Simplify the constant swap function.
On amd64: Replace the CMOVQEQ scheme with SSE2 code similar to the non-amd64 code.
On non-amd64: Avoid unnecessary loop iterations.
The result is less and slightly faster code.
name old time/op new time/op delta
ScalarBaseMult-4 653µs ± 0% 636µs ± 0% ~ (p=0.100 n=3+3)
name old time/op new time/op delta
ConstantSwap-4 10.4ns ± 1% 6.2ns ± 0% -39.86% (p=0.029 n=4+4)
On an i7-65000U
Change-Id: Ia5eea92e0b3eabb6c291d25229aa582b51278552
Reviewed-on: https://go-review.googlesource.com/39693
Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The assembly implementations of ladderstep and mul contain register
save prologues that are unnecessary in Go because there are no callee
save registers in the Go ABI. Remove these prologues, update all SP
offsets, and reduce the frame size accordingly.
The SP offsets were updated with:
python -c 'import sys, re; sys.stdout.write(re.sub(r"(\d+)\(SP\)", lambda m: "%d(SP)" % (int(m.group(1))-YYY), sys.stdin.read()))'
where YYY was 64 for mul_amd64.s and 56 for ladderstep_amd64.s.
Change-Id: I728948809f479b1c061cc65167dadad651efab31
Reviewed-on: https://go-review.googlesource.com/31580
Reviewed-by: Adam Langley <agl@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
The curve25519 assembly routines do very non-Go-ABI SP adjustments.
These would thoroughly confuse traceback if it were to fire in one of
these functions (say, because of a signal). Plus, we're about to make
the assembler track SP balance through more operations (which it
should have done all along), and the SP alignment performed by these
functions is going to make the assembler think the SP is out of
balance.
Fix this by eliminating the SP alignment prologue from all four
assembly functions. They don't do any operations that care about SP
alignment, so this is simply unnecessary. square and freeze don't even
use the stack for anything other that saving what were presumably
"callee save" registers in some other ABI, so for these we can
eliminate the stack frame entirely.
Change-Id: If9dbb2fb6800d9cd733daa91f483eb2937e95f0f
Reviewed-on: https://go-review.googlesource.com/31579
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
Fixes tests when using Go tip. Without this the link steps
fails with errors like:
missing Go type information for global symbol: google3/third_party/golang/go_crypto/curve25519/curve25519.REDMASK51 size 8
LGTM=agl
R=agl
CC=golang-codereviews
https://golang.org/cl/156810043
Previously curve25519 contained a constant-time, optimised amd64 implementation and
a generic implemenation that used math/big and that was not constant-time.
This change contains a Go port of the public domain, "ref10" implementation from
SUPERCOP. This has the advantage of being faster and constant-time.
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/13343045
This consists of ~2000 lines of amd64 assembly and a, much slower,
generic Go version in curve25519.go. The assembly has been ported from
djb's public domain sources and the only semantic alterations are to
deal with Go's split stacks.
R=rsc
CC=golang-dev
https://golang.org/cl/5786045