Граф коммитов

36 Коммитов

Автор SHA1 Сообщение Дата
Filippo Valsorda 2aa609cf4a chacha20,poly1305,chacha20poly1305: set consistent build tags
appengine was only necessary for the legacy system based on Go 1.9, drop
that. Add purego tags instead. Remove redundant architecture tags.

Change-Id: Ib1f65a4837511e63e08c1aa43163a79cfe868e0c
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/215498
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Katie Hockman <katie@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
2020-02-21 23:15:18 +00:00
Filippo Valsorda 61a87790db poly1305: drop broken arm assembly
The ARM assembly uses the reserved G register. This started causing
frequent crashes due to async preemption, but it was already broken in
the presence of signals, including SIGPROF.

name                          old speed      new speed      delta
Chacha20Poly1305/Open-64      2.88MB/s ± 0%  1.85MB/s ± 0%   -35.76%  (p=0.008 n=6+7)
Chacha20Poly1305/Seal-64      3.17MB/s ± 1%  1.97MB/s ± 0%   -37.78%  (p=0.000 n=10+8)
Chacha20Poly1305/Open-64-X    2.41MB/s ± 0%  1.61MB/s ± 0%   -33.29%  (p=0.000 n=9+9)
Chacha20Poly1305/Seal-64-X    2.55MB/s ± 0%  1.64MB/s ± 0%   -35.61%  (p=0.000 n=10+9)
Chacha20Poly1305/Open-1350    8.43MB/s ± 0%  4.15MB/s ± 0%   -50.78%  (p=0.000 n=10+10)
Chacha20Poly1305/Seal-1350    8.55MB/s ± 0%  4.18MB/s ± 0%   -51.12%  (p=0.000 n=9+9)
Chacha20Poly1305/Open-1350-X  8.16MB/s ± 0%  4.06MB/s ± 0%   -50.18%  (p=0.000 n=10+10)
Chacha20Poly1305/Seal-1350-X  8.24MB/s ± 1%  4.08MB/s ± 1%   -50.53%  (p=0.000 n=10+10)
Chacha20Poly1305/Open-8192    9.73MB/s ± 1%  4.56MB/s ± 0%   -53.15%  (p=0.000 n=9+10)
Chacha20Poly1305/Seal-8192    9.57MB/s ± 0%  4.52MB/s ± 0%   -52.77%  (p=0.000 n=9+9)
Chacha20Poly1305/Open-8192-X  9.65MB/s ± 0%  4.54MB/s ± 0%   -52.95%  (p=0.000 n=10+7)
Chacha20Poly1305/Seal-8192-X  9.47MB/s ± 1%  4.50MB/s ± 0%   -52.50%  (p=0.000 n=10+9)

Fixes golang/go#35511

Change-Id: I5e5ca3a0499f04c5fece5bc669a417e32d2656c6
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/213880
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-09 15:21:10 +00:00
Filippo Valsorda 2dbfe9001f poly1305: rewrite the Go implementation with 64-bit limbs
The new code is meant to be readable without external references for
Poly1305, and explains the field logic. The generic code is now 30-50%
faster on a Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, and even better
on a 3.1 GHz i7 MacBook.

name        old time/op    new time/op    delta
64-48          126ns ± 0%      80ns ± 1%  -36.24%  (p=0.000 n=16+20)
1K-48         1.07µs ± 0%    0.81µs ± 2%  -23.63%  (p=0.000 n=19+20)
2M-48         2.07ms ± 0%    1.61ms ± 1%  -22.31%  (p=0.000 n=20+20)
Write64-48    79.3ns ± 0%    58.0ns ± 1%  -26.89%  (p=0.000 n=20+19)
Write1K-48    1.02µs ± 0%    0.79µs ± 1%  -22.91%  (p=0.000 n=19+19)
Write2M-48    2.07ms ± 0%    1.61ms ± 2%  -22.33%  (p=0.000 n=17+20)

name        old speed      new speed      delta
64-48        508MB/s ± 0%   797MB/s ± 1%  +56.95%  (p=0.000 n=16+20)
1K-48        960MB/s ± 0%  1257MB/s ± 2%  +30.94%  (p=0.000 n=18+20)
2M-48       1.01GB/s ± 0%  1.30GB/s ± 1%  +28.73%  (p=0.000 n=20+20)
Write64-48   807MB/s ± 0%  1104MB/s ± 1%  +36.78%  (p=0.000 n=18+19)
Write1K-48  1.00GB/s ± 0%  1.30GB/s ± 1%  +29.71%  (p=0.000 n=18+19)
Write2M-48  1.01GB/s ± 0%  1.31GB/s ± 2%  +28.77%  (p=0.000 n=17+20)

The assembly is still 50-90% faster on the Xeon, 30-60% on the MacBook.
The Go code does not use all the arithmetic tricks the assembly does,
and it does not have access to the three operand wide shift instruction.

name        old time/op    new time/op    delta
64-48         80.3ns ± 1%    54.2ns ± 0%  -32.50%  (p=0.000 n=20+17)
1K-48          815ns ± 2%     446ns ± 1%  -45.27%  (p=0.000 n=20+20)
2M-48         1.61ms ± 1%    0.86ms ± 0%  -46.54%  (p=0.000 n=20+17)
Write64-48    58.0ns ± 1%    34.0ns ± 0%  -41.34%  (p=0.000 n=19+20)
Write1K-48     790ns ± 1%     427ns ± 0%  -45.92%  (p=0.000 n=19+17)
Write2M-48    1.61ms ± 2%    0.86ms ± 0%  -46.51%  (p=0.000 n=20+20)

name        old speed      new speed      delta
64-48        797MB/s ± 1%  1180MB/s ± 0%  +48.09%  (p=0.000 n=20+19)
1K-48       1.26GB/s ± 2%  2.30GB/s ± 1%  +82.71%  (p=0.000 n=20+20)
2M-48       1.30GB/s ± 1%  2.44GB/s ± 0%  +87.04%  (p=0.000 n=20+17)
Write64-48  1.10GB/s ± 1%  1.88GB/s ± 0%  +70.52%  (p=0.000 n=19+18)
Write1K-48  1.30GB/s ± 1%  2.40GB/s ± 0%  +84.84%  (p=0.000 n=19+18)
Write2M-48  1.31GB/s ± 2%  2.44GB/s ± 0%  +86.93%  (p=0.000 n=20+20)

Hopefully this will also avoid the need for an arm64 implementation.

Since now the Go and the amd64/ppc64le assembly use the same limb
schedule, drop the assembly initialize and finalize implementations,
and make the wrapper code match. It comes with a minor slowdown.

name        old time/op    new time/op    delta
64-48         50.3ns ± 0%    54.2ns ± 0%  +7.73%  (p=0.000 n=20+17)
1K-48          441ns ± 0%     446ns ± 1%  +1.10%  (p=0.000 n=19+20)
2M-48          860µs ± 0%     859µs ± 0%    ~     (p=0.178 n=19+17)
Write64-48    34.0ns ± 0%    34.0ns ± 0%    ~     (all equal)
Write1K-48     424ns ± 0%     427ns ± 0%  +0.71%  (p=0.000 n=17+17)
Write2M-48     860µs ± 0%     859µs ± 0%  -0.04%  (p=0.000 n=19+20)

name        old speed      new speed      delta
64-48       1.27GB/s ± 0%  1.18GB/s ± 0%  -7.20%  (p=0.000 n=20+19)
1K-48       2.32GB/s ± 0%  2.30GB/s ± 1%  -1.07%  (p=0.000 n=18+20)
2M-48       2.44GB/s ± 0%  2.44GB/s ± 0%    ~     (p=0.173 n=19+17)
Write64-48  1.88GB/s ± 0%  1.88GB/s ± 0%  +0.04%  (p=0.000 n=19+18)
Write1K-48  2.41GB/s ± 0%  2.40GB/s ± 0%  -0.67%  (p=0.000 n=19+18)
Write2M-48  2.44GB/s ± 0%  2.44GB/s ± 0%  +0.04%  (p=0.000 n=19+20)

Since poly1305/sum_generic.go was almost entirely rewritten, it's
probably best reviewed on gitiles.

This is the implementation published at
https://blog.filippo.io/a-literate-go-implementation-of-poly1305/

Updates #31470

Change-Id: I74f9011d3ee317a43b05ae7f05d96081d08bffd3
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/169037
Reviewed-by: Katie Hockman <katie@golang.org>
2019-11-11 21:33:42 +00:00
Lynn Boger f99c8df09e poly1305: improve performance with asm for ppc64le
This adds an asm implementation for poly1305 on ppc64le, based on
the amd64 asm implementation using the mac interface.

The improvements on a power8 based on the poly1305 benchmarks are:

    name              old time/op   new time/op    delta
    64                  172ns ± 0%      78ns ± 0%   -54.77%  (p=1.000 n=1+1)
    1K                 1.47µs ± 0%    0.59µs ± 0%   -59.69%  (p=1.000 n=1+1)
    2M                 2.84ms ± 0%    1.12ms ± 0%   -60.47%  (p=1.000 n=1+1)
    64Unaligned         172ns ± 0%      78ns ± 0%   -54.59%  (p=1.000 n=1+1)
    1KUnaligned        1.47µs ± 0%    0.59µs ± 0%   -59.69%  (p=1.000 n=1+1)
    2MUnaligned        2.84ms ± 0%    1.13ms ± 0%   -60.23%  (p=1.000 n=1+1)
    Write64             100ns ± 0%      46ns ± 0%   -53.80%  (p=1.000 n=1+1)
    Write1K            1.40µs ± 0%    0.56µs ± 0%   -59.90%  (p=1.000 n=1+1)
    Write2M            2.84ms ± 0%    1.12ms ± 0%   -60.46%  (p=1.000 n=1+1)
    Write64Unaligned    100ns ± 0%      46ns ± 0%   -53.60%  (p=1.000 n=1+1)
    Write1KUnaligned   1.40µs ± 0%    0.56µs ± 0%   -59.90%  (p=1.000 n=1+1)
    Write2MUnaligned   2.84ms ± 0%    1.13ms ± 0%   -60.22%  (p=1.000 n=1+1)

Change-Id: I77cc9bb3645a6b1a6edc414b5651dc37ae4a7410
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/173421
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2019-06-05 12:30:33 +00:00
Andreas Auernhammer c2843e01d9 poly1305: implement a subset of the hash.Hash interface
This CL adds the poly1305.MAC type which implements a
subset of the hash.Hash interface. With MAC it is possible
to compute an authentication tag of data without copying
it into a single byte slice.

This commit modifies the reference/generic and the
AMD64 assembler but not the ARM/s390x implementation
to support an io.Writer interface.

Updates golang/go#25219

Change-Id: I7ee5a9eadd43387cf3cd887d734c625575eee47d
Reviewed-on: https://go-review.googlesource.com/c/crypto/+/111335
Run-TryBot: Filippo Valsorda <filippo@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
2019-03-08 22:17:18 +00:00
Michael Munday 0091315ad7 poly1305: use x/sys/cpu for s390x feature detection
Use the recently added CPU feature detection API rather than custom
assembly. This will need to be updated to use 'internal/cpu' when
the package is revendored into std.

Change-Id: Ia99c51c7409fe4fabcd88fdf5ff19772c1ca2257
Reviewed-on: https://go-review.googlesource.com/c/164382
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-02-28 16:14:51 +00:00
Michael Munday 425cc7d9a7 poly1305: add additional test cases
Increase the number of test vectors in this package to provide
better validation of new SIMD implementations.

Change-Id: Ia89883609e78cef53ba40a9cae41f4e0a3bccc80
Reviewed-on: https://go-review.googlesource.com/112855
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-05-14 23:09:06 +00:00
bill_ofarrell 4eb8c2c8d8 poly1305: add optimized s390x SIMD implementation with VMSL
SIMD implementation based the on the algorithm outlined in:
NEON crypto, Daniel J. Bernstein and Peter Schwabe
https://cryptojedi.org/papers/neoncrypto-20120320.pdf
and as modified for VMSL as described in
Accelerating Poly1305 Cryptographic Message Authentication on the z14
O'Farrell, Gadriwala, et al, CASCON 2017, p48-55
https://ibm.ent.box.com/s/jf9gedj0e9d2vjctfyh186shaztavnht

name		old		new		delta
64		485MB/s		1315 MB/s	+171.58%
1K		607MB/s		4352 MB/s	+616.97%
64Unaligned	485MB/s		1373 MB/s       +183.09%
1KUnaligned	606MB/s		4286 MB/s	+607.26%
2M		607MB/s		5529 MB/s	+810.87%

Change-Id: I31ccc25ced09180d99ea5c9233f0dcdc8666fc98
Reviewed-on: https://go-review.googlesource.com/110297
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2018-05-14 22:55:51 +00:00
Kevin Burke 5ef0053f77 all: use HTTPS for links that support it
Many websites now support HTTPS that may not at the time the code was
committed; let's use the HTTPS links where we can.

Change-Id: I7099dfa0dbb213294e65b4387f343d6e8f955b97
Reviewed-on: https://go-review.googlesource.com/47131
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2017-06-29 04:21:55 +00:00
Adam Langley 453249f01c poly1305: add burn-in test.
This is the test that I use to sanity-check significant changes to the
package, thus it's probably worth checking it in. Since it's very slow,
it's disabled by default.

(Note that while it stands a good chance of catching errors in 32-bit
implementations, no amount of random testing is going to get useful
coverage for 64-bit implementations. Thus it really is just a sanity
check, despite the long run-time.)

Change-Id: I95b321eec6f3026dafbbc157a7ef35a27e88d247
Reviewed-on: https://go-review.googlesource.com/36566
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-09 23:39:01 +00:00
Andreas Auernhammer 537c9dfe43 poly1305: simplify reference implementation
Reduce code complexity by replacing the floating-point implementation
with a 32-bit implementation.

Moreover this improves the performance on 386:

name 		old time/op 	new time/op 	delta
64-2 		972ns ± 2% 	350ns ± 1% 	-64.04% (p=0.029 n=4+4)
1K-2 		10.9µs ± 3% 	4.2µs ± 1% 	-61.11% (p=0.029 n=4+4)
64Unaligned-2	969ns ± 2% 	354ns ± 2% 	-63.44% (p=0.029 n=4+4)
1KUnaligned-2 	10.8µs ± 3% 	4.2µs ± 1% 	-61.15% (p=0.029 n=4+4)

name 		old speed 	new speed 	delta
64-2 		65.8MB/s ± 2% 	182.9MB/s ± 1% 	+177.93% (p=0.029 n=4+4)
1K-2 		94.3MB/s ± 3% 	242.3MB/s ± 1% 	+157.08% (p=0.029 n=4+4)
64Unaligned-2 	66.0MB/s ± 2% 	180.4MB/s ± 2% 	+173.32% (p=0.029 n=4+4)
1KUnaligned-2  	94.4MB/s ± 3%  	243.0MB/s ± 1% 	+157.36% (p=0.029 n=4+4)

There are already optimized versions for amd64 and arm,
and a optimized version for s390x seems to be planned.
	See: https://go-review.googlesource.com/#/c/32812/

Change-Id: I7a5ac62ae33727b0e6060cb966de73a468317e30
Reviewed-on: https://go-review.googlesource.com/35294
Reviewed-by: Michael Munday <munday@ca.ibm.com>
Reviewed-by: Adam Langley <agl@golang.org>
2017-02-08 20:50:45 +00:00
Michael Munday dc137beb6c poly1305: add test vectors for edge cases
Often intermediate results of poly1305 calculations are only reduced to
the range [0, 2^130). These new test vectors exercise the code that
reduces the final output to the range [0, 2^130-5).

This improves the test coverage of CL 35294 and CL 32812.

Change-Id: Ifd2f64d4668c08a396ed81db3e88969a49baf777
Reviewed-on: https://go-review.googlesource.com/35918
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
2017-01-30 17:18:27 +00:00
Shenghou Ma abc5fa7ad0 poly1305: make data declared in assembly files private
Update golang/go#18673.

Change-Id: I3ba89bab42f17e6fd7005df40c7a853aef1fda37
Reviewed-on: https://go-review.googlesource.com/35259
Run-TryBot: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
2017-01-16 01:45:04 +00:00
Adam Langley 1150b8bd09 poly1305: don't move R13 in sum_arm.s.
Rather than change the value of R13 during the execution, keep R13 fixed
(after the initial prelude) and always use offsets from it.

This should help the runtime figure out what's going on if, say, a
signal should occur while running this code.

I've also trimmed the set of saved registers since Go doesn't require
the callee to maintain anything except R10 and R13.

Change-Id: Ifbeca73c1d964cc43bb7f8c20c61066f22fd562d
Reviewed-on: https://go-review.googlesource.com/31717
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-24 22:35:48 +00:00
Adam Langley 3ded668c53 poly1305: enable assembly for ARM in Go 1.6.
5f31782cfb added build constraints to
disable assembly for Go 1.6 but didn't add the needed tags to the ARM
files. Also, it's not clear that was needed as the error given in
golang/go#17424 only complains about the chacha20poly1305 package.

This change reenables the assembly for Go 1.6 in the poly1305 package.
Tested with 1.6.3 and 1.5.4.

Fixes golang/go#17512.

Change-Id: I81b41f8810437ea327b415542402cd8ff5c8a390
Reviewed-on: https://go-review.googlesource.com/31492
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-20 16:32:05 +00:00
Adam Langley dec8741f62 poly1305: fix stack handling in sum_arm.s
Up till now, sum_arm.s was working only because of luck. It was written
assuming that it had stack space below the current stack pointer, but Go
decrements the stack pointer in the function prelude, so it was just
writing off the end of the stack.

This change fixes the stack manipulation so that it only writes within
the bounds.

Fixes golang/go#17499.

Change-Id: I1951b3344c21f6bd6ade79da8b96dd1bb68180db
Reviewed-on: https://go-review.googlesource.com/31443
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-20 16:31:58 +00:00
Adam Langley cdcb58c6ca poly1305: fix NaCl build.
The ARM assembly doesn't work for NaCl on ARM because it doesn't meet
the required rules. This change disables it on ARM and also fixes the
issue that the build constraints in sum_arm.s would be ignored because
they came after the #include.

Change-Id: I6cb3815ec62ac4686a6e72f405af104293586bb6
Reviewed-on: https://go-review.googlesource.com/31264
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-17 20:27:05 +00:00
Brad Fitzpatrick 5f31782cfb poly1305, chacha20poly1305: fix build for Go 1.6
Fixes golang/go#17424

Change-Id: I49d6e475c173da6a31542931d555ab87cc45a1c6
Reviewed-on: https://go-review.googlesource.com/30971
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
2016-10-12 22:20:46 +00:00
Brad Fitzpatrick 85ce60fb24 poly1305: fix build
Updates golang/go#17422

Change-Id: Ie5f16e24f87b3d800f1182b5b09d6cf638135e33
Reviewed-on: https://go-review.googlesource.com/30970
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
2016-10-12 22:18:25 +00:00
Adam Langley a81735b1ea poly1305: rename files to sum_𝑥.s
Since the wrapper files are called sum_𝑥.go, it makes sense that the
assembly files would be named similarly.

Change-Id: I5c515008b86c7fedd04b940d7846b84dfccdba33
Reviewed-on: https://go-review.googlesource.com/30727
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-10-11 21:55:02 +00:00
Adam Langley 1265e0190f poly1305: minor updates.
This change updates the Poly1305 code in x/crypto to reflect some
comments from the review of
https://go-review.googlesource.com/cl/29245/.

Following this change, poly1305_arm.s will be renamed to sum_arm.s, to
match the other files here. (The review becomes confusing if that's done
in the same change as the asmfmt changes.)

Change-Id: Iddf43615eba97c975adb135aef3a814a37e9ec02
Reviewed-on: https://go-review.googlesource.com/30820
Run-TryBot: Adam Langley <agl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <munday@ca.ibm.com>
2016-10-11 21:53:50 +00:00
Adam Langley 7682e7e394 poly1305: add test for carry edge-case.
This change adds a test that catches the bug which existed in
[https://go-review.googlesource.com/#/c/29993/, /30101/).

Change-Id: I71177fec0fcbca0dcec7fb11d3ad0a48df1e0b82
Reviewed-on: https://go-review.googlesource.com/30214
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-10-04 21:33:40 +00:00
Andreas Auernhammer 84e98f4576 poly1305: fix bug in amd64 assembly
Add the conditional subtraction of 3 from 'h2' (register R10).

Change-Id: I75615b0375f050a5cd97b968075c2992ccd1dee7
Reviewed-on: https://go-review.googlesource.com/30101
Reviewed-by: Adam Langley <agl@golang.org>
2016-10-02 16:48:29 +00:00
Andreas Auernhammer 568507f56e x/crypto/poly1305: optimize amd64 assembly performance
Improve performance on amd64 through faster assembly.

name 		old time/op 	new time/op 	delta
64-8 		101ns ± 4% 	42ns ± 3% 	-58.31% (p=0.002 n=6+6)
1K-8 		887ns ± 1% 	456ns ± 1% 	-48.53% (p=0.002 n=6+6)
64Unaligned-8 	98.1ns ± 1% 	41.1ns ± 1% 	-58.06% (p=0.002 n=6+6)
1KUnaligned-8 	885ns ± 2% 	460ns ± 3% 	-48.04% (p=0.002 n=6+6)

name 		old speed 	new speed 	delta
64-8 		635MB/s ± 4% 	1525MB/s ± 3% 	+140.15% (p=0.002 n=6+6)
1K-8 		1.15GB/s ± 1% 	2.24GB/s ± 1% 	+94.22%  (p=0.002 n=6+6)
64Unaligned-8 	653MB/s ± 1% 	1557MB/s ± 1% 	+138.58% (p=0.002 n=6+6)
1KUnaligned-8  	1.16GB/s ± 2%  	2.23GB/s ± 3%	+92.46%  (p=0.002 n=6+6)

Change-Id: Ia3be8e7ff012f8a9b451d728a646f29f809ba665
Reviewed-on: https://go-review.googlesource.com/29993
Reviewed-by: Adam Langley <agl@golang.org>
2016-09-30 20:23:06 +00:00
Jungho Ahn 81bf7719a6 x/crypto/poly1305: fix memory alignment fault in ARM
The current ARM implementation assumes that the input message
  is memory aligned and so it can cause alignment fault when it
  is not enabled. Also it may generate incorrect outputs in ARMv5.

  This change fixes this issue by temporarily copying the input
  to a local aligned space. Although there may be a better way
  to handle unaligned access, this would be a safe way in all
  ARM versions.

  This change also added a test and benchmarks with unaligned
  data. The benchmark result on RasberryPI 2 is

  Benchmark64  2000000         812 ns/op    78.81 MB/s
  Benchmark1K   200000        7809 ns/op   131.12 MB/s
  Benchmark64Unaligned   2000000         967 ns/op    66.13 MB/s
  Benchmark1KUnaligned    200000       10316 ns/op    99.26 MB/s

Change-Id: I189cc1b7bb6c67a04c9877271fb27326f2896e82
Reviewed-on: https://go-review.googlesource.com/12797
Reviewed-by: Adam Langley <agl@golang.org>
2015-08-19 00:13:40 +00:00
Alexander Neumann f6a608df62 poly1305/arm: allow building with Go 1.3
This is the same as https://golang.org/cl/154120043

Since the file textflag.h is not available on Go 1.3, the macros defined
in textflag.h are replaced with their respective value.

Fixes golang/go#11448

Change-Id: I0d4aed67b7afe50d8e4e88915edd2cefeac4cc96
Reviewed-on: https://go-review.googlesource.com/12033
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-07-11 23:27:05 +00:00
Joel Sing 644910e6da poly1305: fix compilation on arm with go tip
Fix compilation of poly1305 using go tip - it currently fails with:

./poly1305_arm.s:124: cannot reference SP without a symbol
./poly1305_arm.s:161: cannot reference SP without a symbol
./poly1305_arm.s:162: cannot reference SP without a symbol
asm: asm: assembly of ./poly1305_arm.s failed

Change-Id: I797dcf3641cc881b6cc192034b693ccf58317987
Reviewed-on: https://go-review.googlesource.com/10307
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Adam Langley <agl@golang.org>
2015-05-21 15:47:07 +00:00
Jungho Ahn 4d48e5fa3d x/crypto/poly1305: add ARM assembly
This change adds ARMv6 assembly implementation. The referenced assembly code was
  the public domain source by Andrew Moon in https://github.com/floodyberry/poly1305-opt/blob/master/app/extensions/poly1305/poly1305_armv6-32.inc.
  The author has confirmed that it's ok to put it under the Go license.

  Benchmark results on Raspberry Pi (ARMv6-compatible processor rev 7),
   o Without ARMv6 assembly
     Benchmark1K      5000      287177 ns/op     3.57 MB/s
     Benchmark64     50000       38880 ns/op     1.65 MB/s

   o With ARMv6 assembly
     Benchmark1K    100000       15964 ns/op    64.14 MB/s
     Benchmark64   1000000        1472 ns/op    43.46 MB/s

Change-Id: Iea5b0b831ac097cc6d477a8fccbf0ddb4819724c
Reviewed-on: https://go-review.googlesource.com/9765
Reviewed-by: Adam Langley <agl@golang.org>
Run-TryBot: Adam Langley <agl@golang.org>
2015-05-14 21:10:51 +00:00
Marga Manterola c57d4a7191 poly1305, curve25519: add build constraints for appengine
Updates: golang/go#9845

Change-Id: I78ce460d2a188ee13dd3f80015919a14eba03d07
Reviewed-on: https://go-review.googlesource.com/8100
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2015-03-27 05:11:19 +00:00
David Symonds 1fbbd62cfe crypto: add import comments.
Change-Id: I33240faf1b8620d0cd600de661928d8e422ebdbc
Reviewed-on: https://go-review.googlesource.com/1235
Reviewed-by: Andrew Gerrand <adg@golang.org>
2014-12-09 23:26:36 +00:00
Ian Lance Taylor 902e2dcb72 curve25519, poly1305: change last CL to build with Go 1.3
It also still works with Go 1.4.

LGTM=agl
R=agl
CC=golang-codereviews
https://golang.org/cl/154120043
2014-10-07 18:09:31 -07:00
Ian Lance Taylor 20b2ab3f62 curve25519, poly1305: mark constants as RODATA
Fixes tests when using Go tip.  Without this the link steps
fails with errors like:

missing Go type information for global symbol: google3/third_party/golang/go_crypto/curve25519/curve25519.REDMASK51 size 8

LGTM=agl
R=agl
CC=golang-codereviews
https://golang.org/cl/156810043
2014-10-07 16:59:07 -07:00
Shenghou Ma bf5456312c go.crypto/{curve25519,poly1305,salsa20/salsa}: add //go:noescape annotation
R=golang-dev, rsc, agl
CC=golang-dev
https://golang.org/cl/7319045
2013-02-19 19:15:01 +08:00
Ian Lance Taylor 6779fad1d0 go.crypto: add and adjust +build lines for 386 and gccgo
R=golang-dev, bradfitz
CC=golang-dev
https://golang.org/cl/6827061
2012-11-07 22:50:39 -08:00
Adam Langley 9c0a3ae199 go.crypto/poly1305: enable AMD64 assembly
This change alters the assembly to use FSUBD instructions such that
6l will actually emit the correct FSUBRD instructions and enables
the assembly code.

R=golang-dev, r
CC=golang-dev
https://golang.org/cl/6514044
2012-09-14 17:00:38 -04:00
Adam Langley 6814ed3bb5 go.crypto/poly1305: add package.
(Reference implementation by dchest. amd64 disabled pending 6l fix.)

R=golang-dev, dchest
CC=golang-dev
https://golang.org/cl/6494105
2012-09-08 14:24:19 -04:00