WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Herbert Xu	0914999744	crypto: aegis128 - Move simd prototypes into aegis.h This patch fixes missing prototype warnings in crypto/aegis128-neon.c. Fixes: `a4397635af` ("crypto: aegis128 - provide a SIMD...") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2021-03-19 21:59:45 +11:00
Ard Biesheuvel	97b70180b7	crypto: aegis128/neon - move final tag check to SIMD domain Instead of calculating the tag and returning it to the caller on decryption, use a SIMD compare and min across vector to perform the comparison. This is slightly more efficient, and removes the need on the caller's part to wipe the tag from memory if the decryption failed. While at it, switch to unsigned int when passing cryptlen and assoclen - we don't support input sizes where it matters anyway. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ondrej Mosnacek <omosnacek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2020-11-27 17:13:40 +11:00
Ard Biesheuvel	528282630c	crypto: aegis128 - duplicate init() and final() hooks in SIMD code In order to speed up aegis128 processing even more, duplicate the init() and final() routines as SIMD versions in their entirety. This results in a 2x speedup on ARM Cortex-A57 for ~1500 byte packets (using AES instructions). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-10-26 02:06:05 +11:00
Ard Biesheuvel	198429631a	crypto: arm64/aegis128 - implement plain NEON version Provide a version of the core AES transform to the aegis128 SIMD code that does not rely on the special AES instructions, but uses plain NEON instructions instead. This allows the SIMD version of the aegis128 driver to be used on arm64 systems that do not implement those instructions (which are not mandatory in the architecture), such as the Raspberry Pi 3. Since GCC makes a mess of this when using the tbl/tbx intrinsics to perform the sbox substitution, preload the Sbox into v16..v31 in this case and use inline asm to emit the tbl/tbx instructions. Clang does not support this approach, nor does it require it, since it does a much better job at code generation, so there we use the intrinsics as usual. Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-08-15 21:52:15 +10:00
Ard Biesheuvel	a4397635af	crypto: aegis128 - provide a SIMD implementation based on NEON intrinsics Provide an accelerated implementation of aegis128 by wiring up the SIMD hooks in the generic driver to an implementation based on NEON intrinsics, which can be compiled to both ARM and arm64 code. This results in a performance of 2.2 cycles per byte on Cortex-A53, which is a performance increase of ~11x compared to the generic code. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-08-15 21:52:15 +10:00
Herbert Xu	c9f1fd4f2f	Revert "crypto: aegis128 - add support for SIMD acceleration" This reverts commit `ecc8bc81f2` ("crypto: aegis128 - provide a SIMD implementation based on NEON intrinsics") and commit `7cdc0ddbf7` ("crypto: aegis128 - add support for SIMD acceleration"). They cause compile errors on platforms other than ARM because the mechanism to selectively compile the SIMD code is broken. Repoted-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-08-02 13:31:35 +10:00
Ard Biesheuvel	ecc8bc81f2	crypto: aegis128 - provide a SIMD implementation based on NEON intrinsics Provide an accelerated implementation of aegis128 by wiring up the SIMD hooks in the generic driver to an implementation based on NEON intrinsics, which can be compiled to both ARM and arm64 code. This results in a performance of 2.2 cycles per byte on Cortex-A53, which is a performance increase of ~11x compared to the generic code. Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2019-07-26 15:03:58 +10:00

7 Коммитов