WSL2-Linux-Kernel/lib/crypto/poly1305.c

// SPDX-License-Identifier: GPL-2.0-or-later
/*
 * Poly1305 authenticator algorithm, RFC7539
 *
 * Copyright (C) 2015 Martin Willi
 *
 * Based on public domain code by Andrew Moon and Daniel J. Bernstein.
 */

#include <crypto/internal/poly1305.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <asm/unaligned.h>

void poly1305_init_generic(struct poly1305_desc_ctx *desc, const u8 *key)
{
	poly1305_core_setkey(&desc->core_r, key);
	desc->s[0] = get_unaligned_le32(key + 16);
	desc->s[1] = get_unaligned_le32(key + 20);
	desc->s[2] = get_unaligned_le32(key + 24);
	desc->s[3] = get_unaligned_le32(key + 28);
	poly1305_core_init(&desc->h);
	desc->buflen = 0;
	desc->sset = true;
	desc->rset = 2;
}
EXPORT_SYMBOL_GPL(poly1305_init_generic);

void poly1305_update_generic(struct poly1305_desc_ctx *desc, const u8 *src,
			     unsigned int nbytes)
{
	unsigned int bytes;

	if (unlikely(desc->buflen)) {
		bytes = min(nbytes, POLY1305_BLOCK_SIZE - desc->buflen);
		memcpy(desc->buf + desc->buflen, src, bytes);
		src += bytes;
		nbytes -= bytes;
		desc->buflen += bytes;

		if (desc->buflen == POLY1305_BLOCK_SIZE) {
			poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf,
					     1, 1);
			desc->buflen = 0;
		}
	}

	if (likely(nbytes >= POLY1305_BLOCK_SIZE)) {
		poly1305_core_blocks(&desc->h, &desc->core_r, src,
				     nbytes / POLY1305_BLOCK_SIZE, 1);
		src += nbytes - (nbytes % POLY1305_BLOCK_SIZE);
		nbytes %= POLY1305_BLOCK_SIZE;
	}

	if (unlikely(nbytes)) {
		desc->buflen = nbytes;
		memcpy(desc->buf, src, nbytes);
	}
}
EXPORT_SYMBOL_GPL(poly1305_update_generic);

void poly1305_final_generic(struct poly1305_desc_ctx *desc, u8 *dst)
{
	if (unlikely(desc->buflen)) {
		desc->buf[desc->buflen++] = 1;
		memset(desc->buf + desc->buflen, 0,
		       POLY1305_BLOCK_SIZE - desc->buflen);
		poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf, 1, 0);
	}

	poly1305_core_emit(&desc->h, desc->s, dst);
	*desc = (struct poly1305_desc_ctx){};
}
EXPORT_SYMBOL_GPL(poly1305_final_generic);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");
crypto: poly1305 - move core routines into a separate library Move the core Poly1305 routines shared between the generic Poly1305 shash driver and the Adiantum and NHPoly1305 drivers into a separate library so that using just this pieces does not pull in the crypto API pieces of the generic Poly1305 routine. In a subsequent patch, we will augment this generic library with init/update/final routines so that Poyl1305 algorithm can be used directly without the need for using the crypto API's shash abstraction. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:19 +03:00			`// SPDX-License-Identifier: GPL-2.0-or-later`
			`/*`
			`* Poly1305 authenticator algorithm, RFC7539`
			`*`
			`* Copyright (C) 2015 Martin Willi`
			`*`
			`* Based on public domain code by Andrew Moon and Daniel J. Bernstein.`
			`*/`

			`#include <crypto/internal/poly1305.h>`
			`#include <linux/kernel.h>`
			`#include <linux/module.h>`
			`#include <asm/unaligned.h>`

crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`void poly1305_init_generic(struct poly1305_desc_ctx desc, const u8 key)`
			`{`
crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`poly1305_core_setkey(&desc->core_r, key);`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`desc->s[0] = get_unaligned_le32(key + 16);`
			`desc->s[1] = get_unaligned_le32(key + 20);`
			`desc->s[2] = get_unaligned_le32(key + 24);`
			`desc->s[3] = get_unaligned_le32(key + 28);`
			`poly1305_core_init(&desc->h);`
			`desc->buflen = 0;`
			`desc->sset = true;`
crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`desc->rset = 2;`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`}`
			`EXPORT_SYMBOL_GPL(poly1305_init_generic);`

			`void poly1305_update_generic(struct poly1305_desc_ctx desc, const u8 src,`
			`unsigned int nbytes)`
			`{`
			`unsigned int bytes;`

			`if (unlikely(desc->buflen)) {`
			`bytes = min(nbytes, POLY1305_BLOCK_SIZE - desc->buflen);`
			`memcpy(desc->buf + desc->buflen, src, bytes);`
			`src += bytes;`
			`nbytes -= bytes;`
			`desc->buflen += bytes;`

			`if (desc->buflen == POLY1305_BLOCK_SIZE) {`
crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf,`
			`1, 1);`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`desc->buflen = 0;`
			`}`
			`}`

			`if (likely(nbytes >= POLY1305_BLOCK_SIZE)) {`
crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`poly1305_core_blocks(&desc->h, &desc->core_r, src,`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`nbytes / POLY1305_BLOCK_SIZE, 1);`
			`src += nbytes - (nbytes % POLY1305_BLOCK_SIZE);`
			`nbytes %= POLY1305_BLOCK_SIZE;`
			`}`

			`if (unlikely(nbytes)) {`
			`desc->buflen = nbytes;`
			`memcpy(desc->buf, src, nbytes);`
			`}`
			`}`
			`EXPORT_SYMBOL_GPL(poly1305_update_generic);`

			`void poly1305_final_generic(struct poly1305_desc_ctx desc, u8 dst)`
			`{`
			`if (unlikely(desc->buflen)) {`
			`desc->buf[desc->buflen++] = 1;`
			`memset(desc->buf + desc->buflen, 0,`
			`POLY1305_BLOCK_SIZE - desc->buflen);`
crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`poly1305_core_blocks(&desc->h, &desc->core_r, desc->buf, 1, 0);`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`}`

crypto: poly1305 - add new 32 and 64-bit generic versions These two C implementations from Zinc -- a 32x32 one and a 64x64 one, depending on the platform -- come from Andrew Moon's public domain poly1305-donna portable code, modified for usage in the kernel. The precomputation in the 32-bit version and the use of 64x64 multiplies in the 64-bit version make these perform better than the code it replaces. Moon's code is also very widespread and has received many eyeballs of scrutiny. There's a bit of interference between the x86 implementation, which relies on internal details of the old scalar implementation. In the next commit, the x86 implementation will be replaced with a faster one that doesn't rely on this, so none of this matters much. But for now, to keep this passing the tests, we inline the bits of the old implementation that the x86 implementation relied on. Also, since we now support a slightly larger key space, via the union, some offsets had to be fixed up. Nonce calculation was folded in with the emit function, to take advantage of 64x64 arithmetic. However, Adiantum appeared to rely on no nonce handling in emit, so this path was conditionalized. We also introduced a new struct, poly1305_core_key, to represent the precise amount of space that particular implementation uses. Testing with kbench9000, depending on the CPU, the update function for the 32x32 version has been improved by 4%-7%, and for the 64x64 by 19%-30%. The 32x32 gains are small, but I think there's great value in having a parallel implementation to the 64x64 one so that the two can be compared side-by-side as nice stand-alone units. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2020-01-06 06:40:46 +03:00			`poly1305_core_emit(&desc->h, desc->s, dst);`
crypto: poly1305 - expose init/update/final library interface Expose the existing generic Poly1305 code via a init/update/final library interface so that callers are not required to go through the crypto API's shash abstraction to access it. At the same time, make some preparations so that the library implementation can be superseded by an accelerated arch-specific version in the future. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:21 +03:00			`*desc = (struct poly1305_desc_ctx){};`
			`}`
			`EXPORT_SYMBOL_GPL(poly1305_final_generic);`

crypto: poly1305 - move core routines into a separate library Move the core Poly1305 routines shared between the generic Poly1305 shash driver and the Adiantum and NHPoly1305 drivers into a separate library so that using just this pieces does not pull in the crypto API pieces of the generic Poly1305 routine. In a subsequent patch, we will augment this generic library with init/update/final routines so that Poyl1305 algorithm can be used directly without the need for using the crypto API's shash abstraction. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> 2019-11-08 15:22:19 +03:00			`MODULE_LICENSE("GPL");`
			`MODULE_AUTHOR("Martin Willi <martin@strongswan.org>");`