License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 17:07:57 +03:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2006-10-17 11:09:42 +04:00
|
|
|
/*
|
2014-04-04 01:49:08 +04:00
|
|
|
* This is a maximally equidistributed combined Tausworthe generator
|
|
|
|
* based on code from GNU Scientific Library 1.5 (30 Jun 2004)
|
|
|
|
*
|
|
|
|
* lfsr113 version:
|
|
|
|
*
|
|
|
|
* x_n = (s1_n ^ s2_n ^ s3_n ^ s4_n)
|
|
|
|
*
|
|
|
|
* s1_{n+1} = (((s1_n & 4294967294) << 18) ^ (((s1_n << 6) ^ s1_n) >> 13))
|
|
|
|
* s2_{n+1} = (((s2_n & 4294967288) << 2) ^ (((s2_n << 2) ^ s2_n) >> 27))
|
|
|
|
* s3_{n+1} = (((s3_n & 4294967280) << 7) ^ (((s3_n << 13) ^ s3_n) >> 21))
|
|
|
|
* s4_{n+1} = (((s4_n & 4294967168) << 13) ^ (((s4_n << 3) ^ s4_n) >> 12))
|
|
|
|
*
|
|
|
|
* The period of this generator is about 2^113 (see erratum paper).
|
|
|
|
*
|
|
|
|
* From: P. L'Ecuyer, "Maximally Equidistributed Combined Tausworthe
|
|
|
|
* Generators", Mathematics of Computation, 65, 213 (1996), 203--213:
|
|
|
|
* http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
|
|
|
|
* ftp://ftp.iro.umontreal.ca/pub/simulation/lecuyer/papers/tausme.ps
|
|
|
|
*
|
|
|
|
* There is an erratum in the paper "Tables of Maximally Equidistributed
|
|
|
|
* Combined LFSR Generators", Mathematics of Computation, 68, 225 (1999),
|
|
|
|
* 261--269: http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
|
|
|
|
*
|
|
|
|
* ... the k_j most significant bits of z_j must be non-zero,
|
|
|
|
* for each j. (Note: this restriction also applies to the
|
|
|
|
* computer code given in [4], but was mistakenly not mentioned
|
|
|
|
* in that paper.)
|
|
|
|
*
|
|
|
|
* This affects the seeding procedure by imposing the requirement
|
|
|
|
* s1 > 1, s2 > 7, s3 > 15, s4 > 127.
|
|
|
|
*/
|
2006-10-17 11:09:42 +04:00
|
|
|
|
|
|
|
#include <linux/types.h>
|
|
|
|
#include <linux/percpu.h>
|
2011-11-17 06:29:17 +04:00
|
|
|
#include <linux/export.h>
|
2006-10-18 09:47:25 +04:00
|
|
|
#include <linux/jiffies.h>
|
2006-10-17 11:09:42 +04:00
|
|
|
#include <linux/random.h>
|
2013-11-11 15:20:37 +04:00
|
|
|
#include <linux/sched.h>
|
2020-10-24 19:36:27 +03:00
|
|
|
#include <linux/bitops.h>
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
#include <asm/unaligned.h>
|
2020-08-13 20:06:43 +03:00
|
|
|
#include <trace/events/random.h>
|
2013-11-11 15:20:37 +04:00
|
|
|
|
2010-05-27 01:44:13 +04:00
|
|
|
/**
|
2012-12-18 04:04:23 +04:00
|
|
|
* prandom_u32_state - seeded pseudo-random number generator.
|
2010-05-27 01:44:13 +04:00
|
|
|
* @state: pointer to state structure holding seeded state.
|
|
|
|
*
|
|
|
|
* This is used for pseudo-randomness with no outside seeding.
|
2012-12-18 04:04:23 +04:00
|
|
|
* For more random results, use prandom_u32().
|
2010-05-27 01:44:13 +04:00
|
|
|
*/
|
2012-12-18 04:04:23 +04:00
|
|
|
u32 prandom_u32_state(struct rnd_state *state)
|
2006-10-17 11:09:42 +04:00
|
|
|
{
|
random32: mix in entropy from core to late initcall
Currently, we have a 3-stage seeding process in prandom():
Phase 1 is from the early actual initialization of prandom()
subsystem which happens during core_initcall() and remains
most likely until the beginning of late_initcall() phase.
Here, the system might not have enough entropy available
for seeding with strong randomness from the random driver.
That means, we currently have a 32bit weak LCG() seeding
the PRNG status register 1 and mixing that successively
into the other 3 registers just to get it up and running.
Phase 2 starts with late_initcall() phase resp. when the
random driver has initialized its non-blocking pool with
enough entropy. At that time, we throw away *all* inner
state from its 4 registers and do a full reseed with strong
randomness.
Phase 3 starts right after that and does a periodic reseed
with random slack of status register 1 by a strong random
source again.
A problem in phase 1 is that during bootup data structures
can be initialized, e.g. on module load time, and thus access
a weakly seeded prandom and are never changed for the rest
of their live-time, thus carrying along the results from a
week seed. Lets make sure that current but also future users
access a possibly better early seeded prandom.
This patch therefore improves phase 1 by trying to make it
more 'unpredictable' through mixing in seed from a possible
hardware source. Now, the mix-in xors inner state with the
outcome of either of the two functions arch_get_random_{,seed}_int(),
preferably arch_get_random_seed_int() as it likely represents
a non-deterministic random bit generator in hw rather than
a cryptographically secure PRNG in hw. However, not all might
have the first one, so we use the PRNG as a fallback if
available. As we xor the seed into the current state, the
worst case would be that a hardware source could be unverifiable
compromised or backdoored. In that case nevertheless it
would be as good as our original early seeding function
prandom_seed_very_weak() since we mix through xor which is
entropy preserving.
Joint work with Daniel Borkmann.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 16:01:38 +04:00
|
|
|
#define TAUSWORTHE(s, a, b, c, d) ((s & c) << d) ^ (((s << a) ^ s) >> b)
|
random32: upgrade taus88 generator to taus113 from errata paper
Since we use prandom*() functions quite often in networking code
i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
from Pierre L'Ecuyer's original paper "Maximally Equidistributed
Combined Tausworthe Generators", Mathematics of Computation, 65,
213 (1996), 203--213 to the version published in his errata paper [1].
The Tausworthe generator is a maximally-equidistributed generator,
that is fast and has good statistical properties [1].
The version presented there upgrades the 3 state LFSR to a 4 state
LFSR with increased periodicity from about 2^88 to 2^113. The
algorithm is presented in [1] by the very same author who also
designed the original algorithm in [2].
Also, by increasing the state, we make it a bit harder for attackers
to "guess" the PRNGs internal state. See also discussion in [3].
Now, as we use this sort of weak initialization discussed in [3]
only between core_initcall() until late_initcall() time [*] for
prandom32*() users, namely in prandom_init(), it is less relevant
from late_initcall() onwards as we overwrite seeds through
prandom_reseed() anyways with a seed source of higher entropy, that
is, get_random_bytes(). In other words, a exhaustive keysearch of
96 bit would be needed. Now, with the help of this patch, this
state-search increases further to 128 bit. Initialization needs
to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
taus88 and taus113 algorithm is also part of GSL. I added a test
case in the next patch to verify internal behaviour of this patch
with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
$ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
$ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
With this seed configuration, in order to compare both, we get
the following differences:
algorithm taus88 taus113
rands/second [**] 1.61e+08 1.37e+08
sts_serial(4, 1st run) WEAK PASSED
sts_serial(9, 2nd run) WEAK PASSED
rgb_lagged_sum(31) WEAK PASSED
We took out diehard_sums test as according to the authors it is
considered broken and unusable [4]. Despite that and the slight
decrease in performance (which is acceptable), taus113 here passes
all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
In general, taus/taus113 is considered "very good" by the authors
of dieharder [5].
The papers [1][2] states a single warm-up step is sufficient by
running quicktaus once on each state to ensure proper initialization
of ~s_{0}:
Our selection of (s) according to Table 1 of [1] row 1 holds the
condition L - k <= r - s, that is,
(32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
with r = k - q and q = (6 2 13 3) as also stated by the paper.
So according to [2] we are safe with one round of quicktaus for
initialization. However we decided to include the warm-up phase
of the PRNG as done in GSL in every case as a safety net. We also
use the warm up phase to make the output of the RNG easier to
verify by the GSL output.
In prandom_init(), we also mix random_get_entropy() into it, just
like drivers/char/random.c does it, jiffies ^ random_get_entropy().
random-get_entropy() is get_cycles(). xor is entropy preserving so
it is fine if it is not implemented by some architectures.
Note, this PRNG is *not* used for cryptography in the kernel, but
rather as a fast PRNG for various randomizations i.e. in the
networking code, or elsewhere for debugging purposes, for example.
[*]: In order to generate some "sort of pseduo-randomness", since
get_random_bytes() is not yet available for us, we use jiffies and
initialize states s1 - s3 with a simple linear congruential generator
(LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
initialization from s1. So the above quote from [3] accounts only
for the time from core to late initcall, not afterwards.
[**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
[4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
[5] http://www.phy.duke.edu/~rgb/General/dieharder.php
Joint work with Hannes Frederic Sowa.
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 15:20:36 +04:00
|
|
|
state->s1 = TAUSWORTHE(state->s1, 6U, 13U, 4294967294U, 18U);
|
|
|
|
state->s2 = TAUSWORTHE(state->s2, 2U, 27U, 4294967288U, 2U);
|
|
|
|
state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U, 7U);
|
|
|
|
state->s4 = TAUSWORTHE(state->s4, 3U, 12U, 4294967168U, 13U);
|
2006-10-17 11:09:42 +04:00
|
|
|
|
random32: upgrade taus88 generator to taus113 from errata paper
Since we use prandom*() functions quite often in networking code
i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
from Pierre L'Ecuyer's original paper "Maximally Equidistributed
Combined Tausworthe Generators", Mathematics of Computation, 65,
213 (1996), 203--213 to the version published in his errata paper [1].
The Tausworthe generator is a maximally-equidistributed generator,
that is fast and has good statistical properties [1].
The version presented there upgrades the 3 state LFSR to a 4 state
LFSR with increased periodicity from about 2^88 to 2^113. The
algorithm is presented in [1] by the very same author who also
designed the original algorithm in [2].
Also, by increasing the state, we make it a bit harder for attackers
to "guess" the PRNGs internal state. See also discussion in [3].
Now, as we use this sort of weak initialization discussed in [3]
only between core_initcall() until late_initcall() time [*] for
prandom32*() users, namely in prandom_init(), it is less relevant
from late_initcall() onwards as we overwrite seeds through
prandom_reseed() anyways with a seed source of higher entropy, that
is, get_random_bytes(). In other words, a exhaustive keysearch of
96 bit would be needed. Now, with the help of this patch, this
state-search increases further to 128 bit. Initialization needs
to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
taus88 and taus113 algorithm is also part of GSL. I added a test
case in the next patch to verify internal behaviour of this patch
with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
$ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
$ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
With this seed configuration, in order to compare both, we get
the following differences:
algorithm taus88 taus113
rands/second [**] 1.61e+08 1.37e+08
sts_serial(4, 1st run) WEAK PASSED
sts_serial(9, 2nd run) WEAK PASSED
rgb_lagged_sum(31) WEAK PASSED
We took out diehard_sums test as according to the authors it is
considered broken and unusable [4]. Despite that and the slight
decrease in performance (which is acceptable), taus113 here passes
all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
In general, taus/taus113 is considered "very good" by the authors
of dieharder [5].
The papers [1][2] states a single warm-up step is sufficient by
running quicktaus once on each state to ensure proper initialization
of ~s_{0}:
Our selection of (s) according to Table 1 of [1] row 1 holds the
condition L - k <= r - s, that is,
(32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
with r = k - q and q = (6 2 13 3) as also stated by the paper.
So according to [2] we are safe with one round of quicktaus for
initialization. However we decided to include the warm-up phase
of the PRNG as done in GSL in every case as a safety net. We also
use the warm up phase to make the output of the RNG easier to
verify by the GSL output.
In prandom_init(), we also mix random_get_entropy() into it, just
like drivers/char/random.c does it, jiffies ^ random_get_entropy().
random-get_entropy() is get_cycles(). xor is entropy preserving so
it is fine if it is not implemented by some architectures.
Note, this PRNG is *not* used for cryptography in the kernel, but
rather as a fast PRNG for various randomizations i.e. in the
networking code, or elsewhere for debugging purposes, for example.
[*]: In order to generate some "sort of pseduo-randomness", since
get_random_bytes() is not yet available for us, we use jiffies and
initialize states s1 - s3 with a simple linear congruential generator
(LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
initialization from s1. So the above quote from [3] accounts only
for the time from core to late initcall, not afterwards.
[**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
[4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
[5] http://www.phy.duke.edu/~rgb/General/dieharder.php
Joint work with Hannes Frederic Sowa.
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 15:20:36 +04:00
|
|
|
return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
|
2006-10-17 11:09:42 +04:00
|
|
|
}
|
2012-12-18 04:04:23 +04:00
|
|
|
EXPORT_SYMBOL(prandom_u32_state);
|
2006-10-17 11:09:42 +04:00
|
|
|
|
2014-04-04 01:49:08 +04:00
|
|
|
/**
|
2012-12-18 04:04:25 +04:00
|
|
|
* prandom_bytes_state - get the requested number of pseudo-random bytes
|
|
|
|
*
|
|
|
|
* @state: pointer to state structure holding seeded state.
|
|
|
|
* @buf: where to copy the pseudo-random bytes to
|
|
|
|
* @bytes: the requested number of bytes
|
|
|
|
*
|
|
|
|
* This is used for pseudo-randomness with no outside seeding.
|
|
|
|
* For more random results, use prandom_bytes().
|
|
|
|
*/
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
void prandom_bytes_state(struct rnd_state *state, void *buf, size_t bytes)
|
2012-12-18 04:04:25 +04:00
|
|
|
{
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
u8 *ptr = buf;
|
2012-12-18 04:04:25 +04:00
|
|
|
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
while (bytes >= sizeof(u32)) {
|
|
|
|
put_unaligned(prandom_u32_state(state), (u32 *) ptr);
|
|
|
|
ptr += sizeof(u32);
|
|
|
|
bytes -= sizeof(u32);
|
2012-12-18 04:04:25 +04:00
|
|
|
}
|
|
|
|
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
if (bytes > 0) {
|
|
|
|
u32 rem = prandom_u32_state(state);
|
|
|
|
do {
|
|
|
|
*ptr++ = (u8) rem;
|
|
|
|
bytes--;
|
|
|
|
rem >>= BITS_PER_BYTE;
|
|
|
|
} while (bytes > 0);
|
2012-12-18 04:04:25 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(prandom_bytes_state);
|
|
|
|
|
random32: upgrade taus88 generator to taus113 from errata paper
Since we use prandom*() functions quite often in networking code
i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
from Pierre L'Ecuyer's original paper "Maximally Equidistributed
Combined Tausworthe Generators", Mathematics of Computation, 65,
213 (1996), 203--213 to the version published in his errata paper [1].
The Tausworthe generator is a maximally-equidistributed generator,
that is fast and has good statistical properties [1].
The version presented there upgrades the 3 state LFSR to a 4 state
LFSR with increased periodicity from about 2^88 to 2^113. The
algorithm is presented in [1] by the very same author who also
designed the original algorithm in [2].
Also, by increasing the state, we make it a bit harder for attackers
to "guess" the PRNGs internal state. See also discussion in [3].
Now, as we use this sort of weak initialization discussed in [3]
only between core_initcall() until late_initcall() time [*] for
prandom32*() users, namely in prandom_init(), it is less relevant
from late_initcall() onwards as we overwrite seeds through
prandom_reseed() anyways with a seed source of higher entropy, that
is, get_random_bytes(). In other words, a exhaustive keysearch of
96 bit would be needed. Now, with the help of this patch, this
state-search increases further to 128 bit. Initialization needs
to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
taus88 and taus113 algorithm is also part of GSL. I added a test
case in the next patch to verify internal behaviour of this patch
with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
$ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
$ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
With this seed configuration, in order to compare both, we get
the following differences:
algorithm taus88 taus113
rands/second [**] 1.61e+08 1.37e+08
sts_serial(4, 1st run) WEAK PASSED
sts_serial(9, 2nd run) WEAK PASSED
rgb_lagged_sum(31) WEAK PASSED
We took out diehard_sums test as according to the authors it is
considered broken and unusable [4]. Despite that and the slight
decrease in performance (which is acceptable), taus113 here passes
all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
In general, taus/taus113 is considered "very good" by the authors
of dieharder [5].
The papers [1][2] states a single warm-up step is sufficient by
running quicktaus once on each state to ensure proper initialization
of ~s_{0}:
Our selection of (s) according to Table 1 of [1] row 1 holds the
condition L - k <= r - s, that is,
(32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
with r = k - q and q = (6 2 13 3) as also stated by the paper.
So according to [2] we are safe with one round of quicktaus for
initialization. However we decided to include the warm-up phase
of the PRNG as done in GSL in every case as a safety net. We also
use the warm up phase to make the output of the RNG easier to
verify by the GSL output.
In prandom_init(), we also mix random_get_entropy() into it, just
like drivers/char/random.c does it, jiffies ^ random_get_entropy().
random-get_entropy() is get_cycles(). xor is entropy preserving so
it is fine if it is not implemented by some architectures.
Note, this PRNG is *not* used for cryptography in the kernel, but
rather as a fast PRNG for various randomizations i.e. in the
networking code, or elsewhere for debugging purposes, for example.
[*]: In order to generate some "sort of pseduo-randomness", since
get_random_bytes() is not yet available for us, we use jiffies and
initialize states s1 - s3 with a simple linear congruential generator
(LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
initialization from s1. So the above quote from [3] accounts only
for the time from core to late initcall, not afterwards.
[**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
[4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
[5] http://www.phy.duke.edu/~rgb/General/dieharder.php
Joint work with Hannes Frederic Sowa.
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 15:20:36 +04:00
|
|
|
static void prandom_warmup(struct rnd_state *state)
|
|
|
|
{
|
random32: improvements to prandom_bytes
This patch addresses a couple of minor items, mostly addesssing
prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
for length arguments, 2) We can use put_unaligned() when filling
the array instead of open coding it [ perhaps some archs will
further benefit from their own arch specific implementation when
GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
int as type for getting the arch seed, 5) Make use of
prandom_u32_max() for timer slack.
Regarding the change to put_unaligned(), callers of prandom_bytes()
which internally invoke prandom_bytes_state(), don't bother as
they expect the array to be filled randomly and don't have any
control of the internal state what-so-ever (that's also why we
have periodic reseeding there, etc), so they really don't care.
Now for the direct callers of prandom_bytes_state(), which
are solely located in test cases for MTD devices, that is,
drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
These tests basically fill a test write-vector through
prandom_bytes_state() with an a-priori defined seed each time
and write that to a MTD device. Later on, they set up a read-vector
and read back that blocks from the device. So in the verification
phase, the write-vector is being re-setup [ so same seed and
prandom_bytes_state() called ], and then memcmp()'ed against the
read-vector to check if the data is the same.
Akinobu, Lothar and I also tested this patch and it runs through
the 3 relevant MTD test cases w/o any errors on the nandsim device
(simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
and i.MX6):
# modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
third_id_byte=0x00 fourth_id_byte=0x15
# modprobe mtd_oobtest dev=0
# modprobe mtd_pagetest dev=0
# modprobe mtd_subpagetest dev=0
We also don't have any users depending directly on a particular
result of the PRNG (except the PRNG self-test itself), and that's
just fine as it e.g. allowed us easily to do things like upgrading
from taus88 to taus113.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-23 19:03:28 +04:00
|
|
|
/* Calling RNG ten times to satisfy recurrence condition */
|
random32: upgrade taus88 generator to taus113 from errata paper
Since we use prandom*() functions quite often in networking code
i.e. in UDP port selection, netfilter code, etc, upgrade the PRNG
from Pierre L'Ecuyer's original paper "Maximally Equidistributed
Combined Tausworthe Generators", Mathematics of Computation, 65,
213 (1996), 203--213 to the version published in his errata paper [1].
The Tausworthe generator is a maximally-equidistributed generator,
that is fast and has good statistical properties [1].
The version presented there upgrades the 3 state LFSR to a 4 state
LFSR with increased periodicity from about 2^88 to 2^113. The
algorithm is presented in [1] by the very same author who also
designed the original algorithm in [2].
Also, by increasing the state, we make it a bit harder for attackers
to "guess" the PRNGs internal state. See also discussion in [3].
Now, as we use this sort of weak initialization discussed in [3]
only between core_initcall() until late_initcall() time [*] for
prandom32*() users, namely in prandom_init(), it is less relevant
from late_initcall() onwards as we overwrite seeds through
prandom_reseed() anyways with a seed source of higher entropy, that
is, get_random_bytes(). In other words, a exhaustive keysearch of
96 bit would be needed. Now, with the help of this patch, this
state-search increases further to 128 bit. Initialization needs
to make sure that s1 > 1, s2 > 7, s3 > 15, s4 > 127.
taus88 and taus113 algorithm is also part of GSL. I added a test
case in the next patch to verify internal behaviour of this patch
with GSL and ran tests with the dieharder 3.31.1 RNG test suite:
$ dieharder -g 052 -a -m 10 -s 1 -S 4137730333 #taus88
$ dieharder -g 054 -a -m 10 -s 1 -S 4137730333 #taus113
With this seed configuration, in order to compare both, we get
the following differences:
algorithm taus88 taus113
rands/second [**] 1.61e+08 1.37e+08
sts_serial(4, 1st run) WEAK PASSED
sts_serial(9, 2nd run) WEAK PASSED
rgb_lagged_sum(31) WEAK PASSED
We took out diehard_sums test as according to the authors it is
considered broken and unusable [4]. Despite that and the slight
decrease in performance (which is acceptable), taus113 here passes
all 113 tests (only rgb_minimum_distance_5 in WEAK, the rest PASSED).
In general, taus/taus113 is considered "very good" by the authors
of dieharder [5].
The papers [1][2] states a single warm-up step is sufficient by
running quicktaus once on each state to ensure proper initialization
of ~s_{0}:
Our selection of (s) according to Table 1 of [1] row 1 holds the
condition L - k <= r - s, that is,
(32 32 32 32) - (31 29 28 25) <= (25 27 15 22) - (18 2 7 13)
with r = k - q and q = (6 2 13 3) as also stated by the paper.
So according to [2] we are safe with one round of quicktaus for
initialization. However we decided to include the warm-up phase
of the PRNG as done in GSL in every case as a safety net. We also
use the warm up phase to make the output of the RNG easier to
verify by the GSL output.
In prandom_init(), we also mix random_get_entropy() into it, just
like drivers/char/random.c does it, jiffies ^ random_get_entropy().
random-get_entropy() is get_cycles(). xor is entropy preserving so
it is fine if it is not implemented by some architectures.
Note, this PRNG is *not* used for cryptography in the kernel, but
rather as a fast PRNG for various randomizations i.e. in the
networking code, or elsewhere for debugging purposes, for example.
[*]: In order to generate some "sort of pseduo-randomness", since
get_random_bytes() is not yet available for us, we use jiffies and
initialize states s1 - s3 with a simple linear congruential generator
(LCG), that is x <- x * 69069; and derive s2, s3, from the 32bit
initialization from s1. So the above quote from [3] accounts only
for the time from core to late initcall, not afterwards.
[**] Single threaded run on MacBook Air w/ Intel Core i5-3317U
[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[3] http://thread.gmane.org/gmane.comp.encryption.general/12103/
[4] http://code.google.com/p/dieharder/source/browse/trunk/libdieharder/diehard_sums.c?spec=svn490&r=490#20
[5] http://www.phy.duke.edu/~rgb/General/dieharder.php
Joint work with Hannes Frederic Sowa.
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-11 15:20:36 +04:00
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
prandom_u32_state(state);
|
|
|
|
}
|
|
|
|
|
2015-10-08 02:20:38 +03:00
|
|
|
void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state)
|
2015-10-08 02:20:37 +03:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for_each_possible_cpu(i) {
|
|
|
|
struct rnd_state *state = per_cpu_ptr(pcpu_state, i);
|
|
|
|
u32 seeds[4];
|
|
|
|
|
|
|
|
get_random_bytes(&seeds, sizeof(seeds));
|
|
|
|
state->s1 = __seed(seeds[0], 2U);
|
|
|
|
state->s2 = __seed(seeds[1], 8U);
|
|
|
|
state->s3 = __seed(seeds[2], 16U);
|
|
|
|
state->s4 = __seed(seeds[3], 128U);
|
|
|
|
|
|
|
|
prandom_warmup(state);
|
|
|
|
}
|
|
|
|
}
|
2016-02-16 19:24:08 +03:00
|
|
|
EXPORT_SYMBOL(prandom_seed_full_state);
|
2015-10-08 02:20:37 +03:00
|
|
|
|
2013-11-11 15:20:37 +04:00
|
|
|
#ifdef CONFIG_RANDOM32_SELFTEST
|
|
|
|
static struct prandom_test1 {
|
|
|
|
u32 seed;
|
|
|
|
u32 result;
|
|
|
|
} test1[] = {
|
|
|
|
{ 1U, 3484351685U },
|
|
|
|
{ 2U, 2623130059U },
|
|
|
|
{ 3U, 3125133893U },
|
|
|
|
{ 4U, 984847254U },
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct prandom_test2 {
|
|
|
|
u32 seed;
|
|
|
|
u32 iteration;
|
|
|
|
u32 result;
|
|
|
|
} test2[] = {
|
|
|
|
/* Test cases against taus113 from GSL library. */
|
|
|
|
{ 931557656U, 959U, 2975593782U },
|
|
|
|
{ 1339693295U, 876U, 3887776532U },
|
|
|
|
{ 1545556285U, 961U, 1615538833U },
|
|
|
|
{ 601730776U, 723U, 1776162651U },
|
|
|
|
{ 1027516047U, 687U, 511983079U },
|
|
|
|
{ 416526298U, 700U, 916156552U },
|
|
|
|
{ 1395522032U, 652U, 2222063676U },
|
|
|
|
{ 366221443U, 617U, 2992857763U },
|
|
|
|
{ 1539836965U, 714U, 3783265725U },
|
|
|
|
{ 556206671U, 994U, 799626459U },
|
|
|
|
{ 684907218U, 799U, 367789491U },
|
|
|
|
{ 2121230701U, 931U, 2115467001U },
|
|
|
|
{ 1668516451U, 644U, 3620590685U },
|
|
|
|
{ 768046066U, 883U, 2034077390U },
|
|
|
|
{ 1989159136U, 833U, 1195767305U },
|
|
|
|
{ 536585145U, 996U, 3577259204U },
|
|
|
|
{ 1008129373U, 642U, 1478080776U },
|
|
|
|
{ 1740775604U, 939U, 1264980372U },
|
|
|
|
{ 1967883163U, 508U, 10734624U },
|
|
|
|
{ 1923019697U, 730U, 3821419629U },
|
|
|
|
{ 442079932U, 560U, 3440032343U },
|
|
|
|
{ 1961302714U, 845U, 841962572U },
|
|
|
|
{ 2030205964U, 962U, 1325144227U },
|
|
|
|
{ 1160407529U, 507U, 240940858U },
|
|
|
|
{ 635482502U, 779U, 4200489746U },
|
|
|
|
{ 1252788931U, 699U, 867195434U },
|
|
|
|
{ 1961817131U, 719U, 668237657U },
|
|
|
|
{ 1071468216U, 983U, 917876630U },
|
|
|
|
{ 1281848367U, 932U, 1003100039U },
|
|
|
|
{ 582537119U, 780U, 1127273778U },
|
|
|
|
{ 1973672777U, 853U, 1071368872U },
|
|
|
|
{ 1896756996U, 762U, 1127851055U },
|
|
|
|
{ 847917054U, 500U, 1717499075U },
|
|
|
|
{ 1240520510U, 951U, 2849576657U },
|
|
|
|
{ 1685071682U, 567U, 1961810396U },
|
|
|
|
{ 1516232129U, 557U, 3173877U },
|
|
|
|
{ 1208118903U, 612U, 1613145022U },
|
|
|
|
{ 1817269927U, 693U, 4279122573U },
|
|
|
|
{ 1510091701U, 717U, 638191229U },
|
|
|
|
{ 365916850U, 807U, 600424314U },
|
|
|
|
{ 399324359U, 702U, 1803598116U },
|
|
|
|
{ 1318480274U, 779U, 2074237022U },
|
|
|
|
{ 697758115U, 840U, 1483639402U },
|
|
|
|
{ 1696507773U, 840U, 577415447U },
|
|
|
|
{ 2081979121U, 981U, 3041486449U },
|
|
|
|
{ 955646687U, 742U, 3846494357U },
|
|
|
|
{ 1250683506U, 749U, 836419859U },
|
|
|
|
{ 595003102U, 534U, 366794109U },
|
|
|
|
{ 47485338U, 558U, 3521120834U },
|
|
|
|
{ 619433479U, 610U, 3991783875U },
|
|
|
|
{ 704096520U, 518U, 4139493852U },
|
|
|
|
{ 1712224984U, 606U, 2393312003U },
|
|
|
|
{ 1318233152U, 922U, 3880361134U },
|
|
|
|
{ 855572992U, 761U, 1472974787U },
|
|
|
|
{ 64721421U, 703U, 683860550U },
|
|
|
|
{ 678931758U, 840U, 380616043U },
|
|
|
|
{ 692711973U, 778U, 1382361947U },
|
|
|
|
{ 677703619U, 530U, 2826914161U },
|
|
|
|
{ 92393223U, 586U, 1522128471U },
|
|
|
|
{ 1222592920U, 743U, 3466726667U },
|
|
|
|
{ 358288986U, 695U, 1091956998U },
|
|
|
|
{ 1935056945U, 958U, 514864477U },
|
|
|
|
{ 735675993U, 990U, 1294239989U },
|
|
|
|
{ 1560089402U, 897U, 2238551287U },
|
|
|
|
{ 70616361U, 829U, 22483098U },
|
|
|
|
{ 368234700U, 731U, 2913875084U },
|
|
|
|
{ 20221190U, 879U, 1564152970U },
|
|
|
|
{ 539444654U, 682U, 1835141259U },
|
|
|
|
{ 1314987297U, 840U, 1801114136U },
|
|
|
|
{ 2019295544U, 645U, 3286438930U },
|
|
|
|
{ 469023838U, 716U, 1637918202U },
|
|
|
|
{ 1843754496U, 653U, 2562092152U },
|
|
|
|
{ 400672036U, 809U, 4264212785U },
|
|
|
|
{ 404722249U, 965U, 2704116999U },
|
|
|
|
{ 600702209U, 758U, 584979986U },
|
|
|
|
{ 519953954U, 667U, 2574436237U },
|
|
|
|
{ 1658071126U, 694U, 2214569490U },
|
|
|
|
{ 420480037U, 749U, 3430010866U },
|
|
|
|
{ 690103647U, 969U, 3700758083U },
|
|
|
|
{ 1029424799U, 937U, 3787746841U },
|
|
|
|
{ 2012608669U, 506U, 3362628973U },
|
|
|
|
{ 1535432887U, 998U, 42610943U },
|
|
|
|
{ 1330635533U, 857U, 3040806504U },
|
|
|
|
{ 1223800550U, 539U, 3954229517U },
|
|
|
|
{ 1322411537U, 680U, 3223250324U },
|
|
|
|
{ 1877847898U, 945U, 2915147143U },
|
|
|
|
{ 1646356099U, 874U, 965988280U },
|
|
|
|
{ 805687536U, 744U, 4032277920U },
|
|
|
|
{ 1948093210U, 633U, 1346597684U },
|
|
|
|
{ 392609744U, 783U, 1636083295U },
|
|
|
|
{ 690241304U, 770U, 1201031298U },
|
|
|
|
{ 1360302965U, 696U, 1665394461U },
|
|
|
|
{ 1220090946U, 780U, 1316922812U },
|
|
|
|
{ 447092251U, 500U, 3438743375U },
|
|
|
|
{ 1613868791U, 592U, 828546883U },
|
|
|
|
{ 523430951U, 548U, 2552392304U },
|
|
|
|
{ 726692899U, 810U, 1656872867U },
|
|
|
|
{ 1364340021U, 836U, 3710513486U },
|
|
|
|
{ 1986257729U, 931U, 935013962U },
|
|
|
|
{ 407983964U, 921U, 728767059U },
|
|
|
|
};
|
|
|
|
|
2020-08-09 09:57:44 +03:00
|
|
|
static u32 __extract_hwseed(void)
|
|
|
|
{
|
|
|
|
unsigned int val = 0;
|
|
|
|
|
|
|
|
(void)(arch_get_random_seed_int(&val) ||
|
|
|
|
arch_get_random_int(&val));
|
|
|
|
|
|
|
|
return val;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void prandom_seed_early(struct rnd_state *state, u32 seed,
|
|
|
|
bool mix_with_hwseed)
|
|
|
|
{
|
|
|
|
#define LCG(x) ((x) * 69069U) /* super-duper LCG */
|
|
|
|
#define HWSEED() (mix_with_hwseed ? __extract_hwseed() : 0)
|
|
|
|
state->s1 = __seed(HWSEED() ^ LCG(seed), 2U);
|
|
|
|
state->s2 = __seed(HWSEED() ^ LCG(state->s1), 8U);
|
|
|
|
state->s3 = __seed(HWSEED() ^ LCG(state->s2), 16U);
|
|
|
|
state->s4 = __seed(HWSEED() ^ LCG(state->s3), 128U);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __init prandom_state_selftest(void)
|
2013-11-11 15:20:37 +04:00
|
|
|
{
|
|
|
|
int i, j, errors = 0, runs = 0;
|
|
|
|
bool error = false;
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(test1); i++) {
|
|
|
|
struct rnd_state state;
|
|
|
|
|
random32: mix in entropy from core to late initcall
Currently, we have a 3-stage seeding process in prandom():
Phase 1 is from the early actual initialization of prandom()
subsystem which happens during core_initcall() and remains
most likely until the beginning of late_initcall() phase.
Here, the system might not have enough entropy available
for seeding with strong randomness from the random driver.
That means, we currently have a 32bit weak LCG() seeding
the PRNG status register 1 and mixing that successively
into the other 3 registers just to get it up and running.
Phase 2 starts with late_initcall() phase resp. when the
random driver has initialized its non-blocking pool with
enough entropy. At that time, we throw away *all* inner
state from its 4 registers and do a full reseed with strong
randomness.
Phase 3 starts right after that and does a periodic reseed
with random slack of status register 1 by a strong random
source again.
A problem in phase 1 is that during bootup data structures
can be initialized, e.g. on module load time, and thus access
a weakly seeded prandom and are never changed for the rest
of their live-time, thus carrying along the results from a
week seed. Lets make sure that current but also future users
access a possibly better early seeded prandom.
This patch therefore improves phase 1 by trying to make it
more 'unpredictable' through mixing in seed from a possible
hardware source. Now, the mix-in xors inner state with the
outcome of either of the two functions arch_get_random_{,seed}_int(),
preferably arch_get_random_seed_int() as it likely represents
a non-deterministic random bit generator in hw rather than
a cryptographically secure PRNG in hw. However, not all might
have the first one, so we use the PRNG as a fallback if
available. As we xor the seed into the current state, the
worst case would be that a hardware source could be unverifiable
compromised or backdoored. In that case nevertheless it
would be as good as our original early seeding function
prandom_seed_very_weak() since we mix through xor which is
entropy preserving.
Joint work with Daniel Borkmann.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 16:01:38 +04:00
|
|
|
prandom_seed_early(&state, test1[i].seed, false);
|
2013-11-11 15:20:37 +04:00
|
|
|
prandom_warmup(&state);
|
|
|
|
|
|
|
|
if (test1[i].result != prandom_u32_state(&state))
|
|
|
|
error = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (error)
|
|
|
|
pr_warn("prandom: seed boundary self test failed\n");
|
|
|
|
else
|
|
|
|
pr_info("prandom: seed boundary self test passed\n");
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(test2); i++) {
|
|
|
|
struct rnd_state state;
|
|
|
|
|
random32: mix in entropy from core to late initcall
Currently, we have a 3-stage seeding process in prandom():
Phase 1 is from the early actual initialization of prandom()
subsystem which happens during core_initcall() and remains
most likely until the beginning of late_initcall() phase.
Here, the system might not have enough entropy available
for seeding with strong randomness from the random driver.
That means, we currently have a 32bit weak LCG() seeding
the PRNG status register 1 and mixing that successively
into the other 3 registers just to get it up and running.
Phase 2 starts with late_initcall() phase resp. when the
random driver has initialized its non-blocking pool with
enough entropy. At that time, we throw away *all* inner
state from its 4 registers and do a full reseed with strong
randomness.
Phase 3 starts right after that and does a periodic reseed
with random slack of status register 1 by a strong random
source again.
A problem in phase 1 is that during bootup data structures
can be initialized, e.g. on module load time, and thus access
a weakly seeded prandom and are never changed for the rest
of their live-time, thus carrying along the results from a
week seed. Lets make sure that current but also future users
access a possibly better early seeded prandom.
This patch therefore improves phase 1 by trying to make it
more 'unpredictable' through mixing in seed from a possible
hardware source. Now, the mix-in xors inner state with the
outcome of either of the two functions arch_get_random_{,seed}_int(),
preferably arch_get_random_seed_int() as it likely represents
a non-deterministic random bit generator in hw rather than
a cryptographically secure PRNG in hw. However, not all might
have the first one, so we use the PRNG as a fallback if
available. As we xor the seed into the current state, the
worst case would be that a hardware source could be unverifiable
compromised or backdoored. In that case nevertheless it
would be as good as our original early seeding function
prandom_seed_very_weak() since we mix through xor which is
entropy preserving.
Joint work with Daniel Borkmann.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-28 16:01:38 +04:00
|
|
|
prandom_seed_early(&state, test2[i].seed, false);
|
2013-11-11 15:20:37 +04:00
|
|
|
prandom_warmup(&state);
|
|
|
|
|
|
|
|
for (j = 0; j < test2[i].iteration - 1; j++)
|
|
|
|
prandom_u32_state(&state);
|
|
|
|
|
|
|
|
if (test2[i].result != prandom_u32_state(&state))
|
|
|
|
errors++;
|
|
|
|
|
|
|
|
runs++;
|
|
|
|
cond_resched();
|
|
|
|
}
|
|
|
|
|
|
|
|
if (errors)
|
|
|
|
pr_warn("prandom: %d/%d self tests failed\n", errors, runs);
|
|
|
|
else
|
|
|
|
pr_info("prandom: %d self tests passed\n", runs);
|
2020-08-09 09:57:44 +03:00
|
|
|
return 0;
|
2013-11-11 15:20:37 +04:00
|
|
|
}
|
2020-08-09 09:57:44 +03:00
|
|
|
core_initcall(prandom_state_selftest);
|
2013-11-11 15:20:37 +04:00
|
|
|
#endif
|
2020-08-09 09:57:44 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The prandom_u32() implementation is now completely separate from the
|
|
|
|
* prandom_state() functions, which are retained (for now) for compatibility.
|
|
|
|
*
|
|
|
|
* Because of (ab)use in the networking code for choosing random TCP/UDP port
|
|
|
|
* numbers, which open DoS possibilities if guessable, we want something
|
|
|
|
* stronger than a standard PRNG. But the performance requirements of
|
|
|
|
* the network code do not allow robust crypto for this application.
|
|
|
|
*
|
|
|
|
* So this is a homebrew Junior Spaceman implementation, based on the
|
|
|
|
* lowest-latency trustworthy crypto primitive available, SipHash.
|
|
|
|
* (The authors of SipHash have not been consulted about this abuse of
|
|
|
|
* their work.)
|
|
|
|
*
|
|
|
|
* Standard SipHash-2-4 uses 2n+4 rounds to hash n words of input to
|
|
|
|
* one word of output. This abbreviated version uses 2 rounds per word
|
|
|
|
* of output.
|
|
|
|
*/
|
|
|
|
|
|
|
|
struct siprand_state {
|
|
|
|
unsigned long v0;
|
|
|
|
unsigned long v1;
|
|
|
|
unsigned long v2;
|
|
|
|
unsigned long v3;
|
|
|
|
};
|
|
|
|
|
|
|
|
static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
|
random32: add noise from network and scheduling activity
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b5c ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
Cc: George Spelvin <lkml@sdf.org>
Cc: Amit Klein <aksecurity@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-08-10 11:27:42 +03:00
|
|
|
DEFINE_PER_CPU(unsigned long, net_rand_noise);
|
|
|
|
EXPORT_PER_CPU_SYMBOL(net_rand_noise);
|
2020-08-09 09:57:44 +03:00
|
|
|
|
|
|
|
/*
|
|
|
|
* This is the core CPRNG function. As "pseudorandom", this is not used
|
|
|
|
* for truly valuable things, just intended to be a PITA to guess.
|
|
|
|
* For maximum speed, we do just two SipHash rounds per word. This is
|
|
|
|
* the same rate as 4 rounds per 64 bits that SipHash normally uses,
|
|
|
|
* so hopefully it's reasonably secure.
|
|
|
|
*
|
|
|
|
* There are two changes from the official SipHash finalization:
|
|
|
|
* - We omit some constants XORed with v2 in the SipHash spec as irrelevant;
|
|
|
|
* they are there only to make the output rounds distinct from the input
|
|
|
|
* rounds, and this application has no input rounds.
|
|
|
|
* - Rather than returning v0^v1^v2^v3, return v1+v3.
|
|
|
|
* If you look at the SipHash round, the last operation on v3 is
|
|
|
|
* "v3 ^= v0", so "v0 ^ v3" just undoes that, a waste of time.
|
|
|
|
* Likewise "v1 ^= v2". (The rotate of v2 makes a difference, but
|
|
|
|
* it still cancels out half of the bits in v2 for no benefit.)
|
|
|
|
* Second, since the last combining operation was xor, continue the
|
|
|
|
* pattern of alternating xor/add for a tiny bit of extra non-linearity.
|
|
|
|
*/
|
|
|
|
static inline u32 siprand_u32(struct siprand_state *s)
|
|
|
|
{
|
|
|
|
unsigned long v0 = s->v0, v1 = s->v1, v2 = s->v2, v3 = s->v3;
|
random32: add noise from network and scheduling activity
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b5c ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
Cc: George Spelvin <lkml@sdf.org>
Cc: Amit Klein <aksecurity@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-08-10 11:27:42 +03:00
|
|
|
unsigned long n = raw_cpu_read(net_rand_noise);
|
2020-08-09 09:57:44 +03:00
|
|
|
|
random32: add noise from network and scheduling activity
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b5c ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
Cc: George Spelvin <lkml@sdf.org>
Cc: Amit Klein <aksecurity@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-08-10 11:27:42 +03:00
|
|
|
v3 ^= n;
|
2020-08-09 09:57:44 +03:00
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
random32: add noise from network and scheduling activity
With the removal of the interrupt perturbations in previous random32
change (random32: make prandom_u32() output unpredictable), the PRNG
has become 100% deterministic again. While SipHash is expected to be
way more robust against brute force than the previous Tausworthe LFSR,
there's still the risk that whoever has even one temporary access to
the PRNG's internal state is able to predict all subsequent draws till
the next reseed (roughly every minute). This may happen through a side
channel attack or any data leak.
This patch restores the spirit of commit f227e3ec3b5c ("random32: update
the net random state on interrupt and activity") in that it will perturb
the internal PRNG's statee using externally collected noise, except that
it will not pick that noise from the random pool's bits nor upon
interrupt, but will rather combine a few elements along the Tx path
that are collectively hard to predict, such as dev, skb and txq
pointers, packet length and jiffies values. These ones are combined
using a single round of SipHash into a single long variable that is
mixed with the net_rand_state upon each invocation.
The operation was inlined because it produces very small and efficient
code, typically 3 xor, 2 add and 2 rol. The performance was measured
to be the same (even very slightly better) than before the switch to
SipHash; on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC
(i40e), the connection rate dropped from 556k/s to 555k/s while the
SYN cookie rate grew from 5.38 Mpps to 5.45 Mpps.
Link: https://lore.kernel.org/netdev/20200808152628.GA27941@SDF.ORG/
Cc: George Spelvin <lkml@sdf.org>
Cc: Amit Klein <aksecurity@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: tytso@mit.edu
Cc: Florian Westphal <fw@strlen.de>
Cc: Marc Plumb <lkml.mplumb@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
2020-08-10 11:27:42 +03:00
|
|
|
v0 ^= n;
|
2020-08-09 09:57:44 +03:00
|
|
|
s->v0 = v0; s->v1 = v1; s->v2 = v2; s->v3 = v3;
|
|
|
|
return v1 + v3;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
* prandom_u32 - pseudo random number generator
|
|
|
|
*
|
|
|
|
* A 32 bit pseudo-random number is generated using a fast
|
|
|
|
* algorithm suitable for simulation. This algorithm is NOT
|
|
|
|
* considered safe for cryptographic use.
|
|
|
|
*/
|
|
|
|
u32 prandom_u32(void)
|
|
|
|
{
|
|
|
|
struct siprand_state *state = get_cpu_ptr(&net_rand_state);
|
|
|
|
u32 res = siprand_u32(state);
|
|
|
|
|
|
|
|
trace_prandom_u32(res);
|
|
|
|
put_cpu_ptr(&net_rand_state);
|
|
|
|
return res;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(prandom_u32);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* prandom_bytes - get the requested number of pseudo-random bytes
|
|
|
|
* @buf: where to copy the pseudo-random bytes to
|
|
|
|
* @bytes: the requested number of bytes
|
|
|
|
*/
|
|
|
|
void prandom_bytes(void *buf, size_t bytes)
|
|
|
|
{
|
|
|
|
struct siprand_state *state = get_cpu_ptr(&net_rand_state);
|
|
|
|
u8 *ptr = buf;
|
|
|
|
|
|
|
|
while (bytes >= sizeof(u32)) {
|
|
|
|
put_unaligned(siprand_u32(state), (u32 *)ptr);
|
|
|
|
ptr += sizeof(u32);
|
|
|
|
bytes -= sizeof(u32);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (bytes > 0) {
|
|
|
|
u32 rem = siprand_u32(state);
|
|
|
|
|
|
|
|
do {
|
|
|
|
*ptr++ = (u8)rem;
|
|
|
|
rem >>= BITS_PER_BYTE;
|
|
|
|
} while (--bytes > 0);
|
|
|
|
}
|
|
|
|
put_cpu_ptr(&net_rand_state);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(prandom_bytes);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* prandom_seed - add entropy to pseudo random number generator
|
|
|
|
* @entropy: entropy value
|
|
|
|
*
|
|
|
|
* Add some additional seed material to the prandom pool.
|
|
|
|
* The "entropy" is actually our IP address (the only caller is
|
|
|
|
* the network code), not for unpredictability, but to ensure that
|
|
|
|
* different machines are initialized differently.
|
|
|
|
*/
|
|
|
|
void prandom_seed(u32 entropy)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
add_device_randomness(&entropy, sizeof(entropy));
|
|
|
|
|
|
|
|
for_each_possible_cpu(i) {
|
|
|
|
struct siprand_state *state = per_cpu_ptr(&net_rand_state, i);
|
|
|
|
unsigned long v0 = state->v0, v1 = state->v1;
|
|
|
|
unsigned long v2 = state->v2, v3 = state->v3;
|
|
|
|
|
|
|
|
do {
|
|
|
|
v3 ^= entropy;
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
v0 ^= entropy;
|
|
|
|
} while (unlikely(!v0 || !v1 || !v2 || !v3));
|
|
|
|
|
|
|
|
WRITE_ONCE(state->v0, v0);
|
|
|
|
WRITE_ONCE(state->v1, v1);
|
|
|
|
WRITE_ONCE(state->v2, v2);
|
|
|
|
WRITE_ONCE(state->v3, v3);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(prandom_seed);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Generate some initially weak seeding values to allow
|
|
|
|
* the prandom_u32() engine to be started.
|
|
|
|
*/
|
|
|
|
static int __init prandom_init_early(void)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
unsigned long v0, v1, v2, v3;
|
|
|
|
|
|
|
|
if (!arch_get_random_long(&v0))
|
|
|
|
v0 = jiffies;
|
|
|
|
if (!arch_get_random_long(&v1))
|
|
|
|
v1 = random_get_entropy();
|
|
|
|
v2 = v0 ^ PRND_K0;
|
|
|
|
v3 = v1 ^ PRND_K1;
|
|
|
|
|
|
|
|
for_each_possible_cpu(i) {
|
|
|
|
struct siprand_state *state;
|
|
|
|
|
|
|
|
v3 ^= i;
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
v0 ^= i;
|
|
|
|
|
|
|
|
state = per_cpu_ptr(&net_rand_state, i);
|
|
|
|
state->v0 = v0; state->v1 = v1;
|
|
|
|
state->v2 = v2; state->v3 = v3;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
core_initcall(prandom_init_early);
|
|
|
|
|
|
|
|
|
|
|
|
/* Stronger reseeding when available, and periodically thereafter. */
|
|
|
|
static void prandom_reseed(struct timer_list *unused);
|
|
|
|
|
|
|
|
static DEFINE_TIMER(seed_timer, prandom_reseed);
|
|
|
|
|
|
|
|
static void prandom_reseed(struct timer_list *unused)
|
|
|
|
{
|
|
|
|
unsigned long expires;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Reinitialize each CPU's PRNG with 128 bits of key.
|
|
|
|
* No locking on the CPUs, but then somewhat random results are,
|
|
|
|
* well, expected.
|
|
|
|
*/
|
|
|
|
for_each_possible_cpu(i) {
|
|
|
|
struct siprand_state *state;
|
|
|
|
unsigned long v0 = get_random_long(), v2 = v0 ^ PRND_K0;
|
|
|
|
unsigned long v1 = get_random_long(), v3 = v1 ^ PRND_K1;
|
|
|
|
#if BITS_PER_LONG == 32
|
|
|
|
int j;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* On 32-bit machines, hash in two extra words to
|
|
|
|
* approximate 128-bit key length. Not that the hash
|
|
|
|
* has that much security, but this prevents a trivial
|
|
|
|
* 64-bit brute force.
|
|
|
|
*/
|
|
|
|
for (j = 0; j < 2; j++) {
|
|
|
|
unsigned long m = get_random_long();
|
|
|
|
|
|
|
|
v3 ^= m;
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
PRND_SIPROUND(v0, v1, v2, v3);
|
|
|
|
v0 ^= m;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
/*
|
|
|
|
* Probably impossible in practice, but there is a
|
|
|
|
* theoretical risk that a race between this reseeding
|
|
|
|
* and the target CPU writing its state back could
|
|
|
|
* create the all-zero SipHash fixed point.
|
|
|
|
*
|
|
|
|
* To ensure that never happens, ensure the state
|
|
|
|
* we write contains no zero words.
|
|
|
|
*/
|
|
|
|
state = per_cpu_ptr(&net_rand_state, i);
|
|
|
|
WRITE_ONCE(state->v0, v0 ? v0 : -1ul);
|
|
|
|
WRITE_ONCE(state->v1, v1 ? v1 : -1ul);
|
|
|
|
WRITE_ONCE(state->v2, v2 ? v2 : -1ul);
|
|
|
|
WRITE_ONCE(state->v3, v3 ? v3 : -1ul);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
|
|
|
|
expires = round_jiffies(jiffies + 40 * HZ + prandom_u32_max(40 * HZ));
|
|
|
|
mod_timer(&seed_timer, expires);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The random ready callback can be called from almost any interrupt.
|
|
|
|
* To avoid worrying about whether it's safe to delay that interrupt
|
|
|
|
* long enough to seed all CPUs, just schedule an immediate timer event.
|
|
|
|
*/
|
|
|
|
static void prandom_timer_start(struct random_ready_callback *unused)
|
|
|
|
{
|
|
|
|
mod_timer(&seed_timer, jiffies);
|
|
|
|
}
|
|
|
|
|
2020-10-24 19:36:27 +03:00
|
|
|
#ifdef CONFIG_RANDOM32_SELFTEST
|
|
|
|
/* Principle: True 32-bit random numbers will all have 16 differing bits on
|
|
|
|
* average. For each 32-bit number, there are 601M numbers differing by 16
|
|
|
|
* bits, and 89% of the numbers differ by at least 12 bits. Note that more
|
|
|
|
* than 16 differing bits also implies a correlation with inverted bits. Thus
|
|
|
|
* we take 1024 random numbers and compare each of them to the other ones,
|
|
|
|
* counting the deviation of correlated bits to 16. Constants report 32,
|
|
|
|
* counters 32-log2(TEST_SIZE), and pure randoms, around 6 or lower. With the
|
|
|
|
* u32 total, TEST_SIZE may be as large as 4096 samples.
|
|
|
|
*/
|
|
|
|
#define TEST_SIZE 1024
|
|
|
|
static int __init prandom32_state_selftest(void)
|
|
|
|
{
|
|
|
|
unsigned int x, y, bits, samples;
|
|
|
|
u32 xor, flip;
|
|
|
|
u32 total;
|
|
|
|
u32 *data;
|
|
|
|
|
|
|
|
data = kmalloc(sizeof(*data) * TEST_SIZE, GFP_KERNEL);
|
|
|
|
if (!data)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
for (samples = 0; samples < TEST_SIZE; samples++)
|
|
|
|
data[samples] = prandom_u32();
|
|
|
|
|
|
|
|
flip = total = 0;
|
|
|
|
for (x = 0; x < samples; x++) {
|
|
|
|
for (y = 0; y < samples; y++) {
|
|
|
|
if (x == y)
|
|
|
|
continue;
|
|
|
|
xor = data[x] ^ data[y];
|
|
|
|
flip |= xor;
|
|
|
|
bits = hweight32(xor);
|
|
|
|
total += (bits - 16) * (bits - 16);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We'll return the average deviation as 2*sqrt(corr/samples), which
|
|
|
|
* is also sqrt(4*corr/samples) which provides a better resolution.
|
|
|
|
*/
|
|
|
|
bits = int_sqrt(total / (samples * (samples - 1)) * 4);
|
|
|
|
if (bits > 6)
|
|
|
|
pr_warn("prandom32: self test failed (at least %u bits"
|
|
|
|
" correlated, fixed_mask=%#x fixed_value=%#x\n",
|
|
|
|
bits, ~flip, data[0] & ~flip);
|
|
|
|
else
|
|
|
|
pr_info("prandom32: self test passed (less than %u bits"
|
|
|
|
" correlated)\n",
|
|
|
|
bits+1);
|
|
|
|
kfree(data);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
core_initcall(prandom32_state_selftest);
|
|
|
|
#endif /* CONFIG_RANDOM32_SELFTEST */
|
|
|
|
|
2020-08-09 09:57:44 +03:00
|
|
|
/*
|
|
|
|
* Start periodic full reseeding as soon as strong
|
|
|
|
* random numbers are available.
|
|
|
|
*/
|
|
|
|
static int __init prandom_init_late(void)
|
|
|
|
{
|
|
|
|
static struct random_ready_callback random_ready = {
|
|
|
|
.func = prandom_timer_start
|
|
|
|
};
|
|
|
|
int ret = add_random_ready_callback(&random_ready);
|
|
|
|
|
|
|
|
if (ret == -EALREADY) {
|
|
|
|
prandom_timer_start(&random_ready);
|
|
|
|
ret = 0;
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
late_initcall(prandom_init_late);
|