[ Upstream commit cf2a85efda ]
For files that lack trailing newlines and match a leaking address (e.g.
wchan[1]), the leaking_addresses.pl report would run together with the
next line, making things look corrupted.
Unconditionally remove the newline on input, and write it back out on
output.
[1] https://lore.kernel.org/all/20210103142726.GC30643@xsang-OptiPlex-9020/
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20211008111626.151570317@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Based on 1 normalized pattern(s):
licensed under the terms of the gnu gpl license version 2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 62 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.929121379@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently attempt to remove the '--version' flag was made, badly. We
failed to remove mention of it from the help output. And we (me) failed
to actually remove the flag from the options list.
_Completely_ remove --version flag.
Currently calls to function dprint() are non uniform and at times
incorrect.
Use uniform _correct_ call to function dprint().
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Sometimes files may be created by using output from printk. As the scan
traverses the directory tree we should parse each path name and check if
it is leaking an address.
Add check for leaking address on each path name.
Suggested-by: Tycho Andersen <tycho@tycho.ws>
Acked-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently sub routine may_leak_address() is checking regex against Perl
special variable $_ which is _fortunately_ being set correctly in a loop
before this sub routine is called. We already have declared a variable
to hold this value '$line' we should use it.
Use $line in regex match instead of implicit $_
Signed-off-by: Tobin C. Harding <me@tobin.cc>
We have git now, we don't need a version number. This was originally
added because leaking_addresses.pl shamelessly (and mindlessly) copied
checkpatch.pl
Remove version number from script.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
The pointers listed in /proc/1/syscall are user pointers, and negative
syscall args will show up like kernel addresses.
For example
/proc/31808/syscall: 0 0x3 0x55b107a38180 0x2000 0xffffffffffffffb0 \
0x55b107a302d0 0x55b107a38180 0x7fffa313b8e8 0x7ff098560d11
Skip parsing /proc/1/syscall
Suggested-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
When the system is idle it is likely that most files under /proc/PID
will be identical for various processes. Scanning _all_ the PIDs under
/proc is unnecessary and implies that we are thoroughly scanning /proc.
This is _not_ the case because there may be ways userspace can trigger
creation of /proc files that leak addresses but were not present during
a scan. For these two reasons we should exclude all PID directories
under /proc except '1/'
Exclude all /proc/PID except /proc/1.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently we are repeatedly calling `uname -m`. This is causing the
script to take a long time to run (more than 10 seconds to parse
/proc/kallsyms). We can use Perl state variables to cache the result of
the first call to `uname -m`. With this change in place the script
scans the whole kernel in under a minute.
Cache machine architecture in state variable.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script has multiple configuration arrays. This is confusing,
evident by the fact that a bunch of the entries are in the wrong place.
We can simplify the code by just having a single array for absolute
paths to skip and a single array for file names to skip wherever they
appear in the scanned directory tree. There are also currently multiple
subroutines to handle the different arrays, we can reduce these to a
single subroutine also.
Simplify the path skipping code.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script parses binary files. Since we are scanning for
readable kernel addresses there is no need to parse binary files. We
can use Perl to check if file is binary and skip parsing it if so.
Do not parse binary files.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script only supports x86_64 and ppc64. It would be nice to be
able to scan 32-bit machines also. We can add support for 32-bit
architectures by modifying how we check for false positives, taking
advantage of the page offset used by the kernel, and using the correct
regular expression.
Support for 32-bit machines is enabled by the observation that the kernel
addresses on 32-bit machines are larger [in value] than the page offset.
We can use this to filter false positives when scanning the kernel for
leaking addresses.
Programmatic determination of the running architecture is not
immediately obvious (current 32-bit machines return various strings from
`uname -m`). We therefore provide a flag to enable scanning of 32-bit
kernels. Also we can check the kernel config file for the offset and if
not found default to 0xc0000000. A command line option to parse in the
page offset is also provided. We do automatically detect architecture
if running on ix86.
Add support for 32-bit kernels. Add a command line option for page
offset.
Suggested-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently there is duplicate code when checking the architecture type.
We can remove the duplication by implementing a wrapper function
is_arch().
Implement and use wrapper function is_arch().
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script uses Perl to get the machine architecture. This can be
erroneous since Perl uses the architecture of the machine that Perl was
compiled on not the architecture of the running machine. We should use
the systems `uname` command instead.
Use `uname -m` instead of Perl to get the machine architecture.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script only supports 4 page table levels because of the way
the kernel address regular expression is crafted. We can do better than
this. Using previously added support for kernel configuration options we
can get the number of page table levels defined by
CONFIG_PGTABLE_LEVELS. Using this value a correct regular expression can
be crafted. This only supports 5 page tables on x86_64.
Add support for 5 page table levels on x86_64.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Features that rely on the ability to get kernel configuration options
are ready to be implemented in script. In preparation for this we can
add support for kernel config options as a separate patch to ease
review.
Add support for locating and parsing kernel configuration file.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script checks only first and last address in the vsyscall
memory range. We can do better than this. When checking for false
positives against $match, we can convert $match to a hexadecimal value
then check if it lies within the range of vsyscall addresses.
Check whole range of vsyscall addresses when checking for false
positive.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
A number of the command line options to script are dependant on the
option --input-raw being set. If we indent these options it makes
explicit this dependency.
Indent options dependant on --input-raw.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently help output includes command examples. These were cute when we
first started development of this script but are unnecessary.
Remove command examples.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
leaking_addresses.pl can be run with kptr_restrict==0 now, we don't need
the comment about setting kptr_restrict any more.
Remove comment suggesting setting kptr_restrict.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently code uses a check against an undefined variable because the
variable is a sub routine name and is not evaluated.
Evaluate subroutine; add parenthesis to sub routine name.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signal masks are false positives, we already check for SigBlk and SigCgt
but we missed SigIgn.
Add SigIgn to false positive check.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script can stall if we read certain files (like
/proc/kmsg). While we have a mechanism to skip these files once they are
discovered it would be nice to not stall on as yet undiscovered files of
this kind.
Set a timer before each file is parsed, warn user if timer expires.
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script is targeted at x86_64. We can support other
architectures by using the correct regular expressions for each
architecture.
Add the infrastructure to support multiple architectures. Add support
for ppc64.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script just dumps all results found. Potentially, this risks
losing single results among multiple duplicate results. We need some
way of restricting duplicates to assist users of the script. It would
also be nice if we got a report instead of raw results.
Duplicates can be defined in various ways, instead of trying to find a
single perfect solution we can present the user with various options to
display the output. Doing so will typically lead to users wanting to
view the output multiple times. Currently we scan the kernel each time,
this is slow and unnecessary. We can expedite the process by writing the
results to file for subsequent viewing.
Add command line options to enable summary reporting, including options
to write to and read from file.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
There are a couple more files that cause the script to stall.
/sys/firmware/devicetree and its symlink /proc/device-tree, reported by
Michael Ellerman.
usbmon should be skipped were ever it appears. Reported by Kees Cook
Add files to be excluded from parsing.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently script accepts files to skip. This was added to make running
the script faster (for repeat runs). We can remove this functionality in
preparation for adding sub commands (scan and format) to the script.
Remove command line options.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
debug_arrays is not called. Also, %seen hash is not used. We should
remove unused code.
Remove dead code.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Currently we are leaking addresses from the kernel to user space. This
script is an attempt to find some of those leakages. Script parses
`dmesg` output and /proc and /sys files for hex strings that look like
kernel addresses.
Only works for 64 bit kernels, the reason being that kernel addresses on
64 bit kernels have 'ffff' as the leading bit pattern making greping
possible. On 32 kernels we don't have this luxury.
Scripts is _slightly_ smarter than a straight grep, we check for false
positives (all 0's or all 1's, and vsyscall start/finish addresses).
[ I think there is a lot of room for improvement here, but it's already
useful, so I'm merging it as-is. The whole "hash %p format" series is
expected to go into 4.15, but will not fix %x users, and will not
incentivize people to look at what they are leaking. - Linus ]
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>