Граф коммитов

5153 Коммитов

Автор SHA1 Сообщение Дата
Mickaël Salaün 83e804f0bf fs,security: Add sb_delete hook
The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).

This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes.  This comes from the unprivileged nature of Landlock
described in the next commit.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-7-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:11 -07:00
Mickaël Salaün cb2c7d1a17 landlock: Support filesystem access-control
Using Landlock objects and ruleset, it is possible to tag inodes
according to a process's domain.  To enable an unprivileged process to
express a file hierarchy, it first needs to open a directory (or a file)
and pass this file descriptor to the kernel through
landlock_add_rule(2).  When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers.  The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy.  This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.

Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.

This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions.  This is the
result of multiple discussions to minimize the code of Landlock to ease
review.  Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem.  Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Link: https://lore.kernel.org/r/20210422154123.13086-8-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:11 -07:00
Casey Schaufler 1aea780837 LSM: Infrastructure management of the superblock
Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: John Johansen <john.johansen@canonical.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-6-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Mickaël Salaün afe81f7541 landlock: Add ptrace restrictions
Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation.  Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities.  Thanks to  ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-5-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Mickaël Salaün 385975dca5 landlock: Set up the security framework and manage credentials
Process's credentials point to a Landlock domain, which is underneath
implemented with a ruleset.  In the following commits, this domain is
used to check and enforce the ptrace and filesystem security policies.
A domain is inherited from a parent to its child the same way a thread
inherits a seccomp policy.

Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-4-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Mickaël Salaün ae271c1b14 landlock: Add ruleset and domain management
A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes.  This enables quick update and lookup to match a requested
access, e.g. to a file.  A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing syscalls) which enables a
process to create and populate a ruleset with new rules.

A domain is a ruleset tied to a set of processes.  This group of rules
defines the security policy enforced on these processes and their future
children.  A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process.  This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.

Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20210422154123.13086-3-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Mickaël Salaün 90945448e9 landlock: Add object management
A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its objects.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris <jmorris@namei.org>
Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Reviewed-by: Jann Horn <jannh@google.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210422154123.13086-2-mic@digikod.net
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-22 12:22:10 -07:00
Paul Moore e4c82eafb6 selinux: add proper NULL termination to the secclass_map permissions
This patch adds the missing NULL termination to the "bpf" and
"perf_event" object class permission lists.

This missing NULL termination should really only affect the tools
under scripts/selinux, with the most important being genheaders.c,
although in practice this has not been an issue on any of my dev/test
systems.  If the problem were to manifest itself it would likely
result in bogus permissions added to the end of the object class;
thankfully with no access control checks using these bogus
permissions and no policies defining these permissions the impact
would likely be limited to some noise about undefined permissions
during policy load.

Cc: stable@vger.kernel.org
Fixes: ec27c3568a ("selinux: bpf: Add selinux check for eBPF syscall operations")
Fixes: da97e18458 ("perf_event: Add support for LSM and SELinux checks")
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-04-21 21:43:25 -04:00
James Bottomley 60dc5f1bcf KEYS: trusted: fix TPM trusted keys for generic framework
The generic framework patch broke the current TPM trusted keys because
it doesn't correctly remove the values consumed by the generic parser
before passing them on to the implementation specific parser.  Fix
this by having the generic parser return the string minus the consumed
tokens.

Additionally, there may be no tokens left for the implementation
specific parser, so make it handle the NULL case correctly and finally
fix a TPM 1.2 specific check for no keyhandle.

Fixes: 5d0682be31 ("KEYS: trusted: Add generic trusted keys framework")
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2021-04-21 16:30:06 -07:00
James Bottomley 9d5171eab4 KEYS: trusted: Fix TPM reservation for seal/unseal
The original patch 8c657a0590 ("KEYS: trusted: Reserve TPM for seal
and unseal operations") was correct on the mailing list:

https://lore.kernel.org/linux-integrity/20210128235621.127925-4-jarkko@kernel.org/

But somehow got rebased so that the tpm_try_get_ops() in
tpm2_seal_trusted() got lost.  This causes an imbalanced put of the
TPM ops and causes oopses on TIS based hardware.

This fix puts back the lost tpm_try_get_ops()

Fixes: 8c657a0590 ("KEYS: trusted: Reserve TPM for seal and unseal operations")
Reported-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2021-04-21 16:28:20 -07:00
Gustavo A. R. Silva 28073eb09c ima: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of just
letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-04-20 16:54:14 -04:00
Jakub Kicinski 8203c7ce4e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
 - keep the ZC code, drop the code related to reinit
net/bridge/netfilter/ebtables.c
 - fix build after move to net_generic

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-17 11:08:07 -07:00
Walter Wu 02c587733c kasan: remove redundant config option
CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack
instrumentation, but we should only need one config, so that we remove
CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable.  see [1].

When enable KASAN stack instrumentation, then for gcc we could do no
prompt and default value y, and for clang prompt and default value n.

This patch fixes the following compilation warning:

  include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef]

[akpm@linux-foundation.org: fix merge snafu]

Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221 [1]
Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com
Fixes: d9b571c885 ("kasan: fix KASAN_STACK dependency for HW_TAGS")
Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-16 16:10:36 -07:00
Randy Dunlap 049ae601f3 security: commoncap: clean up kernel-doc comments
Fix kernel-doc notation in commoncap.c.

Use correct (matching) function name in comments as in code.
Use correct function argument names in kernel-doc comments.
Use kernel-doc's "Return:" format for function return values.

Fixes these kernel-doc warnings:

../security/commoncap.c:1206: warning: expecting prototype for cap_task_ioprio(). Prototype was for cap_task_setioprio() instead
../security/commoncap.c:1219: warning: expecting prototype for cap_task_ioprio(). Prototype was for cap_task_setnice() instead

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-04-15 09:21:58 -07:00
Colin Ian King aec00aa04b KEYS: trusted: Fix missing null return from kzalloc call
The kzalloc call can return null with the GFP_KERNEL flag so
add a null check and exit via a new error exit label. Use the
same exit error label for another error path too.

Addresses-Coverity: ("Dereference null return value")
Fixes: 830027e2cb55 ("KEYS: trusted: Add generic trusted keys framework")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:31 +03:00
Sumit Garg 0a95ebc913 KEYS: trusted: Introduce TEE based Trusted Keys
Add support for TEE based trusted keys where TEE provides the functionality
to seal and unseal trusted keys using hardware unique key.

Refer to Documentation/staging/tee.rst for detailed information about TEE.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
Sumit Garg 5d0682be31 KEYS: trusted: Add generic trusted keys framework
Current trusted keys framework is tightly coupled to use TPM device as
an underlying implementation which makes it difficult for implementations
like Trusted Execution Environment (TEE) etc. to provide trusted keys
support in case platform doesn't posses a TPM device.

Add a generic trusted keys framework where underlying implementations
can be easily plugged in. Create struct trusted_key_ops to achieve this,
which contains necessary functions of a backend.

Also, define a module parameter in order to select a particular trust
source in case a platform support multiple trust sources. In case its
not specified then implementation itetrates through trust sources list
starting with TPM and assign the first trust source as a backend which
has initiazed successfully during iteration.

Note that current implementation only supports a single trust source at
runtime which is either selectable at compile time or during boot via
aforementioned module parameter.

Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
James Bottomley e5fb5d2c5a security: keys: trusted: Make sealed key properly interoperable
The current implementation appends a migratable flag to the end of a
key, meaning the format isn't exactly interoperable because the using
party needs to know to strip this extra byte.  However, all other
consumers of TPM sealed blobs expect the unseal to return exactly the
key.  Since TPM2 keys have a key property flag that corresponds to
migratable, use that flag instead and make the actual key the only
sealed quantity.  This is secure because the key properties are bound
to a hash in the private part, so if they're altered the key won't
load.

Backwards compatibility is implemented by detecting whether we're
loading a new format key or not and correctly setting migratable from
the last byte of old format keys.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
James Bottomley f221974525 security: keys: trusted: use ASN.1 TPM2 key format for the blobs
Modify the TPM2 key format blob output to export and import in the
ASN.1 form for TPM2 sealed object keys.  For compatibility with prior
trusted keys, the importer will also accept two TPM2B quantities
representing the public and private parts of the key.  However, the
export via keyctl pipe will only output the ASN.1 format.

The benefit of the ASN.1 format is that it's a standard and thus the
exported key can be used by userspace tools (openssl_tpm2_engine,
openconnect and tpm2-tss-engine).  The format includes policy
specifications, thus it gets us out of having to construct policy
handles in userspace and the format includes the parent meaning you
don't have to keep passing it in each time.

This patch only implements basic handling for the ASN.1 format, so
keys with passwords but no policy.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
James Bottomley de66514d93 security: keys: trusted: fix TPM2 authorizations
In TPM 1.2 an authorization was a 20 byte number.  The spec actually
recommended you to hash variable length passwords and use the sha1
hash as the authorization.  Because the spec doesn't require this
hashing, the current authorization for trusted keys is a 40 digit hex
number.  For TPM 2.0 the spec allows the passing in of variable length
passwords and passphrases directly, so we should allow that in trusted
keys for ease of use.  Update the 'blobauth' parameter to take this
into account, so we can now use plain text passwords for the keys.

so before

keyctl add trusted kmk "new 32 blobauth=f572d396fae9206628714fb2ce00f72e94f2258fkeyhandle=81000001" @u

after we will accept both the old hex sha1 form as well as a new
directly supplied password:

keyctl add trusted kmk "new 32 blobauth=hello keyhandle=81000001" @u

Since a sha1 hex code must be exactly 40 bytes long and a direct
password must be 20 or less, we use the length as the discriminator
for which form is input.

Note this is both and enhancement and a potential bug fix.  The TPM
2.0 spec requires us to strip leading zeros, meaning empyty
authorization is a zero length HMAC whereas we're currently passing in
20 bytes of zeros.  A lot of TPMs simply accept this as OK, but the
Microsoft TPM emulator rejects it with TPM_RC_BAD_AUTH, so this patch
makes the Microsoft TPM emulator work with trusted keys.

Fixes: 0fe5480303 ("keys, trusted: seal/unseal with TPM 2.0 chips")
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14 16:30:30 +03:00
Jakub Kicinski 8859a44ea0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

MAINTAINERS
 - keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
 - simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
 - trivial
include/linux/ethtool.h
 - trivial, fix kdoc while at it
include/linux/skmsg.h
 - move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
 - add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
 - trivial

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-04-09 20:48:35 -07:00
Linus Torvalds 60144b23c9 selinux/stable-5.12 PR 20210409
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmBwjZcUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOAcg//eZL4z0ksGo9s/Y+/9qOIYH2tMPU5
 OOVZCekBENiq2LOuVbzAndeHOLZflf3iigBwtMvqHaAsdPAKH/3UedzD0/nxG39m
 S2gowEuNEfxtuBwuIZMFaMGzzLyjlZJ3xxi6omIyj/2JqPNyBbbFxR/VC4agJZI5
 oG6VfwhZJmFi1oJiNoGKjwihKHZQ90yd8UU5rMI+Np0TnP1Or3OvRaZjR47r+dWS
 tAu3nTKrVEyGTcPeGzg9TS5tIko0jQ1FyrqPDBhfaJta48bX/9s70We6rwqJj8Vg
 HiiSDPMK5EKkPLso+1vqvBI9q6xdhNeS+M2JP+/ewK/cqVKMkTVVys6l+T3a6HcY
 rIXdgTWdMFiAQ6OW44z30fiwSxW3kI5M62um31nepoqvzX7acl6R1laILFztedWM
 EOfCznZmE6ccYmcZnrqEmNsdF+Se1TUiM87bN90tAGmF9F4Yw2qGM0raiV3OJhDZ
 P2zR/+DceSHI2pNfFtB5VVXZelHoKVhoRcRWvpzn7YW3UmnAl83HoJasBfa/j4rx
 qvo+nj5ptCSX/kUYjvfvrRV1rY/BAaSVlFLpgYKY1r8/hdRN5DLpdE5cHh6Gky6B
 fJen4a7yVecp8IKK+WR3maJ0hymo5ccUoB5AKzMOXeECKRqKkIDAKiEN59g9t96+
 avKfojgsh1tNHA0=
 =mGDl
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fixes from Paul Moore:
 "Three SELinux fixes.

  These fix known problems relating to (re)loading SELinux policy or
  changing the policy booleans, and pass our test suite without problem"

* tag 'selinux-pr-20210409' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix race between old and new sidtab
  selinux: fix cond_list corruption when changing booleans
  selinux: make nslot handling in avtab more robust
2021-04-09 11:51:06 -07:00
Jiele Zhao 282c0a4d15 integrity: Add declarations to init_once void arguments.
init_once is a callback to kmem_cache_create. The parameter
type of this function is void *, so it's better to give a
explicit cast here.

Signed-off-by: Jiele Zhao <unclexiaole@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-04-09 12:17:52 -04:00
Jiele Zhao 41d75dd962 ima: Fix function name error in comment.
The original function name was ima_path_check().  The policy parsing
still supports PATH_CHECK.   Commit 9bbb6cad01 ("ima: rename
ima_path_check to ima_file_check") renamed the function to
ima_file_check(), but missed modifying the function name in the
comment.

Fixes: 9bbb6cad01 ("ima: rename ima_path_check to ima_file_check").

Signed-off-by: Jiele Zhao <unclexiaole@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-04-09 12:17:30 -04:00
Nayna Jain 6cbdfb3d91 ima: enable loading of build time generated key on .ima keyring
The kernel currently only loads the kernel module signing key onto the
builtin trusted keyring. Load the module signing key onto the IMA keyring
as well.

Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Acked-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-04-09 10:40:20 -04:00
Ondrej Mosnacek 9ad6e9cb39 selinux: fix race between old and new sidtab
Since commit 1b8b31a2e6 ("selinux: convert policy read-write lock to
RCU"), there is a small window during policy load where the new policy
pointer has already been installed, but some threads may still be
holding the old policy pointer in their read-side RCU critical sections.
This means that there may be conflicting attempts to add a new SID entry
to both tables via sidtab_context_to_sid().

See also (and the rest of the thread):
https://lore.kernel.org/selinux/CAFqZXNvfux46_f8gnvVvRYMKoes24nwm2n3sPbMjrB8vKTW00g@mail.gmail.com/

Fix this by installing the new policy pointer under the old sidtab's
spinlock along with marking the old sidtab as "frozen". Then, if an
attempt to add new entry to a "frozen" sidtab is detected, make
sidtab_context_to_sid() return -ESTALE to indicate that a new policy
has been installed and that the caller will have to abort the policy
transaction and try again after re-taking the policy pointer (which is
guaranteed to be a newer policy). This requires adding a retry-on-ESTALE
logic to all callers of sidtab_context_to_sid(), but fortunately these
are easy to determine and aren't that many.

This seems to be the simplest solution for this problem, even if it
looks somewhat ugly. Note that other places in the kernel (e.g.
do_mknodat() in fs/namei.c) use similar stale-retry patterns, so I think
it's reasonable.

Cc: stable@vger.kernel.org
Fixes: 1b8b31a2e6 ("selinux: convert policy read-write lock to RCU")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-04-07 20:42:56 -04:00
Ondrej Mosnacek d8f5f0ea5b selinux: fix cond_list corruption when changing booleans
Currently, duplicate_policydb_cond_list() first copies the whole
conditional avtab and then tries to link to the correct entries in
cond_dup_av_list() using avtab_search(). However, since the conditional
avtab may contain multiple entries with the same key, this approach
often fails to find the right entry, potentially leading to wrong rules
being activated/deactivated when booleans are changed.

To fix this, instead start with an empty conditional avtab and add the
individual entries one-by-one while building the new av_lists. This
approach leads to the correct result, since each entry is present in the
av_lists exactly once.

The issue can be reproduced with Fedora policy as follows:

    # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A
    allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True
    allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True
    # setsebool ftpd_anon_write=off ftpd_connect_all_unreserved=off ftpd_connect_db=off ftpd_full_access=off

On fixed kernels, the sesearch output is the same after the setsebool
command:

    # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A
    allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True
    allow ftpd_t public_content_rw_t:dir { add_name create link remove_name rename reparent rmdir setattr unlink watch watch_reads write }; [ ftpd_anon_write ]:True

While on the broken kernels, it will be different:

    # sesearch -s ftpd_t -t public_content_rw_t -c dir -p create -A
    allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True
    allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True
    allow ftpd_t non_security_file_type:dir { add_name create getattr ioctl link lock open read remove_name rename reparent rmdir search setattr unlink watch watch_reads write }; [ ftpd_full_access ]:True

While there, also simplify the computation of nslots. This changes the
nslots values for nrules 2 or 3 to just two slots instead of 4, which
makes the sequence more consistent.

Cc: stable@vger.kernel.org
Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-04-02 11:46:55 -04:00
Ondrej Mosnacek 442dc00f82 selinux: make nslot handling in avtab more robust
1. Make sure all fileds are initialized in avtab_init().
2. Slightly refactor avtab_alloc() to use the above fact.
3. Use h->nslot == 0 as a sentinel in the access functions to prevent
   dereferencing h->htable when it's not allocated.

Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-04-02 11:46:37 -04:00
Jens Axboe 4e53d1701b tomoyo: don't special case PF_IO_WORKER for PF_KTHREAD
Since commit 3bfe610669 ("io-wq: fork worker threads from original
task") stopped using PF_KTHREAD flag for the io_uring PF_IO_WORKER threads,
tomoyo_kernel_service() no longer needs to check PF_IO_WORKER flag.

(This is a 5.12+ patch. Please don't send to stable kernels.)

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2021-03-28 13:11:29 +09:00
Stefan Berger 947d705972 ima: Support EC keys for signature verification
Add support for IMA signature verification for EC keys. Since SHA type
of hashes can be used by RSA and ECDSA signature schemes we need to
look at the key and derive from the key which signature scheme to use.
Since this can be applied to all types of keys, we change the selection
of the encoding type to be driven by the key's signature scheme rather
than by the hash type.

Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: linux-integrity@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: keyrings@vger.kernel.org
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2021-03-26 19:41:59 +11:00
Linus Torvalds db24726bfe integrity-v5.12-fix
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAmBc8t0UHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVr82g/+M4Aj+geNTYg0so7iWuNoc1xrj32k
 AkNt82WWTOz5FCJdz5Svv6lS8fr0MLNlLV8Y2RqNjkbJfmrbuLT0nu8FmAgcEgO5
 XeDasz3v/ij0a2A9nqnnk7UUFpx+7I/peFBEgmupLhsnVE3UOsnzz8GZ922tsZ/k
 UeRHH/Nm+dJ8hAoY0RdfC09YGL/bxLO7w7+NjoaA1GaiiHZ+Jwb6aVbUt8F8X4K3
 R946QnOHqsyvXZ6f5LgSRr8eNEXszR4sJiUpzfpyAaEPF3eV/V3OEWAiv5YQmhnn
 /rLeprkW4C25jTwSVNWOwkSic9dQW3WRg8V6H51EAbm7BN67bO5B7GIsk5H61YPL
 wFvENmZpMxlXt8rcYI+Js8KnIR7ihYQVksRp6N5yTqYz28e+RZN/dqTLCBkRDjgs
 qLH6jEFcr4qzMpjF71nI2AZrepFXoXkH8vm9yz/uvzffdXyP+ZRbPoPEkx25meAt
 PC2neLco13unm2Yp/fuOOzE/sgVlI6n3o+9sLGSeexTkY+ewGKE+c7+bohoZtOk4
 OETYJZ9a7TzTvKsOqan5gcQfdhru6/M73Ev9lR6shoWe1TgF+hLkaKhVLWFL5dUn
 EVcQ/gc8zKmljNji1TtlXSP6+1InvgRUmes42G7HEcf3H3C6CejvAVfdIX1sx9av
 vzL7JN0q7SAIDnI=
 =qVQp
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.12-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fix from Mimi Zohar:
 "Just one patch to address a NULL ptr dereferencing when there is a
  mismatch between the user enabled LSMs and IMA/EVM"

* tag 'integrity-v5.12-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: double check iint_cache was initialized
2021-03-25 16:46:43 -07:00
David S. Miller efd13b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 15:31:22 -07:00
Arnd Bergmann 82e5d8cc76 security: commoncap: fix -Wstringop-overread warning
gcc-11 introdces a harmless warning for cap_inode_getsecurity:

security/commoncap.c: In function ‘cap_inode_getsecurity’:
security/commoncap.c:440:33: error: ‘memcpy’ reading 16 bytes from a region of size 0 [-Werror=stringop-overread]
  440 |                                 memcpy(&nscap->data, &cap->data, sizeof(__le32) * 2 * VFS_CAP_U32);
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The problem here is that tmpbuf is initialized to NULL, so gcc assumes
it is not accessible unless it gets set by vfs_getxattr_alloc().  This is
a legitimate warning as far as I can tell, but the code is correct since
it correctly handles the error when that function fails.

Add a separate NULL check to tell gcc about it as well.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2021-03-24 13:52:19 -07:00
Al Viro 64b2f34f38 apparmor:match_mn() - constify devpath argument
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-03-24 14:11:29 -04:00
Li Huafei 7990ccafaa ima: Fix the error code for restoring the PCR value
In ima_restore_measurement_list(), hdr[HDR_PCR].data is pointing to a
buffer of type u8, which contains the dumped 32-bit pcr value.
Currently, only the least significant byte is used to restore the pcr
value. We should convert hdr[HDR_PCR].data to a pointer of type u32
before fetching the value to restore the correct pcr value.

Fixes: 47fdee60b4 ("ima: use ima_parse_buf() to parse measurements headers")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-03-24 07:14:53 -04:00
Paul Moore 1fb057dcde smack: differentiate between subjective and objective task credentials
With the split of the security_task_getsecid() into subjective and
objective variants it's time to update Smack to ensure it is using
the correct task creds.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Reviewed-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:24:14 -04:00
Paul Moore eb1231f73c selinux: clarify task subjective and objective credentials
SELinux has a function, task_sid(), which returns the task's
objective credentials, but unfortunately is used in a few places
where the subjective task credentials should be used.  Most notably
in the new security_task_getsecid_subj() LSM hook.

This patch fixes this and attempts to make things more obvious by
introducing a new function, task_sid_subj(), and renaming the
existing task_sid() function to task_sid_obj().

This patch also adds an interesting function in task_sid_binder().
The task_sid_binder() function has a comment which hopefully
describes it's reason for being, but it basically boils down to the
simple fact that we can't safely access another task's subjective
credentials so in the case of binder we need to stick with the
objective credentials regardless.

Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:24:01 -04:00
Paul Moore 4ebd7651bf lsm: separate security_task_getsecid() into subjective and objective variants
Of the three LSMs that implement the security_task_getsecid() LSM
hook, all three LSMs provide the task's objective security
credentials.  This turns out to be unfortunate as most of the hook's
callers seem to expect the task's subjective credentials, although
a small handful of callers do correctly expect the objective
credentials.

This patch is the first step towards fixing the problem: it splits
the existing security_task_getsecid() hook into two variants, one
for the subjective creds, one for the objective creds.

  void security_task_getsecid_subj(struct task_struct *p,
				   u32 *secid);
  void security_task_getsecid_obj(struct task_struct *p,
				  u32 *secid);

While this patch does fix all of the callers to use the correct
variant, in order to keep this patch focused on the callers and to
ease review, the LSMs continue to use the same implementation for
both hooks.  The net effect is that this patch should not change
the behavior of the kernel in any way, it will be up to the latter
LSM specific patches in this series to change the hook
implementations and return the correct credentials.

Acked-by: Mimi Zohar <zohar@linux.ibm.com> (IMA)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 15:23:32 -04:00
Mimi Zohar f873b28f26 ima: without an IMA policy loaded, return quickly
Unless an IMA policy is loaded, don't bother checking for an appraise
policy rule.  Return immediately.

Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-03-22 15:12:26 -04:00
Mimi Zohar 92063f3ca7 integrity: double check iint_cache was initialized
The kernel may be built with multiple LSMs, but only a subset may be
enabled on the boot command line by specifying "lsm=".  Not including
"integrity" on the ordered LSM list may result in a NULL deref.

As reported by Dmitry Vyukov:
in qemu:
qemu-system-x86_64       -enable-kvm     -machine q35,nvdimm -cpu
max,migratable=off -smp 4       -m 4G,slots=4,maxmem=16G        -hda
wheezy.img      -kernel arch/x86/boot/bzImage   -nographic -vga std
 -soundhw all     -usb -usbdevice tablet  -bt hci -bt device:keyboard
   -net user,host=10.0.2.10,hostfwd=tcp::10022-:22 -net
nic,model=virtio-net-pci   -object
memory-backend-file,id=pmem1,share=off,mem-path=/dev/zero,size=64M
  -device nvdimm,id=nvdimm1,memdev=pmem1  -append "console=ttyS0
root=/dev/sda earlyprintk=serial rodata=n oops=panic panic_on_warn=1
panic=86400 lsm=smack numa=fake=2 nopcid dummy_hcd.num=8"   -pidfile
vm_pid -m 2G -cpu host

But it crashes on NULL deref in integrity_inode_get during boot:

Run /sbin/init as init process
BUG: kernel NULL pointer dereference, address: 000000000000001c
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc2+ #97
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.13.0-44-g88ab0c15525c-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmem_cache_alloc+0x2b/0x370 mm/slub.c:2920
Code: 57 41 56 41 55 41 54 41 89 f4 55 48 89 fd 53 48 83 ec 10 44 8b
3d d9 1f 90 0b 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 <8b> 5f
1c 4cf
RSP: 0000:ffffc9000032f9d8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888017fc4f00 RCX: 0000000000000000
RDX: ffff888040220000 RSI: 0000000000000c40 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: ffff888019263627
R10: ffffffff83937cd1 R11: 0000000000000000 R12: 0000000000000c40
R13: ffff888019263538 R14: 0000000000000000 R15: 0000000000ffffff
FS:  0000000000000000(0000) GS:ffff88802d180000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000001c CR3: 000000000b48e000 CR4: 0000000000750ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 integrity_inode_get+0x47/0x260 security/integrity/iint.c:105
 process_measurement+0x33d/0x17e0 security/integrity/ima/ima_main.c:237
 ima_bprm_check+0xde/0x210 security/integrity/ima/ima_main.c:474
 security_bprm_check+0x7d/0xa0 security/security.c:845
 search_binary_handler fs/exec.c:1708 [inline]
 exec_binprm fs/exec.c:1761 [inline]
 bprm_execve fs/exec.c:1830 [inline]
 bprm_execve+0x764/0x19a0 fs/exec.c:1792
 kernel_execve+0x370/0x460 fs/exec.c:1973
 try_to_run_init_process+0x14/0x4e init/main.c:1366
 kernel_init+0x11d/0x1b8 init/main.c:1477
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Modules linked in:
CR2: 000000000000001c
---[ end trace 22d601a500de7d79 ]---

Since LSMs and IMA may be configured at build time, but not enabled at
run time, panic the system if "integrity" was not initialized before use.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 79f7865d84 ("LSM: Introduce "lsm=" for boottime LSM selection")
Cc: stable@vger.kernel.org
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-03-22 14:54:11 -04:00
Olga Kornievskaia 69c4a42d72 lsm,selinux: add new hook to compare new mount to an existing mount
Add a new hook that takes an existing super block and a new mount
with new options and determines if new options confict with an
existing mount or not.

A filesystem can use this new hook to determine if it can share
the an existing superblock with a new superblock for the new mount.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
[PM: tweak the subject line, fix tab/space problems]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-22 14:53:37 -04:00
Linus Torvalds 8419639062 selinux/stable-5.12 PR 20210322
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmBYx3gUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPSSw/+MnJxbBEfxMXll2LwCRXvyW0Q/F++
 sSLPKZL9B5E7jANbTBlkUW+tMwsckTS7euPvRuJj2+mrSujRnSTl158JAAcn34gd
 lpiGQpttFZD75Eh9sLNg0OZ7PflwQvAzHt52EweD8/OE5O8BLBg7o56SYMr3LkGu
 Up9YcZPHNlj+NhfvWebv3jSB6dv392cG33iZoqmW81wSzmlXHGdzS5UTiIFnsp3X
 kbhLKaZWDSBHuAVMuAxtx3x3sQO1ElfFHxKRYM1fzfl0BMy30Wv6YnXHW2nn08Hr
 oT26968C0Rl9carTnA+G60Nj4WoTWW2dF20Mih+05vkpqFLjdMtFra7fFndbmfNi
 f7Gj5DJNrbunX1dMFJkyPnO/1x74RFUhZbCKm5ffvmF8AcYVivbbsyUAy/xduPWo
 m9hjXDVZLUbWxGBUFxyJD6qQw/wuz+qII8B7SBCKaDdCtM74TlXBVug8prrPcWHV
 tO3ljjbxEjBJ6zsFIJ9IlV3rJTL0v4RbAELXXp5qcZOJpnUtuH8cxj0Ryzo3yCY5
 g/m6IHhm5OfJ5TBSc5UIj2NJQi7sJ+Yv/++lms+RB2MVopx4lJ+UK7140gCA40iC
 1EPOGXCnB/b1k5F38dqdpI5MD+/uAzOMusQvPfL4x0xoQidzsqDmqgaS+V8pIYl6
 nisL4eEe2K7PWX4=
 =mFaE
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fixes from Paul Moore:
 "Three SELinux patches:

   - Fix a problem where a local variable is used outside its associated
     function. Thankfully this can only be triggered by reloading the
     SELinux policy, which is a restricted operation for other obvious
     reasons.

   - Fix some incorrect, and inconsistent, audit and printk messages
     when loading the SELinux policy.

  All three patches are relatively minor and have been through our
  testing with no failures"

* tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinuxfs: unify policy load error reporting
  selinux: fix variable scope issue in live sidtab conversion
  selinux: don't log MAC_POLICY_LOAD record on failed policy load
2021-03-22 11:34:31 -07:00
Ondrej Mosnacek ee5de60a08 selinuxfs: unify policy load error reporting
Let's drop the pr_err()s from sel_make_policy_nodes() and just add one
pr_warn_ratelimited() call to the sel_make_policy_nodes() error path in
sel_write_load().

Changing from error to warning makes sense, since after 02a52c5c8c
("selinux: move policy commit after updating selinuxfs"), this error
path no longer leads to a broken selinuxfs tree (it's just kept in the
original state and policy load is aborted).

I also added _ratelimited to be consistent with the other prtin in the
same function (it's probably not necessary, but can't really hurt...
there are likely more important error messages to be printed when
filesystem entry creation starts erroring out).

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-18 23:26:59 -04:00
Ondrej Mosnacek 6406887a12 selinux: fix variable scope issue in live sidtab conversion
Commit 02a52c5c8c ("selinux: move policy commit after updating
selinuxfs") moved the selinux_policy_commit() call out of
security_load_policy() into sel_write_load(), which caused a subtle yet
rather serious bug.

The problem is that security_load_policy() passes a reference to the
convert_params local variable to sidtab_convert(), which stores it in
the sidtab, where it may be accessed until the policy is swapped over
and RCU synchronized. Before 02a52c5c8c, selinux_policy_commit() was
called directly from security_load_policy(), so the convert_params
pointer remained valid all the way until the old sidtab was destroyed,
but now that's no longer the case and calls to sidtab_context_to_sid()
on the old sidtab after security_load_policy() returns may cause invalid
memory accesses.

This can be easily triggered using the stress test from commit
ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve
performance"):
```
function rand_cat() {
	echo $(( $RANDOM % 1024 ))
}

function do_work() {
	while true; do
		echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
			>/sys/fs/selinux/context 2>/dev/null || true
	done
}

do_work >/dev/null &
do_work >/dev/null &
do_work >/dev/null &

while load_policy; do echo -n .; sleep 0.1; done

kill %1
kill %2
kill %3
```

Fix this by allocating the temporary sidtab convert structures
dynamically and passing them among the
selinux_policy_{load,cancel,commit} functions.

Fixes: 02a52c5c8c ("selinux: move policy commit after updating selinuxfs")
Cc: stable@vger.kernel.org
Tested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: merge fuzz in security.h and services.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-18 23:23:46 -04:00
Ondrej Mosnacek 519dad3bcd selinux: don't log MAC_POLICY_LOAD record on failed policy load
If sel_make_policy_nodes() fails, we should jump to 'out', not 'out1',
as the latter would incorrectly log an MAC_POLICY_LOAD audit record,
even though the policy hasn't actually been reloaded. The 'out1' jump
label now becomes unused and can be removed.

Fixes: 02a52c5c8c ("selinux: move policy commit after updating selinuxfs")
Cc: stable@vger.kernel.org
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-18 23:13:04 -04:00
Eric W. Biederman 3b0c2d3eaa Revert 95ebabde38 ("capabilities: Don't allow writing ambiguous v3 file capabilities")
It turns out that there are in fact userspace implementations that
care and this recent change caused a regression.

https://github.com/containers/buildah/issues/3071

As the motivation for the original change was future development,
and the impact is existing real world code just revert this change
and allow the ambiguity in v3 file caps.

Cc: stable@vger.kernel.org
Fixes: 95ebabde38 ("capabilities: Don't allow writing ambiguous v3 file capabilities")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-03-12 15:27:14 -06:00
Ido Schimmel 710ec56223 nexthop: Add netlink defines and enumerators for resilient NH groups
- RTM_NEWNEXTHOP et.al. that handle resilient groups will have a new nested
  attribute, NHA_RES_GROUP, whose elements are attributes NHA_RES_GROUP_*.

- RTM_NEWNEXTHOPBUCKET et.al. is a suite of new messages that will
  currently serve only for dumping of individual buckets of resilient next
  hop groups. For nexthop group buckets, these messages will carry a nested
  attribute NHA_RES_BUCKET, whose elements are attributes NHA_RES_BUCKET_*.

  There are several reasons why a new suite of messages is created for
  nexthop buckets instead of overloading the information on the existing
  RTM_{NEW,DEL,GET}NEXTHOP messages.

  First, a nexthop group can contain a large number of nexthop buckets (4k
  is not unheard of). This imposes limits on the amount of information that
  can be encoded for each nexthop bucket given a netlink message is limited
  to 64k bytes.

  Second, while RTM_NEWNEXTHOPBUCKET is only used for notifications at
  this point, in the future it can be extended to provide user space with
  control over nexthop buckets configuration.

- The new group type is NEXTHOP_GRP_TYPE_RES. Note that nexthop code is
  adjusted to bounce groups with that type for now.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-11 16:12:59 -08:00
Eric Snowberg ebd9c2ae36 integrity: Load mokx variables into the blacklist keyring
During boot the Secure Boot Forbidden Signature Database, dbx,
is loaded into the blacklist keyring.  Systems booted with shim
have an equivalent Forbidden Signature Database called mokx.
Currently mokx is only used by shim and grub, the contents are
ignored by the kernel.

Add the ability to load mokx into the blacklist keyring during boot.

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/c33c8e3839a41e9654f41cc92c7231104931b1d7.camel@HansenPartnership.com/
Link: https://lore.kernel.org/r/20210122181054.32635-5-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428674320.677100.12637282414018170743.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433313205.902181.2502803393898221637.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529607422.163428.13530426573612578854.stgit@warthog.procyon.org.uk/ # v3
2021-03-11 16:34:48 +00:00
Eric Snowberg 56c5812623 certs: Add EFI_CERT_X509_GUID support for dbx entries
This fixes CVE-2020-26541.

The Secure Boot Forbidden Signature Database, dbx, contains a list of now
revoked signatures and keys previously approved to boot with UEFI Secure
Boot enabled.  The dbx is capable of containing any number of
EFI_CERT_X509_SHA256_GUID, EFI_CERT_SHA256_GUID, and EFI_CERT_X509_GUID
entries.

Currently when EFI_CERT_X509_GUID are contained in the dbx, the entries are
skipped.

Add support for EFI_CERT_X509_GUID dbx entries. When a EFI_CERT_X509_GUID
is found, it is added as an asymmetrical key to the .blacklist keyring.
Anytime the .platform keyring is used, the keys in the .blacklist keyring
are referenced, if a matching key is found, the key will be rejected.

[DH: Made the following changes:
 - Added to have a config option to enable the facility.  This allows a
   Kconfig solution to make sure that pkcs7_validate_trust() is
   enabled.[1][2]
 - Moved the functions out from the middle of the blacklist functions.
 - Added kerneldoc comments.]

Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
cc: Randy Dunlap <rdunlap@infradead.org>
cc: Mickaël Salaün <mic@digikod.net>
cc: Arnd Bergmann <arnd@kernel.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/20200901165143.10295-1-eric.snowberg@oracle.com/ # rfc
Link: https://lore.kernel.org/r/20200909172736.73003-1-eric.snowberg@oracle.com/ # v2
Link: https://lore.kernel.org/r/20200911182230.62266-1-eric.snowberg@oracle.com/ # v3
Link: https://lore.kernel.org/r/20200916004927.64276-1-eric.snowberg@oracle.com/ # v4
Link: https://lore.kernel.org/r/20210122181054.32635-2-eric.snowberg@oracle.com/ # v5
Link: https://lore.kernel.org/r/161428672051.677100.11064981943343605138.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/161433310942.902181.4901864302675874242.stgit@warthog.procyon.org.uk/ # v2
Link: https://lore.kernel.org/r/161529605075.163428.14625520893961300757.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/bc2c24e3-ed68-2521-0bf4-a1f6be4a895d@infradead.org/ [1]
Link: https://lore.kernel.org/r/20210225125638.1841436-1-arnd@kernel.org/ [2]
2021-03-11 16:31:28 +00:00
Xiong Zhenwu 431c3be16b selinux: fix misspellings using codespell tool
A typo is f out by codespell tool in 422th line of security.h:

$ codespell ./security/selinux/include/
./security.h:422: thie  ==> the, this

Fix a typo found by codespell.

Signed-off-by: Xiong Zhenwu <xiong.zhenwu@zte.com.cn>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-08 19:46:33 -05:00
Xiong Zhenwu 63ddf1baa0 selinux: fix misspellings using codespell tool
A typo is found out by codespell tool in 16th line of hashtab.c

$ codespell ./security/selinux/ss/
./hashtab.c:16: rouding  ==> rounding

Fix a typo found by codespell.

Signed-off-by: Xiong Zhenwu <xiong.zhenwu@zte.com.cn>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-08 19:44:30 -05:00
Lakshmi Ramasubramanian 2554a48f44 selinux: measure state and policy capabilities
SELinux stores the configuration state and the policy capabilities
in kernel memory.  Changes to this data at runtime would have an impact
on the security guarantees provided by SELinux.  Measuring this data
through IMA subsystem provides a tamper-resistant way for
an attestation service to remotely validate it at runtime.

Measure the configuration state and policy capabilities by calling
the IMA hook ima_measure_critical_data().

To enable SELinux data measurement, the following steps are required:

 1, Add "ima_policy=critical_data" to the kernel command line arguments
    to enable measuring SELinux data at boot time.
    For example,
      BOOT_IMAGE=/boot/vmlinuz-5.11.0-rc3+ root=UUID=fd643309-a5d2-4ed3-b10d-3c579a5fab2f ro nomodeset security=selinux ima_policy=critical_data

 2, Add the following rule to /etc/ima/ima-policy
       measure func=CRITICAL_DATA label=selinux

Sample measurement of SELinux state and policy capabilities:

10 2122...65d8 ima-buf sha256:13c2...1292 selinux-state 696e...303b

Execute the following command to extract the measured data
from the IMA's runtime measurements list:

  grep "selinux-state" /sys/kernel/security/integrity/ima/ascii_runtime_measurements | tail -1 | cut -d' ' -f 6 | xxd -r -p

The output should be a list of key-value pairs. For example,
 initialized=1;enforcing=0;checkreqprot=1;network_peer_controls=1;open_perms=1;extended_socket_class=1;always_check_network=0;cgroup_seclabel=1;nnp_nosuid_transition=1;genfs_seclabel_symlinks=0;

To verify the measurement is consistent with the current SELinux state
reported on the system, compare the integer values in the following
files with those set in the IMA measurement (using the following commands):

 - cat /sys/fs/selinux/enforce
 - cat /sys/fs/selinux/checkreqprot
 - cat /sys/fs/selinux/policy_capabilities/[capability_file]

Note that the actual verification would be against an expected state
and done on a separate system (likely an attestation server) requiring
"initialized=1;enforcing=1;checkreqprot=0;"
for a secure state and then whatever policy capabilities are actually
set in the expected policy (which can be extracted from the policy
itself via seinfo, for example).

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-08 19:39:07 -05:00
Vivek Goyal 7fa2e79a6b selinux: Allow context mounts for unpriviliged overlayfs
Now overlayfs allow unpriviliged mounts. That is root inside a non-init
user namespace can mount overlayfs. This is being added in 5.11 kernel.

Giuseppe tried to mount overlayfs with option "context" and it failed
with error -EACCESS.

$ su test
$ unshare -rm
$ mkdir -p lower upper work merged
$ mount -t overlay -o lowerdir=lower,workdir=work,upperdir=upper,userxattr,context='system_u:object_r:container_file_t:s0' none merged

This fails with -EACCESS. It works if option "-o context" is not specified.

Little debugging showed that selinux_set_mnt_opts() returns -EACCESS.

So this patch adds "overlay" to the list, where it is fine to specific
context from non init_user_ns.

Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
[PM: trimmed the changelog from the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-03-08 19:34:38 -05:00
Lakshmi Ramasubramanian fee3ff99bc powerpc: Move arch independent ima kexec functions to drivers/of/kexec.c
The functions defined in "arch/powerpc/kexec/ima.c" handle setting up
and freeing the resources required to carry over the IMA measurement
list from the current kernel to the next kernel across kexec system call.
These functions do not have architecture specific code, but are
currently limited to powerpc.

Move remove_ima_buffer() and setup_ima_buffer() calls into
of_kexec_alloc_and_setup_fdt() defined in "drivers/of/kexec.c".

Move the remaining architecture independent functions from
"arch/powerpc/kexec/ima.c" to "drivers/of/kexec.c".
Delete "arch/powerpc/kexec/ima.c" and "arch/powerpc/include/asm/ima.h".
Remove references to the deleted files and functions in powerpc and
in ima.

Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Tested-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-11-nramas@linux.microsoft.com
2021-03-08 12:06:29 -07:00
Lakshmi Ramasubramanian 0c605158be powerpc: Move ima buffer fields to struct kimage
The fields ima_buffer_addr and ima_buffer_size in "struct kimage_arch"
for powerpc are used to carry forward the IMA measurement list across
kexec system call.  These fields are not architecture specific, but are
currently limited to powerpc.

arch_ima_add_kexec_buffer() defined in "arch/powerpc/kexec/ima.c"
sets ima_buffer_addr and ima_buffer_size for the kexec system call.
This function does not have architecture specific code, but is
currently limited to powerpc.

Move ima_buffer_addr and ima_buffer_size to "struct kimage".
Set ima_buffer_addr and ima_buffer_size in ima_add_kexec_buffer()
in security/integrity/ima/ima_kexec.c.

Co-developed-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Prakhar Srivastava <prsriva@linux.microsoft.com>
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20210221174930.27324-9-nramas@linux.microsoft.com
2021-03-08 12:06:29 -07:00
Linus Torvalds c03c21ba6f Keyrings miscellany
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmAj3ncACgkQ+7dXa6fL
 C2s7eQ/+Obr0Mp9mYJhht/LN3YAIgFrgyPCgwsmYsanc0j8cdECDMoz6b287/W3g
 69zHQUv7iVqHPIK+NntBSSpHKlCapfUKikt5c9kfPNuDn3aT3ZpTBr1t3DYJX1uO
 K6tMUXNDNoi1O70yqsVZEq4Qcv2+1uQXP+F/GxjNkd/brID1HsV/VENKCLSRbyP/
 iazgXx/hChQSdu0YbZwMCkuVErEAJvRWU75l9D1v1Uaaaqro5QdelMdz9DZeO4E5
 CirXXA5d9zAA9ANj0T7odyg79vhFOz8yc0lFhybc/EPNYSHeOV1o8eK3h4ZIZ+hl
 BShwe7feHlmxkQ5WQBppjAn+aFiBtw7LKIptS3YpMI5M7clgT1THDPhgOdVWmbZk
 sBbD0bToP8sst6Zi/95StbqawjagR3uE6YBXRVSyTefGQdG1q1c0u9FM/8bZTc3B
 q4iDTbvfYdUFN6ywQZhh09v6ljZLdNSv0ht1wLcgByBmgdBvzmBgfczEKtAZcxfY
 cLBRvjc8ZjWpfqjrvmmURGQaqwVlO9YBGRzJJwALH9xib1IQbuVmUOilaIGTcCiE
 W1Qd4YLPh8Gv1B9GDY2HMw56IGp75QHD56KwIbf93c8JeEB08/iWSuH+kKwyup8+
 h5xXpzt5NKAx4GQesWeBjWvt+AmZ+uJDtt4dNb/j91gmbh3POTI=
 =HCrJ
 -----END PGP SIGNATURE-----

Merge tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull keyring updates from David Howells:
 "Here's a set of minor keyrings fixes/cleanups that I've collected from
  various people for the upcoming merge window.

  A couple of them might, in theory, be visible to userspace:

   - Make blacklist_vet_description() reject uppercase letters as they
     don't match the all-lowercase hex string generated for a blacklist
     search.

     This may want reconsideration in the future, but, currently, you
     can't add to the blacklist keyring from userspace and the only
     source of blacklist keys generates lowercase descriptions.

   - Fix blacklist_init() to use a new KEY_ALLOC_* flag to indicate that
     it wants KEY_FLAG_KEEP to be set rather than passing KEY_FLAG_KEEP
     into keyring_alloc() as KEY_FLAG_KEEP isn't a valid alloc flag.

     This isn't currently a problem as the blacklist keyring isn't
     currently writable by userspace.

  The rest of the patches are cleanups and I don't think they should
  have any visible effect"

* tag 'keys-misc-20210126' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  watch_queue: rectify kernel-doc for init_watch()
  certs: Replace K{U,G}IDT_INIT() with GLOBAL_ROOT_{U,G}ID
  certs: Fix blacklist flag type confusion
  PKCS#7: Fix missing include
  certs: Fix blacklisted hexadecimal hash string check
  certs/blacklist: fix kernel doc interface issue
  crypto: public_key: Remove redundant header file from public_key.h
  keys: remove trailing semicolon in macro definition
  crypto: pkcs7: Use match_string() helper to simplify the code
  PKCS#7: drop function from kernel-doc pkcs7_validate_trust_one
  encrypted-keys: Replace HTTP links with HTTPS ones
  crypto: asymmetric_keys: fix some comments in pkcs7_parser.h
  KEYS: remove redundant memset
  security: keys: delete repeated words in comments
  KEYS: asymmetric: Fix kerneldoc
  security/keys: use kvfree_sensitive()
  watch_queue: Drop references to /dev/watch_queue
  keys: Remove outdated __user annotations
  security: keys: Fix fall-through warnings for Clang
2021-02-23 16:09:23 -08:00
Linus Torvalds 7d6beb71da idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
 ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
 =yPaw
 -----END PGP SIGNATURE-----

Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull idmapped mounts from Christian Brauner:
 "This introduces idmapped mounts which has been in the making for some
  time. Simply put, different mounts can expose the same file or
  directory with different ownership. This initial implementation comes
  with ports for fat, ext4 and with Christoph's port for xfs with more
  filesystems being actively worked on by independent people and
  maintainers.

  Idmapping mounts handle a wide range of long standing use-cases. Here
  are just a few:

   - Idmapped mounts make it possible to easily share files between
     multiple users or multiple machines especially in complex
     scenarios. For example, idmapped mounts will be used in the
     implementation of portable home directories in
     systemd-homed.service(8) where they allow users to move their home
     directory to an external storage device and use it on multiple
     computers where they are assigned different uids and gids. This
     effectively makes it possible to assign random uids and gids at
     login time.

   - It is possible to share files from the host with unprivileged
     containers without having to change ownership permanently through
     chown(2).

   - It is possible to idmap a container's rootfs and without having to
     mangle every file. For example, Chromebooks use it to share the
     user's Download folder with their unprivileged containers in their
     Linux subsystem.

   - It is possible to share files between containers with
     non-overlapping idmappings.

   - Filesystem that lack a proper concept of ownership such as fat can
     use idmapped mounts to implement discretionary access (DAC)
     permission checking.

   - They allow users to efficiently changing ownership on a per-mount
     basis without having to (recursively) chown(2) all files. In
     contrast to chown (2) changing ownership of large sets of files is
     instantenous with idmapped mounts. This is especially useful when
     ownership of a whole root filesystem of a virtual machine or
     container is changed. With idmapped mounts a single syscall
     mount_setattr syscall will be sufficient to change the ownership of
     all files.

   - Idmapped mounts always take the current ownership into account as
     idmappings specify what a given uid or gid is supposed to be mapped
     to. This contrasts with the chown(2) syscall which cannot by itself
     take the current ownership of the files it changes into account. It
     simply changes the ownership to the specified uid and gid. This is
     especially problematic when recursively chown(2)ing a large set of
     files which is commong with the aforementioned portable home
     directory and container and vm scenario.

   - Idmapped mounts allow to change ownership locally, restricting it
     to specific mounts, and temporarily as the ownership changes only
     apply as long as the mount exists.

  Several userspace projects have either already put up patches and
  pull-requests for this feature or will do so should you decide to pull
  this:

   - systemd: In a wide variety of scenarios but especially right away
     in their implementation of portable home directories.

         https://systemd.io/HOME_DIRECTORY/

   - container runtimes: containerd, runC, LXD:To share data between
     host and unprivileged containers, unprivileged and privileged
     containers, etc. The pull request for idmapped mounts support in
     containerd, the default Kubernetes runtime is already up for quite
     a while now: https://github.com/containerd/containerd/pull/4734

   - The virtio-fs developers and several users have expressed interest
     in using this feature with virtual machines once virtio-fs is
     ported.

   - ChromeOS: Sharing host-directories with unprivileged containers.

  I've tightly synced with all those projects and all of those listed
  here have also expressed their need/desire for this feature on the
  mailing list. For more info on how people use this there's a bunch of
  talks about this too. Here's just two recent ones:

      https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
      https://fosdem.org/2021/schedule/event/containers_idmap/

  This comes with an extensive xfstests suite covering both ext4 and
  xfs:

      https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts

  It covers truncation, creation, opening, xattrs, vfscaps, setid
  execution, setgid inheritance and more both with idmapped and
  non-idmapped mounts. It already helped to discover an unrelated xfs
  setgid inheritance bug which has since been fixed in mainline. It will
  be sent for inclusion with the xfstests project should you decide to
  merge this.

  In order to support per-mount idmappings vfsmounts are marked with
  user namespaces. The idmapping of the user namespace will be used to
  map the ids of vfs objects when they are accessed through that mount.
  By default all vfsmounts are marked with the initial user namespace.
  The initial user namespace is used to indicate that a mount is not
  idmapped. All operations behave as before and this is verified in the
  testsuite.

  Based on prior discussions we want to attach the whole user namespace
  and not just a dedicated idmapping struct. This allows us to reuse all
  the helpers that already exist for dealing with idmappings instead of
  introducing a whole new range of helpers. In addition, if we decide in
  the future that we are confident enough to enable unprivileged users
  to setup idmapped mounts the permission checking can take into account
  whether the caller is privileged in the user namespace the mount is
  currently marked with.

  The user namespace the mount will be marked with can be specified by
  passing a file descriptor refering to the user namespace as an
  argument to the new mount_setattr() syscall together with the new
  MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
  of extensibility.

  The following conditions must be met in order to create an idmapped
  mount:

   - The caller must currently have the CAP_SYS_ADMIN capability in the
     user namespace the underlying filesystem has been mounted in.

   - The underlying filesystem must support idmapped mounts.

   - The mount must not already be idmapped. This also implies that the
     idmapping of a mount cannot be altered once it has been idmapped.

   - The mount must be a detached/anonymous mount, i.e. it must have
     been created by calling open_tree() with the OPEN_TREE_CLONE flag
     and it must not already have been visible in the filesystem.

  The last two points guarantee easier semantics for userspace and the
  kernel and make the implementation significantly simpler.

  By default vfsmounts are marked with the initial user namespace and no
  behavioral or performance changes are observed.

  The manpage with a detailed description can be found here:

      1d7b902e28

  In order to support idmapped mounts, filesystems need to be changed
  and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
  patches to convert individual filesystem are not very large or
  complicated overall as can be seen from the included fat, ext4, and
  xfs ports. Patches for other filesystems are actively worked on and
  will be sent out separately. The xfstestsuite can be used to verify
  that port has been done correctly.

  The mount_setattr() syscall is motivated independent of the idmapped
  mounts patches and it's been around since July 2019. One of the most
  valuable features of the new mount api is the ability to perform
  mounts based on file descriptors only.

  Together with the lookup restrictions available in the openat2()
  RESOLVE_* flag namespace which we added in v5.6 this is the first time
  we are close to hardened and race-free (e.g. symlinks) mounting and
  path resolution.

  While userspace has started porting to the new mount api to mount
  proper filesystems and create new bind-mounts it is currently not
  possible to change mount options of an already existing bind mount in
  the new mount api since the mount_setattr() syscall is missing.

  With the addition of the mount_setattr() syscall we remove this last
  restriction and userspace can now fully port to the new mount api,
  covering every use-case the old mount api could. We also add the
  crucial ability to recursively change mount options for a whole mount
  tree, both removing and adding mount options at the same time. This
  syscall has been requested multiple times by various people and
  projects.

  There is a simple tool available at

      https://github.com/brauner/mount-idmapped

  that allows to create idmapped mounts so people can play with this
  patch series. I'll add support for the regular mount binary should you
  decide to pull this in the following weeks:

  Here's an example to a simple idmapped mount of another user's home
  directory:

	u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt

	u1001@f2-vm:/$ ls -al /home/ubuntu/
	total 28
	drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
	drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
	-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
	-rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
	-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
	-rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
	-rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ ls -al /mnt/
	total 28
	drwxr-xr-x  2 u1001 u1001 4096 Oct 28 22:07 .
	drwxr-xr-x 29 root  root  4096 Oct 28 22:01 ..
	-rw-------  1 u1001 u1001 3154 Oct 28 22:12 .bash_history
	-rw-r--r--  1 u1001 u1001  220 Feb 25  2020 .bash_logout
	-rw-r--r--  1 u1001 u1001 3771 Feb 25  2020 .bashrc
	-rw-r--r--  1 u1001 u1001  807 Feb 25  2020 .profile
	-rw-r--r--  1 u1001 u1001    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw-------  1 u1001 u1001 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ touch /mnt/my-file

	u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file

	u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file

	u1001@f2-vm:/$ ls -al /mnt/my-file
	-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file

	u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
	-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file

	u1001@f2-vm:/$ getfacl /mnt/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: mnt/my-file
	# owner: u1001
	# group: u1001
	user::rw-
	user:u1001:rwx
	group::rw-
	mask::rwx
	other::r--

	u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: home/ubuntu/my-file
	# owner: ubuntu
	# group: ubuntu
	user::rw-
	user:ubuntu:rwx
	group::rw-
	mask::rwx
	other::r--"

* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
  xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
  xfs: support idmapped mounts
  ext4: support idmapped mounts
  fat: handle idmapped mounts
  tests: add mount_setattr() selftests
  fs: introduce MOUNT_ATTR_IDMAP
  fs: add mount_setattr()
  fs: add attr_flags_to_mnt_flags helper
  fs: split out functions to hold writers
  namespace: only take read lock in do_reconfigure_mnt()
  mount: make {lock,unlock}_mount_hash() static
  namespace: take lock_mount_hash() directly when changing flags
  nfs: do not export idmapped mounts
  overlayfs: do not mount on top of idmapped mounts
  ecryptfs: do not mount on top of idmapped mounts
  ima: handle idmapped mounts
  apparmor: handle idmapped mounts
  fs: make helpers idmap mount aware
  exec: handle idmapped mounts
  would_dump: handle idmapped mounts
  ...
2021-02-23 13:39:45 -08:00
Linus Torvalds 7b0b78df9c Merge branch 'userns-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace update from Eric Biederman:
 "There are several pieces of active development, but only a single
  change made it through the gauntlet to be ready for v5.12. That change
  is tightening up the semantics of the v3 capabilities xattr. It is
  just short of being a bug-fix/security issue as no user space is known
  to even generate the problem case"

* 'userns-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  capabilities: Don't allow writing ambiguous v3 file capabilities
2021-02-22 17:13:33 -08:00
Linus Torvalds 250a25e7a1 Merge branch 'work.audit' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull RCU-safe common_lsm_audit() from Al Viro:
 "Make common_lsm_audit() non-blocking and usable from RCU pathwalk
  context.

  We don't really need to grab/drop dentry in there - rcu_read_lock() is
  enough. There's a couple of followups using that to simplify the
  logics in selinux, but those hadn't soaked in -next yet, so they'll
  have to go in next window"

* 'work.audit' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  make dump_common_audit_data() safe to be called from RCU pathwalk
  new helper: d_find_alias_rcu()
2021-02-22 13:05:30 -08:00
Linus Torvalds a2b095e0ef tpmdd updates for Linux v5.12-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCYC2ZZhIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9KM4wEAiMpUqHIX87ZnpRIH3SHD7niOS5+AKwWU
 lklNFH4WLMcBANlp4icOxzNQEpVrBWAf5oNBp9Gyknvd7oat4AUum7kP
 =yNI9
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-v5.12-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:
 "New features:

   - Cr50 I2C TPM driver

   - sysfs exports of PCR registers in TPM 2.0 chips

  Bug fixes:

   - bug fixes for tpm_tis driver, which had a racy wait for hardware
     state change to be ready to send a command to the TPM chip. The bug
     has existed already since 2006, but has only made itself known in
     recent past. This is the same as the "last time" :-)

   - Otherwise there's bunch of fixes for not as alarming regressions. I
     think the list is about the same as last time, except I added fixes
     for some disjoint bugs in trusted keys that I found some time ago"

* tag 'tpmdd-next-v5.12-rc1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  KEYS: trusted: Reserve TPM for seal and unseal operations
  KEYS: trusted: Fix migratable=1 failing
  KEYS: trusted: Fix incorrect handling of tpm_get_random()
  tpm/ppi: Constify static struct attribute_group
  ABI: add sysfs description for tpm exports of PCR registers
  tpm: add sysfs exports for all banks of PCR registers
  keys: Update comment for restrict_link_by_key_or_keyring_chain
  tpm: Remove tpm_dev_wq_lock
  char: tpm: add i2c driver for cr50
  tpm: Fix fall-through warnings for Clang
  tpm_tis: Clean up locality release
  tpm_tis: Fix check_locality for correct locality acquisition
2021-02-21 17:15:44 -08:00
Linus Torvalds 92ae63c07b Smack updates for v5.12.
Bounds checking for writes to smackfs interfaces.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmAsOFwXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFMKA/9H0eX7y8Np7QErAjVg4rfTkFW
 nJJ6EzBCvmPsYoqNZsb+jWmnCo8k4LmRn3BXSsr4fiI/OwdoBHHuF9ZPl/sHOx+v
 eZZr7+WCJCKv/xdROtz0gVRFs3vbng4cGuhX/vDzqiVtbZ0w1Uh0G1Dpe0XDsGdU
 SL4fh9x8UQwGsTdsCfUAKMhiUxHX1qupsVeH3DC20KSc3wVoddeZTi9GkU6bOXAM
 jxa+w1RwYewpchKeGAjErJ2sNz/yQ7Na6MlejNLgG9QQM9uraY+VoffyInDTcOy8
 yJsYikk6HElVdU8UGWk6ZKcDFd7PlLw2b0FJfx9ICHmvNzZbWHJwPVr8zFCq1SLe
 ydX31IKz6zTsKWxRYUNvLFn4LlT+Okg8u0r/apc/Yn7Cxy8OfElwA5s0K8NURIBs
 cG4li4MiRi1v8JkwQZBN8mhyEV8JF98Wdm6hXqvTITYt4sz4XmVc8c15o/7cOEWo
 zeF5i/HDy9aZRQt4z1y1NKVRx7CQylgJ5INeLebMtVuWILjsO/VIj/bsBcO+LQ4/
 jjnkLJLOQ49TryisgDNY8M+vgCODo6GeFFCjhBQQ7+i4LNedrkhZqOJKsPKOSV8s
 RR6rvrotWDIjKrzmraP3rSLv/HsgKEymo42K+hQlOmN9nbZtSahr7JxanamC5avw
 HIAt0QGJ1XqtprniOyk=
 =Ft+S
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-v5.12' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Bounds checking for writes to smackfs interfaces"

* tag 'Smack-for-v5.12' of git://github.com/cschaufler/smack-next:
  smackfs: restrict bytes count in smackfs write functions
2021-02-21 17:11:07 -08:00
Linus Torvalds d643a99089 integrity-v5.12
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAmArRwIUHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVo6JxAAkZHDhv6Zv7FfVsFjE7yDJwRFBu4o
 jnAowPxa/xl6MlR2ICTFLHjOimEcpvzySO2IM85WCxjRYaNevITOxEZE+qfE/Byo
 K1MuZOSXXBa2+AgO1Tku+ZNrQvzTsphgtvhlSD9ReN7P84C/rxG5YDomME+8/6rR
 QH7Ly/izyc3VNKq7nprT8F2boJ0UxpcwNHZiH2McQD3UvUaZOecwpcpvth5pbgad
 Ej2r72Q+IR0voqM/T1dc4TjW5Wcw/m27vhGQoOfQ5f+as5r9r1cPSWj0wRJTkATo
 F/SiKuyWUwOGkRO8I9aaXXzTBgcJw/7MmZe8yNDg5QJrUzD8F5cdjlHZdsnz5BJq
 tLo4kUsR4xMePEppJ4a10ZUDQa737j97C20xTwOHf6mKGIqmoooGAsjW9xUyYqHU
 rYuLP4qB7ua4j8Uz9zVJazjgQWPQ+8Ad9MkjQLLhr00Azpz4mVweWVGjCJQC0pky
 Jr2H4xj3JLAoygqMWfJxr9aVBpfy4Wmo0U29ryZuxZUr178qSXoL3QstGWXRa2MN
 TwzpgHi1saItQ6iXAO0HB6Tsw0h8INyjrm7c3ANbmBwMsYMYeKcTG87+Z0LkK82w
 C5SW2uQT9aLBXx9lZx8z0RpxygO1cW+KjlZxRYSfQa/ev/aF2kBz0ruGQgvqai4K
 ceh/cwrYjrCbFVc=
 =mojv
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull IMA updates from Mimi Zohar:
 "New is IMA support for measuring kernel critical data, as per usual
  based on policy. The first example measures the in memory SELinux
  policy. The second example measures the kernel version.

  In addition are four bug fixes to address memory leaks and a missing
  'static' function declaration"

* tag 'integrity-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: Make function integrity_add_key() static
  ima: Free IMA measurement buffer after kexec syscall
  ima: Free IMA measurement buffer on error
  IMA: Measure kernel version in early boot
  selinux: include a consumer of the new IMA critical data hook
  IMA: define a builtin critical data measurement policy
  IMA: extend critical data hook to limit the measurement based on a label
  IMA: limit critical data measurement based on a label
  IMA: add policy rule to measure critical data
  IMA: define a hook to measure kernel integrity critical data
  IMA: add support to measure buffer data hash
  IMA: generalize keyring specific measurement constructs
  evm: Fix memleak in init_desc
2021-02-21 17:08:06 -08:00
Linus Torvalds d1fec2214b selinux/stable-5.12 PR 20210215
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmAqwVUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP1nw//bbmtBhpaG+RnmPrSGZgy3gqbB3gU
 ggJ5UKNvYclrej2dur3EHXPEB0YWDv2D2OgChfTAu+T7sc2sBF3bz9qAu1a556mV
 JdfID8aoUwSk+oN7AKcwbdua+wLhXppAnYSKaknR+tjmWzvVKBDkrOovl52oR6L8
 Wx3YHCy7yPO79wqGqoWLCI7aI8ByfovoyOf6Xr/sPl+gMuBvbFoJeO1Pa9YNoI0z
 noGT1h6vLjgyvegqMX5lCkh1sUlcOsmXkAksw1FyEAfJfr0MPLLkVoTaBAook5NO
 X7VEhv845CjfIfoCXDdIHzriDWHp3tEDMSQaLwU3QSjfsbyNVh4ggwuHZYqrR9dL
 DerCa+89XYdCldrBzBeRs3Qd/6bZtHpd62pHDgn+NwMdjEckCHh41t2f2odD+Rdy
 2Fv+50C3m+7JjUawKhzgWR3BYJhafiKKUiWc2GBm1cBSr7+vSKokDG27gJmtNCoE
 TedSlQTPyi47zjZMnf/laSqGEUG9xz79xAiDPDP5yuxbDvN5andRYHmhI4thbGcq
 5DsVx5DDWaXtJxRVlsTgTeyvjdp61Rbvj8jvbbD/St+8PNsbpFOerbjSaidovfJK
 Y0YrkL/sKcGkM8HbQCcl1DKd4l1EfDIKUch078LQJHetuHh4L89U+r5uqZRsgZYD
 /EWeEw56llrepMQ=
 =5fVL
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20210215' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "We've got a good handful of patches for SELinux this time around; with
  everything passing the selinux-testsuite and applying cleanly to your
  tree as of a few minutes ago. The highlights are:

   - Add support for labeling anonymous inodes, and extend this new
     support to userfaultfd.

   - Fallback to SELinux genfs file labeling if the filesystem does not
     have xattr support. This is useful for virtiofs which can vary in
     its xattr support depending on the backing filesystem.

   - Classify and handle MPTCP the same as TCP in SELinux.

   - Ensure consistent behavior between inode_getxattr and
     inode_listsecurity when the SELinux policy is not loaded. This
     fixes a known problem with overlayfs.

   - A couple of patches to prune some unused variables from the SELinux
     code, mark private variables as static, and mark other variables as
     __ro_after_init or __read_mostly"

* tag 'selinux-pr-20210215' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  fs: anon_inodes: rephrase to appropriate kernel-doc
  userfaultfd: use secure anon inodes for userfaultfd
  selinux: teach SELinux about anonymous inodes
  fs: add LSM-supporting anon-inode interface
  security: add inode_init_security_anon() LSM hook
  selinux: fall back to SECURITY_FS_USE_GENFS if no xattr support
  selinux: mark selinux_xfrm_refcount as __read_mostly
  selinux: mark some global variables __ro_after_init
  selinux: make selinuxfs_mount static
  selinux: drop the unnecessary aurule_callback variable
  selinux: remove unused global variables
  selinux: fix inconsistency between inode_getxattr and inode_listsecurity
  selinux: handle MPTCP consistently with TCP
2021-02-21 16:54:54 -08:00
Linus Torvalds e210761fb3 Detect kernel thread correctly, and ignore harmless data race.
tomoyo: recognize kernel threads correctly
   tomoyo: ignore data race while checking quota
 
  security/tomoyo/file.c    |   16 ++++++++--------
  security/tomoyo/network.c |   10 +++++-----
  security/tomoyo/util.c    |   24 ++++++++++++------------
  3 files changed, 25 insertions(+), 25 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJgKc61AAoJEEJfEo0MZPUq1qkQAKqABDH28UI/T1lq8YWQ/geC
 Z5SIisZ6IS8ovNonEmO13g6j7N45Dul7oIgA81GbL8D0CJFdQaixz6WKTepy4clz
 6KOnwKimezA3sLyKEJFUywu3VT8w3kb2bUb10gbqRTOiB0xNH/Ix8lnbLu5XWzKM
 /gVmDNqRIdjr864bRTygJZxJcn+KXpkfK/Oc02+xx1AzG8ajc5AjJh8oRQq4PsQn
 dUQLdGyHmVY66NIn19ErV9OVEnbcZIQoKNRnnKvCPJLkZRheqNoVFWwW4ZqhznV1
 9MWRcx626pDUDDkP5a72vVPLmMi1zqHk4I70cu865Tpm2NwjovztX1Ru6z2aWfKd
 GTqt3ajOzjWBPoGVAoTdrvcBena2cljMK6q0+DXT8dr2z/LKFdYVNK4t/ioMywXy
 6CS56bVzWevBtUpXypwsjxtk4Fi4w+NWw4GnnPTiaiSKnOEcOIdPU4VFMVan14Mx
 pMkzKrGt2YBKUVYcyaz67lfU3/lhqxtMt0oOuaMhXYM+YpaBbRcNztPTJXfyYRHJ
 PLKyPX/G343WFjDD0qnhbYixtbAJzjIjo7NB0ZGXYTbpYYNm4TaLztYiE0bQ8X5e
 fxIma4Ua65E8dmpUa1JinZ7peL7cEJErOYHC4so/VNQ9B44BUgJJOf8Mh0WU32mN
 fGQCkWGLFWgVVDLs5/V/
 =WxKO
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20210215' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo updates from Tetsuo Handa:
 "Detect kernel thread correctly, and ignore harmless data race"

* tag 'tomoyo-pr-20210215' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: recognize kernel threads correctly
  tomoyo: ignore data race while checking quota
2021-02-21 16:52:06 -08:00
Jarkko Sakkinen 8c657a0590 KEYS: trusted: Reserve TPM for seal and unseal operations
When TPM 2.0 trusted keys code was moved to the trusted keys subsystem,
the operations were unwrapped from tpm_try_get_ops() and tpm_put_ops(),
which are used to take temporarily the ownership of the TPM chip. The
ownership is only taken inside tpm_send(), but this is not sufficient,
as in the key load TPM2_CC_LOAD, TPM2_CC_UNSEAL and TPM2_FLUSH_CONTEXT
need to be done as a one single atom.

Take the TPM chip ownership before sending anything with
tpm_try_get_ops() and tpm_put_ops(), and use tpm_transmit_cmd() to send
TPM commands instead of tpm_send(), reverting back to the old behaviour.

Fixes: 2e19e10131 ("KEYS: trusted: Move TPM2 trusted keys code")
Reported-by: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: stable@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Acked-by Sumit Garg <sumit.garg@linaro.org>
Tested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-02-16 10:40:28 +02:00
Jarkko Sakkinen 8da7520c80 KEYS: trusted: Fix migratable=1 failing
Consider the following transcript:

$ keyctl add trusted kmk "new 32 blobauth=helloworld keyhandle=80000000 migratable=1" @u
add_key: Invalid argument

The documentation has the following description:

  migratable=   0|1 indicating permission to reseal to new PCR values,
                default 1 (resealing allowed)

The consequence is that "migratable=1" should succeed. Fix this by
allowing this condition to pass instead of return -EINVAL.

[*] Documentation/security/keys/trusted-encrypted.rst

Cc: stable@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Fixes: d00a1c72f7 ("keys: add new trusted key-type")
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-02-16 10:40:28 +02:00
Jarkko Sakkinen 5df16caada KEYS: trusted: Fix incorrect handling of tpm_get_random()
When tpm_get_random() was introduced, it defined the following API for the
return value:

1. A positive value tells how many bytes of random data was generated.
2. A negative value on error.

However, in the call sites the API was used incorrectly, i.e. as it would
only return negative values and otherwise zero. Returning he positive read
counts to the user space does not make any possible sense.

Fix this by returning -EIO when tpm_get_random() returns a positive value.

Fixes: 41ab999c80 ("tpm: Move tpm_get_random api into the TPM device driver")
Cc: stable@vger.kernel.org
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
2021-02-16 10:40:28 +02:00
Wei Yongjun f6692213b5 integrity: Make function integrity_add_key() static
The sparse tool complains as follows:

security/integrity/digsig.c:146:12: warning:
 symbol 'integrity_add_key' was not declared. Should it be static?

This function is not used outside of digsig.c, so this
commit marks it static.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 60740accf7 ("integrity: Load certs to the platform keyring")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-02-12 11:11:59 -05:00
Mimi Zohar cccb0efdef Merge branch 'ima-kexec-fixes' into next-integrity 2021-02-10 16:34:06 -05:00
Lakshmi Ramasubramanian f31e3386a4 ima: Free IMA measurement buffer after kexec syscall
IMA allocates kernel virtual memory to carry forward the measurement
list, from the current kernel to the next kernel on kexec system call,
in ima_add_kexec_buffer() function.  This buffer is not freed before
completing the kexec system call resulting in memory leak.

Add ima_buffer field in "struct kimage" to store the virtual address
of the buffer allocated for the IMA measurement list.
Free the memory allocated for the IMA measurement list in
kimage_file_post_load_cleanup() function.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Fixes: 7b8589cc29 ("ima: on soft reboot, save the measurement list")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-02-10 15:49:38 -05:00
Lakshmi Ramasubramanian 6d14c65178 ima: Free IMA measurement buffer on error
IMA allocates kernel virtual memory to carry forward the measurement
list, from the current kernel to the next kernel on kexec system call,
in ima_add_kexec_buffer() function.  In error code paths this memory
is not freed resulting in memory leak.

Free the memory allocated for the IMA measurement list in
the error code paths in ima_add_kexec_buffer() function.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Fixes: 7b8589cc29 ("ima: on soft reboot, save the measurement list")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-02-10 15:49:35 -05:00
Sabyrzhan Tasbolatov 7ef4c19d24 smackfs: restrict bytes count in smackfs write functions
syzbot found WARNINGs in several smackfs write operations where
bytes count is passed to memdup_user_nul which exceeds
GFP MAX_ORDER. Check count size if bigger than PAGE_SIZE.

Per smackfs doc, smk_write_net4addr accepts any label or -CIPSO,
smk_write_net6addr accepts any label or -DELETE. I couldn't find
any general rule for other label lengths except SMK_LABELLEN,
SMK_LONGLABEL, SMK_CIPSOMAX which are documented.

Let's constrain, in general, smackfs label lengths for PAGE_SIZE.
Although fuzzer crashes write to smackfs/netlabel on 0x400000 length.

Here is a quick way to reproduce the WARNING:
python -c "print('A' * 0x400000)" > /sys/fs/smackfs/netlabel

Reported-by: syzbot+a71a442385a0b2815497@syzkaller.appspotmail.com
Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2021-02-02 17:14:02 -08:00
Tetsuo Handa 9c83465f32 tomoyo: recognize kernel threads correctly
Commit db68ce10c4 ("new helper: uaccess_kernel()") replaced
segment_eq(get_fs(), KERNEL_DS) with uaccess_kernel(). But the correct
method for tomoyo to check whether current is a kernel thread in order
to assume that kernel threads are privileged for socket operations was
(current->flags & PF_KTHREAD). Now that uaccess_kernel() became 0 on x86,
tomoyo has to fix this problem. Do like commit 942cb357ae ("Smack:
Handle io_uring kernel thread privileges") does.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2021-02-01 11:53:05 +09:00
Tetsuo Handa 5797e861e4 tomoyo: ignore data race while checking quota
syzbot is reporting that tomoyo's quota check is racy [1]. But this check
is tolerant of some degree of inaccuracy. Thus, teach KCSAN to ignore
this data race.

[1] https://syzkaller.appspot.com/bug?id=999533deec7ba6337f8aa25d8bd1a4d5f7e50476

Reported-by: syzbot <syzbot+0789a72b46fd91431bd8@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2021-02-01 11:52:11 +09:00
Miklos Szeredi f2b00be488 cap: fix conversions on getxattr
If a capability is stored on disk in v2 format cap_inode_getsecurity() will
currently return in v2 format unconditionally.

This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid,
and so the same conversions performed on it.

If the rootid cannot be mapped, v3 is returned unconverted.  Fix this so
that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs
user namespace in case of v2) cannot be mapped into the current user
namespace.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-01-28 10:22:48 +01:00
Raphael Gianotti b3f82afc10 IMA: Measure kernel version in early boot
The integrity of a kernel can be verified by the boot loader on cold
boot, and during kexec, by the current running kernel, before it is
loaded. However, it is still possible that the new kernel being
loaded is older than the current kernel, and/or has known
vulnerabilities. Therefore, it is imperative that an attestation
service be able to verify the version of the kernel being loaded on
the client, from cold boot and subsequent kexec system calls,
ensuring that only kernels with versions known to be good are loaded.

Measure the kernel version using ima_measure_critical_data() early on
in the boot sequence, reducing the chances of known kernel
vulnerabilities being exploited. With IMA being part of the kernel,
this overall approach makes the measurement itself more trustworthy.

To enable measuring the kernel version "ima_policy=critical_data"
needs to be added to the kernel command line arguments.
For example,
        BOOT_IMAGE=/boot/vmlinuz-5.11.0-rc3+ root=UUID=fd643309-a5d2-4ed3-b10d-3c579a5fab2f ro nomodeset ima_policy=critical_data

If runtime measurement of the kernel version is ever needed, the
following should be added to /etc/ima/ima-policy:

        measure func=CRITICAL_DATA label=kernel_info

To extract the measured data after boot, the following command can be used:

        grep -m 1 "kernel_version" \
        /sys/kernel/security/integrity/ima/ascii_runtime_measurements

Sample output from the command above:

        10 a8297d408e9d5155728b619761d0dd4cedf5ef5f ima-buf
        sha256:5660e19945be0119bc19cbbf8d9c33a09935ab5d30dad48aa11f879c67d70988
        kernel_version 352e31312e302d7263332d31363138372d676564623634666537383234342d6469727479

The above hex-ascii string corresponds to the kernel version
(e.g. xxd -r -p):

        5.11.0-rc3-16187-gedb64fe78244-dirty

Signed-off-by: Raphael Gianotti <raphgi@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-26 19:06:41 -05:00
Christian Brauner a2d2329e30
ima: handle idmapped mounts
IMA does sometimes access the inode's i_uid and compares it against the
rules' fowner. Enable IMA to handle idmapped mounts by passing down the
mount's user namespace. We simply make use of the helpers we introduced
before. If the initial user namespace is passed nothing changes so
non-idmapped mounts will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-27-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:20 +01:00
Christian Brauner 3cee6079f6
apparmor: handle idmapped mounts
The i_uid and i_gid are mostly used when logging for AppArmor. This is
broken in a bunch of places where the global root id is reported instead
of the i_uid or i_gid of the file. Nonetheless, be kind and log the
mapped inode if we're coming from an idmapped mount. If the initial user
namespace is passed nothing changes so non-idmapped mounts will see
identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-26-christian.brauner@ubuntu.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:20 +01:00
Christian Brauner 549c729771
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.

As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.

Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:20 +01:00
Christian Brauner 71bc356f93
commoncap: handle idmapped mounts
When interacting with user namespace and non-user namespace aware
filesystem capabilities the vfs will perform various security checks to
determine whether or not the filesystem capabilities can be used by the
caller, whether they need to be removed and so on. The main
infrastructure for this resides in the capability codepaths but they are
called through the LSM security infrastructure even though they are not
technically an LSM or optional. This extends the existing security hooks
security_inode_removexattr(), security_inode_killpriv(),
security_inode_getsecurity() to pass down the mount's user namespace and
makes them aware of idmapped mounts.

In order to actually get filesystem capabilities from disk the
capability infrastructure exposes the get_vfs_caps_from_disk() helper.
For user namespace aware filesystem capabilities a root uid is stored
alongside the capabilities.

In order to determine whether the caller can make use of the filesystem
capability or whether it needs to be ignored it is translated according
to the superblock's user namespace. If it can be translated to uid 0
according to that id mapping the caller can use the filesystem
capabilities stored on disk. If we are accessing the inode that holds
the filesystem capabilities through an idmapped mount we map the root
uid according to the mount's user namespace. Afterwards the checks are
identical to non-idmapped mounts: reading filesystem caps from disk
enforces that the root uid associated with the filesystem capability
must have a mapping in the superblock's user namespace and that the
caller is either in the same user namespace or is a descendant of the
superblock's user namespace. For filesystems that are mountable inside
user namespace the caller can just mount the filesystem and won't
usually need to idmap it. If they do want to idmap it they can create an
idmapped mount and mark it with a user namespace they created and which
is thus a descendant of s_user_ns. For filesystems that are not
mountable inside user namespaces the descendant rule is trivially true
because the s_user_ns will be the initial user namespace.

If the initial user namespace is passed nothing changes so non-idmapped
mounts will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-11-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:17 +01:00
Tycho Andersen c7c7a1a18a
xattr: handle idmapped mounts
When interacting with extended attributes the vfs verifies that the
caller is privileged over the inode with which the extended attribute is
associated. For posix access and posix default extended attributes a uid
or gid can be stored on-disk. Let the functions handle posix extended
attributes on idmapped mounts. If the inode is accessed through an
idmapped mount we need to map it according to the mount's user
namespace. Afterwards the checks are identical to non-idmapped mounts.
This has no effect for e.g. security xattrs since they don't store uids
or gids and don't perform permission checks on them like posix acls do.

Link: https://lore.kernel.org/r/20210121131959.646623-10-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:17 +01:00
Christian Brauner e65ce2a50c
acl: handle idmapped mounts
The posix acl permission checking helpers determine whether a caller is
privileged over an inode according to the acls associated with the
inode. Add helpers that make it possible to handle acls on idmapped
mounts.

The vfs and the filesystems targeted by this first iteration make use of
posix_acl_fix_xattr_from_user() and posix_acl_fix_xattr_to_user() to
translate basic posix access and default permissions such as the
ACL_USER and ACL_GROUP type according to the initial user namespace (or
the superblock's user namespace) to and from the caller's current user
namespace. Adapt these two helpers to handle idmapped mounts whereby we
either map from or into the mount's user namespace depending on in which
direction we're translating.
Similarly, cap_convert_nscap() is used by the vfs to translate user
namespace and non-user namespace aware filesystem capabilities from the
superblock's user namespace to the caller's user namespace. Enable it to
handle idmapped mounts by accounting for the mount's user namespace.

In addition the fileystems targeted in the first iteration of this patch
series make use of the posix_acl_chmod() and, posix_acl_update_mode()
helpers. Both helpers perform permission checks on the target inode. Let
them handle idmapped mounts. These two helpers are called when posix
acls are set by the respective filesystems to handle this case we extend
the ->set() method to take an additional user namespace argument to pass
the mount's user namespace down.

Link: https://lore.kernel.org/r/20210121131959.646623-9-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:17 +01:00
Christian Brauner 21cb47be6f
inode: make init and permission helpers idmapped mount aware
The inode_owner_or_capable() helper determines whether the caller is the
owner of the inode or is capable with respect to that inode. Allow it to
handle idmapped mounts. If the inode is accessed through an idmapped
mount it according to the mount's user namespace. Afterwards the checks
are identical to non-idmapped mounts. If the initial user namespace is
passed nothing changes so non-idmapped mounts will see identical
behavior as before.

Similarly, allow the inode_init_owner() helper to handle idmapped
mounts. It initializes a new inode on idmapped mounts by mapping the
fsuid and fsgid of the caller from the mount's user namespace. If the
initial user namespace is passed nothing changes so non-idmapped mounts
will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-7-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:16 +01:00
Christian Brauner 0558c1bf5a
capability: handle idmapped mounts
In order to determine whether a caller holds privilege over a given
inode the capability framework exposes the two helpers
privileged_wrt_inode_uidgid() and capable_wrt_inode_uidgid(). The former
verifies that the inode has a mapping in the caller's user namespace and
the latter additionally verifies that the caller has the requested
capability in their current user namespace.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped inodes. If the initial user namespace is passed all
operations are a nop so non-idmapped mounts will not see a change in
behavior.

Link: https://lore.kernel.org/r/20210121131959.646623-5-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:16 +01:00
David Howells 4993e1f947 certs: Fix blacklist flag type confusion
KEY_FLAG_KEEP is not meant to be passed to keyring_alloc() or key_alloc(),
as these only take KEY_ALLOC_* flags.  KEY_FLAG_KEEP has the same value as
KEY_ALLOC_BYPASS_RESTRICTION, but fortunately only key_create_or_update()
uses it.  LSMs using the key_alloc hook don't check that flag.

KEY_FLAG_KEEP is then ignored but fortunately (again) the root user cannot
write to the blacklist keyring, so it is not possible to remove a key/hash
from it.

Fix this by adding a KEY_ALLOC_SET_KEEP flag that tells key_alloc() to set
KEY_FLAG_KEEP on the new key.  blacklist_init() can then, correctly, pass
this to keyring_alloc().

We can also use this in ima_mok_init() rather than setting the flag
manually.

Note that this doesn't fix an observable bug with the current
implementation but it is required to allow addition of new hashes to the
blacklist in the future without making it possible for them to be removed.

Fixes: 734114f878 ("KEYS: Add a system blacklist keyring")
Reported-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mickaël Salaün <mic@linux.microsoft.com>
cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
2021-01-21 16:16:10 +00:00
Tom Rix c224926edf KEYS: remove redundant memset
Reviewing use of memset in keyctl_pkey.c

keyctl_pkey_params_get prologue code to set params up

	memset(params, 0, sizeof(*params));
	params->encoding = "raw";

keyctl_pkey_query has the same prologue
and calls keyctl_pkey_params_get.

So remove the prologue.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:09 +00:00
Randy Dunlap 328c95db01 security: keys: delete repeated words in comments
Drop repeated words in comments.
{to, will, the}

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
Cc: keyrings@vger.kernel.org
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
2021-01-21 16:16:09 +00:00
Denis Efremov 272a121940 security/keys: use kvfree_sensitive()
Use kvfree_sensitive() instead of open-coding it.

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:09 +00:00
Gabriel Krisman Bertazi 8fe62e0c0e watch_queue: Drop references to /dev/watch_queue
The merged API doesn't use a watch_queue device, but instead relies on
pipes, so let the documentation reflect that.

Fixes: f7e47677e3 ("watch_queue: Add a key/keyring notification facility")
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:08 +00:00
Jann Horn 796e46f9e2 keys: Remove outdated __user annotations
When the semantics of the ->read() handlers were changed such that "buffer"
is a kernel pointer, some __user annotations survived.
Since they're wrong now, get rid of them.

Fixes: d3ec10aa95 ("KEYS: Don't write out to userspace while holding key semaphore")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:08 +00:00
Gustavo A. R. Silva 634c21bb98 security: keys: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Ben Boeckel <mathstuf@gmail.com>
2021-01-21 16:16:08 +00:00
Al Viro 23d8f5b684 make dump_common_audit_data() safe to be called from RCU pathwalk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-01-16 15:12:08 -05:00
Al Viro d36a1dd9f7 dump_common_audit_data(): fix racy accesses to ->d_name
We are not guaranteed the locking environment that would prevent
dentry getting renamed right under us.  And it's possible for
old long name to be freed after rename, leading to UAF here.

Cc: stable@kernel.org # v2.6.2+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-01-16 15:11:35 -05:00
Lakshmi Ramasubramanian fdd1ffe8a8 selinux: include a consumer of the new IMA critical data hook
SELinux stores the active policy in memory, so the changes to this data
at runtime would have an impact on the security guarantees provided
by SELinux.  Measuring in-memory SELinux policy through IMA subsystem
provides a secure way for the attestation service to remotely validate
the policy contents at runtime.

Measure the hash of the loaded policy by calling the IMA hook
ima_measure_critical_data().  Since the size of the loaded policy
can be large (several MB), measure the hash of the policy instead of
the entire policy to avoid bloating the IMA log entry.

To enable SELinux data measurement, the following steps are required:

1, Add "ima_policy=critical_data" to the kernel command line arguments
   to enable measuring SELinux data at boot time.
For example,
  BOOT_IMAGE=/boot/vmlinuz-5.10.0-rc1+ root=UUID=fd643309-a5d2-4ed3-b10d-3c579a5fab2f ro nomodeset security=selinux ima_policy=critical_data

2, Add the following rule to /etc/ima/ima-policy
   measure func=CRITICAL_DATA label=selinux

Sample measurement of the hash of SELinux policy:

To verify the measured data with the current SELinux policy run
the following commands and verify the output hash values match.

  sha256sum /sys/fs/selinux/policy | cut -d' ' -f 1

  grep "selinux-policy-hash" /sys/kernel/security/integrity/ima/ascii_runtime_measurements | tail -1 | cut -d' ' -f 6

Note that the actual verification of SELinux policy would require loading
the expected policy into an identical kernel on a pristine/known-safe
system and run the sha256sum /sys/kernel/selinux/policy there to get
the expected hash.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:46 -05:00
Lakshmi Ramasubramanian 03cee16836 IMA: define a builtin critical data measurement policy
Define a new critical data builtin policy to allow measuring
early kernel integrity critical data before a custom IMA policy
is loaded.

Update the documentation on kernel parameters to document
the new critical data builtin policy.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:43 -05:00
Tushar Sugandhi 9f5d7d23cc IMA: extend critical data hook to limit the measurement based on a label
The IMA hook ima_measure_critical_data() does not support a way to
specify the source of the critical data provider.  Thus, the data
measurement cannot be constrained based on the data source label
in the IMA policy.

Extend the IMA hook ima_measure_critical_data() to support passing
the data source label as an input parameter, so that the policy rule can
be used to limit the measurements based on the label.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:38 -05:00
Tushar Sugandhi 47d76a4840 IMA: limit critical data measurement based on a label
Integrity critical data may belong to a single subsystem or it may
arise from cross subsystem interaction.  Currently there is no mechanism
to group or limit the data based on certain label.  Limiting and
grouping critical data based on a label would make it flexible and
configurable to measure.

Define "label:=", a new IMA policy condition, for the IMA func
CRITICAL_DATA to allow grouping and limiting measurement of integrity
critical data.

Limit the measurement to the labels that are specified in the IMA
policy - CRITICAL_DATA+"label:=".  If "label:=" is not provided with
the func CRITICAL_DATA, measure all the input integrity critical data.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:34 -05:00
Tushar Sugandhi c4e43aa2ee IMA: add policy rule to measure critical data
A new IMA policy rule is needed for the IMA hook
ima_measure_critical_data() and the corresponding func CRITICAL_DATA for
measuring the input buffer.  The policy rule should ensure the buffer
would get measured only when the policy rule allows the action.  The
policy rule should also support the necessary constraints (flags etc.)
for integrity critical buffer data measurements.

Add policy rule support for measuring integrity critical data.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:29 -05:00
Tushar Sugandhi d6e645012d IMA: define a hook to measure kernel integrity critical data
IMA provides capabilities to measure file and buffer data.  However,
various data structures, policies, and states stored in kernel memory
also impact the integrity of the system.  Several kernel subsystems
contain such integrity critical data.  These kernel subsystems help
protect the integrity of the system.  Currently, IMA does not provide a
generic function for measuring kernel integrity critical data.

Define ima_measure_critical_data, a new IMA hook, to measure kernel
integrity critical data.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:26 -05:00
Tushar Sugandhi 291af651b3 IMA: add support to measure buffer data hash
The original IMA buffer data measurement sizes were small (e.g.  boot
command line), but the new buffer data measurement use cases have data
sizes that are a lot larger.  Just as IMA measures the file data hash,
not the file data, IMA should similarly support the option for measuring
buffer data hash.

Introduce a boolean parameter to support measuring buffer data hash,
which would be much smaller, instead of the buffer itself.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:23 -05:00
Tushar Sugandhi 2b4a2474a2 IMA: generalize keyring specific measurement constructs
IMA functions such as ima_match_keyring(), process_buffer_measurement(),
ima_match_policy() etc.  handle data specific to keyrings.  Currently,
these constructs are not generic to handle any func specific data.
This makes it harder to extend them without code duplication.

Refactor the keyring specific measurement constructs to be generic and
reusable in other measurement scenarios.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-14 23:41:13 -05:00
Daniel Colascione 29cd6591ab selinux: teach SELinux about anonymous inodes
This change uses the anon_inodes and LSM infrastructure introduced in
the previous patches to give SELinux the ability to control
anonymous-inode files that are created using the new
anon_inode_getfd_secure() function.

A SELinux policy author detects and controls these anonymous inodes by
adding a name-based type_transition rule that assigns a new security
type to anonymous-inode files created in some domain. The name used
for the name-based transition is the name associated with the
anonymous inode for file listings --- e.g., "[userfaultfd]" or
"[perf_event]".

Example:

type uffd_t;
type_transition sysadm_t sysadm_t : anon_inode uffd_t "[userfaultfd]";
allow sysadm_t uffd_t:anon_inode { create };

(The next patch in this series is necessary for making userfaultfd
support this new interface.  The example above is just
for exposition.)

Signed-off-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-14 17:38:10 -05:00
Lokesh Gidra 215b674b84 security: add inode_init_security_anon() LSM hook
This change adds a new LSM hook, inode_init_security_anon(), that will
be used while creating secure anonymous inodes. The hook allows/denies
its creation and assigns a security context to the inode.

The new hook accepts an optional context_inode parameter that callers
can use to provide additional contextual information to security modules
for granting/denying permission to create an anon-inode of the same type.
This context_inode's security_context can also be used to initialize the
newly created anon-inode's security_context.

Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-14 17:28:24 -05:00
Ondrej Mosnacek 08abe46b2c selinux: fall back to SECURITY_FS_USE_GENFS if no xattr support
When a superblock is assigned the SECURITY_FS_USE_XATTR behavior by the
policy yet it lacks xattr support, try to fall back to genfs rather than
rejecting the mount. If a genfscon rule is found for the filesystem,
then change the behavior to SECURITY_FS_USE_GENFS, otherwise reject the
mount as before. A similar fallback is already done in security_fs_use()
if no behavior specification is found for the given filesystem.

This is needed e.g. for virtiofs, which may or may not support xattrs
depending on the backing host filesystem.

Example:
    # seinfo --genfs | grep ' ramfs'
       genfscon ramfs /  system_u:object_r:ramfs_t:s0
    # echo '(fsuse xattr ramfs (system_u object_r fs_t ((s0) (s0))))' >ramfs_xattr.cil
    # semodule -i ramfs_xattr.cil
    # mount -t ramfs none /mnt

Before:
    mount: /mnt: mount(2) system call failed: Operation not supported.

After:
    (mount succeeds)
    # ls -Zd /mnt
    system_u:object_r:ramfs_t:s0 /mnt

See also:
https://lore.kernel.org/selinux/20210105142148.GA3200@redhat.com/T/
https://github.com/fedora-selinux/selinux-policy/pull/478

Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-13 08:55:11 -05:00
Dinghao Liu ccf11dbaa0 evm: Fix memleak in init_desc
tmp_tfm is allocated, but not freed on subsequent kmalloc failure, which
leads to a memory leak.  Free tmp_tfm.

Fixes: d46eb36995 ("evm: crypto hash replaced by shash")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
[zohar@linux.ibm.com: formatted/reworded patch description]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2021-01-13 07:22:12 -05:00
Ondrej Mosnacek e0de8a9aeb selinux: mark selinux_xfrm_refcount as __read_mostly
This is motivated by a perfomance regression of selinux_xfrm_enabled()
that happened on a RHEL kernel due to false sharing between
selinux_xfrm_refcount and (the late) selinux_ss.policy_rwlock (i.e. the
.bss section memory layout changed such that they happened to share the
same cacheline). Since the policy rwlock's memory region was modified
upon each read-side critical section, the readers of
selinux_xfrm_refcount had frequent cache misses, eventually leading to a
significant performance degradation under a TCP SYN flood on a system
with many cores (32 in this case, but it's detectable on less cores as
well).

While upstream has since switched to RCU locking, so the same can no
longer happen here, selinux_xfrm_refcount could still share a cacheline
with another frequently written region, thus marking it __read_mostly
still makes sense. __read_mostly helps, because it will put the symbol
in a separate section along with other read-mostly variables, so there
should never be a clash with frequently written data.

Since selinux_xfrm_refcount is modified only in case of an explicit
action, it should be safe to do this (i.e. it shouldn't disrupt other
read-mostly variables too much).

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-12 10:12:58 -05:00
Ondrej Mosnacek cd2bb4cb09 selinux: mark some global variables __ro_after_init
All of these are never modified outside initcalls, so they can be
__ro_after_init.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-12 10:08:55 -05:00
Ondrej Mosnacek db478cd60d selinux: make selinuxfs_mount static
It is not referenced outside selinuxfs.c, so remove its extern header
declaration and make it static.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-12 10:02:26 -05:00
Ondrej Mosnacek 3c797e514b selinux: drop the unnecessary aurule_callback variable
Its value is actually not changed anywhere, so it can be substituted for
a direct call to audit_update_lsm_rules().

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-12 09:53:57 -05:00
Ondrej Mosnacek 46434ba040 selinux: remove unused global variables
All of sel_ib_pkey_list, sel_netif_list, sel_netnode_list, and
sel_netport_list are declared but never used. Remove them.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-12 09:49:01 -05:00
Amir Goldstein a9ffe682c5 selinux: fix inconsistency between inode_getxattr and inode_listsecurity
When inode has no listxattr op of its own (e.g. squashfs) vfs_listxattr
calls the LSM inode_listsecurity hooks to list the xattrs that LSMs will
intercept in inode_getxattr hooks.

When selinux LSM is installed but not initialized, it will list the
security.selinux xattr in inode_listsecurity, but will not intercept it
in inode_getxattr.  This results in -ENODATA for a getxattr call for an
xattr returned by listxattr.

This situation was manifested as overlayfs failure to copy up lower
files from squashfs when selinux is built-in but not initialized,
because ovl_copy_xattr() iterates the lower inode xattrs by
vfs_listxattr() and vfs_getxattr().

Match the logic of inode_listsecurity to that of inode_getxattr and
do not list the security.selinux xattr if selinux is not initialized.

Reported-by: Michael Labriola <michael.d.labriola@gmail.com>
Tested-by: Michael Labriola <michael.d.labriola@gmail.com>
Link: https://lore.kernel.org/linux-unionfs/2nv9d47zt7.fsf@aldarion.sourceruckus.org/
Fixes: c8e222616c ("selinux: allow reading labels before policy is loaded")
Cc: stable@vger.kernel.org#v5.9+
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-04 20:41:09 -05:00
Paolo Abeni 95ca90726e selinux: handle MPTCP consistently with TCP
The MPTCP protocol uses a specific protocol value, even if
it's an extension to TCP. Additionally, MPTCP sockets
could 'fall-back' to TCP at run-time, depending on peer MPTCP
support and available resources.

As a consequence of the specific protocol number, selinux
applies the raw_socket class to MPTCP sockets.

Existing TCP application converted to MPTCP - or forced to
use MPTCP socket with user-space hacks - will need an
updated policy to run successfully.

This change lets selinux attach the TCP socket class to
MPTCP sockets, too, so that no policy changes are needed in
the above scenario.

Note that the MPTCP is setting, propagating and updating the
security context on all the subflows and related request
socket.

Link: https://lore.kernel.org/linux-security-module/CAHC9VhTaK3xx0hEGByD2zxfF7fadyPP1kb-WeWH_YCyq9X-sRg@mail.gmail.com/T/#t
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
[PM: tweaked subject's prefix]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-01-04 19:43:59 -05:00
Eric W. Biederman 95ebabde38 capabilities: Don't allow writing ambiguous v3 file capabilities
The v3 file capabilities have a uid field that records the filesystem
uid of the root user of the user namespace the file capabilities are
valid in.

When someone is silly enough to have the same underlying uid as the
root uid of multiple nested containers a v3 filesystem capability can
be ambiguous.

In the spirit of don't do that then, forbid writing a v3 filesystem
capability if it is ambiguous.

Fixes: 8db6c34f1d ("Introduce v3 namespaced file capabilities")
Reviewed-by: Andrew G. Morgan <morgan@kernel.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2020-12-29 09:32:35 -06:00
Linus Torvalds 2f2fce3d53 Provide a fix for the incorrect handling of privilege
in the face of io_uring's use of kernel threads.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl/ig/gXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBGqvRAArz1+Ki7ipJVOusXAMtBLimUc
 4rQnk6vCLZllojY5K0swfN7Y8zDMwrOiGLO9zXFeWFT3lc9vf0HVDDAerIZ+go8n
 g6gII+OWPM2E0ywBgzXgzt5eSOkJ7npcn5zYAHz5sZGlYjWfVu7IAtR6GAcN0aeC
 C75NTJSTbx2uyxUL/5rRJBAiNKHRbmzsNX103SKqAEuhEmHwb1yXE2i8xrxKe/un
 vQv7CJQYZRf6tFEpC8zYPDZb1/EXifxQkzxGhmJCJcl+ZTWW8RplNk8ij/BIyb/V
 Bupj+cF/waq6lXKGQB/lxrUiOYrUzk09RBaawH5ZTVnQ+kBgXPJjc3gjrOdjuVji
 DVl4oImgOGrTpq36PdldWIlB1tY1hrGgP4qjaRn0/6zJocLmuu3fVlVbHb/xMgMT
 Nq5/Ny1ymfH3w/hsg9Usze8nAL8Sj3pt6j0hdmgqS31Fth7aR16eLxZY5rK0jDWu
 d+eEobZx/ww0jJB8s+cVMGxhFNlq1Qmu0+nCPGHKtfnBwAWdMfu2M25lg3s2CBb+
 8+1edxG9Vfcq7laNUUrV9z8RAJpog+22DjEZpq2pCKIHgGqEYsvrxFcdLX/ymlTi
 RDtIi0+SF2Vt8B6M+iZ8gbTAGzcGZi1uHXZCha95A1nwLq6X7ZwzIiIzL0CicvZJ
 QPG3O4zvys3LthxZFoA=
 =zXSS
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next

Pull smack fix from Casey Schaufler:
 "Provide a fix for the incorrect handling of privilege in the face of
  io_uring's use of kernel threads. That invalidated an long standing
  assumption regarding the privilege of kernel threads.

  The fix is simple and safe. It was provided by Jens Axboe and has been
  tested"

* tag 'Smack-for-5.11-io_uring-fix' of git://github.com/cschaufler/smack-next:
  Smack: Handle io_uring kernel thread privileges
2020-12-24 14:08:43 -08:00
Linus Torvalds 4a1106afee EFI updates collected by Ard Biesheuvel:
- Don't move BSS section around pointlessly in the x86 decompressor
  - Refactor helper for discovering the EFI secure boot mode
  - Wire up EFI secure boot to IMA for arm64
  - Some fixes for the capsule loader
  - Expose the RT_PROP table via the EFI test module
  - Relax DT and kernel placement restrictions on ARM
 
 + followup fixes:
 
  - fix the build breakage on IA64 caused by recent capsule loader changes
  - suppress a type mismatch build warning in the expansion of
        EFI_PHYS_ALIGN on ARM
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/kWCMACgkQEsHwGGHe
 VUqVlxAAg3jSS5w5fuaXON2xYZmgKdlRB0fjbklo1ZrWS6sEHrP+gmVmrJSWGZP+
 qFleQ6AxaYK57UiBXxS6Xfn7hHRToqdOAGnaSYzIg1aQIofRoLxvm3YHBMKllb+g
 x73IBS/Hu9/kiH8EVDrJSkBpVdbPwDnw+FeW4ZWUMF9GVmV8oA6Zx23BVSVsbFda
 jat/cEsJQS3GfECJ/Fg5ae+c/2zn5NgbaVtLxVnMnJfAwEpoPz3ogKoANSskdZg3
 z6pA1aMFoHr+lnlzcsM5zdboQlwZRKPHvFpsXPexESBy5dPkYhxFnHqgK4hSZglC
 c3QoO9Gn+KOJl4KAKJWNzCrd3G9kKY5RXkoei4bH9wGMjW2c68WrbFyXgNsO3vYR
 v5CKpq3+jlwGo03GiLJgWQFdgqX0EgTVHHPTcwFpt8qAMi9/JIPSIeTE41p2+AjZ
 cW5F0IlikaR+N8vxc2TDvQTuSsroMiLcocvRWR61oV/48pFlEjqiUjV31myDsASg
 gGkOxZOOz2iBJfK8lCrKp5p9JwGp0M0/GSHTxlYQFy+p4SrcOiPX4wYYdLsWxioK
 AbVhvOClgB3kN7y7TpLvdjND00ciy4nKEC0QZ5p5G59jSLnpSBM/g6av24LsSQwo
 S1HJKhQPbzcI1lhaPjo91HQoOOMZHWLes0SqK4FGNIH+0imHliA=
 =n7Gc
 -----END PGP SIGNATURE-----

Merge tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI updates from Borislav Petkov:
 "These got delayed due to a last minute ia64 build issue which got
  fixed in the meantime.

  EFI updates collected by Ard Biesheuvel:

   - Don't move BSS section around pointlessly in the x86 decompressor

   - Refactor helper for discovering the EFI secure boot mode

   - Wire up EFI secure boot to IMA for arm64

   - Some fixes for the capsule loader

   - Expose the RT_PROP table via the EFI test module

   - Relax DT and kernel placement restrictions on ARM

  with a few followup fixes:

   - fix the build breakage on IA64 caused by recent capsule loader
     changes

   - suppress a type mismatch build warning in the expansion of
     EFI_PHYS_ALIGN on ARM"

* tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: arm: force use of unsigned type for EFI_PHYS_ALIGN
  efi: ia64: disable the capsule loader
  efi: stub: get rid of efi_get_max_fdt_addr()
  efi/efi_test: read RuntimeServicesSupported
  efi: arm: reduce minimum alignment of uncompressed kernel
  efi: capsule: clean scatter-gather entries from the D-cache
  efi: capsule: use atomic kmap for transient sglist mappings
  efi: x86/xen: switch to efi_get_secureboot_mode helper
  arm64/ima: add ima_arch support
  ima: generalize x86/EFI arch glue for other EFI architectures
  efi: generalize efi_get_secureboot
  efi/libstub: EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should not default to yes
  efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()
  efi/libstub/x86: simplify efi_is_native()
2020-12-24 12:40:07 -08:00
Casey Schaufler 942cb357ae Smack: Handle io_uring kernel thread privileges
Smack assumes that kernel threads are privileged for smackfs
operations. This was necessary because the credential of the
kernel thread was not related to a user operation. With io_uring
the credential does reflect a user's rights and can be used.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-12-22 15:34:24 -08:00
Linus Torvalds 92dbc9dedc overlayfs update for 5.11
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCX9te7AAKCRDh3BK/laaZ
 PGu/AP4i7Em2byhNCl/A/cSmx5bKWqwOWwgvT8HGOXd+H/vP5wD/Yqcl6mRxVqlk
 J19tOpIagJoMVr62yNgD2esJyMtzKgo=
 =Od8+
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:

 - Allow unprivileged mounting in a user namespace.

   For quite some time the security model of overlayfs has been that
   operations on underlying layers shall be performed with the
   privileges of the mounting task.

   This way an unprvileged user cannot gain privileges by the act of
   mounting an overlayfs instance. A full audit of all function calls
   made by the overlayfs code has been performed to see whether they
   conform to this model, and this branch contains some fixes in this
   regard.

 - Support running on copied filesystem images by optionally disabling
   UUID verification.

 - Bug fixes as well as documentation updates.

* tag 'ovl-update-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: unprivieged mounts
  ovl: do not get metacopy for userxattr
  ovl: do not fail because of O_NOATIME
  ovl: do not fail when setting origin xattr
  ovl: user xattr
  ovl: simplify file splice
  ovl: make ioctl() safe
  ovl: check privs before decoding file handle
  vfs: verify source area in vfs_dedupe_file_range_one()
  vfs: move cap_convert_nscap() call into vfs_setxattr()
  ovl: fix incorrect extent info in metacopy case
  ovl: expand warning in ovl_d_real()
  ovl: document lower modification caveats
  ovl: warn about orphan metacopy
  ovl: doc clarification
  ovl: introduce new "uuid=off" option for inodes index feature
  ovl: propagate ovl_fs to ovl_decode_real_fh and ovl_encode_real_fh
2020-12-17 11:42:48 -08:00
Linus Torvalds 8bda68d68b Smack updates for Linux 5.11. One minor code clean-up and a
set of corrections in function header comments.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl/ZAUgXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFgFhAAgxDoN1EV1LMbt5YHHWVCRkuN
 a9anGIoKQz++B+XMXJ4p5aouJvm5Bg7XATgHagUIK8oT3i4EW/8t1mD2tSw1kvrn
 WARe0T+DrH+4NL1S53hcsfNVYdfUL4oSaQP5S/D7OezAILs8VDz108a97AWmrTnA
 0Ed3q+C8YzqQfPP3KjcNhXYOBpx2EwSwmWNeyAL9QU4lGYUbeO9gN7TVXQnRK4OH
 /cjhPRhe9D7Abr2QtHuVjzjFvpLx6hlkNOCI9o1kI9FM7JlC2m2nAEUerm1/DRxC
 z7vB0CP1nc8Q6MlbeI+fM3CQvWaDXV6/fKcHfUQ5sqIL2utyZzKZwg1xxfDvFYZ8
 WmF3v0XKUwW30rBa7Zbq+IBPSZ7ITWnIrIig2HwhdGxEBa6GHcy6MLMrOVKmQPsx
 2cNiFd8+FizP82tSzHpfb8hS2834Qz1RS/z7rpXFhkq71EqgLTl9Taghtv6gOYiX
 WGRiz4zOCqtOIzdQWfz+npLwfmr7ILfw1B5ftjMJv5U6Fq0JPowGswl999OYMvUr
 Ct/iyAcT64M5TTac8T7qFjzmFpjjGF8Ev3NPPnqYLTBFYA13/3Wrjdj8VDM92RXW
 HaqVv3bMaYwOzREI1OjiDyfJq+IwfUQNLaABSW+3lwuw3BTAabttkheMLnYAAY7O
 KXNhqO97Vp8Q83JBbtk=
 =LBD6
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.11' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "There are no functional changes. Just one minor code clean-up and a
  set of corrections in function header comments"

* tag 'Smack-for-5.11' of git://github.com/cschaufler/smack-next:
  security/smack: remove unused varible 'rc'
  Smack: fix kernel-doc interface on functions
2020-12-16 11:11:58 -08:00
Linus Torvalds e20a9b92dd integrity-v5.11
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAl/X/p4UHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVq/0w//QatvPcNJ/lrfR28J3Jdf2dPWChEr
 OzeTuCT3/gPKvwtvT5QbaJVEJfSzeZb2SnOIRExOoIMET5kwCSWBAXLgU/eFy5jV
 5fXOh8HjONNTEsP/o8ycHw43fQ+BxM0x6aayatGda2PREArF47YfKQ4drlB2znLG
 veHmwk+42Bj4TgP1C1dv1MQwSUlaC0iu4lYu7c3kM/LftUjvgDv0+dkhhOKfgJMG
 3spUkSb9XfCl3qCqPy6wa1768bsvRwf7mTGRN/dfFtlfIXfkS+E9XwOLgUFIhWBG
 3DZYtZnZ0biEwT0Kc7cA310ZFYxftYs1nQqAG6TsBvtpjqhy0Tpd167WQa8VWprS
 G+AqsroZ5TrGmL0lqCTu6X+0GJ5gLwfWhO/e+T4zpQT+JE69fxn1RtrqgSB+Rk3p
 7qDYAgICmcEtMUiFBXQZ9N5SNg1fv7ACQ6dhEWRDW/tbT6BVnas+f3mtjhp7Asut
 /40jKQxPo+93KxYHa4dqjKfCOGqZqPDxcvsmHtRy0ddrr9R32E23O8EtpWSIsJyC
 lo7XoVAauvAhhewt0lEBgEgAMrEvwoqHDp/sdDJQdvZGIWs/6gc4w/2JLryZOQFn
 wbC2ynRU/I2Htri8nGnJMhZU85lIEj3o6C/+XAWa6wSwFR9KTW8BOvdoTXPMP1uy
 TZcs3WFQetti+VE=
 =+8op
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity subsystem updates from Mimi Zohar:
 "Just three patches here. Other integrity changes are being upstreamed
  via EFI (defines a common EFI secure and trusted boot IMA policy) and
  BPF LSM (exporting the IMA file cache hash info based on inode).

  The three patches included here:

   - bug fix: fail calculating the file hash, when a file not opened for
     read and the attempt to re-open it for read fails.

   - defer processing the "ima_appraise" boot command line option to
     avoid enabling different modes (e.g. fix, log) to when the secure
     boot flag is available on arm.

   - defines "ima-buf" as the default IMA buffer measurement template in
     preparation for the builtin integrity "critical data" policy"

* tag 'integrity-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Don't modify file descriptor mode on the fly
  ima: select ima-buf template for buffer measurement
  ima: defer arch_ima_get_secureboot() call to IMA init time
2020-12-16 11:06:07 -08:00
Linus Torvalds ca5b877b6c selinux/stable-5.11 PR 20201214
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl/YBtEUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNnwA/9Ek8DG/1t8CEoJxpoRvwovQxNo+bi
 0rCT9vqvx9PeCwoZi/0Vp6oKmpE1HADvbeB/+e00VrbLYnzE3oRY6VkpjoZRofKS
 vc0/MzHSFxFUR1OTHwCefcXlPLK+bfitQbX5jEMeVyQCXNXXIrN7CnJf1LmCeLTR
 kQBPlEN9lt7HyNVAi34FhOD/TQbWnFHgl2z5puffgri6cWnc+TALKMYytUZ+rYex
 NYndDJW5b3g5kTat2eErn0FruxfzloGs0xMIiWb+z2i9kl41D+dkKPdAN7idqCSC
 Jv0nJP/bDftzA0wOe9szmGaLQzu7YnCN5kiWcSspatZVnon42Cy/tp9tiuPGLRFU
 XtelDfpyX6o3CLN0tX7LQEO+GYxPzvM6iaR2OrsChWPozUIIR3TLQg7jJN4bvNKl
 TR6gCGZCoAeS5JLNGjzVKxT/oKQY+tCLLlYXQdQY6swNFi3EKmPr+K1D9lgm98fO
 f3d1QmWiZZNmtxxoVogT0qoQYjkfgpnm3dVx813Vt+lwHlVpHGMEPpO27iD3/RYb
 w2yWOJaGKwMD8iL0l+Cm6CPW0/nE5FFISQjWgC8b4Vgxlyan6+L9eViqGICkrUQ2
 Edo0i1YFFZ4utHYkDf1VYBbJ+36KyCtdktgLAcbgnePiPB3E1XBsXTIIStSUIbVQ
 iEbTkBlsCG4GIeU=
 =6Cqb
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "While we have a small number of SELinux patches for v5.11, there are a
  few changes worth highlighting:

   - Change the LSM network hooks to pass flowi_common structs instead
     of the parent flowi struct as the LSMs do not currently need the
     full flowi struct and they do not have enough information to use it
     safely (missing information on the address family).

     This patch was discussed both with Herbert Xu (representing team
     netdev) and James Morris (representing team
     LSMs-other-than-SELinux).

   - Fix how we handle errors in inode_doinit_with_dentry() so that we
     attempt to properly label the inode on following lookups instead of
     continuing to treat it as unlabeled.

   - Tweak the kernel logic around allowx, auditallowx, and dontauditx
     SELinux policy statements such that the auditx/dontauditx are
     effective even without the allowx statement.

  Everything passes our test suite"

* tag 'selinux-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
  selinux: Fix fall-through warnings for Clang
  selinux: drop super_block backpointer from superblock_security_struct
  selinux: fix inode_doinit_with_dentry() LABEL_INVALID error handling
  selinux: allow dontauditx and auditallowx rules to take effect without allowx
  selinux: fix error initialization in inode_doinit_with_dentry()
2020-12-16 11:01:04 -08:00
Linus Torvalds 3d5de2ddc6 audit/stable-5.11 PR 20201214
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl/YBw4UHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNndg/+JyEYzO+B0y0+0iTeUmBgLB1Hsbvt
 2RlQe8sZo3nBLr96hty4jwRUNdudUSKwKxXjIEr9DplNTpMd3/DzIMb92b00vVIi
 kBMDawsgtrAmWBE99Jo8YtL2vKbr5e5XlCjD1iH4UdfPvHemusMzGSMfzSetAgLU
 JTe0vzgdE46Y4peELTOGeCosO3WC2j4QU6B1QW4rFQEUr9AlN3c2Q40JEPUCKPCU
 3cLRWPQTmr9yiKis1i5HD7mHKqseSgvlxnl1SCboWSEJVbdfg+ceK4ugI7gXbweL
 EXxBDFJxuQEk5ENPu6MUZDgbcy7ROXMpE1TyFx8+SHxQJSmNiylddg/dZMbUk9Cs
 dLNkWMQbol827XdhcbXun5KVRGzh4sTwDL9QnxCfPtxpjGuYdQmXUTFnePgLVBH3
 Ial4mTGOOd37m6a7peAPtnjgR4W1jugoZQMSp//bOKTQvaZlDnWnoPGhgJENDELs
 Ys+tpsam+CjvoPzGfMRF/DQhk4QZtMhlFyd5H+6EeBh8K6WJepXTg+fMpBgXAKat
 Cy1YS5O0vKE+y2J0SKds/Gd7skTREN2QiYdVWo7LX8Vp8hWI9ClZiJHBO3QOQGI3
 2hJBPTzZ4qex6F2kSX6O17MFd/eOBLhTf+V+X5JjlE/YPQyYXxGvlSbCW0tVVyzW
 xFgeevnwl1aOlPU=
 =J+S/
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "A small set of audit patches for v5.11 with four patches in total and
  only one of any real significance.

  Richard's patch to trigger accompanying records causes the kernel to
  emit additional related records when an audit event occurs; helping
  provide some much needed context to events in the audit log. It is
  also worth mentioning that this is a revised patch based on an earlier
  attempt that had to be reverted in the v5.8 time frame.

  Everything passes our test suite, and with no problems reported please
  merge this for v5.11"

* tag 'audit-pr-20201214' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: replace atomic_add_return()
  audit: fix macros warnings
  audit: trigger accompanying records when no rules present
  audit: fix a kernel-doc markup
2020-12-16 10:54:03 -08:00
Andy Shevchenko 9801ca279a apparmor: remove duplicate macro list_entry_is_head()
Strangely I hadn't had noticed the existence of the list_entry_is_head()
in apparmor code when added the same one in the list.h.  Luckily it's
fully identical and didn't break builds.  In any case we don't need a
duplicate anymore, thus remove it from apparmor code.

Link: https://lkml.kernel.org/r/20201208100639.88182-1-andriy.shevchenko@linux.intel.com
Fixes: e130816164 ("include/linux/list.h: add a macro to test if entry is pointing to the head")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E . Hallyn " <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-15 22:46:19 -08:00
Linus Torvalds d635a69dd4 Networking updates for 5.11
Core:
 
  - support "prefer busy polling" NAPI operation mode, where we defer softirq
    for some time expecting applications to periodically busy poll
 
  - AF_XDP: improve efficiency by more batching and hindering
            the adjacency cache prefetcher
 
  - af_packet: make packet_fanout.arr size configurable up to 64K
 
  - tcp: optimize TCP zero copy receive in presence of partial or unaligned
         reads making zero copy a performance win for much smaller messages
 
  - XDP: add bulk APIs for returning / freeing frames
 
  - sched: support fragmenting IP packets as they come out of conntrack
 
  - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
 
 BPF:
 
  - BPF switch from crude rlimit-based to memcg-based memory accounting
 
  - BPF type format information for kernel modules and related tracing
    enhancements
 
  - BPF implement task local storage for BPF LSM
 
  - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage
 
 Protocols:
 
  - mptcp: improve multiple xmit streams support, memory accounting and
           many smaller improvements
 
  - TLS: support CHACHA20-POLY1305 cipher
 
  - seg6: add support for SRv6 End.DT4/DT6 behavior
 
  - sctp: Implement RFC 6951: UDP Encapsulation of SCTP
 
  - ppp_generic: add ability to bridge channels directly
 
  - bridge: Connectivity Fault Management (CFM) support as is defined in
            IEEE 802.1Q section 12.14.
 
 Drivers:
 
  - mlx5: make use of the new auxiliary bus to organize the driver internals
 
  - mlx5: more accurate port TX timestamping support
 
  - mlxsw:
    - improve the efficiency of offloaded next hop updates by using
      the new nexthop object API
    - support blackhole nexthops
    - support IEEE 802.1ad (Q-in-Q) bridging
 
  - rtw88: major bluetooth co-existance improvements
 
  - iwlwifi: support new 6 GHz frequency band
 
  - ath11k: Fast Initial Link Setup (FILS)
 
  - mt7915: dual band concurrent (DBDC) support
 
  - net: ipa: add basic support for IPA v4.5
 
 Refactor:
 
  - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior
 
  - phy: add support for shared interrupts; get rid of multiple driver
         APIs and have the drivers write a full IRQ handler, slight growth
 	of driver code should be compensated by the simpler API which
 	also allows shared IRQs
 
  - add common code for handling netdev per-cpu counters
 
  - move TX packet re-allocation from Ethernet switch tag drivers to
    a central place
 
  - improve efficiency and rename nla_strlcpy
 
  - number of W=1 warning cleanups as we now catch those in a patchwork
    build bot
 
 Old code removal:
 
  - wan: delete the DLCI / SDLA drivers
 
  - wimax: move to staging
 
  - wifi: remove old WDS wifi bridging support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl/YXmUACgkQMUZtbf5S
 IrvSQBAAgOrt4EFopEvVqlTHZbqI45IEqgtXS+YWmlgnjZCgshyMj8q1yK1zzane
 qYxr/NNJ9kV3FdtaynmmHPgEEEfR5kJ/D3B2BsxYDkaDDrD0vbNsBGw+L+/Gbhxl
 N/5l/9FjLyLY1D+EErknuwR5XGuQ6BSDVaKQMhYOiK2hgdnAAI4hszo8Chf6wdD0
 XDBslQ7vpD/05r+eMj0IkS5dSAoGOIFXUxhJ5dqrDbRHiKsIyWqA3PLbYemfAhxI
 s2XckjfmSgGE3FKL8PSFu+EcfHbJQQjLcULJUnqgVcdwEEtRuE9ggEi52nZRXMWM
 4e8sQJAR9Fx7pZy0G1xfS149j6iPU5LjRlU9TNSpVABz14Vvvo3gEL6gyIdsz+xh
 hMN7UBdp0FEaP028CXoIYpaBesvQqj0BSndmee8qsYAtN6j+QKcM2AOSr7JN1uMH
 C/86EDoGAATiEQIVWJvnX5MPmlAoblyLA+RuVhmxkIBx2InGXkFmWqRkXT5l4jtk
 LVl8/TArR4alSQqLXictXCjYlCm9j5N4zFFtEVasSYi7/ZoPfgRNWT+lJ2R8Y+Zv
 +htzGaFuyj6RJTVeFQMrkl3whAtBamo2a0kwg45NnxmmXcspN6kJX1WOIy82+MhD
 Yht7uplSs7MGKA78q/CDU0XBeGjpABUvmplUQBIfrR/jKLW2730=
 =GXs1
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core:

   - support "prefer busy polling" NAPI operation mode, where we defer
     softirq for some time expecting applications to periodically busy
     poll

   - AF_XDP: improve efficiency by more batching and hindering the
     adjacency cache prefetcher

   - af_packet: make packet_fanout.arr size configurable up to 64K

   - tcp: optimize TCP zero copy receive in presence of partial or
     unaligned reads making zero copy a performance win for much smaller
     messages

   - XDP: add bulk APIs for returning / freeing frames

   - sched: support fragmenting IP packets as they come out of conntrack

   - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs

  BPF:

   - BPF switch from crude rlimit-based to memcg-based memory accounting

   - BPF type format information for kernel modules and related tracing
     enhancements

   - BPF implement task local storage for BPF LSM

   - allow the FENTRY/FEXIT/RAW_TP tracing programs to use
     bpf_sk_storage

  Protocols:

   - mptcp: improve multiple xmit streams support, memory accounting and
     many smaller improvements

   - TLS: support CHACHA20-POLY1305 cipher

   - seg6: add support for SRv6 End.DT4/DT6 behavior

   - sctp: Implement RFC 6951: UDP Encapsulation of SCTP

   - ppp_generic: add ability to bridge channels directly

   - bridge: Connectivity Fault Management (CFM) support as is defined
     in IEEE 802.1Q section 12.14.

  Drivers:

   - mlx5: make use of the new auxiliary bus to organize the driver
     internals

   - mlx5: more accurate port TX timestamping support

   - mlxsw:
      - improve the efficiency of offloaded next hop updates by using
        the new nexthop object API
      - support blackhole nexthops
      - support IEEE 802.1ad (Q-in-Q) bridging

   - rtw88: major bluetooth co-existance improvements

   - iwlwifi: support new 6 GHz frequency band

   - ath11k: Fast Initial Link Setup (FILS)

   - mt7915: dual band concurrent (DBDC) support

   - net: ipa: add basic support for IPA v4.5

  Refactor:

   - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
     Siewior

   - phy: add support for shared interrupts; get rid of multiple driver
     APIs and have the drivers write a full IRQ handler, slight growth
     of driver code should be compensated by the simpler API which also
     allows shared IRQs

   - add common code for handling netdev per-cpu counters

   - move TX packet re-allocation from Ethernet switch tag drivers to a
     central place

   - improve efficiency and rename nla_strlcpy

   - number of W=1 warning cleanups as we now catch those in a patchwork
     build bot

  Old code removal:

   - wan: delete the DLCI / SDLA drivers

   - wimax: move to staging

   - wifi: remove old WDS wifi bridging support"

* tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
  net: hns3: fix expression that is currently always true
  net: fix proc_fs init handling in af_packet and tls
  nfc: pn533: convert comma to semicolon
  af_vsock: Assign the vsock transport considering the vsock address flags
  af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
  vsock_addr: Check for supported flag values
  vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
  vm_sockets: Add flags field in the vsock address data structure
  net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
  tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
  net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
  nfc: s3fwrn5: Release the nfc firmware
  net: vxget: clean up sparse warnings
  mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
  mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
  mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
  mlxsw: reg: Add Router LPM Cache Enable Register
  mlxsw: reg: Add Router LPM Cache ML Delete Register
  mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
  mlxsw: reg: Add XM Router M Table Register
  ...
2020-12-15 13:22:29 -08:00
Linus Torvalds 9e4b0d55d8 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Add speed testing on 1420-byte blocks for networking

  Algorithms:
   - Improve performance of chacha on ARM for network packets
   - Improve performance of aegis128 on ARM for network packets

  Drivers:
   - Add support for Keem Bay OCS AES/SM4
   - Add support for QAT 4xxx devices
   - Enable crypto-engine retry mechanism in caam
   - Enable support for crypto engine on sdm845 in qce
   - Add HiSilicon PRNG driver support"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (161 commits)
  crypto: qat - add capability detection logic in qat_4xxx
  crypto: qat - add AES-XTS support for QAT GEN4 devices
  crypto: qat - add AES-CTR support for QAT GEN4 devices
  crypto: atmel-i2c - select CONFIG_BITREVERSE
  crypto: hisilicon/trng - replace atomic_add_return()
  crypto: keembay - Add support for Keem Bay OCS AES/SM4
  dt-bindings: Add Keem Bay OCS AES bindings
  crypto: aegis128 - avoid spurious references crypto_aegis128_update_simd
  crypto: seed - remove trailing semicolon in macro definition
  crypto: x86/poly1305 - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: x86/sha512 - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg
  crypto: cpt - Fix sparse warnings in cptpf
  hwrng: ks-sa - Add dependency on IOMEM and OF
  crypto: lib/blake2s - Move selftest prototype into header file
  crypto: arm/aes-ce - work around Cortex-A57/A72 silion errata
  crypto: ecdh - avoid unaligned accesses in ecdh_set_secret()
  crypto: ccree - rework cache parameters handling
  crypto: cavium - Use dma_set_mask_and_coherent to simplify code
  crypto: marvell/octeontx - Use dma_set_mask_and_coherent to simplify code
  ...
2020-12-14 12:18:19 -08:00
Linus Torvalds da06285598 Limit recursion depth, fix clang warning, fix comment typo, and
silence memory allocation failure warning.
 
 tomoyo: fix clang pointer arithmetic warning
 tomoyo: Limit wildcard recursion depth.
 tomoyo: Fix null pointer check
 tomoyo: Fix typo in comments.
 
  security/tomoyo/audit.c         |    2 -
  security/tomoyo/common.c        |    8 ++---
  security/tomoyo/condition.c     |    2 -
  security/tomoyo/domain.c        |    6 +---
  security/tomoyo/gc.c            |    2 -
  security/tomoyo/memory.c        |    4 +-
  security/tomoyo/securityfs_if.c |    6 ++--
  security/tomoyo/util.c          |   55 +++++++++++++++++++++-------------------
  8 files changed, 44 insertions(+), 41 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJf1uc6AAoJEEJfEo0MZPUqRL8P/0a6n5oi4Ka2pmfh9u4AnuKo
 LY+LRxOv4g0oCMF/teVdeWNgM2B8sKE3F3sLlkLfcUu313mcsc1spg9LiZ87izkO
 OiCzFRI6Psn+o86s2mPcRExFDgg4DKdi91e3NlQQNDGkoHRoaPMr/viajzVCcKx1
 toNtxEFczLYlATslRDK7CeXKhzOGvWM3HyuujI/nSPMdJIk0pY9UIS57NCtwUH2v
 HAWBgl6hU1Txa9rcl2F3jjDYE/v4JjOiZS+1IPm2nAYSAC8dMmmXthEZttlRIdqr
 4zkoF+YzqPTnvjZvi5Owz8AmhmlW0FPtxAmack++yFaNXP3b1tCUrlFS6U5GPetE
 5ZT0dwHJE9NGX07pVmNYb+APf0VLcNUdHtp+ja7QeyfmRdU8mV4RepTC7vdojcxS
 RgLz3aIhJL3EudxkHAes/Oc/rpdmJazSrhkzeIIShX1FC/iR6qTbU2tXMcBvwz8e
 hz5Jltr5RAgT3vrdApcvC/e/X9lZ8g+qdwgCYb2Wy/paRejjfxlGvYtB3ihVsDyl
 pPeIJTkQoMp4YycPYaQ1IwAEHCke6rKzbtxfOFR3otNRP0wHNn7H0bvzyAvv+ymB
 OTy/62p+DYpE9xDnKagjJGRsMp0zNum5UsMkg3fYixdK7PnQ/yNVcd7y12VldKhI
 mfkNbIsmFrrWsBJRJZNX
 =xb99
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20201214' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo updates from Tetsuo Handa:
 "Limit recursion depth, fix clang warning, fix comment typo, and
  silence memory allocation failure warning"

* tag 'tomoyo-pr-20201214' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: Fix typo in comments.
  tomoyo: Fix null pointer check
  tomoyo: Limit wildcard recursion depth.
  tomoyo: fix clang pointer arithmetic warning
  tomoyo: Loosen pathname/domainname validation.
2020-12-14 12:05:10 -08:00
Miklos Szeredi 7c03e2cda4 vfs: move cap_convert_nscap() call into vfs_setxattr()
cap_convert_nscap() does permission checking as well as conversion of the
xattr value conditionally based on fs's user-ns.

This is needed by overlayfs and probably other layered fs (ecryptfs) and is
what vfs_foo() is supposed to do anyway.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
2020-12-14 15:26:13 +01:00
Jakub Kicinski e2437ac2f5 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2020-12-12

Just one patch this time:

1) Redact the SA keys with kernel lockdown confidentiality.
   If enabled, no secret keys are sent to uuserspace.
   From Antony Antony.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
  xfrm: redact SA secret with lockdown confidentiality
====================

Link: https://lore.kernel.org/r/20201212085737.2101294-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-12 12:28:42 -08:00
Tetsuo Handa 15269fb193 tomoyo: Fix typo in comments.
Spotted by developers and codespell program.

Co-developed-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Co-developed-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-12-06 13:44:57 +09:00
Jakub Kicinski a1dd1d8697 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-12-03

The main changes are:

1) Support BTF in kernel modules, from Andrii.

2) Introduce preferred busy-polling, from Björn.

3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh.

4) Memcg-based memory accounting for bpf objects, from Roman.

5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits)
  selftests/bpf: Fix invalid use of strncat in test_sockmap
  libbpf: Use memcpy instead of strncpy to please GCC
  selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module
  selftests/bpf: Add tp_btf CO-RE reloc test for modules
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Factor out low-level BPF program loading helper
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier
  selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF
  selftests/bpf: Add support for marking sub-tests as skipped
  selftests/bpf: Add bpf_testmod kernel module for testing
  libbpf: Add kernel module BTF support for CO-RE relocations
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add internal helper to load BTF data by FD
  bpf: Keep module's btf_data_size intact after load
  bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address()
  selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP
  bpf: Adds support for setting window clamp
  samples/bpf: Fix spelling mistake "recieving" -> "receiving"
  bpf: Fix cold build of test_progs-no_alu32
  ...
====================

Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 07:48:12 -08:00
Florian Westphal 41dd9596d6 security: add const qualifier to struct sock in various places
A followup change to tcp_request_sock_op would have to drop the 'const'
qualifier from the 'route_req' function as the
'security_inet_conn_request' call is moved there - and that function
expects a 'struct sock *'.

However, it turns out its also possible to add a const qualifier to
security_inet_conn_request instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-03 12:56:03 -08:00
Roberto Sassu 207cdd565d ima: Don't modify file descriptor mode on the fly
Commit a408e4a86b ("ima: open a new file instance if no read
permissions") already introduced a second open to measure a file when the
original file descriptor does not allow it. However, it didn't remove the
existing method of changing the mode of the original file descriptor, which
is still necessary if the current process does not have enough privileges
to open a new one.

Changing the mode isn't really an option, as the filesystem might need to
do preliminary steps to make the read possible. Thus, this patch removes
the code and keeps the second open as the only option to measure a file
when it is unreadable with the original file descriptor.

Cc: <stable@vger.kernel.org> # 4.20.x: 0014cc04e8 ima: Set file->f_mode
Fixes: 2fe5d6def1 ("ima: integrity appraisal extension")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-11-29 07:02:53 -05:00
Zheng Zengkai 1b6b924efe tomoyo: Fix null pointer check
Since tomoyo_memory_ok() will check for null pointer returned by
kzalloc() in tomoyo_assign_profile(), tomoyo_assign_namespace(),
tomoyo_get_name() and tomoyo_commit_ok(), then emit OOM warnings
if needed. And this is the expected behavior as informed by
Tetsuo Handa.

Let's add __GFP_NOWARN to kzalloc() in those related functions.
Besides, to achieve this goal, remove the null check for entry
right after kzalloc() in tomoyo_assign_namespace().

Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-11-27 19:36:11 +09:00
Antony Antony c7a5899eb2 xfrm: redact SA secret with lockdown confidentiality
redact XFRM SA secret in the netlink response to xfrm_get_sa()
or dumpall sa.
Enable lockdown, confidentiality mode, at boot or at run time.

e.g. when enabled:
cat /sys/kernel/security/lockdown
none integrity [confidentiality]

ip xfrm state
src 172.16.1.200 dst 172.16.1.100
	proto esp spi 0x00000002 reqid 2 mode tunnel
	replay-window 0
	aead rfc4106(gcm(aes)) 0x0000000000000000000000000000000000000000 96

note: the aead secret is redacted.
Redacting secret is also a FIPS 140-2 requirement.

v1->v2
 - add size checks before memset calls
v2->v3
 - replace spaces with tabs for consistency
v3->v4
 - use kernel lockdown instead of a /proc setting
v4->v5
 - remove kconfig option

Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-11-27 11:03:06 +01:00
KP Singh 403319be5d ima: Implement ima_inode_hash
This is in preparation to add a helper for BPF LSM programs to use
IMA hashes when attached to LSM hooks. There are LSM hooks like
inode_unlink which do not have a struct file * argument and cannot
use the existing ima_file_hash API.

An inode based API is, therefore, useful in LSM based detections like an
executable trying to delete itself which rely on the inode_unlink LSM
hook.

Moreover, the ima_file_hash function does nothing with the struct file
pointer apart from calling file_inode on it and converting it to an
inode.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20201124151210.1081188-2-kpsingh@chromium.org
2020-11-26 00:04:04 +01:00
Paul Moore 3df98d7921 lsm,selinux: pass flowi_common instead of flowi to the LSM hooks
As pointed out by Herbert in a recent related patch, the LSM hooks do
not have the necessary address family information to use the flowi
struct safely.  As none of the LSMs currently use any of the protocol
specific flowi information, replace the flowi pointers with pointers
to the address family independent flowi_common struct.

Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-23 18:36:21 -05:00
Gustavo A. R. Silva b2d99bcb27 selinux: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-23 18:21:13 -05:00
David Howells 8eb621698f keys: Provide the original description to the key preparser
Provide the proposed description (add key) or the original description
(update/instantiate key) when preparsing a key so that the key type can
validate it against the data.

This is important for rxrpc server keys as we need to check that they have
the right amount of key material present - and it's better to do that when
the key is loaded rather than deep in trying to process a response packet.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: keyrings@vger.kernel.org
2020-11-23 18:09:29 +00:00
Lakshmi Ramasubramanian dea87d0889 ima: select ima-buf template for buffer measurement
The default IMA template used for all policy rules is the value set
for CONFIG_IMA_DEFAULT_TEMPLATE if the policy rule does not specify
a template. The default IMA template for buffer measurements should be
'ima-buf' - so that the measured buffer is correctly included in the IMA
measurement log entry.

With the default template format, buffer measurements are added to
the measurement list, but do not include the buffer data, making it
difficult, if not impossible, to validate. Including 'ima-buf'
template records in the measurement list by default, should not impact
existing attestation servers without 'ima-buf' template support.

Initialize a global 'ima-buf' template and select that template,
by default, for buffer measurements.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-11-20 13:52:43 -05:00
Eric Biggers a24d22b225 crypto: sha - split sha.h into sha1.h and sha2.h
Currently <crypto/sha.h> contains declarations for both SHA-1 and SHA-2,
and <crypto/sha3.h> contains declarations for SHA-3.

This organization is inconsistent, but more importantly SHA-1 is no
longer considered to be cryptographically secure.  So to the extent
possible, SHA-1 shouldn't be grouped together with any of the other SHA
versions, and usage of it should be phased out.

Therefore, split <crypto/sha.h> into two headers <crypto/sha1.h> and
<crypto/sha2.h>, and make everyone explicitly specify whether they want
the declarations for SHA-1, SHA-2, or both.

This avoids making the SHA-1 declarations visible to files that don't
want anything to do with SHA-1.  It also prepares for potentially moving
sha1.h into a new insecure/ or dangerous/ directory.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-11-20 14:45:33 +11:00
Jakub Kicinski 56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Alex Shi 9b0072e2b2 security/smack: remove unused varible 'rc'
This varible isn't used and can be removed to avoid a gcc warning:
security/smack/smack_lsm.c:3873:6: warning: variable ‘rc’ set but not
used [-Wunused-but-set-variable]

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-11-16 17:26:31 -08:00
Linus Torvalds 30636a59f4 selinux/stable-5.10 PR 20201113
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl+vD0MUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOHZw//SbZ4c32LGxANDJrASdRWGRYzUsSm
 cQZrLR373TMMmawZDenvRK0Wfa8vj+BK5V5knBXXTDFq5M+TVtlzhRhED4ZH40RP
 CmaL+XC24ed1IPsYiZZ0lytXrDnb7EgmGaSjHP7RttBdLG/2ABamm9FNZFpiFJjV
 y9txgNyJ9NvrdFCO+UoLEm9BFq4x/Ddzu+BAEqu1iSGaKXufiBlq9uQ+3qommJlx
 Bv+i6HeqCruTj+y8p8tkR7aLma9jsbqCpBlQkG1ja4pgxMXwC2uRbNzz//dlxWZ5
 pKtz8HJqi+myla/rwNZAA82uZlkFnEOp7Yy404Deu73WznMdrQ1oJqlGDrXdafKn
 BJwpcUQnJ7NP2MK1JrMbxdVllUKsZF2ICP0t3CFTuuyJ2CakOuaSW+ERBvbO6NYa
 G7h9Ll8AO4ADbG3W+/c6nybDt8AkjBLJTg7uXWO3LIjjZ/UJwJxO4YwzYbaMdg6Q
 7zVNj7j36YLHhv9Hc60vhCjO53KXYo60Z6TCblkeyClifvM8Gx109JkOwPXWFGIY
 Bh7KbKtGr2dZoFxU2yS5oQgtbzJhsfJQGYsF8uANHcj76hQ/rh+Ff1NKKofHaRSl
 umQSgNbekaQjnmq08otVlIHtjdaG+or1PFW2Yep37OwsCXwKrFG4YZRY9FdpwQeG
 Cjx3JAJTibvjMqg=
 =qpR+
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
 "One small SELinux patch to make sure we return an error code when an
  allocation fails. It passes all of our tests, but given the nature of
  the patch that isn't surprising"

* tag 'selinux-pr-20201113' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: Fix error return code in sel_ib_pkey_sid_slow()
2020-11-14 12:04:02 -08:00
Jakub Kicinski 07cbce2e46 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-11-14

1) Add BTF generation for kernel modules and extend BTF infra in kernel
   e.g. support for split BTF loading and validation, from Andrii Nakryiko.

2) Support for pointers beyond pkt_end to recognize LLVM generated patterns
   on inlined branch conditions, from Alexei Starovoitov.

3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh.

4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage
   infra, from Martin KaFai Lau.

5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the
   XDP_REDIRECT path, from Lorenzo Bianconi.

6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI
   context, from Song Liu.

7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build
   for different target archs on same source tree, from Jean-Philippe Brucker.

8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet.

9) Move functionality from test_tcpbpf_user into the test_progs framework so it
   can run in BPF CI, from Alexander Duyck.

10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner.

Note that for the fix from Song we have seen a sparse report on context
imbalance which requires changes in sparse itself for proper annotation
detection where this is currently being discussed on linux-sparse among
developers [0]. Once we have more clarification/guidance after their fix,
Song will follow-up.

  [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/
      https://lore.kernel.org/linux-sparse/20201109221345.uklbp3lzgq6g42zb@ltop.local/T/

* git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits)
  net: mlx5: Add xdp tx return bulking support
  net: mvpp2: Add xdp tx return bulking support
  net: mvneta: Add xdp tx return bulking support
  net: page_pool: Add bulk support for ptr_ring
  net: xdp: Introduce bulking for xdp tx return path
  bpf: Expose bpf_d_path helper to sleepable LSM hooks
  bpf: Augment the set of sleepable LSM hooks
  bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP
  bpf: Rename some functions in bpf_sk_storage
  bpf: Folding omem_charge() into sk_storage_charge()
  selftests/bpf: Add asm tests for pkt vs pkt_end comparison.
  selftests/bpf: Add skb_pkt_end test
  bpf: Support for pointers beyond pkt_end.
  tools/bpf: Always run the *-clean recipes
  tools/bpf: Add bootstrap/ to .gitignore
  bpf: Fix NULL dereference in bpf_task_storage
  tools/bpftool: Fix build slowdown
  tools/runqslower: Build bpftool using HOSTCC
  tools/runqslower: Enable out-of-tree build
  ...
====================

Link: https://lore.kernel.org/r/20201114020819.29584-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-14 09:13:41 -08:00
Alex Shi 7da31b858e Smack: fix kernel-doc interface on functions
The are some kernel-doc interface issues:
security/smack/smackfs.c:1950: warning: Function parameter or member
'list' not described in 'smk_parse_label_list'
security/smack/smackfs.c:1950: warning: Excess function parameter
'private' description in 'smk_parse_label_list'
security/smack/smackfs.c:1979: warning: Function parameter or member
'list' not described in 'smk_destroy_label_list'
security/smack/smackfs.c:1979: warning: Excess function parameter 'head'
description in 'smk_destroy_label_list'
security/smack/smackfs.c:2141: warning: Function parameter or member
'count' not described in 'smk_read_logging'
security/smack/smackfs.c:2141: warning: Excess function parameter 'cn'
description in 'smk_read_logging'
security/smack/smackfs.c:2278: warning: Function parameter or member
'format' not described in 'smk_user_access'

Correct them in this patch.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-11-13 11:50:44 -08:00
Chen Zhou c350f8bea2 selinux: Fix error return code in sel_ib_pkey_sid_slow()
Fix to return a negative error code from the error handling case
instead of 0 in function sel_ib_pkey_sid_slow(), as done elsewhere
in this function.

Cc: stable@vger.kernel.org
Fixes: 409dcf3153 ("selinux: Add a cache for quicker retreival of PKey SIDs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-12 20:16:09 -05:00
Ondrej Mosnacek b159e86b5a selinux: drop super_block backpointer from superblock_security_struct
It appears to have been needed for selinux_complete_init() in the past,
but today it's useless.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-12 19:52:21 -05:00
KP Singh 4cf1bc1f10 bpf: Implement task local storage
Similar to bpf_local_storage for sockets and inodes add local storage
for task_struct.

The life-cycle of storage is managed with the life-cycle of the
task_struct.  i.e. the storage is destroyed along with the owning task
with a callback to the bpf_task_storage_free from the task_free LSM
hook.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in
the security blob which are now stackable and can co-exist with other
LSMs.

The userspace map operations can be done by using a pid fd as a key
passed to the lookup, update and delete operations.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-3-kpsingh@chromium.org
2020-11-06 08:08:37 -08:00
Chester Lin 25519d6834 ima: generalize x86/EFI arch glue for other EFI architectures
Move the x86 IMA arch code into security/integrity/ima/ima_efi.c,
so that we will be able to wire it up for arm64 in a future patch.

Co-developed-by: Chester Lin <clin@suse.com>
Signed-off-by: Chester Lin <clin@suse.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-11-06 07:40:42 +01:00
Paul Moore 200ea5a229 selinux: fix inode_doinit_with_dentry() LABEL_INVALID error handling
A previous fix, commit 83370b31a9 ("selinux: fix error initialization
in inode_doinit_with_dentry()"), changed how failures were handled
before a SELinux policy was loaded.  Unfortunately that patch was
potentially problematic for two reasons: it set the isec->initialized
state without holding a lock, and it didn't set the inode's SELinux
label to the "default" for the particular filesystem.  The later can
be a problem if/when a later attempt to revalidate the inode fails
and SELinux reverts to the existing inode label.

This patch should restore the default inode labeling that existed
before the original fix, without affecting the LABEL_INVALID marking
such that revalidation will still be attempted in the future.

Fixes: 83370b31a9 ("selinux: fix error initialization in inode_doinit_with_dentry()")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-11-05 22:47:31 -05:00
Tetsuo Handa e991a40b3d tomoyo: Limit wildcard recursion depth.
Since wildcards that need recursion consume kernel stack memory (or might
cause CPU stall warning problem), we cannot allow infinite recursion.

Since TOMOYO 1.8 survived with 20 recursions limit for 5 years, nobody
would complain if applying this limit to TOMOYO 2.6.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-11-03 13:50:02 +09:00
Ard Biesheuvel b000d5cb95 ima: defer arch_ima_get_secureboot() call to IMA init time
Chester reports that it is necessary to introduce a new way to pass
the EFI secure boot status between the EFI stub and the core kernel
on ARM systems. The usual way of obtaining this information is by
checking the SecureBoot and SetupMode EFI variables, but this can
only be done after the EFI variable workqueue is created, which
occurs in a subsys_initcall(), whereas arch_ima_get_secureboot()
is called much earlier by the IMA framework.

However, the IMA framework itself is started as a late_initcall,
and the only reason the call to arch_ima_get_secureboot() occurs
so early is because it happens in the context of a __setup()
callback that parses the ima_appraise= command line parameter.

So let's refactor this code a little bit, by using a core_param()
callback to capture the command line argument, and deferring any
reasoning based on its contents to the IMA init routine.

Cc: Chester Lin <clin@suse.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Link: https://lore.kernel.org/linux-arm-kernel/20200904072905.25332-2-clin@suse.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: kernel test robot <lkp@intel.com> [missing core_param()]
[zohar@linux.ibm.com: included linux/module.h]
Tested-by: Chester Lin <clin@suse.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-11-02 14:19:01 -05:00
Gustavo A. R. Silva 4739eeafb9 ima: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-10-29 17:22:59 -05:00
Arnd Bergmann d9594e0409 tomoyo: fix clang pointer arithmetic warning
clang warns about additions on NULL pointers being undefined in C:

security/tomoyo/securityfs_if.c:226:59: warning: arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension [-Wnull-pointer-arithmetic]
        securityfs_create_file(name, mode, parent, ((u8 *) NULL) + key,

Change the code to instead use a cast through uintptr_t to avoid
the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-10-28 23:21:43 +09:00
bauen1 44141f58e1 selinux: allow dontauditx and auditallowx rules to take effect without allowx
This allows for dontauditing very specific ioctls e.g. TCGETS without
dontauditing every ioctl or granting additional permissions.

Now either an allowx, dontauditx or auditallowx rules enables checking
for extended permissions.

Signed-off-by: Jonathan Hettwer <j2468h@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-10-27 22:21:11 -04:00
Tianyue Ren 83370b31a9 selinux: fix error initialization in inode_doinit_with_dentry()
Mark the inode security label as invalid if we cannot find
a dentry so that we will retry later rather than marking it
initialized with the unlabeled SID.

Fixes: 9287aed2ad ("selinux: Convert isec->lock into a spinlock")
Signed-off-by: Tianyue Ren <rentianyue@kylinos.cn>
[PM: minor comment tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-10-27 22:14:25 -04:00
Richard Guy Briggs 6d915476e6 audit: trigger accompanying records when no rules present
When there are no audit rules registered, mandatory records (config,
etc.) are missing their accompanying records (syscall, proctitle, etc.).

This is due to audit context dummy set on syscall entry based on absence
of rules that signals that no other records are to be printed.  Clear the dummy
bit if any record is generated, open coding this in audit_log_start().

The proctitle context and dummy checks are pointless since the
proctitle record will not be printed if no syscall records are printed.

The fds array is reset to -1 after the first syscall to indicate it
isn't valid any more, but was never set to -1 when the context was
allocated to indicate it wasn't yet valid.

Check ctx->pwd in audit_log_name().

The audit_inode* functions can be called without going through
getname_flags() or getname_kernel() that sets audit_names and cwd, so
set the cwd in audit_alloc_name() if it has not already been done so due to
audit_names being valid and purge all other audit_getcwd() calls.

Revert the LSM dump_common_audit_data() LSM_AUDIT_DATA_* cases from the
ghak96 patch since they are no longer necessary due to cwd coverage in
audit_alloc_name().

Thanks to bauen1 <j2468h@googlemail.com> for reporting LSM situations in
which context->cwd is not valid, inadvertantly fixed by the ghak96 patch.

Please see upstream github issue
https://github.com/linux-audit/audit-kernel/issues/120
This is also related to upstream github issue
https://github.com/linux-audit/audit-kernel/issues/96

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-10-27 21:02:57 -04:00
Linus Torvalds 81ecf91eab SafeSetID changes for v5.10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEgvWslnM+qUy+sgVg5n2WYw6TPBAFAl+Ifu8ACgkQ5n2WYw6T
 PBCoxA/+Pn0XvwYa6V773lPNjon+Oa94Aq7Wl6YryDMJakiGDJFSJa0tEI8TmRkJ
 z21kjww2Us9gEvfmNoc0t4oDJ98UNAXERjc98fOZgxH1d1urpGUI7qdQ07YCo0xZ
 CDOvqXk/PobGF6p9BpF5QWqEJNq6G8xAKpA8nLa6OUPcjofHroWCgIs86Rl3CtTc
 DwjcOvCgUoTxFm9Vpvm04njFFkVuGUwmXuhyV3Xjh2vNhHvfpP/ibTPmmv1sx4dO
 9WE8BjW0HL5VMzms/BE/mnXmbu2BdPs+PW9/RjQfebbAH8DM3Noqr9f3Db8eqp7t
 TiqU8AO06TEVZa011+V3aywgz9rnH+XJ17TfutB28Z7lG3s4XPZYDgzubJxb1X8M
 4d2mCL3N/ao5otx6FqpgJ2oK0ZceB/voY9qyyfErEBhRumxifl7AQCHxt3LumH6m
 fvvNY+UcN/n7hZPJ7sgZVi/hnnwvO0e1eX0L9ZdNsDjR1bgzBQCdkY53XNxam+rM
 z7tmT3jlDpNtPzOzFCZeiJuTgWYMDdJFqekPLess/Vqaswzc4PPT2lyQ6N81NR5H
 +mzYf/PNIg5fqN8QlMQEkMTv2fnC19dHJT83NPgy4dQObpXzUqYGWAmdKcBxLpnG
 du8wDpPHusChRFMZKRMTXztdMvMAuNqY+KJ6bFojG0Z+qgR7oQk=
 =/anB
 -----END PGP SIGNATURE-----

Merge tag 'safesetid-5.10' of git://github.com/micah-morton/linux

Pull SafeSetID updates from Micah Morton:
 "The changes are mostly contained to within the SafeSetID LSM, with the
  exception of a few 1-line changes to change some ns_capable() calls to
  ns_capable_setid() -- causing a flag (CAP_OPT_INSETID) to be set that
  is examined by SafeSetID code and nothing else in the kernel.

  The changes to SafeSetID internally allow for setting up GID
  transition security policies, as already existed for UIDs"

* tag 'safesetid-5.10' of git://github.com/micah-morton/linux:
  LSM: SafeSetID: Fix warnings reported by test bot
  LSM: SafeSetID: Add GID security policy handling
  LSM: Signal to SafeSetID when setting group IDs
2020-10-25 10:45:26 -07:00
Jens Axboe 91989c7078 task_work: cleanup notification modes
A previous commit changed the notification mode from true/false to an
int, allowing notify-no, notify-yes, or signal-notify. This was
backwards compatible in the sense that any existing true/false user
would translate to either 0 (on notification sent) or 1, the latter
which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2.

Clean this up properly, and define a proper enum for the notification
mode. Now we have:

- TWA_NONE. This is 0, same as before the original change, meaning no
  notification requested.
- TWA_RESUME. This is 1, same as before the original change, meaning
  that we use TIF_NOTIFY_RESUME.
- TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the
  notification.

Clean up all the callers, switching their 0/1/false/true to using the
appropriate TWA_* mode for notifications.

Fixes: e91b481623 ("task_work: teach task_work_add() to do signal_wake_up()")
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-17 15:05:30 -06:00
Linus Torvalds 9ff9b0d392 networking changes for the 5.10 merge window
Add redirect_neigh() BPF packet redirect helper, allowing to limit stack
 traversal in common container configs and improving TCP back-pressure.
 Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
 
 Expand netlink policy support and improve policy export to user space.
 (Ge)netlink core performs request validation according to declared
 policies. Expand the expressiveness of those policies (min/max length
 and bitmasks). Allow dumping policies for particular commands.
 This is used for feature discovery by user space (instead of kernel
 version parsing or trial and error).
 
 Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge.
 
 Allow more than 255 IPv4 multicast interfaces.
 
 Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
 packets of TCPv6.
 
 In Multi-patch TCP (MPTCP) support concurrent transmission of data
 on multiple subflows in a load balancing scenario. Enhance advertising
 addresses via the RM_ADDR/ADD_ADDR options.
 
 Support SMC-Dv2 version of SMC, which enables multi-subnet deployments.
 
 Allow more calls to same peer in RxRPC.
 
 Support two new Controller Area Network (CAN) protocols -
 CAN-FD and ISO 15765-2:2016.
 
 Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
 kernel problem.
 
 Add TC actions for implementing MPLS L2 VPNs.
 
 Improve nexthop code - e.g. handle various corner cases when nexthop
 objects are removed from groups better, skip unnecessary notifications
 and make it easier to offload nexthops into HW by converting
 to a blocking notifier.
 
 Support adding and consuming TCP header options by BPF programs,
 opening the doors for easy experimental and deployment-specific
 TCP option use.
 
 Reorganize TCP congestion control (CC) initialization to simplify life
 of TCP CC implemented in BPF.
 
 Add support for shipping BPF programs with the kernel and loading them
 early on boot via the User Mode Driver mechanism, hence reusing all the
 user space infra we have.
 
 Support sleepable BPF programs, initially targeting LSM and tracing.
 
 Add bpf_d_path() helper for returning full path for given 'struct path'.
 
 Make bpf_tail_call compatible with bpf-to-bpf calls.
 
 Allow BPF programs to call map_update_elem on sockmaps.
 
 Add BPF Type Format (BTF) support for type and enum discovery, as
 well as support for using BTF within the kernel itself (current use
 is for pretty printing structures).
 
 Support listing and getting information about bpf_links via the bpf
 syscall.
 
 Enhance kernel interfaces around NIC firmware update. Allow specifying
 overwrite mask to control if settings etc. are reset during update;
 report expected max time operation may take to users; support firmware
 activation without machine reboot incl. limits of how much impact
 reset may have (e.g. dropping link or not).
 
 Extend ethtool configuration interface to report IEEE-standard
 counters, to limit the need for per-vendor logic in user space.
 
 Adopt or extend devlink use for debug, monitoring, fw update
 in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw,
 mv88e6xxx, dpaa2-eth).
 
 In mlxsw expose critical and emergency SFP module temperature alarms.
 Refactor port buffer handling to make the defaults more suitable and
 support setting these values explicitly via the DCBNL interface.
 
 Add XDP support for Intel's igb driver.
 
 Support offloading TC flower classification and filtering rules to
 mscc_ocelot switches.
 
 Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
 fixed interval period pulse generator and one-step timestamping in
 dpaa-eth.
 
 Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
 offload.
 
 Add Lynx PHY/PCS MDIO module, and convert various drivers which have
 this HW to use it. Convert mvpp2 to split PCS.
 
 Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
 7-port Mediatek MT7531 IP.
 
 Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
 and wcn3680 support in wcn36xx.
 
 Improve performance for packets which don't require much offloads
 on recent Mellanox NICs by 20% by making multiple packets share
 a descriptor entry.
 
 Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto
 subtree to drivers/net. Move MDIO drivers out of the phy directory.
 
 Clean up a lot of W=1 warnings, reportedly the actively developed
 subsections of networking drivers should now build W=1 warning free.
 
 Make sure drivers don't use in_interrupt() to dynamically adapt their
 code. Convert tasklets to use new tasklet_setup API (sadly this
 conversion is not yet complete).
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S
 IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81
 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U
 Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh
 r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E
 iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy
 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F
 mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl
 wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe
 owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp
 HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP
 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc=
 =bc1U
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:

 - Add redirect_neigh() BPF packet redirect helper, allowing to limit
   stack traversal in common container configs and improving TCP
   back-pressure.

   Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.

 - Expand netlink policy support and improve policy export to user
   space. (Ge)netlink core performs request validation according to
   declared policies. Expand the expressiveness of those policies
   (min/max length and bitmasks). Allow dumping policies for particular
   commands. This is used for feature discovery by user space (instead
   of kernel version parsing or trial and error).

 - Support IGMPv3/MLDv2 multicast listener discovery protocols in
   bridge.

 - Allow more than 255 IPv4 multicast interfaces.

 - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
   packets of TCPv6.

 - In Multi-patch TCP (MPTCP) support concurrent transmission of data on
   multiple subflows in a load balancing scenario. Enhance advertising
   addresses via the RM_ADDR/ADD_ADDR options.

 - Support SMC-Dv2 version of SMC, which enables multi-subnet
   deployments.

 - Allow more calls to same peer in RxRPC.

 - Support two new Controller Area Network (CAN) protocols - CAN-FD and
   ISO 15765-2:2016.

 - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
   kernel problem.

 - Add TC actions for implementing MPLS L2 VPNs.

 - Improve nexthop code - e.g. handle various corner cases when nexthop
   objects are removed from groups better, skip unnecessary
   notifications and make it easier to offload nexthops into HW by
   converting to a blocking notifier.

 - Support adding and consuming TCP header options by BPF programs,
   opening the doors for easy experimental and deployment-specific TCP
   option use.

 - Reorganize TCP congestion control (CC) initialization to simplify
   life of TCP CC implemented in BPF.

 - Add support for shipping BPF programs with the kernel and loading
   them early on boot via the User Mode Driver mechanism, hence reusing
   all the user space infra we have.

 - Support sleepable BPF programs, initially targeting LSM and tracing.

 - Add bpf_d_path() helper for returning full path for given 'struct
   path'.

 - Make bpf_tail_call compatible with bpf-to-bpf calls.

 - Allow BPF programs to call map_update_elem on sockmaps.

 - Add BPF Type Format (BTF) support for type and enum discovery, as
   well as support for using BTF within the kernel itself (current use
   is for pretty printing structures).

 - Support listing and getting information about bpf_links via the bpf
   syscall.

 - Enhance kernel interfaces around NIC firmware update. Allow
   specifying overwrite mask to control if settings etc. are reset
   during update; report expected max time operation may take to users;
   support firmware activation without machine reboot incl. limits of
   how much impact reset may have (e.g. dropping link or not).

 - Extend ethtool configuration interface to report IEEE-standard
   counters, to limit the need for per-vendor logic in user space.

 - Adopt or extend devlink use for debug, monitoring, fw update in many
   drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
   dpaa2-eth).

 - In mlxsw expose critical and emergency SFP module temperature alarms.
   Refactor port buffer handling to make the defaults more suitable and
   support setting these values explicitly via the DCBNL interface.

 - Add XDP support for Intel's igb driver.

 - Support offloading TC flower classification and filtering rules to
   mscc_ocelot switches.

 - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
   fixed interval period pulse generator and one-step timestamping in
   dpaa-eth.

 - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
   offload.

 - Add Lynx PHY/PCS MDIO module, and convert various drivers which have
   this HW to use it. Convert mvpp2 to split PCS.

 - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
   7-port Mediatek MT7531 IP.

 - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
   and wcn3680 support in wcn36xx.

 - Improve performance for packets which don't require much offloads on
   recent Mellanox NICs by 20% by making multiple packets share a
   descriptor entry.

 - Move chelsio inline crypto drivers (for TLS and IPsec) from the
   crypto subtree to drivers/net. Move MDIO drivers out of the phy
   directory.

 - Clean up a lot of W=1 warnings, reportedly the actively developed
   subsections of networking drivers should now build W=1 warning free.

 - Make sure drivers don't use in_interrupt() to dynamically adapt their
   code. Convert tasklets to use new tasklet_setup API (sadly this
   conversion is not yet complete).

* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
  Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
  net, sockmap: Don't call bpf_prog_put() on NULL pointer
  bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
  bpf, sockmap: Add locking annotations to iterator
  netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
  net: fix pos incrementment in ipv6_route_seq_next
  net/smc: fix invalid return code in smcd_new_buf_create()
  net/smc: fix valid DMBE buffer sizes
  net/smc: fix use-after-free of delayed events
  bpfilter: Fix build error with CONFIG_BPFILTER_UMH
  cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
  net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
  bpf: Fix register equivalence tracking.
  rxrpc: Fix loss of final ack on shutdown
  rxrpc: Fix bundle counting for exclusive connections
  netfilter: restore NF_INET_NUMHOOKS
  ibmveth: Identify ingress large send packets.
  ibmveth: Switch order of ibmveth_helper calls.
  cxgb4: handle 4-tuple PEDIT to NAT mode translation
  selftests: Add VRF route leaking tests
  ...
2020-10-15 18:42:13 -07:00
Linus Torvalds 840e5bb326 integrity-v5.10
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAl+GX1oUHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVpACRAAqkLjZZioCl8SB2PgtOvfNGmK8b70
 j4RIWKVFnzZVq6cjzc6OY/ujjOg1Psuyr48g//5fLAZVpqD7RLv0s12npZ/Q+Pim
 8uInUh4G4TKZlcPmsMA2uC31NmK6xwVz2+rQjQUB8XP0ZWq392M+nrcjg2nPU1r0
 ozXg0zefY/NAJwpgJlZfjxCs2YhLYe6ooqBF5Hw6kiBgWEW7O3cgBCeW3zXv9CDA
 TTh7bL8Y3tLiB9DYat6alfT/IU9tb9GLCgWMxmzb+MeAiSjKLWG/9wMvsAW/7M69
 ECARmf28zBNjRU8OZaf615q6hXp3JghYJNpirob3B8MX6galA5oux6sOecrXxduR
 yEexPR2HWiCgazcN/a1NkE9nGxICsrOmLoiYdAs4pz7Csqlj4hY0HQkL8HLQzD7U
 MTXvdZMAd35cFDb2zMGWSnOvJrX7RlvulkgAkFpM5y3WjddY1R0hdf4fs5dYrfxb
 CVi+40ZCO/Xt4c689Jnga/nwFABgMhU3XCYHgZz5tBv/3YW41xgM7HwK0WskunFM
 xQ3zNHnj9ZmjVfwuVH+yQU16FgSjJ7rdrMj1NucH8xColPnfr8erSBIc3UWoMGPb
 zq0E9y8LXebI8HVmymDiWLSjH0CQ86VtSGfmtJWB0oE3ad8DZ0HW9rydoYFzgc6+
 VZISjEfZN5t7xws=
 =qyUG
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "Continuing IMA policy rule cleanup and validation in particular for
  measuring keys, adding/removing/updating informational and error
  messages (e.g. "ima_appraise" boot command line option), and other bug
  fixes (e.g. minimal data size validation before use, return code and
  NULL pointer checking)"

* tag 'integrity-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Fix NULL pointer dereference in ima_file_hash
  evm: Check size of security.evm before using it
  ima: Remove semicolon at the end of ima_get_binary_runtime_size()
  ima: Don't ignore errors from crypto_shash_update()
  ima: Use kmemdup rather than kmalloc+memcpy
  integrity: include keyring name for unknown key request
  ima: limit secure boot feedback scope for appraise
  integrity: invalid kernel parameters feedback
  ima: add check for enforced appraise option
  integrity: Use current_uid() in integrity_audit_message()
  ima: Fail rule parsing when asymmetric key measurement isn't supportable
  ima: Pre-parse the list of keyrings in a KEY_CHECK rule
2020-10-15 15:58:18 -07:00
Linus Torvalds 726eb70e0d Char/Misc driver patches for 5.10-rc1
Here is the big set of char, misc, and other assorted driver subsystem
 patches for 5.10-rc1.
 
 There's a lot of different things in here, all over the drivers/
 directory.  Some summaries:
 	- soundwire driver updates
 	- habanalabs driver updates
 	- extcon driver updates
 	- nitro_enclaves new driver
 	- fsl-mc driver and core updates
 	- mhi core and bus updates
 	- nvmem driver updates
 	- eeprom driver updates
 	- binder driver updates and fixes
 	- vbox minor bugfixes
 	- fsi driver updates
 	- w1 driver updates
 	- coresight driver updates
 	- interconnect driver updates
 	- misc driver updates
 	- other minor driver updates
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCX4g8YQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yngKgCeNpArCP/9vQJRK9upnDm8ZLunSCUAn1wUT/2A
 /bTQ42c/WRQ+LU828GSM
 =6sO2
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of char, misc, and other assorted driver subsystem
  patches for 5.10-rc1.

  There's a lot of different things in here, all over the drivers/
  directory. Some summaries:

   - soundwire driver updates

   - habanalabs driver updates

   - extcon driver updates

   - nitro_enclaves new driver

   - fsl-mc driver and core updates

   - mhi core and bus updates

   - nvmem driver updates

   - eeprom driver updates

   - binder driver updates and fixes

   - vbox minor bugfixes

   - fsi driver updates

   - w1 driver updates

   - coresight driver updates

   - interconnect driver updates

   - misc driver updates

   - other minor driver updates

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (396 commits)
  binder: fix UAF when releasing todo list
  docs: w1: w1_therm: Fix broken xref, mistakes, clarify text
  misc: Kconfig: fix a HISI_HIKEY_USB dependency
  LSM: Fix type of id parameter in kernel_post_load_data prototype
  misc: Kconfig: add a new dependency for HISI_HIKEY_USB
  firmware_loader: fix a kernel-doc markup
  w1: w1_therm: make w1_poll_completion static
  binder: simplify the return expression of binder_mmap
  test_firmware: Test partial read support
  firmware: Add request_partial_firmware_into_buf()
  firmware: Store opt_flags in fw_priv
  fs/kernel_file_read: Add "offset" arg for partial reads
  IMA: Add support for file reads without contents
  LSM: Add "contents" flag to kernel_read_file hook
  module: Call security_kernel_post_load_data()
  firmware_loader: Use security_post_load_data()
  LSM: Introduce kernel_post_load_data() hook
  fs/kernel_read_file: Add file_size output argument
  fs/kernel_read_file: Switch buffer size arg to size_t
  fs/kernel_read_file: Remove redundant size argument
  ...
2020-10-15 10:01:51 -07:00
Linus Torvalds 7b540812cc selinux/stable-5.10 PR 20201012
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl+E9UoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMG2BAApHLKLsfH5gf7gZNjHmQxddg8maCl
 BGt7K1xc9iYBZN56Cbc7v9uKc5pM+UOoOlVmWh+8jaROpX10jJmvhsebQzpcWEEs
 O/BDg/Y/AafoLr5e7gbAnlA7TJXNSR9MG9RB7c9xC14LG/bqBmkaUNsv8isWlLgl
 J2atHLsdlvCbmqJvnc6Fh3VJCbY/I0kt9L04GBQ4pEK3TKOxtORQaQcjVgLhlcw9
 YdMPKYIwy2Ze2HUuyW2o9OuryHhoMrwxpN/35/PAxrRwpO0LVnjjiw6njQqYVGH3
 el8mPXlhHah/7QUKcngSsvcvUcaSencp9sUBrp1vK9C1vkSFyubZweVi4A2TEWnh
 Ctceje7XP/YWDcJ+5BgASvosQdqOBB7huuOOKVpvaBXqgUHFgaxphV4/FDNnlF62
 AteX5RcWb/JiFJ4YnbknPNa/MWxVYuVn78AlNsM2ZponWYWs9JZ17lX4tHAKF1Qm
 x6ZMvMCDJTj8622l8nw3dTZKNDE3nFblDThX8aSrAhCQQE6HvugbKU4Fzo1oiSPl
 84PlCPgb+3tP3OsvZDIOPCJxC6IHgS+meA0IjhjwuCb+U+YWaAIeOlOPSkxUmfLu
 iJVWHmDtsAM3bTBxwQudhgXF3a1oKCEqeqNxM6P6p55jti7xal9FnZNHTbSh2sO1
 Km4oIqTEb1XWNdU=
 =NNLw
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "A decent number of SELinux patches for v5.10, twenty two in total. The
  highlights are listed below, but all of the patches pass our test
  suite and merge cleanly.

   - A number of changes to how the SELinux policy is loaded and managed
     inside the kernel with the goal of improving the atomicity of a
     SELinux policy load operation.

     These changes account for the bulk of the diffstat as well as the
     patch count. A special thanks to everyone who contributed patches
     and fixes for this work.

   - Convert the SELinux policy read-write lock to RCU.

   - A tracepoint was added for audited SELinux access control events;
     this should help provide a more unified backtrace across kernel and
     userspace.

   - Allow the removal of security.selinux xattrs when a SELinux policy
     is not loaded.

   - Enable policy capabilities in SELinux policies created with the
     scripts/selinux/mdp tool.

   - Provide some "no sooner than" dates for the SELinux checkreqprot
     sysfs deprecation"

* tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (22 commits)
  selinux: provide a "no sooner than" date for the checkreqprot removal
  selinux: Add helper functions to get and set checkreqprot
  selinux: access policycaps with READ_ONCE/WRITE_ONCE
  selinux: simplify away security_policydb_len()
  selinux: move policy mutex to selinux_state, use in lockdep checks
  selinux: fix error handling bugs in security_load_policy()
  selinux: convert policy read-write lock to RCU
  selinux: delete repeated words in comments
  selinux: add basic filtering for audit trace events
  selinux: add tracepoint on audited events
  selinux: Create new booleans and class dirs out of tree
  selinux: Standardize string literal usage for selinuxfs directory names
  selinux: Refactor selinuxfs directory populating functions
  selinux: Create function for selinuxfs directory cleanup
  selinux: permit removing security.selinux xattr before policy load
  selinux: fix memdup.cocci warnings
  selinux: avoid dereferencing the policy prior to initialization
  selinux: fix allocation failure check on newpolicy->sidtab
  selinux: refactor changing booleans
  selinux: move policy commit after updating selinuxfs
  ...
2020-10-13 16:29:55 -07:00
Linus Torvalds 99a6740f88 Smack LSM changes for Linux 5.10
Two kernel test robot suggested clean-ups.
 Teach Smack to use the IPv4 netlabel cache.
 This results in a 12-14% improvement on TCP benchmarks.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl+EkO0XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFmWBAAqJzkYd1Pd3g0vewZXGe5mRxs
 Qqt6i3HMJ7yAPIJN3445DyHUB6Sad2jmGXGullROfUtJYFkTbxqIj0XT+RUrJFap
 qf54Sgzt/ucNTvZ45s5mV6iYdL2ol7NSB9kQOMVMV0BZf2GXknYv0/rP23wI/upo
 Ylk41UYZNE5RDwoH90sy9aUUJ6VHhMYJOj33aUXz6rwaO1Ck6xvu1tCDudJpERkU
 kUmgpYzkNVGmg+GQIOT/xr7C7LRFxfnbSX0UL5TYPEKC23tYfkSBoClgCpNSSYSh
 3+7+YNRjMjxuKYJEPTUliDeDMQLxQB/3tLMD2c92nADtTJyRkTZelSgFhDFiOW0L
 ln5otQ0aGuDvkmLmBLk/jcq3eKctztqJIQdxQG9K3kJdohz1t/onJJt3cwuWZIo5
 T8dKz/mhga11u0ii6Zk+ecDdpagcxrQ8FDKcjI1tgXtTKAEvjfhVwAbj/9X55BGY
 T3M52b6RuE/ZRPD/BKXMaAUTNI0jwYbJyZYm+5GY/fk6lg0CRNaPBTVr0hj1Ia4P
 7akSD1gYUSTblFvjGa+hBsnmUDHo7Htl+kXJ3v6U8bSPFgFsO7hLbZJR6nRbxDvV
 k/KejR7RsORkbTFczWGngTKGI2UUNgMJf3Km9WU5bLqU3PpM32Hlal3CLMm3dQTB
 /3Gfl6aCy7Ct72tNilw=
 =EgBn
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Two minor fixes and one performance enhancement to Smack. The
  performance improvement is significant and the new code is more like
  its counterpart in SELinux.

   - Two kernel test robot suggested clean-ups.

   - Teach Smack to use the IPv4 netlabel cache. This results in a
     12-14% improvement on TCP benchmarks"

* tag 'Smack-for-5.10' of git://github.com/cschaufler/smack-next:
  Smack: Remove unnecessary variable initialization
  Smack: Fix build when NETWORK_SECMARK is not set
  Smack: Use the netlabel cache
  Smack: Set socket labels only once
  Smack: Consolidate uses of secmark into a function
2020-10-13 16:18:51 -07:00
Linus Torvalds b274279a0b One patch for making it possible to execute usermode driver's path.
tomoyo: Loosen pathname/domainname validation.
 
  security/tomoyo/util.c |   29 +++++++++++++++++++++++------
  1 file changed, 23 insertions(+), 6 deletions(-)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJfhDj0AAoJEEJfEo0MZPUqKBYP/in6mwhoEs8EczqSJ04cRYf9
 zD1BRwubpwsiQgD6TBxHcJ1Kplpd1vp+IRtLYCG38HDZlLjRnSMKDn6dOUEwi3bf
 Uc8cBZ4FgUSxlFIyBoUjiG5IbUSndFbigJrbrMno8xKEedUcnoSntDTYsfDjuHAL
 5G6yBgCGZd3UI57Utgt+se73eptdseTLtGlCd/fD6tfZJpwiWpBlRdqM3C6PVAS5
 M9hFblYTVcR7mh1zB2fGxclgX0PIB9l8eq24yrWqOMyGaQP1C7aFuoonTxJbh295
 g4Ea5jLBmZkvjE0L1Wxu9WFLBfdepNVnKoDLayKasLFIl4OLoWUad1R+ALb5RhgM
 6pVlaJO7FjSKOc1gTq3R8WGUI0xUhP3BEwRKOron0x5n2CDew0+qZ0GIntv9K27A
 mSzk4US4T2rwKUp6L5kwsCSvh2GEvYrEdKACz8Ey0hqLSJgoUgowISh5+dpCNSd/
 GosDlZ1m35MyINT827YTVqzg7KN+HGrH/dt/vfHXCAprpcr33RBS/UniUqERA5y4
 aNHY+bwy2/uw6UUmpwwNuAVM97jIGQAfr13CxCOztCo4oXos4ksFU/BEp2fh6rJV
 5zW1+QoEQOKuDNvRT+KeFPxXunX45FcRceh7SumFvGA0vk9X6kMS5I1B2gmcMqIT
 3BZYCqGPm73o0xzhE3Xf
 =onIp
 -----END PGP SIGNATURE-----

Merge tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo fix from Tetsuo HandaL
 "One patch to make it possible to execute usermode-driver's path"

* tag 'tomoyo-pr-20201012' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: Loosen pathname/domainname validation.
2020-10-13 16:10:37 -07:00
Thomas Cedeno 03ca0ec138 LSM: SafeSetID: Fix warnings reported by test bot
Fix multiple cast-to-union warnings related to casting kuid_t and kgid_t
types to kid_t union type. Also fix incompatible type warning that
arises from accidental omission of "__rcu" qualifier on the struct
setid_ruleset pointer in the argument list for safesetid_file_read().

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Cedeno <thomascedeno@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
2020-10-13 09:17:36 -07:00
Thomas Cedeno 5294bac97e LSM: SafeSetID: Add GID security policy handling
The SafeSetID LSM has functionality for restricting setuid() calls based
on its configured security policies. This patch adds the analogous
functionality for setgid() calls. This is mostly a copy-and-paste change
with some code deduplication, plus slight modifications/name changes to
the policy-rule-related structs (now contain GID rules in addition to
the UID ones) and some type generalization since SafeSetID now needs to
deal with kgid_t and kuid_t types.

Signed-off-by: Thomas Cedeno <thomascedeno@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
2020-10-13 09:17:35 -07:00
Linus Torvalds 39a5101f98 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Allow DRBG testing through user-space af_alg
   - Add tcrypt speed testing support for keyed hashes
   - Add type-safe init/exit hooks for ahash

  Algorithms:
   - Mark arc4 as obsolete and pending for future removal
   - Mark anubis, khazad, sead and tea as obsolete
   - Improve boot-time xor benchmark
   - Add OSCCA SM2 asymmetric cipher algorithm and use it for integrity

  Drivers:
   - Fixes and enhancement for XTS in caam
   - Add support for XIP8001B hwrng in xiphera-trng
   - Add RNG and hash support in sun8i-ce/sun8i-ss
   - Allow imx-rngc to be used by kernel entropy pool
   - Use crypto engine in omap-sham
   - Add support for Ingenic X1830 with ingenic"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (205 commits)
  X.509: Fix modular build of public_key_sm2
  crypto: xor - Remove unused variable count in do_xor_speed
  X.509: fix error return value on the failed path
  crypto: bcm - Verify GCM/CCM key length in setkey
  crypto: qat - drop input parameter from adf_enable_aer()
  crypto: qat - fix function parameters descriptions
  crypto: atmel-tdes - use semicolons rather than commas to separate statements
  crypto: drivers - use semicolons rather than commas to separate statements
  hwrng: mxc-rnga - use semicolons rather than commas to separate statements
  hwrng: iproc-rng200 - use semicolons rather than commas to separate statements
  hwrng: stm32 - use semicolons rather than commas to separate statements
  crypto: xor - use ktime for template benchmarking
  crypto: xor - defer load time benchmark to a later time
  crypto: hisilicon/zip - fix the uninitalized 'curr_qm_qp_num'
  crypto: hisilicon/zip - fix the return value when device is busy
  crypto: hisilicon/zip - fix zero length input in GZIP decompress
  crypto: hisilicon/zip - fix the uncleared debug registers
  lib/mpi: Fix unused variable warnings
  crypto: x86/poly1305 - Remove assignments with no effect
  hwrng: npcm - modify readl to readb
  ...
2020-10-13 08:50:16 -07:00
Linus Torvalds 85ed13e78d Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull compat iovec cleanups from Al Viro:
 "Christoph's series around import_iovec() and compat variant thereof"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  security/keys: remove compat_keyctl_instantiate_key_iov
  mm: remove compat_process_vm_{readv,writev}
  fs: remove compat_sys_vmsplice
  fs: remove the compat readv/writev syscalls
  fs: remove various compat readv/writev helpers
  iov_iter: transparently handle compat iovecs in import_iovec
  iov_iter: refactor rw_copy_check_uvector and import_iovec
  iov_iter: move rw_copy_check_uvector() into lib/iov_iter.c
  compat.h: fix a spelling error in <linux/compat.h>
2020-10-12 16:35:51 -07:00
Linus Torvalds e6412f9833 EFI changes for v5.10:
- Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree.
 
  - Relax decompressed image placement rules for 32-bit ARM
 
  - Add support for passing MOK certificate table contents via a config table
    rather than a EFI variable.
 
  - Add support for 18 bit DIMM row IDs in the CPER records.
 
  - Work around broken Dell firmware that passes the entire Boot#### variable
    contents as the command line
 
  - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can
    identify it in the memory map listings.
 
  - Don't abort the boot on arm64 if the EFI RNG protocol is available but
    returns with an error
 
  - Replace slashes with exclamation marks in efivarfs file names
 
  - Split efi-pstore from the deprecated efivars sysfs code, so we can
    disable the latter on !x86.
 
  - Misc fixes, cleanups and updates.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+Ec9QRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1inTQ//TYj3kJq/7sWfUAxmAsWnUEC005YCNf0T
 x3kJQv3zYX4Rl4eEwkff8S1PrqqvwUP5yUZYApp8HD9s9CYvzz5iG5xtf/jX+QaV
 06JnTMnkoycx2NaOlbr1cmcIn4/cAhQVYbVCeVrlf7QL8enNTBr5IIQmo4mgP8Lc
 mauSsO1XU8ZuMQM+JcZSxAkAPxlhz3dbR5GteP4o2K4ShQKpiTCOfOG1J3FvUYba
 s1HGnhHFlkQr6m3pC+iG8dnAG0YtwHMH1eJVP7mbeKUsMXz944U8OVXDWxtn81pH
 /Xt/aFZXnoqwlSXythAr6vFTuEEn40n+qoOK6jhtcGPUeiAFPJgiaeAXw3gO0YBe
 Y8nEgdGfdNOMih94McRd4M6gB/N3vdqAGt+vjiZSCtzE+nTWRyIXSGCXuDVpkvL4
 VpEXpPINnt1FZZ3T/7dPro4X7pXALhODE+pl36RCbfHVBZKRfLV1Mc1prAUGXPxW
 E0MfaM9TxDnVhs3VPWlHmRgavee2MT1Tl/ES4CrRHEoz8ZCcu4MfROQyao8+Gobr
 VR+jVk+xbyDrykEc6jdAK4sDFXpTambuV624LiKkh6Mc4yfHRhPGrmP5c5l7SnCd
 aLp+scQ4T7sqkLuYlXpausXE3h4sm5uur5hNIRpdlvnwZBXpDEpkzI8x0C9OYr0Q
 kvFrreQWPLQ=
 =ZNI8
 -----END PGP SIGNATURE-----

Merge tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI changes from Ingo Molnar:

 - Preliminary RISC-V enablement - the bulk of it will arrive via the
   RISCV tree.

 - Relax decompressed image placement rules for 32-bit ARM

 - Add support for passing MOK certificate table contents via a config
   table rather than a EFI variable.

 - Add support for 18 bit DIMM row IDs in the CPER records.

 - Work around broken Dell firmware that passes the entire Boot####
   variable contents as the command line

 - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we
   can identify it in the memory map listings.

 - Don't abort the boot on arm64 if the EFI RNG protocol is available
   but returns with an error

 - Replace slashes with exclamation marks in efivarfs file names

 - Split efi-pstore from the deprecated efivars sysfs code, so we can
   disable the latter on !x86.

 - Misc fixes, cleanups and updates.

* tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  efi: mokvar: add missing include of asm/early_ioremap.h
  efi: efivars: limit availability to X86 builds
  efi: remove some false dependencies on CONFIG_EFI_VARS
  efi: gsmi: fix false dependency on CONFIG_EFI_VARS
  efi: efivars: un-export efivars_sysfs_init()
  efi: pstore: move workqueue handling out of efivars
  efi: pstore: disentangle from deprecated efivars module
  efi: mokvar-table: fix some issues in new code
  efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure
  efivarfs: Replace invalid slashes with exclamation marks in dentries.
  efi: Delete deprecated parameter comments
  efi/libstub: Fix missing-prototypes in string.c
  efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it
  cper,edac,efi: Memory Error Record: bank group/address and chip id
  edac,ghes,cper: Add Row Extension to Memory Error Record
  efi/x86: Add a quirk to support command line arguments on Dell EFI firmware
  efi/libstub: Add efi_warn and *_once logging helpers
  integrity: Load certs from the EFI MOK config table
  integrity: Move import of MokListRT certs to a separate routine
  efi: Support for MOK variable config table
  ...
2020-10-12 13:26:49 -07:00
Tetsuo Handa a207516776 tomoyo: Loosen pathname/domainname validation.
Since commit e2dc9bf3f5 ("umd: Transform fork_usermode_blob into
fork_usermode_driver") started calling execve() on a program written in
a local mount which is not connected to mount tree,
tomoyo_realpath_from_path() started returning a pathname in
"$fsname:/$pathname" format which violates TOMOYO's domainname rule that
it must start with "<$namespace>" followed by zero or more repetitions of
pathnames which start with '/'.

Since $fsname must not contain '.' since commit 79c0b2df79 ("add
filesystem subtype support"), tomoyo_correct_path() can recognize a token
which appears '/' before '.' appears (e.g. proc:/self/exe ) as a pathname
while rejecting a token which appears '.' before '/' appears (e.g.
exec.realpath="/bin/bash" ) as a condition parameter.

Therefore, accept domainnames which contain pathnames which do not start
with '/' but contain '/' before '.' (e.g. <kernel> tmpfs:/bpfilter_umh ).

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
2020-10-12 19:53:34 +09:00
Casey Schaufler edd615371b Smack: Remove unnecessary variable initialization
The initialization of rc in smack_from_netlbl() is pointless.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-10-05 14:20:51 -07:00
Kees Cook 0fa8e08464 fs/kernel_file_read: Add "offset" arg for partial reads
To perform partial reads, callers of kernel_read_file*() must have a
non-NULL file_size argument and a preallocated buffer. The new "offset"
argument can then be used to seek to specific locations in the file to
fill the buffer to, at most, "buf_size" per call.

Where possible, the LSM hooks can report whether a full file has been
read or not so that the contents can be reasoned about.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201002173828.2099543-14-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:04 +02:00
Scott Branden 34736daeec IMA: Add support for file reads without contents
When the kernel_read_file LSM hook is called with contents=false, IMA
can appraise the file directly, without requiring a filled buffer. When
such a buffer is available, though, IMA can continue to use it instead
of forcing a double read here.

Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/lkml/20200706232309.12010-10-scott.branden@broadcom.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-13-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:04 +02:00
Kees Cook 2039bda1fa LSM: Add "contents" flag to kernel_read_file hook
As with the kernel_load_data LSM hook, add a "contents" flag to the
kernel_read_file LSM hook that indicates whether the LSM can expect
a matching call to the kernel_post_read_file LSM hook with the full
contents of the file. With the coming addition of partial file read
support for kernel_read_file*() API, the LSM will no longer be able
to always see the entire contents of a file during the read calls.

For cases where the LSM must read examine the complete file contents,
it will need to do so on its own every time the kernel_read_file
hook is called with contents=false (or reject such cases). Adjust all
existing LSMs to retain existing behavior.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-12-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 4f2d99b06b firmware_loader: Use security_post_load_data()
Now that security_post_load_data() is wired up, use it instead
of the NULL file argument style of security_post_read_file(),
and update the security_kernel_load_data() call to indicate that a
security_kernel_post_load_data() call is expected.

Wire up the IMA check to match earlier logic. Perhaps a generalized
change to ima_post_load_data() might look something like this:

    return process_buffer_measurement(buf, size,
                                      kernel_load_data_id_str(load_id),
                                      read_idmap[load_id] ?: FILE_CHECK,
                                      0, NULL);

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-10-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook b64fcae74b LSM: Introduce kernel_post_load_data() hook
There are a few places in the kernel where LSMs would like to have
visibility into the contents of a kernel buffer that has been loaded or
read. While security_kernel_post_read_file() (which includes the
buffer) exists as a pairing for security_kernel_read_file(), no such
hook exists to pair with security_kernel_load_data().

Earlier proposals for just using security_kernel_post_read_file() with a
NULL file argument were rejected (i.e. "file" should always be valid for
the security_..._file hooks, but it appears at least one case was
left in the kernel during earlier refactoring. (This will be fixed in
a subsequent patch.)

Since not all cases of security_kernel_load_data() can have a single
contiguous buffer made available to the LSM hook (e.g. kexec image
segments are separately loaded), there needs to be a way for the LSM to
reason about its expectations of the hook coverage. In order to handle
this, add a "contents" argument to the "kernel_load_data" hook that
indicates if the newly added "kernel_post_load_data" hook will be called
with the full contents once loaded. That way, LSMs requiring full contents
can choose to unilaterally reject "kernel_load_data" with contents=false
(which is effectively the existing hook coverage), but when contents=true
they can allow it and later evaluate the "kernel_post_load_data" hook
once the buffer is loaded.

With this change, LSMs can gain coverage over non-file-backed data loads
(e.g. init_module(2) and firmware userspace helper), which will happen
in subsequent patches.

Additionally prepare IMA to start processing these cases.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: KP Singh <kpsingh@google.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-9-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 885352881f fs/kernel_read_file: Add file_size output argument
In preparation for adding partial read support, add an optional output
argument to kernel_read_file*() that reports the file size so callers
can reason more easily about their reading progress.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-8-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:37:03 +02:00
Kees Cook 113eeb5177 fs/kernel_read_file: Switch buffer size arg to size_t
In preparation for further refactoring of kernel_read_file*(), rename
the "max_size" argument to the more accurate "buf_size", and correct
its type to size_t. Add kerndoc to explain the specifics of how the
arguments will be used. Note that with buf_size now size_t, it can no
longer be negative (and was never called with a negative value). Adjust
callers to use it as a "maximum size" when *buf is NULL.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-7-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:19 +02:00
Kees Cook f7a4f689bc fs/kernel_read_file: Remove redundant size argument
In preparation for refactoring kernel_read_file*(), remove the redundant
"size" argument which is not needed: it can be included in the return
code, with callers adjusted. (VFS reads already cannot be larger than
INT_MAX.)

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Link: https://lore.kernel.org/r/20201002173828.2099543-6-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Scott Branden b89999d004 fs/kernel_read_file: Split into separate include file
Move kernel_read_file* out of linux/fs.h to its own linux/kernel_read_file.h
include file. That header gets pulled in just about everywhere
and doesn't really need functions not related to the general fs interface.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/r/20200706232309.12010-2-scott.branden@broadcom.com
Link: https://lore.kernel.org/r/20201002173828.2099543-4-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Kees Cook c307459b9d fs/kernel_read_file: Remove FIRMWARE_PREALLOC_BUFFER enum
FIRMWARE_PREALLOC_BUFFER is a "how", not a "what", and confuses the LSMs
that are interested in filtering between types of things. The "how"
should be an internal detail made uninteresting to the LSMs.

Fixes: a098ecd2fa ("firmware: support loading into a pre-allocated buffer")
Fixes: fd90bc559b ("ima: based on policy verify firmware signatures (pre-allocated buffer)")
Fixes: 4f0496d8ff ("ima: based on policy warn about loading firmware (pre-allocated buffer)")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201002173828.2099543-2-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05 13:34:18 +02:00
Christoph Hellwig 5d47b39479 security/keys: remove compat_keyctl_instantiate_key_iov
Now that import_iovec handles compat iovecs, the native version of
keyctl_instantiate_key_iov can be used for the compat case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:16 -04:00
Christoph Hellwig 89cd35c58b iov_iter: transparently handle compat iovecs in import_iovec
Use in compat_syscall to import either native or the compat iovecs, and
remove the now superflous compat_import_iovec.

This removes the need for special compat logic in most callers, and
the remaining ones can still be simplified by using __import_iovec
with a bool compat parameter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03 00:02:13 -04:00
Tianjia Zhang 0b7e44d39c integrity: Asymmetric digsig supports SM2-with-SM3 algorithm
Asymmetric digsig supports SM2-with-SM3 algorithm combination,
so that IMA can also verify SM2's signature data.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Tested-by: Xufeng Zhang <yunbo.xufeng@linux.alibaba.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-09-25 17:48:55 +10:00
David S. Miller 3ab0a7a0c3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Two minor conflicts:

1) net/ipv4/route.c, adding a new local variable while
   moving another local variable and removing it's
   initial assignment.

2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes.
   One pretty prints the port mode differently, whilst another
   changes the driver to try and obtain the port mode from
   the port node rather than the switch node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-22 16:45:34 -07:00
Casey Schaufler bf0afe673b Smack: Fix build when NETWORK_SECMARK is not set
Use proper conditional compilation for the secmark field in
the network skb.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-22 14:59:31 -07:00
KP Singh aa662fc04f ima: Fix NULL pointer dereference in ima_file_hash
ima_file_hash can be called when there is no iint->ima_hash available
even though the inode exists in the integrity cache. It is fairly
common for a file to not have a hash. (e.g. an mknodat, prior to the
file being closed).

Another example where this can happen (suggested by Jann Horn):

Process A does:

	while(1) {
		unlink("/tmp/imafoo");
		fd = open("/tmp/imafoo", O_RDWR|O_CREAT|O_TRUNC, 0700);
		if (fd == -1) {
			perror("open");
			continue;
		}
		write(fd, "A", 1);
		close(fd);
	}

and Process B does:

	while (1) {
		int fd = open("/tmp/imafoo", O_RDONLY);
		if (fd == -1)
			continue;
    		char *mapping = mmap(NULL, 0x1000, PROT_READ|PROT_EXEC,
			 	     MAP_PRIVATE, fd, 0);
		if (mapping != MAP_FAILED)
			munmap(mapping, 0x1000);
		close(fd);
  	}

Due to the race to get the iint->mutex between ima_file_hash and
process_measurement iint->ima_hash could still be NULL.

Fixes: 6beea7afcc ("ima: add the ability to query the cached hash of a given file")
Signed-off-by: KP Singh <kpsingh@google.com>
Reviewed-by: Florent Revest <revest@chromium.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-16 17:43:02 -04:00
Lenny Szubowicz 726bd8965a integrity: Load certs from the EFI MOK config table
Because of system-specific EFI firmware limitations, EFI volatile
variables may not be capable of holding the required contents of
the Machine Owner Key (MOK) certificate store when the certificate
list grows above some size. Therefore, an EFI boot loader may pass
the MOK certs via a EFI configuration table created specifically for
this purpose to avoid this firmware limitation.

An EFI configuration table is a much more primitive mechanism
compared to EFI variables and is well suited for one-way passage
of static information from a pre-OS environment to the kernel.

This patch adds the support to load certs from the MokListRT
entry in the MOK variable configuration table, if it's present.
The pre-existing support to load certs from the MokListRT EFI
variable remains and is used if the EFI MOK configuration table
isn't present or can't be successfully used.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Link: https://lore.kernel.org/r/20200905013107.10457-4-lszubowi@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-16 18:53:42 +03:00
Lenny Szubowicz 38a1f03aa2 integrity: Move import of MokListRT certs to a separate routine
Move the loading of certs from the UEFI MokListRT into a separate
routine to facilitate additional MokList functionality.

There is no visible functional change as a result of this patch.
Although the UEFI dbx certs are now loaded before the MokList certs,
they are loaded onto different key rings. So the order of the keys
on their respective key rings is the same.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Link: https://lore.kernel.org/r/20200905013107.10457-3-lszubowi@redhat.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2020-09-16 18:53:42 +03:00
Linus Torvalds 1e484d3887 device_cgroup RCU warning fix from Amol Grover <frextrite@gmail.com>
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAl9hIXMACgkQrZhLv9lQ
 BTy34RAAkt6BAcEPA9hSMkvRCrA44Doq6jaK45vmMExuWh2/QMvfK1E4KjxXGn0U
 e74TcCPgh+AuSWQABAuZrMVx4Ai9fyDBWkhtzwz7XsHeUtUvEMUkb2fzKRGoBg6h
 6WvYtzdO4NHEcy3Lf59EYW2Hm08eEjZfmVMRlCF6MoLmsj/ifh+yQ3Xxy0RAd/Jo
 X4IvSwan6EitXNEHy7onmpDjL7BvncXs1dXpGXqHzhLF8W4EtFmIZGH3T5/W82n0
 IgtEqqsCw5MY5mSIixjUPcRxbi+NUkymEzYQyvceVU0W+voMITQ8Qb/NkGMklMLE
 KUHwP1r4q1XR1WVFqHxRCPB4c+njNwiTUtAO44ODNNgC1R+wT70CGhujP1bSW6Eo
 Gf5DWJniD9I8viBWD5tYBFWPBlH+DfURY8wqkrEEC8fntsIDkSWf2XK2dVBrDxMM
 PxXOYEKfZVIRQTRAz/HJmCAoW8rVkCa5ptpKFJzWvoLqS3FclFRg0i1FZ6fcMuz1
 4phZCL+pGDSp3yhSi6lamdhPhRPq9Pbk4ZVSPK2gAg4VzhI6w7TY/zicZapaRQ0g
 hScOmTk4YKqLhUbcWiBErH/AwV6op+H4DwG/A1z8ASUxQST5oiJk0d7dMGCF9cgG
 VuHXj/dQbtymUMjo7MSqLqpx0ieEarEqZ2BOgn1DVTYgn4c1yms=
 =DgJN
 -----END PGP SIGNATURE-----

Merge tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security layer fix from James  Morris:
 "A device_cgroup RCU warning fix from Amol Grover"

* tag 'fixes-v5.9a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  device_cgroup: Fix RCU list debugging warning
2020-09-15 16:26:57 -07:00
Lakshmi Ramasubramanian 8861d0af64 selinux: Add helper functions to get and set checkreqprot
checkreqprot data member in selinux_state struct is accessed directly by
SELinux functions to get and set. This could cause unexpected read or
write access to this data member due to compiler optimizations and/or
compiler's reordering of access to this field.

Add helper functions to get and set checkreqprot data member in
selinux_state struct. These helper functions use READ_ONCE and
WRITE_ONCE macros to ensure atomic read or write of memory for
this data member.

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Suggested-by: Paul Moore <paul@paul-moore.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-09-15 14:36:28 -04:00
Roberto Sassu 455b6c9112 evm: Check size of security.evm before using it
This patch checks the size for the EVM_IMA_XATTR_DIGSIG and
EVM_XATTR_PORTABLE_DIGSIG types to ensure that the algorithm is read from
the buffer returned by vfs_getxattr_alloc().

Cc: stable@vger.kernel.org # 4.19.x
Fixes: 5feeb61183 ("evm: Allow non-SHA1 digital signatures")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:42 -04:00
Roberto Sassu 4be92db3b5 ima: Remove semicolon at the end of ima_get_binary_runtime_size()
This patch removes the unnecessary semicolon at the end of
ima_get_binary_runtime_size().

Cc: stable@vger.kernel.org
Fixes: d158847ae8 ("ima: maintain memory size needed for serializing the measurement list")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:41 -04:00
Roberto Sassu 60386b8540 ima: Don't ignore errors from crypto_shash_update()
Errors returned by crypto_shash_update() are not checked in
ima_calc_boot_aggregate_tfm() and thus can be overwritten at the next
iteration of the loop. This patch adds a check after calling
crypto_shash_update() and returns immediately if the result is not zero.

Cc: stable@vger.kernel.org
Fixes: 3323eec921 ("integrity: IMA as an integrity service provider")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 13:47:37 -04:00
Alex Dewar f60c826d03 ima: Use kmemdup rather than kmalloc+memcpy
Issue identified with Coccinelle.

Signed-off-by: Alex Dewar <alex.dewar90@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-15 09:57:48 -04:00
Casey Schaufler 322dd63c7f Smack: Use the netlabel cache
Utilize the Netlabel cache mechanism for incoming packet matching.
Refactor the initialization of secattr structures, as it was being
done in two places.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:31 -07:00
Casey Schaufler a2af031885 Smack: Set socket labels only once
Refactor the IP send checks so that the netlabel value
is set only when necessary, not on every send. Some functions
get renamed as the changes made the old name misleading.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Casey Schaufler 36be81293d Smack: Consolidate uses of secmark into a function
Add a function smack_from_skb() that returns the Smack label
identified by a network secmark. Replace the explicit uses of
the secmark with this function.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-09-11 15:31:30 -07:00
Stephen Smalley e8ba53d002 selinux: access policycaps with READ_ONCE/WRITE_ONCE
Use READ_ONCE/WRITE_ONCE for all accesses to the
selinux_state.policycaps booleans to prevent compiler
mischief.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-09-11 10:08:51 -04:00
Bruno Meneguele 8c2f516c99 integrity: include keyring name for unknown key request
Depending on the IMA policy rule a key may be searched for in multiple
keyrings (e.g. .ima and .platform) and possibly not found.  This patch
improves feedback by including the keyring "description" (name) in the
error message.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
[zohar@linux.ibm.com: updated commit message]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-09 20:05:28 -04:00
Bruno Meneguele e4d7e2df3a ima: limit secure boot feedback scope for appraise
Only emit an unknown/invalid message when setting the IMA appraise mode
to anything other than "enforce", when secureboot is enabled.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
[zohar@linux.ibm.com: updated commit message]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-09 20:01:55 -04:00
Bruno Meneguele 7fe2bb7e7e integrity: invalid kernel parameters feedback
Don't silently ignore unknown or invalid ima_{policy,appraise,hash} and evm
kernel boot command line options.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-08 22:03:50 -04:00
Bruno Meneguele 4afb28ab03 ima: add check for enforced appraise option
The "enforce" string is allowed as an option for ima_appraise= kernel
paramenter per kernel-paramenters.txt and should be considered on the
parameter setup checking as a matter of completeness. Also it allows futher
checking on the options being passed by the user.

Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-09-08 22:02:57 -04:00
Jakub Kicinski 44a8c4f33c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.

Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-04 21:28:59 -07:00
Denis Efremov e44f128768 integrity: Use current_uid() in integrity_audit_message()
Modify integrity_audit_message() to use current_uid().

Signed-off-by: Denis Efremov <efremov@linux.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:46:50 -04:00
Tyler Hicks 48ce1ddce1 ima: Fail rule parsing when asymmetric key measurement isn't supportable
Measuring keys is currently only supported for asymmetric keys. In the
future, this might change.

For now, the "func=KEY_CHECK" and "keyrings=" options are only
appropriate when CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS is enabled. Make
this clear at policy load so that IMA policy authors don't assume that
these policy language constructs are supported.

Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Suggested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:45:14 -04:00
Tyler Hicks 176377d97d ima: Pre-parse the list of keyrings in a KEY_CHECK rule
The ima_keyrings buffer was used as a work buffer for strsep()-based
parsing of the "keyrings=" option of an IMA policy rule. This parsing
was re-performed each time an asymmetric key was added to a kernel
keyring for each loaded policy rule that contained a "keyrings=" option.

An example rule specifying this option is:

 measure func=KEY_CHECK keyrings=a|b|c

The rule says to measure asymmetric keys added to any of the kernel
keyrings named "a", "b", or "c". The size of the buffer size was
equal to the size of the largest "keyrings=" value seen in a previously
loaded rule (5 + 1 for the NUL-terminator in the previous example) and
the buffer was pre-allocated at the time of policy load.

The pre-allocated buffer approach suffered from a couple bugs:

1) There was no locking around the use of the buffer so concurrent key
   add operations, to two different keyrings, would result in the
   strsep() loop of ima_match_keyring() to modify the buffer at the same
   time. This resulted in unexpected results from ima_match_keyring()
   and, therefore, could cause unintended keys to be measured or keys to
   not be measured when IMA policy intended for them to be measured.

2) If the kstrdup() that initialized entry->keyrings in ima_parse_rule()
   failed, the ima_keyrings buffer was freed and set to NULL even when a
   valid KEY_CHECK rule was previously loaded. The next KEY_CHECK event
   would trigger a call to strcpy() with a NULL destination pointer and
   crash the kernel.

Remove the need for a pre-allocated global buffer by parsing the list of
keyrings in a KEY_CHECK rule at the time of policy load. The
ima_rule_entry will contain an array of string pointers which point to
the name of each keyring specified in the rule. No string processing
needs to happen at the time of asymmetric key add so iterating through
the list and doing a string comparison is all that's required at the
time of policy check.

In the process of changing how the "keyrings=" policy option is handled,
a couple additional bugs were fixed:

1) The rule parser accepted rules containing invalid "keyrings=" values
   such as "a|b||c", "a|b|", or simply "|".

2) The /sys/kernel/security/ima/policy file did not display the entire
   "keyrings=" value if the list of keyrings was longer than what could
   fit in the fixed size tbuf buffer in ima_policy_show().

Fixes: 5c7bac9fb2 ("IMA: pre-allocate buffer to hold keyrings string")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-08-31 17:45:02 -04:00
Ondrej Mosnacek 66ccd2560a selinux: simplify away security_policydb_len()
Remove the security_policydb_len() calls from sel_open_policy() and
instead update the inode size from the size returned from
security_read_policy().

Since after this change security_policydb_len() is only called from
security_load_policy(), remove it entirely and just open-code it there.

Also, since security_load_policy() is always called with policy_mutex
held, make it dereference the policy pointer directly and drop the
unnecessary RCU locking.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-31 10:00:14 -04:00
Stephen Smalley 9ff9abc4c6 selinux: move policy mutex to selinux_state, use in lockdep checks
Move the mutex used to synchronize policy changes (reloads and setting
of booleans) from selinux_fs_info to selinux_state and use it in
lockdep checks for rcu_dereference_protected() calls in the security
server functions.  This makes the dependency on the mutex explicit
in the code rather than relying on comments.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-27 09:52:47 -04:00
Dan Carpenter 0256b0aa80 selinux: fix error handling bugs in security_load_policy()
There are a few bugs in the error handling for security_load_policy().

1) If the newpolicy->sidtab allocation fails then it leads to a NULL
   dereference.  Also the error code was not set to -ENOMEM on that
   path.
2) If policydb_read() failed then we call policydb_destroy() twice
   which meands we call kvfree(p->sym_val_to_name[i]) twice.
3) If policydb_load_isids() failed then we call sidtab_destroy() twice
   and that results in a double free in the sidtab_destroy_tree()
   function because entry.ptr_inner and entry.ptr_leaf are not set to
   NULL.

One thing that makes this code nice to deal with is that none of the
functions return partially allocated data.  In other words, the
policydb_read() either allocates everything successfully or it frees
all the data it allocates.  It never returns a mix of allocated and
not allocated data.

I re-wrote this to only free the successfully allocated data which
avoids the double frees.  I also re-ordered selinux_policy_free() so
it's in the reverse order of the allocation function.

Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[PM: partially merged by hand due to merge fuzz]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-26 10:19:08 -04:00
KP Singh 8ea636848a bpf: Implement bpf_local_storage for inodes
Similar to bpf_local_storage for sockets, add local storage for inodes.
The life-cycle of storage is managed with the life-cycle of the inode.
i.e. the storage is destroyed along with the owning inode.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the
security blob which are now stackable and can co-exist with other LSMs.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org
2020-08-25 15:00:04 -07:00
Stephen Smalley 1b8b31a2e6 selinux: convert policy read-write lock to RCU
Convert the policy read-write lock to RCU.  This is significantly
simplified by the earlier work to encapsulate the policy data
structures and refactor the policy load and boolean setting logic.
Move the latest_granting sequence number into the selinux_policy
structure so that it can be updated atomically with the policy.
Since removing the policy rwlock and moving latest_granting reduces
the selinux_ss structure to nothing more than a wrapper around the
selinux_policy pointer, get rid of the extra layer of indirection.

At present this change merely passes a hardcoded 1 to
rcu_dereference_check() in the cases where we know we do not need to
take rcu_read_lock(), with the preceding comment explaining why.
Alternatively we could pass fsi->mutex down from selinuxfs and
apply a lockdep check on it instead.

Based in part on earlier attempts to convert the policy rwlock
to RCU by Kaigai Kohei [1] and by Peter Enderborg [2].

[1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@tycho.nsa.gov/
[2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@sony.com/

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-25 08:34:47 -04:00
Randy Dunlap c76a2f9ecd selinux: delete repeated words in comments
Drop a repeated word in comments.
{open, is, then}

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Cc: Eric Paris <eparis@parisplace.org>
Cc: selinux@vger.kernel.org
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-security-module@vger.kernel.org
[PM: fix subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-24 09:03:14 -04:00
Gustavo A. R. Silva df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Peter Enderborg 30969bc8e0 selinux: add basic filtering for audit trace events
This patch adds further attributes to the event. These attributes are
helpful to understand the context of the message and can be used
to filter the events.

There are three common items. Source context, target context and tclass.
There are also items from the outcome of operation performed.

An event is similar to:
           <...>-1309  [002] ....  6346.691689: selinux_audited:
       requested=0x4000000 denied=0x4000000 audited=0x4000000
       result=-13
       scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
       tcontext=system_u:object_r:bin_t:s0 tclass=file

With systems where many denials are occurring, it is useful to apply a
filter. The filtering is a set of logic that is inserted with
the filter file. Example:
 echo "tclass==\"file\" " > events/avc/selinux_audited/filter

This adds that we only get tclass=file.

The trace can also have extra properties. Adding the user stack
can be done with
   echo 1 > options/userstacktrace

Now the output will be
         runcon-1365  [003] ....  6960.955530: selinux_audited:
     requested=0x4000000 denied=0x4000000 audited=0x4000000
     result=-13
     scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
     tcontext=system_u:object_r:bin_t:s0 tclass=file
          runcon-1365  [003] ....  6960.955560: <user stack trace>
 =>  <00007f325b4ce45b>
 =>  <00005607093efa57>

Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Reviewed-by: Thiébaud Weksteen <tweek@google.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:07:29 -04:00
Thiébaud Weksteen dd8166212d selinux: add tracepoint on audited events
The audit data currently captures which process and which target
is responsible for a denial. There is no data on where exactly in the
process that call occurred. Debugging can be made easier by being able to
reconstruct the unified kernel and userland stack traces [1]. Add a
tracepoint on the SELinux denials which can then be used by userland
(i.e. perf).

Although this patch could manually be added by each OS developer to
trouble shoot a denial, adding it to the kernel streamlines the
developers workflow.

It is possible to use perf for monitoring the event:
  # perf record -e avc:selinux_audited -g -a
  ^C
  # perf report -g
  [...]
      6.40%     6.40%  audited=800000 tclass=4
               |
                  __libc_start_main
                  |
                  |--4.60%--__GI___ioctl
                  |          entry_SYSCALL_64
                  |          do_syscall_64
                  |          __x64_sys_ioctl
                  |          ksys_ioctl
                  |          binder_ioctl
                  |          binder_set_nice
                  |          can_nice
                  |          capable
                  |          security_capable
                  |          cred_has_capability.isra.0
                  |          slow_avc_audit
                  |          common_lsm_audit
                  |          avc_audit_post_callback
                  |          avc_audit_post_callback
                  |

It is also possible to use the ftrace interface:
  # echo 1 > /sys/kernel/debug/tracing/events/avc/selinux_audited/enable
  # cat /sys/kernel/debug/tracing/trace
  tracer: nop
  entries-in-buffer/entries-written: 1/1   #P:8
  [...]
  dmesg-3624  [001] 13072.325358: selinux_denied: audited=800000 tclass=4

The tclass value can be mapped to a class by searching
security/selinux/flask.h. The audited value is a bit field of the
permissions described in security/selinux/av_permissions.h for the
corresponding class.

[1] https://source.android.com/devices/tech/debug/native_stack_dump

Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Suggested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Peter Enderborg <peter.enderborg@sony.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:05:22 -04:00
Daniel Burgener 0eea609153 selinux: Create new booleans and class dirs out of tree
In order to avoid concurrency issues around selinuxfs resource availability
during policy load, we first create new directories out of tree for
reloaded resources, then swap them in, and finally delete the old versions.

This fix focuses on concurrency in each of the two subtrees swapped, and
not concurrency between the trees.  This means that it is still possible
that subsequent reads to eg the booleans directory and the class directory
during a policy load could see the old state for one and the new for the other.
The problem of ensuring that policy loads are fully atomic from the perspective
of userspace is larger than what is dealt with here.  This commit focuses on
ensuring that the directories contents always match either the new or the old
policy state from the perspective of userspace.

In the previous implementation, on policy load /sys/fs/selinux is updated
by deleting the previous contents of
/sys/fs/selinux/{class,booleans} and then recreating them.  This means
that there is a period of time when the contents of these directories do not
exist which can cause race conditions as userspace relies on them for
information about the policy.  In addition, it means that error recovery in
the event of failure is challenging.

In order to demonstrate the race condition that this series fixes, you
can use the following commands:

while true; do cat /sys/fs/selinux/class/service/perms/status
>/dev/null; done &
while true; do load_policy; done;

In the existing code, this will display errors fairly often as the class
lookup fails.  (In normal operation from systemd, this would result in a
permission check which would be allowed or denied based on policy settings
around unknown object classes.) After applying this patch series you
should expect to no longer see such error messages.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:41:31 -04:00
Daniel Burgener 613ba18798 selinux: Standardize string literal usage for selinuxfs directory names
Switch class and policy_capabilities directory names to be referred to with
global constants, consistent with booleans directory name.  This will allow
for easy consistency of naming in future development.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:39:10 -04:00
Daniel Burgener 66ec384ad3 selinux: Refactor selinuxfs directory populating functions
Make sel_make_bools and sel_make_classes take the specific elements of
selinux_fs_info that they need rather than the entire struct.

This will allow a future patch to pass temporary elements that are not in
the selinux_fs_info struct to these functions so that the original elements
can be preserved until we are ready to perform the switch over.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:37:12 -04:00
Daniel Burgener aeecf4a3fb selinux: Create function for selinuxfs directory cleanup
Separating the cleanup from the creation will simplify two things in
future patches in this series.  First, the creation can be made generic,
to create directories not tied to the selinux_fs_info structure.  Second,
we will ultimately want to reorder creation and deletion so that the
deletions aren't performed until the new directory structures have already
been moved into place.

Signed-off-by: Daniel Burgener <dburgener@linux.microsoft.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 09:35:59 -04:00
Stephen Smalley 9530a3e004 selinux: permit removing security.selinux xattr before policy load
Currently SELinux denies attempts to remove the security.selinux xattr
always, even when permissive or no policy is loaded.  This was originally
motivated by the view that all files should be labeled, even if that label
is unlabeled_t, and we shouldn't permit files that were once labeled to
have their labels removed entirely.  This however prevents removing
SELinux xattrs in the case where one "disables" SELinux by not loading
a policy (e.g. a system where runtime disable is removed and selinux=0
was not specified).  Allow removing the xattr before SELinux is
initialized.  We could conceivably permit it even after initialization
if permissive, or introduce a separate permission check here.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-20 21:55:31 -04:00
Amol Grover bc62d68e2a device_cgroup: Fix RCU list debugging warning
exceptions may be traversed using list_for_each_entry_rcu()
outside of an RCU read side critical section BUT under the
protection of decgroup_mutex. Hence add the corresponding
lockdep expression to fix the following false-positive
warning:

[    2.304417] =============================
[    2.304418] WARNING: suspicious RCU usage
[    2.304420] 5.5.4-stable #17 Tainted: G            E
[    2.304422] -----------------------------
[    2.304424] security/device_cgroup.c:355 RCU-list traversed in non-reader section!!

Signed-off-by: Amol Grover <frextrite@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
2020-08-20 11:25:03 -07:00
kernel test robot 879229311b selinux: fix memdup.cocci warnings
Use kmemdup rather than duplicating its implementation

Generated by: scripts/coccinelle/api/memdup.cocci

Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
CC: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-20 08:39:05 -04:00
Stephen Smalley 37ea433c66 selinux: avoid dereferencing the policy prior to initialization
Certain SELinux security server functions (e.g. security_port_sid,
called during bind) were not explicitly testing to see if SELinux
has been initialized (i.e. initial policy loaded) and handling
the no-policy-loaded case.  In the past this happened to work
because the policydb was statically allocated and could always
be accessed, but with the recent encapsulation of policy state
and conversion to dynamic allocation, we can no longer access
the policy state prior to initialization.  Add a test of
!selinux_initialized(state) to all of the exported functions that
were missing them and handle appropriately.

Fixes: 461698026f ("selinux: encapsulate policy state, refactor policy load")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-19 21:14:41 -04:00
Colin Ian King 69ea651c40 selinux: fix allocation failure check on newpolicy->sidtab
The allocation check of newpolicy->sidtab is null checking if
newpolicy is null and not newpolicy->sidtab. Fix this.

Addresses-Coverity: ("Logically dead code")
Fixes: c7c556f1e8 ("selinux: refactor changing booleans")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-19 09:14:04 -04:00
Stephen Smalley c7c556f1e8 selinux: refactor changing booleans
Refactor the logic for changing SELinux policy booleans in a similar
manner to the refactoring of policy load, thereby reducing the
size of the critical section when the policy write-lock is held
and making it easier to convert the policy rwlock to RCU in the
future.  Instead of directly modifying the policydb in place, modify
a copy and then swap it into place through a single pointer update.
Only fully copy the portions of the policydb that are affected by
boolean changes to avoid the full cost of a deep policydb copy.
Introduce another level of indirection for the sidtab since changing
booleans does not require updating the sidtab, unlike policy load.
While we are here, create a common helper for notifying
other kernel components and userspace of a policy change and call it
from both security_set_bools() and selinux_policy_commit().

Based on an old (2004) patch by Kaigai Kohei [1] to convert the policy
rwlock to RCU that was deferred at the time since it did not
significantly improve performance and introduced complexity. Peter
Enderborg later submitted a patch series to convert to RCU [2] that
would have made changing booleans a much more expensive operation
by requiring a full policydb_write();policydb_read(); sequence to
deep copy the entire policydb and also had concerns regarding
atomic allocations.

This change is now simplified by the earlier work to encapsulate
policy state in the selinux_policy struct and to refactor
policy load.  After this change, the last major obstacle to
converting the policy rwlock to RCU is likely the sidtab live
convert support.

[1] https://lore.kernel.org/selinux/6e2f9128-e191-ebb3-0e87-74bfccb0767f@tycho.nsa.gov/
[2] https://lore.kernel.org/selinux/20180530141104.28569-1-peter.enderborg@sony.com/

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 21:00:33 -04:00
Stephen Smalley 02a52c5c8c selinux: move policy commit after updating selinuxfs
With the refactoring of the policy load logic in the security
server from the previous change, it is now possible to split out
the committing of the new policy from security_load_policy() and
perform it only after successful updating of selinuxfs.  Change
security_load_policy() to return the newly populated policy
data structures to the caller, export selinux_policy_commit()
for external callers, and introduce selinux_policy_cancel() to
provide a way to cancel the policy load in the event of an error
during updating of the selinuxfs directory tree.  Further, rework
the interfaces used by selinuxfs to get information from the policy
when creating the new directory tree to take and act upon the
new policy data structure rather than the current/active policy.
Update selinuxfs to use these updated and new interfaces.  While
we are here, stop re-creating the policy_capabilities directory
on each policy load since it does not depend on the policy, and
stop trying to create the booleans and classes directories during
the initial creation of selinuxfs since no information is available
until first policy load.

After this change, a failure while updating the booleans and class
directories will cause the entire policy load to be canceled, leaving
the original policy intact, and policy load notifications to userspace
will only happen after a successful completion of updating those
directories.  This does not (yet) provide full atomicity with respect
to the updating of the directory trees themselves.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:50:22 -04:00
Stephen Smalley 461698026f selinux: encapsulate policy state, refactor policy load
Encapsulate the policy state in its own structure (struct
selinux_policy) that is separately allocated but referenced from the
selinux_ss structure.  The policy state includes the SID table
(particularly the context structures), the policy database, and the
mapping between the kernel classes/permissions and the policy values.
Refactor the security server portion of the policy load logic to
cleanly separate loading of the new structures from committing the new
policy.  Unify the initial policy load and reload code paths as much
as possible, avoiding duplicated code.  Make sure we are taking the
policy read-lock prior to any dereferencing of the policy.  Move the
copying of the policy capability booleans into the state structure
outside of the policy write-lock because they are separate from the
policy and are read outside of any policy lock; possibly they should
be using at least READ_ONCE/WRITE_ONCE or smp_load_acquire/store_release.

These changes simplify the policy loading logic, reduce the size of
the critical section while holding the policy write-lock, and should
facilitate future changes to e.g. refactor the entire policy reload
logic including the selinuxfs code to make the updating of the policy
and the selinuxfs directory tree atomic and/or to convert the policy
read-write lock to RCU.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:48:57 -04:00
Stephen Smalley 339949be25 scripts/selinux,selinux: update mdp to enable policy capabilities
Presently mdp does not enable any SELinux policy capabilities
in the dummy policy it generates. Thus, policies derived from
it will by default lack various features commonly used in modern
policies such as open permission, extended socket classes, network
peer controls, etc.  Split the policy capability definitions out into
their own headers so that we can include them into mdp without pulling in
other kernel headers and extend mdp generate policycap statements for the
policy capabilities known to the kernel.  Policy authors may wish to
selectively remove some of these from the generated policy.

Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-17 20:42:00 -04:00
Linus Torvalds 9ad57f6dfc Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:

 - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
   mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
   memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

 - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
   checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
   exec, kdump, rapidio, panic, kcov, kgdb, ipc).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (164 commits)
  mm/gup: remove task_struct pointer for all gup code
  mm: clean up the last pieces of page fault accountings
  mm/xtensa: use general page fault accounting
  mm/x86: use general page fault accounting
  mm/sparc64: use general page fault accounting
  mm/sparc32: use general page fault accounting
  mm/sh: use general page fault accounting
  mm/s390: use general page fault accounting
  mm/riscv: use general page fault accounting
  mm/powerpc: use general page fault accounting
  mm/parisc: use general page fault accounting
  mm/openrisc: use general page fault accounting
  mm/nios2: use general page fault accounting
  mm/nds32: use general page fault accounting
  mm/mips: use general page fault accounting
  mm/microblaze: use general page fault accounting
  mm/m68k: use general page fault accounting
  mm/ia64: use general page fault accounting
  mm/hexagon: use general page fault accounting
  mm/csky: use general page fault accounting
  ...
2020-08-12 11:24:12 -07:00
Peter Xu 64019a2e46 mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Linus Torvalds ce13266d97 Minor fixes for v5.9.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAl8xl0QACgkQrZhLv9lQ
 BTzEUA/+Muf7gha2mtxGJ49ZX/AsUOi/feHFDjt+NEA6lQTIaaqU5LxXNdtARu/5
 j+RlJkrw8+3QGJ4h544HIJodbLZHghWpp15AxBAy+1BaeAoswEnrW2/6mD1iUBEH
 pFI0P2OjnVYxEPJGubLhp4qQ0lnqVKwzciNbBDLMydr6SerwoPDz9h0h5SMDoOxF
 m4f1/dsoXrpyp86GSvHDVa9NRs/GMKz/qIeC6DXuMRoqGX15EZVV1iABC7vPd2we
 84IacCRIE/DO1M1rmbNBSpeErmvkxRo00Qjupl0XGf7D4aazxnQl+RpaLdHAtBI1
 ubzU/76DCkaCO1x+3KPHyQUHZvXa3dt0/n4yEkOv01RIzivKZZz6jahsCrbX6lzX
 Dq4n0zg8sA7vh/T7aNX77z0FU1TuFBpiJ8dn/0vUgJPxDwt2V9F2k9jyV1pUeK1V
 yvSkIleIQmwmuT0p2nB/1g7yE5xkvWTM5WOy8/zIQj2aCvuo3ToY06Qc0rNOKTa8
 6Qi/Byi/5S1bBwYQqrAyrd5GhPVdZ8oNZyaUu8Mpm+4P0+2CquvDfN3ZHUxwILNX
 /TMVTVMu1PQQIltWANA0L0BjGIjSGxutisEUQL7o24566GXA3wTQd8HoKRBc6h+p
 DGeVMehPG7GIwoCIvuzdahSRdzzI/iBG3P10TZ5u+3BwTL0OUNY=
 =s2Jw
 -----END PGP SIGNATURE-----

Merge tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security subsystem updates from James Morris:
 "A couple of minor documentation updates only for this release"

* tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  LSM: drop duplicated words in header file comments
  Replace HTTP links with HTTPS ones: security
2020-08-11 14:30:36 -07:00
Waiman Long 453431a549 mm, treewide: rename kzfree() to kfree_sensitive()
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:22 -07:00
Alexander A. Klimov c9fecf505a Replace HTTP links with HTTPS ones: security
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
          If both the HTTP and HTTPS versions
          return 200 OK and serve the same content:
            Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <jmorris@namei.org>
2020-08-06 12:00:05 -07:00
Linus Torvalds 4cec929370 integrity-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEjSMCCC7+cjo3nszSa3kkZrA+cVoFAl8puJgUHHpvaGFyQGxp
 bnV4LmlibS5jb20ACgkQa3kkZrA+cVq47w//VDg2pTD+/fPadleRJkKVSPaKJu4k
 N/gAVPxhYpJVJ+BTZKMFzTjX3kjfQG7udjORzC+saEdii7W1EfJJqHabLEnihfxd
 VDUS0RQndMwOkioAAZOsy5dFE84wUOX8O1kq31Aw2G+QLCYhn1dNMg10j6SBM034
 cJbS59k3w+lyqFy/Fje8e7aO1xmc/83x9MfLgzZTscCZqzf1vIJY8onwfTxRVBpQ
 QS0AZJM+b0+9MlJxpzBYxZARwYb5cXBLh07W/vBFmJRh15n0e20uWM4YFkBixicX
 gi3LtXd/75hFIHgm6QqbwDJrrA45zOJs5YsOudCctWVAe5k5mV0H7ysJ6phcRI9E
 uQvBb7Z+0viQXis6Cjx4gYSYAcAJPcDrfcjR4itQSOj5anUFBvCju+Jr373S0Vn8
 3eXGyimRAc33vEFkI7RJNfExkGh7pkYWzcruk90bHD6dAKuki/tisIs7ZvhTuFOp
 eyWt7hbctqbt/gESop3zXjUDRJsX9GyAA4OvJwFGRfRJ4ziQ5w8LGc+VendSWald
 1zjkJxXAZLjDPQlYv2074PYeIguTbcDkjeRVxUD9mWvdi0tyXK+r2qC+PeX7Rs71
 y1aGIT/NX9qYI2H0xIm3ettztdIE8F1tnAn2ziNkQiXEzCrEqKtAAxxSErTQuB78
 LMgCDPF8y06ZjD8=
 =M/tq
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity updates from Mimi Zohar:
 "The nicest change is the IMA policy rule checking. The other changes
  include allowing the kexec boot cmdline line measure policy rules to
  be defined in terms of the inode associated with the kexec kernel
  image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA
  appraise mode (log, fix, enforce), a runtime decision based on the
  secure boot mode of the system, and including errno in the audit log"

* tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: remove redundant initialization of variable ret
  ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
  ima: AppArmor satisfies the audit rule requirements
  ima: Rename internal filter rule functions
  ima: Support additional conditionals in the KEXEC_CMDLINE hook function
  ima: Use the common function to detect LSM conditionals in a rule
  ima: Move comprehensive rule validation checks out of the token parser
  ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
  ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
  ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
  ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
  ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
  ima: Fail rule parsing when buffer hook functions have an invalid action
  ima: Free the entire rule if it fails to parse
  ima: Free the entire rule when deleting a list of rules
  ima: Have the LSM free its audit rule
  IMA: Add audit log for failure conditions
  integrity: Add errno field in audit message
2020-08-06 11:35:57 -07:00
Linus Torvalds bfdd5aaa54 Smack fixes for 5.9
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl8pmVMXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBE4tA/7BvMGZjrGVu7F8oOJGam6xiQ+
 iVczarvuH9HUfIWWKytgAZXaNf8/lQvpTGYjJFdLGVaLVfB4CSrrrdOGlfvZGpZ/
 5CXeSMAAUOAmYnqjSXFuLqp0NJEEwBg0L7/YWGbp/fq7t8WcY73esDRpVw/R78fV
 rOSMq+qt+IKRERUZWHa3gY7/FDA1/KnEkXjcMViM9PN3Gi5u2Fw6IMVWugPYJYrh
 86jlrDxYV3ceqDpoiHGyulMcBk/f/GppjIWz4ihZ0YShn+ZgAXsnJGjEhGxsoSXb
 CdG0O0y/m6y/+ekyovhvL8+AymfVWbK3qzE8IxWp8Xbrd882naur0c0vTG6y2eF7
 Z2HnI3ivusNrKTkXQnzC86YJ0T96Pg9OCab5poSIsWK5bM542dKZ/7KoSDfzIfBH
 uPhTY2PyRSiUiCWpt7esHtp4mYwCGqUs2j80TZDb8ptAZbYKxzC0pFIDoFeEyOM4
 FDBXjH8Bo+eEG9V8OgGeVvluYaat+C00dba9fiOT7/Tm0F9KqOJEOO9uJPV9CuJN
 +jrL90mopy8T0bygqMjp+GB5V13IGL5+NDHZcnO3Ij5jI/7q0TQ6FlFZ4tg4o4yM
 Y7FyMF+ybUlRod0ayYgurtxn91c0E/PTOPSqtL/RKtRLbLdGrOM3Rfbxn9rREONY
 h1/RmmX5jSBQOKNq9/Y=
 =ZoLN
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Minor fixes to Smack for the v5.9 release.

  All were found by automated checkers and have straightforward
  resolution"

* tag 'Smack-for-5.9' of git://github.com/cschaufler/smack-next:
  Smack: prevent underflow in smk_set_cipso()
  Smack: fix another vsscanf out of bounds
  Smack: fix use-after-free in smk_write_relabel_self()
2020-08-06 11:02:23 -07:00
Linus Torvalds 74858abbb1 cap-checkpoint-restore-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygegQAKCRCRxhvAZXjc
 olWZAQCMPbhI/20LA3OYJ6s+BgBEnm89PymvlHcym6Z4AvTungD+KqZonIYuxWgi
 6Ttlv/fzgFFbXgJgbuass5mwFVoN5wM=
 =oK7d
 -----END PGP SIGNATURE-----

Merge tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull checkpoint-restore updates from Christian Brauner:
 "This enables unprivileged checkpoint/restore of processes.

  Given that this work has been going on for quite some time the first
  sentence in this summary is hopefully more exciting than the actual
  final code changes required. Unprivileged checkpoint/restore has seen
  a frequent increase in interest over the last two years and has thus
  been one of the main topics for the combined containers &
  checkpoint/restore microconference since at least 2018 (cf. [1]).

  Here are just the three most frequent use-cases that were brought forward:

   - The JVM developers are integrating checkpoint/restore into a Java
     VM to significantly decrease the startup time.

   - In high-performance computing environment a resource manager will
     typically be distributing jobs where users are always running as
     non-root. Long-running and "large" processes with significant
     startup times are supposed to be checkpointed and restored with
     CRIU.

   - Container migration as a non-root user.

  In all of these scenarios it is either desirable or required to run
  without CAP_SYS_ADMIN. The userspace implementation of
  checkpoint/restore CRIU already has the pull request for supporting
  unprivileged checkpoint/restore up (cf. [2]).

  To enable unprivileged checkpoint/restore a new dedicated capability
  CAP_CHECKPOINT_RESTORE is introduced. This solution has last been
  discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1]
  "Update on Task Migration at Google Using CRIU") with Adrian and
  Nicolas providing the implementation now over the last months. In
  essence, this allows the CRIU binary to be installed with the
  CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling
  unprivileged users to restore processes.

  To make this possible the following permissions are altered:

   - Selecting a specific PID via clone3() set_tid relaxed from userns
     CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed
     from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE.

   - Accessing /proc/pid/map_files relaxed from init userns
     CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE.

   - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns
     CAP_CHECKPOINT_RESTORE.

  Of these four changes the /proc/self/exe change deserves a few words
  because the reasoning behind even restricting /proc/self/exe changes
  in the first place is just full of historical quirks and tracking this
  down was a questionable version of fun that I'd like to spare others.

  In short, it is trivial to change /proc/self/exe as an unprivileged
  user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace()
  or by simply intercepting the elf loader in userspace during exec.
  Nicolas was nice enough to even provide a POC for the latter (cf. [3])
  to illustrate this fact.

  The original patchset which introduced PR_SET_MM_MAP had no
  permissions around changing the exe link. They too argued that it is
  trivial to spoof the exe link already which is true. The argument
  brought up against this was that the Tomoyo LSM uses the exe link in
  tomoyo_manager() to detect whether the calling process is a policy
  manager. This caused changing the exe links to be guarded by userns
  CAP_SYS_ADMIN.

  All in all this rather seems like a "better guard it with something
  rather than nothing" argument which imho doesn't qualify as a great
  security policy. Again, because spoofing the exe link is possible for
  the calling process so even if this were security relevant it was
  broken back then and would be broken today. So technically, dropping
  all permissions around changing the exe link would probably be
  possible and would send a clearer message to any userspace that relies
  on /proc/self/exe for security reasons that they should stop doing
  this but for now we're only relaxing the exe link permissions from
  userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE.

  There's a final uapi change in here. Changing the exe link used to
  accidently return EINVAL when the caller lacked the necessary
  permissions instead of the more correct EPERM. This pr contains a
  commit fixing this. I assume that userspace won't notice or care and
  if they do I will revert this commit. But since we are changing the
  permissions anyway it seems like a good opportunity to try this fix.

  With these changes merged unprivileged checkpoint/restore will be
  possible and has already been tested by various users"

[1] LPC 2018
     1. "Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095
     2. "Securely Migrating Untrusted Workloads with CRIU"
        https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400
     LPC 2019
     1. "CRIU and the PID dance"
         https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s
     2. "Update on Task Migration at Google Using CRIU"
        https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s

[2] https://github.com/checkpoint-restore/criu/pull/1155

[3] https://github.com/nviennot/run_as_exe

* tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests: add clone3() CAP_CHECKPOINT_RESTORE test
  prctl: exe link permission error changed from -EINVAL to -EPERM
  prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe
  proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE
  pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid
  pid: use checkpoint_restore_ns_capable() for set_tid
  capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04 15:02:07 -07:00
Linus Torvalds 3950e97543 Merge branch 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull execve updates from Eric Biederman:
 "During the development of v5.7 I ran into bugs and quality of
  implementation issues related to exec that could not be easily fixed
  because of the way exec is implemented. So I have been diggin into
  exec and cleaning up what I can.

  This cycle I have been looking at different ideas and different
  implementations to see what is possible to improve exec, and cleaning
  the way exec interfaces with in kernel users. Only cleaning up the
  interfaces of exec with rest of the kernel has managed to stabalize
  and make it through review in time for v5.9-rc1 resulting in 2 sets of
  changes this cycle.

   - Implement kernel_execve

   - Make the user mode driver code a better citizen

  With kernel_execve the code size got a little larger as the copying of
  parameters from userspace and copying of parameters from userspace is
  now separate. The good news is kernel threads no longer need to play
  games with set_fs to use exec. Which when combined with the rest of
  Christophs set_fs changes should security bugs with set_fs much more
  difficult"

* 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits)
  exec: Implement kernel_execve
  exec: Factor bprm_stack_limits out of prepare_arg_pages
  exec: Factor bprm_execve out of do_execve_common
  exec: Move bprm_mm_init into alloc_bprm
  exec: Move initialization of bprm->filename into alloc_bprm
  exec: Factor out alloc_bprm
  exec: Remove unnecessary spaces from binfmts.h
  umd: Stop using split_argv
  umd: Remove exit_umh
  bpfilter: Take advantage of the facilities of struct pid
  exit: Factor thread_group_exited out of pidfd_poll
  umd: Track user space drivers with struct pid
  bpfilter: Move bpfilter_umh back into init data
  exec: Remove do_execve_file
  umh: Stop calling do_execve_file
  umd: Transform fork_usermode_blob into fork_usermode_driver
  umd: Rename umd_info.cmdline umd_info.driver_name
  umd: For clarity rename umh_info umd_info
  umh: Separate the user mode driver and the user mode helper support
  umh: Remove call_usermodehelper_setup_file.
  ...
2020-08-04 14:27:25 -07:00
Linus Torvalds fd76a74d94 audit/stable-5.9 PR 20200803
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl8okpIUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNqOQ/8D+m9Ykcby3csEKsp8YtsaukEu62U
 lRVaxzRNO9wwB24aFwDFuJnIkmsSi/s/O4nBsy2mw+Apn+uDCvHQ9tBU07vlNn2f
 lu27YaTya7YGlqoe315xijd8tyoX99k8cpQeixvAVr9/jdR09yka7SJ8O7X9mjV7
 +SUVDiKCplPKpiwCCRS9cqD7F64T6y35XKzbrzYqdP0UOF2XelZo/Evt5rDRvWUf
 5qDN2tP+iM/Fvu5lCfczFwAeivfAdxjQ11n783hx8Ms2qyiaKQCzbEwjqAslmkbs
 1k/+ED0NjzXX1ne0JZaz/bk0wsMnmOoa8o+NDcyd7Za/cj5prUZi7kBy+xry4YV8
 qKJ40Lk0flCWgUpm6bkYVOByIYHk0gmfBNvjilqf25NR/eOC/9e9ir8PywvYUW/7
 kvVK37+N/a3LnFj80sZpIeqqnNU8z9PV1i7//5/kDuKvz94Bq83TJDO6pPKvqDtC
 njQfCFoHwdEeF8OalK793lIiYaoODqvbkWKChKMqziODJ4ZP8AW06gXpEbEWn7G3
 TTnJx7hqzR9t90vBQJeO3Fromfn+9TDlZVdX+EGO8gIqUiLGr0r7LPPep4VkDbNw
 LxMYKeC2cgRp8Z+XXPDxfXSDL2psTwg6CXcDrXcYnUyBo/yerpBvbJkeaR0h+UR0
 j6cvMX+T39X2JXM=
 =Xs3M
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:
 "Aside from some smaller bug fixes, here are the highlights:

   - add a new backlog wait metric to the audit status message, this is
     intended to help admins determine how long processes have been
     waiting for the audit backlog queue to clear

   - generate audit records for nftables configuration changes

   - generate CWD audit records for for the relevant LSM audit records"

* tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: report audit wait metric in audit status reply
  audit: purge audit_log_string from the intra-kernel audit API
  audit: issue CWD record to accompany LSM_AUDIT_DATA_* records
  audit: use the proper gfp flags in the audit_log_nfcfg() calls
  audit: remove unused !CONFIG_AUDITSYSCALL __audit_inode* stubs
  audit: add gfp parameter to audit_log_nfcfg
  audit: log nftables configuration change events
  audit: Use struct_size() helper in alloc_chunk
2020-08-04 14:20:26 -07:00
Linus Torvalds 49e917deeb selinux/stable-5.9 PR 20200803
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl8okmsUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMaRA//XO7JKJEyLcpqRzhQP/QY50JXdQtE
 c9vKeb7y4wlfbTozRgjBN3Xj+tFbqANzX/rVsR1aKV+hExyEuUfNZ0Fl8MbPEccQ
 1RUCW2808/YRTYsl0g0DDZsc+vxVosfouk91pZfld9ZRnZbrNTGXFP7vuVyFKdBy
 wBX1FCL9q31wLc8Jk7f6otSbBvSCG0YXjkkxEM7LQx3oQ59s8dfOed41kDGpLoNk
 TS5BN/W3uuYEDIsIwTRZjU4h42dpc/wbxVMJhBg85rU/2bF4u5sDs2qgwqaa1tXs
 aRDH5J+eBMZRCkF4shxlDrrOWeXvEEtal9yYzQUx664tWDjZazoTLctCAe3PWI1i
 q61cG8PXw/5/oB6RyvPkRMLc5pU8P6/Xdfg6R6kOsGSq8bj+g30J6jqGXnW9FIVr
 5rIaooiw19vqH+ASVuq9oLmhuWJQyn6ImFqOkREJFWVaqufglWw9RWDCGFsLq9Tr
 w6HbA9UYCoWpdQBfRXpa086sSQm1wuCP39fIcY64uHpR5gPJuzyd8Tswz3tbEAtg
 v7vgIRtBpghhdcBLzIJgSIXJKR7W/Y49eFwNf3x0OTSeAIia6Z9paaQjXYl71I9V
 6oUiQgVE3lX2SkgMbOK2V5UsjbVkpjjv7MWxdm0mPQCU0Fmb8W2FN/wVR7FBlCZc
 yhde+bs4zTmPNFw=
 =ocPe
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "Beyond the usual smattering of bug fixes, we've got three small
  improvements worth highlighting:

   - improved SELinux policy symbol table performance due to a reworking
     of the insert and search functions

   - allow reading of SELinux labels before the policy is loaded,
     allowing for some more "exotic" initramfs approaches

   - improved checking an error reporting about process
     class/permissions during SELinux policy load"

* tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: complete the inlining of hashtab functions
  selinux: prepare for inlining of hashtab functions
  selinux: specialize symtab insert and search functions
  selinux: Fix spelling mistakes in the comments
  selinux: fixed a checkpatch warning with the sizeof macro
  selinux: log error messages on required process class / permissions
  scripts/selinux/mdp: fix initial SID handling
  selinux: allow reading labels before policy is loaded
2020-08-04 14:18:01 -07:00
Linus Torvalds 5b5d3be5d6 Automatic variable initialization updates for v5.9-rc1
- Introduce CONFIG_INIT_STACK_ALL_ZERO (Alexander Potapenko)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oXX4WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJt/FD/wJISl6Va3UvJrwGWcjLqb3iQh/
 38Nq7LV9ysUStpi5ibxhiB95uawFtAUsBLKyBKLtOERUz5RXiHrR9MI4UWNPBgNc
 7/H5ZAkkD21LpzC76FH+a4SWQp1kQTiyu/iONn03LE8p4vSwSVZzoGqA1r4fpzGY
 Np++2Ym/bzWV7R0Xdq/LI5oH9109dm75PhcCqCZPAtlIq+USXpyNAozimgREplVl
 /clYmj7oruoRYiF5uheOlbpCEXYlybwVHfDKE2Uh5IcXcpm3OYZU9HEK5ot5oudJ
 Z7bIcMeS2mMtSH/hhyjFbi0cZBVtJFc9exHRmuiDiYzNkWzaT2/5xAMUzw65q7Yk
 BTpr5AU+nkVQwuAmkN3AyBLrqQYyhWL0+xnWRmbbjt2yoqCx5x3AyxaBgHDV4vgF
 sTNhczFQdGqhlmvbxOw93PARV+lU9pozcc6b8TpXVdsE+bFFN5mBuRljIOTCRvke
 yxFsLF9olfNB3CXTHXAWLC/RuqdH/Vk7zC0vS34tlmvWgVC07P9QXyWciqcldAgL
 BsFXsRt6bRvOukyunhRfQkLVRxsOCLhQuYC33cRX9xY9vwCkM5v6TQH5WRcfxK7Q
 swujqqvozYZ/njblBTeagg8sGg0OiqxpCvJZD6qA6s1mO3lG58CDqqwxd4DemIDF
 /BxVarzUtmvBuiMBSQ==
 =c2Rf
 -----END PGP SIGNATURE-----

Merge tag 'var-init-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull automatic variable initialization updates from Kees Cook:
 "This adds the "zero" init option from Clang, which is being used
  widely in production builds of Android and Chrome OS (though it also
  keeps the "pattern" init, which is better for debug builds).

   - Introduce CONFIG_INIT_STACK_ALL_ZERO (Alexander Potapenko)"

* tag 'var-init-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  security: allow using Clang's zero initialization for stack variables
2020-08-04 13:38:35 -07:00
Linus Torvalds 382625d0d4 for-5.9/block-20200802
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8m7YwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpt+dEAC7a0HYuX2OrkyawBnsgd1QQR/soC7surec
 yDDa7SMM8cOq3935bfzcYHV9FWJszEGIknchiGb9R3/T+vmSohbvDsM5zgwya9u/
 FHUIuTq324I6JWXKl30k4rwjiX9wQeMt+WZ5gC8KJYCWA296i2IpJwd0A45aaKuS
 x4bTjxqknE+fD4gQiMUSt+bmuOUAp81fEku3EPapCRYDPAj8f5uoY7R2arT/POwB
 b+s+AtXqzBymIqx1z0sZ/XcdZKmDuhdurGCWu7BfJFIzw5kQ2Qe3W8rUmrQ3pGut
 8a21YfilhUFiBv+B4wptfrzJuzU6Ps0BXHCnBsQjzvXwq5uFcZH495mM/4E4OJvh
 SbjL2K4iFj+O1ngFkukG/F8tdEM1zKBYy2ZEkGoWKUpyQanbAaGI6QKKJA+DCdBi
 yPEb7yRAa5KfLqMiocm1qCEO1I56HRiNHaJVMqCPOZxLmpXj19Fs71yIRplP1Trv
 GGXdWZsccjuY6OljoXWdEfnxAr5zBsO3Yf2yFT95AD+egtGsU1oOzlqAaU1mtflw
 ABo452pvh6FFpxGXqz6oK4VqY4Et7WgXOiljA4yIGoPpG/08L1Yle4eVc2EE01Jb
 +BL49xNJVeUhGFrvUjPGl9kVMeLmubPFbmgrtipW+VRg9W8+Yirw7DPP6K+gbPAR
 RzAUdZFbWw==
 =abJG
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Good amount of cleanups and tech debt removals in here, and as a
  result, the diffstat shows a nice net reduction in code.

   - Softirq completion cleanups (Christoph)

   - Stop using ->queuedata (Christoph)

   - Cleanup bd claiming (Christoph)

   - Use check_events, moving away from the legacy media change
     (Christoph)

   - Use inode i_blkbits consistently (Christoph)

   - Remove old unused writeback congestion bits (Christoph)

   - Cleanup/unify submission path (Christoph)

   - Use bio_uninit consistently, instead of bio_disassociate_blkg
     (Christoph)

   - sbitmap cleared bits handling (John)

   - Request merging blktrace event addition (Jan)

   - sysfs add/remove race fixes (Luis)

   - blk-mq tag fixes/optimizations (Ming)

   - Duplicate words in comments (Randy)

   - Flush deferral cleanup (Yufen)

   - IO context locking/retry fixes (John)

   - struct_size() usage (Gustavo)

   - blk-iocost fixes (Chengming)

   - blk-cgroup IO stats fixes (Boris)

   - Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
  block: blk-timeout: delete duplicated word
  block: blk-mq-sched: delete duplicated word
  block: blk-mq: delete duplicated word
  block: genhd: delete duplicated words
  block: elevator: delete duplicated word and fix typos
  block: bio: delete duplicated words
  block: bfq-iosched: fix duplicated word
  iocost_monitor: start from the oldest usage index
  iocost: Fix check condition of iocg abs_vdebt
  block: Remove callback typedefs for blk_mq_ops
  block: Use non _rcu version of list functions for tag_set_list
  blk-cgroup: show global disk stats in root cgroup io.stat
  blk-cgroup: make iostat functions visible to stat printing
  block: improve discard bio alignment in __blkdev_issue_discard()
  block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
  block: defer flush request no matter whether we have elevator
  block: make blk_timeout_init() static
  block: remove retry loop in ioc_release_fn()
  block: remove unnecessary ioc nested locking
  block: integrate bd_start_claiming into __blkdev_get
  ...
2020-08-03 11:57:03 -07:00
Colin Ian King 3db0d0c276 integrity: remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Fixes: eb5798f2e2 ("integrity: convert digsig to akcipher api")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Addresses-Coverity: ("Unused value")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-27 16:52:09 -04:00
Dan Carpenter 42a2df3e82 Smack: prevent underflow in smk_set_cipso()
We have an upper bound on "maplevel" but forgot to check for negative
values.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:12 -07:00
Dan Carpenter a6bd4f6d9b Smack: fix another vsscanf out of bounds
This is similar to commit 84e99e58e8 ("Smack: slab-out-of-bounds in
vsscanf") where we added a bounds check on "rule".

Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com
Fixes: f7112e6c9a ("Smack: allow for significantly longer Smack labels v4")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-27 13:35:03 -07:00
Richard Guy Briggs f1d9b23cab audit: purge audit_log_string from the intra-kernel audit API
audit_log_string() was inteded to be an internal audit function and
since there are only two internal uses, remove them.  Purge all external
uses of it by restructuring code to use an existing audit_log_format()
or using audit_log_format().

Please see the upstream issue
https://github.com/linux-audit/audit-kernel/issues/84

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-21 11:12:31 -04:00
Eric W. Biederman be619f7f06 exec: Implement kernel_execve
To allow the kernel not to play games with set_fs to call exec
implement kernel_execve.  The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.

The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.

The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.

The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances.  In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.

Inspired-by: https://lkml.kernel.org/r/20200627072704.2447163-1-hch@lst.de
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/87wo365ikj.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2020-07-21 08:24:52 -05:00
Bruno Meneguele 311aa6aafe ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
The IMA_APPRAISE_BOOTPARAM config allows enabling different "ima_appraise="
modes - log, fix, enforce - at run time, but not when IMA architecture
specific policies are enabled.  This prevents properly labeling the
filesystem on systems where secure boot is supported, but not enabled on the
platform.  Only when secure boot is actually enabled should these IMA
appraise modes be disabled.

This patch removes the compile time dependency and makes it a runtime
decision, based on the secure boot state of that platform.

Test results as follows:

-> x86-64 with secure boot enabled

[    0.015637] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[    0.015668] ima: Secure boot enabled: ignoring ima_appraise=fix boot parameter option

-> powerpc with secure boot disabled

[    0.000000] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[    0.000000] Secure boot mode disabled

-> Running the system without secure boot and with both options set:

CONFIG_IMA_APPRAISE_BOOTPARAM=y
CONFIG_IMA_ARCH_POLICY=y

Audit prompts "missing-hash" but still allow execution and, consequently,
filesystem labeling:

type=INTEGRITY_DATA msg=audit(07/09/2020 12:30:27.778:1691) : pid=4976
uid=root auid=root ses=2
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=appraise_data
cause=missing-hash comm=bash name=/usr/bin/evmctl dev="dm-0" ino=493150
res=no

Cc: stable@vger.kernel.org
Fixes: d958083a8f ("x86/ima: define arch_get_ima_policy() for x86")
Signed-off-by: Bruno Meneguele <bmeneg@redhat.com>
Cc: stable@vger.kernel.org # 5.0
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:49 -04:00
Tyler Hicks 1768215a65 ima: AppArmor satisfies the audit rule requirements
AppArmor meets all the requirements for IMA in terms of audit rules
since commit e79c26d040 ("apparmor: Add support for audit rule
filtering"). Update IMA's Kconfig section for CONFIG_IMA_LSM_RULES to
reflect this.

Fixes: e79c26d040 ("apparmor: Add support for audit rule filtering")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:37 -04:00
Tyler Hicks b8867eedcf ima: Rename internal filter rule functions
Rename IMA's internal filter rule functions from security_filter_rule_*()
to ima_filter_rule_*(). This avoids polluting the security_* namespace,
which is typically reserved for general security subsystem
infrastructure.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Suggested-by: Casey Schaufler <casey@schaufler-ca.com>
[zohar@linux.ibm.com: reword using the term "filter", not "audit"]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 18:18:23 -04:00
Tyler Hicks 4834177e63 ima: Support additional conditionals in the KEXEC_CMDLINE hook function
Take the properties of the kexec kernel's inode and the current task
ownership into consideration when matching a KEXEC_CMDLINE operation to
the rules in the IMA policy. This allows for some uniformity when
writing IMA policy rules for KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK,
and KEXEC_CMDLINE operations.

Prior to this patch, it was not possible to write a set of rules like
this:

 dont_measure func=KEXEC_KERNEL_CHECK obj_type=foo_t
 dont_measure func=KEXEC_INITRAMFS_CHECK obj_type=foo_t
 dont_measure func=KEXEC_CMDLINE obj_type=foo_t
 measure func=KEXEC_KERNEL_CHECK
 measure func=KEXEC_INITRAMFS_CHECK
 measure func=KEXEC_CMDLINE

The inode information associated with the kernel being loaded by a
kexec_kernel_load(2) syscall can now be included in the decision to
measure or not

Additonally, the uid, euid, and subj_* conditionals can also now be
used in KEXEC_CMDLINE rules. There was no technical reason as to why
those conditionals weren't being considered previously other than
ima_match_rules() didn't have a valid inode to use so it immediately
bailed out for KEXEC_CMDLINE operations rather than going through the
full list of conditional comparisons.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: kexec@lists.infradead.org
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:16 -04:00
Tyler Hicks 592b24cbdc ima: Use the common function to detect LSM conditionals in a rule
Make broader use of ima_rule_contains_lsm_cond() to check if a given
rule contains an LSM conditional. This is a code cleanup and has no
user-facing change.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:16 -04:00
Tyler Hicks 30031b0ec8 ima: Move comprehensive rule validation checks out of the token parser
Use ima_validate_rule(), at the end of the token parsing stage, to
verify combinations of actions, hooks, and flags. This is useful to
increase readability by consolidating such checks into a single function
and also because rule conditionals can be specified in arbitrary order
making it difficult to do comprehensive rule validation until the entire
rule has been parsed.

This allows for the check that ties together the "keyrings" conditional
with the KEY_CHECK function hook to be moved into the final rule
validation.

The modsig check no longer needs to compiled conditionally because the
token parser will ensure that modsig support is enabled before accepting
"imasig|modsig" appraise type values. The final rule validation will
ensure that appraise_type and appraise_flag options are only present in
appraise rules.

Finally, this allows for the check that ties together the "pcr"
conditional with the measure action to be moved into the final rule
validation.

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:15 -04:00
Tyler Hicks aa0c0227d3 ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
Make args_p be of the char pointer type rather than have it be a void
pointer that gets casted to char pointer when it is used. It is a simple
NUL-terminated string as returned by match_strdup().

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:14 -04:00
Tyler Hicks 39e5993d0d ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
The args_p member is a simple string that is allocated by
ima_rule_init(). Shallow copy it like other non-LSM references in
ima_rule_entry structs.

There are no longer any necessary error path cleanups to do in
ima_lsm_copy_rule().

Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:28:13 -04:00
Tyler Hicks 5f3e92657b ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
Verifying that a file hash is not blacklisted is currently only
supported for files with appended signatures (modsig).  In the future,
this might change.

For now, the "appraise_flag" option is only appropriate for appraise
actions and its "blacklist" value is only appropriate when
CONFIG_IMA_APPRAISE_MODSIG is enabled and "appraise_flag=blacklist" is
only appropriate when "appraise_type=imasig|modsig" is also present.
Make this clear at policy load so that IMA policy authors don't assume
that other uses of "appraise_flag=blacklist" are supported.

Fixes: 273df864cf ("ima: Check against blacklisted hashes for files with modsig")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reivewed-by: Nayna Jain <nayna@linux.ibm.com>
Tested-by: Nayna Jain <nayna@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-20 13:06:26 -04:00
Adrian Reber 124ea650d3 capabilities: Introduce CAP_CHECKPOINT_RESTORE
This patch introduces CAP_CHECKPOINT_RESTORE, a new capability facilitating
checkpoint/restore for non-root users.

Over the last years, The CRIU (Checkpoint/Restore In Userspace) team has
been asked numerous times if it is possible to checkpoint/restore a
process as non-root. The answer usually was: 'almost'.

The main blocker to restore a process as non-root was to control the PID
of the restored process. This feature available via the clone3 system
call, or via /proc/sys/kernel/ns_last_pid is unfortunately guarded by
CAP_SYS_ADMIN.

In the past two years, requests for non-root checkpoint/restore have
increased due to the following use cases:
* Checkpoint/Restore in an HPC environment in combination with a
  resource manager distributing jobs where users are always running as
  non-root. There is a desire to provide a way to checkpoint and
  restore long running jobs.
* Container migration as non-root
* We have been in contact with JVM developers who are integrating
  CRIU into a Java VM to decrease the startup time. These
  checkpoint/restore applications are not meant to be running with
  CAP_SYS_ADMIN.

We have seen the following workarounds:
* Use a setuid wrapper around CRIU:
  See https://github.com/FredHutch/slurm-examples/blob/master/checkpointer/lib/checkpointer/checkpointer-suid.c
* Use a setuid helper that writes to ns_last_pid.
  Unfortunately, this helper delegation technique is impossible to use
  with clone3, and is thus prone to races.
  See https://github.com/twosigma/set_ns_last_pid
* Cycle through PIDs with fork() until the desired PID is reached:
  This has been demonstrated to work with cycling rates of 100,000 PIDs/s
  See https://github.com/twosigma/set_ns_last_pid
* Patch out the CAP_SYS_ADMIN check from the kernel
* Run the desired application in a new user and PID namespace to provide
  a local CAP_SYS_ADMIN for controlling PIDs. This technique has limited
  use in typical container environments (e.g., Kubernetes) as /proc is
  typically protected with read-only layers (e.g., /proc/sys) for
  hardening purposes. Read-only layers prevent additional /proc mounts
  (due to proc's SB_I_USERNS_VISIBLE property), making the use of new
  PID namespaces limited as certain applications need access to /proc
  matching their PID namespace.

The introduced capability allows to:
* Control PIDs when the current user is CAP_CHECKPOINT_RESTORE capable
  for the corresponding PID namespace via ns_last_pid/clone3.
* Open files in /proc/pid/map_files when the current user is
  CAP_CHECKPOINT_RESTORE capable in the root namespace, useful for
  recovering files that are unreachable via the file system such as
  deleted files, or memfd files.

See corresponding selftest for an example with clone3().

Signed-off-by: Adrian Reber <areber@redhat.com>
Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200719100418.2112740-2-areber@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-19 20:14:42 +02:00
Tyler Hicks eb624fe214 ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
The KEY_CHECK function only supports the uid, pcr, and keyrings
conditionals. Make this clear at policy load so that IMA policy authors
don't assume that other conditionals are supported.

Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks db2045f589 ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
The KEXEC_CMDLINE hook function only supports the pcr conditional. Make
this clear at policy load so that IMA policy authors don't assume that
other conditionals are supported.

Since KEXEC_CMDLINE's inception, ima_match_rules() has always returned
true on any loaded KEXEC_CMDLINE rule without any consideration for
other conditionals present in the rule. Make it clear that pcr is the
only supported KEXEC_CMDLINE conditional by returning an error during
policy load.

An example of why this is a problem can be explained with the following
rule:

 dont_measure func=KEXEC_CMDLINE obj_type=foo_t

An IMA policy author would have assumed that rule is valid because the
parser accepted it but the result was that measurements for all
KEXEC_CMDLINE operations would be disabled.

Fixes: b0935123a1 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 712183437e ima: Fail rule parsing when buffer hook functions have an invalid action
Buffer based hook functions, such as KEXEC_CMDLINE and KEY_CHECK, can
only measure. The process_buffer_measurement() function quietly ignores
all actions except measure so make this behavior clear at the time of
policy load.

The parsing of the keyrings conditional had a check to ensure that it
was only specified with measure actions but the check should be on the
hook function and not the keyrings conditional since
"appraise func=KEY_CHECK" is not a valid rule.

Fixes: b0935123a1 ("IMA: Define a new hook to measure the kexec boot command line arguments")
Fixes: 5808611ccc ("IMA: Add KEY_CHECK func to measure keys")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 2bdd737c56 ima: Free the entire rule if it fails to parse
Use ima_free_rule() to fix memory leaks of allocated ima_rule_entry
members, such as .fsname and .keyrings, when an error is encountered
during rule parsing.

Set the args_p pointer to NULL after freeing it in the error path of
ima_lsm_rule_init() so that it isn't freed twice.

This fixes a memory leak seen when loading an rule that contains an
additional piece of allocated memory, such as an fsname, followed by an
invalid conditional:

 # echo "measure fsname=tmpfs bad=cond" > /sys/kernel/security/ima/policy
 -bash: echo: write error: Invalid argument
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
 unreferenced object 0xffff98e7e4ece6c0 (size 8):
   comm "bash", pid 672, jiffies 4294791843 (age 21.855s)
   hex dump (first 8 bytes):
     74 6d 70 66 73 00 6b a5                          tmpfs.k.
   backtrace:
     [<00000000abab7413>] kstrdup+0x2e/0x60
     [<00000000f11ede32>] ima_parse_add_rule+0x7d4/0x1020
     [<00000000f883dd7a>] ima_write_policy+0xab/0x1d0
     [<00000000b17cf753>] vfs_write+0xde/0x1d0
     [<00000000b8ddfdea>] ksys_write+0x68/0xe0
     [<00000000b8e21e87>] do_syscall_64+0x56/0xa0
     [<0000000089ea7b98>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f1b08bbcbd ("ima: define a new policy condition based on the filesystem name")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 465aee77aa ima: Free the entire rule when deleting a list of rules
Create a function, ima_free_rule(), to free all memory associated with
an ima_rule_entry. Use the new function to fix memory leaks of allocated
ima_rule_entry members, such as .fsname and .keyrings, when deleting a
list of rules.

Make the existing ima_lsm_free_rule() function specific to the LSM
audit rule array of an ima_rule_entry and require that callers make an
additional call to kfree to free the ima_rule_entry itself.

This fixes a memory leak seen when loading by a valid rule that contains
an additional piece of allocated memory, such as an fsname, followed by
an invalid rule that triggers a policy load failure:

 # echo -e "dont_measure fsname=securityfs\nbad syntax" > \
    /sys/kernel/security/ima/policy
 -bash: echo: write error: Invalid argument
 # echo scan > /sys/kernel/debug/kmemleak
 # cat /sys/kernel/debug/kmemleak
 unreferenced object 0xffff9bab67ca12c0 (size 16):
   comm "bash", pid 684, jiffies 4295212803 (age 252.344s)
   hex dump (first 16 bytes):
     73 65 63 75 72 69 74 79 66 73 00 6b 6b 6b 6b a5  securityfs.kkkk.
   backtrace:
     [<00000000adc80b1b>] kstrdup+0x2e/0x60
     [<00000000d504cb0d>] ima_parse_add_rule+0x7d4/0x1020
     [<00000000444825ac>] ima_write_policy+0xab/0x1d0
     [<000000002b7f0d6c>] vfs_write+0xde/0x1d0
     [<0000000096feedcf>] ksys_write+0x68/0xe0
     [<0000000052b544a2>] do_syscall_64+0x56/0xa0
     [<000000007ead1ba7>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: f1b08bbcbd ("ima: define a new policy condition based on the filesystem name")
Fixes: 2b60c0eced ("IMA: Read keyrings= option from the IMA policy")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Tyler Hicks 9ff8a616df ima: Have the LSM free its audit rule
Ask the LSM to free its audit rule rather than directly calling kfree().
Both AppArmor and SELinux do additional work in their audit_rule_free()
hooks. Fix memory leaks by allowing the LSMs to perform necessary work.

Fixes: b169424551 ("ima: use the lsm policy update notifier")
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Cc: Janne Karhunen <janne.karhunen@gmail.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:53:55 -04:00
Lakshmi Ramasubramanian 34e980bb83 IMA: Add audit log for failure conditions
process_buffer_measurement() and ima_alloc_key_entry() functions need to
log an audit message for auditing integrity measurement failures.

Add audit message in these two functions. Remove "pr_devel" log message
in process_buffer_measurement().

Sample audit messages:

[    6.303048] audit: type=1804 audit(1592506281.627:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel op=measuring_key cause=ENOMEM comm="swapper/0" name=".builtin_trusted_keys" res=0 errno=-12

[    8.019432] audit: type=1804 audit(1592506283.344:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=measuring_kexec_cmdline cause=hashing_error comm="systemd" name="kexec-cmdline" res=0 errno=-22

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:48:36 -04:00
Lakshmi Ramasubramanian 2f845882ec integrity: Add errno field in audit message
Error code is not included in the audit messages logged by
the integrity subsystem.

Define a new function integrity_audit_message() that takes error code
in the "errno" parameter. Add "errno" field in the audit messages logged
by the integrity subsystem and set the value passed in the "errno"
parameter.

[    6.303048] audit: type=1804 audit(1592506281.627:2): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=kernel op=measuring_key cause=ENOMEM comm="swapper/0" name=".builtin_trusted_keys" res=0 errno=-12

[    7.987647] audit: type=1802 audit(1592506283.312:9): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=policy_update cause=completed comm="systemd" res=1 errno=0

[    8.019432] audit: type=1804 audit(1592506283.344:10): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 op=measuring_kexec_cmdline cause=hashing_error comm="systemd" name="kexec-cmdline" res=0 errno=-22

Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: Steve Grubb <sgrubb@redhat.com>
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-07-16 21:48:11 -04:00
Eric Biggers beb4ee6770 Smack: fix use-after-free in smk_write_relabel_self()
smk_write_relabel_self() frees memory from the task's credentials with
no locking, which can easily cause a use-after-free because multiple
tasks can share the same credentials structure.

Fix this by using prepare_creds() and commit_creds() to correctly modify
the task's credentials.

Reproducer for "BUG: KASAN: use-after-free in smk_write_relabel_self":

	#include <fcntl.h>
	#include <pthread.h>
	#include <unistd.h>

	static void *thrproc(void *arg)
	{
		int fd = open("/sys/fs/smackfs/relabel-self", O_WRONLY);
		for (;;) write(fd, "foo", 3);
	}

	int main()
	{
		pthread_t t;
		pthread_create(&t, NULL, thrproc, NULL);
		thrproc(NULL);
	}

Reported-by: syzbot+e6416dabb497a650da40@syzkaller.appspotmail.com
Fixes: 38416e5393 ("Smack: limited capability for changing process label")
Cc: <stable@vger.kernel.org> # v4.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2020-07-14 11:19:58 -07:00
Ondrej Mosnacek 54b27f9287 selinux: complete the inlining of hashtab functions
Move (most of) the definitions of hashtab_search() and hashtab_insert()
to the header file. In combination with the previous patch, this avoids
calling the callbacks indirectly by function pointers and allows for
better optimization, leading to a drastic performance improvement of
these operations.

With this patch, I measured a speed up in the following areas (measured
on x86_64 F32 VM with 4 CPUs):
  1. Policy load (`load_policy`) - takes ~150 ms instead of ~230 ms.
  2. `chcon -R unconfined_u:object_r:user_tmp_t:s0:c381,c519 /tmp/linux-src`
     where /tmp/linux-src is an extracted linux-5.7 source tarball -
     takes ~522 ms instead of ~576 ms. This is because of many
     symtab_search() calls in string_to_context_struct() when there are
     many categories specified in the context.
  3. `stress-ng --msg 1 --msg-ops 10000000` - takes 12.41 s instead of
     13.95 s (consumes 18.6 s of kernel CPU time instead of 21.6 s).
     This is thanks to security_transition_sid() being ~43% faster after
     this patch.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-09 19:08:16 -04:00
Ondrej Mosnacek 24def7bb92 selinux: prepare for inlining of hashtab functions
Refactor searching and inserting into hashtabs to pave the way for
converting hashtab_search() and hashtab_insert() to inline functions in
the next patch. This will avoid indirect calls and allow the compiler to
better optimize individual callers, leading to a significant performance
improvement.

In order to avoid the indirect calls, the key hashing and comparison
callbacks need to be extracted from the hashtab struct and passed
directly to hashtab_search()/_insert() by the callers so that the
callback address is always known at compile time. The kernel's
rhashtable library (<linux/rhashtable*.h>) does the same thing.

This of course makes the hashtab functions slightly easier to misuse by
passing a wrong callback set, but unfortunately there is no better way
to implement a hash table that is both generic and efficient in C. This
patch tries to somewhat mitigate this by only calling the hashtab
functions in the same file where the corresponding callbacks are
defined (wrapping them into more specialized functions as needed).

Note that this patch doesn't bring any benefit without also moving the
definitions of hashtab_search() and -_insert() to the header file, which
is done in a follow-up patch for easier review of the hashtab.c changes
in this patch.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-09 19:05:36 -04:00
Ondrej Mosnacek 237389e301 selinux: specialize symtab insert and search functions
This encapsulates symtab a little better and will help with further
refactoring later.

Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-08 20:21:43 -04:00
Richard Guy Briggs d7481b24b8 audit: issue CWD record to accompany LSM_AUDIT_DATA_* records
The LSM_AUDIT_DATA_* records for PATH, FILE, IOCTL_OP, DENTRY and INODE
are incomplete without the task context of the AUDIT Current Working
Directory record.  Add it.

This record addition can't use audit_dummy_context to determine whether
or not to store the record information since the LSM_AUDIT_DATA_*
records are initiated by various LSMs independent of any audit rules.
context->in_syscall is used to determine if it was called in user
context like audit_getname.

Please see the upstream issue
https://github.com/linux-audit/audit-kernel/issues/96

Adapted from Vladis Dronov's v2 patch.

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-08 19:02:11 -04:00
lihao 2c3d8dfece selinux: Fix spelling mistakes in the comments
Fix spelling mistakes in the comments
    quering==>querying

Signed-off-by: lihao <fly.lihao@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-07-08 12:15:52 -04:00
Christoph Hellwig a1f9b1c043 integrity/ima: switch to using __kernel_read
__kernel_read has a bunch of additional sanity checks, and this moves
the set_fs out of non-core code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 08:27:57 +02:00
Linus Torvalds 615bc218d6 Two simple fixes for v5.8:
1) Fix hook iteration and default value for inode_copy_up_xattr
 	from KP Singh <kpsingh@google.com>
 
 2) Fix the key_permission LSM hook function type
 	from Sami Tolvanen <samitolvanen@google.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgycj0O+d1G2aycA8rZhLv9lQBTwFAl76VWoACgkQrZhLv9lQ
 BTzA7hAAoOtVwDy0eop24cCmBnJZBk234oyEB/2Qer7F66TbXJsbRtDN2Pmo9H8T
 SKVkg1LpeDdwAlByAThalTiJeZSSK6p3t7Yhhd0FmZpUKv+/WyAyWy8m2KmBF1M6
 xq2Sa9GzBFOm96vJlSRIMlpvpeVClY6soCiKowSdLXZ/Bqeg1daHEGXGnTTtC7Sg
 ju4aW/BylJzF9XhMBlcK3qLCd26FX2qPnqtTR0XeNLA+kX007lA2MyJ79xnnj2zb
 mWslT0e/z3xF3b1fGXLr16ELHIaK0+Nu5S5S1Y8OJdTqpL+fKmV68rePrDX2VCrB
 H0fdHXuVwMTP1SEimItTHsYsFXZuS8rjV5IgMPwiih3u5tUki/1C/4uQqbkXx5Uv
 ele7QBOgq48nKv1/tIp/7CnfS7SWsJVMYvVIYpBp6Svvguih4Ud+bksVQx9evYR5
 74ZFJXWMiLeXEdbPeVqaFCHrDggYpCV8Gcqnq+v2fn1R5mEK4tB9Y/xYGXGPt5QN
 CuoACM83B1PsYFhTHiLaEnVTe3ToAtgth3cm0PbfkPXmyGzwlf1ANNIaRpBkoJh7
 9Ms1B97EBsI4smkriv0WbmfAydJSVoqJaUOFqnTSLLwMivLJtCkQbxB44SxSV9AP
 tLEgvTn/CaY/O1nZ98ALtLMNlmj5Q2AhUkd3J/Hobl+oQzvb++8=
 =SToQ
 -----END PGP SIGNATURE-----

Merge tag 'fixes-v5.8-rc3-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull security subsystem fixes from James Morris:
 "Two simple fixes for v5.8:

   - Fix hook iteration and default value for inode_copy_up_xattr
     (KP Singh)

   - Fix the key_permission LSM hook function type (Sami Tolvanen)"

* tag 'fixes-v5.8-rc3-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: Fix hook iteration and default value for inode_copy_up_xattr
  security: fix the key_permission LSM hook function type
2020-06-30 12:21:53 -07:00
Ethan Edwards 65d96351b1 selinux: fixed a checkpatch warning with the sizeof macro
`sizeof buf` changed to `sizeof(buf)`

Signed-off-by: Ethan Edwards <ethancarteredwards@gmail.com>
[PM: rewrote the subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-29 19:26:17 -04:00
Maurizio Drocco 20c59ce010 ima: extend boot_aggregate with kernel measurements
Registers 8-9 are used to store measurements of the kernel and its
command line (e.g., grub2 bootloader with tpm module enabled). IMA
should include them in the boot aggregate. Registers 8-9 should be
only included in non-SHA1 digests to avoid ambiguity.

Signed-off-by: Maurizio Drocco <maurizio.drocco@ibm.com>
Reviewed-by: Bruno Meneguele <bmeneg@redhat.com>
Tested-by: Bruno Meneguele <bmeneg@redhat.com>  (TPM 1.2, TPM 2.0)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-06-24 20:47:24 -04:00
Christoph Hellwig 3f1266f1f8 block: move block-related definitions out of fs.h
Move most of the block related definition out of fs.h into more suitable
headers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-24 09:16:02 -06:00
Stephen Smalley 7383c0f94d selinux: log error messages on required process class / permissions
In general SELinux no longer treats undefined object classes or permissions
in the policy as a fatal error, instead handling them in accordance with
handle_unknown. However, the process class and process transition and
dyntransition permissions are still required to be defined due to
dependencies on these definitions for default labeling behaviors,
role and range transitions in older policy versions that lack an explicit
class field, and role allow checking.  Log error messages in these cases
since otherwise the policy load will fail silently with no indication
to the user as to the underlying cause.  While here, fix the checking for
process transition / dyntransition so that omitting either permission is
handled as an error; both are needed in order to ensure that role allow
checking is consistently applied.

Reported-by: bauen1 <j2468h@googlemail.com>
Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-23 20:57:01 -04:00
Jonathan Lebon c8e222616c selinux: allow reading labels before policy is loaded
This patch does for `getxattr` what commit 3e3e24b420 ("selinux: allow
labeling before policy is loaded") did for `setxattr`; it allows
querying the current SELinux label on disk before the policy is loaded.

One of the motivations described in that commit message also drives this
patch: for Fedora CoreOS (and eventually RHEL CoreOS), we want to be
able to move the root filesystem for example, from xfs to ext4 on RAID,
on first boot, at initrd time.[1]

Because such an operation works at the filesystem level, we need to be
able to read the SELinux labels first from the original root, and apply
them to the files of the new root. The previous commit enabled the
second part of this process; this commit enables the first part.

[1] https://github.com/coreos/fedora-coreos-tracker/issues/94

Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Jonathan Lebon <jlebon@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-23 20:42:38 -04:00
KP Singh 23e390cdbe security: Fix hook iteration and default value for inode_copy_up_xattr
inode_copy_up_xattr returns 0 to indicate the acceptance of the xattr
and 1 to reject it. If the LSM does not know about the xattr, it's
expected to return -EOPNOTSUPP, which is the correct default value for
this hook. BPF LSM, currently, uses 0 as the default value and thereby
falsely allows all overlay fs xattributes to be copied up.

The iteration logic is also updated from the "bail-on-fail"
call_int_hook to continue on the non-decisive -EOPNOTSUPP and bail out
on other values.

Fixes: 98e828a065 ("security: Refactor declaration of LSM hooks")
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: James Morris <jmorris@namei.org>
2020-06-23 16:39:23 -07:00
Linus Torvalds 817d914d17 selinux/stable-5.8 PR 20200621
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl7vxoUUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOs3Q/+JSNYoKZiOJax5u1ePEwk5JRasRij
 JkKNjspXueVcoVkVF8lOiSOmQm/7FyfDUi2Qk8U5Gmx7Pr6vQJZgghHxdPMsemCz
 mNbbR8UMm6ssTim19ULBQ3S0Sc1QMQvQCLNDZcv24ne2K8d9HTBrFGenqlLU4UtZ
 JrMLircBt39fVonooMrf9ycGlM8tUZwm8te+Jp7KL18GUKZT8hr0HKzu2WE6/qT4
 WBGNaWxqnfbajnDb41ix2rL+lb8Snqn94cxCjp248rn7M5fJRSCKmYaumBh5ViJ2
 VuD/ZQsTX5SSnc9YDpkUDXya9M1wzFwf64ku6Avga1BXS6lNWB1wqWueSTMfggiL
 2B+LVANWkGFfHtVAVA5xsxXjeJnYmIj/g8qSiHS/RSFJazr1b/cXWedvyewll/Nv
 rFRBsVzktV6BBrlTclcrsu9FmlZRAThNC3uYs/s5vbAja+wHEhCLuacO+jiducRP
 fqQCP2iF6MqC6B2I8WzVp3jU8k2t02i6ySaXmXjzrwOZSLvnOdvDBzE7e95yNLRg
 WLeGd/o2PdLpVoSNVHelFrqm8VZKYSCkWty9WppklnrIVVydKMJ3bgihXY4pADyf
 1ABtKUZgySZKZOpr1pQBqIivHuvKqUGFynj6PSRsngQBoq6V3XpJ7ZCBhuG7cNAT
 9BfnUkhFW7lW70I=
 =nILH
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull SELinux fixes from Paul Moore:
 "Three small patches to fix problems in the SELinux code, all found via
  clang.

  Two patches fix potential double-free conditions and one fixes an
  undefined return value"

* tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix undefined return of cond_evaluate_expr
  selinux: fix a double free in cond_read_node()/cond_read_list()
  selinux: fix double free
2020-06-21 15:41:24 -07:00
Tom Rix 8231b0b9c3 selinux: fix undefined return of cond_evaluate_expr
clang static analysis reports an undefined return

security/selinux/ss/conditional.c:79:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
        return s[0];
        ^~~~~~~~~~~

static int cond_evaluate_expr( ...
{
	u32 i;
	int s[COND_EXPR_MAXDEPTH];

	for (i = 0; i < expr->len; i++)
	  ...

	return s[0];

When expr->len is 0, the loop which sets s[0] never runs.

So return -1 if the loop never runs.

Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-17 17:36:40 -04:00
Tom Rix aa449a7965 selinux: fix a double free in cond_read_node()/cond_read_list()
Clang static analysis reports this double free error

security/selinux/ss/conditional.c:139:2: warning: Attempt to free released memory [unix.Malloc]
        kfree(node->expr.nodes);
        ^~~~~~~~~~~~~~~~~~~~~~~

When cond_read_node fails, it calls cond_node_destroy which frees the
node but does not poison the entry in the node list.  So when it
returns to its caller cond_read_list, cond_read_list deletes the
partial list.  The latest entry in the list will be deleted twice.

So instead of freeing the node in cond_read_node, let list freeing in
code_read_list handle the freeing the problem node along with all of the
earlier nodes.

Because cond_read_node no longer does any error handling, the goto's
the error case are redundant.  Instead just return the error code.

Cc: stable@vger.kernel.org
Fixes: 60abd3181d ("selinux: convert cond_list to array")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: subject line tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-16 20:25:19 -04:00
glider@google.com f0fe00d497 security: allow using Clang's zero initialization for stack variables
In addition to -ftrivial-auto-var-init=pattern (used by
CONFIG_INIT_STACK_ALL now) Clang also supports zero initialization for
locals enabled by -ftrivial-auto-var-init=zero. The future of this flag
is still being debated (see https://bugs.llvm.org/show_bug.cgi?id=45497).
Right now it is guarded by another flag,
-enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang,
which means it may not be supported by future Clang releases. Another
possible resolution is that -ftrivial-auto-var-init=zero will persist
(as certain users have already started depending on it), but the name
of the guard flag will change.

In the meantime, zero initialization has proven itself as a good
production mitigation measure against uninitialized locals. Unlike pattern
initialization, which has a higher chance of triggering existing bugs,
zero initialization provides safe defaults for strings, pointers, indexes,
and sizes. On the other hand, pattern initialization remains safer for
return values. Chrome OS and Android are moving to using zero
initialization for production builds.

Performance-wise, the difference between pattern and zero initialization
is usually negligible, although the generated code for zero
initialization is more compact.

This patch renames CONFIG_INIT_STACK_ALL to CONFIG_INIT_STACK_ALL_PATTERN
and introduces another config option, CONFIG_INIT_STACK_ALL_ZERO, that
enables zero initialization for locals if the corresponding flags are
supported by Clang.

Cc: Kees Cook <keescook@chromium.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alexander Potapenko <glider@google.com>
Link: https://lore.kernel.org/r/20200616083435.223038-1-glider@google.com
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-06-16 02:06:23 -07:00
Gustavo A. R. Silva eb492c627a ima: Replace zero-length array with flexible-array
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://github.com/KSPP/linux/issues/21

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-06-15 23:08:32 -05:00
Linus Torvalds 4a87b197c1 Add additional LSM hooks for SafeSetID
SafeSetID is capable of making allow/deny decisions for set*uid calls
 on a system, and we want to add similar functionality for set*gid
 calls. The work to do that is not yet complete, so probably won't make
 it in for v5.8, but we are looking to get this simple patch in for
 v5.8 since we have it ready. We are planning on the rest of the work
 for extending the SafeSetID LSM being merged during the v5.9 merge
 window.
 
 This patch was sent to the security mailing list and there were no objections.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEgvWslnM+qUy+sgVg5n2WYw6TPBAFAl7mZCoACgkQ5n2WYw6T
 PBAk1RAAl8t3/m3lELf8qIir4OAd4nK0kc4e+7W8WkznX2ljUl2IetlNxDCBmEXr
 T5qoW6uPsr6kl5AKnbl9Ii7WpW/halsslpKSUNQCs6zbecoVdxekJ8ISW7xHuboZ
 SvS1bqm+t++PM0c0nWSFEr7eXYmPH8OGbCqu6/+nnbxPZf2rJX03e5LnHkEFDFnZ
 0D/rsKgzMt01pdBJQXeoKk79etHO5MjuAkkYVEKJKCR1fM16lk7ECaCp0KJv1Mmx
 I88VncbLvI+um4t82d1Z8qDr2iLgogjJrMZC4WKfxDTmlmxox2Fz9ZJo+8sIWk6k
 T3a95x0s/mYCO4gWtpCVICt9+71Z3ie9T2iaI+CIe/kJvI/ysb+7LSkF+PD33bdz
 0yv6Y9+VMRdzb3pW69R28IoP4wdYQOJRomsY49z6ypH0RgBWcBvyE6e4v+WJGRNK
 E164Imevf6rrZeqJ0kGSBS1nL9WmQHMaXabAwxg1jK1KRZD+YZj3EKC9S/+PAkaT
 1qXUgvGuXHGjQrwU0hclQjgc6BAudWfAGdfrVr7IWwNKJmjgBf6C35my/azrkOg9
 wHCEpUWVmZZLIZLM69/6QXdmMA+iR+rPz5qlVnWhWTfjRYJUXM455Zk+aNo+Qnwi
 +saCcdU+9xqreLeDIoYoebcV/ctHeW0XCQi/+ebjexXVlyeSfYs=
 =I+0L
 -----END PGP SIGNATURE-----

Merge tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux

Pull SafeSetID update from Micah Morton:
 "Add additional LSM hooks for SafeSetID

  SafeSetID is capable of making allow/deny decisions for set*uid calls
  on a system, and we want to add similar functionality for set*gid
  calls.

  The work to do that is not yet complete, so probably won't make it in
  for v5.8, but we are looking to get this simple patch in for v5.8
  since we have it ready.

  We are planning on the rest of the work for extending the SafeSetID
  LSM being merged during the v5.9 merge window"

* tag 'LSM-add-setgid-hook-5.8-author-fix' of git://github.com/micah-morton/linux:
  security: Add LSM hooks to set*gid syscalls
2020-06-14 11:39:31 -07:00
Thomas Cedeno 39030e1351 security: Add LSM hooks to set*gid syscalls
The SafeSetID LSM uses the security_task_fix_setuid hook to filter
set*uid() syscalls according to its configured security policy. In
preparation for adding analagous support in the LSM for set*gid()
syscalls, we add the requisite hook here. Tested by putting print
statements in the security_task_fix_setgid hook and seeing them get hit
during kernel boot.

Signed-off-by: Thomas Cedeno <thomascedeno@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
2020-06-14 10:52:02 -07:00
Linus Torvalds 6adc19fd13 Kbuild updates for v5.8 (2nd)
- fix build rules in binderfs sample
 
  - fix build errors when Kbuild recurses to the top Makefile
 
  - covert '---help---' in Kconfig to 'help'
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl7lBuYVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGHvIP/3iErjPshpg/phwH8NTCS4SFkiti
 BZRM+2lupSn7Qs53BTpVzIkXoHBJQZlJxlQ5HY8ScO+fiz28rKZr+b40us+je1Q+
 SkvSPfwZzxjEg7lAZutznG4KgItJLWJKmDyh9T8Y8TAuG4f8WO0hKnXoAp3YorS2
 zppEIxso8O5spZPjp+fF/fPbxPjIsabGK7Jp2LpSVFR5pVDHI/ycTlKQS+MFpMEx
 6JIpdFRw7TkvKew1dr5uAWT5btWHatEqjSR3JeyVHv3EICTGQwHmcHK67cJzGInK
 T51+DT7/CpKtmRgGMiTEu/INfMzzoQAKl6Fcu+vMaShTN97Hk9DpdtQyvA6P/h3L
 8GA4UBct05J7fjjIB7iUD+GYQ0EZbaFujzRXLYk+dQqEJRbhcCwvdzggGp0WvGRs
 1f8/AIpgnQv8JSL/bOMgGMS5uL2dSLsgbzTdr6RzWf1jlYdI1i4u7AZ/nBrwWP+Z
 iOBkKsVceEoJrTbaynl3eoYqFLtWyDau+//oBc2gUvmhn8ioM5dfqBRiJjxJnPG9
 /giRj6xRIqMMEw8Gg8PCG7WebfWxWyaIQwlWBbPok7DwISURK5mvOyakZL+Q25/y
 6MBr2H8NEJsf35q0GTINpfZnot7NX4JXrrndJH8NIRC7HEhwd29S041xlQJdP0rs
 E76xsOr3hrAmBu4P
 =1NIT
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

 - fix build rules in binderfs sample

 - fix build errors when Kbuild recurses to the top Makefile

 - covert '---help---' in Kconfig to 'help'

* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  treewide: replace '---help---' in Kconfig files with 'help'
  kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
  samples: binderfs: really compile this sample and fix build issues
2020-06-13 13:29:16 -07:00
Masahiro Yamada a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Linus Torvalds 6c32978414 Notifications over pipes + Keyring notifications
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl7U/i8ACgkQ+7dXa6fL
 C2u2eg/+Oy6ybq0hPovYVkFI9WIG7ZCz7w9Q6BEnfYMqqn3dnfJxKQ3l4pnQEOWw
 f4QfvpvevsYfMtOJkYcG6s66rQgbFdqc5TEyBBy0QNp3acRolN7IXkcopvv9xOpQ
 JxedpbFG1PTFLWjvBpyjlrUPouwLzq2FXAf1Ox0ZIMw6165mYOMWoli1VL8dh0A0
 Ai7JUB0WrvTNbrwhV413obIzXT/rPCdcrgbQcgrrLPex8lQ47ZAE9bq6k4q5HiwK
 KRzEqkQgnzId6cCNTFBfkTWsx89zZunz7jkfM5yx30MvdAtPSxvvpfIPdZRZkXsP
 E2K9Fk1/6OQZTC0Op3Pi/bt+hVG/mD1p0sQUDgo2MO3qlSS+5mMkR8h3mJEgwK12
 72P4YfOJkuAy2z3v4lL0GYdUDAZY6i6G8TMxERKu/a9O3VjTWICDOyBUS6F8YEAK
 C7HlbZxAEOKTVK0BTDTeEUBwSeDrBbvH6MnRlZCG5g1Fos2aWP0udhjiX8IfZLO7
 GN6nWBvK1fYzfsUczdhgnoCzQs3suoDo04HnsTPGJ8De52T4x2RsjV+gPx0nrNAq
 eWChl1JvMWsY2B3GLnl9XQz4NNN+EreKEkk+PULDGllrArrPsp5Vnhb9FJO1PVCU
 hMDJHohPiXnKbc8f4Bd78OhIvnuoGfJPdM5MtNe2flUKy2a2ops=
 =YTGf
 -----END PGP SIGNATURE-----

Merge tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull notification queue from David Howells:
 "This adds a general notification queue concept and adds an event
  source for keys/keyrings, such as linking and unlinking keys and
  changing their attributes.

  Thanks to Debarshi Ray, we do have a pull request to use this to fix a
  problem with gnome-online-accounts - as mentioned last time:

     https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

  Without this, g-o-a has to constantly poll a keyring-based kerberos
  cache to find out if kinit has changed anything.

  [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

  LSM hooks are included:

   - A set of hooks are provided that allow an LSM to rule on whether or
     not a watch may be set. Each of these hooks takes a different
     "watched object" parameter, so they're not really shareable. The
     LSM should use current's credentials. [Wanted by SELinux & Smack]

   - A hook is provided to allow an LSM to rule on whether or not a
     particular message may be posted to a particular queue. This is
     given the credentials from the event generator (which may be the
     system) and the watch setter. [Wanted by Smack]

  I've provided SELinux and Smack with implementations of some of these
  hooks.

  WHY
  ===

  Key/keyring notifications are desirable because if you have your
  kerberos tickets in a file/directory, your Gnome desktop will monitor
  that using something like fanotify and tell you if your credentials
  cache changes.

  However, we also have the ability to cache your kerberos tickets in
  the session, user or persistent keyring so that it isn't left around
  on disk across a reboot or logout. Keyrings, however, cannot currently
  be monitored asynchronously, so the desktop has to poll for it - not
  so good on a laptop. This facility will allow the desktop to avoid the
  need to poll.

  DESIGN DECISIONS
  ================

   - The notification queue is built on top of a standard pipe. Messages
     are effectively spliced in. The pipe is opened with a special flag:

        pipe2(fds, O_NOTIFICATION_PIPE);

     The special flag has the same value as O_EXCL (which doesn't seem
     like it will ever be applicable in this context)[?]. It is given up
     front to make it a lot easier to prohibit splice&co from accessing
     the pipe.

     [?] Should this be done some other way?  I'd rather not use up a new
         O_* flag if I can avoid it - should I add a pipe3() system call
         instead?

     The pipe is then configured::

        ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
        ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

     Messages are then read out of the pipe using read().

   - It should be possible to allow write() to insert data into the
     notification pipes too, but this is currently disabled as the
     kernel has to be able to insert messages into the pipe *without*
     holding pipe->mutex and the code to make this work needs careful
     auditing.

   - sendfile(), splice() and vmsplice() are disabled on notification
     pipes because of the pipe->mutex issue and also because they
     sometimes want to revert what they just did - but one or more
     notification messages might've been interleaved in the ring.

   - The kernel inserts messages with the wait queue spinlock held. This
     means that pipe_read() and pipe_write() have to take the spinlock
     to update the queue pointers.

   - Records in the buffer are binary, typed and have a length so that
     they can be of varying size.

     This allows multiple heterogeneous sources to share a common
     buffer; there are 16 million types available, of which I've used
     just a few, so there is scope for others to be used. Tags may be
     specified when a watchpoint is created to help distinguish the
     sources.

   - Records are filterable as types have up to 256 subtypes that can be
     individually filtered. Other filtration is also available.

   - Notification pipes don't interfere with each other; each may be
     bound to a different set of watches. Any particular notification
     will be copied to all the queues that are currently watching for it
     - and only those that are watching for it.

   - When recording a notification, the kernel will not sleep, but will
     rather mark a queue as having lost a message if there's
     insufficient space. read() will fabricate a loss notification
     message at an appropriate point later.

   - The notification pipe is created and then watchpoints are attached
     to it, using one of:

        keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
        watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
        watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

     where in both cases, fd indicates the queue and the number after is
     a tag between 0 and 255.

   - Watches are removed if either the notification pipe is destroyed or
     the watched object is destroyed. In the latter case, a message will
     be generated indicating the enforced watch removal.

  Things I want to avoid:

   - Introducing features that make the core VFS dependent on the
     network stack or networking namespaces (ie. usage of netlink).

   - Dumping all this stuff into dmesg and having a daemon that sits
     there parsing the output and distributing it as this then puts the
     responsibility for security into userspace and makes handling
     namespaces tricky. Further, dmesg might not exist or might be
     inaccessible inside a container.

   - Letting users see events they shouldn't be able to see.

  TESTING AND MANPAGES
  ====================

   - The keyutils tree has a pipe-watch branch that has keyctl commands
     for making use of notifications. Proposed manual pages can also be
     found on this branch, though a couple of them really need to go to
     the main manpages repository instead.

     If the kernel supports the watching of keys, then running "make
     test" on that branch will cause the testing infrastructure to spawn
     a monitoring process on the side that monitors a notifications pipe
     for all the key/keyring changes induced by the tests and they'll
     all be checked off to make sure they happened.

        https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

   - A test program is provided (samples/watch_queue/watch_test) that
     can be used to monitor for keyrings, mount and superblock events.
     Information on the notifications is simply logged to stdout"

* tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  smack: Implement the watch_key and post_notification hooks
  selinux: Implement the watch_key security hook
  keys: Make the KEY_NEED_* perms an enum rather than a mask
  pipe: Add notification lossage handling
  pipe: Allow buffers to be marked read-whole-or-error for notifications
  Add sample notification program
  watch_queue: Add a key/keyring notification facility
  security: Add hooks to rule on setting a watch
  pipe: Add general notification queue support
  pipe: Add O_NOTIFICATION_PIPE
  security: Add a hook for the point of notification insertion
  uapi: General notification queue definitions
2020-06-13 09:56:21 -07:00
Linus Torvalds 923ea1631e ima: mprotect performance fix
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJe46D/AAoJEGt5JGawPnFaIQsQALJSQjR9xxnWjoriSAEoo6wb
 6PdIZkHyDJvpX7ZFZAjHN6ntipY4I7+TlSeAn31XPgjrmQICfaP6LldHwlap2iI+
 0Ty/0E+aLcsfUJvuDd6YaOTQzKoefZyAD5jIHuhgVqYUQlZ77+7Ouht8HOh5zLHw
 w7IYTrs4V3ckd6KF2kZ3k+7Bbeod2EXXBwL7E3D9a8gIk4vAWo7+ONHI0h75kHKZ
 u+g/n/3hXv0wDDDRH2bwQg2XkMknkw63SMOwDm8UvwjIzVj8Xyoo3g7fJJOt8kOj
 edH1zndn5QL6GXsbv3Au0VSWXfizjbqUIxPv/GZaLLm0hM5c0rSRnV0F356huamb
 F5iS1i5b+iIaOoKMAiJbD2C3cMQ1ssTU2y4W3B43xx/yR6TNIGWR60efj5NN8fYN
 oNSBMf7ARelDSa696wpL6W5yZ598CzgK/WTvLHwQkZJrwXwifFNvK7T2AkggWpFf
 sICQEjJjwICAqIgbjVGM+cBXr9050nq/HGADhGOdgshl2iLPd0iqhHGFyWbDs3JI
 kscF9I/JooZ2tj1UrTGJRZSudd8fTZOoTpGrTUNPMm5ib68/ObYriAIOMEmaoYEu
 9X8FFJDJSZg5Efg+Orx5ju2iJaO+LUSp+/hyWrE9c4m2ZtAKK/s5WFU/dzeQcYC6
 uHGvXLE+Eh/qvgoYfQnv
 =QV8/
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v5.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fix from Mimi Zohar:
 "ima mprotect performance fix"

* tag 'integrity-v5.8-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: fix mprotect checking
2020-06-12 12:02:41 -07:00
Mimi Zohar 4235b1a4ef ima: fix mprotect checking
Make sure IMA is enabled before checking mprotect change.  Addresses
report of a 3.7% regression of boot-time.dhcp.

Fixes: 8eb613c0b8 ("ima: verify mprotect change is consistent with mmap policy")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Tested-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2020-06-12 11:30:18 -04:00
Tom Rix 65de50969a selinux: fix double free
Clang's static analysis tool reports these double free memory errors.

security/selinux/ss/services.c:2987:4: warning: Attempt to free released memory [unix.Malloc]
                        kfree(bnames[i]);
                        ^~~~~~~~~~~~~~~~
security/selinux/ss/services.c:2990:2: warning: Attempt to free released memory [unix.Malloc]
        kfree(bvalues);
        ^~~~~~~~~~~~~~

So improve the security_get_bools error handling by freeing these variables
and setting their return pointers to NULL and the return len to 0

Cc: stable@vger.kernel.org
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-06-10 22:10:35 -04:00
Linus Torvalds 52435c86bf overlayfs update for 5.8
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCXt9klAAKCRDh3BK/laaZ
 PBeeAP9GRI0yajPzBzz2ZK9KkDc6A7wPiaAec+86Q+c02VncVwEAvq5Pi4um5RTZ
 7SVv56ggKO3Cqx779zVyZTRYDs3+YA4=
 =bpKI
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
 "Fixes:

   - Resolve mount option conflicts consistently

   - Sync before remount R/O

   - Fix file handle encoding corner cases

   - Fix metacopy related issues

   - Fix an unintialized return value

   - Add missing permission checks for underlying layers

  Optimizations:

   - Allow multipe whiteouts to share an inode

   - Optimize small writes by inheriting SB_NOSEC from upper layer

   - Do not call ->syncfs() multiple times for sync(2)

   - Do not cache negative lookups on upper layer

   - Make private internal mounts longterm"

* tag 'ovl-update-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (27 commits)
  ovl: remove unnecessary lock check
  ovl: make oip->index bool
  ovl: only pass ->ki_flags to ovl_iocb_to_rwf()
  ovl: make private mounts longterm
  ovl: get rid of redundant members in struct ovl_fs
  ovl: add accessor for ofs->upper_mnt
  ovl: initialize error in ovl_copy_xattr
  ovl: drop negative dentry in upper layer
  ovl: check permission to open real file
  ovl: call secutiry hook in ovl_real_ioctl()
  ovl: verify permissions in ovl_path_open()
  ovl: switch to mounter creds in readdir
  ovl: pass correct flags for opening real directory
  ovl: fix redirect traversal on metacopy dentries
  ovl: initialize OVL_UPPERDATA in ovl_lookup()
  ovl: use only uppermetacopy state in ovl_lookup()
  ovl: simplify setting of origin for index lookup
  ovl: fix out of bounds access warning in ovl_check_fb_len()
  ovl: return required buffer size for file handles
  ovl: sync dirty data when remounting to ro mode
  ...
2020-06-09 15:40:50 -07:00
Linus Torvalds 595a56ac1b linux-kselftest-kunit-5.8-rc1
This Kunit update for Linux 5.8-rc1 consists of:
 
 - Several config fragment fixes from Anders Roxell to improve
   test coverage.
 - Improvements to kunit run script to use defconfig as default and
   restructure the code for config/build/exec/parse from Vitor Massaru Iha
   and David Gow.
 - Miscellaneous documentation warn fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl7etrcACgkQCwJExA0N
 QxzGYg/+KHpPhB31IAjNFKCRqwDooftst3dohhzguxJLpDHdEmVJ4moQhLr4gL+/
 qpi3T9hr4Rx++n/A5NoxDvyJvGr+FAL40U+Of7F2UyHpqQmfKPj37I+yvyeR1JEL
 z4+yXEpfQLZaQkmZ7f3GWHyqN3+xwvyTEy7NYUad7xMxLF/99No+I6RMD6yp3srS
 wUUeuBIesSFT0LXYrgI+wgsNGUESlj/McjiP5eMj6UtlMgKpzmfzH56Fia8uw1pw
 6QtpntxDHjtxVfp8YKM4qExI54YI2t6sgHTIoOUsMWD5Q2kHd8kNf1L+lb1sKYUF
 j7lzol5nuqqchAVQYjHzNHa8XKndvexGyWMsPz1gAnkpgVrvBTSFcavdDpDuDZ0T
 HoJZnk9XPsguBQjDxapzPYfAQ81Un/rEmZQ8/X2TaNjdSIH1hHljhaP2OZ6eND/Q
 iobq9x8nC9D95TIqjDbRw3Sp2na/pZLN8Gp27hmKlc+L1XzV8NuZe/WGOUe3lsrq
 fG1ZSLo/iRau8gHuF6fRSrGIzQSCEMGKl3jlQ28OT9HGMAgTlncEwVzQId48/AsS
 UOY+bIAnRZuK+B5F/vw6L3o1e3c17z5bruVlb0M0alP5b7P9/3WLNHsHA3r8haZF
 F6PwIWu41wdRjJf2HI7zD5LaQe/7oU3jfwvuA7n2z8Py+zGx7m4=
 =S+HY
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kunit updates from Shuah Khan:
 "This consists of:

   - Several config fragment fixes from Anders Roxell to improve test
     coverage.

   - Improvements to kunit run script to use defconfig as default and
     restructure the code for config/build/exec/parse from Vitor Massaru
     Iha and David Gow.

   - Miscellaneous documentation warn fix"

* tag 'linux-kselftest-kunit-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  security: apparmor: default KUNIT_* fragments to KUNIT_ALL_TESTS
  fs: ext4: default KUNIT_* fragments to KUNIT_ALL_TESTS
  drivers: base: default KUNIT_* fragments to KUNIT_ALL_TESTS
  lib: Kconfig.debug: default KUNIT_* fragments to KUNIT_ALL_TESTS
  kunit: default KUNIT_* fragments to KUNIT_ALL_TESTS
  kunit: Kconfig: enable a KUNIT_ALL_TESTS fragment
  kunit: Fix TabError, remove defconfig code and handle when there is no kunitconfig
  kunit: use KUnit defconfig by default
  kunit: use --build_dir=.kunit as default
  Documentation: test.h - fix warnings
  kunit: kunit_tool: Separate out config/build/exec/parse
2020-06-09 10:04:47 -07:00
Michel Lespinasse c1e8d7c6a7 mmap locking API: convert mmap_sem comments
Convert comments that reference mmap_sem to reference mmap_lock instead.

[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Linus Torvalds a2b447066c Tag summary
+ Features
   - Replace zero-length array with flexible-array
   - add a valid state flags check
   - add consistency check between state and dfa diff encode flags
   - add apparmor subdir to proc attr interface
   - fail unpack if profile mode is unknown
   - add outofband transition and use it in xattr match
   - ensure that dfa state tables have entries
 
 + Cleanups
   - Use true and false for bool variable
   - Remove semicolon
   - Clean code by removing redundant instructions
   - Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
   - remove duplicate check of xattrs on profile attachment
   - remove useless aafs_create_symlink
 
 + Bug fixes
   - Fix memory leak of profile proxy
   - fix introspection of of task mode for unconfined tasks
   - fix nnp subset test for unconfined
   - check/put label on apparmor_sk_clone_security()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAl7dUf4ACgkQBS82cBjV
 w9j8rA//R3qbVeiN3SJtxLhiF3AAdP2cVbZ/mAhQLwYObI6flb1bliiahJHRf8Ey
 FaVb4srOH8NlmzNINZehXOvD3UDwX/sbpw8h0Y0JolO+v1m3UXkt/eRoMt6gRz7I
 jtaImY1/V+G4O5rV5fGA1HQI8Geg+W9Abt32d16vyKIIpnBS/Pfv8ppM0NcHCZ4G
 e8935T/dMNK5K0Y7HNb1nMjyzEr0LtEXvXznBOrGVpCtDQ45m0/NBvAqpfhuKsVm
 FE5Na8rgtiB9sU72LaoNXNr8Y5LVgkXPmBr/e1FqZtF01XEarKb7yJDGOLrLpp1o
 rGYpY9DQSBT/ZZrwMaLFqCd1XtnN1BAmhlM6TXfnm25ArEnQ49ReHFc7ZHZRSTZz
 LWVBD6atZbapvqckk1SU49eCLuGs5wmRj/CmwdoQUbZ+aOfR68zF+0PANbP5xDo4
 862MmeMsm8JHndeCelpZQRbhtXt0t9MDzwMBevKhxV9hbpt4g8DcnC5tNUc9AnJi
 qJDsMkytYhazIW+/4MsnLTo9wzhqzXq5kBeE++Xl7vDE/V+d5ocvQg73xtwQo9sx
 LzMlh3cPmBvOnlpYfnONZP8pJdjDAuESsi/H5+RKQL3cLz7NX31CLWR8dXLBHy80
 Dvxqvy84Cf7buigqwSzgAGKjDI5HmeOECAMjpLbEB2NS9xxQYuk=
 =U7d2
 -----END PGP SIGNATURE-----

Merge tag 'apparmor-pr-2020-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

Pull apparmor updates from John Johansen:
 "Features:
   - Replace zero-length array with flexible-array
   - add a valid state flags check
   - add consistency check between state and dfa diff encode flags
   - add apparmor subdir to proc attr interface
   - fail unpack if profile mode is unknown
   - add outofband transition and use it in xattr match
   - ensure that dfa state tables have entries

  Cleanups:
   - Use true and false for bool variable
   - Remove semicolon
   - Clean code by removing redundant instructions
   - Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
   - remove duplicate check of xattrs on profile attachment
   - remove useless aafs_create_symlink

  Bug fixes:
   - Fix memory leak of profile proxy
   - fix introspection of of task mode for unconfined tasks
   - fix nnp subset test for unconfined
   - check/put label on apparmor_sk_clone_security()"

* tag 'apparmor-pr-2020-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: Fix memory leak of profile proxy
  apparmor: fix introspection of of task mode for unconfined tasks
  apparmor: check/put label on apparmor_sk_clone_security()
  apparmor: Use true and false for bool variable
  security/apparmor/label.c: Clean code by removing redundant instructions
  apparmor: Replace zero-length array with flexible-array
  apparmor: ensure that dfa state tables have entries
  apparmor: remove duplicate check of xattrs on profile attachment.
  apparmor: add outofband transition and use it in xattr match
  apparmor: fail unpack if profile mode is unknown
  apparmor: fix nnp subset test for unconfined
  apparmor: remove useless aafs_create_symlink
  apparmor: add proc subdir to attrs
  apparmor: add consistency check between state and dfa diff encode flags
  apparmor: add a valid state flags check
  AppArmor: Remove semicolon
  apparmor: Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
2020-06-07 16:04:49 -07:00
Roberto Sassu 8b8c704d91 ima: Remove __init annotation from ima_pcrread()
Commit 6cc7c266e5 ("ima: Call ima_calc_boot_aggregate() in
ima_eventdigest_init()") added a call to ima_calc_boot_aggregate() so that
the digest can be recalculated for the boot_aggregate measurement entry if
the 'd' template field has been requested. For the 'd' field, only SHA1 and
MD5 digests are accepted.

Given that ima_eventdigest_init() does not have the __init annotation, all
functions called should not have it. This patch removes __init from
ima_pcrread().

Cc: stable@vger.kernel.org
Fixes:  6cc7c266e5 ("ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-07 16:03:09 -07:00
John Johansen 3622ad25d4 apparmor: Fix memory leak of profile proxy
When the proxy isn't replaced and the profile is removed, the proxy
is being leaked resulting in a kmemleak check message of

unreferenced object 0xffff888077a3a490 (size 16):
  comm "apparmor_parser", pid 128041, jiffies 4322684109 (age 1097.028s)
  hex dump (first 16 bytes):
    03 00 00 00 00 00 00 00 b0 92 fd 4b 81 88 ff ff  ...........K....
  backtrace:
    [<0000000084d5daf2>] aa_alloc_proxy+0x58/0xe0
    [<00000000ecc0e21a>] aa_alloc_profile+0x159/0x1a0
    [<000000004cc9ce15>] unpack_profile+0x275/0x1c40
    [<000000007332b3ca>] aa_unpack+0x1e7/0x7e0
    [<00000000e25e31bd>] aa_replace_profiles+0x18a/0x1d10
    [<00000000350d9415>] policy_update+0x237/0x650
    [<000000003fbf934e>] profile_load+0x122/0x160
    [<0000000047f7b781>] vfs_write+0x139/0x290
    [<000000008ad12358>] ksys_write+0xcd/0x170
    [<000000001a9daa7b>] do_syscall_64+0x70/0x310
    [<00000000b9efb0cf>] entry_SYSCALL_64_after_hwframe+0x49/0xb3

Make sure to cleanup the profile's embedded label which will result
on the proxy being properly freed.

Fixes: 637f688dc3 ("apparmor: switch from profiles to using labels on contexts")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-06-07 13:38:55 -07:00
John Johansen dd2569fbb0 apparmor: fix introspection of of task mode for unconfined tasks
Fix two issues with introspecting the task mode.

1. If a task is attached to a unconfined profile that is not the
   ns->unconfined profile then. Mode the mode is always reported
   as -

      $ ps -Z
      LABEL                               PID TTY          TIME CMD
      unconfined                         1287 pts/0    00:00:01 bash
      test (-)                           1892 pts/0    00:00:00 ps

   instead of the correct value of (unconfined) as shown below

      $ ps -Z
      LABEL                               PID TTY          TIME CMD
      unconfined                         2483 pts/0    00:00:01 bash
      test (unconfined)                  3591 pts/0    00:00:00 ps

2. if a task is confined by a stack of profiles that are unconfined
   the output of label mode is again the incorrect value of (-) like
   above, instead of (unconfined). This is because the visibile
   profile count increment is skipped by the special casing of
   unconfined.

Fixes: f1bd904175 ("apparmor: add the base fns() for domain labels")
Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-06-07 13:38:55 -07:00
Mauricio Faria de Oliveira 3b646abc5b apparmor: check/put label on apparmor_sk_clone_security()
Currently apparmor_sk_clone_security() does not check for existing
label/peer in the 'new' struct sock; it just overwrites it, if any
(with another reference to the label of the source sock.)

    static void apparmor_sk_clone_security(const struct sock *sk,
                                           struct sock *newsk)
    {
            struct aa_sk_ctx *ctx = SK_CTX(sk);
            struct aa_sk_ctx *new = SK_CTX(newsk);

            new->label = aa_get_label(ctx->label);
            new->peer = aa_get_label(ctx->peer);
    }

This might leak label references, which might overflow under load.
Thus, check for and put labels, to prevent such errors.

Note this is similarly done on:

    static int apparmor_socket_post_create(struct socket *sock, ...)
    ...
            if (sock->sk) {
                    struct aa_sk_ctx *ctx = SK_CTX(sock->sk);

                    aa_put_label(ctx->label);
                    ctx->label = aa_get_label(label);
            }
    ...

Context:
-------

The label reference count leak is observed if apparmor_sock_graft()
is called previously: this sets the 'ctx->label' field by getting
a reference to the current label (later overwritten, without put.)

    static void apparmor_sock_graft(struct sock *sk, ...)
    {
            struct aa_sk_ctx *ctx = SK_CTX(sk);

            if (!ctx->label)
                    ctx->label = aa_get_current_label();
    }

And that is the case on crypto/af_alg.c:af_alg_accept():

    int af_alg_accept(struct sock *sk, struct socket *newsock, ...)
    ...
            struct sock *sk2;
            ...
            sk2 = sk_alloc(...);
            ...
            security_sock_graft(sk2, newsock);
            security_sk_clone(sk, sk2);
    ...

Apparently both calls are done on their own right, especially for
other LSMs, being introduced in 2010/2014, before apparmor socket
mediation in 2017 (see commits [1,2,3,4]).

So, it looks OK there! Let's fix the reference leak in apparmor.

Test-case:
---------

Exercise that code path enough to overflow label reference count.

    $ cat aa-refcnt-af_alg.c
    #include <stdio.h>
    #include <string.h>
    #include <unistd.h>
    #include <sys/socket.h>
    #include <linux/if_alg.h>

    int main() {
            int sockfd;
            struct sockaddr_alg sa;

            /* Setup the crypto API socket */
            sockfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
            if (sockfd < 0) {
                    perror("socket");
                    return 1;
            }

            memset(&sa, 0, sizeof(sa));
            sa.salg_family = AF_ALG;
            strcpy((char *) sa.salg_type, "rng");
            strcpy((char *) sa.salg_name, "stdrng");

            if (bind(sockfd, (struct sockaddr *) &sa, sizeof(sa)) < 0) {
                    perror("bind");
                    return 1;
            }

            /* Accept a "connection" and close it; repeat. */
            while (!close(accept(sockfd, NULL, 0)));

            return 0;
    }

    $ gcc -o aa-refcnt-af_alg aa-refcnt-af_alg.c

    $ ./aa-refcnt-af_alg
    <a few hours later>

    [ 9928.475953] refcount_t overflow at apparmor_sk_clone_security+0x37/0x70 in aa-refcnt-af_alg[1322], uid/euid: 1000/1000
    ...
    [ 9928.507443] RIP: 0010:apparmor_sk_clone_security+0x37/0x70
    ...
    [ 9928.514286]  security_sk_clone+0x33/0x50
    [ 9928.514807]  af_alg_accept+0x81/0x1c0 [af_alg]
    [ 9928.516091]  alg_accept+0x15/0x20 [af_alg]
    [ 9928.516682]  SYSC_accept4+0xff/0x210
    [ 9928.519609]  SyS_accept+0x10/0x20
    [ 9928.520190]  do_syscall_64+0x73/0x130
    [ 9928.520808]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Note that other messages may be seen, not just overflow, depending on
the value being incremented by kref_get(); on another run:

    [ 7273.182666] refcount_t: saturated; leaking memory.
    ...
    [ 7273.185789] refcount_t: underflow; use-after-free.

Kprobes:
-------

Using kprobe events to monitor sk -> sk_security -> label -> count (kref):

Original v5.7 (one reference leak every iteration)

 ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd2
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd4
 ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd3
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd5
 ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd4
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd6

Patched v5.7 (zero reference leak per iteration)

 ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594
 ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594
 ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593
 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594

Commits:
-------

[1] commit 507cad355f ("crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets")
[2] commit 4c63f83c2c ("crypto: af_alg - properly label AF_ALG socket")
[3] commit 2acce6aa9f ("Networking") a.k.a ("crypto: af_alg - Avoid sock_graft call warning)
[4] commit 56974a6fcf ("apparmor: add base infastructure for socket mediation")

Fixes: 56974a6fcf ("apparmor: add base infastructure for socket mediation")
Reported-by: Brian Moyles <bmoyles@netflix.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
2020-06-07 13:38:56 -07:00