WSL2-Linux-Kernel/Documentation
Tycho Andersen 6a21cc50f0 seccomp: add a return code to trap to userspace
This patch introduces a means for syscalls matched in seccomp to notify
some other task that a particular filter has been triggered.

The motivation for this is primarily for use with containers. For example,
if a container does an init_module(), we obviously don't want to load this
untrusted code, which may be compiled for the wrong version of the kernel
anyway. Instead, we could parse the module image, figure out which module
the container is trying to load and load it on the host.

As another example, containers cannot mount() in general since various
filesystems assume a trusted image. However, if an orchestrator knows that
e.g. a particular block device has not been exposed to a container for
writing, it want to allow the container to mount that block device (that
is, handle the mount for it).

This patch adds functionality that is already possible via at least two
other means that I know about, both of which involve ptrace(): first, one
could ptrace attach, and then iterate through syscalls via PTRACE_SYSCALL.
Unfortunately this is slow, so a faster version would be to install a
filter that does SECCOMP_RET_TRACE, which triggers a PTRACE_EVENT_SECCOMP.
Since ptrace allows only one tracer, if the container runtime is that
tracer, users inside the container (or outside) trying to debug it will not
be able to use ptrace, which is annoying. It also means that older
distributions based on Upstart cannot boot inside containers using ptrace,
since upstart itself uses ptrace to monitor services while starting.

The actual implementation of this is fairly small, although getting the
synchronization right was/is slightly complex.

Finally, it's worth noting that the classic seccomp TOCTOU of reading
memory data from the task still applies here, but can be avoided with
careful design of the userspace handler: if the userspace handler reads all
of the task memory that is necessary before applying its security policy,
the tracee's subsequent memory edits will not be read by the tracer.

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
CC: Kees Cook <keescook@chromium.org>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: "Serge E. Hallyn" <serge@hallyn.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
CC: Christian Brauner <christian@brauner.io>
CC: Tyler Hicks <tyhicks@canonical.com>
CC: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
2018-12-11 16:28:41 -08:00
..
ABI LED fixes for 4.20-rc2 2018-11-08 17:49:04 -06:00
EDID
PCI pci-v4.20-changes 2018-10-25 06:50:48 -07:00
RCU This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
accelerators
accounting psi: cgroup support 2018-10-26 16:26:32 -07:00
acpi
admin-guide Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:25:17 -07:00
aoe
arm ARM: SoC device tree updates for 4.20 2018-10-29 15:05:20 -07:00
arm64 Documentation/arm64: HugeTLB page implementation 2018-10-10 18:08:36 +01:00
auxdisplay
backlight
block
blockdev This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
bpf
bus-devices
cdrom
cgroup-v1 Merge branch 'for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2018-10-25 17:15:46 -07:00
cma
connector
console
core-api docs/boot-time-mm: remove bootmem documentation 2018-10-31 08:54:16 -07:00
cpu-freq
cpuidle
crypto KEYS: Implement PKCS#8 RSA Private Key parser [ver #2] 2018-10-26 09:30:46 +01:00
dev-tools
device-mapper This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
devicetree Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2018-11-10 06:57:34 -06:00
doc-guide
driver-api Char/Misc driver patches for 4.20-rc1 2018-10-26 09:11:43 -07:00
driver-model
early-userspace
extcon
fault-injection
fb This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
features
filesystems This pull request contains updates for UBIFS: 2018-11-04 14:46:04 -08:00
firmware_class
fmc
fpga
gpio
gpu
hid
hwmon hwmon: (ina3221) Read channel input source info from DT 2018-10-10 20:37:13 -07:00
i2c i2c: add i2c bus driver for NVIDIA GPU 2018-11-09 17:46:43 +01:00
ia64
ide
iio
infiniband
input
ioctl seccomp: add a return code to trap to userspace 2018-12-11 16:28:41 -08:00
isdn
kbuild kbuild: remove unused cc-fullversion variable 2018-11-02 00:15:26 +09:00
kdump
kernel-hacking
laptops platform-drivers-x86 for v4.20-1 2018-11-01 08:42:21 -07:00
leds
lightnvm
livepatch
locking This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
m68k
maintainer
md
media media updates for v4.20-rc1 2018-10-31 10:53:29 -07:00
memory-devices
mic
mips
misc-devices
mmc
mtd
namespaces
netlabel
networking Kbuild updates for v4.20 (2nd) 2018-11-03 10:47:33 -07:00
nfc
nios2
nvdimm
nvmem
openrisc
parisc
pcmcia
perf
phy
platform
power This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
powerpc
pps
process The Compiler Attributes series 2018-11-01 18:34:46 -07:00
pti
ptp
rapidio
riscv
s390 KVM updates for v4.20 2018-10-25 17:57:35 -07:00
scheduler This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
scsi SCSI misc on 20181024 2018-10-25 07:40:30 -07:00
security Merge branch 'next-keys2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-11-01 15:23:59 -07:00
serial TTY/Serial patches for 4.20-rc1 2018-10-29 10:42:20 -07:00
sh
sound ALSA: doc: Brush up the old writing-an-alsa-driver 2018-10-18 10:30:01 +02:00
sparc
sphinx
sphinx-static
spi
sysctl New gcc plugin: stackleak 2018-11-01 11:46:27 -07:00
target
thermal
timers
trace The biggest change here is the updates to kprobes 2018-10-30 09:49:56 -07:00
translations
usb
userspace-api seccomp: add a return code to trap to userspace 2018-12-11 16:28:41 -08:00
virtual KVM updates for v4.20 2018-10-25 17:57:35 -07:00
vm slub: extend slub debug to handle multiple slabs 2018-10-26 16:25:19 -07:00
w1
watchdog documentation: watchdog: add documentation for armada-37xx-wdt 2018-10-13 15:19:40 +02:00
wimax
x86 x86/mm: Move LDT remap out of KASLR region on 5-level paging 2018-11-06 21:35:11 +01:00
xilinx Documentation: xilinx: Add documentation for eemi APIs 2018-10-09 13:26:05 +02:00
xtensa
.gitignore
Changes
CodingStyle
DMA-API-HOWTO.txt
DMA-API.txt
DMA-ISA-LPC.txt
DMA-attributes.txt
IPMI.txt
IRQ-affinity.txt
IRQ-domain.txt
IRQ.txt
Intel-IOMMU.txt
Makefile
SAK.txt
SM501.txt
SubmittingPatches
atomic_bitops.txt
atomic_t.txt
bt8xxgpio.txt
btmrvl.txt
bus-virt-phys-mapping.txt
clearing-warn-once.txt
conf.py This is a fairly typical cycle for documentation. There's some welcome 2018-10-24 18:01:11 +01:00
cpu-load.txt
cputopology.txt
crc32.txt
dcdbas.txt
debugging-modules.txt
debugging-via-ohci1394.txt
dell_rbu.txt
digsig.txt
docutils.conf
dontdiff
efi-stub.txt
eisa.txt
flexible-arrays.txt
futex-requeue-pi.txt
gcc-plugins.txt
highuid.txt
hw_random.txt
hwspinlock.txt
index.rst
intel_txt.txt
io-mapping.txt
io_ordering.txt
iostats.txt
irqflags-tracing.txt
isa.txt
isapnp.txt
kernel-per-CPU-kthreads.txt
kobject.txt
kprobes.txt
kref.txt
ldm.txt
lockup-watchdogs.txt
logo.gif
logo.txt
lsm.txt
lzo.txt
mailbox.txt
memory-barriers.txt
men-chameleon-bus.txt
nommu-mmap.txt
ntb.txt
numastat.txt
padata.txt
parport-lowlevel.txt
percpu-rw-semaphore.txt
phy.txt
pi-futex.txt
pnp.txt
preempt-locking.txt Documentation: preempt-locking: Use better example 2018-10-12 11:35:47 -06:00
pwm.txt
rbtree.txt
remoteproc.txt
rfkill.txt
robust-futex-ABI.txt
robust-futexes.txt
rpmsg.txt
rtc.txt
sgi-ioc4.txt
siphash.txt
smsc_ece1099.txt
speculation.txt
static-keys.txt
svga.txt
switchtec.txt NTB: switchtec_ntb: Update switchtec documentation with prerequisites for NTB 2018-10-11 11:28:53 -05:00
sync_file.txt
tee.txt
this_cpu_ops.txt
unaligned-memory-access.txt
vfio-mediated-device.txt
vfio.txt
video-output.txt
xillybus.txt
xz.txt
zorro.txt