WSL2-Linux-Kernel/kernel
Daniel Borkmann 60a3b2253c net: bpf: make eBPF interpreter images read-only
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate.  This patch moves the to be interpreted bytecode to
read-only pages.

In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").

This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).

Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.

Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables.  Brad Spengler proposed the same idea
via Twitter during development of this patch.

Joint work with Hannes Frederic Sowa.

Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-05 12:02:48 -07:00
..
bpf net: bpf: make eBPF interpreter images read-only 2014-09-05 12:02:48 -07:00
debug
events mm: memcontrol: rewrite charge API 2014-08-08 15:57:17 -07:00
gcov kernel/gcov/fs.c: remove unnecessary null test before debugfs_remove 2014-08-08 15:57:24 -07:00
irq Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-05 17:38:45 -07:00
locking arch, locking: Ciao arch_mutex_cpu_relax() 2014-07-17 12:32:47 +02:00
power Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle' 2014-08-11 23:19:48 +02:00
printk printk: Add function to return log buffer address and size 2014-08-13 15:13:44 +10:00
rcu Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-04 15:55:08 -07:00
sched Merge branches 'pm-sleep', 'pm-cpufreq' and 'pm-cpuidle' 2014-08-11 23:19:48 +02:00
time timekeeping: Another fix to the VSYSCALL_OLD update_vsyscall 2014-08-14 11:04:11 -06:00
trace This contains a fix for two long standing bugs. Both of which are 2014-08-09 17:29:36 -07:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER 2014-07-16 14:57:13 +02:00
Kconfig.preempt
Makefile bin2c: move bin2c in scripts/basic 2014-08-08 15:57:32 -07:00
acct.c kernel/acct.c: fix coding style warnings and errors 2014-08-08 15:57:27 -07:00
async.c
audit.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
audit.h
audit_tree.c
audit_watch.c
auditfilter.c kernel/auditfilter.c: replace count*size kmalloc by kcalloc 2014-08-06 18:01:12 -07:00
auditsc.c
backtracetest.c
bounds.c page-cgroup: get rid of NR_PCG_FLAGS 2014-08-08 15:57:18 -07:00
capability.c CAPABILITIES: remove undefined caps from all processes 2014-07-24 21:53:47 +10:00
cgroup.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
cgroup_freezer.c
compat.c
configs.c
context_tracking.c
cpu.c
cpu_pm.c
cpuset.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-08-04 10:11:28 -07:00
crash_dump.c crash_dump: Make is_kdump_kernel() accessible from modules 2014-08-25 15:42:19 -07:00
cred.c
delayacct.c delayacct: Remove braindamaged type conversions 2014-07-23 10:18:06 -07:00
dma.c
elfcore.c
exec_domain.c
exit.c kernel/exit.c: fix coding style warnings and errors 2014-08-08 15:57:22 -07:00
extable.c
fork.c seccomp: Replace BUG(!spin_is_locked()) with assert_spin_lock 2014-08-11 13:29:12 -07:00
freezer.c
futex.c
futex_compat.c
groups.c
hung_task.c
irq_work.c
jump_label.c
kallsyms.c kernel/kallsyms.c: fix %pB when there's no symbol at the address 2014-08-08 15:57:18 -07:00
kcmp.c
kexec.c kexec: verify the signature of signed PE bzImage 2014-08-08 15:57:33 -07:00
kmod.c
kprobes.c kprobes: Fix "Failed to find blacklist" probing errors on ia64 and ppc64 2014-07-18 06:23:40 +02:00
ksysfs.c
kthread.c kthread_work: wake up worker only when the worker is idle 2014-07-28 14:07:52 -04:00
latencytop.c
module-internal.h
module.c module: Clean up ro/nx after early module load failures 2014-08-16 04:47:00 +09:30
module_signing.c
notifier.c
nsproxy.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
padata.c
panic.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
params.c Add module param type 'ullong' 2014-07-17 22:07:37 +02:00
pid.c
pid_namespace.c
profile.c
ptrace.c sched: Remove proliferation of wait_on_bit() action functions 2014-07-16 15:10:39 +02:00
range.c
reboot.c
relay.c
res_counter.c
resource.c resource: provide new functions to walk through resources 2014-08-08 15:57:32 -07:00
seccomp.c net: bpf: make eBPF interpreter images read-only 2014-09-05 12:02:48 -07:00
signal.c Merge branch 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc 2014-08-09 09:58:12 -07:00
smp.c kernel/smp.c:on_each_cpu_cond(): fix warning in fallback path 2014-08-06 18:01:22 -07:00
smpboot.c
smpboot.h
softirq.c
stacktrace.c
stop_machine.c
sys.c sched: move no_new_privs into new atomic flags 2014-07-18 12:13:38 -07:00
sys_ni.c kexec: new syscall kexec_file_load() declaration 2014-08-08 15:57:32 -07:00
sysctl.c mm, hugetlb: remove hugetlb_zero and hugetlb_infinity 2014-08-06 18:01:19 -07:00
sysctl_binary.c
system_certificates.S
system_keyring.c KEYS: validate certificate trust only with builtin keys 2014-07-17 09:35:17 -04:00
task_work.c
taskstats.c
test_kprobes.c kernel/test_kprobes.c: use current logging functions 2014-08-08 15:57:18 -07:00
torture.c
tracepoint.c
tsacct.c sched: Make task->start_time nanoseconds based 2014-07-23 10:18:05 -07:00
uid16.c
up.c
user-return-notifier.c
user.c
user_namespace.c proc: constify seq_operations 2014-08-08 15:57:22 -07:00
utsname.c namespaces: Use task_lock and not rcu to protect nsproxy 2014-07-29 18:08:50 -07:00
utsname_sysctl.c
watchdog.c panic: add TAINT_SOFTLOCKUP 2014-08-08 15:57:24 -07:00
workqueue.c Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-08-04 10:09:27 -07:00
workqueue_internal.h