WSL2-Linux-Kernel/kernel
Rik van Riel 5eed6f1dff fork,memcg: fix crash in free_thread_stack on memcg charge fail
Commit 9b6f7e163c ("mm: rework memcg kernel stack accounting") will
result in fork failing if allocating a kernel stack for a task in
dup_task_struct exceeds the kernel memory allowance for that cgroup.

Unfortunately, it also results in a crash.

This is due to the code jumping to free_stack and calling
free_thread_stack when the memcg kernel stack charge fails, but without
tsk->stack pointing at the freshly allocated stack.

This in turn results in the vfree_atomic in free_thread_stack oopsing
with a backtrace like this:

#5 [ffffc900244efc88] die at ffffffff8101f0ab
 #6 [ffffc900244efcb8] do_general_protection at ffffffff8101cb86
 #7 [ffffc900244efce0] general_protection at ffffffff818ff082
    [exception RIP: llist_add_batch+7]
    RIP: ffffffff8150d487  RSP: ffffc900244efd98  RFLAGS: 00010282
    RAX: 0000000000000000  RBX: ffff88085ef55980  RCX: 0000000000000000
    RDX: ffff88085ef55980  RSI: 343834343531203a  RDI: 343834343531203a
    RBP: ffffc900244efd98   R8: 0000000000000001   R9: ffff8808578c3600
    R10: 0000000000000000  R11: 0000000000000001  R12: ffff88029f6c21c0
    R13: 0000000000000286  R14: ffff880147759b00  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffc900244efda0] vfree_atomic at ffffffff811df2c7
 #9 [ffffc900244efdb8] copy_process at ffffffff81086e37
#10 [ffffc900244efe98] _do_fork at ffffffff810884e0
#11 [ffffc900244eff10] sys_vfork at ffffffff810887ff
#12 [ffffc900244eff20] do_syscall_64 at ffffffff81002a43
    RIP: 000000000049b948  RSP: 00007ffcdb307830  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: 0000000000896030  RCX: 000000000049b948
    RDX: 0000000000000000  RSI: 00007ffcdb307790  RDI: 00000000005d7421
    RBP: 000000000067370f   R8: 00007ffcdb3077b0   R9: 000000000001ed00
    R10: 0000000000000008  R11: 0000000000000246  R12: 0000000000000040
    R13: 000000000000000f  R14: 0000000000000000  R15: 000000000088d018
    ORIG_RAX: 000000000000003a  CS: 0033  SS: 002b

The simplest fix is to assign tsk->stack right where it is allocated.

Link: http://lkml.kernel.org/r/20181214231726.7ee4843c@imladris.surriel.com
Fixes: 9b6f7e163c ("mm: rework memcg kernel stack accounting")
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-21 14:51:18 -08:00
..
bpf bpf: verifier: make sure callees don't prune with caller differences 2018-12-13 10:35:40 -08:00
cgroup for-linus-20181102 2018-11-02 11:25:48 -07:00
configs kvm_config: add CONFIG_VIRTIO_MENU 2018-10-24 20:55:56 -04:00
debug kdb: kdb_support: mark expected switch fall-throughs 2018-11-13 20:38:50 +00:00
dma dma-direct: do not include SME mask in the DMA supported check 2018-12-17 18:02:11 +01:00
events This is a single commit that fixes a bug in uprobes SDT code 2018-12-06 10:35:19 -08:00
gcov gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT 2018-06-08 18:56:02 +09:00
irq irq/matrix: Fix memory overallocation 2018-11-01 10:00:38 +01:00
livepatch Merge branch 'for-4.19/upstream' into for-linus 2018-08-20 18:33:50 +02:00
locking mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
power memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
printk mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
rcu Merge branches 'doc.2018.08.30a', 'dynticks.2018.08.30b', 'srcu.2018.08.30b' and 'torture.2018.08.29a' into HEAD 2018-08-30 16:12:53 -07:00
sched Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-12-01 12:35:48 -08:00
time posix-timers: Fix division by zero bug 2018-12-17 17:35:45 +01:00
trace While running various ftrace tests on new development code, the kmemleak 2018-12-12 18:15:29 -08:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt kconfig: include kernel/Kconfig.preempt from init/Kconfig 2018-08-02 08:06:54 +09:00
Makefile x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls 2018-09-04 10:35:47 -07:00
acct.c kernel/acct.c: fix the acct->needcheck check in check_free_space() 2018-01-04 16:45:09 -08:00
async.c kernel/async.c: revert "async: simplify lowest_in_progress()" 2018-02-06 18:32:44 -08:00
audit.c audit: use ktime_get_coarse_real_ts64() for timestamps 2018-07-17 14:45:08 -04:00
audit.h audit: track the owner of the command mutex ourselves 2018-02-23 11:22:22 -05:00
audit_fsnotify.c fsnotify: add fsnotify_add_inode_mark() wrappers 2018-05-18 14:58:22 +02:00
audit_tree.c \n 2018-08-17 09:41:28 -07:00
audit_watch.c audit: fix use-after-free in audit_add_watch 2018-07-18 11:43:36 -04:00
auditfilter.c audit: rename FILTER_TYPE to FILTER_EXCLUDE 2018-06-19 10:39:54 -04:00
auditsc.c audit/stable-4.18 PR 20180814 2018-08-15 10:46:54 -07:00
backtracetest.c
bounds.c kbuild: fix kernel/bounds.c 'W=1' warning 2018-10-31 08:54:14 -07:00
capability.c
compat.c y2038: globally rename compat_time to old_time32 2018-08-27 14:48:48 +02:00
configs.c
context_tracking.c
cpu.c x86/speculation: Rework SMT state change 2018-11-28 11:57:07 +01:00
cpu_pm.c
crash_core.c kernel/crash_core.c: print timestamp using time64_t 2018-08-22 10:52:47 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: track delays from thrashing cache pages 2018-10-26 16:26:32 -07:00
dma.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
elfcore.c
exec_domain.c proc: introduce proc_create_single{,_data} 2018-05-16 07:23:35 +02:00
exit.c signal: Pass pid type into group_send_sig_info 2018-07-21 12:57:35 -05:00
extable.c extable: Make init_kernel_text() global 2018-02-21 16:54:06 +01:00
fail_function.c kernel/fail_function.c: remove meaningless null pointer check before debugfs_remove_recursive 2018-10-31 08:54:12 -07:00
fork.c fork,memcg: fix crash in free_thread_stack on memcg charge fail 2018-12-21 14:51:18 -08:00
freezer.c PM / reboot: Eliminate race between reboot and suspend 2018-08-06 12:35:20 +02:00
futex.c futex: Cure exit race 2018-12-18 23:13:15 +01:00
futex_compat.c y2038: globally rename compat_time to old_time32 2018-08-27 14:48:48 +02:00
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2017-12-14 16:00:49 -08:00
hung_task.c kernel: hung_task.c: disable on suspend 2018-10-25 18:45:08 +02:00
iomem.c memremap: split devm_memremap_pages() and memremap() infrastructure 2018-05-15 23:08:33 -07:00
irq_work.c irq/work: Improve the flag definitions 2018-01-08 19:43:15 +01:00
jump_label.c Merge branch 'x86/build' into locking/core, to pick up dependent patches and unify jump-label work 2018-10-16 17:30:11 +02:00
kallsyms.c kallsyms: reduce size a little on 64-bit 2018-09-10 22:54:33 +09:00
kcmp.c
kcov.c kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace 2018-11-30 14:56:14 -08:00
kexec.c kexec: add call to LSM hook in original kexec_load syscall 2018-07-16 12:31:57 -07:00
kexec_core.c kexec: Allocate decrypted control pages for kdump if SME is enabled 2018-10-06 12:01:51 +02:00
kexec_file.c kernel/kexec_file.c: remove some duplicated includes 2018-11-03 10:09:37 -07:00
kexec_internal.h
kmod.c
kprobes.c kprobes: Don't call BUG_ON() if there is a kprobe in use on free list 2018-09-12 08:01:16 +02:00
ksysfs.c
kthread.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
latencytop.c
memremap.c Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
module-internal.h modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
module.c jump_table: Move entries into ro_after_init region 2018-09-27 17:56:49 +02:00
module_signing.c modsign: log module name in the event of an error 2018-07-02 11:36:17 +02:00
notifier.c
nsproxy.c
padata.c padata: add SPDX identifier 2018-01-05 18:43:00 +11:00
panic.c kernel/panic.c: filter out a potential trailing newline 2018-10-31 08:54:14 -07:00
params.c kernel/params.c: downgrade warning for unsafe parameters 2018-04-11 10:28:37 -07:00
pid.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
pid_namespace.c signal: Use group_send_sig_info to kill all processes in a pid namespace 2018-09-16 16:08:25 +02:00
profile.c mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ptrace.c ptrace: Remove unused ptrace_may_access_sched() and MODE_IBRS 2018-11-28 11:57:11 +01:00
range.c
reboot.c kernel/reboot.c: export pm_power_off_prepare 2018-09-11 16:13:24 +01:00
relay.c kernel/relay.c: change return type to vm_fault_t 2018-06-15 07:55:24 +09:00
resource.c resource/docs: Complete kernel-doc style function documentation 2018-11-07 16:47:47 +01:00
rseq.c rseq: uapi: Declare rseq_cs field as union, update includes 2018-07-10 22:18:52 +02:00
seccomp.c Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2018-10-24 11:49:35 +01:00
signal.c kernel/signal.c: fix a comment error 2018-10-31 08:54:14 -07:00
smp.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
smpboot.c smpboot: Remove cpumask from the API 2018-07-03 09:20:44 +02:00
smpboot.h
softirq.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-25 11:43:47 -07:00
stackleak.c stackleak: Mark stackleak_track_stack() as notrace 2018-12-05 19:31:44 -08:00
stacktrace.c
stop_machine.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-08-13 11:25:07 -07:00
sys.c kernel/sys.c: remove duplicated include 2018-09-20 22:01:11 +02:00
sys_ni.c Merge branch 'core-rseq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-10 10:17:09 -07:00
sysctl.c kernel/sysctl.c: remove duplicated include 2018-11-03 10:09:37 -07:00
sysctl_binary.c staging: irda: remove remaining remants of irda code removal 2018-04-16 11:26:49 +02:00
task_work.c locking/barriers: Convert users of lockless_dereference() to READ_ONCE() 2017-12-17 13:57:15 +01:00
taskstats.c pids: introduce find_get_task_by_vpid() helper 2018-02-06 18:32:46 -08:00
test_kprobes.c kprobes: Remove jprobe API implementation 2018-06-21 12:33:05 +02:00
torture.c rcutorture: Check GP completion at stutter end 2018-08-29 09:20:48 -07:00
tracepoint.c tracepoint: Fix tracepoint array element size mismatch 2018-10-17 15:35:29 -04:00
tsacct.c
ucount.c headers: untangle kmemleak.h from mm.h 2018-04-05 21:36:27 -07:00
uid16.c fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers 2018-04-02 20:15:59 +02:00
uid16.h kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c 2018-04-02 20:15:30 +02:00
umh.c umh: Add command line to user mode helpers 2018-10-22 19:37:36 -07:00
up.c smp,cpumask: introduce on_each_cpu_cond_mask 2018-10-09 16:51:11 +02:00
user-return-notifier.c
user.c userns: use irqsave variant of refcount_dec_and_lock() 2018-08-22 10:52:47 -07:00
user_namespace.c userns: also map extents in the reverse map to kernel IDs 2018-11-07 23:51:16 -06:00
utsname.c uts: create "struct uts_namespace" from kmem_cache 2018-04-11 10:28:35 -07:00
utsname_sysctl.c sys: don't hold uts_sem while accessing userspace memory 2018-08-11 02:05:53 -05:00
watchdog.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
watchdog_hld.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue.c watchdog: Mark watchdog touch functions as notrace 2018-08-30 12:56:40 +02:00
workqueue_internal.h workqueue: Set worker->desc to workqueue name by default 2018-05-18 08:47:13 -07:00