WSL2-Linux-Kernel/kernel
Oleg Nesterov abd50b39e7 wait: introduce EXIT_TRACE to avoid the racy EXIT_DEAD->EXIT_ZOMBIE transition
wait_task_zombie() first does EXIT_ZOMBIE->EXIT_DEAD transition and
drops tasklist_lock.  If this task is not the natural child and it is
traced, we change its state back to EXIT_ZOMBIE for ->real_parent.

The last transition is racy, this is even documented in 50b8d25748
"ptrace: partially fix the do_wait(WEXITED) vs EXIT_DEAD->EXIT_ZOMBIE
race".  wait_consider_task() tries to detect this transition and clear
->notask_error but we can't rely on ptrace_reparented(), debugger can
exit and do ptrace_unlink() before its sub-thread sets EXIT_ZOMBIE.

And there is another problem which were missed before: this transition
can also race with reparent_leader() which doesn't reset >exit_signal if
EXIT_DEAD, assuming that this task must be reaped by someone else.  So
the tracee can be re-parented with ->exit_signal != SIGCHLD, and if
/sbin/init doesn't use __WALL it becomes unreapable.  This was fixed by
the previous commit, but it was the temporary hack.

1. Add the new exit_state, EXIT_TRACE. It means that the task is the
   traced zombie, debugger is going to detach and notify its natural
   parent.

   This new state is actually EXIT_ZOMBIE | EXIT_DEAD. This way we
   can avoid the changes in proc/kgdb code, get_task_state() still
   reports "X (dead)" in this case.

   Note: with or without this change userspace can see Z -> X -> Z
   transition. Not really bad, but probably makes sense to fix.

2. Change wait_task_zombie() to use EXIT_TRACE instead of EXIT_DEAD
   if we need to notify the ->real_parent.

3. Revert the previous hack in reparent_leader(), now that EXIT_DEAD
   is always the final state we can safely ignore such a task.

4. Change wait_consider_task() to check EXIT_TRACE separately and kill
   the racy and no longer needed ptrace_reparented() case.

   If ptrace == T an EXIT_TRACE thread should be simply ignored, the
   owner of this state is going to ptrace_unlink() this task. We can
   pretend that it was already removed from ->ptraced list.

   Otherwise we should skip this thread too but clear ->notask_error,
   we must be the natural parent and debugger is going to untrace and
   notify us. IOW, this doesn't differ from "EXIT_ZOMBIE && p->ptrace"
   even if the task was already untraced.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Jan Kratochvil <jan.kratochvil@redhat.com>
Reported-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Lennart Poettering <lpoetter@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:36:05 -07:00
..
debug mm: per-thread vma caching 2014-04-07 16:35:53 -07:00
events Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2014-04-05 13:20:43 -07:00
gcov gcov: reuse kbasename helper 2013-11-13 12:09:34 +09:00
irq genirq: Export symbol no_action() 2014-03-22 11:33:09 +01:00
locking Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 14:13:25 -07:00
power Merge branches 'pm-runtime' and 'pm-sleep' 2014-03-20 13:25:54 +01:00
printk printk: fix one circular lockdep warning about console_lock 2014-04-03 16:21:08 -07:00
rcu Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 11:21:19 -07:00
sched Merge branch 'akpm' (incoming from Andrew) 2014-04-03 16:22:16 -07:00
time Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-01 11:00:07 -07:00
trace Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2014-04-05 13:20:43 -07:00
.gitignore Ignore generated file kernel/x509_certificate_list 2013-12-10 18:21:34 +00:00
Kconfig.freezer
Kconfig.hz kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS 2013-11-15 09:32:22 +09:00
Kconfig.locks
Kconfig.preempt
Makefile Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 14:13:25 -07:00
acct.c
async.c
audit.c AUDIT: Allow login in non-init namespaces 2014-03-30 17:02:53 -07:00
audit.h audit: Use struct net not pid_t to remember the network namespce to reply in 2014-02-28 04:04:33 -08:00
audit_tree.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
audit_watch.c inotify: Fix reporting of cookies for inotify events 2014-02-18 11:17:17 +01:00
auditfilter.c audit: Update kdoc for audit_send_reply and audit_list_rules_send 2014-03-08 15:31:54 -08:00
auditsc.c execve: use 'struct filename *' for executable name passing 2014-02-05 12:54:53 -08:00
backtracetest.c
bounds.c mm: do not allocate page->ptl dynamically, if spinlock_t fits to long 2013-12-20 12:25:45 -08:00
capability.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2014-04-03 09:26:18 -07:00
cgroup.c Merge branch 'for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-04-03 13:05:42 -07:00
cgroup_freezer.c cgroup: drop const from @buffer of cftype->write_string() 2014-03-19 10:23:54 -04:00
compat.c Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-02 12:51:41 -07:00
configs.c
context_tracking.c context_tracking: Wrap static key check into more intuitive function name 2013-12-02 20:43:14 +01:00
cpu.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-11-14 16:55:11 +09:00
cpu_pm.c
cpuset.c Merge branch 'akpm' (incoming from Andrew) 2014-04-03 16:22:16 -07:00
crash_dump.c
cred.c
delayacct.c kernel/delayacct.c: remove redundant checking in __delayacct_add_tsk() 2013-11-13 12:09:12 +09:00
dma.c
elfcore.c switch elf_core_write_extra_phdrs() to dump_emit() 2013-11-09 00:16:23 -05:00
exec_domain.c
exit.c wait: introduce EXIT_TRACE to avoid the racy EXIT_DEAD->EXIT_ZOMBIE transition 2014-04-07 16:36:05 -07:00
extable.c asmlinkage: Make main_extable_sort_needed visible 2014-02-13 18:13:22 -08:00
fork.c mm, mempolicy: remove per-process flag 2014-04-07 16:35:54 -07:00
freezer.c libata, freezer: avoid block device removal while system is frozen 2013-12-19 13:50:32 -05:00
futex.c Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-03-31 10:59:39 -07:00
futex_compat.c compat: Get rid of (get|put)_compat_time(val|spec) 2014-02-02 14:09:12 -08:00
groups.c kernel/groups.c: remove return value of set_groups 2014-04-03 16:21:05 -07:00
hrtimer.c timer: Remove code redundancy while calling get_nohz_timer_target() 2014-03-20 12:35:46 +01:00
hung_task.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
irq_work.c perf/x86: Warn to early_printk() in case irq_work is too slow 2014-02-21 21:49:07 +01:00
itimer.c
jump_label.c static_key: WARN on usage before jump_label_init was called 2013-10-19 19:45:35 -04:00
kallsyms.c
kcmp.c
kexec.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
kmod.c execve: use 'struct filename *' for executable name passing 2014-02-05 12:54:53 -08:00
kprobes.c kprobes: use KSYM_NAME_LEN to size identifier buffers 2013-11-13 12:09:26 +09:00
ksysfs.c rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c 2014-02-26 06:35:16 -08:00
kthread.c kthread: ensure locality of task_struct allocations 2014-04-03 16:20:49 -07:00
latencytop.c
module-internal.h KEYS: Separate the kernel signature checking keyring from module signing 2013-09-25 17:17:01 +01:00
module.c Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
module_signing.c keys: change asymmetric keys to use common hash definitions 2013-10-25 17:15:18 -04:00
notifier.c notifier: Substitute rcu_access_pointer() for rcu_dereference_raw() 2014-02-26 06:35:13 -08:00
nsproxy.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2013-09-07 14:35:32 -07:00
padata.c padata: Fix wrong usage of rcu_dereference() 2013-12-05 21:28:42 +08:00
panic.c Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
params.c params: improve standard definitions 2013-12-04 14:09:46 +10:30
pid.c pidns: fix free_pid() to handle the first fork failure 2013-09-30 14:31:03 -07:00
pid_namespace.c pid_namespace: pidns_get() should check task_active_pid_ns() != NULL 2014-04-02 16:20:21 -07:00
posix-cpu-timers.c posix-timers: Convert abuses of BUG_ON to WARN_ON 2013-12-09 16:56:29 +01:00
posix-timers.c
profile.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
ptrace.c kernel/compat: convert to COMPAT_SYSCALL_DEFINE 2014-03-06 15:35:10 +01:00
range.c range: Do not add new blank slot with add_range_with_merge 2013-06-18 11:32:10 -05:00
reboot.c kexec: migrate to reboot cpu 2013-12-18 19:04:50 -08:00
relay.c treewide: Fix typo in Documentation/DocBook 2014-02-19 14:58:17 +01:00
res_counter.c res_counter: remove interface for locked charging and uncharging 2014-04-07 16:35:54 -07:00
resource.c kernel/resource.c: make reallocate_resource() static 2014-04-03 16:21:07 -07:00
seccomp.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2014-04-03 09:26:18 -07:00
signal.c Merge branch 'master' into for-next 2014-02-20 14:54:28 +01:00
smp.c smp: Rename __smp_call_function_single() to smp_call_function_single_async() 2014-02-24 14:47:15 -08:00
smpboot.c kernel: delete __cpuinit usage from all core kernel files 2013-07-14 19:36:59 -04:00
smpboot.h
softirq.c softirq: Add linux/irq.h to make it compile again 2014-03-19 11:28:14 +01:00
stacktrace.c
stop_machine.c stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus() 2014-03-11 11:33:47 +01:00
sys.c mm, thp: add VM_INIT_DEF_MASK and PRCTL_THP_DISABLE 2014-04-07 16:35:52 -07:00
sys_ni.c fs, kernel: permit disabling the uselib syscall 2014-04-03 16:21:05 -07:00
sysctl.c drop_caches: add some documentation and info message 2014-04-03 16:21:04 -07:00
sysctl_binary.c kernel/sysctl_binary.c: use scnprintf() instead of snprintf() 2013-11-13 12:09:33 +09:00
system_certificates.S KEYS: correct alignment of system_certificate_list content in assembly file 2013-12-10 18:25:28 +00:00
system_keyring.c KEYS: correct alignment of system_certificate_list content in assembly file 2013-12-10 18:25:28 +00:00
task_work.c task_work: documentation 2013-09-11 15:58:27 -07:00
taskstats.c genetlink: only pass array to genl_register_family_with_ops() 2013-11-19 16:39:05 -05:00
test_kprobes.c
time.c sched: Rename sched.c as sched/core.c in comments and Documentation 2013-06-19 12:58:42 +02:00
timeconst.bc
timer.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-01 11:00:07 -07:00
torture.c rcutorture: Gracefully handle NULL cleanup hooks 2014-02-23 09:04:39 -08:00
tracepoint.c Nothing major: the stricter permissions checking for sysfs broke 2014-04-06 09:38:07 -07:00
tsacct.c
uid16.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
up.c smp: Rename __smp_call_function_single() to smp_call_function_single_async() 2014-02-24 14:47:15 -08:00
user-return-notifier.c
user.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
user_namespace.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
utsname.c userns: Kill nsown_capable it makes the wrong thing easy 2013-08-30 23:44:11 -07:00
utsname_sysctl.c
watchdog.c kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one 2014-04-03 16:20:58 -07:00
workqueue.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-04-01 11:00:07 -07:00
workqueue_internal.h sched: Rename sched.c as sched/core.c in comments and Documentation 2013-06-19 12:58:42 +02:00