WSL2-Linux-Kernel/kernel
Hou Tao b44d28b98f bpf, cpumap: Make sure kthread is running before map update returns
commit 640a604585 upstream.

The following warning was reported when running stress-mode enabled
xdp_redirect_cpu with some RT threads:

  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 65 at kernel/bpf/cpumap.c:135
  CPU: 4 PID: 65 Comm: kworker/4:1 Not tainted 6.5.0-rc2+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
  Workqueue: events cpu_map_kthread_stop
  RIP: 0010:put_cpu_map_entry+0xda/0x220
  ......
  Call Trace:
   <TASK>
   ? show_regs+0x65/0x70
   ? __warn+0xa5/0x240
   ......
   ? put_cpu_map_entry+0xda/0x220
   cpu_map_kthread_stop+0x41/0x60
   process_one_work+0x6b0/0xb80
   worker_thread+0x96/0x720
   kthread+0x1a5/0x1f0
   ret_from_fork+0x3a/0x70
   ret_from_fork_asm+0x1b/0x30
   </TASK>

The root cause is the same as commit 4369016497 ("bpf: cpumap: Fix memory
leak in cpu_map_update_elem"). The kthread is stopped prematurely by
kthread_stop() in cpu_map_kthread_stop(), and kthread() doesn't call
cpu_map_kthread_run() at all but XDP program has already queued some
frames or skbs into ptr_ring. So when __cpu_map_ring_cleanup() checks
the ptr_ring, it will find it was not emptied and report a warning.

An alternative fix is to use __cpu_map_ring_cleanup() to drop these
pending frames or skbs when kthread_stop() returns -EINTR, but it may
confuse the user, because these frames or skbs have been handled
correctly by XDP program. So instead of dropping these frames or skbs,
just make sure the per-cpu kthread is running before
__cpu_map_entry_alloc() returns.

After apply the fix, the error handle for kthread_stop() will be
unnecessary because it will always return 0, so just remove it.

Fixes: 6710e11269 ("bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230729095107.1722450-2-houtao@huaweicloud.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11 15:13:58 +02:00
..
bpf bpf, cpumap: Make sure kthread is running before map update returns 2023-08-11 15:13:58 +02:00
cgroup cgroup: Do not corrupt task iteration when rebinding subsystem 2023-06-28 10:29:43 +02:00
configs drivers/char: remove /dev/kmem for good 2021-05-07 00:26:34 -07:00
debug lockdown: also lock down previous kgdb use 2022-05-25 09:57:37 +02:00
dma swiotlb: max mapping size takes min align mask into account 2022-10-05 10:39:40 +02:00
entry entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up 2023-03-30 12:47:51 +02:00
events perf: Fix function pointer case 2023-08-11 15:13:48 +02:00
futex futex: Resend potentially swallowed owner death notification 2022-12-31 13:14:04 +01:00
gcov gcov: add support for checksum field 2022-12-31 13:14:47 +01:00
irq irqdomain: Fix mapping-creation race 2023-03-17 08:48:58 +01:00
kcsan kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures 2023-07-23 13:47:12 +02:00
livepatch livepatch: fix race between fork and KLP transition 2022-10-26 12:34:30 +02:00
locking locking/rtmutex: Fix task->pi_waiters integrity 2023-08-03 10:22:45 +02:00
power workqueue: Introduce show_one_worker_pool and show_one_workqueue. 2023-05-11 23:00:35 +09:00
printk kernel/printk/index.c: fix memory leak with using debugfs_lookup() 2023-03-11 13:57:32 +01:00
rcu rcu/rcuscale: Stop kfree_scale_thread thread(s) after unloading rcuscale 2023-07-23 13:46:47 +02:00
sched sched: Fix DEBUG && !SCHEDSTATS warn 2023-05-11 23:00:40 +09:00
time posix-timers: Prevent RT livelock in itimer_delete() 2023-07-23 13:46:45 +02:00
trace bpf: Disable preemption in bpf_event_output 2023-08-11 15:13:57 +02:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt sched/core: Disable CONFIG_SCHED_CORE by default 2021-06-28 22:43:05 +02:00
Makefile futex: Move to kernel/futex/ 2022-12-31 13:14:04 +01:00
acct.c acct: fix potential integer overflow in encode_comp_t() 2022-12-31 13:14:40 +01:00
async.c Revert "module, async: async_synchronize_full() on module init iff async is used" 2022-02-23 12:03:07 +01:00
audit.c audit: improve audit queue handling when "audit=1" on cmdline 2022-02-08 18:34:03 +01:00
audit.h audit: log AUDIT_TIME_* records only from rules 2022-04-08 14:23:06 +02:00
audit_fsnotify.c audit: fix potential double free on error path from fsnotify_add_inode_mark 2022-08-31 17:16:33 +02:00
audit_tree.c audit: move put_tree() to avoid trim_trees refcount underflow and UAF 2021-08-24 18:52:36 -04:00
audit_watch.c
auditfilter.c lsm: separate security_task_getsecid() into subjective and objective variants 2021-03-22 15:23:32 -04:00
auditsc.c audit: log AUDIT_TIME_* records only from rules 2022-04-08 14:23:06 +02:00
backtracetest.c
bounds.c
capability.c
cfi.c cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle 2022-06-22 14:22:04 +02:00
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-04-05 11:24:53 +02:00
configs.c
context_tracking.c
cpu.c cpu/hotplug: Do not bail-out in DYING/STARTING sections 2022-12-31 13:14:04 +01:00
cpu_pm.c PM: cpu: Make notifier chain use a raw_spinlock_t 2021-08-16 18:55:32 +02:00
crash_core.c kernel/crash_core: suppress unknown crashkernel parameter warning 2021-12-29 12:28:49 +01:00
crash_dump.c
cred.c ucounts: Base set_cred_ucounts changes on the real user 2022-02-23 12:03:20 +01:00
delayacct.c delayacct: Add sysctl to enable at runtime 2021-05-12 11:43:25 +02:00
dma.c
exec_domain.c
exit.c exit: Use READ_ONCE() for all oops/warn limit reads 2023-02-01 08:27:22 +01:00
extable.c
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-03-11 13:57:38 +01:00
fork.c mm: Move mm_cachep initialization to mm_init() 2023-08-08 19:58:33 +02:00
freezer.c sched: Add get_current_state() 2021-06-18 11:43:08 +02:00
gen_kheaders.sh kbuild: clean up ${quiet} checks in shell scripts 2021-05-27 04:01:50 +09:00
groups.c groups: simplify struct group_info allocation 2021-02-26 09:41:03 -08:00
hung_task.c Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
iomem.c
irq_work.c irq_work: Make irq_work_queue() NMI-safe again 2021-06-10 10:00:08 +02:00
jump_label.c jump_label: Fix jump_label_text_reserved() vs __init 2021-07-05 10:46:20 +02:00
kallsyms.c module: add printk formats to add module build ID to stacktraces 2021-07-08 11:48:22 -07:00
kcmp.c
kcov.c
kexec.c panic, kexec: make __crash_kexec() NMI safe 2023-04-20 12:13:57 +02:00
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-23 13:46:52 +02:00
kexec_elf.c
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-21 15:59:14 +02:00
kexec_internal.h panic, kexec: make __crash_kexec() NMI safe 2023-04-20 12:13:57 +02:00
kheaders.c kheaders: Use array declaration instead of char 2023-05-11 23:00:17 +09:00
kmod.c modules: add CONFIG_MODPROBE_PATH 2021-05-07 00:26:33 -07:00
kprobes.c x86/kprobes: Fix arch_check_optimized_kprobe check within optimized_kprobe range 2023-03-10 09:40:01 +01:00
ksysfs.c kexec: turn all kexec_mutex acquisitions into trylocks 2023-04-20 12:13:57 +02:00
kthread.c kthread: add the helper function kthread_run_on_cpu() 2023-03-30 12:47:42 +02:00
latencytop.c
module-internal.h
module.c module: Don't wait for GOING modules 2023-02-01 08:27:23 +01:00
module_signature.c
module_signing.c
notifier.c notifier: Remove atomic_notifier_call_chain_robust() 2021-08-16 18:55:32 +02:00
nsproxy.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
padata.c padata: Fix list iterator in padata_do_serial() 2022-12-31 13:14:24 +01:00
panic.c exit: Use READ_ONCE() for all oops/warn limit reads 2023-02-01 08:27:22 +01:00
params.c params: lift param_set_uint_minmax to common code 2021-08-16 14:42:22 +02:00
pid.c kernel/pid.c: implement additional checks upon pidfd_create() parameters 2021-08-10 12:53:07 +02:00
pid_namespace.c rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() 2023-03-10 09:39:09 +01:00
profile.c profiling: fix shift too large makes kernel panic 2022-08-17 14:24:04 +02:00
ptrace.c ptrace: Reimplement PTRACE_KILL by always sending SIGKILL 2022-06-09 10:22:29 +02:00
range.c
reboot.c reboot: Add hardware protection power-off 2021-06-21 13:08:36 +01:00
regset.c
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-11 23:00:18 +09:00
resource.c dax/kmem: Fix leak of memory-hotplug resources 2023-03-10 09:40:08 +01:00
resource_kunit.c
rseq.c rseq: Remove broken uapi field layout on 32-bit little endian 2022-04-08 14:23:10 +02:00
scftorture.c scftorture: Fix distribution of short handler delays 2022-06-09 10:22:46 +02:00
scs.c scs: Release kasan vmalloc poison in scs_free process 2021-11-18 19:16:29 +01:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2022-02-16 12:56:38 +01:00
signal.c signal handling: don't use BUG_ON() for debugging 2022-07-21 21:24:42 +02:00
smp.c locking/csd_lock: Change csdlock_debug from early_param to __setup 2022-08-17 14:24:24 +02:00
smpboot.c smpboot: Replace deprecated CPU-hotplug functions. 2021-08-10 14:57:42 +02:00
smpboot.h
softirq.c genirq: Change force_irqthreads to a static key 2021-08-10 22:50:07 +02:00
stackleak.c gcc-plugins/stackleak: Use noinstr in favor of notrace 2022-02-23 12:03:07 +01:00
stacktrace.c stacktrace: move filter_irq_stacks() to kernel/stacktrace.c 2022-04-13 20:59:28 +02:00
static_call.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
static_call_inline.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
stop_machine.c stop_machine: Add caller debug info to queue_stop_cpus_work 2021-03-23 16:01:58 +01:00
sys.c kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() 2023-04-26 13:51:52 +02:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-08-31 17:16:33 +02:00
sysctl-test.c kernel/sysctl-test: Remove some casts which are no-longer required 2021-06-23 16:41:24 -06:00
sysctl.c kernel/panic: move panic sysctls to its own file 2023-02-01 08:27:20 +01:00
task_work.c kasan: record task_work_add() call stack 2021-04-30 11:20:42 -07:00
taskstats.c
test_kprobes.c
torture.c torture: Replace deprecated CPU-hotplug functions. 2021-08-10 10:48:07 -07:00
tracepoint.c tracepoint: Fix kerneldoc comments 2021-08-16 11:39:51 -04:00
tsacct.c taskstats: Cleanup the use of task->exit_code 2022-01-27 11:05:35 +01:00
ucount.c ucounts: Handle wrapping in is_ucounts_overlimit 2022-02-23 12:03:20 +01:00
uid16.c
uid16.h
umh.c kernel/umh.c: fix some spelling mistakes 2021-05-07 00:26:34 -07:00
up.c A set of locking related fixes and updates: 2021-05-09 13:07:03 -07:00
user-return-notifier.c
user.c fs/epoll: use a per-cpu counter for user's watches count 2021-09-08 11:50:27 -07:00
user_namespace.c ucounts: Fix systemd LimitNPROC with private users regression 2022-03-08 19:12:42 +01:00
usermode_driver.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
utsname.c
utsname_sysctl.c
watch_queue.c watch_queue: fix IOC_WATCH_QUEUE_SET_SIZE alloc error paths 2023-03-17 08:48:59 +01:00
watchdog.c watchdog: export lockup_detector_reconfigure 2022-08-25 11:40:43 +02:00
watchdog_hld.c watchdog/perf: more properly prevent false positives with turbo modes 2023-07-23 13:46:52 +02:00
workqueue.c workqueue: clean up WORK_* constant types, clarify masking 2023-07-23 13:47:38 +02:00
workqueue_internal.h workqueue: Assign a color to barrier work items 2021-08-17 07:49:10 -10:00