WSL2-Linux-Kernel/kernel
Dongli Zhang 9eeda3e007 genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
commit a6c11c0a5235fb144a65e0cb2ffd360ddc1f6c32 upstream.

The absence of IRQD_MOVE_PCNTXT prevents immediate effectiveness of
interrupt affinity reconfiguration via procfs. Instead, the change is
deferred until the next instance of the interrupt being triggered on the
original CPU.

When the interrupt next triggers on the original CPU, the new affinity is
enforced within __irq_move_irq(). A vector is allocated from the new CPU,
but the old vector on the original CPU remains and is not immediately
reclaimed. Instead, apicd->move_in_progress is flagged, and the reclaiming
process is delayed until the next trigger of the interrupt on the new CPU.

Upon the subsequent triggering of the interrupt on the new CPU,
irq_complete_move() adds a task to the old CPU's vector_cleanup list if it
remains online. Subsequently, the timer on the old CPU iterates over its
vector_cleanup list, reclaiming old vectors.

However, a rare scenario arises if the old CPU is outgoing before the
interrupt triggers again on the new CPU.

In that case irq_force_complete_move() is not invoked on the outgoing CPU
to reclaim the old apicd->prev_vector because the interrupt isn't currently
affine to the outgoing CPU, and irq_needs_fixup() returns false. Even
though __vector_schedule_cleanup() is later called on the new CPU, it
doesn't reclaim apicd->prev_vector; instead, it simply resets both
apicd->move_in_progress and apicd->prev_vector to 0.

As a result, the vector remains unreclaimed in vector_matrix, leading to a
CPU vector leak.

To address this issue, move the invocation of irq_force_complete_move()
before the irq_needs_fixup() call to reclaim apicd->prev_vector, if the
interrupt is currently or used to be affine to the outgoing CPU.

Additionally, reclaim the vector in __vector_schedule_cleanup() as well,
following a warning message, although theoretically it should never see
apicd->move_in_progress with apicd->prev_cpu pointing to an offline CPU.

Fixes: f0383c24b4 ("genirq/cpuhotplug: Add support for cleaning up move in progress")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240522220218.162423-1-dongli.zhang@oracle.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16 13:39:52 +02:00
..
bpf bpf: Allow delete from sockmap/sockhash only if update is allowed 2024-06-16 13:39:50 +02:00
cgroup sched/fair: Allow disabling sched_balance_newidle with sched_relax_domain_level 2024-06-16 13:39:34 +02:00
configs
debug kdb: Fix a potential buffer overflow in kdb_local() 2024-01-25 14:52:54 -08:00
dma dma-mapping: benchmark: handle NUMA_NO_NODE correctly 2024-06-16 13:39:49 +02:00
entry entry: Respect changes to system call number by trace_sys_enter() 2024-04-10 16:18:46 +02:00
events perf/core: Fix reentry problem in perf_output_read_group() 2024-04-10 16:19:29 +02:00
futex futex: Don't include process MM in futex key on no-MMU 2023-11-20 11:08:13 +01:00
gcov gcov: add support for checksum field 2022-12-31 13:14:47 +01:00
irq genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline 2024-06-16 13:39:52 +02:00
kcsan kcsan: Don't expect 64 bits atomic builtins from 32 bits architectures 2023-07-23 13:47:12 +02:00
livepatch livepatch: Fix missing newline character in klp_resolve_symbols() 2023-11-20 11:08:25 +01:00
locking locking/rwsem: Disable preemption while trying for rwsem lock 2024-04-10 16:19:37 +02:00
power PM: suspend: Set mem_sleep_current during kernel command line setup 2024-04-10 16:18:36 +02:00
printk printk: Update @console_may_schedule in console_trylock_spinning() 2024-04-10 16:18:47 +02:00
rcu rcu-tasks: Provide rcu_trace_implies_rcu_gp() 2024-03-26 18:21:11 -04:00
sched sched/core: Fix incorrect initialization of the 'burst' parameter in cpu_max_write() 2024-06-16 13:39:34 +02:00
time timers: Rename del_timer_sync() to timer_delete_sync() 2024-04-10 16:18:33 +02:00
trace ring-buffer: Fix a race between readers and resize checks 2024-06-16 13:39:12 +02:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt
Makefile futex: Move to kernel/futex/ 2022-12-31 13:14:04 +01:00
acct.c acct: fix potential integer overflow in encode_comp_t() 2022-12-31 13:14:40 +01:00
async.c async: Introduce async_schedule_dev_nocall() 2024-02-23 08:54:25 +01:00
audit.c audit: Send netlink ACK before setting connection in auditd_set 2024-02-23 08:54:37 +01:00
audit.h audit: log AUDIT_TIME_* records only from rules 2022-04-08 14:23:06 +02:00
audit_fsnotify.c fsnotify: make allow_dups a property of the group 2024-04-10 16:19:02 +02:00
audit_tree.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-04-10 16:19:02 +02:00
audit_watch.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-04-10 16:19:02 +02:00
auditfilter.c
auditsc.c audit: fix possible soft lockup in __audit_inode_child() 2023-09-19 12:22:39 +02:00
backtracetest.c
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-05-02 16:24:50 +02:00
capability.c
cfi.c cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle 2022-06-22 14:22:04 +02:00
compat.c sched_getaffinity: don't assume 'cpumask_size()' is fully initialized 2023-04-05 11:24:53 +02:00
configs.c
context_tracking.c
cpu.c cpu: Re-enable CPU mitigations by default for !X86 architectures 2024-05-02 16:24:48 +02:00
cpu_pm.c PM: cpu: Make notifier chain use a raw_spinlock_t 2021-08-16 18:55:32 +02:00
crash_core.c kernel/crash_core: suppress unknown crashkernel parameter warning 2021-12-29 12:28:49 +01:00
crash_dump.c
cred.c cred: switch to using atomic_long_t 2023-12-20 15:17:37 +01:00
delayacct.c
dma.c
exec_domain.c
exit.c exit: Use READ_ONCE() for all oops/warn limit reads 2023-02-01 08:27:22 +01:00
extable.c
fail_function.c kernel/fail_function: fix memory leak with using debugfs_lookup() 2023-03-11 13:57:38 +01:00
fork.c kernel/fork: beware of __put_task_struct() calling context 2023-09-23 11:09:55 +02:00
freezer.c
gen_kheaders.sh
groups.c
hung_task.c
iomem.c
irq_work.c
jump_label.c
kallsyms.c kallsyms: Make kallsyms_on_each_symbol generally available 2023-12-13 18:36:45 +01:00
kcmp.c
kcov.c
kexec.c kernel: kexec: copy user-array safely 2023-11-28 16:56:16 +00:00
kexec_core.c kexec: fix a memory leak in crash_shrink_memory() 2023-07-23 13:46:52 +02:00
kexec_elf.c
kexec_file.c kexec: support purgatories with .text.hot sections 2023-06-21 15:59:14 +02:00
kexec_internal.h panic, kexec: make __crash_kexec() NMI safe 2023-04-20 12:13:57 +02:00
kheaders.c kheaders: Use array declaration instead of char 2023-05-11 23:00:17 +09:00
kmod.c
kprobes.c kprobes: Fix possible use-after-free issue on kprobe registration 2024-04-27 17:05:23 +02:00
ksysfs.c kexec: turn all kexec_mutex acquisitions into trylocks 2023-04-20 12:13:57 +02:00
kthread.c exit: Implement kthread_exit 2024-04-10 16:18:55 +02:00
latencytop.c
module-internal.h
module.c NFSD: Remove svc_serv_ops::svo_module 2024-04-10 16:19:01 +02:00
module_signature.c
module_signing.c
notifier.c notifier: Remove atomic_notifier_call_chain_robust() 2021-08-16 18:55:32 +02:00
nsproxy.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
padata.c crypto: pcrypt - Fix hungtask for PADATA_RESET 2023-11-28 16:56:18 +00:00
panic.c panic: Flush kernel log buffer at the end 2024-04-13 13:01:43 +02:00
params.c params: lift param_set_uint_minmax to common code 2021-08-16 14:42:22 +02:00
pid.c kernel/pid.c: implement additional checks upon pidfd_create() parameters 2021-08-10 12:53:07 +02:00
pid_namespace.c rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() 2023-03-10 09:39:09 +01:00
profile.c profiling: fix shift too large makes kernel panic 2022-08-17 14:24:04 +02:00
ptrace.c ptrace: Reimplement PTRACE_KILL by always sending SIGKILL 2022-06-09 10:22:29 +02:00
range.c
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2023-11-28 16:56:31 +00:00
regset.c
relay.c relayfs: fix out-of-bounds access in relay_file_read 2023-05-11 23:00:18 +09:00
resource.c dax/kmem: Fix leak of memory-hotplug resources 2023-03-10 09:40:08 +01:00
resource_kunit.c
rseq.c rseq: Remove broken uapi field layout on 32-bit little endian 2022-04-08 14:23:10 +02:00
scftorture.c scftorture: Forgive memory-allocation failure if KASAN 2023-09-23 11:09:55 +02:00
scs.c scs: Release kasan vmalloc poison in scs_free process 2021-11-18 19:16:29 +01:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2022-02-16 12:56:38 +01:00
signal.c signal handling: don't use BUG_ON() for debugging 2022-07-21 21:24:42 +02:00
smp.c locking/csd_lock: Change csdlock_debug from early_param to __setup 2022-08-17 14:24:24 +02:00
smpboot.c smpboot: Replace deprecated CPU-hotplug functions. 2021-08-10 14:57:42 +02:00
smpboot.h
softirq.c softirq: Fix suspicious RCU usage in __do_softirq() 2024-06-16 13:39:15 +02:00
stackleak.c gcc-plugins/stackleak: Use noinstr in favor of notrace 2022-02-23 12:03:07 +01:00
stacktrace.c stacktrace: move filter_irq_stacks() to kernel/stacktrace.c 2022-04-13 20:59:28 +02:00
static_call.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
static_call_inline.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
stop_machine.c
sys.c getrusage: use sig->stats_lock rather than lock_task_sighand() 2024-03-15 10:48:22 -04:00
sys_ni.c kernel/sys_ni: add compat entry for fadvise64_64 2022-08-31 17:16:33 +02:00
sysctl-test.c
sysctl.c sched/rt: Disallow writing invalid values to sched_rt_period_us 2024-03-01 13:21:43 +01:00
task_work.c
taskstats.c
test_kprobes.c
torture.c torture: Fix hang during kthread shutdown phase 2023-08-30 16:18:19 +02:00
tracepoint.c tracepoint: Fix kerneldoc comments 2021-08-16 11:39:51 -04:00
tsacct.c taskstats: Cleanup the use of task->exit_code 2022-01-27 11:05:35 +01:00
ucount.c ucounts: Handle wrapping in is_ucounts_overlimit 2022-02-23 12:03:20 +01:00
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c fs/epoll: use a per-cpu counter for user's watches count 2021-09-08 11:50:27 -07:00
user_namespace.c ucounts: Fix systemd LimitNPROC with private users regression 2022-03-08 19:12:42 +01:00
usermode_driver.c
utsname.c
utsname_sysctl.c
watch_queue.c kernel: watch_queue: copy user-array safely 2023-11-28 16:56:16 +00:00
watchdog.c watchdog: move softlockup_panic back to early_param 2023-11-28 16:56:28 +00:00
watchdog_hld.c watchdog/perf: more properly prevent false positives with turbo modes 2023-07-23 13:46:52 +02:00
workqueue.c Revert "workqueue: remove unused cancel_work()" 2023-12-08 08:48:03 +01:00
workqueue_internal.h workqueue: Assign a color to barrier work items 2021-08-17 07:49:10 -10:00