WSL2-Linux-Kernel/arch/x86/kernel
Wanpeng Li 03bd4e1f72 sched: Fix unreleased llc_shared_mask bit during CPU hotplug
The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:

	BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
	PGD 5a9d5067 PUD 13067 PMD 0
	Oops: 0000 [#3] SMP
	[...]
	Call Trace:
	load_balance
	? _raw_spin_unlock_irqrestore
	idle_balance
	__schedule
	schedule
	schedule_timeout
	? lock_timer_base
	schedule_timeout_uninterruptible
	msleep
	lock_device_hotplug_sysfs
	online_store
	dev_attr_store
	sysfs_write_file
	vfs_write
	SyS_write
	system_call_fastpath

Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.

However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.

This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.

Yasuaki also reported that this can happen on real hardware:

  https://lkml.org/lkml/2014/7/22/1018

His case is here:

	==
	Here is an example on my system.
	My system has 4 sockets and each socket has 15 cores and HT is
	enabled. In this case, each core of sockes is numbered as
	follows:

		 | CPU#
	Socket#0 | 0-14 , 60-74
	Socket#1 | 15-29, 75-89
	Socket#2 | 30-44, 90-104
	Socket#3 | 45-59, 105-119

	Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.

	It means that last level cache of Socket#2 is shared with
	CPU#30-44 and 90-104.

	When hot-removing socket#2 and #3, each core of sockets is
	numbered as follows:

		 | CPU#
	Socket#0 | 0-14 , 60-74
	Socket#1 | 15-29, 75-89

	But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
	remains having 0x3fff80000001fffc0000000.

	After that, when hot-adding socket#2 and #3, each core of
	sockets is numbered as follows:

		 | CPU#
	Socket#0 | 0-14 , 60-74
	Socket#1 | 15-29, 75-89
	Socket#2 | 30-59
	Socket#3 | 90-119

	Then llc_shared_mask of CPU#30 becomes
	0x3fff8000fffffffc0000000. It means that last level cache of
	Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
	the wrong value.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: <stable@vger.kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-09-24 15:13:20 +02:00
..
acpi Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-13 18:23:32 -06:00
apic x86, irq, PCI: Keep IRQ assignment for runtime power management 2014-08-29 13:38:00 +02:00
cpu PCI changes for the v3.17 merge window (part 2): 2014-08-14 18:10:33 -06:00
kprobes kprobes/x86: Free 'optinsn' cache when range check fails 2014-08-27 20:24:32 +02:00
.gitignore
Makefile kexec: create a new config option CONFIG_KEXEC_FILE for new syscall 2014-08-29 16:28:16 -07:00
alternative.c
amd_gart_64.c x86: enable DMA CMA with swiotlb 2014-06-04 16:53:57 -07:00
amd_nb.c
apb_timer.c
aperture_64.c x86/gart: Tidy messages and add bridge device info 2014-05-23 10:47:19 -06:00
apm_32.c x86: Remove unused variable "polling" 2014-07-16 12:58:47 +02:00
asm-offsets.c
asm-offsets_32.c
asm-offsets_64.c
audit_64.c
bootflag.c
check.c
cpuid.c
crash.c kexec: create a new config option CONFIG_KEXEC_FILE for new syscall 2014-08-29 16:28:16 -07:00
crash_dump_32.c
crash_dump_64.c
devicetree.c x86, irq, devicetree: Release IOAPIC pin when PCI device is disabled 2014-06-21 23:05:44 +02:00
doublefault.c
dumpstack.c
dumpstack_32.c
dumpstack_64.c
e820.c
early-quirks.c Merge commit '9e9a928eed8796a0a1aaed7e0b676db86ba84594' into drm-next 2014-06-05 20:28:59 +10:00
early_printk.c
entry_32.S x86_32, entry: Clean up sysenter_badsys declaration 2014-08-15 13:45:32 -07:00
entry_64.S Merge branches 'x86-build-for-linus', 'x86-cleanups-for-linus' and 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-04 16:56:16 -07:00
espfix_64.c x86/espfix/xen: Fix allocation of pages for paravirt page tables 2014-07-14 13:47:32 -07:00
ftrace.c ftrace/x86: Add call to ftrace_graph_is_dead() in function graph code 2014-07-17 09:45:08 -04:00
head.c
head32.c
head64.c kernel/printk: use symbolic defines for console loglevels 2014-06-04 16:54:17 -07:00
head_32.S
head_64.S
hpet.c Merge branch 'x86/vdso' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-05 08:05:29 -07:00
hw_breakpoint.c
i386_ksyms_32.c
i387.c x86/xsaves: Clear reserved bits in xsave header 2014-05-29 14:33:00 -07:00
i8237.c
i8253.c
i8259.c
io_delay.c
ioport.c
iosf_mbi.c PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use 2014-08-12 12:15:14 -06:00
irq.c Merge branch 'x86-irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 20:03:47 -07:00
irq_32.c
irq_64.c
irq_work.c
irqinit.c x86: Fix non-PC platform kernel crash on boot due to NULL dereference 2014-08-25 22:36:57 +02:00
jump_label.c
kdebugfs.c
kexec-bzimage64.c kexec: verify the signature of signed PE bzImage 2014-08-08 15:57:33 -07:00
kgdb.c
ksysfs.c
kvm.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-06-12 19:18:49 -07:00
kvmclock.c
ldt.c Revert "x86-64, modify_ldt: Make support for 16-bit segments a runtime option" 2014-05-21 10:22:59 -07:00
machine_kexec_32.c
machine_kexec_64.c kexec: create a new config option CONFIG_KEXEC_FILE for new syscall 2014-08-29 16:28:16 -07:00
mcount_64.S ftrace: x86: Remove check of obsolete variable function_trace_stop 2014-07-18 13:57:02 -04:00
mmconf-fam10h_64.c
module.c
mpparse.c x86, apic: Remove mps_oem_check callback 2014-07-31 08:05:42 -07:00
msr.c
nmi.c
nmi_selftest.c
paravirt-spinlocks.c
paravirt.c
paravirt_patch_32.c
paravirt_patch_64.c x86_64/entry/xen: Do not invoke espfix64 on Xen 2014-07-28 15:25:40 -07:00
pci-calgary_64.c
pci-dma.c arch/x86/kernel/pci-dma.c: fix dma_generic_alloc_coherent() when CONFIG_DMA_CMA is enabled 2014-06-04 16:53:57 -07:00
pci-iommu_table.c
pci-nommu.c
pci-swiotlb.c x86: enable DMA CMA with swiotlb 2014-06-04 16:53:57 -07:00
pcspeaker.c
perf_regs.c
pmc_atom.c x86/pmc_atom: Silence shift wrapping warnings in pmc_sleep_tmr_show() 2014-08-02 16:52:17 -07:00
preempt.S
probe_roms.c
process.c Define kernel API to get address of each state in xsave area 2014-05-29 14:33:09 -07:00
process_32.c
process_64.c Merge branch 'perf/urgent' into perf/core, to resolve conflict and to prepare for new patches 2014-06-06 07:55:06 +02:00
ptrace.c
pvclock.c
quirks.c
reboot.c x86/reboot: Add EFI reboot quirk for ACPI Hardware Reduced flag 2014-07-18 21:23:52 +01:00
reboot_fixups_32.c
relocate_kernel_32.S
relocate_kernel_64.S
resource.c x86: don't exclude low BIOS area when allocating address space for non-PCI cards 2014-07-16 12:29:36 -06:00
rtc.c
setup.c arch/x86: Replace plain strings with constants 2014-07-18 21:23:59 +01:00
setup_percpu.c
signal.c x86_32, signal: Fix vdso rt_sigreturn 2014-06-23 15:54:42 -07:00
smp.c
smpboot.c sched: Fix unreleased llc_shared_mask bit during CPU hotplug 2014-09-24 15:13:20 +02:00
stacktrace.c
step.c
sys_x86_64.c
syscall_32.c
syscall_64.c
sysfb.c
sysfb_efi.c
sysfb_simplefb.c
tboot.c
tce_64.c
test_nx.c
test_rodata.c
time.c x86: Fix non-PC platform kernel crash on boot due to NULL dereference 2014-08-25 22:36:57 +02:00
tls.c
tls.h
topology.c
trace_clock.c
tracepoint.c
traps.c x86/kprobes: Fix build errors and blacklist context_track_user 2014-06-14 09:07:44 +02:00
tsc.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-08-05 17:46:42 -07:00
tsc_msr.c
tsc_sync.c
uprobes.c uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates 2014-06-05 16:21:57 +02:00
verify_cpu.S
vm86_32.c
vmlinux.lds.S
vsmp_64.c x86/apic/vsmp: Make is_vsmp_box() static 2014-08-01 15:09:45 -07:00
vsyscall_64.c x86_64/vsyscall: Fix warn_bad_vsyscall log output 2014-07-25 16:34:15 -07:00
vsyscall_emu_64.S
vsyscall_gtod.c timekeeping: Create struct tk_read_base and use it in struct timekeeper 2014-07-23 15:01:53 -07:00
vsyscall_trace.h
x86_init.c
x8664_ksyms_64.c
xsave.c x86/xsaves: Clean up code in xstate offsets computation in xsave area 2014-05-30 17:12:41 -07:00