WSL2-Linux-Kernel

История

Tianchen Ding f3dd3f6745 sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle Wakelist can help avoid cache bouncing and offload the overhead of waker cpu. So far, using wakelist within the same llc only happens on WF_ON_CPU, and this limitation could be removed to further improve wakeup performance. The commit `518cd62341` ("sched: Only queue remote wakeups when crossing cache boundaries") disabled queuing tasks on wakelist when the cpus share llc. This is because, at that time, the scheduler must send IPIs to do ttwu_queue_wakelist. Nowadays, ttwu_queue_wakelist also supports TIF_POLLING, so this is not a problem now when the wakee cpu is in idle polling. Benefits: Queuing the task on idle cpu can help improving performance on waker cpu and utilization on wakee cpu, and further improve locality because the wakee cpu can handle its own rq. This patch helps improving rt on our real java workloads where wakeup happens frequently. Consider the normal condition (CPU0 and CPU1 share same llc) Before this patch: CPU0 CPU1 select_task_rq() idle rq_lock(CPU1->rq) enqueue_task(CPU1->rq) notify CPU1 (by sending IPI or CPU1 polling) resched() After this patch: CPU0 CPU1 select_task_rq() idle add to wakelist of CPU1 notify CPU1 (by sending IPI or CPU1 polling) rq_lock(CPU1->rq) enqueue_task(CPU1->rq) resched() We see CPU0 can finish its work earlier. It only needs to put task to wakelist and return. While CPU1 is idle, so let itself handle its own runqueue data. This patch brings no difference about IPI. This patch only takes effect when the wakee cpu is: 1) idle polling 2) idle not polling For 1), there will be no IPI with or without this patch. For 2), there will always be an IPI before or after this patch. Before this patch: waker cpu will enqueue task and check preempt. Since "idle" will be sure to be preempted, waker cpu must send a resched IPI. After this patch: waker cpu will put the task to the wakelist of wakee cpu, and send an IPI. Benchmark: We've tested schbench, unixbench, and hachbench on both x86 and arm64. On x86 (Intel Xeon Platinum 8269CY): schbench -m 2 -t 8 Latency percentiles (usec) before after 50.0000th: 8 6 75.0000th: 10 7 90.0000th: 11 8 95.0000th: 12 8 99.0000th: 13 10 99.5000th: 15 11 99.9000th: 18 14 Unixbench with full threads (104) before after Dhrystone 2 using register variables 3011862938 3009935994 -0.06% Double-Precision Whetstone 617119.3 617298.5 0.03% Execl Throughput 27667.3 27627.3 -0.14% File Copy 1024 bufsize 2000 maxblocks 785871.4 784906.2 -0.12% File Copy 256 bufsize 500 maxblocks 210113.6 212635.4 1.20% File Copy 4096 bufsize 8000 maxblocks 2328862.2 2320529.1 -0.36% Pipe Throughput 145535622.8 145323033.2 -0.15% Pipe-based Context Switching 3221686.4 3583975.4 11.25% Process Creation 101347.1 103345.4 1.97% Shell Scripts (1 concurrent) 120193.5 123977.8 3.15% Shell Scripts (8 concurrent) 17233.4 17138.4 -0.55% System Call Overhead 5300604.8 5312213.6 0.22% hackbench -g 1 -l 100000 before after Time 3.246 2.251 On arm64 (Ampere Altra): schbench -m 2 -t 8 Latency percentiles (usec) before after 50.0000th: 14 10 75.0000th: 19 14 90.0000th: 22 16 95.0000th: 23 16 99.0000th: 24 17 99.5000th: 24 17 99.9000th: 28 25 Unixbench with full threads (80) before after Dhrystone 2 using register variables 3536194249 3537019613 0.02% Double-Precision Whetstone 629383.6 629431.6 0.01% Execl Throughput 65920.5 65846.2 -0.11% File Copy 1024 bufsize 2000 maxblocks 1063722.8 1064026.8 0.03% File Copy 256 bufsize 500 maxblocks 322684.5 318724.5 -1.23% File Copy 4096 bufsize 8000 maxblocks 2348285.3 2328804.8 -0.83% Pipe Throughput 133542875.3 131619389.8 -1.44% Pipe-based Context Switching 3215356.1 3576945.1 11.25% Process Creation 108520.5 120184.6 10.75% Shell Scripts (1 concurrent) 122636.3 121888 -0.61% Shell Scripts (8 concurrent) 17462.1 17381.4 -0.46% System Call Overhead 4429998.9 4435006.7 0.11% hackbench -g 1 -l 100000 before after Time 4.217 2.916 Our patch has improvement on schbench, hackbench and Pipe-based Context Switching of unixbench when there exists idle cpus, and no obvious regression on other tests of unixbench. This can help improve rt in scenes where wakeup happens frequently. Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20220608233412.327341-3-dtcccc@linux.alibaba.com		2022-06-13 10:30:01 +02:00
..
bpf	bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs	2022-06-07 10:41:20 -07:00
cgroup	Merge branch 'for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2022-05-25 11:47:25 -07:00
configs	x86/configs: Add x86 debugging Kconfig fragment plus docs	2022-04-06 19:56:29 +02:00
debug	Modules updates for v5.19-rc1	2022-05-26 17:13:43 -07:00
dma	swiotlb: fix setting ->force_bounce	2022-06-02 07:17:59 +02:00
entry	* Fix syzkaller NULL pointer dereference	2022-06-08 09:16:31 -07:00
events	Two small perf updates:	2022-06-05 10:40:31 -07:00
futex	drm for 5.19-rc1	2022-05-25 16:18:27 -07:00
gcov	…
irq	Updates for interrupt core and drivers:	2022-05-23 16:58:49 -07:00
kcsan	linux-kselftest-kunit-5.19-rc1	2022-05-25 11:32:53 -07:00
livepatch	Livepatching changes for 5.19	2022-06-02 08:55:01 -07:00
locking	sysctl changes for v5.19-rc1	2022-05-26 16:57:20 -07:00
module	module: Fix prefix for module.sig_enforce module param	2022-06-02 12:44:33 -07:00
power	cxl for 5.19	2022-05-27 21:24:19 -07:00
printk	Revert "printk: wake up all waiters"	2022-05-27 13:04:46 +02:00
rcu	sysctl changes for v5.19-rc1	2022-05-26 16:57:20 -07:00
sched	sched: Remove the limitation of WF_ON_CPU on wakelist if wakee cpu is idle	2022-06-13 10:30:01 +02:00
time	While looking at the ptrace problems with PREEMPT_RT and the problems	2022-06-03 16:13:25 -07:00
trace	Networking fixes for 5.19-rc2, including fixes from bpf and netfilter.	2022-06-09 12:06:52 -07:00
.gitignore	…
Kconfig.freezer	…
Kconfig.hz	…
Kconfig.locks	…
Kconfig.preempt	Revert "signal, x86: Delay calling signals in atomic on RT enabled kernels"	2022-03-31 10:36:55 +02:00
Makefile	kernel: add platform_has() infrastructure	2022-06-06 08:06:00 +02:00
acct.c	kernel/acct: move acct sysctls to its own file	2022-04-06 13:43:44 -07:00
async.c	Revert "module, async: async_synchronize_full() on module init iff async is used"	2022-02-03 11:20:34 -08:00
audit.c	audit: improve audit queue handling when "audit=1" on cmdline	2022-01-25 13:22:51 -05:00
audit.h	audit: log AUDIT_TIME_* records only from rules	2022-02-22 13:51:40 -05:00
audit_fsnotify.c	fsnotify: make allow_dups a property of the group	2022-04-25 14:37:18 +02:00
audit_tree.c	audit: use fsnotify group lock helpers	2022-04-25 14:37:28 +02:00
audit_watch.c	fsnotify: pass flags argument to fsnotify_alloc_group()	2022-04-25 14:37:12 +02:00
auditfilter.c	audit/stable-5.17 PR 20220110	2022-01-11 13:08:21 -08:00
auditsc.c	audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts	2022-05-17 15:03:36 -04:00
backtracetest.c	…
bounds.c	…
capability.c	xfs: don't generate selinux audit messages for capability testing	2022-03-09 10:32:06 -08:00
cfi.c	…
compat.c	…
configs.c	…
context_tracking.c	…
cpu.c	Intel Trust Domain Extensions	2022-05-23 17:51:12 -07:00
cpu_pm.c	…
crash_core.c	Not a lot of material this cycle. Many singleton patches against various	2022-05-27 11:22:03 -07:00
crash_dump.c	…
cred.c	x86: Mark __invalid_creds() __noreturn	2022-03-15 10:32:44 +01:00
delayacct.c	delayacct: track delays from write-protect copy	2022-06-01 15:55:25 -07:00
dma.c	…
exec_domain.c	…
exit.c	ptrace: Cleanups for v5.18	2022-03-28 17:29:53 -07:00
extable.c	lkdtm: Really write into kernel text in WRITE_KERN	2022-02-16 23:25:12 +11:00
fail_function.c	…
fork.c	This set of changes updates init and user mode helper tasks to be	2022-06-03 16:03:05 -07:00
freezer.c	…
gen_kheaders.sh	kheaders: Have cpio unconditionally replace files	2022-05-08 03:16:59 +09:00
groups.c	…
hung_task.c	Not a lot of material this cycle. Many singleton patches against various	2022-05-27 11:22:03 -07:00
iomem.c	…
irq_work.c	irq_work: use kasan_record_aux_stack_noalloc() record callstack	2022-04-15 14:49:55 -07:00
jump_label.c	…
kallsyms.c	ftrace: Add ftrace_lookup_symbols function	2022-05-10 14:42:06 -07:00
kcmp.c	…
kcov.c	kcov: update pos before writing pc in trace function	2022-05-25 13:05:42 -07:00
kexec.c	…
kexec_core.c	Not a lot of material this cycle. Many singleton patches against various	2022-05-27 11:22:03 -07:00
kexec_elf.c	…
kexec_file.c	RISC-V Patches for the 5.19 Merge Window, Part 1	2022-05-31 14:10:54 -07:00
kexec_internal.h	…
kheaders.c	…
kmod.c	…
kprobes.c	tracing updates for 5.19:	2022-05-29 10:31:36 -07:00
ksysfs.c	kernel/ksysfs.c: use helper macro __ATTR_RW	2022-03-23 19:00:33 -07:00
kthread.c	kthread: unexport kthread_blkcg	2022-05-02 14:06:20 -06:00
latencytop.c	latencytop: move sysctl to its own file	2022-04-21 11:40:59 -07:00
module_signature.c	…
notifier.c	notifier: Add blocking/atomic_notifier_chain_register_unique_prio()	2022-05-19 19:30:30 +02:00
nsproxy.c	…
padata.c	padata: replace cpumask_weight with cpumask_empty in padata.c	2022-01-31 11:21:46 +11:00
panic.c	sysctl changes for v5.19-rc1	2022-05-26 16:57:20 -07:00
params.c	…
pid.c	…
pid_namespace.c	kernel: pid_namespace: use NULL instead of using plain integer as pointer	2022-04-29 14:38:00 -07:00
platform-feature.c	kernel: add platform_has() infrastructure	2022-06-06 08:06:00 +02:00
profile.c	exit: Remove profile_handoff_task	2022-01-08 12:43:57 -06:00
ptrace.c	While looking at the ptrace problems with PREEMPT_RT and the problems	2022-06-03 16:13:25 -07:00
range.c	…
reboot.c	kernel/reboot: Fix powering off using a non-syscall code paths	2022-06-07 19:42:31 +02:00
regset.c	…
relay.c	relay: remove redundant assignment to pointer buf	2022-05-12 20:38:37 -07:00
resource.c	kernel/resource: fix kfree() of bootmem memory again	2022-03-23 19:00:35 -07:00
resource_kunit.c	…
rseq.c	rseq: Remove broken uapi field layout on 32-bit little endian	2022-02-02 13:11:34 +01:00
scftorture.c	scftorture: Fix distribution of short handler delays	2022-04-11 17:07:29 -07:00
scs.c	kasan, vmalloc: only tag normal vmalloc allocations	2022-03-24 19:06:48 -07:00
seccomp.c	seccomp: Add wait_killable semantic to seccomp user notifier	2022-05-03 14:11:58 -07:00
signal.c	While looking at the ptrace problems with PREEMPT_RT and the problems	2022-06-03 16:13:25 -07:00
smp.c	Scheduler changes in this cycle were:	2022-05-24 11:11:13 -07:00
smpboot.c	cpu/hotplug: Allow the CPU in CPU_UP_PREPARE state to be brought up again.	2022-04-12 14:13:01 +02:00
smpboot.h	…
softirq.c	smp: Make softirq handling RT safe in flush_smp_call_function_queue()	2022-05-01 10:03:43 +02:00
stackleak.c	stackleak: add on/off stack variants	2022-05-08 01:33:09 -07:00
stacktrace.c	uaccess: remove CONFIG_SET_FS	2022-02-25 09:36:06 +01:00
static_call.c	static_call: Don't make __static_call_return0 static	2022-04-05 09:59:38 +02:00
static_call_inline.c	static_call: Don't make __static_call_return0 static	2022-04-05 09:59:38 +02:00
stop_machine.c	Scheduler changes in this cycle were:	2022-05-24 11:11:13 -07:00
sys.c	arm64/sme: Implement vector length configuration prctl()s	2022-04-22 18:50:54 +01:00
sys_ni.c	mm/mempolicy: wire up syscall set_mempolicy_home_node	2022-01-15 16:30:30 +02:00
sysctl-test.c	…
sysctl.c	sysctl changes for v5.19-rc1	2022-05-26 16:57:20 -07:00
task_work.c	task_work: allow TWA_SIGNAL without a rescheduling IPI	2022-04-30 08:39:32 -06:00
taskstats.c	kernel: make taskstats available from all net namespaces	2022-04-29 14:38:03 -07:00
torture.c	torture: Wake up kthreads after storing task_struct pointer	2022-02-01 17:24:39 -08:00
tracepoint.c	…
tsacct.c	taskstats: version 12 with thread group and exe info	2022-04-29 14:38:03 -07:00
ucount.c	ucounts: Handle wrapping in is_ucounts_overlimit	2022-02-17 09:11:57 -06:00
uid16.c	…
uid16.h	…
umh.c	kthread: Don't allocate kthread_struct for init and umh	2022-05-06 14:49:44 -05:00
up.c	…
user-return-notifier.c	…
user.c	…
user_namespace.c	ucounts: Fix systemd LimitNPROC with private users regression	2022-02-25 10:40:14 -06:00
usermode_driver.c	blob_to_mnt(): kern_unmount() is needed to undo kern_mount()	2022-05-19 23:25:47 -04:00
utsname.c	…
utsname_sysctl.c	…
watch_queue.c	watch_queue: Free the page array when watch_queue is dismantled	2022-04-02 10:37:39 -07:00
watchdog.c	Not a lot of material this cycle. Many singleton patches against various	2022-05-27 11:22:03 -07:00
watchdog_hld.c	printk: add functions to prefer direct printing	2022-04-22 21:30:58 +02:00
workqueue.c	workqueue: Wrap flush_workqueue() using a macro	2022-06-07 07:07:14 -10:00
workqueue_internal.h	…