x86/percpu: Remove volatile from arch_raw_cpu_ptr().
The volatile attribute in the inline assembly of arch_raw_cpu_ptr() forces the compiler to always generate the code, even if the compiler can decide upfront that its result is not needed. For instance invoking __intel_pmu_disable_all(false) (like intel_pmu_snapshot_arch_branch_stack() does) leads to loading the address of &cpu_hw_events into the register while compiler knows that it has no need for it. This ends up with code like: | movq $cpu_hw_events, %rax #, tcp_ptr__ | add %gs:this_cpu_off(%rip), %rax # this_cpu_off, tcp_ptr__ | xorl %eax, %eax # tmp93 It also creates additional code within local_lock() with !RT && !LOCKDEP which is not desired. By removing the volatile attribute the compiler can place the function freely and avoid it if it is not needed in the end. By using the function twice the compiler properly caches only the variable offset and always loads the CPU-offset. this_cpu_ptr() also remains properly placed within a preempt_disable() sections because - arch_raw_cpu_ptr() assembly has a memory input ("m" (this_cpu_off)) - prempt_{dis,en}able() fundamentally has a 'barrier()' in it Therefore this_cpu_ptr() is already properly serialized and does not rely on the 'volatile' attribute. Remove volatile from arch_raw_cpu_ptr(). [ bigeasy: Added Linus' explanation why this_cpu_ptr() is not moved out of a preempt_disable() section without the 'volatile' attribute. ] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220328145810.86783-2-bigeasy@linutronix.de
This commit is contained in:
Родитель
df21c0d7a9
Коммит
1c1e7e3c23
|
@ -38,9 +38,9 @@
|
|||
#define arch_raw_cpu_ptr(ptr) \
|
||||
({ \
|
||||
unsigned long tcp_ptr__; \
|
||||
asm volatile("add " __percpu_arg(1) ", %0" \
|
||||
: "=r" (tcp_ptr__) \
|
||||
: "m" (this_cpu_off), "0" (ptr)); \
|
||||
asm ("add " __percpu_arg(1) ", %0" \
|
||||
: "=r" (tcp_ptr__) \
|
||||
: "m" (this_cpu_off), "0" (ptr)); \
|
||||
(typeof(*(ptr)) __kernel __force *)tcp_ptr__; \
|
||||
})
|
||||
#else
|
||||
|
|
Загрузка…
Ссылка в новой задаче