WSL2-Linux-Kernel/arch/arm64/include/asm/spinlock.h

/* SPDX-License-Identifier: GPL-2.0-only */
/*
 * Copyright (C) 2012 ARM Ltd.
 */
#ifndef __ASM_SPINLOCK_H
#define __ASM_SPINLOCK_H

#include <asm/qspinlock.h>
#include <asm/qrwlock.h>

/* See include/linux/spinlock.h */
#define smp_mb__after_spinlock()	smp_mb()

/*
 * Changing this will break osq_lock() thanks to the call inside
 * smp_cond_load_relaxed().
 *
 * See:
 * https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-ass.net
 */
#define vcpu_is_preempted vcpu_is_preempted
static inline bool vcpu_is_preempted(int cpu)
{
	return false;
}

#endif /* __ASM_SPINLOCK_H */
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not see http www gnu org licenses extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 503 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Enrico Weigelt <info@metux.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> 2019-06-03 08:44:50 +03:00			`/* SPDX-License-Identifier: GPL-2.0-only */`
arm64: SMP support This patch adds SMP initialisation and spinlocks implementation for AArch64. The spinlock support uses the new load-acquire/store-release instructions to avoid explicit barriers. The architecture also specifies that an event is automatically generated when clearing the exclusive monitor state to wake up processors in WFE, so there is no need for an explicit DSB/SEV instruction sequence. The SEVL instruction is used to set the exclusive monitor locally as there is no conditional WFE and a branch is more expensive. For the SMP booting protocol, see Documentation/arm64/booting.txt. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> 2012-03-05 15:49:30 +04:00			`/*`
			`* Copyright (C) 2012 ARM Ltd.`
			`*/`
			`#ifndef __ASM_SPINLOCK_H`
			`#define __ASM_SPINLOCK_H`

arm64: locking: Replace ticket lock implementation with qspinlock It's fair to say that our ticket lock has served us well over time, but it's time to bite the bullet and start using the generic qspinlock code so we can make use of explicit MCS queuing and potentially better PV performance in future. Signed-off-by: Will Deacon <will.deacon@arm.com> 2018-03-13 23:45:45 +03:00			`#include <asm/qspinlock.h>`
locking/arch: Move qrwlock.h include after qspinlock.h include/asm-generic/qrwlock.h was trying to get arch_spin_is_locked via asm-generic/qspinlock.h. However, this does not work because architectures might be using queued rwlocks but not queued spinlocks (csky), or because they might be defining their own queued_* macros before including asm/qspinlock.h. To fix this, ensure that asm/spinlock.h always includes qrwlock.h after defining arch_spin_is_locked (either directly for csky, or via asm/qspinlock.h for other architectures). The only inclusion elsewhere is in kernel/locking/qrwlock.c. That one is really unnecessary because the file is only compiled in SMP configurations (config QUEUED_RWLOCKS depends on SMP) and in that case linux/spinlock.h already includes asm/qrwlock.h if needed, via asm/spinlock.h. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Waiman Long <longman@redhat.com> Fixes: 26128cb6c7e6 ("locking/rwlocks: Add contention detection for rwlocks") Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Ben Gardon <bgardon@google.com> [Add arch/sparc and kernel/locking parts per discussion with Waiman. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> 2021-02-10 21:16:31 +03:00			`#include <asm/qrwlock.h>`
arm64: SMP support This patch adds SMP initialisation and spinlocks implementation for AArch64. The spinlock support uses the new load-acquire/store-release instructions to avoid explicit barriers. The architecture also specifies that an event is automatically generated when clearing the exclusive monitor state to wake up processors in WFE, so there is no need for an explicit DSB/SEV instruction sequence. The SEVL instruction is used to set the exclusive monitor locally as there is no conditional WFE and a branch is more expensive. For the SMP booting protocol, see Documentation/arm64/booting.txt. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> 2012-03-05 15:49:30 +04:00
locking: Introduce smp_mb__after_spinlock() Since its inception, our understanding of ACQUIRE, esp. as applied to spinlocks, has changed somewhat. Also, I wonder if, with a simple change, we cannot make it provide more. The problem with the comment is that the STORE done by spin_lock isn't itself ordered by the ACQUIRE, and therefore a later LOAD can pass over it and cross with any prior STORE, rendering the default WMB insufficient (pointed out by Alan). Now, this is only really a problem on PowerPC and ARM64, both of which already defined smp_mb__before_spinlock() as a smp_mb(). At the same time, we can get a much stronger construct if we place that same barrier _inside_ the spin_lock(). In that case we upgrade the RCpc spinlock to an RCsc. That would make all schedule() calls fully transitive against one another. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> 2016-09-05 12:37:53 +03:00			`/* See include/linux/spinlock.h */`
			`#define smp_mb__after_spinlock() smp_mb()`
arm64: spinlocks: implement smp_mb__before_spinlock() as smp_mb() smp_mb__before_spinlock() is intended to upgrade a spin_lock() operation to a full barrier, such that prior stores are ordered with respect to loads and stores occuring inside the critical section. Unfortunately, the core code defines the barrier as smp_wmb(), which is insufficient to provide the required ordering guarantees when used in conjunction with our load-acquire-based spinlock implementation. This patch overrides the arm64 definition of smp_mb__before_spinlock() to map to a full smp_mb(). Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> 2016-09-05 13:56:05 +03:00
locking/osq: Use optimized spinning loop for arm64 Arm64 has a more optimized spinning loop (atomic_cond_read_acquire) using wfe for spinlock that can boost performance of sibling threads by putting the current cpu to a wait state that is broken only when the monitored variable changes or an external event happens. OSQ has a more complicated spinning loop. Besides the lock value, it also checks for need_resched() and vcpu_is_preempted(). The check for need_resched() is not a problem as it is only set by the tick interrupt handler. That will be detected by the spinning cpu right after iret. The vcpu_is_preempted() check, however, is a problem as changes to the preempt state of of previous node will not affect the wait state. For ARM64, vcpu_is_preempted is not currently defined and so is a no-op. Will has indicated that he is planning to para-virtualize wfe instead of defining vcpu_is_preempted for PV support. So just add a comment in arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted() should not be defined as suggested. On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking microbenchmark was run for 10s with and without the patch. The performance numbers before patch were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 316/123,143/2,121,269 Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s After patch, the numbers were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 334/147,836/1,304,787 Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s So there was about 20% performance improvement. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com 2020-01-13 18:07:35 +03:00			`/*`
			`* Changing this will break osq_lock() thanks to the call inside`
			`* smp_cond_load_relaxed().`
			`*`
			`* See:`
			`* https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-ass.net`
			`*/`
arm64/spinlock: fix a -Wunused-function warning The commit f5bfdc8e3947 ("locking/osq: Use optimized spinning loop for arm64") introduced a warning from Clang because vcpu_is_preempted() is compiled away, kernel/locking/osq_lock.c:25:19: warning: unused function 'node_cpu' [-Wunused-function] static inline int node_cpu(struct optimistic_spin_node *node) ^ 1 warning generated. Fix it by converting vcpu_is_preempted() to a static inline function. Fixes: f5bfdc8e3947 ("locking/osq: Use optimized spinning loop for arm64") Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Will Deacon <will@kernel.org> 2020-01-23 23:20:51 +03:00			`#define vcpu_is_preempted vcpu_is_preempted`
			`static inline bool vcpu_is_preempted(int cpu)`
			`{`
			`return false;`
			`}`
locking/osq: Use optimized spinning loop for arm64 Arm64 has a more optimized spinning loop (atomic_cond_read_acquire) using wfe for spinlock that can boost performance of sibling threads by putting the current cpu to a wait state that is broken only when the monitored variable changes or an external event happens. OSQ has a more complicated spinning loop. Besides the lock value, it also checks for need_resched() and vcpu_is_preempted(). The check for need_resched() is not a problem as it is only set by the tick interrupt handler. That will be detected by the spinning cpu right after iret. The vcpu_is_preempted() check, however, is a problem as changes to the preempt state of of previous node will not affect the wait state. For ARM64, vcpu_is_preempted is not currently defined and so is a no-op. Will has indicated that he is planning to para-virtualize wfe instead of defining vcpu_is_preempted for PV support. So just add a comment in arch/arm64/include/asm/spinlock.h to indicate that vcpu_is_preempted() should not be defined as suggested. On a 2-socket 56-core 224-thread ARM64 system, a kernel mutex locking microbenchmark was run for 10s with and without the patch. The performance numbers before patch were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 316/123,143/2,121,269 Threads = 224, Total Rate = 2,757 kop/s; Percpu Rate = 12 kop/s After patch, the numbers were: Running locktest with mutex [runtime = 10s, load = 1] Threads = 224, Min/Mean/Max = 334/147,836/1,304,787 Threads = 224, Total Rate = 3,311 kop/s; Percpu Rate = 15 kop/s So there was about 20% performance improvement. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lkml.kernel.org/r/20200113150735.21956-1-longman@redhat.com 2020-01-13 18:07:35 +03:00
arm64: SMP support This patch adds SMP initialisation and spinlocks implementation for AArch64. The spinlock support uses the new load-acquire/store-release instructions to avoid explicit barriers. The architecture also specifies that an event is automatically generated when clearing the exclusive monitor state to wake up processors in WFE, so there is no need for an explicit DSB/SEV instruction sequence. The SEVL instruction is used to set the exclusive monitor locally as there is no conditional WFE and a branch is more expensive. For the SMP booting protocol, see Documentation/arm64/booting.txt. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de> 2012-03-05 15:49:30 +04:00			`#endif /* __ASM_SPINLOCK_H */`