Commit ca74b316df ("arm: Use common cpu_topology structure and
functions.") changed cpu_coregroup_mask() from the ARM32 specific
implementation in arch/arm/include/asm/topology.h to the one shared
with ARM64 and RISCV in drivers/base/arch_topology.c.
Currently on ARM32 (TC2 w/ CONFIG_SCHED_MC) the task scheduler setup
code (w/ CONFIG_SCHED_DEBUG) shows this during CPU hotplug:
ERROR: groups don't span domain->span
It happens to CPUs of the cluster of the CPU which gets hot-plugged
out on scheduler domain MC.
Turns out that the shared cpu_coregroup_mask() requires that the
hot-plugged CPU is removed from the core_sibling mask via
remove_cpu_topology(). Otherwise the 'is core_sibling subset of
cpumask_of_node()' doesn't work. In this case the task scheduler has to
deal with cpumask_of_node instead of core_sibling which is wrong on
scheduler domain MC.
e.g. CPU3 hot-plugged out on TC2 [cluster0: 0,3-4 cluster1: 1-2]:
cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,3-4
^
should be:
cpu_coregroup_mask(): CPU3 cpumask_of_node=0-2,4 core_sibling=0,4
Add remove_cpu_topology() to __cpu_disable() to remove the CPU from the
topology masks in case of a CPU hotplug out operation.
At the same time tweak store_cpu_topology() slightly so it will call
update_siblings_masks() in case of CPU hotplug in operation via
secondary_start_kernel()->smp_store_cpu_info().
This aligns the ARM32 implementation with the ARM64 one.
Guarding remove_cpu_topology() with CONFIG_GENERIC_ARCH_TOPOLOGY is
necessary since some Arm32 defconfigs (aspeed_g5_defconfig,
milbeaut_m10v_defconfig, spear13xx_defconfig) specify an explicit
# CONFIG_ARM_CPU_TOPOLOGY is not set
w/ ./arch/arm/Kconfig: select GENERIC_ARCH_TOPOLOGY if ARM_CPU_TOPOLOGY
Fixes: ca74b316df ("arm: Use common cpu_topology structure and functions")
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
This commit removes the open-coded CPU-offline notification with new
common code. In particular, this change avoids calling scheduler code
using RCU from an offline CPU that RCU is ignoring. This is a minimal
change. A more intrusive change might invoke the cpu_check_up_prepare()
and cpu_set_state_online() functions at CPU-online time, which would
allow onlining throw an error if the CPU did not go offline properly.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
- Add a "cut here" to make it clearer where oops dumps should be cut
from - we already have a marker for the end of the dumps.
- Add logging severity to show_pte()
- Drop unnecessary common-page-size linker flag
- Errata workarounds for Cortex A12 857271, Cortex A17 857272 and
Cortex A7 814220.
- Remove some unused variables that had started to provoke a compiler
warning.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAXSMct/TnkBvkraxkAQIpKBAAoE+5ZLeBanLZ6GMUpIMW+wVfVYaI4p8g
kZd0Dm2EB5Y9x9kfYwCLmdUp5A/Lh3dmAkAGHSP0TVdDj34qK9KXxCA3t+k85ph5
+VDzdpAtUrbuDWaF/6KIaGMLgA4ss+VXpoqMOvZuwDQReXbXaT325WVzcTFlNLe7
sUBS8k2DxTTjiF0HM/jnxtbvVmc3sfkPij+MTEz9XEcS9K0Hilz9suoJEKZCXY6H
ULX3n+8wJAxbnbnXdGLyQ095BqtaQC88mEcMUvBci7NBGiRne19vvRqdLwe1mJPc
PTz1v/Pjvbfj1apSZDGVJvIu3BVEkwJNBF1NujRGUVrYnKQd/YJSKJkqjXat5Q5M
H/G2l+y0vmzn4VSbn1LUEvleuou6Hq/1nucksS1DHruEB7tH/aQRtf0EduzGRsop
VmYkTiIiaF28N+eHDolDtUzSg5RNzEMCQnni4y6ajGhbWiu1Q63GBR7k9zwB27Bj
3t6RdrqIaE52uKUdgwc9APcpZhdJR/ncqrI68B+wIwbuObhUSY5sa1Ec8uiELOiK
g3zCp4AWRBbYQFNeQy87A811JSFU+L/lCbABs32Bj8UEKGn/qdmVwEaH9rZ8/PHk
TAlcu8PtwBVQQIEgs+ur+OyfZcLt2U4DUDiL/26GCyG+pzzL7fYap5GSoBuqR02p
MmoHFr87YZM=
=Ru/4
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
- Add a "cut here" to make it clearer where oops dumps should be cut
from - we already have a marker for the end of the dumps.
- Add logging severity to show_pte()
- Drop unnecessary common-page-size linker flag
- Errata workarounds for Cortex A12 857271, Cortex A17 857272 and
Cortex A7 814220.
- Remove some unused variables that had started to provoke a compiler
warning.
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8863/1: stm32: select ARM errata 814220
ARM: 8862/1: errata: 814220-B-Cache maintenance by set/way operations can execute out of order
ARM: 8865/1: mm: remove unused variables
ARM: 8864/1: Add workaround for I-Cache line size mismatch between CPU cores
ARM: 8861/1: errata: Workaround errata A12 857271 / A17 857272
ARM: 8860/1: VDSO: Drop implicit common-page-size linker flag
ARM: arrange show_pte() to issue severity-based messages
ARM: add "8<--- cut here ---" to kernel dumps
Some big.LITTLE systems have I-Cache line size mismatch between
LITTLE and big cores. This patch adds a workaround for proper I-Cache
support on such systems. Without it, some class of the userspace code
(typically self-modifying) might suffer from random SIGILL failures.
Similar workaround already exists for ARM64 architecture. I has been
added by commit 116c81f427 ("arm64: Work around systems with mismatched
cache line sizes").
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Fix recent regression causing kernels built with CONFIG_PM
unset to crash on systems that support the Performance and
Energy Bias Hint (EPB) by avoiding to compile the EPB-related
code depending on CONFIG_PM when it is unset (Rafael Wysocki).
- Clean up the transition notifier invocation code in the cpufreq
core and change some users of cpufreq transition notifiers
accordingly (Viresh Kumar).
- Change MAINTAINERS to cover the schedutil governor as part of
cpufreq (Viresh Kumar).
- Simplify cpufreq_init_policy() to avoid redundant computations
(Yue Hu).
- Add explanatory comment to the cpufreq core (Rafael Wysocki).
- Introduce a new flag, GENPD_FLAG_RPM_ALWAYS_ON, to the generic
power domains (genpd) framework along with the first user of it
(Leonard Crestez).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAlzb4TASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxiEAP/37uQOx+I8J3IU7HQcPIkdI1hgksLEzo
g2eoREekjszIjFK9xa70X3V/QnGK4YSPQ/cHCjgXfVhwkO5TJzte5T5M2z9gUCDT
7OMYWCI6hP6Mo5UWlP4dQ9Cqce4SB3TdibadevxcVOhFAW/xz42y5Gr6s4WkexJf
Swb2uoLS4gGANyhUhx6XEZ5NpWZkWcK2ygZ8VJZETnoIwxMSUW7FTJkF+4s2tXLZ
GH+F5jWAbwPlg6g2c54lPL1HtiAvK+/018aF8CZMqUBec94RHDFybVOlb5sacfQW
+Y0W/mc/6SMqT3OUcQ0H3Z/qkgwR8mL01hH6gCP1jA5OBljmTjzk0Bbc4c3n9BEN
aRy4M8Qc/GXzEBPO3Z9AlYik6ALH9iUgL2hewGZAFN8kn9ZGPAqYsctdCVkfKL1u
4Esz5+wOsyYmBx910PozL+p2jbTH0x89sSo1qXUQr2JEiNm2iL4I4+ndqhuiq4LO
sQPHCpe4HhYWzIQzJLDurv6hAxxU5PUsGg8XDEGlsyowIPDoIkMgC93RRLGZ/taY
Ivc2FSlwLTSkzBHwVfckakXPvfyFdw8DFL2n66dQbXS9FFNshOF/TFx40iV42i5H
wusyIZIT1y1H74De0EVntUho3xBo3nrrsu1o2NaXsTBoEsYwJiCji4yOZlI1Zh+m
A9coiXKm4hY5
=LqTN
-----END PGP SIGNATURE-----
Merge tag 'pm-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These fix a recent regression causing kernels built with CONFIG_PM
unset to crash on systems that support the Performance and Energy Bias
Hint (EPB), clean up the cpufreq core and some users of transition
notifiers and introduce a new power domain flag into the generic power
domains framework (genpd).
Specifics:
- Fix recent regression causing kernels built with CONFIG_PM unset to
crash on systems that support the Performance and Energy Bias Hint
(EPB) by avoiding to compile the EPB-related code depending on
CONFIG_PM when it is unset (Rafael Wysocki).
- Clean up the transition notifier invocation code in the cpufreq
core and change some users of cpufreq transition notifiers
accordingly (Viresh Kumar).
- Change MAINTAINERS to cover the schedutil governor as part of
cpufreq (Viresh Kumar).
- Simplify cpufreq_init_policy() to avoid redundant computations (Yue
Hu).
- Add explanatory comment to the cpufreq core (Rafael Wysocki).
- Introduce a new flag, GENPD_FLAG_RPM_ALWAYS_ON, to the generic
power domains (genpd) framework along with the first user of it
(Leonard Crestez)"
* tag 'pm-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
soc: imx: gpc: Use GENPD_FLAG_RPM_ALWAYS_ON for ERR009619
PM / Domains: Add GENPD_FLAG_RPM_ALWAYS_ON flag
cpufreq: Update MAINTAINERS to include schedutil governor
cpufreq: Don't find governor for setpolicy drivers in cpufreq_init_policy()
cpufreq: Explain the kobject_put() in cpufreq_policy_alloc()
cpufreq: Call transition notifier only once for each policy
x86: intel_epb: Take CONFIG_PM into account
Patch series "compiler: allow all arches to enable
CONFIG_OPTIMIZE_INLINING", v3.
This patch (of 11):
When function tracing for IPIs is enabled, we get a warning for an
overflow of the ipi_types array with the IPI_CPU_BACKTRACE type as
triggered by raise_nmi():
arch/arm/kernel/smp.c: In function 'raise_nmi':
arch/arm/kernel/smp.c:489:2: error: array subscript is above array bounds [-Werror=array-bounds]
trace_ipi_raise(target, ipi_types[ipinr]);
This is a correct warning as we actually overflow the array here.
This patch raise_nmi() to call __smp_cross_call() instead of
smp_cross_call(), to avoid calling into ftrace. For clarification, I'm
also adding a two new code comments describing how this one is special.
The warning appears to have shown up after commit e7273ff49a ("ARM:
8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI"), which changed the
number assignment from '15' to '8', but as far as I can tell has existed
since the IPI tracepoints were first introduced. If we decide to
backport this patch to stable kernels, we probably need to backport
e7273ff49a as well.
[yamada.masahiro@socionext.com: rebase on v5.1-rc1]
Link: http://lkml.kernel.org/r/20190423034959.13525-2-yamada.masahiro@socionext.com
Fixes: e7273ff49a ("ARM: 8488/1: Make IPI_CPU_BACKTRACE a "non-secure" SGI")
Fixes: 365ec7b173 ("ARM: add IPI tracepoints") # v3.17
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Borislav Petkov <bp@suse.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, the notifiers are called once for each CPU of the policy->cpus
cpumask. It would be more optimal if the notifier can be called only
once and all the relevant information be provided to it. Out of the 23
drivers that register for the transition notifiers today, only 4 of them
do per-cpu updates and the callback for the rest can be called only once
for the policy without any impact.
This would also avoid multiple function calls to the notifier callbacks
and reduce multiple iterations of notifier core's code (which does
locking as well).
This patch adds pointer to the cpufreq policy to the struct
cpufreq_freqs, so the notifier callback has all the information
available to it with a single call. The five drivers which perform
per-cpu updates are updated to use the cpufreq policy. The freqs->cpu
field is redundant now and is removed.
Acked-by: David S. Miller <davem@davemloft.net> (sparc)
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
machine_crash_nonpanic_core() does this:
while (1)
cpu_relax();
because the kernel has crashed, and we have no known safe way to deal
with the CPU. So, we place the CPU into an infinite loop which we
expect it to never exit - at least not until the system as a whole is
reset by some method.
In the absence of erratum 754327, this code assembles to:
b .
In other words, an infinite loop. When erratum 754327 is enabled,
this becomes:
1: dmb
b 1b
It has been observed that on some systems (eg, OMAP4) where, if a
crash is triggered, the system tries to kexec into the panic kernel,
but fails after taking the secondary CPU down - placing it into one
of these loops. This causes the system to livelock, and the most
noticable effect is the system stops after issuing:
Loading crashdump kernel...
to the system console.
The tested as working solution I came up with was to add wfe() to
these infinite loops thusly:
while (1) {
cpu_relax();
wfe();
}
which, without 754327 builds to:
1: wfe
b 1b
or with 754327 is enabled:
1: dmb
wfe
b 1b
Adding "wfe" does two things depending on the environment we're running
under:
- where we're running on bare metal, and the processor implements
"wfe", it stops us spinning endlessly in a loop where we're never
going to do any useful work.
- if we're running in a VM, it allows the CPU to be given back to the
hypervisor and rescheduled for other purposes (maybe a different VM)
rather than wasting CPU cycles inside a crashed VM.
However, in light of erratum 794072, Will Deacon wanted to see 10 nops
as well - which is reasonable to cover the case where we have erratum
754327 enabled _and_ we have a processor that doesn't implement the
wfe hint.
So, we now end up with:
1: wfe
b 1b
when erratum 754327 is disabled, or:
1: dmb
nop
nop
nop
nop
nop
nop
nop
nop
nop
nop
wfe
b 1b
when erratum 754327 is enabled. We also get the dmb + 10 nop
sequence elsewhere in the kernel, in terminating loops.
This is reasonable - it means we get the workaround for erratum
794072 when erratum 754327 is enabled, but still relinquish the dead
processor - either by placing it in a lower power mode when wfe is
implemented as such or by returning it to the hypervisior, or in the
case where wfe is a no-op, we use the workaround specified in erratum
794072 to avoid the problem.
These as two entirely orthogonal problems - the 10 nops addresses
erratum 794072, and the wfe is an optimisation that makes the system
more efficient when crashed either in terms of power consumption or
by allowing the host/other VMs to make use of the CPU.
I don't see any reason not to use kexec() inside a VM - it has the
potential to provide automated recovery from a failure of the VMs
kernel with the opportunity for saving a crashdump of the failure.
A panic() with a reboot timeout won't do that, and reading the
libvirt documentation, setting on_reboot to "preserve" won't either
(the documentation states "The preserve action for an on_reboot event
is treated as a destroy".) Surely it has to be a good thing to
avoiding having CPUs spinning inside a VM that is doing no useful
work.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Consolidating the "pen_release" stuff amongst the various SoC
implementations gives credence to having a CPU holding pen for
secondary CPUs. However, this is far from the truth.
Many SoC implementations cargo-cult copied various bits of the pen
release implementation from the initial Realview/Versatile Express
implementation without understanding what it was or why it existed.
The reason it existed is because these are _development_ platforms,
and some board firmware is unable to individually control the
startup of secondary CPUs. Moreover, they do not have a way to
power down or reset secondary CPUs for hot-unplug. Hence, the
pen_release implementation was designed for ARM Ltd's development
platforms to provide a working implementation, even though it is
very far from what is required.
It was decided a while back to reduce the duplication by consolidating
the "pen_release" variable, but this only made the situation worse -
we have ended up with several implementations that read this variable
but do not write it - again, showing the cargo-cult mentality at work,
lack of proper review of new code, and in some cases a lack of testing.
While it would be preferable to remove pen_release entirely from the
kernel, this is not possible without help from the SoC maintainers,
which seems to be lacking. However, I want to remove pen_release from
arch code to remove the credence that having it gives.
This patch removes pen_release from the arch code entirely, adding
private per-SoC definitions for it instead, and explicitly stating
that write_pen_release() is cargo-cult copied and should not be
copied any further. Rename write_pen_release() in a similar fashion
as well.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Arm TC2 fails cpu hotplug stress test.
This issue was tracked down to a missing copy of the new affinity
cpumask for the vexpress-spc interrupt into struct
irq_common_data.affinity when the interrupt is migrated in
migrate_one_irq().
Fix it by replacing the arm specific hotplug cpu migration with the
generic irq code.
This is the counterpart implementation to commit 217d453d47 ("arm64:
fix a migrating irq bug when hotplug cpu").
Tested with cpu hotplug stress test on Arm TC2 (multi_v7_defconfig plus
CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y).
The vexpress-spc interrupt (irq=22) on this board is affine to CPU0.
Its affinity cpumask now changes correctly e.g. from 0 to 1-4 when
CPU0 is hotplugged out.
Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
In big.Little systems, some CPUs require the Spectre workarounds in
paths such as the context switch, but other CPUs do not. In order
to handle these differences, we need per-CPU vtables.
We are unable to use the kernel's per-CPU variables to support this
as per-CPU is not initialised at times when we need access to the
vtables, so we have to use an array indexed by logical CPU number.
We use an array-of-pointers to avoid having function pointers in
the kernel's read/write .data section.
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
panic()
__crash_kexec
machine_crash_shutdown
crash_smp_send_stop
machine_kexec
BUG_ON(num_online_cpus() > 1);
CPU 1:
panic()
local_irq_disable
panic_smp_self_stop
If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).
Signed-off-by: Yufen Wang <wangyufen@huawei.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Check for CPU bugs when secondary processors are being brought online,
and also when CPUs are resuming from a low power mode. This gives an
opportunity to check that processor specific bug workarounds are
correctly enabled for all paths that a CPU re-enters the kernel.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Boot-tested-by: Tony Lindgren <tony@atomide.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Suspending a CPU on a RT kernel results in the following backtrace:
| Disabling non-boot CPUs ...
| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
| in_atomic(): 1, irqs_disabled(): 128, pid: 18, name: migration/1
| INFO: lockdep is turned off.
| irq event stamp: 122
| hardirqs last enabled at (121): [<c06ac0ac>] _raw_spin_unlock_irqrestore+0x88/0x90
| hardirqs last disabled at (122): [<c06abed0>] _raw_spin_lock_irq+0x28/0x5c
| CPU: 1 PID: 18 Comm: migration/1 Tainted: G W 4.1.4-rt3-01046-g96ac8da #204
| Hardware name: Generic DRA74X (Flattened Device Tree)
| [<c0019134>] (unwind_backtrace) from [<c0014774>] (show_stack+0x20/0x24)
| [<c0014774>] (show_stack) from [<c06a70f4>] (dump_stack+0x88/0xdc)
| [<c06a70f4>] (dump_stack) from [<c006cab8>] (___might_sleep+0x198/0x2a8)
| [<c006cab8>] (___might_sleep) from [<c06ac4dc>] (rt_spin_lock+0x30/0x70)
| [<c06ac4dc>] (rt_spin_lock) from [<c013f790>] (find_lock_task_mm+0x9c/0x174)
| [<c013f790>] (find_lock_task_mm) from [<c00409ac>] (clear_tasks_mm_cpumask+0xb4/0x1ac)
| [<c00409ac>] (clear_tasks_mm_cpumask) from [<c00166a4>] (__cpu_disable+0x98/0xbc)
| [<c00166a4>] (__cpu_disable) from [<c06a2e8c>] (take_cpu_down+0x1c/0x50)
| [<c06a2e8c>] (take_cpu_down) from [<c00f2600>] (multi_cpu_stop+0x11c/0x158)
| [<c00f2600>] (multi_cpu_stop) from [<c00f2a9c>] (cpu_stopper_thread+0xc4/0x184)
| [<c00f2a9c>] (cpu_stopper_thread) from [<c0069058>] (smpboot_thread_fn+0x18c/0x324)
| [<c0069058>] (smpboot_thread_fn) from [<c00649c4>] (kthread+0xe8/0x104)
| [<c00649c4>] (kthread) from [<c0010058>] (ret_from_fork+0x14/0x3c)
| CPU1: shutdown
The root cause of above backtrace is task_lock() which takes a sleeping
lock on -RT.
To fix the issue, move clear_tasks_mm_cpumask() call from __cpu_disable()
to __cpu_die() which is called on the thread which is asking for a target
CPU to be shutdown. In addition, this change restores CPU hotplug
functionality on ARM CPU1 can be unplugged/plugged many times.
Link: http://lkml.kernel.org/r/1441995683-30817-1-git-send-email-grygorii.strashko@ti.com
[bigeasy: slighty edited the commit message]
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
With switch to dynamic exception base address setting, VBAR/Hivecs
set only for boot CPU, but secondaries stay unaware of that. That
might lead to weird effects when trying up to bring up secondaries.
Fixes: ad475117d2 ("ARM: 8649/2: nommu: remove Hivecs configuration is asm")
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Acked-by: afzal mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Currently, there are several issues with how MPU is setup:
1. We won't boot if MPU is missing
2. We won't boot if use XIP
3. Further extension of MPU setup requires asm skills
The 1st point can be relaxed, so we can continue with boot CPU even if
MPU is missed and fail boot for secondaries only. To address the 2nd
point we could create region covering CONFIG_XIP_PHYS_ADDR - _end and
that might work for the first stage of MPU enable, but due to MPU's
alignment requirement we could cover too much, IOW we need more
flexibility in how we're partitioning memory regions... and it'd be
hardly possible to archive because of the 3rd point.
This patch is trying to address 1st and 3rd issues and paves the path
for 2nd and further improvements.
The most visible change introduced with this patch is that we start
using mpu_rgn_info array (as it was supposed?), so change in MPU setup
done by boot CPU is recorded there and feed to secondaries. It
allows us to keep minimal region setup for boot CPU and do the rest in
C. Since we start programming MPU regions in C evaluation of MPU
constrains (number of regions supported and minimal region order) can
be done once, which in turn open possibility to free-up "probe"
region early.
Tested-by: Szemző András <sza@esh.hu>
Tested-by: Alexandre TORGUE <alexandre.torgue@st.com>
Tested-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.
Adjust the system_state check in ipi_cpu_stop() to handle the extra states.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170516184735.020718977@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are going to split <linux/sched/hotplug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/hotplug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull ARM updates from Russell King:
- nommu updates from Afzal Mohammed cleaning up the vectors support
- allow DMA memory "mapping" for nommu Benjamin Gaignard
- fixing a correctness issue with R_ARM_PREL31 relocations in the
module linker
- add strlen() prototype for the decompressor
- support for DEBUG_VIRTUAL from Florian Fainelli
- adjusting memory bounds after memory reservations have been
registered
- unipher cache handling updates from Masahiro Yamada
- initrd and Thumb Kconfig cleanups
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (23 commits)
ARM: mm: round the initrd reservation to page boundaries
ARM: mm: clean up initrd initialisation
ARM: mm: move initrd init code out of arm_memblock_init()
ARM: 8655/1: improve NOMMU definition of pgprot_*()
ARM: 8654/1: decompressor: add strlen prototype
ARM: 8652/1: cache-uniphier: clean up active way setup code
ARM: 8651/1: cache-uniphier: include <linux/errno.h> instead of <linux/types.h>
ARM: 8650/1: module: handle negative R_ARM_PREL31 addends correctly
ARM: 8649/2: nommu: remove Hivecs configuration is asm
ARM: 8648/2: nommu: display vectors base
ARM: 8647/2: nommu: dynamic exception base address setting
ARM: 8646/1: mmu: decouple VECTORS_BASE from Kconfig
ARM: 8644/1: Reduce "CPU: shutdown" message to debug level
ARM: 8641/1: treewide: Replace uses of virt_to_phys with __pa_symbol
ARM: 8640/1: Add support for CONFIG_DEBUG_VIRTUAL
ARM: 8639/1: Define KERNEL_START and KERNEL_END
ARM: 8638/1: mtd: lart: Rename partition defines to be prefixed with PART_
ARM: 8637/1: Adjust memory boundaries after reservations
ARM: 8636/1: Cleanup sanity_check_meminfo
ARM: add CPU_THUMB_CAPABLE to indicate possible Thumb support
...
Similar to c68b0274fb ("ARM: reduce "Booted secondary processor"
message to debug level"), demote the "CPU: shutdown" pr_notice() into a
pr_debug().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Apart from adding the helper function itself, the rest of the kernel is
converted mechanically using:
git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)->mm_count);/mmgrab\(\1\);/'
git grep -l 'atomic_inc.*mm_count' | xargs sed -i 's/atomic_inc(&\(.*\)\.mm_count);/mmgrab\(\&\1\);/'
This is needed for a later patch that hooks into the helper, but might
be a worthwhile cleanup on its own.
(Michal Hocko provided most of the kerneldoc comment.)
Link: http://lkml.kernel.org/r/20161218123229.22952-1-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently on arm there is code that checks whether it should call
dump_stack() explicitly, to avoid trying to raise an NMI when the
current context is not preemptible by the backtrace IPI. Similarly, the
forthcoming arch/tile support uses an IPI mechanism that does not
support generating an NMI to self.
Accordingly, move the code that guards this case into the generic
mechanism, and invoke it unconditionally whenever we want a backtrace of
the current cpu. It seems plausible that in all cases, dump_stack()
will generate better information than generating a stack from the NMI
handler. The register state will be missing, but that state is likely
not particularly helpful in any case.
Or, if we think it is helpful, we should be capturing and emitting the
current register state in all cases when regs == NULL is passed to
nmi_cpu_backtrace().
Link: http://lkml.kernel.org/r/1472487169-14923-3-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Aaron Tomlin <atomlin@redhat.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "improvements to the nmi_backtrace code" v9.
This patch series modifies the trigger_xxx_backtrace() NMI-based remote
backtracing code to make it more flexible, and makes a few small
improvements along the way.
The motivation comes from the task isolation code, where there are
scenarios where we want to be able to diagnose a case where some cpu is
about to interrupt a task-isolated cpu. It can be helpful to see both
where the interrupting cpu is, and also an approximation of where the
cpu that is being interrupted is. The nmi_backtrace framework allows us
to discover the stack of the interrupted cpu.
I've tested that the change works as desired on tile, and build-tested
x86, arm, mips, and sparc64. For x86 I confirmed that the generic
cpuidle stuff as well as the architecture-specific routines are in the
new cpuidle section. For arm, mips, and sparc I just build-tested it
and made sure the generic cpuidle routines were in the new cpuidle
section, but I didn't attempt to figure out which the platform-specific
idle routines might be. That might be more usefully done by someone
with platform experience in follow-up patches.
This patch (of 4):
Currently you can only request a backtrace of either all cpus, or all
cpus but yourself. It can also be helpful to request a remote backtrace
of a single cpu, and since we want that, the logical extension is to
support a cpumask as the underlying primitive.
This change modifies the existing lib/nmi_backtrace.c code to take a
cpumask as its basic primitive, and modifies the linux/nmi.h code to use
the new "cpumask" method instead.
The existing clients of nmi_backtrace (arm and x86) are converted to
using the new cpumask approach in this change.
The other users of the backtracing API (sparc64 and mips) are converted
to use the cpumask approach rather than the all/allbutself approach.
The mips code ignored the "include_self" boolean but with this change it
will now also dump a local backtrace if requested.
Link: http://lkml.kernel.org/r/1472487169-14923-2-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guided by grsecurity's analogous __read_only markings in arch/arm,
this applies several uses of __ro_after_init to structures that are
only updated during __init.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
- Two boot warning fixes from the RCU tree that should have gotten
merged several weeks ago already but did not because of issues
with who merges them. Paul has now split the RCU warning fixes into
sets for various maintainers.
- Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes
- Fix PM for omap3 boards using timer12 and gptimer, like the
original beagleboard
- Fix hangs on am437x-sk-evm by lowering the I2C bus speed
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXY8wuAAoJEBvUPslcq6VzGBQQAJ6OIH0Gws19Wyi8IqnjMLJN
npu+JXU0xP5bBZ+HbCVjyN8k32drhXdwDMQ+u1DvBYwUuyLIIRZPZF4aHb8fDfOC
v1VqUzQRzj1FCh9MlkdqTedA180WCo5PCGlFOon0BmaZlv9WevEaTOYrEgyZPrmk
quBnaE+baZfGxWBbDSN+OrGYobQRs7Eu8bel0gh628CDiajrbwlIyAcNdEn5C/Uu
GHiEuIQcxb4b62mwAwh/t7el9ureirsS1b6mFB41puPmF/lYawI6YaCWIL48lbMd
XsgKGnFDU6dgSO5QRx5PhP/7YP9FetS0U+g7iFhgjmArNCraNQYBY0ltMweOG0qe
M8BPxDuCnhm1Q+PcjBORteN/PF47kcnBMpiJVVTmq5JRlXUqXecKpoScCt9HfPgy
EJq+esLQNIuRw9cEwVPQBj80GyxFUVqL/Rzo7xjEwTDPYRQERGCB7V68iV25on3w
07dOVl/lSwe902ik580wnlGUq+J09wk+9hIKV2XwQAV/8mKaWMc3pA8rE/efLJoC
buAsccxVcEsR3+uLSsU/P+fFm8nfBRmiOO9yIR4gez0BhbiDMc1zpwwhLkI+vTT4
D3PnuUdVeBvoGTNnpwqSURxajhaK0DSKTwhWnWGubYfXd3B48sW76rvKLO1FThgL
qyaed06QFeWj8gV+VZLb
=P0Vi
-----END PGP SIGNATURE-----
Merge tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omaps for v4.7-rc cycle:
- Two boot warning fixes from the RCU tree that should have gotten
merged several weeks ago already but did not because of issues
with who merges them. Paul has now split the RCU warning fixes into
sets for various maintainers.
- Fix ams-delta FIQ regression caused by omap1 sparse IRQ changes
- Fix PM for omap3 boards using timer12 and gptimer, like the
original beagleboard
- Fix hangs on am437x-sk-evm by lowering the I2C bus speed
* tag 'fixes-rcu-fiq-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am437x-sk-evm: Reduce i2c0 bus speed for tps65218
ARM: OMAP2+: timer: add probe for clocksources
ARM: OMAP1: fix ams-delta FIQ handler to work with sparse IRQ
arm: Use _rcuidle for smp_cross_call() tracepoints
arm: Use _rcuidle tracepoint to allow use from idle
Signed-off-by: Olof Johansson <olof@lixom.net>
Further testing with false negatives suppressed by commit 293e2421fe
("rcu: Remove superfluous versions of rcu_read_lock_sched_held()")
identified another unprotected use of RCU from the idle loop. Because RCU
actively ignores idle-loop code (for energy-efficiency reasons, among
other things), using RCU from the idle loop can result in too-short
grace periods, in turn resulting in arbitrary misbehavior.
The resulting lockdep-RCU splat is as follows:
------------------------------------------------------------------------
===============================
[ INFO: suspicious RCU usage. ]
4.6.0-rc5-next-20160426+ #1112 Not tainted
-------------------------------
include/trace/events/ipi.h:35 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1112
Hardware name: Generic OMAP4 (Flattened Device Tree)
[<c0110308>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
[<c010c3a8>] (show_stack) from [<c047fec8>] (dump_stack+0xb0/0xe4)
[<c047fec8>] (dump_stack) from [<c010dcfc>] (smp_cross_call+0xbc/0x188)
[<c010dcfc>] (smp_cross_call) from [<c01c9e28>] (generic_exec_single+0x9c/0x15c)
[<c01c9e28>] (generic_exec_single) from [<c01ca0a0>] (smp_call_function_single_async+0 x38/0x9c)
[<c01ca0a0>] (smp_call_function_single_async) from [<c0603728>] (cpuidle_coupled_poke_others+0x8c/0xa8)
[<c0603728>] (cpuidle_coupled_poke_others) from [<c0603c10>] (cpuidle_enter_state_coupled+0x26c/0x390)
[<c0603c10>] (cpuidle_enter_state_coupled) from [<c0183c74>] (cpu_startup_entry+0x198/0x3a0)
[<c0183c74>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
[<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
------------------------------------------------------------------------
Reported-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <linux-omap@vger.kernel.org>
Cc: <linux-arm-kernel@lists.infradead.org>
printk() takes some locks and could not be used a safe way in NMI
context.
The chance of a deadlock is real especially when printing stacks from
all CPUs. This particular problem has been addressed on x86 by the
commit a9edc88093 ("x86/nmi: Perform a safe NMI stack trace on all
CPUs").
The patchset brings two big advantages. First, it makes the NMI
backtraces safe on all architectures for free. Second, it makes all NMI
messages almost safe on all architectures (the temporary buffer is
limited. We still should keep the number of messages in NMI context at
minimum).
Note that there already are several messages printed in NMI context:
WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE
handlers. These are not easy to avoid.
This patch reuses most of the code and makes it generic. It is useful
for all messages and architectures that support NMI.
The alternative printk_func is set when entering and is reseted when
leaving NMI context. It queues IRQ work to copy the messages into the
main ring buffer in a safe context.
__printk_nmi_flush() copies all available messages and reset the buffer.
Then we could use a simple cmpxchg operations to get synchronized with
writers. There is also used a spinlock to get synchronized with other
flushers.
We do not longer use seq_buf because it depends on external lock. It
would be hard to make all supported operations safe for a lockless use.
It would be confusing and error prone to make only some operations safe.
The code is put into separate printk/nmi.c as suggested by Steven
Rostedt. It needs a per-CPU buffer and is compiled only on
architectures that call nmi_enter(). This is achieved by the new
HAVE_NMI Kconfig flag.
The are MN10300 and Xtensa architectures. We need to clean up NMI
handling there first. Let's do it separately.
The patch is heavily based on the draft from Peter Zijlstra, see
https://lkml.org/lkml/2015/6/10/327
[arnd@arndb.de: printk-nmi: use %zu format string for size_t]
[akpm@linux-foundation.org: min_t->min - all types are size_t here]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Jan Kara <jack@suse.cz>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> [arm part]
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jiri Kosina <jkosina@suse.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let the non boot cpus call into idle with the corresponding hotplug state, so
the hotplug core can handle the further bringup. That's a first step to
convert the boot side of the hotplugged cpus to do all the synchronization
with the other side through the state machine. For now it'll only start the
hotplug thread and kick the full bringup of the cpu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: Rik van Riel <riel@redhat.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: "Srivatsa S. Bhat" <srivatsa@mit.edu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: http://lkml.kernel.org/r/20160226182341.614102639@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Having IPI_CPU_BACKTRACE as SGI15 may not work if the kernel is
running in non-secure mode and that the secure firmware has
decided to follow ARM's recommendations that SGI8-15 should
be reserved for secure purpose.
Now that we are "only" using SGI0-6, change IPI_CPU_BACKTRACE
to use SGI7, which makes it more likely to work.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since 9a46ad6d6d ("smp: make smp_call_function_many() use logic
similar to smp_call_function_single()"), the core IPI handling
has been simplified, and generic_smp_call_function_interrupt is
now the same as generic_smp_call_function_single_interrupt.
This means that one of IPI_CALL_FUNC and IPI_CALL_FUNC_SINGLE has
become redundant. We can then safely drop IPI_CALL_FUNC_SINGLE,
and use only IPI_CALL_FUNC.
This has the advantage of reducing the number of SGI IDs we're using
(a fairly scarse resource).
Tested on a dual A7 board.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Currently on ARM when <SysRq-L> is triggered from an interrupt handler
(e.g. a SysRq issued using UART or kbd) the main CPU will wedge for ten
seconds with interrupts masked before issuing a backtrace for every CPU
except itself.
The new backtrace code introduced by commit 96f0e00378 ("ARM: add
basic support for on-demand backtrace of other CPUs") does not work
correctly when run from an interrupt handler because IPI_CPU_BACKTRACE
is used to generate the backtrace on all CPUs but cannot preempt the
current calling context.
This can be fixed by detecting that the calling context cannot be
preempted and issuing the backtrace directly in this case. Issuing
directly leaves us without any pt_regs to pass to nmi_cpu_backtrace()
so we also modify the generic code to call dump_stack() when its
argument is NULL.
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This function just copies '*ops' to 'smp_ops', so the given
structure '*ops' is not modified at all.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch adds imprecise abort enable/disable macros and uses them to
enable imprecise aborts early when starting the kernel.
This helps in tracking down the real cause for such imprecise abort, as
they are handled as soon as they occur. Until now those aborts would
only be enabled when entering the userspace and as a consequence crash
the first userspace process if any abort had been raised during kernel
startup.
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull NMI backtrace update from Russell King:
"These changes convert the x86 NMI handling to be a library
implementation which other architectures can make use of. Thomas
Gleixner has reviewed and tested these changes, and wishes me to send
these rather than taking them through the tip tree.
The final patch in the set adds an initial implementation using this
infrastructure to ARM, even though it doesn't send the IPI at "NMI"
level. Patches are in progress to add the ARM equivalent of NMI, but
we still need the IRQ-level fallback for systems where the "NMI" isn't
available due to secure firmware denying access to it"
* 'nmi' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: add basic support for on-demand backtrace of other CPUs
nmi: x86: convert to generic nmi handler
nmi: create generic NMI backtrace implementation
The only caller of cpu_die() on ARM is arch_cpu_idle_dead(), so
let's simplify the code by renaming cpu_die() to
arch_cpu_idle_dead(). While were here, drop the __ref annotation
because __cpuinit is gone nowadays.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Writes to /sys/.../cpuX/online fail if we determine the platform
doesn't support hotplug for that CPU. Furthermore, if the cpu_die
op isn't specified the system hangs when we try to offline a CPU
and it comes right back online unexpectedly. Let's figure this
stuff out before we make the sysfs nodes so that the online file
doesn't even exist if it isn't (at least sometimes) possible to
hotplug the CPU.
Add a new 'cpu_can_disable' op and repoint all 'cpu_disable'
implementations at it because all implementers use the op to
indicate if a CPU can be hotplugged or not in a static fashion.
With PSCI we may need to add a 'cpu_disable' op so that the
secure OS can be migrated off the CPU we're trying to hotplug.
In this case, the 'cpu_can_disable' op will indicate that all
CPUs are hotpluggable by returning true, but the 'cpu_disable' op
will make a PSCI migration call and occasionally fail, denying
the hotplug of a CPU. This shouldn't be any worse than x86 where
we may indicate that all CPUs are hotpluggable but occasionally
we can't offline a CPU due to check_irq_vectors_for_cpu_disable()
failing to find a CPU to move vectors to.
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Dave Martin <Dave.Martin@arm.com>
Acked-by: Simon Horman <horms@verge.net.au> [shmobile portion]
Tested-by: Simon Horman <horms@verge.net.au>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: <linux-sh@vger.kernel.org>
Tested-by: Tyler Baker <tyler.baker@linaro.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we now have generic infrastructure to support backtracing of other
CPUs in the system on lockups, we can start to implement this for ARM.
Initially, we add an IPI based implementation, as the GIC code needs
modification to support the generation of FIQ IPIs, and not all ARM
platforms have the ability to raise a FIQ in the non-secure world.
This provides us with a "best efforts" implementation in the absence
of FIQs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
John Stultz reports an RCU splat on boot with ARM ipi trace
events enabled.
===============================
[ INFO: suspicious RCU usage. ]
4.1.0-rc7-00033-gb5bed2f #153 Not tainted
-------------------------------
include/trace/events/ipi.h:68 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.1.0-rc7-00033-gb5bed2f #153
Hardware name: Qualcomm (Flattened Device Tree)
[<c0216b08>] (unwind_backtrace) from [<c02136e8>] (show_stack+0x10/0x14)
[<c02136e8>] (show_stack) from [<c075e678>] (dump_stack+0x70/0xbc)
[<c075e678>] (dump_stack) from [<c0215a80>] (handle_IPI+0x428/0x604)
[<c0215a80>] (handle_IPI) from [<c020942c>] (gic_handle_irq+0x54/0x5c)
[<c020942c>] (gic_handle_irq) from [<c0766604>] (__irq_svc+0x44/0x7c)
Exception stack(0xc09f3f48 to 0xc09f3f90)
3f40: 00000001 00000001 00000000 c09f73b8 c09f4528 c0a5de9c
3f60: c076b4f0 00000000 00000000 c09ef108 c0a5cec1 00000001 00000000 c09f3f90
3f80: c026bf60 c0210ab8 20000113 ffffffff
[<c0766604>] (__irq_svc) from [<c0210ab8>] (arch_cpu_idle+0x20/0x3c)
[<c0210ab8>] (arch_cpu_idle) from [<c02647f0>] (cpu_startup_entry+0x2c0/0x5dc)
[<c02647f0>] (cpu_startup_entry) from [<c099bc1c>] (start_kernel+0x358/0x3c4)
[<c099bc1c>] (start_kernel) from [<8020807c>] (0x8020807c)
At this point in the IPI handling path we haven't called
irq_enter() yet, so RCU doesn't know that we're about to exit
idle and properly warns that we're using RCU from an idle CPU.
Use trace_ipi_entry_rcuidle() instead of trace_ipi_entry() so
that RCU is informed about our exit from idle.
Fixes: 365ec7b173 ("ARM: add IPI tracepoints")
Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Re-engineer the LPAE TTBR setup code. Rather than passing some shifted
address in order to fit in a CPU register, pass either a full physical
address (in the case of r4, r5 for TTBR0) or a PFN (for TTBR1).
This removes the ARCH_PGD_SHIFT hack, and the last dangerous user of
cpu_set_ttbr() in the secondary CPU startup code path (which was there
to re-set TTBR1 to the appropriate high physical address space on
Keystone2.)
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When trying to kexec into a new kernel on a platform where multiple CPU
cores are present, but no SMP bringup code is available yet, the
kexec_load system call fails with:
kexec_load failed: Invalid argument
The SMP test added to machine_kexec_prepare() in commit 2103f6cba6
("ARM: 7807/1: kexec: validate CPU hotplug support") wants to prohibit
kexec on SMP platforms where it cannot disable secondary CPUs.
However, this test is too strict: if the secondary CPUs couldn't be
enabled in the first place, there's no need to disable them later at
kexec time. Hence skip the test in the absence of SMP bringup code.
This allows to add all CPU cores to the DTS from the beginning, without
having to implement SMP bringup first, improving DT compatibility.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The "SMP: Total of %d processors activated." message which we print in
smp_cpus_done() provides no further information than the message in
genreic code in smp_announce(). Kill it.
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>