The restart interrupt is triggered whenever a secondary CPU is
brought online, a remote function call dispatched from another
CPU or a manual PSW restart is initiated and causes the system
to kdump. The handling routine is always called with DAT turned
off. It then initializes the stack frame and invokes a callback.
The existing callbacks handle DAT as follows:
* __do_restart() and __machine_kexec() turn in on upon entry;
* __ipl_run(), __reipl_run() and __dump_run() do not turn it
right away, but all of them call diag308() - which turns DAT
on, but only if kasan is enabled;
In addition to the described complexity all callbacks (and the
functions they call) should avoid kasan instrumentation while
DAT is off.
This update enables DAT in the assembler restart handler and
relieves any callbacks (which are mostly C functions) from
dealing with DAT altogether.
There are four types of CPU restart that initialize control
registers in different ways:
1. Start of secondary CPU on boot - control registers are
inherited from the IPL CPU;
2. Restart of online CPU - control registers of the CPU being
restarted are kept;
3. Hotplug of offline CPU - control registers are inherited
from the starting CPU;
4. Start of offline CPU triggered by manual PSW restart -
the control registers are read from the absolute lowcore
and contain the boot time IPL CPU values updated with all
follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit()
routines;
In first three cases contents of the control registers is the
most recent. In the latter case control registers are good
enough to facilitate successful completion of kdump operation.
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
The dma section name is confusing, since the code which resides within
that section has nothing to do with direct memory access. Instead the
limitation is that the code has to run in 31 bit addressing mode, and
therefore has to reside below 2GB. So the name was chosen since
ZONE_DMA is the same region.
To reduce confusion rename the section to amode31, which hopefully
describes better what this is about.
Note: this will also change vmcoreinfo strings
- SDMA=... gets renamed to SAMODE31=...
- EDMA=... gets renamed to EAMODE31=...
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
- Rework inline asm to get rid of error prone "register asm" constructs,
which are problematic especially when code instrumentation is enabled. In
particular introduce and use register pair union to allocate even/odd
register pairs. Unfortunately this breaks compatibility with older
clang compilers and minimum clang version for s390 has been raised to 13.
https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/
- Fix gcc 11 warnings, which triggered various minor reworks all over
the code.
- Add zstd kernel image compression support.
- Rework boot CPU lowcore handling.
- De-duplicate and move kernel memory layout setup logic earlier.
- Few fixes in preparation for FORTIFY_SOURCE performing compile-time
and run-time field bounds checking for mem functions.
- Remove broken and unused power management support leftovers in s390
drivers.
- Disable stack-protector for decompressor and purgatory to fix buildroot
build.
- Fix vt220 sclp console name to match the char device name.
- Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in
zPCI code.
- Remove some implausible WARN_ON_ONCEs and remove arch specific counter
transaction call backs in favour of default transaction handling in
perf code.
- Extend/add new uevents for online/config/mode state changes of
AP card / queue device in zcrypt.
- Minor entry and ccwgroup code improvements.
- Other small various fixes and improvements all over the code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmDhuTEACgkQjYWKoQLX
FBjVlggAgDFBkDjlyfvrm4xzmHi7BJMmhrTJIONsSz+3tcA4/u5kE+Hrdrqxm0Uh
ZH4MXBxn4q4Fmoomhu5w5ZDe8o2ip0aN9fFNdsBoP8hurmQbL/IbdTnBETKMrKpV
XpogU2G7p+2nQ0+9+o6PS/vWlZhI88NVh8dWyRd2+5/XdMycgLv2Qm7NpQoACVw1
CbUvxP2PlpZ0wltLvNBKPg1xXMZa3GS0wbVUsS2jiWcr/3VzCqfTHenZJ/RadoE6
axG99QXCbLDMsJgVQcXtlI8K6Z461fAwbNtWZWC+Uq7o5pYuUFW1dovMg9WWF+7T
lFNqXyyNy5wwITRkvuzjlVTE8yzYYg==
=ADZ4
-----END PGP SIGNATURE-----
Merge tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Rework inline asm to get rid of error prone "register asm"
constructs, which are problematic especially when code
instrumentation is enabled.
In particular introduce and use register pair union to allocate
even/odd register pairs. Unfortunately this breaks compatibility with
older clang compilers and minimum clang version for s390 has been
raised to 13.
https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/
- Fix gcc 11 warnings, which triggered various minor reworks all over
the code.
- Add zstd kernel image compression support.
- Rework boot CPU lowcore handling.
- De-duplicate and move kernel memory layout setup logic earlier.
- Few fixes in preparation for FORTIFY_SOURCE performing compile-time
and run-time field bounds checking for mem functions.
- Remove broken and unused power management support leftovers in s390
drivers.
- Disable stack-protector for decompressor and purgatory to fix
buildroot build.
- Fix vt220 sclp console name to match the char device name.
- Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in
zPCI code.
- Remove some implausible WARN_ON_ONCEs and remove arch specific
counter transaction call backs in favour of default transaction
handling in perf code.
- Extend/add new uevents for online/config/mode state changes of AP
card / queue device in zcrypt.
- Minor entry and ccwgroup code improvements.
- Other small various fixes and improvements all over the code.
* tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (91 commits)
s390/dasd: use register pair instead of register asm
s390/qdio: get rid of register asm
s390/ioasm: use symbolic names for asm operands
s390/ioasm: get rid of register asm
s390/cmf: get rid of register asm
s390/lib,string: get rid of register asm
s390/lib,uaccess: get rid of register asm
s390/string: get rid of register asm
s390/cmpxchg: use register pair instead of register asm
s390/mm,pages-states: get rid of register asm
s390/lib,xor: get rid of register asm
s390/timex: get rid of register asm
s390/hypfs: use register pair instead of register asm
s390/zcrypt: Switch to flexible array member
s390/speculation: Use statically initialized const for instructions
virtio/s390: get rid of open-coded kvm hypercall
s390/pci: add zpci_set_irq()/zpci_clear_irq()
scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390
s390/ipl: use register pair instead of register asm
s390/mem_detect: fix tprot() program check new psw handling
...
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.
There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain
At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.
[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Funciton do_restart() is a callback invoked from the
restart CPU routine and passed a single parameter.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
udelay_simple() callers can make use of the now simplified udelay()
implementation. No need to keep it.
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Re-IPL for nvme is currently done by using diag 308 with the "Load Clear"
subcode, which means that all memory will be cleared.
This can increase re-IPL duration considerably on very large machines.
For list-directed IPL like nvme or fcp IPL, a "Load Normal" subcode was
introduced with z14. The "Load Normal" diag 308 subcode allows to re-IPL
without clearing memory.
This patch adds a new "clear" sysfs attribute to /sys/firmware/reipl/nvme,
which can be set to either "0" or "1" to disable or enable re-IPL with
memory clearing. The default value is "0", which disables memory clearing.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Add the nvme dump ipl type, associated data, and sysfs entries. This allows
booting into a stand alone dump environment that resides on an nvme device.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
snprintf() returns the number of bytes that would be written,
which may be greater than the the actual length to be written.
show() methods should return the number of bytes printed into the
buffer. This is the return value of scnprintf().
Link: https://lkml.kernel.org/r/20200509085608.41061-3-chenzhou10@huawei.com
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Populate sysfs and structs with reipl entries for nvme ipl type.
This allows specifying a target nvme device when rebooting/reipling.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Recognize IPL Block's Ipl Type of "nvme". Populate related structs and sysfs
entries.
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Re-IPL for both CCW and FCP is currently done by using diag 308 with the
"Load Clear" subcode, which means that all memory will be cleared.
This can increase re-IPL duration considerably on very large machines.
For CCW devices, there is also a "Load Normal" subcode that was only used
for dump kernels so far. For FCP devices, a similar "Load Normal" subcode
was introduced with z14. The "Load Normal" diag 308 subcode allows to
re-IPL without clearing memory.
This patch adds a new "clear" sysfs attribute to /sys/firmware/reipl for
both the ccw and fcp subdirectories, which can be set to either "0" or "1"
to disable or enable re-IPL with memory clearing. The default value is "0",
which disables memory clearing.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
This reverts commit db9492cef4 ("s390/protvirt: add memory sharing for
diag 308 set/store") which due to ultravisor implementation change is
not needed after all.
Fixes: db9492cef4 ("s390/protvirt: add memory sharing for diag 308 set/store")
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Use the correct bit for detection of the machine capability associated
with the has_secure attribute. It is expected that the underlying
platform (including hypervisors) unsets the bit when they don't provide
secure ipl for their guests.
Fixes: c9896acc78 ("s390/ipl: Provide has_secure sysfs attribute")
Cc: stable@vger.kernel.org # 5.2
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The disabled_wait() function uses its argument as the PSW address when
it stops the CPU with a wait PSW that is disabled for interrupts.
The different callers sometimes use a specific number like 0xdeadbeef
to indicate a specific failure, the early boot code uses 0 and some
other calls sites use __builtin_return_address(0).
At the time a dump is created the current PSW and the registers of a
CPU are written to lowcore to make them avaiable to the dump analysis
tool. For a CPU stopped with disabled_wait the PSW and the registers
do not really make sense together, the PSW address does not point to
the function the registers belong to.
Simplify disabled_wait() by using _THIS_IP_ for the PSW address and
drop the argument to the function.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With a relocatable kernel that could reside at any place in memory, code
and data that has to stay below 2 GB needs special handling.
This patch introduces .dma sections for such text, data and ex_table.
The sections will be part of the decompressor kernel, so they will not
be relocated and stay below 2 GB. Their location is passed over to the
decompressed / relocated kernel via the .boot.preserved.data section.
The duald and aste for control register setup also need to stay below
2 GB, so move the setup code from arch/s390/kernel/head64.S to
arch/s390/boot/head.S. The duct and linkage_stack could reside above
2 GB, but their content has to be preserved for the decompresed kernel,
so they are also moved into the .dma section.
The start and end address of the .dma sections is added to vmcoreinfo,
for crash support, to help debugging in case the kernel crashed there.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide an interface for userspace so it can find out if a machine is
capeable of doing secure boot. The interface is, for example, needed for
zipl so it can find out which file format it can/should write to disk.
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
PR: Adjusted to the use in kexec_file later.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Read the IPL Report block provided by secure-boot, add the entries
of the certificate list to the system key ring and print the list
of components.
PR: Adjust to Vasilys bootdata_preserved patch set. Preserve ipl_cert_list
for later use in kexec_file.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The IPL parameter block is used as an interface between Linux and
the machine to query and change the boot device and boot options.
To be able to create IPL parameter block in user space and pass it
as segment to kexec provide an uapi header with proper structure
definitions for the block.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The ipl_info union in struct ipl_parameter_block has the same name as
the struct ipl_info. This does not help while reading the code and the
union in struct ipl_parameter_block does not need to be named. Drop
the name from the union.
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
.boot.preserved.data is a better fit for ipl block than .boot.data
which is discarded after init. Reusing .boot.preserved.data allows to
simplify code a little bit and avoid copying data from .boot.data to
persistent variables.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some functions from both arch/s390/kernel/ipl.c and
arch/s390/kernel/machine_kexec.c are called without DAT enabled
(or with and without DAT enabled code paths). There is no easy way
to partially disable kasan for those files without a substantial
rework. Disable kasan for both files for now.
To avoid disabling kasan for arch/s390/kernel/diag.c DAT flag is
enabled in diag308 call. pcpu_delegate which disables DAT is marked
with __no_sanitize_address to disable instrumentation for that one
function.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To distinguish zfcpdump case and to be able to parse some of the kernel
command line arguments early (e.g. mem=) ipl block retrieval and command
line construction code is moved to the early boot phase.
"memory_end" is set up correctly respecting "mem=" and hsa_size in case
of the zfcpdump.
arch/s390/boot/string.c is introduced to provide string handling and
command line parsing functions to early boot phase code for the compressed
kernel image case.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
reipl_method and dump_method have been used in addition to reipl_type
and dump_type, because a single reipl_type could be achieved with
multiple reipl_method (same for dump_type/method). After dropping
non-diag308_set based reipl methods, there is a single method per
reipl_type/dump_type and reipl_method and dump_method could be simply
removed.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390 kdump reipl implementation relies on os_info kernel structure
residing in old memory being dumped. os_info contains reipl block,
which is used (if valid) by the kdump kernel for reipl parameters.
The problem is that the reipl block and its checksum inside
os_info is updated only when /sys/firmware/reipl/reipl_type is
written. This sets an offset of a reipl block for "reipl_type" and
re-calculates reipl block checksum. Any further alteration of values
under /sys/firmware/reipl/{reipl_type}/ without subsequent write to
/sys/firmware/reipl/reipl_type lead to incorrect os_info reipl block
checksum. In such a case kdump kernel ignores it and reboots using
default logic.
To fix this, os_info reipl block update is moved right before kdump
execution.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
diag308 set has been available for many machine generations, and
alternative reipl code paths has not been exercised and seems to be
broken without noticing for a while now. So, cleaning up all obsolete
reipl methods except currently used ones, assuming that diag308 set
always works.
Also removing not longer needed reset callbacks.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In some cases diag308_set_works used to be misused as "we have valid ipl
parmblock", which is not the case when diag308 set works, but there is
no ipl parmblock (diag308 store returns DIAG308_RC_NOCONFIG). Such checks
are adjusted to reuse ipl_block_valid instead of diag308_set_works.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For both ccw and fcp boot retrieve ipl info from ipl block received via
diag308 store. Old scsi ipl parm block handling and cio_get_iplinfo are
removed. Ipl type is deducted from ipl block (if valid).
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ipl_flags and corresponding enum are not used outside of ipl.c and will
be reworked in later commits.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
ipl_ssid and ipl_devno used to be used during ccw boot when diag308 store
was not available. Reuse ipl_block to store those values.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ipl parm blocks received via "diag308 store" and during scsi boot at
IPL_PARMBLOCK_ORIGIN are merged into the "ipl_block".
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When loadparm is set in reipl parm block, the kernel should also set
DIAG308_FLAGS_LP_VALID flag.
This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct
boot -> ccw reipl.
Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add the PPA instruction to the system entry and exit path to switch
the kernel to a different branch prediction behaviour. The instructions
are added via CPU alternatives and can be disabled with the "nospec"
or the "nobp=0" kernel parameter. If the default behaviour selected
with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be
used to enable the changed kernel branch prediction.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
bss section is cleared before ipl.c code is called or global variables
are used nowadays. Remove stale comment and __section(.data) from
few global variables.
Also removes static/global variables initialization to 0.
Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It's good to have SPDX identifiers in all files to make it easier to
audit the kernel tree for correct licenses.
Update the arch/s390/kernel/ files with the correct SPDX license
identifier based on the license text in the file itself. The SPDX
identifier is a legally binding shorthand, which can be used instead of
the full boiler plate text.
This work is based on a script and data from Thomas Gleixner, Philippe
Ombredanne, and Kate Stewart.
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the support to create a z/VM named saved segment (NSS). This
feature is not supported since quite a while in favour of jump labels,
function tracing and (now) CPU alternatives. All of these features
require to write to the kernel text section which is not possible if
the kernel is contained within an NSS.
Given that memory savings are minimal if kernel images are shared and
in addition updates of shared images are painful, the NSS feature can
be removed.
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This reverts the two commits
7afbeb6df2 ("s390/ipl: always use load normal for CCW-type re-IPL")
0f7451ff3a ("s390/ipl: use load normal for LPAR re-ipl")
The two commits did not take into account that behavior of standby
memory changes fundamentally if the re-IPL method is changed from
Load Clear to Load Normal.
In case of the old re-IPL clear method all memory that was initially
in standby state will be put into standby state again within the
re-IPL process. Or in other words: memory that was brought online
before a re-IPL will be offline again after a reboot.
Given that we use different re-IPL methods depending on the hypervisor
and CCW-type vs SCSI re-IPL it is not easy to tell in advance when and
why memory will stay online or will be offline after a re-IPL.
This does also have other side effects, since memory that is online
from the beginning will be in ZONE_NORMAL by default vs ZONE_MOVABLE
for memory that is offline.
Therefore, before the change, a user could online and offline memory
easily since standby memory was always in ZONE_NORMAL. After the
change, and a re-IPL, this depended on which memory parts were online
before the re-IPL.
From a usability point of view the current behavior is more than
suboptimal. Therefore revert these changes until we have a better
solution and get back to a consistent behavior. The bad thing about
this is that the time required for a re-IPL will be significantly
increased for configurations with several 100GB or 1TB of memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
commit 14890678687c ("s390/ipl: use load normal for LPAR re-ipl")
missed to convert one code path to use load normal semantics for
re-IPL. Convert the missing code path as well.
Fixes: 14890678687c ("s390/ipl: use load normal for LPAR re-ipl")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends. That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig. The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.
Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each change instance
for the presence of either and replace as needed. Build testing
revealed some implicit header usage that was fixed up accordingly.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch
- unifies the old sclp early code and the sclp early printk code, so
they can use common functions
- makes sure all sclp early functions and variables have the same
"sclp_early" prefix
- converts the sclp early printk code into readable code by using
existing data structures instead of hard coded magic arrays
- splits the early sclp code into two files: sclp_early.c and
sclp_early_core.c. The core file contains everything that is
required by the kernel decompressor and may not call functions not
contained within the core file. Otherwise the result would be a
link error.
- changes interrupt handling to be completely synchronous. The old
early sclp code had a small window which allowed to receive several
interrupts instead of exactly the single expected interrupt. This
did hide a subtle potential bug, which is fixed with this large
rework.
- contains a couple of small cleanups.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Keep sparse and other static code checkers from emitting warnings like:
arch/s390/kernel/ipl.c:1549:14: warning: incorrect type in assignment (different base types)
arch/s390/kernel/ipl.c:1549:14: expected unsigned int [unsigned] csum
arch/s390/kernel/ipl.c:1549:14: got restricted __wsum
All usages in s390 code are ok. Therefore add proper casts.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The early C code within arch/s390/kernel/early.c saves ipl parameters
before the bss section is cleared. When doing that it jumps to code
that is potentially gcov/kcov instrumented. That code in turn will
corrupt an initrd that potentially may reside in the not yet ready to
be used bss section.
Instead of excluding more and more code from gcov/kcov instrumentation
provide an early memmove function which will be used to save ipl
parameters. The verification if these parameters are actually valid
will be done later.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull s390 updates from Martin Schwidefsky:
"There are a couple of new things for s390 with this merge request:
- a new scheduling domain "drawer" is added to reflect the unusual
topology found on z13 machines. Performance tests showed up to 8
percent gain with the additional domain.
- the new crc-32 checksum crypto module uses the vector-galois-field
multiply and sum SIMD instruction to speed up crc-32 and crc-32c.
- proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
the generic vmlinux.lds linker script definitions.
- kcov instrumentation support. A prerequisite for that is the
inline assembly basic block cleanup, which is the reason for the
net/iucv/iucv.c change.
- support for 2GB pages is added to the hugetlbfs backend.
Then there are two removals:
- the oprofile hardware sampling support is dead code and is removed.
The oprofile user space uses the perf interface nowadays.
- the ETR clock synchronization is removed, this has been superseeded
be the STP clock synchronization. And it always has been
"interesting" code..
And the usual bug fixes and cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
s390/smp: clean up a condition
s390/cio/chp : Remove deprecated create_singlethread_workqueue
s390/chsc: improve channel path descriptor determination
s390/chsc: sanitize fmt check for chp_desc determination
s390/cio: make fmt1 channel path descriptor optional
s390/chsc: fix ioctl CHSC_INFO_CU command
s390/cio/device_ops: fix kernel doc
s390/cio: allow to reset channel measurement block
s390/console: Make preferred console handling more consistent
s390/mm: fix gmap tlb flush issues
s390/mm: add support for 2GB hugepages
s390: have unique symbol for __switch_to address
s390/cpuinfo: show maximum thread id
s390/ptrace: clarify bits in the per_struct
s390: stack address vs thread_info
s390: remove pointless load within __switch_to
s390: enable kcov support
s390/cpumf: use basic block for ecctr inline assembly
s390/hypfs: use basic block for diag inline assembly
...
This reverts commit 852ffd0f4e.
There are use cases where an intermediate boot kernel (1) uses kexec
to boot the final production kernel (2). For this scenario we should
provide the original boot information to the production kernel (2).
Therefore clearing the boot information during kexec() should not
be done.
Cc: stable@vger.kernel.org # v3.17+
Reported-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rename DIAG308_IPL and DIAG308_DUMP to DIAG308_LOAD_CLEAR and
DIAG308_LOAD_NORMAL_DUMP to better reflect the associated IPL
functions.
Suggested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Avoid clearing memory for CCW-type re-ipl within a logical
partition. This can save a significant amount of time if a logical
partition contains a lot of memory.
On the other hand we still clear memory if running within a second
level hypervisor, since the hypervisor can simply free all memory that
was used for the guest.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>