Perf-fuzzer triggers non-existent MSR access in RAPL driver on
Haswell-EX.
Haswell/Broadwell server and client have differnt RAPL events.
Since 'commit 7f2236d0bf ("perf/x86/rapl: Use Intel family macros for
RAPL")', it accidentally assign RAPL client events to server.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-kernel@vger.kernel.org
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
1) BPF speculation prevention and BPF_JIT_ALWAYS_ON, from Alexei
Starovoitov.
2) Revert dev_get_random_name() changes as adjust the error code
returns seen by userspace definitely breaks stuff.
3) Fix TX DMA map/unmap on older iwlwifi devices, from Emmanuel
Grumbach.
4) From wrong AF family when requesting sock diag modules, from Andrii
Vladyka.
5) Don't add new ipv6 routes attached to the null_entry, from Wei Wang.
6) Some SCTP sockopt length fixes from Marcelo Ricardo Leitner.
7) Don't leak when removing VLAN ID 0, from Cong Wang.
8) Hey there's a potential leak in ipv6_make_skb() too, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
ipv6: sr: fix TLVs not being copied using setsockopt
ipv6: fix possible mem leaks in ipv6_make_skb()
mlxsw: spectrum_qdisc: Don't use variable array in mlxsw_sp_tclass_congestion_enable
mlxsw: pci: Wait after reset before accessing HW
nfp: always unmask aux interrupts at init
8021q: fix a memory leak for VLAN 0 device
of_mdio: avoid MDIO bus removal when a PHY is missing
caif_usb: use strlcpy() instead of strncpy()
doc: clarification about setting SO_ZEROCOPY
net: gianfar_ptp: move set_fipers() to spinlock protecting area
sctp: make use of pre-calculated len
sctp: add a ceiling to optlen in some sockopts
sctp: GFP_ATOMIC is not needed in sctp_setsockopt_events
bpf: introduce BPF_JIT_ALWAYS_ON config
bpf: avoid false sharing of map refcount with max_entries
ipv6: remove null_entry before adding default route
SolutionEngine771x: add Ether TSU resource
SolutionEngine771x: fix Ether platform data
docs-rst: networking: wire up msg_zerocopy
net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()
...
This contains what I hope are the last RISC-V changes to go into 4.15.
I know it's a bit last minute, but I think they're all fairly small
changes:
* SR_* constants have been renamed to match the latest ISA
specification.
* Some CONFIG_MMU #ifdef cruft has been removed. We've never supported
!CONFIG_MMU.
* __NR_riscv_flush_icache is now visible to userspace. We were hoping
to avoid making this public in order to force userspace to call the
vDSO entry, but it looks like QEMU's user-mode emulation doesn't want
to emulate a vDSO. In order to allow glibc to fall back to a system
call when the vDSO entry doesn't exist we're just
* Our defconfig is no long empty. This is another one that just slipped
through the cracks. The defconfig isn't perfect, but it's at least
close to what users will want for the first RISC-V development board.
Getting closer is kind of splitting hairs here: none of the RISC-V
specific drivers are in yet, so it's not like things will boot out of
the box.
The only one that's strictly necessary is the __NR_riscv_flush_icache
change, as I want that to be part of the public API starting from our
first kernel so nobody has to worry about it. The others are nice to
haves, but they seem sane for 4.15 to me.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlpU/UcTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQdy5D/9fcTwXTk98U2gSoR4Dv25tztqbNMhw
+Lae5EeIqAaPI4xfyLGldJe0BWAJaouZWIY5xkB5JWzsdYPx/jYgC+SbwI/3aGVy
VjcU0d4haZtz2kdm0Y0ZKIGg91vDlULoVvcxrM8Jff0gDmyKoT1OjwKpt3esyhmN
Vc+iC0FxtJow/xIaFlnPa42qh/pFkcLDmmY/Im6N8IEcyHBT6vCDnD3CgCFY/hdu
9vcWJDvFBj4SFwL8y+ajspQ4tPzDt4Ko+3NLxtEv+19y3NEgLm+shbxv/J8AVO8O
BvBr51QfggM2rAqGzCa4nEZZR7Roxgg9bJVQARXyzX1tUhtBEz9+eUArJ0tzMtbx
GyXYY5NwyupDJ/MA9yn+GqYlLNnS2yL2y0zIBJehi/37+KpAFtH/cRnA58sXViqw
IKGhKW7JCGU3/xyW+RtuY3N5urU18+qE4CZRLtI5QN0QRcTWLhqqQvQRud86HqqD
g4KPo6g9Z6Ak9Xu81n/liIExp3Vp2kpQUts1lCF1D+4WYRwpb4Mqy4HiOCSf/OO2
wOuX5HY+tbS8yvupgYjszTXaYDn35RoGkcjK9o1Lkq9RgI5kzHDyaQrSK/c/oAzn
A7cJ2z7dBaV0W4O7R+2SJ2k9DHw1db/WVf19pKVjSi5osSoUds5w1YxHK25cSBUz
+47LVCgkQI/Scw==
=PhUK
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux
Pull RISC-V updates from Palmer Dabbelt:
"This contains what I hope are the last RISC-V changes to go into 4.15.
I know it's a bit last minute, but I think they're all fairly small
changes:
- SR_* constants have been renamed to match the latest ISA
specification.
- Some CONFIG_MMU #ifdef cruft has been removed. We've never
supported !CONFIG_MMU.
- __NR_riscv_flush_icache is now visible to userspace. We were hoping
to avoid making this public in order to force userspace to call the
vDSO entry, but it looks like QEMU's user-mode emulation doesn't
want to emulate a vDSO. In order to allow glibc to fall back to a
system call when the vDSO entry doesn't exist we're just
- Our defconfig is no long empty. This is another one that just
slipped through the cracks. The defconfig isn't perfect, but it's
at least close to what users will want for the first RISC-V
development board. Getting closer is kind of splitting hairs here:
none of the RISC-V specific drivers are in yet, so it's not like
things will boot out of the box.
The only one that's strictly necessary is the __NR_riscv_flush_icache
change, as I want that to be part of the public API starting from our
first kernel so nobody has to worry about it. The others are nice to
haves, but they seem sane for 4.15 to me"
* tag 'riscv-for-linus-4.15-rc8_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux:
riscv: rename SR_* constants to match the spec
riscv: remove CONFIG_MMU ifdefs
RISC-V: Make __NR_riscv_flush_icache visible to userspace
RISC-V: Add a basic defconfig
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.15.
- Maciej Rozycki found another series of FP issues which requires a
seven part series to restructure and fix.
- James fixes a warning about .set mt which gas doesn't like when
building for R1 processors"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Validate PR_SET_FP_MODE prctl(2) requests against the ABI of the task
MIPS: Disallow outsized PTRACE_SETREGSET NT_PRFPREG regset accesses
MIPS: Also verify sizeof `elf_fpreg_t' with PTRACE_SETREGSET
MIPS: Fix an FCSR access API regression with NT_PRFPREG and MSA
MIPS: Consistently handle buffer counter with PTRACE_SETREGSET
MIPS: Guard against any partial write attempt with PTRACE_SETREGSET
MIPS: Factor out NT_PRFPREG regset access helpers
MIPS: CPS: Fix r1 .set mt assembler warning
After the Ether platform data is fixed, the driver probe() method would
still fail since the 'struct sh_eth_cpu_data' corresponding to SH771x
indicates the presence of TSU but the memory resource for it is absent.
Add the missing TSU resource to both Ether devices and fix the harmless
off-by-one error in the main memory resources, while at it...
Fixes: 4986b99688 ("net: sh_eth: remove the SH_TSU_ADDR")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'sh_eth' driver's probe() method would fail on the SolutionEngine7710
board and crash on SolutionEngine7712 board as the platform code is
hopelessly behind the driver's platform data -- it passes the PHY address
instead of 'struct sh_eth_plat_data *'; pass the latter to the driver in
order to fix the bug...
Fixes: 71557a37ad ("[netdrvr] sh_eth: Add SH7619 support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RISC-V port doesn't suport a nommu mode, so there is no reason
to provide some code only under a CONFIG_MMU ifdef.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
We were hoping to avoid making this visible to userspace, but it looks
like we're going to have to because QEMU's user-mode emulation doesn't
want to emulate a vDSO. Having vDSO-only system calls was a bit
unothodox anyway, so I think in this case it's OK to just make the
actual system call number public.
This patch simply moves the definition of __NR_riscv_flush_icache
availiable to userspace, which results in the deletion of the now empty
vdso-syscalls.h.
Changes since v1:
* I've moved the definition into uapi/asm/syscalls.h rathen than
uapi/asm/unistd.h. This allows me to keep asm/unistd.h, so we can
keep the syscall table macros sane.
* As a side effect of the above, this no longer disables all system
calls on RISC-V. Whoops!
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
This patch provides a basic defconfig for the RISC-V
architecture that enables enough kernel features to run a
basic Linux distribution on qemu's "virt" board for native
software development. Features include:
- serial console
- virtio block and network device support
- VFAT and ext2/3/4 filesystem support
- NFS client and NFS rootfs support
- an assortment of other kernel features required for
running systemd
It also enables a number of drivers for physical hardware
that target the "SiFive U500" SoC and the corresponding
development platform. These include:
- PCIe host controller support for the FPGA-based U500
development platform (PCIE_XILINX)
- USB host controller support (OHCI/EHCI/XHCI)
- USB HID (keyboard/mouse) support
- USB mass storage support (bulk and UAS)
- SATA support (AHCI)
- ethernet drivers (MACB for a SoC-internal MAC block, microsemi
ethernet phy, E1000E and R8169 for PCIe-connected external devices)
- DRM and framebuffer console support for PCIe-connected
Radeon graphics chips
Signed-off-by: Karsten Merker <merker@debian.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Pull parisc fixes from Helge Deller:
- Many small fixes to show the real physical addresses of devices
instead of hashed addresses.
- One important fix to unbreak 32-bit SMP support: We forgot to 16-byte
align the spinlocks in the assembler code.
- Qemu support: The host will get a chance to sleep when the parisc
guest is idle. We use the same mechanism as the power architecture by
overlaying the "or %r10,%r10,%r10" instruction which is simply a nop
on real hardware.
* 'parisc-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: qemu idle sleep support
parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel
parisc: Show unhashed EISA EEPROM address
parisc: Show unhashed HPA of Dino chip
parisc: Show initial kernel memory layout unhashed
parisc: Show unhashed hardware inventory
s390:
* Two fixes for potential bitmap overruns in the cmma migration code
x86:
* Clear guest provided GPRs to defeat the Project Zero PoC for CVE
2017-5715
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJaUTJ4AAoJEED/6hsPKofohk0IAJAFlMG66u5MxC0kSM61U4Zf
1vkzRwAkBbcN82LpGQKbqabVyTq0F3aLipyOn6WO5SN0K5m+OI2OV/aAroPyX8bI
F7nWIqTXLhJ9X6KXINFvyavHMprvWl8PA72tR/B/7GhhfShrZ2wGgqhl0vv/kCUK
/8q+5e693yJqw8ceemin9a6kPJrLpmjeH+Oy24KIlGbvJWV4UrIE86pRHnAnBtg8
L7Vbxn5+ezKmakvBh+zF8NKcD1zHDcmQZHoYFPsQT0vX5GPoYqT2bcO6gsh1Grmp
8ti6KkrnP+j2A/OEna4LBWfwKI/1xHXneB22BYrAxvNjHt+R4JrjaPpx82SEB4Y=
=URMR
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"s390:
- Two fixes for potential bitmap overruns in the cmma migration code
x86:
- Clear guest provided GPRs to defeat the Project Zero PoC for CVE
2017-5715"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vmx: Scrub hardware GPRs at VM-exit
KVM: s390: prevent buffer overrun on memory hotplug during migration
KVM: s390: fix cmma migration for multiple memory slots
Just one fix to correctly return SEGV_ACCERR when we take a SEGV on a mapped
region. The bug was introduced in the refactoring of the page fault handler we
did in the previous release.
Thanks to:
John Sperbeck.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJaULvEExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYBN
hw//W6NeKl1iS7Xf5tcFX97I/TBakl0rS7gHGBMheQT4IGengQ3dJqfCMdq6nfyx
Sss72UG1sVfeYNG5djJjoZ3hmnt1CjcMkRlQotUhFYACufHYiI8DwYlUTNYpQR6f
z6uaItK2S5JWSzuk8wOe/VUmBqyIfe+LIWplh8uGy1vn93avbyHrJAAtPFeSyiBm
TA6EMubY4n1NpNWyGIWBILO7e1yI0xT4jctwNy/ZAGC0lgutFb4sWY/ZxgYlQyKo
fxb0Al8REpY73IjbZbSzcZ1GfdzDztda1fCNyUeKShRInSJTp31zasn4YCXzYOU8
8yLw5DcnlA9Fyy7BV0IuFtAfV4wUHS9NDe8ebX6xKXarurTCwugoSbmHCQ8E7jIC
4FFVhArQdraY+tumOwouJA7g4nGUtGV6rpZAUnd++7xVvFspJiAbFpbU2vUNnnJ4
VoU2lvWjox9r6wxT001Ct/4M+XoR8+nnEKs8bll1771CyV+AQ4fGqoDga3dOB2cC
M1ejwLFZ80ZnXDUY6wc4Wzor3G1knVRzuRLEcsoAe4vJunGsS1i9tYce9bOJ9la+
okcmoPm0roPJSiT1bmiptIAsJRjZZq2cxr2+lBBQ9zlNuZyEIY/CwrBM/0ZP6RJI
ljbjOCj0xvJBkmIBSenOVO/tIBi/Ww+wL2MDzsYv+K1pp24=
=VAzk
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix to correctly return SEGV_ACCERR when we take a SEGV on a
mapped region. The bug was introduced in the refactoring of the page
fault handler we did in the previous release.
Thanks to John Sperbeck"
* tag 'powerpc-4.15-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR
Add qemu idle sleep support when running under qemu with SeaBIOS PDC
firmware.
Like the power architecture we use the "or" assembler instructions,
which translate to nops on real hardware, to indicate that qemu shall
idle sleep.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Richard Henderson <rth@twiddle.net>
CC: stable@vger.kernel.org # v4.9+
Recent changes made a bit of an inconsistent mess out of arch/x86/events/msr.c,
fix it:
- re-align the initialization tables to be vertically aligned and readable again
- harmonize comment style in terms of punctuation, capitalization and spelling
- use curly braces for multi-condition branches
- remove extra newlines
- simplify the code a bit
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds support for the Digital Readout provided by the
IA32_THERM_STATUS MSR (0x19C) on Intel X86 processors. The readout
shows the number of degrees Celcius to the TCC (critical temperature)
supported by the processor. Thus, the larger, the better.
The perf_event support is provided via the msr PMU. The new
logical event is called cpu_thermal_margin. It comes with a unit and
snapshot files. The event shows the current temprature distance (margin).
It is not an accumulating event. The unit is degrees C. The event
is provided per logical CPU to make things simpler but it is the
same for both hyper-threads sharing a physical core.
$ perf stat -I 1000 -a -A -e msr/cpu_thermal_margin/
This will print the temperature for all logical CPUs.
time CPU counts unit events
1.000123741 CPU0 38 C msr/cpu_thermal_margin/
1.000161837 CPU1 37 C msr/cpu_thermal_margin/
1.000187906 CPU2 36 C msr/cpu_thermal_margin/
1.000189046 CPU3 39 C msr/cpu_thermal_margin/
1.000283044 CPU4 40 C msr/cpu_thermal_margin/
1.000344297 CPU5 40 C msr/cpu_thermal_margin/
1.000365832 CPU6 39 C msr/cpu_thermal_margin/
...
In case the temperature margin cannot be read, the reported value would be -1.
Works on all processors supporting the Digital Readout (dtherm in cpuinfo)
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1515169132-3980-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull more x86 pti fixes from Thomas Gleixner:
"Another small stash of fixes for fallout from the PTI work:
- Fix the modules vs. KASAN breakage which was caused by making
MODULES_END depend of the fixmap size. That was done when the cpu
entry area moved into the fixmap, but now that we have a separate
map space for that this is causing more issues than it solves.
- Use the proper cache flush methods for the debugstore buffers as
they are mapped/unmapped during runtime and not statically mapped
at boot time like the rest of the cpu entry area.
- Make the map layout of the cpu_entry_area consistent for 4 and 5
level paging and fix the KASLR vaddr_end wreckage.
- Use PER_CPU_EXPORT for per cpu variable and while at it unbreak
nvidia gfx drivers by dropping the GPL export. The subject line of
the commit tells it the other way around, but I noticed that too
late.
- Fix the ASM alternative macros so they can be used in the middle of
an inline asm block.
- Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack
vector is properly identified. The Spectre mitigations will come
with their own bug bits later"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
x86/tlb: Drop the _GPL from the cpu_tlbstate export
x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
x86/kaslr: Fix the vaddr_end mess
x86/mm: Map cpu_entry_area at the same place on 4/5 level
x86/mm: Set MODULES_END to 0xffffffffff000000
Pull EFI updates from Thomas Gleixner:
- A fix for a add_efi_memmap parameter regression which ensures that
the parameter is parsed before it is used.
- Reinstate the virtual capsule mapping as the cached copy turned out
to break Quark and other things
- Remove Matt Fleming as EFI co-maintainer. He stepped back a few days
ago. Thanks Matt for all your great work!
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Remove Matt Fleming as EFI co-maintainer
efi/capsule-loader: Reinstate virtual capsule mapping
x86/efi: Fix kernel param add_efi_memmap regression
Pull s390 fixes from Martin Schwidefsky:
"Four bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/dasd: fix wrongly assigned configuration data
s390: fix preemption race in disable_sacf_uaccess
s390/sclp: disable FORTIFY_SOURCE for early sclp code
s390/pci: handle insufficient resources during dma tlb flush
Guest GPR values are live in the hardware GPRs at VM-exit. Do not
leave any guest values in hardware GPRs after the guest GPR values are
saved to the vcpu_vmx structure.
This is a partial mitigation for CVE 2017-5715 and CVE 2017-5753.
Specifically, it defeats the Project Zero PoC for CVE 2017-5715.
Suggested-by: Eric Northup <digitaleric@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Eric Northup <digitaleric@google.com>
Reviewed-by: Benjamin Serebrin <serebrin@google.com>
Reviewed-by: Andrew Honig <ahonig@google.com>
[Paolo: Add AMD bits, Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use the name associated with the particular attack which needs page table
isolation for mitigation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jiri Koshina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Lutomirski <luto@amacapital.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Greg KH <gregkh@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
Where an ALTERNATIVE is used in the middle of an inline asm block, this
would otherwise lead to the following instruction being appended directly
to the trailing ".popsection", and a failed compile.
Fixes: 9cebed423c ("x86, alternative: Use .pushsection/.popsection")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: ak@linux.intel.com
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104143710.8961-8-dwmw@amazon.co.uk
The recent changes for PTI touch cpu_tlbstate from various tlb_flush
inlines. cpu_tlbstate is exported as GPL symbol, so this causes a
regression when building out of tree drivers for certain graphics cards.
Aside of that the export was wrong since it was introduced as it should
have been EXPORT_PER_CPU_SYMBOL_GPL().
Use the correct PER_CPU export and drop the _GPL to restore the previous
state which allows users to utilize the cards they payed for.
As always I'm really thrilled to make this kind of change to support the
#friends (or however the hot hashtag of today is spelled) from that closet
sauce graphics corp.
Fixes: 1e02ce4ccc ("x86: Store a per-cpu shadow copy of CR4")
Fixes: 6fd166aae7 ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Thomas reported the following warning:
BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
caller is native_flush_tlb_single+0x57/0xc0
native_flush_tlb_single+0x57/0xc0
__set_pte_vaddr+0x2d/0x40
set_pte_vaddr+0x2f/0x40
cea_set_pte+0x30/0x40
ds_update_cea.constprop.4+0x4d/0x70
reserve_ds_buffers+0x159/0x410
x86_reserve_hardware+0x150/0x160
x86_pmu_event_init+0x3e/0x1f0
perf_try_init_event+0x69/0x80
perf_event_alloc+0x652/0x740
SyS_perf_event_open+0x3f6/0xd60
do_syscall_64+0x5c/0x190
set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
there are two problems with that:
1) The resulting flush is not supposed to be called in preemptible context
2) The cpu entry area is supposed to be per CPU, but the debug store
buffers are mapped for all CPUs so these mappings need to be flushed
globally.
Add the necessary preemption protection across the mapping code and flush
TLBs globally.
Fixes: c1961a4631 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
vaddr_end for KASLR is only documented in the KASLR code itself and is
adjusted depending on config options. So it's not surprising that a change
of the memory layout causes KASLR to have the wrong vaddr_end. This can map
arbitrary stuff into other areas causing hard to understand problems.
Remove the whole ifdef magic and define the start of the cpu_entry_area to
be the end of the KASLR vaddr range.
Add documentation to that effect.
Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
There is no reason for 4 and 5 level pagetables to have a different
layout. It just makes determining vaddr_end for KASLR harder than
necessary.
Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Gilbert <benjamin.gilbert@coreos.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable <stable@vger.kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>,
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
Since f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.
So passing page unaligned address to kasan_populate_zero_shadow() have two
possible effects:
1) It may leave one page hole in supposed to be populated area. After commit
21506525fb ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
hole happens to be in the shadow covering fixmap area and leads to crash:
BUG: unable to handle kernel paging request at fffffbffffe8ee04
RIP: 0010:check_memory_region+0x5c/0x190
Call Trace:
<NMI>
memcpy+0x1f/0x50
ghes_copy_tofrom_phys+0xab/0x180
ghes_read_estatus+0xfb/0x280
ghes_notify_nmi+0x2b2/0x410
nmi_handle+0x115/0x2c0
default_do_nmi+0x57/0x110
do_nmi+0xf8/0x150
end_repeat_nmi+0x1a/0x1e
Note, the crash likely disappeared after commit 92a0f81d89, which
changed kasan_populate_zero_shadow() call the way it was before
commit 21506525fb.
2) Attempt to load module near MODULES_END will fail, because
__vmalloc_node_range() called from kasan_module_alloc() will hit the
WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.
To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
which means that MODULES_END should be 8*PAGE_SIZE aligned.
The whole point of commit f06bdd4001 was to move MODULES_END down if
NR_CPUS is big, so the cpu_entry_area takes a lot of space.
But since 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
the cpu_entry_area is no longer in fixmap, so we could just set
MODULES_END to a fixed 8*PAGE_SIZE aligned address.
Fixes: f06bdd4001 ("x86/mm: Adapt MODULES_END based on fixmap section size")
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Garnier <thgarnie@google.com>
Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
Fixes this time include mostly device tree changes, as usual,
the notable ones include:
- A number of patches to fix most of the remaining DTC warnings
that got introduced when DTC started warning about some
obvious mistakes. We still have some remaining warnings that
probably may have to wait until 4.16 to get fixed while we
try to figure out what the correct contents should be.
- On Allwinner A64, Ethernet PHYs need a fix after a mistake in
coordination between patches merged through multiple branches.
- Various fixes for PMICs on allwinner based boards
- Two fixes for ethernet link detection on some Renesas machines
- Two stability fixes for rockchip based boards
Aside from device-tree, two other areas got fixes for older
problems:
- For TI Davinci DM365, a couple of fixes were needed to repair
the MMC DMA engine support, apparently this has been broken for
a while.
- One important fix for all Allwinner chips with the PMIC driver
as a loadable module.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJaTlgxAAoJEGCrR//JCVInCaMQAJAeEXqM3h0t353xWWdAw7N3
6iYcRMgGz0s6xx1+k6s8ez0hyooDn6d19j/dhFV5RcfL5iMKzYtM0mbzGhB9NCLl
uawJDPfuYKe3AVP4qnzNOU6qFNr6rp8+qY/ow0/tZtY+CzabEQKSe1TBOM2dNfF0
qEyHn55+s5HA+sjNOyy8NVPEFRP8OFU/8gFc7Hbacn4hbwxFeuwNxA+6PQCzPnd0
rMo5IwUMNoj04zu1SPGznaqJRMbhvYJr4tOmolPx4U2srInLK0mIFkhoBhVFrEHR
9mFfCayrKoZe+lq1cVHyoFTH4KWAc2RgcfeautWb5h/Nx9NFMKxOs5HCxXokrgUW
RFoELI35fJ0Mo+xdU1Yi7sppuTV27Br/Okx/ozuYkZGDxY/uj96TGTajFcEaE5aM
jZ/G5VgF16l03EBiDBwGkdI+BuHQeC+ulih8O6akhfW+NQlaK1egKiZiXyKWmpkp
wkEt3GCQsqB51lt1DMrF1toOoun7sTWkMb7PKBZjwQ7E6r2JHk93x76mfH077rWy
2rnfnYKqmWh70LQmOmLBpuB9M29xRv/tJH1u5MLyZSE6Q8cJOVI+v3NcKpxe5FJ1
Q7pLE9lkDTP8CYjVD0nFcNqH9SbklX+5O3AIb9mA6KEs3RVoNvBrTUnAYae2eMEz
tIOt3n2Uqh4dxqPZcmy/
=mQiF
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Fixes this time include mostly device tree changes, as usual, the
notable ones include:
- A number of patches to fix most of the remaining DTC warnings that
got introduced when DTC started warning about some obvious
mistakes. We still have some remaining warnings that probably may
have to wait until 4.16 to get fixed while we try to figure out
what the correct contents should be.
- On Allwinner A64, Ethernet PHYs need a fix after a mistake in
coordination between patches merged through multiple branches.
- Various fixes for PMICs on allwinner based boards
- Two fixes for ethernet link detection on some Renesas machines
- Two stability fixes for rockchip based boards
Aside from device-tree, two other areas got fixes for older problems:
- For TI Davinci DM365, a couple of fixes were needed to repair the
MMC DMA engine support, apparently this has been broken for a
while.
- One important fix for all Allwinner chips with the PMIC driver as a
loadable module"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC
arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property
arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
ARM: dts: tango4: remove bogus interrupt-controller property
ARM: dts: ls1021a: fix incorrect clock references
ARM: dts: aspeed-g4: Correct VUART IRQ number
ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine
ARM: dts: sun8i: a711: Reinstate the PMIC compatible
ARM: davinci: fix mmc entries in dm365's dma_slave_map
ARM: dts: da850-lego-ev3: Fix battery voltage gpio
ARM: davinci: Add dma_mask to dm365's eDMA device
ARM: davinci: Use platform_device_register_full() to create pdev for dm365's eDMA
arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now
arm64: dts: rockchip: remove vdd_log from rk3399-puma
arm64: dts: orange-pi-zero-plus2: fix sdcard detect
arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3
ARM: dts: sunxi: Convert to CCU index macros for HDMI controller
sunxi-rsb: Include OF based modalias in device uevent
ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850
arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts
...
This is probably a copy-paste mistake. The gpio-ranges of PXs3 is
different from that of LD20.
Fixes: 277b51e705 ("arm64: dts: uniphier: add GPIO controller nodes")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
First, one fix that adds proper regulator references for the EMAC
external PHYs on A64 boards. The EMAC bindings were developed for 4.13,
but reverted at the last minute. They were finalized and brought back
for 4.15. However in the time between, regulator support for the A64
boards was merged. When EMAC device tree changes were reintroduced,
this was not taken into account.
Second, a patch that adds OF based modalias uevent for RSB slave devices.
This has been missing since the introduction of RSB, and recently with
PMIC regulator support introduced for the A64, has been seen affecting
distributions, which have the all-important PMIC mfd drivers built as
modules, which then don't get loaded.
Other minor cleanups include final conversion of raw indices to CCU
binding macros for sun[4567]i HDMI, cleanup of dummy regulators on the
A64 SOPINE, a SD card detection polarity fix for the Orange Pi Zero
Plus2, and adding a missing compatible for the PMIC on the TBS A711
tablet.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAlpDPAcOHHdlbnNAY3Np
ZS5vcmcACgkQOJpUIZwPJDCj9xAA2Y0NblMi896otrTDEtskSXqoUfQeX+fqeOfX
xBK9+1IrFJ+KiA5zC1Hs7wYMYG/AlvGBgpxpp+UnX1TPojIPydCLwmJuPrdaniJ7
O3OqiK5m0Dpp/tj4zdeJE3bDdFRg3QrCYIRljpHlKEXDAoBehWwIjwniw7jjcLyG
5V1hO11sGLclDhN14ezs3blsQDjtUEG4CA3YgIwgRTEFVzKfZ2GyHPUi1myE+ItM
5egZVPGCaiQPUf4HcB3rvX3xJNEaumQ1S1e/WZKnG5KEZfKqDkfqu1IRhnn8kIvo
xmRdcSi1p7iHlBquHwwntsTB3cxr7xEu6kRlGBU4yTFTVpDJsMZntRdDQHQ50jMJ
edRR4IqOVUETD7PQGIhK9qNq3UqiKDAvBJ99xhV2tvsJse+p2urbRCaCUwueRLKi
GLha3Y0U3Na7+Q4ODpLwelEIkR+NcSxLfHjovEs3EecUFqEFxiIkc+7bdZq8mGJP
UX31dDFHW6CjIEAVeHLLhBuU+01KPYXlwc4s1bEReu2/OBDE+KK0rOcrIpumxBp5
LjXW+s/sUVGZ5sbQ+3wr32/cEQf133O+AqN4S7vZ2p5AIrm6J2vzjRaxZvEFlfy6
NbxnW3Bkt0Pu70oe6KQh4FvSVXVL9XLB/5nKoCsOMxlKYmHinC+j+IHd7oDI1zv4
ScbIRN0=
=FCgU
-----END PGP SIGNATURE-----
Merge tag 'sunxi-fixes-for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Pull "Allwinner fixes for 4.15" from Chen-Yu Tsai:
First, one fix that adds proper regulator references for the EMAC
external PHYs on A64 boards. The EMAC bindings were developed for 4.13,
but reverted at the last minute. They were finalized and brought back
for 4.15. However in the time between, regulator support for the A64
boards was merged. When EMAC device tree changes were reintroduced,
this was not taken into account.
Second, a patch that adds OF based modalias uevent for RSB slave devices.
This has been missing since the introduction of RSB, and recently with
PMIC regulator support introduced for the A64, has been seen affecting
distributions, which have the all-important PMIC mfd drivers built as
modules, which then don't get loaded.
Other minor cleanups include final conversion of raw indices to CCU
binding macros for sun[4567]i HDMI, cleanup of dummy regulators on the
A64 SOPINE, a SD card detection polarity fix for the Orange Pi Zero
Plus2, and adding a missing compatible for the PMIC on the TBS A711
tablet.
* tag 'sunxi-fixes-for-4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
ARM: dts: sun8i: a711: Reinstate the PMIC compatible
arm64: dts: orange-pi-zero-plus2: fix sdcard detect
arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3
ARM: dts: sunxi: Convert to CCU index macros for HDMI controller
sunxi-rsb: Include OF based modalias in device uevent
arm64: allwinner: a64: add Ethernet PHY regulator for several boards
Vladimir Zapolskiy says:
The present change is a bug fix for AVB link iteratively up/down.
Steps to reproduce:
- start AVB TX stream (Using aplay via MSE),
- disconnect+reconnect the eth cable,
- after a reconnection the eth connection goes iteratively up/down
without user interaction,
- this may heal after some seconds or even stay for minutes.
As the documentation specifies, the "renesas,no-ether-link" option
should be used when a board does not provide a proper AVB_LINK signal.
There is no need for this option enabled on RCAR H3/M3 Salvator-X/XS
and ULCB starter kits since the AVB_LINK is correctly handled by HW.
Choosing to keep or remove the "renesas,no-ether-link" option will
have impact on the code flow in the following ways:
- keeping this option enabled may lead to unexpected behavior since
the RX & TX are enabled/disabled directly from adjust_link function
without any HW interrogation,
- removing this option, the RX & TX will only be enabled/disabled after
HW interrogation. The HW check is made through the LMON pin in PSR
register which specifies AVB_LINK signal value (0 - at low level;
1 - at high level).
In conclusion, the change is also a safety improvement because it
removes the "renesas,no-ether-link" option leading to a proper way
of detecting the link state based on HW interrogation and not on
software heuristic.
Note that DTS files for V3M Starter Kit, Draak and Eagle boards
contain the same property, the files are untouched due to unavailable
schematics to verify if the fix applies to these boards as well.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE4nzZofWswv9L/nKF189kaWo3T74FAlo83goACgkQ189kaWo3
T74cgw/+KEG74zpi6d3JYyrkXNCZQasPRVYWtKN8+6+ML31KfNNKGyIOqRHzc85D
5dMV0LkP4xJIiOpoDYwku6Mk7NUgvbywEUPAkScrAeUCTmNPI7uczI7SGepByoUI
zml5u/bLnglrmKcxo8vyJWEQwCEP1cicwVYPBfmWvHZpmYqNQLGwNgrQdYH/Bo9E
uD1x+ZcWeOYU9IVp8DQNAq0zJZ+n1T2dtu3nWoL5bUKcTHjF5IBIpLnfkK6TuTzB
kn167OowM8ZvUlkvaFNcSD4HbFLh3huySPtl1hsIbxQ+MaajntIzRDzhCaQoQYCQ
9FLM5bWBbb9AM/DqhQ7C8cVwfUKND/jcBXEEDObjyEX9VniBgGpDs/rK1V5PxG+d
ZZc1CCZxdx+qKDFYW95W4l9N2NQ1fSjHnlpMSiHI3z8rfeKLXMNo2Sx/506b6U3w
Fa+wylCPjDBmD9dzDS22UlTfXgifSLfbhedi6TO9SglwHzJsMERPmouBnFW3oZAm
GNUxMcqIxfeJtqOUwnl31sWqaH3UDV3PBYAKgZkv38qvp3KnK2cacwsMFh/p8KnZ
Wu3PzQlUM8Y181u5xVekmtBNjVCoyMWRTOMQiNnQpJ5ASHLbQ1Fi135XhFK5j9sZ
JgqVGHkO2r9ktfkIqnxp67ZHNnmwYZi6T5NIENP+Ba7FL0TrGLE=
=kmQd
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes-for-v4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Renesas ARM Based SoC Fixes for v4.15" from Simon Horman:
Vladimir Zapolskiy says:
The present change is a bug fix for AVB link iteratively up/down.
Steps to reproduce:
- start AVB TX stream (Using aplay via MSE),
- disconnect+reconnect the eth cable,
- after a reconnection the eth connection goes iteratively up/down
without user interaction,
- this may heal after some seconds or even stay for minutes.
As the documentation specifies, the "renesas,no-ether-link" option
should be used when a board does not provide a proper AVB_LINK signal.
There is no need for this option enabled on RCAR H3/M3 Salvator-X/XS
and ULCB starter kits since the AVB_LINK is correctly handled by HW.
Choosing to keep or remove the "renesas,no-ether-link" option will
have impact on the code flow in the following ways:
- keeping this option enabled may lead to unexpected behavior since
the RX & TX are enabled/disabled directly from adjust_link function
without any HW interrogation,
- removing this option, the RX & TX will only be enabled/disabled after
HW interrogation. The HW check is made through the LMON pin in PSR
register which specifies AVB_LINK signal value (0 - at low level;
1 - at high level).
In conclusion, the change is also a safety improvement because it
removes the "renesas,no-ether-link" option leading to a proper way
of detecting the link state based on HW interrogation and not on
software heuristic.
Note that DTS files for V3M Starter Kit, Draak and Eagle boards
contain the same property, the files are untouched due to unavailable
schematics to verify if the fix applies to these boards as well.
* tag 'renesas-fixes-for-v4.15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property
arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
Pull x86 page table isolation fixes from Thomas Gleixner:
"A couple of urgent fixes for PTI:
- Fix a PTE mismatch between user and kernel visible mapping of the
cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
MCE on older AMD K8 machines
- Fix the misplaced CR3 switch in the SYSCALL compat entry code which
causes access to unmapped kernel memory resulting in double faults.
- Fix the section mismatch of the cpu_tss_rw percpu storage caused by
using a different mechanism for declaration and definition.
- Two fixes for dumpstack which help to decode entry stack issues
better
- Enable PTI by default in Kconfig. We should have done that earlier,
but it slipped through the cracks.
- Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
AMD is so confident that they are not affected, then we should not
burden users with the overhead"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/process: Define cpu_tss_rw in same section as declaration
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
x86/dumpstack: Print registers for first stack frame
x86/dumpstack: Fix partial register dumps
x86/pti: Make sure the user/kernel PTEs match
x86/cpu, x86/pti: Do not enable PTI on AMD processors
x86/pti: Enable PTI by default
The preparation for PTI which added CR3 switching to the entry code
misplaced the CR3 switch in entry_SYSCALL_compat().
With PTI enabled the entry code tries to access a per cpu variable after
switching to kernel GS. This fails because that variable is not mapped to
user space. This results in a double fault and in the worst case a kernel
crash.
Move the switch ahead of the access and clobber RSP which has been saved
already.
Fixes: 8a09317b89 ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching")
Reported-by: Lars Wendler <wendler.lars@web.de>
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>,
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Greg KH <gregkh@linuxfoundation.org>, ,
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Cc: Juergen Gross <jgross@suse.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031949200.1957@nanos
In the stack dump code, if the frame after the starting pt_regs is also
a regs frame, the registers don't get printed. Fix that.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@vger.kernel.org
Fixes: 3b3fa11bc7 ("x86/dumpstack: Print any pt_regs found on the stack")
Link: http://lkml.kernel.org/r/396f84491d2f0ef64eda4217a2165f5712f6a115.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The show_regs_safe() logic is wrong. When there's an iret stack frame,
it prints the entire pt_regs -- most of which is random stack data --
instead of just the five registers at the end.
show_regs_safe() is also poorly named: the on_stack() checks aren't for
safety. Rename the function to show_regs_if_on_stack() and add a
comment to explain why the checks are needed.
These issues were introduced with the "partial register dump" feature of
the following commit:
b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")
That patch had gone through a few iterations of development, and the
above issues were artifacts from a previous iteration of the patch where
'regs' pointed directly to the iret frame rather than to the (partially
empty) pt_regs.
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@vger.kernel.org
Fixes: b02fcf9ba1 ("x86/unwinder: Handle stack overflows more gracefully")
Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Meelis reported that his K8 Athlon64 emits MCE warnings when PTI is
enabled:
[Hardware Error]: Error Addr: 0x0000ffff81e000e0
[Hardware Error]: MC1 Error: L1 TLB multimatch.
[Hardware Error]: cache level: L1, tx: INSN
The address is in the entry area, which is mapped into kernel _AND_ user
space. That's special because we switch CR3 while we are executing
there.
User mapping:
0xffffffff81e00000-0xffffffff82000000 2M ro PSE GLB x pmd
Kernel mapping:
0xffffffff81000000-0xffffffff82000000 16M ro PSE x pmd
So the K8 is complaining that the TLB entries differ. They differ in the
GLB bit.
Drop the GLB bit when installing the user shared mapping.
Fixes: 6dc72c3cbc ("x86/mm/pti: Share entry text PMD")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031407180.1957@nanos
AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting
the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI
is set.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20171227054354.20369.94587.stgit@tlendack-t1.amdoffice.net
Commit:
82c3768b8d ("efi/capsule-loader: Use a cached copy of the capsule header")
... refactored the capsule loading code that maps the capsule header,
to avoid having to map it several times.
However, as it turns out, the vmap() call we ended up removing did not
just map the header, but the entire capsule image, and dropping this
virtual mapping breaks capsules that are processed by the firmware
immediately (i.e., without a reboot).
Unfortunately, that change was part of a larger refactor that allowed
a quirk to be implemented for Quark, which has a non-standard memory
layout for capsules, and we have slightly painted ourselves into a
corner by allowing quirk code to mangle the capsule header and memory
layout.
So we need to fix this without breaking Quark. Fortunately, Quark does
not appear to care about the virtual mapping, and so we can simply
do a partial revert of commit:
2a457fb31d ("efi/capsule-loader: Use page addresses rather than struct page pointers")
... and create a vmap() mapping of the entire capsule (including header)
based on the reinstated struct page array, unless running on Quark, in
which case we pass the capsule header copy as before.
Reported-by: Ge Song <ge.song@hxt-semitech.com>
Tested-by: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Tested-by: Ge Song <ge.song@hxt-semitech.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 82c3768b8d ("efi/capsule-loader: Use a cached copy of the capsule header")
Link: http://lkml.kernel.org/r/20180102172110.17018-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
'add_efi_memmap' is an early param, but do_add_efi_memmap() has no
chance to run because the code path is before parse_early_param().
I believe it worked when the param was introduced but probably later
some other changes caused the wrong order and nobody noticed it.
Move efi_memblock_x86_reserve_range() after parse_early_param()
to fix it.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Ge Song <ge.song@hxt-semitech.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180102172110.17018-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARC gcc prior to GNU 2018.03 release didn't have a target specific
__builtin_trap() implementation, generating default abort() call.
Implement the abort() call - emulating what newer gcc does for the same,
as suggested by Arnd.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Qemu for PARISC reported on a 32bit SMP parisc kernel strange failures
about "Not-handled unaligned insn 0x0e8011d6 and 0x0c2011c9."
Those opcodes evaluate to the ldcw() assembly instruction which requires
(on 32bit) an alignment of 16 bytes to ensure atomicity.
As it turns out, qemu is correct and in our assembly code in entry.S and
pacache.S we don't pay attention to the required alignment.
This patch fixes the problem by aligning the lock offset in assembly
code in the same manner as we do in our C-code.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v4.0+