While we've only seen inlining problems with atomic_sub_return(),
the other atomic operations could have the same problem. Convert all
remaining operations to use the same solution as atomic_sub_return().
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Since my system use python3 as default, arch/ia64/scripts/unwcheck.py no
longer run.
This patch convert it to the python3 syntax.
I have ran it with python2/python3 while printing values of
start/end/rlen_sum which could be impacted by this change and I see no difference.
Fixes: 94a4708352 ("scripts: change scripts to use system python instead of env")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Pull networking fixes from David Miller:
1) Use an appropriate TSQ pacing shift in mac80211, from Toke
Høiland-Jørgensen.
2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
in ip6_route_me_harder, from Eric Dumazet.
3) Fix several shutdown races and similar other problems in l2tp, from
James Chapman.
4) Handle missing XDP flush properly in tuntap, for real this time.
From Jason Wang.
5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel
Borkmann.
6) Fix phy_resume() locking, from Andrew Lunn.
7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
from Xin Long.
8) Revert F-RTO middle box workarounds, they only handle one dimension
of the problem. From Yuchung Cheng.
9) Fix socket refcounting in RDS, from Ka-Cheong Poon.
10) Don't allow ppp unit registration to an unregistered channel, from
Guillaume Nault.
11) Various hv_netvsc fixes from Stephen Hemminger.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits)
hv_netvsc: propagate rx filters to VF
hv_netvsc: filter multicast/broadcast
hv_netvsc: defer queue selection to VF
hv_netvsc: use napi_schedule_irqoff
hv_netvsc: fix race in napi poll when rescheduling
hv_netvsc: cancel subchannel setup before halting device
hv_netvsc: fix error unwind handling if vmbus_open fails
hv_netvsc: only wake transmit queue if link is up
hv_netvsc: avoid retry on send during shutdown
virtio-net: re enable XDP_REDIRECT for mergeable buffer
ppp: prevent unregistered channels from connecting to PPP units
tc-testing: skbmod: fix match value of ethertype
mlxsw: spectrum_switchdev: Check success of FDB add operation
net: make skb_gso_*_seglen functions private
net: xfrm: use skb_gso_validate_network_len() to check gso sizes
net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
rds: Incorrect reference counting in TCP socket creation
net: ethtool: don't ignore return from driver get_fecparam method
vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
...
Commit 7a407aa5e0 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to
platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_SERIO select
down to various platforms, but doesn't add it to Loongson64 platforms
which need it, so add the selects to these platforms too.
Fixes: 7a407aa5e0 ("MIPS: Push ARCH_MIGHT_HAVE_PC_SERIO down to platform level")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18704/
Signed-off-by: James Hogan <jhogan@kernel.org>
Commit a211a0820d ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to
platform level") moves the global MIPS ARCH_MIGHT_HAVE_PC_PARPORT select
down to various platforms, but doesn't add it to Loongson64 platforms
which need it, so add the selects to these platforms too.
Fixes: a211a0820d ("MIPS: Push ARCH_MIGHT_HAVE_PC_PARPORT down to platform level")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18703/
Signed-off-by: James Hogan <jhogan@kernel.org>
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Add missing instruction suffixes to assembly code so it can be
compiled by newer GAS versions without warnings.
- Switch refcount WARN exceptions to UD2 as we did in general
- Make the reboot on Intel Edison platforms work
- A small documentation update so text and sample command match"
* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation, x86, resctrl: Make text and sample command match
x86/platform/intel-mid: Handle Intel Edison reboot correctly
x86/asm: Add instruction suffixes to bitops
x86/entry/64: Add instruction suffix
x86/refcounts: Switch to UD2 for exceptions
Pull x86/pti fixes from Thomas Gleixner:
"Three fixes related to melted spectrum:
- Sync the cpu_entry_area page table to initial_page_table on 32 bit.
Otherwise suspend/resume fails because resume uses
initial_page_table and triggers a triple fault when accessing the
cpu entry area.
- Zero the SPEC_CTL MRS on XEN before suspend to address a
shortcoming in the hypervisor.
- Fix another switch table detection issue in objtool"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu_entry_area: Sync cpu_entry_area to initial_page_table
objtool: Fix another switch table detection issue
x86/xen: Zero MSR_IA32_SPEC_CTRL before suspend
There is no event extension (bit 21) for SKX UPI, so
use 'event' instead of 'event_ext'.
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: cd34cd97b7 ("perf/x86/intel/uncore: Add Skylake server uncore support")
Link: http://lkml.kernel.org/r/1520004150-4855-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJamiV5AAoJED2LAQed4NsGgbUQAInk9+SPBtulpK1HayDoYrC6
6LYQWjdO89jbKAaQFsDQ0ZC5HwPgafX18HWunJ67Cs6kunLYTMJLu8cHr5rRHrv1
B+ceAkMe5EE62Bmju35qX54BSgcPyWt9jFloFmdzOG7nR7D+C9DUx2WZnF3Kd0Gb
XPw8R4u2EEKGOMYh5PwYFP5mjhfWMijg6OZaAdcV4E/fSXiwSlDjwuVY1ymhsEAn
OalDNcJ5+Foe9ADy5yLEna0Jlj8j8UMGPRzNErvL2CxZP2CSfpKfuXJABig5x8rk
ZXULN6w7nIN/dpUkPgkTYgF6behUuSUPj+7xBDWbVjRQLMMy6bI65s5hylBzYAMy
tB3hKnpZtrrJNc00XQWLY18b03YzLv5wW0CK8jmRJPLWDE8znb8aMqxwtfJKhfa+
cW5v5KzQm4ercArSy983jj4FJD6ZuQf1lqAARD4vxX5kK9oy+7oJvfzmwrlh1U2c
dk/3VKF2AjUDDdzxq+zgJIWFR1MMTxFKqL95CsdaFEUMRXr80a8XBNYP07R4l3qB
y2zUsZZi7Ad363MkohBCEBdw1yPtHupRETr3xdQlUDh0ZQj94TXB6lX2SaY4v1NS
J3eqTl8t/RHboS5kbPMz8MduO4otH/ZxYDd0uMKA5c/v/9LkD9y+AIpndswk4jcN
/uKkw5sSRgO+9VfsTqed
=sr4l
-----END PGP SIGNATURE-----
Merge tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- suppress sparse warnings about unknown attributes
- fix typos and stale comments
- fix build error of arch/sh
- fix wrong use of ld-option vs cc-ldoption
- remove redundant GCC_PLUGINS_CFLAGS assignment
- fix another memory leak of Kconfig
- fix line number in error messages of Kconfig
- do not write confusing CONFIG_DEFCONFIG_LIST out to .config
- add xstrdup() to Kconfig to handle memory shortage errors
- show also a Debian package name if ncurses is missing
* tag 'kbuild-fixes-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
MAINTAINERS: take over Kconfig maintainership
kconfig: fix line number in recursive inclusion error message
Coccinelle: memdup: Fix typo in warning messages
kconfig: Update ncurses package names for menuconfig
kbuild/kallsyms: trivial typo fix
kbuild: test --build-id linker flag by ld-option instead of cc-ldoption
kbuild: drop superfluous GCC_PLUGINS_CFLAGS assignment
kconfig: Don't leak choice names during parsing
sh: fix build error for empty CONFIG_BUILTIN_DTB_SOURCE
kconfig: set SYMBOL_AUTO to the symbol marked with defconfig_list
kconfig: add xstrdup() helper
kbuild: disable sparse warnings about unknown attributes
Makefile: Fix lying comment re. silentoldconfig
Since commit 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing
for signals on guest entry"), if CONFIG_VIRT_CPU_ACCOUNTING_GEN is set, the
guest time is not accounted to guest time and user time, but instead to
system time.
This is because guest_enter()/guest_exit() are called while interrupts
are disabled and the tick counter cannot be updated between them.
To fix that, move guest_exit() after local_irq_enable(), and as
guest_enter() is called with IRQ disabled, call guest_enter_irqoff()
instead.
Fixes: 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation features
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJambvzAAoJEED/6hsPKofo9HwH/2il8xNSLIYf9pJtxZo/puyQ
ZSwByGdeLKBZ1GP1dhdZ8kMk3eBoci0a/sQJmhDiEG6GDf1Mrgri/xj3p60sWwXT
iReG+ZhBKg4QMj/IgOJQrh+53JT73QQP14wIhzc/DSi0Fo0ziqDA/lINxqMKc7oF
b5qratjsb4xF1db4d1g8Ii1VRk64UoBEVpEoP37OOyAu1rgXgDr+9C832KkP0rb+
pVYT8hLFiaYiwVN+WN52/NrIkqBlMvMp3ouRtMAajCQ9OznnraDJE6eTkPGDSDBM
RizSuQbev5R7pcmzCAP9/XKfTbeSUZQei2ZFXQXAOvIXMrQd/ITjbPvnObsceSE=
=wtQd
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"x86:
- fix NULL dereference when using userspace lapic
- optimize spectre v1 mitigations by allowing guests to use LFENCE
- make microcode revision configurable to prevent guests from
unnecessarily blacklisting spectre v2 mitigation feature"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: fix vcpu initialization with userspace lapic
KVM: X86: Allow userspace to define the microcode version
KVM: X86: Introduce kvm_get_msr_feature()
KVM: SVM: Add MSR-based feature support for serializing LFENCE
KVM: x86: Add a framework for supporting MSR-based features
Pull parisc fixes from Helge Deller:
- a patch to change the ordering of cache and TLB flushes to hopefully
fix the random segfaults we very rarely face (by Dave Anglin).
- a patch to hide the virtual kernel memory layout due to security
reasons.
- two small patches to make the kernel run more smoothly under qemu.
* 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Reduce irq overhead when run in qemu
parisc: Use cr16 interval timers unconditionally on qemu
parisc: Check if secondary CPUs want own PDC calls
parisc: Hide virtual kernel memory layout
parisc: Fix ordering of cache and TLB flushes
When running s390 images with 'compat' processes, the following
BUG is seen repeatedly.
BUG: non-zero pgtables_bytes on freeing mm: -16384
Bisect points to commit b4e98d9ac7 ("mm: account pud page tables").
Analysis shows that init_new_context() is called with
mm->context.asce_limit set to _REGION3_SIZE. In this situation,
pgtables_bytes remains set to 0 and is not increased. The message is
displayed when the affected process dies and mm_dec_nr_puds() is called.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Fixes: b4e98d9ac7 ("mm: account pud page tables")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When run under QEMU, calling mfctl(16) creates some overhead because the
qemu timer has to be scaled and moved into the register. This patch
reduces the number of calls to mfctl(16) by moving the calls out of the
loops.
Additionally, increase the minimal time interval to 8000 cycles instead
of 500 to compensate possible QEMU delays when delivering interrupts.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
When running on qemu we know that the (emulated) cr16 cpu-internal
clocks are syncronized. So let's use them unconditionally on qemu.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
The architecture specification says (for 64-bit systems): PDC is a per
processor resource, and operating system software must be prepared to
manage separate pointers to PDCE_PROC for each processor. The address
of PDCE_PROC for the monarch processor is stored in the Page Zero
location MEM_PDC. The address of PDCE_PROC for each non-monarch
processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ.
Currently we still use one PDC for all CPUs, but in case we face a
machine which is following the specification let's warn about it.
Signed-off-by: Helge Deller <deller@gmx.de>
The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls. The problem is some drivers call these routines with
interrupts disabled. Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work. This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts. When interrupts are disabled, we now drop into slower code.
The attached change fixes the ordering of cache and TLB flushes in
several cases. When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush. We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries. The same is true for
tmpalias region flushes.
The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.
Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range(). Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.
Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation. So far, testing indicates that this is the case. I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code. This increases the
probability of stalls.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
The current code for initializing the VRMA (virtual real memory area)
for HPT guests requires the page size of the backing memory to be one
of 4kB, 64kB or 16MB. With a radix host we have the possibility that
the backing memory page size can be 2MB or 1GB. In these cases, if the
guest switches to HPT mode, KVM will not initialize the VRMA and the
guest will fail to run.
In fact it is not necessary that the VRMA page size is the same as the
backing memory page size; any VRMA page size less than or equal to the
backing memory page size is acceptable. Therefore we now choose the
largest page size out of the set {4k, 64k, 16M} which is not larger
than the backing memory page size.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This fixes several bugs in the radix page fault handler relating to
the way large pages in the memory backing the guest were handled.
First, the check for large pages only checked for explicit huge pages
and missed transparent huge pages. Then the check that the addresses
(host virtual vs. guest physical) had appropriate alignment was
wrong, meaning that the code never put a large page in the partition
scoped radix tree; it was always demoted to a small page.
Fixing this exposed bugs in kvmppc_create_pte(). We were never
invalidating a 2MB PTE, which meant that if a page was initially
faulted in without write permission and the guest then attempted
to store to it, we would never update the PTE to have write permission.
If we find a valid 2MB PTE in the PMD, we need to clear it and
do a TLB invalidation before installing either the new 2MB PTE or
a pointer to a page table page.
This also corrects an assumption that get_user_pages_fast would set
the _PAGE_DIRTY bit if we are writing, which is not true. Instead we
mark the page dirty explicitly with set_page_dirty_lock(). This
also means we don't need the dirty bit set on the host PTE when
providing write access on a read fault.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Daniel Borkmann says:
====================
pull-request: bpf 2018-02-28
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Add schedule points and reduce the number of loop iterations
the test_bpf kernel module is performing in order to not hog
the CPU for too long, from Eric.
2) Fix an out of bounds access in tail calls in the ppc64 BPF
JIT compiler, from Daniel.
3) Fix a crash on arm64 on unaligned BPF xadd operations that
could be triggered via interpreter and JIT, from Daniel.
Please not that once you merge net into net-next at some point, there
is a minor merge conflict in test_verifier.c since test cases had
been added at the end in both trees. Resolution is trivial: keep all
the test cases from both trees.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_USE_BUILTIN_DTB is enabled, but CONFIG_BUILTIN_DTB_SOURCE
is empty (for example, allmodconfig), it fails to build, like this:
make[2]: *** No rule to make target 'arch/sh/boot/dts/.dtb.o',
needed by 'arch/sh/boot/dts/built-in.o'. Stop.
Surround obj-y with ifneq ... endif.
I replaced $(CONFIG_USE_BUILTIN_DTB) with 'y' since this is always
the case from the following code from arch/sh/Makefile:
core-$(CONFIG_USE_BUILTIN_DTB) += arch/sh/boot/dts/
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
- MCIP aka ARconnect fixes for SMP builds [Euginey]
- Preventive fix for SLC (L2 cache) flushing [Euginey]
- Kconfig default fix [Ulf Magnusson]
- trailing semicolon fixes [Luis de Bethencourt]
- other assorted minor fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJamHh7AAoJEGnX8d3iisJeM+MQAL+cIkdRd9NJoPPL66IOgwmi
68jUkVQ8PSgR2P/sWvwIRTionOG9siu58Q1ygXpPR9SNnSJlyIYIW4Onuoajs+Ii
TmTwWm8jnKtuPtKMBep8XpYQQ+FRL6sDNn0QLjCnnvqAeE7rTFODSlpBR+jL6ur4
t2h1wHQem6oas1YRwnKrxxMnfV3KP3TkCS2a/lvmeAAt8xi+ll+9OnzVcSCj7pCJ
ZToenfH3UAbhAd5T7gWkJv2v00zbHNzKtpdluSW6WBybP1Ib2IxOxEiUlAvxQgpl
NctvS8Q1uveOPQZHO+fNXsJvf+imP2RdWh5RaOcmm8a4tp8jsR51BScX55usjr/z
ybg4bzHBV3x9YbZkkW9DKyF9eeZ7hWST8nNoebcHNVjboUeD+wgtV8e3Fvc6bIoo
/Xkw28y/mZwCs7mBfu1fNOIIiiZDSgz7YeeqzOPBzEYVcHE4VGINwX+9cLXPOsgz
rmI/Md3buSjrfXPnXTuk1R4fl0WKM5FT/2BJ+IbNU2w6VO1/iKqiBQREH5lBfYAI
BUikxKNwrJ9+zG8EXFUAEi6gn4dxosxMKJ04CTviwGFc1y6yNsmMgizevFpPUyx7
9o3z19hHUZLx94QsGHLI7QJe3Kawamu0tmdJvQBgB/eZ9lj6OdkzY48I2lgGezSw
00e+JPSy6EDi8YrlMki5
=+Bmc
-----END PGP SIGNATURE-----
Merge tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- MCIP aka ARconnect fixes for SMP builds [Euginey]
- preventive fix for SLC (L2 cache) flushing [Euginey]
- Kconfig default fix [Ulf Magnusson]
- trailing semicolon fixes [Luis de Bethencourt]
- other assorted minor fixes
* tag 'arc-4.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: setup cpu possible mask according to possible-cpus dts property
ARC: mcip: update MCIP debug mask when the new cpu came online
ARC: mcip: halt GFRC counter when ARC cores halt
ARCv2: boot log: fix HS48 release number
arc: dts: use 'atmel' as manufacturer for at24 in axs10x_mb
ARC: Fix malformed ARC_EMUL_UNALIGNED default
ARC: boot log: Fix trailing semicolon
ARC: dw2 unwind: Fix trailing semicolon
ARC: Enable fatal signals on boot for dev platforms
ARCv2: Don't pretend we may set L-bit in STATUS32 with kflag instruction
ARCv2: cache: fix slc_entire_op: flush only instead of flush-n-inv
Moving the code around broke this rare configuration.
Use this opportunity to finally call lapic reset from vcpu reset.
Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes: 0b2e9904c1 ("KVM: x86: move LAPIC initialization after VMCS creation")
Cc: stable@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Linux (among the others) has checks to make sure that certain features
aren't enabled on a certain family/model/stepping if the microcode version
isn't greater than or equal to a known good version.
By exposing the real microcode version, we're preventing buggy guests that
don't check that they are running virtualized (i.e., they should trust the
hypervisor) from disabling features that are effectively not buggy.
Suggested-by: Filippo Sironi <sironi@amazon.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Introduce kvm_get_msr_feature() to handle the msrs which are supported
by different vendors and sharing the same emulation logic.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Liran Alon <liran.alon@oracle.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
In order to determine if LFENCE is a serializing instruction on AMD
processors, MSR 0xc0011029 (MSR_F10H_DECFG) must be read and the state
of bit 1 checked. This patch will add support to allow a guest to
properly make this determination.
Add the MSR feature callback operation to svm.c and add MSR 0xc0011029
to the list of MSR-based features. If LFENCE is serializing, then the
feature is supported, allowing the hypervisor to set the value of the
MSR that guest will see. Support is also added to write (hypervisor only)
and read the MSR value for the guest. A write by the guest will result in
a #GP. A read by the guest will return the value as set by the host. In
this way, the support to expose the feature to the guest is controlled by
the hypervisor.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Provide a new KVM capability that allows bits within MSRs to be recognized
as features. Two new ioctls are added to the /dev/kvm ioctl routine to
retrieve the list of these MSRs and then retrieve their values. A kvm_x86_ops
callback is used to determine support for the listed MSR-based features.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Tweaked documentation. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
We already count io interrupts, but we forgot to print them.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: d8346b7d9b ("KVM: s390: Support for I/O interrupts.")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The separation of the cpu_entry_area from the fixmap missed the fact that
on 32bit non-PAE kernels the cpu_entry_area mapping might not be covered in
initial_page_table by the previous synchronizations.
This results in suspend/resume failures because 32bit utilizes initial page
table for resume. The absence of the cpu_entry_area mapping results in a
triple fault, aka. insta reboot.
With PAE enabled this works by chance because the PGD entry which covers
the fixmap and other parts incindentally provides the cpu_entry_area
mapping as well.
Synchronize the initial page table after setting up the cpu entry
area. Instead of adding yet another copy of the same code, move it to a
function and invoke it from the various places.
It needs to be investigated if the existing calls in setup_arch() and
setup_per_cpu_areas() can be replaced by the later invocation from
setup_cpu_entry_areas(), but that's beyond the scope of this fix.
Fixes: 92a0f81d89 ("x86/cpu_entry_area: Move it out of the fixmap")
Reported-by: Woody Suwalski <terraluna977@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Woody Suwalski <terraluna977@gmail.com>
Cc: William Grant <william.grant@canonical.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1802282137290.1392@nanos.tec.linutronix.de
This is the first set of bugfixes for ARM SoCs, fixing a couple
of stability problems, mostly on TI OMAP and Rockchips platforms:
- OMAP2 hwmod clocks must be enabled in the correct order
- OMAP3 Wakeup from resume through PRM IRQ was unreliable
- One regression on OMAP5 caused by a kexec fix
- Rockchip ethernet needs some settings for stable operation on Rock64
- Rockchip based Chrombook Plus needs another clock setting for
stable display suspend/resume
- Rockchip based phyCORE-RK3288 was able to run at an invalid
CPU clock frequency
- Rockchip MMC link was sometimes unreliable
- Multiple fixes to avoid crashes in the Broadcom STB DPFE driver
Other minor changes include:
- Devicetree fixes for incorrect hardware description (rockchip,
omap, Gemini, amlogic)
- Some MAINTAINER file updates to correct email and git addresses
- Some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic,
cavium, qualcomm, hisilicon, zx)
- Fixes for LTO-compilation (orion, davinci, clps711x)
- One fix for an incorrect Kconfig errata selection
- A memory leak in the OMAP timer driver
- A kernel data leak in OMAP1 debugfs files
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJalzTaAAoJEGCrR//JCVInBeQP/3wBXCnzfCkmSSliZHoNzgYB
XGkC+JIqw9AnHvn/ckvHMwUv8kQlbi7ImPXz1P8yafy3h2vHIdN2My0XYtRyQkNT
NoAxIXT+NiQx9sAoLGY8gWTN4Do63q1vw5SLmOEDD2GYzo1jao4s7J0mhFZopBLw
WkgHf8t4jRmoBDA4GEYcdJZS5shMydFDyb9CiiqNHVA4S4IL87XcPoJDpJmyVDZ4
vZVeccyhw0Xh0NJLzRIhVDGRN2pj1ayFFVodfRNTseRGf0QRexntiIyIHa2wOi1l
93IjJ3XgHuYEj0NNNpZiHV5OZxxRbQlTD/ji5L8j71lklVjIedJsJdWFUKiK53oh
ufQXTRZaVMmh4xcvihABSchg8vEXMqx4cZ/hj/+LIepDJM6GC39uGipg6enORVym
BuZpol8b1owABN461Bt2RfAVyXqJ7TRkdVy+RaP7RCsddLEcdKdI6HYi3aeDVmHQ
krvTrLQhRsDL4IHvi6rQDqyJMf5GDP4y7aInf7YzvJlbV2uU+M0ndiSHpGhw6vbG
brhc/n56U/waMPG8tOv9AB1+afARQOc4Fo9xg96PADA69SXn7Eq2dgf1D/ern8UQ
6KgNZ1hmmEHzkxsAXjEcStlmhpwk4lh4T0nSDbamsMRvZRNQaqmskMbmYYepIXKC
71k/Uwf4CQhMxe2aXIOo
=fcv0
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is the first set of bugfixes for ARM SoCs, fixing a couple of
stability problems, mostly on TI OMAP and Rockchips platforms:
- OMAP2 hwmod clocks must be enabled in the correct order
- OMAP3 Wakeup from resume through PRM IRQ was unreliable
- one regression on OMAP5 caused by a kexec fix
- Rockchip ethernet needs some settings for stable operation on
Rock64
- Rockchip based Chrombook Plus needs another clock setting for
stable display suspend/resume
- Rockchip based phyCORE-RK3288 was able to run at an invalid CPU
clock frequency
- Rockchip MMC link was sometimes unreliable
- multiple fixes to avoid crashes in the Broadcom STB DPFE driver
Other minor changes include:
- Devicetree fixes for incorrect hardware description (rockchip,
omap, Gemini, amlogic)
- some MAINTAINER file updates to correct email and git addresses
- some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic,
cavium, qualcomm, hisilicon, zx)
- fixes for LTO-compilation (orion, davinci, clps711x)
- one fix for an incorrect Kconfig errata selection
- a memory leak in the OMAP timer driver
- a kernel data leak in OMAP1 debugfs files"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
MAINTAINERS: update entries for ARM/STM32
ARM: dts: bcm283x: Move arm-pmu out of soc node
ARM: dts: bcm283x: Fix unit address of local_intc
ARM: dts: NSP: Fix amount of RAM on BCM958625HR
ARM: dts: Set D-Link DNS-313 SATA to muxmode 0
ARM: omap2: set CONFIG_LIRC=y in defconfig
ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
memory: brcmstb: dpfe: support new way of passing data from the DCPU
memory: brcmstb: dpfe: fix type declaration of variable "ret"
memory: brcmstb: dpfe: properly mask vendor error bits
ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
ARM: orion: fix orion_ge00_switch_board_info initialization
ARM: davinci: mark spi_board_info arrays as const
ARM: clps711x: mark clps711x_compat as const
arm: zx: dts: Remove leading 0x and 0s from bindings notation
arm64: dts: Remove leading 0x and 0s from bindings notation
arm64: dts: cavium: fix PCI bus dtc warnings
MAINTAINERS: ARM: at91: update my email address
soc: imx: gpc: de-register power domains only if initialized
ARM: dts: rockchip: Fix DWMMC clocks
...
This week we have a single fix: replacing smp_mb() with __smp_mb(). We
were the only architecture with smp_mb() and it appears to just be
clearly wrong, so I think this is a pretty safe patch for an RC.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlqUQEETHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQciND/wJOHzf/O7aHWlt7UYvzKsqgIdxNKiH
IWbv4jTmJGJju10Ht5jImuIxSYJ5gu24Vj5f+/XkFRZfTGrOGgByXh37RuFP51kH
3yFqzDoBfmBHu8NTeyAb6hHtozvcVTBYynSqKBJzsCz3mrBgqobvzoii1R2TX4lc
+cPNffv+YRPZSHdHOpaytYIYdVxbcFmsGocftjzZ9AbR87mbT1zxDJV+YHgH86vE
2qqaWYTJChCCQo+We57+0HheMtzjvuC9wXRg7yqDOhe5mvpxNQf/RLhr4cxJtcFN
i96ERaRg0B5PAU+Pcql0wVL9oigd7bPv4Ncc+rjPG9efkXDKeN/ueD1az9yWRqE1
SUjJROel5tF6Yxytn2XX+MEIPEy0AO1WlD+9o23aoaKXHgSu8kMU3Zgryvs+S6W5
VkZeeUk/VV4YIQy8HwMujgjKzLXOoBjAHOvSyyf75ERkYbQxpEjkWRZWtjsiLGbP
lgHX59p3C2J3w/xLA+rUE3gdPm4qY45l232RJFlFS6e+wG32/Fw+Yc+PxFHhPARK
J+mhAVswhfrC4x9KbL83y8Zf9RvJ4y4qnpJAOyze+OgARq30SLzr6ul+5nHQAkb+
4lcPS0ROsxpa83BnGC8bfz9TGoomdYJrT8piV+lx8wTPrZQg4gfAoStSyu89V0Vz
LBcwO6MObWcdhw==
=ke4H
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V fix from Palmer Dabbelt:
"This week we have a single fix: replacing smp_mb() with __smp_mb().
We were the only architecture with smp_mb() and it appears to just be
clearly wrong, so I think this is a pretty safe patch for an RC"
* tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
riscv/barrier: Define __smp_{mb,rmb,wmb}
4.16, please pull the following:
- Mathieu fixes leading 0x and 0's from bindings and Device Tree source
files, he has done this treewide and most of his changes are already in
4.16
- Stefan provides two changes to the BCM283x DTS files in order to fix
DTC warnings
- Florian fixes the amount of RAM on the BCM958625HR reference board to
properly limit to what is initialized by the bootloader
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJalfuUAAoJEIfQlpxEBwcEaMoQAIR2oLcNZcrUzsgOuBSQ/m1N
+EpXw5TIeDzZRIdNTqofcldGDn0FMnAsPEBnMMYCti+2iPntPGK64IFAD0u7E8JW
+CU6sfk3SYijPP/nz5/KOSwRZ2KrDombYx73yrXqY2hs4kYupT078NfiI70jZYSB
6v8cua4Pg/Uw9c2ZSC/lkgrW3G1ZImAxA6tgAOZvRaq7qgCBlBvWEweLIgMcBAGY
Eq5p6KsmC01/1QGNV0sL9v90Mg8uAe3IJK0hGgk57BeYERcyVZ/V1C6veQmiiyjj
XYcILm2ww3KOfbZYslwKBfr9V76lFcsTPQD16Z8IOWTL9X/B0DRXDXiRnUX5P6w+
ZIzVBjLH8UxSU5bgD2tXVBWfEKs/kgvQPAv+8FlpCU0bY7fB2y8E6Pr44FVH/dTV
7Vkm60Jz9AFGVjaqhx+t8hyjQ7g2NQGwonWYGUWYXD7Wi+OK7cazC49lnebZJyPN
rSJjOWGxblHUAh2CjGUQNyk4CplbUybuo8441xacGK2BXJS0gl+xvGdxY+uaUcbc
zMb1Ox4I0gpe8xpWcc7vyKt+cgWCaXnyLHpFdcaC3bBT63O04yVU5mirlHYQu/NR
R/yY7lPStA+DzL6uihRi4K05+BoO12J7I3Mm0Wq//KgE1ij7LeHL7Nzy+AzQI2Rs
Dy1H//EWpDvfZTbONIwz
=9RVY
-----END PGP SIGNATURE-----
Merge tag 'arm-soc/for-4.16/devicetree-fixes' of https://github.com/Broadcom/stblinux into fixes
Pull "Broadcom devicetree fixes for 4.16" from Florian Fainelli:
This pull request contains Broadcom ARM-based SoCs Device Tree fixes for
4.16, please pull the following:
- Mathieu fixes leading 0x and 0's from bindings and Device Tree source
files, he has done this treewide and most of his changes are already in
4.16
- Stefan provides two changes to the BCM283x DTS files in order to fix
DTC warnings
- Florian fixes the amount of RAM on the BCM958625HR reference board to
properly limit to what is initialized by the bootloader
* tag 'arm-soc/for-4.16/devicetree-fixes' of https://github.com/Broadcom/stblinux:
ARM: dts: bcm283x: Move arm-pmu out of soc node
ARM: dts: bcm283x: Fix unit address of local_intc
ARM: dts: NSP: Fix amount of RAM on BCM958625HR
ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
- Fix i.MX GPC driver to remove power domains only when they are
initialized in imx_gpc_probe().
- Fix the broken Engicam i.CoreM6 DualLite/Solo RQS board DT to include
imx6dl.dtsi instead of imx6q.dtsi.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJalfO8AAoJEFBXWFqHsHzOPk4IALx0Gl07tgaTy2VgZJT2w0ud
ZoyeicSNxfM42auRHjC6QhYpQztw8+k4PCuGmUOutHc2Hbw21EXDxLiSwYgGarBR
3OuYShxBg6IExpUk7TM0kDFU0Z4GcpTjAovloM7yjLxCj88dzS+NIiUwPSl02kTp
iAGxVtTQOYLotJFZLQ+DGH3SuaEvOEzeQZz/v4v2/z1IsJRI5IQ2ILtSFEOYhMwh
vgagvvHopS2jIfLXd47PJJvUQuRjXH00FMvrEmBT2b31HJDU2uKn+y4BtMCJHzJV
cTaCwKcrJcLCH2tv5LVUg86VcbL7qbMGjqKQk4aSuEtMtJkpkD4palNG+y69N9g=
=NKMF
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 4.16" from Shawn Guo:
- Fix i.MX GPC driver to remove power domains only when they are
initialized in imx_gpc_probe().
- Fix the broken Engicam i.CoreM6 DualLite/Solo RQS board DT to include
imx6dl.dtsi instead of imx6q.dtsi.
* tag 'imx-fixes-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS
soc: imx: gpc: de-register power domains only if initialized
Today the tty0 and hvc0 consoles are added as a preferred consoles for
pv domUs only. As this requires a boot parameter for getting dom0
messages per default, add them for dom0, too.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
As we have option in u-boot to set CPU mask for running linux,
we want to pass information to kernel about CPU cores should
be brought up. So we patch kernel dtb in u-boot to set
possible-cpus property.
This also allows us to have correctly setuped MCIP debug mask.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
As of today we use hardcoded MCIP debug mask, so if we launch
kernel via debugger and kick fever cores than HW has all cpus
hang at the momemt of setup MCIP debug mask.
So update MCIP debug mask when the new cpu came online, instead of
use hardcoded MCIP debug mask.
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In SMP systems, GFRC is used for clocksource. However by default the
counter keeps running even when core is halted (say when debugging via a
JTAG debugger). This confuses Linux timekeeping and triggers flase RCU stall
splat such as below:
| [ARCLinux]# while true; do ./shm_open_23-1.run-test ; done
| Running with 1000 processes for 1000 objects
| hrtimer: interrupt took 485060 ns
|
| create_cnt: 1000
| Running with 1000 processes for 1000 objects
| [ARCLinux]# INFO: rcu_preempt self-detected stall on CPU
| 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
| INFO: rcu_preempt detected stalls on CPUs/tasks:
| 0-...: (1 GPs behind) idle=71e/0/0 softirq=135264/135264 fqs=0
| 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0
| 3-...: (1 GPs behind) idle=4e0/0/0 softirq=134304/134304 fqs=0
| (detected by 1, t=13648 jiffies, g=31493, c=31492, q=1)
Starting from ARC HS v3.0 it's possible to tie GFRC to state of up-to 4
ARC cores with help of GFRC's CORE register where we set a mask for
cores which state we need to rely on.
We update cpu mask every time new cpu came online instead of using
hardcoded one or using mask generated from "possible_cpus" as we
want it set correctly even if we run kernel on HW which has fewer cores
than expected (or we launch kernel via debugger and kick fever cores
than HW has)
Note that GFRC halts when all cores have halted and thus relies on
programming of Inter-Core-dEbug register to halt all cores when one
halts.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[vgupta: rewrote changelog]
When the Intel Edison module is powered with 3.3V, the reboot command makes
the module stuck. If the module is powered at a greater voltage, like 4.4V
(as the Edison Mini Breakout board does), reboot works OK.
The official Intel Edison BSP sends the IPCMSG_COLD_RESET message to the
SCU by default. The IPCMSG_COLD_BOOT which is used by the upstream kernel
is only sent when explicitely selected on the kernel command line.
Use IPCMSG_COLD_RESET unconditionally which makes reboot work independent
of the power supply voltage.
[ tglx: Massaged changelog ]
Fixes: bda7b072de ("x86/platform/intel-mid: Implement power off sequence")
Signed-off-by: Sebastian Panceac <sebastian@resin.io>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1519810849-15131-1-git-send-email-sebastian@resin.io
Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the missing suffixes here. Note that for
64-bit this means some operations change from being 32-bit to 64-bit.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F98702000078001ABACC@prv-mh.provo.novell.com
Omitting suffixes from instructions in AT&T mode is bad practice when
operand size cannot be determined by the assembler from register
operands, and is likely going to be warned about by upstream gas in the
future (mine does already). Add the single missing suffix here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/5A93F96902000078001ABAC8@prv-mh.provo.novell.com
As done in commit 3b3a371cc9 ("x86/debug: Use UD2 for WARN()"), this
switches to UD2 from UD0 to keep disassembly readable.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20180225165056.GA11719@beast
Once in a while I see build errors similar to the following
when building images from a clean tree.
Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
Building powerpc🎍smpdev:44x/bamboo_defconfig ... failed
------------
Error log:
arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
libfdt.h: No such file or directory
arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error:
libfdt.h: No such file or directory
Rebuilds will succeed.
Turns out that several source files in arch/powerpc/boot/ include
libfdt.h, but Makefile dependencies are incomplete. Let's fix that.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The ARM PMU doesn't have a reg address, so fix the following DTC warning
(requires W=1):
Node /soc/arm-pmu missing or empty reg/ranges property
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This patch fixes the following DTC warning (requires W=1):
Node /soc/local_intc simple-bus unit address format error, expected "40000000"
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Jon attempted to fix the amount of RAM on the BCM958625HR in commit
c53beb47f6 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
but it seems like we tripped over some poorly documented schematics.
The top-level page of the schematics says the board has 2GB, but when
you end-up scrolling to page 6, you see two chips of 4GBit (512MB) but
what the bootloader really initializes only 512MB, any attempt to use
more than that results in data aborts. Fix this again back to 512MB.
Fixes: c53beb47f6 ("ARM: dts: NSP: Correct RAM amount for BCM958625HR board")
Acked-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
While working on 16338a9b3a ("bpf, arm64: fix out of bounds access in
tail call") I noticed that ppc64 JIT is partially affected as well. While
the bound checking is correctly performed as unsigned comparison, the
register with the index value however, is never truncated into 32 bit
space, so e.g. a index value of 0x100000000ULL with a map of 1 element
would pass with PPC_CMPLW() whereas we later on continue with the full
64 bit register value. Therefore, as we do in interpreter and other JITs
truncate the value to 32 bit initially in order to fix access.
Fixes: ce0761419f ("powerpc/bpf: Implement support for tail calls")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This stops the driver from trying to probe the ATA slave
interface. The vendor code enables the slave interface
but the driver in the vendor tree does not make use of
it.
Setting it to muxmode 0 disables the slave interface:
the hardware only has the master interface connected
to the one harddrive slot anyways.
Without this change booting takes excessive time, so it
is very annoying to end users.
Fixes: dd5c0561db ("ARM: dts: Add basic devicetree for D-Link DNS-313")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The CONFIG_LIRC symbol has changed from 'tristate' to 'bool, so we now
get a warning for omap2plus_defconfig:
arch/arm/configs/omap2plus_defconfig:322:warning: symbol value 'm' invalid for LIRC
This changes the file to mark the symbol as built-in to get rid of the
warning.
Fixes: a60d64b15c ("media: lirc: lirc interface should not be a raw decoder")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
It is not valid for orion5x to use mac_pton().
First of all, the orion5x buffer is not NULL terminated. mac_pton()
has no business operating on non-NULL terminated buffers because
only the caller can know that this is valid and in what manner it
is ok to parse this NULL'less buffer.
Second of all, orion5x operates on an __iomem pointer, which cannot
be dereferenced using normal C pointer operations. Accesses to
such areas much be performed with the proper iomem accessors.
Fixes: 4904dbda41 ("ARM: orion5x: use mac_pton() helper")
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull x86 fixes from Thomas Gleixner:
"Yet another pile of melted spectrum related changes:
- sanitize the array_index_nospec protection mechanism: Remove the
overengineered array_index_nospec_mask_check() magic and allow
const-qualified types as index to avoid temporary storage in a
non-const local variable.
- make the microcode loader more robust by properly propagating error
codes. Provide information about new feature bits after micro code
was updated so administrators can act upon.
- optimizations of the entry ASM code which reduce code footprint and
make the code simpler and faster.
- fix the {pmd,pud}_{set,clear}_flags() implementations to work
properly on paravirt kernels by removing the address translation
operations.
- revert the harmful vmexit_fill_RSB() optimization
- use IBRS around firmware calls
- teach objtool about retpolines and add annotations for indirect
jumps and calls.
- explicitly disable jumplabel patching in __init code and handle
patching failures properly instead of silently ignoring them.
- remove indirect paravirt calls for writing the speculation control
MSR as these calls are obviously proving the same attack vector
which is tried to be mitigated.
- a few small fixes which address build issues with recent compiler
and assembler versions"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely()
KVM/x86: Remove indirect MSR op calls from SPEC_CTRL
objtool, retpolines: Integrate objtool with retpoline support more closely
x86/entry/64: Simplify ENCODE_FRAME_POINTER
extable: Make init_kernel_text() global
jump_label: Warn on failed jump_label patching attempt
jump_label: Explicitly disable jump labels in __init code
x86/entry/64: Open-code switch_to_thread_stack()
x86/entry/64: Move ASM_CLAC to interrupt_entry()
x86/entry/64: Remove 'interrupt' macro
x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry()
x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry
x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function
x86/speculation: Move firmware_restrict_branch_speculation_*() from C to CPP
objtool: Add module specific retpoline rules
objtool: Add retpoline validation
objtool: Use existing global variables for options
x86/mm/sme, objtool: Annotate indirect call in sme_encrypt_execute()
x86/boot, objtool: Annotate indirect jump in secondary_startup_64()
x86/paravirt, objtool: Annotate indirect calls
...
- optimization for the exitless interrupt support that was merged in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve expoline performance
- fixes for multiple epoch facility
ARM:
- fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
- make sure we can build 32-bit KVM/ARM with gcc-8.
x86:
- fixes for AMD SEV
- fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
- fixes for async page fault migration
- small optimization to PV TLB flush (new in 4.16-rc1)
- syzkaller fixes
Generic:
- compiler warning fixes
- syzkaller fixes
- more improvements to the kvm_stat tool
Two more small Spectre fixes are going to reach you via Ingo.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEbBAABAgAGBQJakL/fAAoJEL/70l94x66Dzp4H9j6qMzgOTAQ0bYmupQp81tad
V8lNabVSNi0UBYwk2D44oNigtNjQckE18KGnjuJ4tZW+GZ+D7zrrHrKXWtATXgxP
SIfHj+raSd/lgJoy6HLu/N0oT6wS+PdZMYFgSu600Vi618lGKGX1SIAwBhjoxdMX
7QKKAuPcDZ1qgGddhWaLnof28nQQEWcCAVfFeVojmM0TyhvSbgSysh/Gq10ydybh
NVUfgP3fzLtT9gVngX/ZtbogNkltPYmucpI+wT3nWfsgBic783klfWrfpnC/GM85
OeXLVhHwVLG6tXUGhb4ULO+F9HwRGX31+er6iIxmwH9PvqnQMRcQ0Xxf2gbNXg==
=YmH6
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"s390:
- optimization for the exitless interrupt support that was merged in 4.16-rc1
- improve the branch prediction blocking for nested KVM
- replace some jump tables with switch statements to improve expoline performance
- fixes for multiple epoch facility
ARM:
- fix the interaction of userspace irqchip VMs with in-kernel irqchip VMs
- make sure we can build 32-bit KVM/ARM with gcc-8.
x86:
- fixes for AMD SEV
- fixes for Intel nested VMX, emulated UMIP and a dump_stack() on VM startup
- fixes for async page fault migration
- small optimization to PV TLB flush (new in 4.16-rc1)
- syzkaller fixes
Generic:
- compiler warning fixes
- syzkaller fixes
- more improvements to the kvm_stat tool
Two more small Spectre fixes are going to reach you via Ingo"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (40 commits)
KVM: SVM: Fix SEV LAUNCH_SECRET command
KVM: SVM: install RSM intercept
KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE command
include: psp-sev: Capitalize invalid length enum
crypto: ccp: Fix sparse, use plain integer as NULL pointer
KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled
x86/kvm: Make parse_no_xxx __init for kvm
KVM: x86: fix backward migration with async_PF
kvm: fix warning for non-x86 builds
kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD builds
tools/kvm_stat: print 'Total' line for multiple events only
tools/kvm_stat: group child events indented after parent
tools/kvm_stat: separate drilldown and fields filtering
tools/kvm_stat: eliminate extra guest/pid selection dialog
tools/kvm_stat: mark private methods as such
tools/kvm_stat: fix debugfs handling
tools/kvm_stat: print error on invalid regex
tools/kvm_stat: fix crash when filtering out all non-child trace events
tools/kvm_stat: avoid 'is' for equality checks
tools/kvm_stat: use a more pythonic way to iterate over dictionaries
...
Introduce __smp_{mb,rmb,wmb}, and rely on the generic definitions
for smp_{mb,rmb,wmb}. A first consequence is that smp_{mb,rmb,wmb}
map to a compiler barrier on !SMP (while their definition remains
unchanged on SMP). As a further consequence, smp_load_acquire and
smp_store_release have "fence rw,rw" instead of "fence iorw,iorw".
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
The routine pgattr_change_is_safe() was extended in commit 4e60205655
("arm64: mm: Permit transitioning from Global to Non-Global without BBM")
to permit changing the nG attribute from not set to set, but did so in a
way that inadvertently disallows such changes if other permitted attribute
changes take place at the same time. So update the code to take this into
account.
Fixes: 4e60205655 ("arm64: mm: Permit transitioning from Global to ...")
Cc: <stable@vger.kernel.org> # 4.14.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- fix memory accounting when reserved memory is in high memory region;
- fix DMA allocation from high memory.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAlqTLMUTHGpjbXZia2Jj
QGdtYWlsLmNvbQAKCRBR+cyR+D+gRF/3D/49eKQJf8P/F1MQlZCGXPPR+NAtPnVe
NakVAE4jNZ/6m+pt694hdifG2LIbo5ePxWeW+7Nmn5POI6QO71qZEJR9KKD6Sbs5
1pG/+BtLHyRgef7QEqWuCaljFSvoIm7CrXnBvFbEN8FAiDlZ7MDu8w3iNtULTcyu
jxML8vfAIq8w/aRa0v+GlKNkQDk+K6UCvzTn8oBGet5zWIJ/ufjLreNQm0qkDyB9
K+UkyIUg94hC79RM4ZcJhka+cBq6B6qITft0TbdpMzWeYyZH01zGaw1t2tgi596B
T/rIlP4fpjF+aBHRoWBoiz0I0sFEoTtwiEWXS+PBSDfoDSzT9KYH+eJMi6KjQxXR
MwPmSz8wgMoObX91TO5pMhq281yZMkuee4mEdutssXisHNscwimpnlrCxAQLDLY7
kosMxHkU4UW/elpnPUWTWWReJjif9+cXzu69Hwn1QKWIZJfbPeoIqGtiFeZTBAeR
NANhS80A4Psk7XF/hgyWD44xXYB7PXLzg1cYyQDz/3hTydAnoh9Rq8I0T4ccmbDO
c4SpKwe0I9H/b8jBAuxu6YmF02EOMPNQapVOoTe/mNLUj5tsOUH8oDx/P2fxWUpZ
bY+9lwwo32AljujVxGEmJmtKKZQoJYfNlNDWIgL/fSd1sXE1Sft2M6eRG54K5tEa
+Q6M+lu+kXKWoQ==
=IRmt
-----END PGP SIGNATURE-----
Merge tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa fixes from Max Filippov:
"Two fixes for reserved memory/DMA buffers allocation in high memory on
xtensa architecture
- fix memory accounting when reserved memory is in high memory region
- fix DMA allocation from high memory"
* tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: support DMA buffers in high memory
xtensa: fix high memory/reserved memory collision
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes:
- UAPI data type correction for hyperv
- correct the cpu cores field in /proc/cpuinfo on CPU hotplug
- return proper error code in the resctrl file system failure path to
avoid silent subsequent failures
- correct a subtle accounting issue in the new vector allocation code
which went unnoticed for a while and caused suspend/resume
failures"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
x86/topology: Fix function name in documentation
x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system
x86/apic/vector: Handle vector release on CPU unplug correctly
genirq/matrix: Handle CPU offlining proper
x86/headers/UAPI: Use __u64 instead of u64 in <uapi/asm/hyperv.h>
Pull perf fix from Thomas Gleixner:
"A single commit which shuts up a bogus GCC-8 warning"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/oprofile: Fix bogus GCC-8 warning in nmi_setup()
Pull locking fixes from Thomas Gleixner:
"Three patches to fix memory ordering issues on ALPHA and a comment to
clarify the usage scope of a mutex internal function"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
locking/xchg/alpha: Clean up barrier usage by using smp_mb() in place of __ASM__MB
locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
locking/mutex: Add comment to __mutex_owner() to deter usage
Pull cleanup patchlet from Thomas Gleixner:
"A single commit removing a bunch of bogus double semicolons all over
the tree"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
treewide/trivial: Remove ';;$' typo noise
Add handling for a missing instruction in our 32-bit BPF JIT so that it can be
used for seccomp filtering.
Add a missing NULL pointer check before a function call in new EEH code.
Fix an error path in the new ocxl driver to correctly return EFAULT.
The support for the new ibm,drc-info device tree property turns out to need
several fixes, so for now we just stop advertising to firmware that we support
it until the bugs can be ironed out.
One fix for the new drmem code which was incorrectly modifying the device tree
in place.
Finally two fixes for the RFI flush support, so that firmware can advertise to
us that it should be disabled entirely so as not to affect performance.
Thanks to:
Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord, Michael Bringmann.
-----BEGIN PGP SIGNATURE-----
iQIwBAABCAAaBQJakNObExxtcGVAZWxsZXJtYW4uaWQuYXUACgkQUevqPMjhpYAy
5A//RHFnwaXrOmcQ7h6fJim9lh+CsbhyHkf+hEGqJ4gp/ZOxjgWLjxgqB8O09hLb
CZUvbqc3phip431E+qCw1KVVoPzKXVVqEQD1mckvpVwVL7vVxOvcNfjzRH3oMsjC
jLQAfEvZMO6Lf14tb83vC5YBYOrWjojyOqJ/DbFzedw9hLLt1JTI12KfcgRZo5xr
3WhyevTHa4JJSgLhY9r34Q2hsf7CS2yTfhHa2+iHFPg2fINYb+Ld8V5JNj3JiFtx
fd6JiPFjQSyK+xXwK55eQGqMXZdEJU0iSwnYWBLUnFGaK7MQpEIJdv7FiRjD7Hcp
b3wpBuCSsN2mQIYpsoBY3GUm0thtAIbLDTRq1nAWdULbnOi1QBQy694Vgn4f8A79
T10Y1QuuwL2nInPy9qb/c6sukGz/czu9txFyqO/PwtRs99A6i4hRAtfKLUeW0cbG
QxI3GMfxst31SThqjVQifyhy7omXEjU8Oc6o1VMp8Vm1eH1QviwzcfmKA5UQuq3k
Y+ninT4wjo8MceAZ6F3w767DQmTK9e9lcgLU1ZZd/jntcnDzTr6AF54TFNmmicRb
o0arHMneMDo1IHEUfeski+0j0uZGuA/reQ6A/mDGSxum09voPwtWEH+dlAr9oKSQ
lgETCP6GN7rWluEWTYidgGGhV77QsfLNZyJYbGJXQ8cValo=
=mTqU
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Add handling for a missing instruction in our 32-bit BPF JIT so that
it can be used for seccomp filtering.
- Add a missing NULL pointer check before a function call in new EEH
code.
- Fix an error path in the new ocxl driver to correctly return EFAULT.
- The support for the new ibm,drc-info device tree property turns out
to need several fixes, so for now we just stop advertising to
firmware that we support it until the bugs can be ironed out.
- One fix for the new drmem code which was incorrectly modifying the
device tree in place.
- Finally two fixes for the RFI flush support, so that firmware can
advertise to us that it should be disabled entirely so as not to
affect performance.
Thanks to: Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord,
Michael Bringmann.
* tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Support firmware disable of RFI flush
powerpc/pseries: Support firmware disable of RFI flush
powerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2
powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
powerpc/pseries: Revert support for ibm,drc-info devtree property
powerpc/pseries: Fix duplicate firmware feature for DRC_INFO
ocxl: Fix potential bad errno on irq allocation
powerpc/eeh: Fix crashes in eeh_report_resume()
This patch fixes the wrongly included dtsi file which
was breaking mainline support for Engicam i.CoreM6 DualLite/Solo RQS.
As per the board name, the correct file should be imx6dl.dtsi instead
of imx6q.dtsi
Reported-by: Michael Trimarchi <michael@amarulasolutions.com>
Suggested-by: Jagan Teki <jagan@amarulasolutions.com>
Signed-off-by: Shyam Saini <shyam@amarulasolutions.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Fixes: 7a9caba55a ("ARM: dts: imx6dl: Add Engicam i.CoreM6 DualLite/Solo RQS initial support")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The SEV LAUNCH_SECRET command fails with error code 'invalid param'
because we missed filling the guest and header system physical address
while issuing the command.
Fixes: 9f5b5b950a (KVM: SVM: Add support for SEV LAUNCH_SECRET command)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
RSM instruction is used by the SMM handler to return from SMM mode.
Currently, rsm causes a #UD - which results in instruction fetch, decode,
and emulate. By installing the RSM intercept we can avoid the instruction
fetch since we know that #VMEXIT was due to rsm.
The patch is required for the SEV guest, because in case of SEV guest
memory is encrypted with guest-specific key and hypervisor will not
able to fetch the instruction bytes from the guest memory.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using the access_ok() to validate the input before issuing the SEV
command does not buy us anything in this case. If userland is
giving us a garbage pointer then copy_to_user() will catch it when we try
to return the measurement.
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Fixes: 0d0736f763 (KVM: SVM: Add support for KVM_SEV_LAUNCH_MEASURE ...)
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid traversing all the cpus for pv tlb flush when steal time
is disabled since pv tlb flush depends on the field in steal time
for shared data.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The early_param() is only called during kernel initialization, So Linux
marks the functions of it with __init macro to save memory.
But it forgot to mark the parse_no_kvmapf/stealacc/kvmclock_vsyscall,
So, Make them __init as well.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: rkrcmar@redhat.com
Cc: kvm@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: x86@kernel.org
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
bit when enabling async_PF, but this bit is reserved on old hypervisors,
which results in a failure upon migration.
To avoid breaking different cases, we are checking for CPUID feature bit
before enabling the feature and nothing else.
Fixes: 52a5c155cf ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
Cc: <stable@vger.kernel.org>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the following sparse warning by moving the prototype
of kvm_arch_mmu_notifier_invalidate_range() to linux/kvm_host.h .
CHECK arch/s390/kvm/../../../virt/kvm/kvm_main.c
arch/s390/kvm/../../../virt/kvm/kvm_main.c:138:13: warning: symbol 'kvm_arch_mmu_notifier_invalidate_range' was not declared. Should it be static?
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by syzkaller:
WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
Call Trace:
vmx_handle_exit+0xbd/0xe20 [kvm_intel]
kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
do_vfs_ioctl+0xa4/0x6a0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x25/0x9c
The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
a second thread to mmap and operate on the same vCPU. This triggers a race
condition when running the testcase with multiple threads. Sometimes one thread
exits with a triple fault while another thread mmaps and operates on the same
vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
exit with KVM_EXIT_INTERNAL_ERROR.
Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although L2 is in halt state, it will be in the active state after
VM entry if the VM entry is vectoring according to SDM 26.6.2 Activity
State. Halting the vcpu here means the event won't be injected to L2
and this decision isn't reported to L1. Thus L0 drops an event that
should be injected to L2.
Cc: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On x86, special KVM memslots such as the TSS region have anonymous
memory mappings created on behalf of userspace, and these mappings are
removed when the VM is destroyed.
It is however possible for removing these mappings via vm_munmap() to
fail. This can most easily happen if the thread receives SIGKILL while
it's waiting to acquire ->mmap_sem. This triggers the 'WARN_ON(r < 0)'
in __x86_set_memory_region(). syzkaller was able to hit this, using
'exit()' to send the SIGKILL. Note that while the vm_munmap() failure
results in the mapping not being removed immediately, it is not leaked
forever but rather will be freed when the process exits.
It's not really possible to handle this failure properly, so almost
every other caller of vm_munmap() doesn't check the return value. It's
a limitation of having the kernel manage these mappings rather than
userspace.
So just remove the WARN_ON() so that users can't spam the kernel log
with this warning.
Fixes: f0d648bdf0 ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
L1 might want to use SECONDARY_EXEC_DESC, so we must not clear the VMCS
bit if UMIP is not being emulated.
We must still set the bit when emulating UMIP as the feature can be
passed to L2 where L0 will do the emulation and because L2 can change
CR4 without a VM exit, we should clear the bit if UMIP is disabled.
Fixes: 0367f205a3 ("KVM: vmx: add support for emulating UMIP")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
The initial reset of the local APIC is performed before the VMCS has been
created, but it tries to do a vmwrite:
vmwrite error: reg 810 value 4a00 (err 18944)
CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G W I 4.16.0-0.rc2.git0.1.fc28.x86_64 #1
Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
Call Trace:
vmx_set_rvi [kvm_intel]
vmx_hwapic_irr_update [kvm_intel]
kvm_lapic_reset [kvm]
kvm_create_lapic [kvm]
kvm_arch_vcpu_init [kvm]
kvm_vcpu_init [kvm]
vmx_create_vcpu [kvm_intel]
kvm_vm_ioctl [kvm]
Move it later, after the VMCS has been created.
Fixes: 4191db26b7 ("KVM: x86: Update APICv on APIC reset")
Cc: stable@vger.kernel.org
Cc: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull networking fixes from David Miller:
1) Fix TTL offset calculation in mac80211 mesh code, from Peter Oh.
2) Fix races with procfs in ipt_CLUSTERIP, from Cong Wang.
3) Memory leak fix in lpm_trie BPF map code, from Yonghong Song.
4) Need to use GFP_ATOMIC in BPF cpumap allocations, from Jason Wang.
5) Fix potential deadlocks in netfilter getsockopt() code paths, from
Paolo Abeni.
6) Netfilter stackpointer size checks really are needed to validate
user input, from Florian Westphal.
7) Missing timer init in x_tables, from Paolo Abeni.
8) Don't use WQ_MEM_RECLAIM in mac80211 hwsim, from Johannes Berg.
9) When an ibmvnic device is brought down then back up again, it can be
sent queue entries from a previous session, handle this properly
instead of crashing. From Thomas Falcon.
10) Fix TCP checksum on LRO buffers in mlx5e, from Gal Pressman.
11) When we are dumping filters in cls_api, the output SKB is empty, and
the filter we are dumping is too large for the space in the SKB, we
should return -EMSGSIZE like other netlink dump operations do.
Otherwise userland has no signal that is needs to increase the size
of its read buffer. From Roman Kapl.
12) Several XDP fixes for virtio_net, from Jesper Dangaard Brouer.
13) Module refcount leak in netlink when a dump start fails, from Jason
Donenfeld.
14) Handle sub-optimal GSO sizes better in TCP BBR congestion control,
from Eric Dumazet.
15) Releasing bpf per-cpu arraymaps can take a long time, add a
condtional scheduling point. From Eric Dumazet.
16) Implement retpolines for tail calls in x64 and arm64 bpf JITs. From
Daniel Borkmann.
17) Fix page leak in gianfar driver, from Andy Spencer.
18) Missed clearing of estimator scratch buffer, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
net_sched: gen_estimator: fix broken estimators based on percpu stats
gianfar: simplify FCS handling and fix memory leak
ipv6 sit: work around bogus gcc-8 -Wrestrict warning
macvlan: fix use-after-free in macvlan_common_newlink()
bpf, arm64: fix out of bounds access in tail call
bpf, x64: implement retpoline for tail call
rxrpc: Fix send in rxrpc_send_data_packet()
net: aquantia: Fix error handling in aq_pci_probe()
bpf: fix rcu lockdep warning for lpm_trie map_free callback
bpf: add schedule points in percpu arrays management
regulatory: add NUL to request alpha2
ibmvnic: Fix early release of login buffer
net/smc9194: Remove bogus CONFIG_MAC reference
net: ipv4: Set addr_type in hash_keys for forwarded case
tcp_bbr: better deal with suboptimal GSO
smsc75xx: fix smsc75xx_set_features()
netlink: put module reference if dump start fails
selftests/bpf/test_maps: exit child process without error in ENOMEM case
selftests/bpf: update gitignore with test_libbpf_open
selftests/bpf: tcpbpf_kern: use in6_* macros from glibc
..
A single MIPS fix for mismatching struct compat_flock, resulting in bus
errors starting Firefox on Debian 8 since 4.13.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlqP5u0ACgkQbAtpk944
dnqPeA//dLmtRS9Ogjkwrpfmb9AkTiB2iECKYxIUFMAeFHgBTT9skYaI8smaaZlj
VHrIP5mnL/IJaSI9exZHdYDTIdDJ9iRYPaIq29Z63rHWOQKTQmcuU5SzDZuiJdq9
lGxF/fu1cHVJzZ5SsDIDol94P2GIoZ3Gw9u4kutk//H05tzhVgnIR7ETRIKY6HsO
gij4Ubg9YqEyxGvv+opt5SCSVEOxtt/gFB4/JvR74L6mxPVUYXcCMA4MB5RqckpO
aU7zH5YAHrtcNmto4qDsJIeGBrXYVo5H/MOq9j+1Nt3RJXOB3/934ZpMsLYOCBml
JYqe+k8OlTAsWGBqHXIYhdMlTXLFlaDHW8WHViY7wZ1Eh97DIeZ7KRVJuZR9t0kW
2bdifdMhfVfekPjSjYFQJB6AaEtYMTMNSm7ZaTk4zGXShIEN6N0byfFXt2gc0olg
oZu+NwhevlQhifS3qjgACddbPru2bf0QI6hByCXRbjccKJlB7B/ClKo3l7AmCyb9
Bfe/WW24gbSDp1VhHNNaB9kO2PaXXzDqnJN2AjrwgG4g0HixCcsoPVk/dKgp/AY2
+ewQfCrzdYW4Y/Nkfwg0gfuy/eQx2vZ3DdoMS7MeHVgdJw956bdnD5lgDgeldLpy
KSuQlxCPJm2zV3gV/QFJxAHOHrIIk5V9+WDh1ZdInfYjxcgDFW8=
=h/p9
-----END PGP SIGNATURE-----
Merge tag 'mips_fixes_4.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips
Pull MIPS fix from James Hogan:
"A single MIPS fix for mismatching struct compat_flock, resulting in
bus errors starting Firefox on Debian 8 since 4.13"
* tag 'mips_fixes_4.16_3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
MIPS: Drop spurious __unused in struct compat_flock
We have certain cases where the multiple epoch facility is broken:
- timer wakeup during epoch change
- cpu hotplug
- SCK instruction
- stp sync checks
Fix those.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJajqO9AAoJEBF7vIC1phx8fk4QAIACbrjlCpdqQ/3s8JC6SfNb
2tnyt4gQLw0ztb11kPGkjYXAkG9SA7v4Y3J3oXDtFH8BP/xhf6CO3jVmWCXUNv7E
wfk04Dh0xJwnwBsHYuERFlngB2BTODLoHV/w00fd/ja1c8T5yGULzADie6dJjNNT
B/q/eCIpzHQYZLZnBW7+YO05ciMwssi2luq46uijY/MZkfCYIvO8pf4MNcuLPvWq
CepxzCyXbdy2xw0fWu7lrYk/0VU08eYchGbqjsDbpuz3CdbKJhVLwZhSGx89WebX
/+s2IKXQZEtxKcBWHOZS2k98mB8LNMLumnaoeEJDjDt3T+lu3B/ujGfPURipcvGQ
0ch4iM5Fmhyx3IxYk4lEgrdoRpjHdjnBs1ONyNGIx35NJrfWjAsRRHw6ov6qQ0rH
rcDmBC8bBZmZYTxBXD+R5rTn+noJp2OkNt4Wc5X7SnKj3DIbfR3FKgT3z+mtJyIX
l8+qnaQpj/Pchuko4j7gh0/uzHVt3WtG3HtLqQnqHJTZM+b9nkIeDfUbdp3cLycD
W2wfs9LO2tXXcX1A05KFPPSjNDUypz1ToAfyt6JgPXjE7ZfHkpLJTQPrN+BoJZCk
3P//LQ85yJaDNcEJtH9S7nGjhSTdW1MqeO61mhlkag4A5Qe2Mquqd18H1ngac7aq
0Xna6qvJBvdvPwEmI04H
=umvB
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-master-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: fixes for multiple epoch facility
We have certain cases where the multiple epoch facility is broken:
- timer wakeup during epoch change
- cpu hotplug
- SCK instruction
- stp sync checks
Fix those.
Fix the interaction of userspace irqchip VMs with in-kernl irqchip VMs
and make sure we can build 32-bit KVM/ARM with gcc-8.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJakFXlAAoJEEtpOizt6ddyg0kH/1Aor4Gkra8AFVaNP8r3xZxN
ZZfxiKDBuca32uEKtkk0DUKaPIFRJS60K4SvKJTPbOV5gj43vq29SNrpBiBHNdj+
5hcZRjH66p5HzBmZZk/hispijpQUW2cXDnf2tsupknwENqWHhgf440t3ruuDLdLU
npg+I/0588uPWaljj2YfAn1DQOFLx/gpjuNY8n+dHYm6PcTtVp+tTYEYu2P5znFa
WOn6Zw0XLVdC4e3sFOdajlGk5TZ8+ILEWVJkXKg5bd0vV67WB3mCvsz6byr9c5CU
vdjRgz2/iSGldbo+e4LuqnxSUjaVF53T2BSfKLl1YO0J2vIP1hkZl7XR0tRJfww=
=SF7J
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-fixes-for-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM Fixes for v4.16, Round 1
Fix the interaction of userspace irqchip VMs with in-kernl irqchip VMs
and make sure we can build 32-bit KVM/ARM with gcc-8.
do_task_stat() calls get_wchan(), which further does unwind_frame().
unwind_frame() restores frame->pc to original value in case function
graph tracer has modified a return address (LR) in a stack frame to hook
a function return. However, if function graph tracer has hit a filtered
function, then we can't unwind it as ftrace_push_return_trace() has
biased the index(frame->graph) with a 'huge negative'
offset(-FTRACE_NOTRACE_DEPTH).
Moreover, arm64 stack walker defines index(frame->graph) as unsigned
int, which can not compare a -ve number.
Similar problem we can have with calling of walk_stackframe() from
save_stack_trace_tsk() or dump_backtrace().
This patch fixes unwind_frame() to test the index for -ve value and
restore index accordingly before we can restore frame->pc.
Reproducer:
cd /sys/kernel/debug/tracing/
echo schedule > set_graph_notrace
echo 1 > options/display-graph
echo wakeup > current_tracer
ps -ef | grep -i agent
Above commands result in:
Unable to handle kernel paging request at virtual address ffff801bd3d1e000
pgd = ffff8003cbe97c00
[ffff801bd3d1e000] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000006 [#1] SMP
[...]
CPU: 5 PID: 11696 Comm: ps Not tainted 4.11.0+ #33
[...]
task: ffff8003c21ba000 task.stack: ffff8003cc6c0000
PC is at unwind_frame+0x12c/0x180
LR is at get_wchan+0xd4/0x134
pc : [<ffff00000808892c>] lr : [<ffff0000080860b8>] pstate: 60000145
sp : ffff8003cc6c3ab0
x29: ffff8003cc6c3ab0 x28: 0000000000000001
x27: 0000000000000026 x26: 0000000000000026
x25: 00000000000012d8 x24: 0000000000000000
x23: ffff8003c1c04000 x22: ffff000008c83000
x21: ffff8003c1c00000 x20: 000000000000000f
x19: ffff8003c1bc0000 x18: 0000fffffc593690
x17: 0000000000000000 x16: 0000000000000001
x15: 0000b855670e2b60 x14: 0003e97f22cf1d0f
x13: 0000000000000001 x12: 0000000000000000
x11: 00000000e8f4883e x10: 0000000154f47ec8
x9 : 0000000070f367c0 x8 : 0000000000000000
x7 : 00008003f7290000 x6 : 0000000000000018
x5 : 0000000000000000 x4 : ffff8003c1c03cb0
x3 : ffff8003c1c03ca0 x2 : 00000017ffe80000
x1 : ffff8003cc6c3af8 x0 : ffff8003d3e9e000
Process ps (pid: 11696, stack limit = 0xffff8003cc6c0000)
Stack: (0xffff8003cc6c3ab0 to 0xffff8003cc6c4000)
[...]
[<ffff00000808892c>] unwind_frame+0x12c/0x180
[<ffff000008305008>] do_task_stat+0x864/0x870
[<ffff000008305c44>] proc_tgid_stat+0x3c/0x48
[<ffff0000082fde0c>] proc_single_show+0x5c/0xb8
[<ffff0000082b27e0>] seq_read+0x160/0x414
[<ffff000008289e6c>] __vfs_read+0x58/0x164
[<ffff00000828b164>] vfs_read+0x88/0x144
[<ffff00000828c2e8>] SyS_read+0x60/0xc0
[<ffff0000080834a0>] __sys_trace_return+0x0/0x4
Fixes: 20380bb390 (arm64: ftrace: fix a stack tracer's output under function graph tracer)
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
[catalin.marinas@arm.com: replace WARN_ON with WARN_ON_ONCE]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The allocation of host_data is not null checked, leading to a null
pointer dereference if the allocation fails. Fix this by adding a null
check and return with -ENOMEM.
Fixes: 64b139f97c ("MIPS: OCTEON: irq: add CIB and other fixes")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "Steven J. Hill" <Steven.Hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.0+
Patchwork: https://patchwork.linux-mips.org/patch/18658/
Signed-off-by: James Hogan <jhogan@kernel.org>
Currently there is no null check on a failed allocation of board_data,
and hence a null pointer dereference will occurr. Fix this by checking
for the out of memory null pointer.
Fixes: a747371748 ("MIPS: ath25: add board configuration detection")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.19+
Patchwork: https://patchwork.linux-mips.org/patch/18657/
Signed-off-by: James Hogan <jhogan@kernel.org>
Without this fix, /proc/cpuinfo will display an incorrect amount
of CPU cores, after bringing them offline and online again, as
exemplified below:
$ cat /proc/cpuinfo | grep cores
cpu cores : 4
cpu cores : 8
cpu cores : 8
cpu cores : 20
cpu cores : 4
cpu cores : 3
cpu cores : 2
cpu cores : 2
This patch fixes this by always zeroing the booted_cores variable
upon turning off a logical CPU.
Tested-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: jgross@suse.com
Cc: luto@kernel.org
Cc: prarit@redhat.com
Cc: vkuznets@redhat.com
Link: http://lkml.kernel.org/r/20180221205036.5244-1-sneves@dei.uc.pt
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Successful RMW operations are supposed to be fully ordered, but
Alpha's xchg() and cmpxchg() do not meet this requirement.
Will Deacon noticed the bug:
> So MP using xchg:
>
> WRITE_ONCE(x, 1)
> xchg(y, 1)
>
> smp_load_acquire(y) == 1
> READ_ONCE(x) == 0
>
> would be allowed.
... which thus violates the above requirement.
Fix it by adding a leading smp_mb() to the xchg() and cmpxchg() implementations.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-alpha@vger.kernel.org
Link: http://lkml.kernel.org/r/1519291488-5752-1-git-send-email-parri.andrea@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace each occurrence of __ASM__MB with a (trailing) smp_mb() in
xchg(), cmpxchg(), and remove the now unused __ASM__MB definitions;
this improves readability, with no additional synchronization cost.
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-alpha@vger.kernel.org
Link: http://lkml.kernel.org/r/1519291469-5702-1-git-send-email-parri.andrea@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
branch hints to the compiler can actually make a substantial cycle
difference by keeping the fast path contiguous in memory.
With this optimization, the retpoline-guest/retpoline-host case is
about 50 cycles faster.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Having a paravirt indirect call in the IBRS restore path is not a
good idea, since we are trying to protect from speculative execution
of bogus indirect branch targets. It is also slower, so use
native_wrmsrl() on the vmentry path too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: d28b387fb7
Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If no monitoring feature is detected because all monitoring features are
disabled during boot time or there is no monitoring feature in hardware,
creating rdtgroup sub-directory by "mkdir" command reports error:
mkdir: cannot create directory ‘/sys/fs/resctrl/p1’: No such file or directory
But the sub-directory actually is generated and content is correct:
cpus cpus_list schemata tasks
The error is because rdtgroup_mkdir_ctrl_mon() returns non zero value after
the sub-directory is created and the returned value is reported as an error
to user.
Clear the returned value to report to user that the sub-directory is
actually created successfully.
Signed-off-by: Wang Hui <john.wanghui@huawei.com>
Signed-off-by: Zhang Yanfei <yanfei.zhang@huawei.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vikas <vikas.shivappa@intel.com>
Cc: Xiaochen Shen <xiaochen.shen@intel.com>
Link: http://lkml.kernel.org/r/1519356363-133085-1-git-send-email-fenghua.yu@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When a irq vector is replaced, then the previous vector is normally
released when the first interrupt happens on the new vector. If the target
CPU of the previous vector is already offline when the new vector is
installed, then the previous vector is silently discarded, which leads to
accounting issues causing suspend failures and other problems.
Adjust the logic so that the previous vector is freed in the underlying
matrix allocator to ensure that the accounting stays correct.
Fixes: 69cde0004a ("x86/vector: Use matrix allocator for vector assignment")
Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180222112316.930791749@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.
Fixes: 6e032b350c ("powerpc/powernv: Check device-tree for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.
Fixes: 8989d56878 ("powerpc/pseries: Query hypervisor for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Memory addtion and removal by count and indexed-count methods
temporarily mark the LMBs that are being added/removed by a special
flag value DRMEM_LMB_RESERVED. Accessing flags value directly at a few
places without proper accessor method is causing two unexpected
side-effects:
- DRMEM_LMB_RESERVED bit is becoming part of the flags word of
drconf_cell_v2 entries in ibm,dynamic-memory-v2 DT property.
- This results in extra drconf_cell entries in ibm,dynamic-memory-v2.
For example if 1G memory is added, it leads to one entry for 3 LMBs
and 1 separate entry for the last LMB. All the 4 LMBs should be
defined by one entry here.
Fix this by always accessing the flags by its accessor method
drmem_lmb_flags().
Fixes: 2b31e3aec1 ("powerpc/drmem: Add support for ibm, dynamic-memory-v2 property")
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The MIPS %.its.S compiler command did not define __ASSEMBLY__, which meant
when compiler_types.h was added to kconfig.h, unexpected things appeared
(e.g. struct declarations) which should not have been present. As done in
the general %.S compiler command, __ASSEMBLY__ is now included here too.
The failure was:
Error: arch/mips/boot/vmlinux.gz.its:201.1-2 syntax error
FATAL ERROR: Unable to parse input tree
/usr/bin/mkimage: Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
/usr/bin/mkimage Can't add hashes to FIT blob
Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 28128c61e0 ("kconfig.h: Include compiler types to avoid missed struct attributes")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:
[ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
[...]
[ 347.043065] [fffb850e96492510] address between user and kernel address ranges
[ 347.050205] Internal error: Oops: 96000004 [#1] SMP
[...]
[ 347.190829] x13: 0000000000000000 x12: 0000000000000000
[ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
[ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000
[ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000
[ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
[ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
[ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
[ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
[ 347.235221] Call trace:
[ 347.237656] 0xffff000002f3a4fc
[ 347.240784] bpf_test_run+0x78/0xf8
[ 347.244260] bpf_prog_test_run_skb+0x148/0x230
[ 347.248694] SyS_bpf+0x77c/0x1110
[ 347.251999] el0_svc_naked+0x30/0x34
[ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
[...]
In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.
While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:
# bpftool p d j i 988
[...]
38: ldr w10, [x1,x10]
3c: cmp w2, w10
40: b.ge 0x000000000000007c <-- signed cmp
44: mov x10, #0x20 // #32
48: cmp x26, x10
4c: b.gt 0x000000000000007c
50: add x26, x26, #0x1
54: mov x10, #0x110 // #272
58: add x10, x1, x10
5c: lsl x11, x2, #3
60: ldr x11, [x10,x11] <-- faulting insn (f86b694b)
64: cbz x11, 0x000000000000007c
[...]
Meaning, the tests passed because commit ddb55992b0 ("arm64:
bpf: implement bpf_tail_call() helper") was using signed compares
instead of unsigned which as a result had the test wrongly passing.
Change this but also the tail call count test both into unsigned
and cap the index as u32. Latter we did as well in 90caccdd8c
("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
too. Tested on HiSilicon Hi1616.
Result after patch:
# bpftool p d j i 268
[...]
38: ldr w10, [x1,x10]
3c: add w2, w2, #0x0
40: cmp w2, w10
44: b.cs 0x0000000000000080
48: mov x10, #0x20 // #32
4c: cmp x26, x10
50: b.hi 0x0000000000000080
54: add x26, x26, #0x1
58: mov x10, #0x110 // #272
5c: add x10, x1, x10
60: lsl x11, x2, #3
64: ldr x11, [x10,x11]
68: cbz x11, 0x0000000000000080
[...]
Fixes: ddb55992b0 ("arm64: bpf: implement bpf_tail_call() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Implement a retpoline [0] for the BPF tail call JIT'ing that converts
the indirect jump via jmp %rax that is used to make the long jump into
another JITed BPF image. Since this is subject to speculative execution,
we need to control the transient instruction sequence here as well
when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
The latter aligns also with what gcc / clang emits (e.g. [1]).
JIT dump after patch:
# bpftool p d x i 1
0: (18) r2 = map[id:1]
2: (b7) r3 = 0
3: (85) call bpf_tail_call#12
4: (b7) r0 = 2
5: (95) exit
With CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000072 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000072 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000072 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: callq 0x000000000000006d |+
66: pause |
68: lfence |
6b: jmp 0x0000000000000066 |
6d: mov %rax,(%rsp) |
71: retq |
72: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
+ retpoline for indirect jump
Without CONFIG_RETPOLINE:
# bpftool p d j i 1
[...]
33: cmp %edx,0x24(%rsi)
36: jbe 0x0000000000000063 |*
38: mov 0x24(%rbp),%eax
3e: cmp $0x20,%eax
41: ja 0x0000000000000063 |
43: add $0x1,%eax
46: mov %eax,0x24(%rbp)
4c: mov 0x90(%rsi,%rdx,8),%rax
54: test %rax,%rax
57: je 0x0000000000000063 |
59: mov 0x28(%rax),%rax
5d: add $0x25,%rax
61: jmpq *%rax |-
63: mov $0x2,%eax
[...]
* relative fall-through jumps in error case
- plain indirect jump as before
[0] https://support.google.com/faqs/answer/7625886
[1] a31e654fa1
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Improve the DTS files by removing all the leading "0x" and zeros to fix the
following dtc warnings:
Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
and
Warning (unit_address_format): Node /XXX unit name should not have leading 0s
Converted using the following command:
find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -i -e "s/@\([0-9a-fA-FxX\.;:#]+\)\s*{/@\L\1 {/g" -e "s/@0x\(.*\) {/@\1 {/g" -e "s/@0+\(.*\) {/@\1 {/g" {} +^C
For simplicity, two sed expressions were used to solve each warnings separately.
To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:
https://elinux.org/Device_Tree_Linux#Linux_conventions
This will solve as a side effect warning:
Warning (simple_bus_reg): Node /XXX@<UPPER> simple-bus unit address format error, expected "<lower>"
This is a follow up to commit 4c9847b737 ("dt-bindings: Remove leading 0x from bindings notation")
Reported-by: David Daney <ddaney@caviumnetworks.com>
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This pull request contains a handful of small cleanups. The only
functional change is that IRQs are now enabled during exception
handling, which was found when some warnings triggered with
`CONFIG_DEBUG_ATOMIC_SLEEP=y`. The remaining fixes should have no
functional change: `sbi_save()` has been renamed to `parse_dtb()`
reflect what it actually does, and a handful of unused Kconfig entries
have been removed.
This is based on rc1 as a break to my usual flow, as it appears I missed
rc2 over the holiday. I've merged master as of Wednesday morning,
af3e79d295 ("Merge tag 'leds_for-4.16-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds")
without any conflicts and I've given it a simple build test. If this
isn't OK then feel free to drop the patch set and I'll send another
against rc3 for rc4.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlqNnPYTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRDvTKFQLMurQdslD/48VsB5shoMy2hzs7N0myJ0869zB9w4
C+7A+vgJQCutDbzZVRLCWhRLqZ3bL51ids1cnDtOMISYqHctuIW7xGOtoJELm1qc
78KXyhr6UvIFqPHyVGWOygW52dzDdrL7HKkDvy+TWk11yWIDZpgqVSVWJPZrvTm2
qK5IU2NpFVeH8pKkq7G4hZuWRJh2XlPefH7pOh1NYIDK90UShUOK1BmIahraq7Jb
wIlR6Xxet2BoxjG5wHhSG9JOYJCsgEeoEb6D0fJJz4JIBXHySLUWb3ZfwGstLY2m
AGO485E+EU0YL6E5AhPrenCzX4xyJlOTb7spAFAqTEkKyytwdM27A+Yv95wo47NZ
FtlsIRyQsiU6+XWCTvZP53ARQB8J92Pyw1KyDBf8WCl47BJOKwlCBNOvKNDLXtKK
E9h8FuXsBHGeay/+QQCWvwr7uqT2Z4Q33MYBKlmRVAQlEGj10WywVqai4Kk5Tz3f
tmvhHxtbA3Fuu4OACM8xZ4m/vDj+pGLkKw91v9XHF8hcBB5x/fkVRt+GHmg5EsKN
n9f936OpvT8RoBAvCNrtK7zXNWemwKmzkAH0GOdTpkW2fSSokTT9Gnyb66pvrIPL
TH0SqKOMQ8vQMMm/W0nxuqxaF99qvxoy2bTqa2ojT/NUZQqXz42WxSfhoeRJFTvF
UdElhd88BBwcoA==
=9zYf
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-4.16-rc3-riscv_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V cleanups from Palmer Dabbelt:
"This contains a handful of small cleanups.
The only functional change is that IRQs are now enabled during
exception handling, which was found when some warnings triggered with
`CONFIG_DEBUG_ATOMIC_SLEEP=y`.
The remaining fixes should have no functional change: `sbi_save()` has
been renamed to `parse_dtb()` reflect what it actually does, and a
handful of unused Kconfig entries have been removed"
* tag 'riscv-for-linus-4.16-rc3-riscv_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Rename sbi_save to parse_dtb to improve code readability
RISC-V: Enable IRQ during exception handling
riscv: Remove ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE select
riscv: kconfig: Remove RISCV_IRQ_INTC select
riscv: Remove ARCH_WANT_OPTIONAL_GPIOLIB select
Merge misc fixes from Andrew Morton:
"16 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: don't defer struct page initialization for Xen pv guests
lib/Kconfig.debug: enable RUNTIME_TESTING_MENU
vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems
selftests/memfd: add run_fuse_test.sh to TEST_FILES
bug.h: work around GCC PR82365 in BUG()
mm/swap.c: make functions and their kernel-doc agree (again)
mm/zpool.c: zpool_evictable: fix mismatch in parameter name and kernel-doc
ida: do zeroing in ida_pre_get()
mm, swap, frontswap: fix THP swap if frontswap enabled
certs/blacklist_nohashes.c: fix const confusion in certs blacklist
kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
mm, mlock, vmscan: no more skipping pagevecs
mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
Kbuild: always define endianess in kconfig.h
include/linux/sched/mm.h: re-inline mmdrop()
tools: fix cross-compile var clobbering