Some special-purpose registers that were present and accessible
by guests on POWER8 no longer exist on POWER9, so this adds
feature sections to ensure that we don't try to context-switch
them when going into or out of a guest on POWER9. These are
all relatively obscure, rarely-used registers, but we had to
context-switch them on POWER8 to avoid creating a covert channel.
They are: SPMC1, SPMC2, MMCRS, CSIGR, TACR, TCSCR, and ACOP.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
On POWER9, the SDR1 register (hashed page table base address) is no
longer used, and instead the hardware reads the HPT base address
and size from the partition table. The partition table entry also
contains the bits that specify the page size for the VRMA mapping,
which were previously in the LPCR. The VPM0 bit of the LPCR is
now reserved; the processor now always uses the VRMA (virtual
real-mode area) mechanism for guest real-mode accesses in HPT mode,
and the RMO (real-mode offset) mechanism has been dropped.
When entering or exiting the guest, we now only have to set the
LPIDR (logical partition ID register), not the SDR1 register.
There is also no requirement now to transition via a reserved
LPID value.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adapts the KVM-HV hashed page table (HPT) code to read and write
HPT entries in the new format defined in Power ISA v3.00 on POWER9
machines. The new format moves the B (segment size) field from the
first doubleword to the second, and trims some bits from the AVA
(abbreviated virtual address) and ARPN (abbreviated real page number)
fields. As far as possible, the conversion is done when reading or
writing the HPT entries, and the rest of the code continues to use
the old format.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This merges in the ppc-kvm topic branch to get changes to
arch/powerpc code that are necessary for adding POWER9 KVM support.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Define and set the POWER9 HFSCR doorbell bit so that guests can use
msgsndp.
ISA 3.0 calls this MSGP, so name it accordingly in the code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.0 defines a new PECE (Power-saving mode Exit Cause Enable) field
in the LPCR (Logical Partitioning Control Register), called
LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable).
KVM code will need to know about this bit, so add a definition for it.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
ISA 3.00 adds the logical PVR value 0x0f000005, so add a definition for
this.
Define PCR_ARCH_207 to reflect ISA 2.07 compatibility mode in the processor
compatibility register (PCR).
[paulus@ozlabs.org - moved dummy PCR_ARCH_300 value into next patch]
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.
It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
POWER9 requires the host to set up a partition table, which is a
table in memory indexed by logical partition ID (LPID) which
contains the pointers to page tables and process tables for the
host and each guest.
This factors out the initialization of the partition table into
a single function. This code was previously duplicated between
hash_utils_64.c and pgtable-radix.c.
This provides a function for setting a partition table entry,
which is used in early MMU initialization, and will be used by
KVM whenever a guest is created. This function includes a tlbie
instruction which will flush all TLB entries for the LPID and
all caches of the partition table entry for the LPID, across the
system.
This also moves a call to memblock_set_current_limit(), which was
in radix_init_partition_table(), but has nothing to do with the
partition table. By analogy with the similar code for hash, the
call gets moved to near the end of radix__early_init_mmu(). It
now gets called when running as a guest, whereas previously it
would only be called if the kernel is running as the host.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Drop duplicate header asm/iommu.h from book3s_64_vio_hv.c.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
The hashed page table MMU in POWER processors can update the R
(reference) and C (change) bits in a HPTE at any time until the
HPTE has been invalidated and the TLB invalidation sequence has
completed. In kvmppc_h_protect, which implements the H_PROTECT
hypercall, we read the HPTE, modify the second doubleword,
invalidate the HPTE in memory, do the TLB invalidation sequence,
and then write the modified value of the second doubleword back
to memory. In doing so we could overwrite an R/C bit update done
by hardware between when we read the HPTE and when the TLB
invalidation completed. To fix this we re-read the second
doubleword after the TLB invalidation and OR in the (possibly)
new values of R and C. We can use an OR since hardware only ever
sets R and C, never clears them.
This race was found by code inspection. In principle this bug could
cause occasional guest memory corruption under host memory pressure.
Fixes: a8606e20e4 ("KVM: PPC: Handle some PAPR hcalls in the kernel", 2011-06-29)
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
When switching from/to a guest that has a transaction in progress,
we need to save/restore the checkpointed register state. Although
XER is part of the CPU state that gets checkpointed, the code that
does this saving and restoring doesn't save/restore XER.
This fixes it by saving and restoring the XER. To allow userspace
to read/write the checkpointed XER value, we also add a new ONE_REG
specifier.
The visible effect of this bug is that the guest may see its XER
value being corrupted when it uses transactions.
Fixes: e4e3812150 ("KVM: PPC: Book3S HV: Add transactional memory support")
Fixes: 0a8eccefcb ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This keeps a per vcpu cache for recently page faulted MMIO entries.
On a page fault, if the entry exists in the cache, we can avoid some
time-consuming paths, for example, looking up HPT, locking HPTE twice
and searching mmio gfn from memslots, then directly call
kvmppc_hv_emulate_mmio().
In current implenment, we limit the size of cache to four. We think
it's enough to cover the high-frequency MMIO HPTEs in most case.
For example, considering the case of using virtio device, for virtio
legacy devices, one HPTE could handle notifications from up to
1024 (64K page / 64 byte Port IO register) devices, so one cache entry
is enough; for virtio modern devices, we always need one HPTE to handle
notification for each device because modern device would use a 8M MMIO
register to notify host instead of Port IO register, typically the
system's configuration should not exceed four virtio devices per
vcpu, four cache entry is also enough in this case. Of course, if needed,
we could also modify the macro to a module parameter in the future.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Currently we mark a HPTE for emulated MMIO with HPTE_V_ABSENT bit
set as well as key 0x1f. However, those HPTEs may be conflicted with
the HPTE for real guest RAM page HPTE with key 0x1f when the page
get paged out.
This patch clears the key field of HPTE when the page is paged out,
then recover it when HPTE is re-established.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Using list_move_tail() instead of list_del() + list_add_tail().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
A bunch of KVM functions are only called from assembler.
Give them prototypes in asm-prototypes.h
This reduces sparse warnings.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Squash a couple of sparse warnings by making things static.
Build tested.
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Sparse populated CPUID leafs are collected in a software provided leaf to
avoid bloat of the x86_capability array, but there is no way to rebuild the
real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID
leaf from the CPU. While this is possible it is problematic as it does not
take software disabled features into account. If a feature is disabled on
the host it should not be exposed to a guest either.
Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered
cpuid table information and the active CPU features.
[ tglx: Rewrote changelog ]
Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.chen@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
cpuid_regs is defined multiple times as structure and enum. Rename the enum
and move all of it to processor.h so we don't end up with more instances.
Rename the misnomed register enumeration from CR_* to the obvious CPUID_*.
[ tglx: Rewrote changelog ]
Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.chen@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit. This is OK because the
old format contains all the information that is in the new format.
This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format. At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.
This fixes and partially reverts commit 50de596de8
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).
Fixes: 50de596de8 ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
wait for next week.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQExBAABCAAbBQJYJvAXFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
ztUH/3DZVYkVYUea+Sk1mBnLaK5cbEJMGtxV/NsAqwz8rYB981cB5Iqw0+f2HX2G
LWLc0cf/kyUH+6TrFxQFyXLsBvV7xku2YwJdoiRutpR50767qPFPaPUBHcxCYPYR
MI4VSXb1hW0c4rd8409HesKv5qc5BiYjdWKPMu+f1LUF2CED+Xq4SWHVg+CwSE8i
UInvFUlj/ejQlSdxN2piv87SRcMuO+4nzP+JbtyN7onvtLIah6JfvHhlOO6em+0c
KX1G3qQ6fth6jewTdOPUB1Tx8CoEAZ+YLieRQA19vGHVI4eIhofDgtSaPZJmu9G7
0kjbrlqAuViTEsRl7Dx8zT8OpBM=
=N6w9
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM fixes. There are a couple pending x86 patches but they'll have to
wait for next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
KVM: arm/arm64: vgic: Prevent access to invalid SPIs
arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU
- mmap handler for dma ops as generic handler no longer works for us [Alexey]
- Fixes for EZChip platform [Noam]
- Fix RTC clocksource driver build issue
- ARC IRQ handling fixes [Yuriy]
- Revert a recent makefile change which doesn't go well with oldish tools out in the wild
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYJhV1AAoJEGnX8d3iisJeZZgP/2/nJVi4LTpXYqhpt04yNvCD
Fxwt8KypWfpbd8KDiK+C0OJiFmCa2XAZwAamRBkgtgKreq9/Vq4MYEiQXcYlQkQb
dCmU6T+RmSb15/7N3ZD1xhXB34ktGz1ffjcp8si0LIOkOj5cHQ3Ex4MH/SVQWuRh
VypLOfTJlLS0DoEvo/S4Vk2fBH65dLAWoeBl63J6yAcEdflzkoTmH9QWVzGlHGmi
mZ66/OeD4Rh7GDnrv5BQYLZJwUGQjoV3n4MhchrZ5uVt6UWs56ypi+ZVML/qYS5m
YuCOY7O9ZJ94/XY/pMa7CqEPgNm4ZHN/7DQ32F9X5BemQxZY8/JvUjPsuP42iXol
DfhtA55mNRk6yP0Ku5L2xg64mazyy9HqZnSh2pYpUMePSkcX/AXMrlOWSzFh250s
qkXEpwQgccl4HMA6xarjeQxFZL34lh1N40p3s6PXT6UJwfWOgD2w8Z4qW1jdp7G2
VkblWUN5AVT2YcyBcKgMGG09sLgIXhV40NVO4fvOo4p6qBSYk/yL3YRBKppOt+aQ
EVf6yaUKlpac6D9Ozy1pG5yfcx6wwaT0Iregjsld35WJ8Lvg9tLo0oLJrU0tlyCY
tX6zbgcXXU2y0TPKN1jJQ2u14Pfscfd5DLo1M64COKAGIaPrWYUBxJ0sBhqolq66
gKJwyOWse63H4ZxkEFCP
=hbBv
-----END PGP SIGNATURE-----
Merge tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
- mmap handler for dma ops as generic handler no longer works for us
[Alexey]
- Fixes for EZChip platform [Noam]
- Fix RTC clocksource driver build issue
- ARC IRQ handling fixes [Yuriy]
- Revert a recent makefile change which doesn't go well with oldish
tools out in the wild
* tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARCv2: MCIP: Use IDU_M_DISTRI_DEST mode if there is only 1 destination core
ARC: IRQ: Do not use hwirq as virq and vice versa
ARC: [plat-eznps] set default baud for early console
ARC: [plat-eznps] remove IPI clear from SMP operations
Revert "ARC: build: retire old toggles"
ARC: timer: rtc: implement read loop in "C" vs. inline asm
ARC: change return value of userspace cmpxchg assist syscall
arc: Implement arch-specific dma_map_ops.mmap
ARC: [SMP] avoid overriding present cpumask
ARC: Enable PERF_EVENTS in nSIM driven platforms
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYJg7LAAoJEFmIoMA60/r8J3UP/04+KrtlbNWV4UzIadry4NJ4
w05NUZZQnCc4PfN63Yab9IS6UjDiuK0spe3tzjPub/1uKrjEVZzOTnKZLaJe5hKA
/eRavTTeBwYfI1/Otw1pMLzbEOe/OMjvMuf96KRla/dlnVG6QTppD81jwW+lYZ51
I7gRqD+cL6f2X4tw3ORWi0EsqvOvbhSiNBklkQkEXH9epDXFnqiKgpom+Wqhl3TP
G7ZPX4PAk8eScEfwkAOCP9QSdCFaKlRMMLNXCScKsxRGKkkXUrGQxj17GWYjInAx
tAbjUwImrAHuVYQMggawXlmRI4+gdbpOYVV7yl2yuHecYFqr2o+9KX7A7hUJatYJ
37TeWUHZKONiQ31+fYFIUM814H6Yzl76m7ZrexYQ0Dq3roDLZSP9/RtFxO1/ADlJ
VKlGZ7clvklKPU2GfVCL6GR2mvIxBv8AOZi1YuK0haMXuPJmgkxnoBvM7W4krWhR
Ynffu5HI7+BWi1syp/tay1fYhWy08TmKTnSZFiIxOQgC3MCjB4uEHa31RB3BxnpV
tJiPYxikjO6hXwTWJJRULsP9P2fMObX/+fphb5FamHbx0kRr+wSqlmjCJJA2g338
1cK6TLAJ2Zl6aF2l7FWtr6YYvLBfPAj5PVxpySSS+yYt4Pq41weKKH0XUn7u+OXG
xLcZcDEUdjqoZnjkEG8X
=5c2j
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Update MAINTAINERS for Intel VMD driver filename
- Update Rockchip rk3399 host bridge driver DTS and resets
- Fix ROM shadow problem that made some video device initialization
fail
* tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: VMD: Update filename to reflect move
arm64: dts: rockchip: add three new resets for rk3399 PCIe controller
PCI: rockchip: Add three new resets as required properties
PCI: Don't attempt to claim shadow copies of ROM
Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
"It took a while for some patches to make it into mainline through
maintainer trees, but the 28-patch series is now reduced to 10, with
one tiny patch added at the end.
Aside from patches that are no longer required, I did these changes
compared to version 1:
- Dropped "iio: maxim_thermocouple: detect invalid storage size in
read()", which is currently in linux-next as commit 32cb7d27e6.
This is the only remaining warning I see for a couple of corner
cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
bot both report it on arm64)
- Dropped "brcmfmac: avoid maybe-uninitialized warning in
brcmf_cfg80211_start_ap", which is currently in net/master merge
pending.
- Dropped two x86 patches, "x86: math-emu: possible uninitialized
variable use" and "x86: mark target address as output in 'insb'
asm" as they do not seem to trigger for a default build, and I got
no feedback on them. Both of these are ancient issues and seem
harmless, I will send them again to the x86 maintainers once the
rest is merged.
- Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
feedback from Ilya Dryomov, who already has a different fix queued
up for v4.10. The kbuild bot reports this as a warning for xtensa.
- Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
a simpler patch, this one always triggers but my first solution
would not be safe for linux-4.9 any more at this point. I'll follow
up with the larger patch as a cleanup for 4.10.
- Replaced "dib0700: fix nec repeat handling" with a better one,
contributed by Sean Young"
* -Wmaybe-uninitialized fixes:
Kbuild: enable -Wmaybe-uninitialized warnings by default
pcmcia: fix return value of soc_pcmcia_regulator_set
infiniband: shut up a maybe-uninitialized warning
crypto: aesni: shut up -Wmaybe-uninitialized warning
rc: print correct variable for z8f0811
dib0700: fix nec repeat handling
s390: pci: don't print uninitialized data for debugging
nios2: fix timer initcall return value
x86: apm: avoid uninitialized data
NFSv4.1: work around -Wmaybe-uninitialized warning
Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"
The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:
arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.
This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output. A
follow-up patch for v4.10 rearranges the code to make the warning go
away.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
gcc correctly warns about an incorrect use of the 'pa' variable in case
we pass an empty scatterlist to __s390_dma_map_sg:
arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg':
arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized]
This adds a bogus initialization to the function to sanitize the debug
output. I would have preferred a solution without the initialization,
but I only got the report from the kbuild bot after turning on the
warning again, and didn't manage to reproduce it myself.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When called more than twice, the nios2_time_init() function return an
uninitialized value, as detected by gcc -Wmaybe-uninitialized
arch/nios2/kernel/time.c: warning: 'ret' may be used uninitialized in this function
This makes it return '0' here, matching the comment above the function.
Acked-by: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
apm_bios_call() can fail, and return a status in its argument structure.
If that status however is zero during a call from
apm_get_power_status(), we end up using data that may have never been
set, as reported by "gcc -Wmaybe-uninitialized":
arch/x86/kernel/apm_32.c: In function ‘apm’:
arch/x86/kernel/apm_32.c:1729:17: error: ‘bx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1835:5: error: ‘cx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1730:17: note: ‘cx’ was declared here
arch/x86/kernel/apm_32.c:1842:27: error: ‘dx’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
arch/x86/kernel/apm_32.c:1731:17: note: ‘dx’ was declared here
This changes the function to return "APM_NO_ERROR" here, which makes the
code more robust to broken BIOS versions, and avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Traditionally, we have always had warnings about uninitialized variables
enabled, as this is part of -Wall, and generally a good idea [1], but it
also always produced false positives, mainly because this is a variation
of the halting problem and provably impossible to get right in all cases
[2].
Various people have identified cases that are particularly bad for false
positives, and in commit e74fc973b6 ("Turn off -Wmaybe-uninitialized
when building with -Os"), I turned off the warning for any build that
was done with CC_OPTIMIZE_FOR_SIZE. This drastically reduced the number
of false positive warnings in the default build but unfortunately had
the side effect of turning the warning off completely in 'allmodconfig'
builds, which in turn led to a lot of warnings (both actual bugs, and
remaining false positives) to go in unnoticed.
With commit 877417e6ff ("Kbuild: change CC_OPTIMIZE_FOR_SIZE
definition") enabled the warning again for allmodconfig builds in v4.7
and in v4.8-rc1, I had finally managed to address all warnings I get in
an ARM allmodconfig build and most other maybe-uninitialized warnings
for ARM randconfig builds.
However, commit 6e8d666e92 ("Disable "maybe-uninitialized" warning
globally") was merged at the same time and disabled it completely for
all configurations, because of false-positive warnings on x86 that I had
not addressed until then. This caused a lot of actual bugs to get
merged into mainline, and I sent several dozen patches for these during
the v4.9 development cycle. Most of these are actual bugs, some are for
correct code that is safe because it is only called under external
constraints that make it impossible to run into the case that gcc sees,
and in a few cases gcc is just stupid and finds something that can
obviously never happen.
I have now done a few thousand randconfig builds on x86 and collected
all patches that I needed to address every single warning I got (I can
provide the combined patch for the other warnings if anyone is
interested), so I hope we can get the warning back and let people catch
the actual bugs earlier.
This reverts the change to disable the warning completely and for now
brings it back at the "make W=1" level, so we can get it merged into
mainline without introducing false positives. A follow-up patch enables
it on all levels unless some configuration option turns it off because
of false-positives.
Link: https://rusty.ozlabs.org/?p=232 [1]
Link: https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings [2]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Limit the number of kmemleak false positives by including
.data.ro_after_init in memory scanning. To achieve this we need to add
symbols for start and end of the section to the linker scripts.
The problem was been uncovered by commit 56989f6d85 ("genetlink: mark
families as __ro_after_init").
Link: http://lkml.kernel.org/r/1478274173-15218-1-git-send-email-jakub.kicinski@netronome.com
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Kick the vcpu when a pending interrupt becomes pending again
- Prevent access to invalid interrupt registers
- Invalid TLBs when two vcpus from the same VM share a CPU
-----BEGIN PGP SIGNATURE-----
iQIyBAABCAAcBQJYHNMTFRxtYXJjLnp5bmdpZXJAYXJtLmNvbQAKCRAj0NC60T16
Q1WDD/9d5KfQ3dWiLtBXbeD3w2K0gXknwLAMsCCAdhgkCdLenxSBjlB7lmVYi1lZ
pTnshnR4HC0P3yW3bA78J7LZnUzJg72pq/S5K/om9KylVUdXz9WzQ3u+XyB3KTFW
b+viTUK3mqose67UcBSKGfFEWpIOmJ/nZVvWAIaUTg49btxnetKjyhv2Ux744Hm/
Jba3trcA4m8RPJ8Vu6mIfd6gkTXzSkQaN2wGVaEFhCFHOPDCQHjcdspe20Ig9fmY
kTXEBe4r0sC+8fXoymEM6TDQFWB8WthIIqfeIJ3FgfoETKrwmyJ23YfLAh49m1cB
nFpyy/lr9PNsOjJKXFi84pzx6l8U/CDslnBm5klYTT2kFc3stKbyDtIILvUOwKl8
n9UZSO8NGhOpKscGXLzO/CmIO+wgL15LTsxYsOh3HK7KjzocspQpxyD7pPWN8CUI
M2IGLvYMzCaBAOzs6WO4P9xlJRNtUMK8lvAthnBiCeE2Nnu3Oajf8krR4DZmBcQh
Q/GOACa1kuBMfqmWNrCVq3UNiFLxxAseShgxq9/E/dNe20daXOnxSaRGdRzTvAQF
dRBEtHXdY0qDgLz3tVzBdTTmx3M2k4B4/t+VxnsFFVlvbr0OyOozvFH42tGeTw5t
IBoXP9x87+Rpl6P6wW+ICketXQMRmdl40JXNjR96sXN94Y/Z4A==
=vj/s
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM updates for v4.9-rc4
- Kick the vcpu when a pending interrupt becomes pending again
- Prevent access to invalid interrupt registers
- Invalid TLBs when two vcpus from the same VM share a CPU
pm_rst, aclk_rst and pclk_rst should be controlled by driver, so we
need to add these three resets for PCIe controller.
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Pull s390 fixes from Martin Schwidefsky:
"Two bug fixes
- a memory alignment fix in the s390 only hypfs code
- a fix for the generic percpu code that caused ftrace to break on
s390. This is not relevant for x86 but for all architectures that
use the generic percpu code"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
percpu: use notrace variant of preempt_disable/preempt_enable
s390/hypfs: Use get_free_page() instead of kmalloc to ensure page alignment
ARC linux uses 2 distribution modes for common interrupts: round robin
mode (IDU_M_DISTRI_RR) and a simple destination mode (IDU_M_DISTRI_DEST).
The first one is used when more than 1 cores may handle a common interrupt
and the second one is used when only 1 core may handle a common interrupt.
However idu_irq_set_affinity() always sets IDU_M_DISTRI_RR for all affinity
values. But there is no sense in setting of such mode if only 1 core must
handle a common interrupt.
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This came up when reviewing code to address missing IRQ affinity
setting in AXS103 platform and/or implementing hierarchical IRQ domains
- smp_ipi_irq_setup() callers pass hwirq but in turn calls
request_percpu_irq() which expects a linux virq. So invoke
irq_find_mapping() to do the conversion
(also explicitify this in code by renaming the args appropriately)
- idu_of_init()/idu_cascade_isr() were similarly using linux virq where
hwirq is expected, so do the conversion using irqd_to_hwirq() helper
Signed-off-by: Yuriy Kolerov <yuriy.kolerov@synopsys.com>
[vgupta: made changelog a bit concise a bit]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
For CONFIG_SERIAL_EARLYCON we need 800MHz for NPS SoC
The early console driver uses BASE_BAUD and not using dtb.
The default of 50MHz is NOT good for NPS SoC.
Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Today we register to plat_smp_ops.clear() method which actually
is acking the IPI.
However this is already taking care by our irqchip driver specifically
by the irq_chip.irq_eoi() method.
This is perfect timing where it should be done and no special handling
is needed at plat_smp_ops.clear().
Signed-off-by: Noam Camus <noamca@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This has caused a bunch of build failures at a few sites, with GNU
2015.12 and older as the assembler seems to need -mlock to be able to
grok llock/scond instructions for ARC700 builds.
different places since the
older tools still seem to release
of tools which most people are using seem to trip with the -mlock flag
not being passed.
This reverts commit c300547588.
The current code doesn't even compile as somehow the inline assembly
can't see the register names defined as ARC_RTC_*
I'm pretty sure It worked when I first got it merged, but the tools were
definitely different then.
So better to write this in "C" anyways.
CC: stable@vger.kernel.org #4.2+
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The original syscall only used to return errno to indicate if cmpxchg
succeeded. It was not returning the "previous" value which typical cmpxchg
callers are interested in to build their slowpaths or retry loops.
Given user preemption in syscall return path etc, it is not wise to
check this in userspace afterwards, but should be what kernel actually
observed in the syscall.
So change the syscall interface to always return the previous value and
additionally set Z flag to indicate whether operation succeeded or not
(just like ARM implementation when they used to have this syscall)
The flag approach avoids having to put_user errno which is nice given
the use case for this syscall cares mostly about the "previous" value.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Fix build failure on compilers without asm goto
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJYH+anAAoJELescNyEwWM0fiQIALprhwbhNmqc5q6IVXYJuPw7
kPonaVa1MA6EGANkBwdxD4pbkK2fcsFYsUtiSewP71TwLrRhW/2gPJGUOv3sK6Bq
BXM10O9Meu4Toy45uPofKpRk4yNhJh4WBPPYedzlwitoBUaC4R4sclioqfIOJsvv
z3UI/EXsGpgEuEKQkNHZ5PGUzQ9eLwbhexMJROsdnqVektaDrCSkVtdQNxpMsmve
yy92epfnH9xlk79KrTF1a0lM/SwlQHscua9jsOO+C4Txu6z5s2ltHVtM95Rb5X3a
nmczpMAUesL8g89AdJYOrbSO+dGbCI7ZHhXmcTnWJdRp7g2Hyhei3jNUUwfDJUA=
=GgpB
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"It's been pretty quiet on the fixes side of things for us, but Artem
reported a build failure introduced during the merge window that
appears with older GCCs that do not support asm goto. The fix is
bigger than I'd like, but it's a mechnical move of some constants to
break an include dependency between atomic.h and jump_label.h when
!HAVE_JUMP_LABEL.
Summary:
- Fix build failure on compilers without asm goto"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix circular include of asm/lse.h through linux/jump_label.h
Commit efd9e03fac ("arm64: Use static keys for CPU features")
introduced support for static keys in asm/cpufeature.h, including
linux/jump_label.h. When CC_HAVE_ASM_GOTO is not defined, this causes a
circular dependency via linux/atomic.h, asm/lse.h and asm/cpufeature.h.
This patch moves the capability macros out out of asm/cpufeature.h into
a separate asm/cpucaps.h and modifies some of the #includes accordingly.
Fixes: efd9e03fac ("arm64: Use static keys for CPU features")
Reported-by: Artem Savkov <asavkov@redhat.com>
Tested-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
during the merge window. The rest are fixes for MIPS, s390 and nested VMX.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJYG2H5AAoJEL/70l94x66DK/cH/0jEQ3ynuLAd5CKux7JxI/EP
msSJh1Xqr4+XhXZnuDpGQWrdsBlxoiqA6PsJrUTtyi4nQCDXlT8g+2MDuvqhWIHz
7vw58j/EMJDCVQzYAbN5VDUfk13uB5aSWTo3M9Rf09v0hU1Ql7z8u4CtKEdLpN5Y
LY9bT9fxUmXO7REKP7bdW6ZrDX/hUShYHgMqzXGFMyGBG3ym3a9bggXEzTCD6eNQ
ioogQIWqg+icdhta0iLNAwFClPlcKB2/xo4IUuNgrPwGoHFGJN/8+qxT4+sVbp2B
v8u1zOXlCFXBcskWE+yRRsGe72+mIzz6QScCyO+5HbhKYVfbE9H7KBlFX9rZZ2c=
=IbKx
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"One NULL pointer dereference, and two fixes for regressions introduced
during the merge window.
The rest are fixes for MIPS, s390 and nested VMX"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: Check memopp before dereference (CVE-2016-8630)
kvm: nVMX: VMCLEAR an active shadow VMCS after last use
KVM: x86: drop TSC offsetting kvm_x86_ops to fix KVM_GET/SET_CLOCK
KVM: x86: fix wbinvd_dirty_mask use-after-free
kvm/x86: Show WRMSR data is in hex
kvm: nVMX: Fix kernel panics induced by illegal INVEPT/INVVPID types
KVM: document lock orders
KVM: fix OOPS on flush_work
KVM: s390: Fix STHYI buffer alignment for diag224
KVM: MIPS: Precalculate MMIO load resume PC
KVM: MIPS: Make ERET handle ERL before EXL
KVM: MIPS: Fix lazy user ASID regenerate for SMP
Pull MIPS fixes from Ralf Baechle:
"A set of MIPS fixes for 4.9:
- lots of fixes for printk continuations
- six fixes for FP related code.
- fix max_low_pfn with disabled highmem
- fix KASLR handling of NULL FDT and KASLR for generic kernels
- fix build of compressed image
- provide default mips_cpc_default_phys_base to ignore CPC
- fix reboot on Malta"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix max_low_pfn with disabled highmem
MIPS: Correct MIPS I FP sigcontext layout
MIPS: Fix ISA I/II FP signal context offsets
MIPS: Remove FIR from ISA I FP signal context
MIPS: Fix ISA I FP sigcontext access violation handling
MIPS: Fix FCSR Cause bit handling for correct SIGFPE issue
MIPS: ptrace: Also initialize the FP context on individual FCSR writes
MIPS: dump_tlb: Fix printk continuations
MIPS: Fix __show_regs() output
MIPS: traps: Fix output of show_code
MIPS: traps: Fix output of show_stacktrace
MIPS: traps: Fix output of show_backtrace
MIPS: Fix build of compressed image
MIPS: generic: Fix KASLR for generic kernel.
MIPS: KASLR: Fix handling of NULL FDT
MIPS: Malta: Fixup reboot
MIPS: CPC: Provide default mips_cpc_default_phys_base to ignore CPC
Pull parisc updates from Helge Deller:
"The first three patches are trivial and add some required KERN_CONT,
ignore the new pkey syscalls on parisc and use the LINUX_GATEWAY_ADDR
define instead of hardcoded values.
The two patches from Dave Anglin are important.
The first one avoids trashing the sr2 and sr3 space registers in the
Light-weight syscall path. Especially the usage of sr3 is critical
since it may get trashed by the interrupt handler.
The second patch is even more important and tagged for stable series.
It protects one critical section in the syscall entry path by
disabling local interrupts. Without disabling interrupts, the sr7
space register may not be in sync with the current stack setup and
thus an incoming hardware interrupt may destroy memory in random
userspace areas"
* 'parisc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Ignore the pkey system calls for now
parisc: Use LINUX_GATEWAY_ADDR define instead of hardcoded value
parisc: Ensure consistent state when switching to kernel stack at syscall entry
parisc: Avoid trashing sr2 and sr3 in LWS code
parisc: use KERN_CONT when printing device inventory
Architecturally, TLBs are private to the (physical) CPU they're
associated with. But when multiple vcpus from the same VM are
being multiplexed on the same CPU, the TLBs are not private
to the vcpus (and are actually shared across the VMID).
Let's consider the following scenario:
- vcpu-0 maps PA to VA
- vcpu-1 maps PA' to VA
If run on the same physical CPU, vcpu-1 can hit TLB entries generated
by vcpu-0 accesses, and access the wrong physical page.
The solution to this is to keep a per-VM map of which vcpu ran last
on each given physical CPU, and invalidate local TLBs when switching
to a different vcpu from the same VM.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>