Граф коммитов

122830 Коммитов

Автор SHA1 Сообщение Дата
James Hogan e4e94c0fc8 MIPS: KVM: Drop unused host_cp0_entryhi
The host EntryHi in the KVM VCPU context is virtually unused. It gets
stored on exceptions, but only ever used in a kvm_debug() when a TLB
miss occurs.

Drop it entirely, removing that information from the kvm_debug output.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 11:02:39 +02:00
James Hogan d40dd9e8da MIPS: KVM: Drop unused guest_inst from kvm_vcpu_arch
The MIPS kvm_vcpu_arch::guest_inst isn't used, so drop it from the
struct and drop its asm-offsets definition.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 11:02:38 +02:00
Paolo Bonzini b11c3f5947 Merge branch 'kvm-mips-fixes' into HEAD
Merge MIPS patches destined to both 4.7 and kvm/next, to avoid
unnecessary conflicts.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 11:02:33 +02:00
James Hogan 6df82a7b88 MIPS: KVM: Fix CACHE triggered exception emulation
When emulating TLB miss / invalid exceptions during CACHE instruction
emulation, be sure to set up the correct PC and host_cp0_badvaddr state
for the kvm_mips_emlulate_tlb*_ld() function to pick up for guest EPC
and BadVAddr.

PC needs to be rewound otherwise the guest EPC will end up pointing at
the next instruction after the faulting CACHE instruction.

host_cp0_badvaddr must be set because guest CACHE instructions trap with
a Coprocessor Unusable exception, which doesn't update the host BadVAddr
as a TLB exception would.

This doesn't tend to get hit when dynamic translation of emulated
instructions is enabled, since only the first execution of each CACHE
instruction actually goes through this code path, with subsequent
executions hitting the SYNCI instruction that it gets replaced with.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 10:59:45 +02:00
James Hogan cc81e94862 MIPS: KVM: Don't unwind PC when emulating CACHE
When a CACHE instruction is emulated by kvm_mips_emulate_cache(), the PC
is first updated to point to the next instruction, and afterwards it
falls through the "dont_update_pc" label, which rewinds the PC back to
its original address.

This works when dynamic translation of emulated instructions is enabled,
since the CACHE instruction is replaced with a SYNCI which works without
trapping, however when dynamic translation is disabled the guest hangs
on CACHE instructions as they always trap and are never stepped over.

Roughly swap the meanings of the "done" and "dont_update_pc" to match
kvm_mips_emulate_CP0(), so that "done" will roll back the PC on failure,
and "dont_update_pc" won't change PC at all (for the sake of exceptions
that have already modified the PC).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 10:59:45 +02:00
James Hogan 7f5a1ddc79 MIPS: KVM: Include bit 31 in segment matches
When faulting guest addresses are matched against guest segments with
the KVM_GUEST_KSEGX() macro, change the mask to 0xe0000000 so as to
include bit 31.

This is mainly for safety's sake, as it prevents a rogue BadVAddr in the
host kseg2/kseg3 segments (e.g. 0xC*******) after a TLB exception from
matching the guest kseg0 segment (e.g. 0x4*******), triggering an
internal KVM error instead of allowing the corresponding guest kseg0
page to be mapped into the host vmalloc space.

Such a rogue BadVAddr was observed to happen with the host MIPS kernel
running under QEMU with KVM built as a module, due to a not entirely
transparent optimisation in the QEMU TLB handling. This has already been
worked around properly in a previous commit.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 10:59:44 +02:00
James Hogan 797179bc4f MIPS: KVM: Fix modular KVM under QEMU
Copy __kvm_mips_vcpu_run() into unmapped memory, so that we can never
get a TLB refill exception in it when KVM is built as a module.

This was observed to happen with the host MIPS kernel running under
QEMU, due to a not entirely transparent optimisation in the QEMU TLB
handling where TLB entries replaced with TLBWR are copied to a separate
part of the TLB array. Code in those pages continue to be executable,
but those mappings persist only until the next ASID switch, even if they
are marked global.

An ASID switch happens in __kvm_mips_vcpu_run() at exception level after
switching to the guest exception base. Subsequent TLB mapped kernel
instructions just prior to switching to the guest trigger a TLB refill
exception, which enters the guest exception handlers without updating
EPC. This appears as a guest triggered TLB refill on a host kernel
mapped (host KSeg2) address, which is not handled correctly as user
(guest) mode accesses to kernel (host) segments always generate address
error exceptions.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.10.x-
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-14 10:59:44 +02:00
David Hildenbrand a7e19ab55f KVM: s390: handle missing storage-key facility
Without the storage-key facility, SIE won't interpret SSKE, ISKE and
RRBE for us. So let's add proper interception handlers that will be called
if lazy sske cannot be enabled.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:31 +02:00
David Hildenbrand 11ddcd41bc KVM: s390: trace and count all skey intercepts
Let's trace and count all skey handling operations, even if lazy skey
handling was already activated. Also, don't enable lazy skey handling if
anything went wrong while enabling skey handling for the SIE.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:31 +02:00
David Hildenbrand 2386145152 s390/sclp: detect storage-key facility
Let's correctly detect that facility.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:30 +02:00
David Hildenbrand 695be0e7a2 KVM: s390: pfmf: handle address overflows
In theory, end could always end up being < start, if overflowing to 0.
Although very unlikely for now, let's just fix it.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:30 +02:00
David Hildenbrand 1824c723ac KVM: s390: pfmf: support conditional-sske facility
We already indicate that facility but don't implement it in our pfmf
interception handler. Let's add a new storage key handling function for
conditionally setting the guest storage key.

As we will reuse this function later on, let's directly implement returning
the old key via parameter and indicating if any change happened via rc.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:30 +02:00
David Hildenbrand 2c26d1d23a KVM: s390: pfmf: take care of amode when setting reg2
Depending on the addressing mode, we must not overwrite bit 0-31 of the
register. In addition, 24 bit and 31 bit have to set certain bits to 0,
which is guaranteed by converting the end address to an effective
address.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:29 +02:00
David Hildenbrand 9a68f0af8c KVM: s390: pfmf: MR and MC are ignored without CSSKE
These two bits are simply ignored when the conditional-SSKE facility is
not installed.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:29 +02:00
David Hildenbrand 6164a2e90a KVM: s390: pfmf: fix end address calculation
The current calculation is wrong if absolute != real address. Let's just
calculate the start address for 4k frames upfront. Otherwise, the
calculated end address will be wrong, resulting in wrong memory
location/storage keys getting touched.

To keep low-address protection working (using the effective address),
we have to move the check.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:28 +02:00
David Hildenbrand fe69eabf8d KVM: s390: storage keys fit into a char
No need to convert the storage key into an unsigned long, the target
function expects a char as argument.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:28 +02:00
David Hildenbrand 154c8c19c3 s390/mm: return key via pointer in get_guest_storage_key
Let's just split returning the key and reporting errors. This makes calling
code easier and avoids bugs as happened already.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:28 +02:00
David Hildenbrand 8d6037a7b4 s390/mm: simplify get_guest_storage_key
We can safe a few LOC and make that function easier to understand
by rewriting existing code.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:27 +02:00
Martin Schwidefsky d3ed1ceeac s390/mm: set and get guest storage key mmap locking
Move the mmap semaphore locking out of set_guest_storage_key
and get_guest_storage_key. This makes the two functions more
like the other ptep_xxx operations and allows to avoid repeated
semaphore operations if multiple keys are read or written.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:27 +02:00
David Hildenbrand c427c42cd6 s390/mm: don't drop errors in get_guest_storage_key
Commit 1e133ab296 ("s390/mm: split arch/s390/mm/pgtable.c") changed
the return value of get_guest_storage_key to an unsigned char, resulting
in -EFAULT getting interpreted as a valid storage key.

Cc: stable@vger.kernel.org # 4.6+
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:26 +02:00
Christian Borntraeger dcc98ea614 KVM: s390: fixup I/O interrupt traces
We currently have two issues with the I/O  interrupt injection logging:
1. All QEMU versions up to 2.6 have a wrong encoding of device numbers
etc for the I/O interrupt type, so the inject VM_EVENT will have wrong
data. Let's fix this by using the interrupt parameters and not the
interrupt type number.
2. We only log in kvm_s390_inject_vm, but not when coming from
kvm_s390_reinject_io_int or from flic. Let's move the logging to the
common __inject_io function.

We also enhance the logging for delivery to match the data.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-10 12:07:26 +02:00
Christian Borntraeger 1bb78d161f KVM: s390: provide logging for diagnose 0x500
We might need to debug some virtio things, so better have diagnose 500
logged.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-10 12:07:26 +02:00
David Hildenbrand f597d24eee KVM: s390: turn on tx even without ctx
Constrained transactional execution is an addon of transactional execution.

Let's enable the assist also if only TX is enabled for the guest.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:25 +02:00
David Hildenbrand bdab09f3d8 KVM: s390: enable host-protection-interruption only with ESOP
host-protection-interruption control was introduced with ESOP. So let's
enable it only if we have ESOP and add an explanatory comment why
we can live without it.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:25 +02:00
David Hildenbrand 09a400e78e KVM: s390: enable ibs only if available
Let's enable interlock-and-broadcast suppression only if the facility is
actually available.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:24 +02:00
David Hildenbrand 9c375490fc s390/sclp: detect interlock-and-broadcast-suppression facility
Let's detect that facility.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:24 +02:00
David Hildenbrand 873b425e4c KVM: s390: enable PFMFI only if available
Let's enable interpretation of PFMFI only if the facility is
actually available. Emulation code still works in case the guest is
offered EDAT-1.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:23 +02:00
David Hildenbrand a0eb55e631 s390/sclp: detect PFMF interpretation facility
Let's detect that facility.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:23 +02:00
David Hildenbrand 48ee7d3a7f KVM: s390: enable cei only if available
Let's only enable conditional-external-interruption if the facility is
actually available.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:23 +02:00
David Hildenbrand 4a5c3e0827 s390/sclp: detect conditional-external-interception facility
Let's detect if we have that facility.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:22 +02:00
David Hildenbrand 11ad65b79e KVM: s390: enable ib only if available
Let's enable intervention bypass only if the facility is acutally
available.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:22 +02:00
David Hildenbrand 72cd82b9e9 s390/sclp: detect intervention bypass facility
Let's detect if we have the intervention bypass facility installed.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:21 +02:00
David Hildenbrand efed110446 KVM: s390: handle missing guest-storage-limit-suppression
If guest-storage-limit-suppression is not available, we would for now
have a valid guest address space with size 0. So let's simply set the
origin to 0 and the limit to hamax.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:21 +02:00
David Hildenbrand 5236c751da s390/sclp: detect guest-storage-limit-suppression
Let's detect that facility.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:20 +02:00
David Hildenbrand f9cbd9b025 KVM: s390: provide CMMA attributes only if available
Let's not provide the device attribute for cmma enabling and clearing
if the hardware doesn't support it.

This also helps getting rid of the undocumented return value "-EINVAL"
in case CMMA is not available when trying to enable it.

Also properly document the meaning of -EINVAL for CMMA clearing.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:20 +02:00
David Hildenbrand c24cc9c8a6 KVM: s390: enable CMMA if the interpration is available
Now that we can detect if collaborative-memory-management interpretation
is available, replace the heuristic by a real hardware detection.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:19 +02:00
David Hildenbrand 09be9cb92b s390/sclp: detect cmma
Let's detect the Collaborative-memory-management-interpretation facility,
aka CMM assist, so we can correctly enable cmma later.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:19 +02:00
David Hildenbrand 89b5b4de33 KVM: s390: guestdbg: signal missing hardware support
Without guest-PER enhancement, we can't provide any debugging support.
Therefore act like kernel support is missing.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:18 +02:00
David Hildenbrand b9e28897e6 s390/sclp: detect guest-PER enhancement
Let's detect that facility, so we can correctly handle its abscence.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:18 +02:00
David Hildenbrand 76a6dd7241 KVM: s390: handle missing 64-bit-SCAO facility
Without that facility, we may only use scaol. So fallback
to DMA allocation in that case, so we won't overwrite random memory
via the SIE.

Also disallow ESCA, so we don't have to handle that allocation case.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:18 +02:00
David Hildenbrand 4013ade3fb s390/sclp: detect 64-bit-SCAO facility
Let's correctly detect that facility, so we can correctly handle its
abscence later on.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:17 +02:00
David Hildenbrand 0a763c780b KVM: s390: interface to query and configure cpu subfunctions
We have certain instructions that indicate available subfunctions via
a query subfunction (crypto functions and ptff), or via a test bit
function (plo).

By exposing these "subfunction blocks" to user space, we allow user space
to
1) query available subfunctions and make sure subfunctions won't get lost
   during migration - e.g. properly indicate them via a CPU model
2) change the subfunctions to be reported to the guest (even adding
   unavailable ones)

This mechanism works just like the way we indicate the stfl(e) list to
user space.

This way, user space could even emulate some subfunctions in QEMU in the
future. If this is ever applicable, we have to make sure later on, that
unsupported subfunctions result in an intercept to QEMU.

Please note that support to indicate them to the guest is still missing
and requires hardware support. Usually, the IBC takes already care of these
subfunctions for migration safety. QEMU should make sure to always set
these bits properly according to the machine generation to be emulated.

Available subfunctions are only valid in combination with STFLE bits
retrieved via KVM_S390_VM_CPU_MACHINE and enabled via
KVM_S390_VM_CPU_PROCESSOR. If the applicable bits are available, the
indicated subfunctions are guaranteed to be correct.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:17 +02:00
David Hildenbrand 1afd43e0fb s390/crypto: allow to query all known cpacf functions
KVM will have to query these functions, let's add at least the query
capabilities.

PCKMO has RRE format, as bit 16-31 are ignored, we can still use the
existing function. As PCKMO won't touch the cc, let's force it to 0
upfront.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Ingo Tuchscherer <ingo.tuchscherer@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:16 +02:00
David Hildenbrand bcfa01d787 KVM: s390: gaccess: convert get_vcpu_asce()
Let's use our new function for preparing translation exceptions.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:16 +02:00
David Hildenbrand cde0dcfb5d KVM: s390: gaccess: convert guest_page_range()
Let's use our new function for preparing translation exceptions. As we will
need the correct ar, let's pass that to guest_page_range().

This will also make sure that the guest address is stored in the tec
for applicable excptions.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:15 +02:00
David Hildenbrand fbcb7d5157 KVM: s390: gaccess: convert guest_translate_address()
Let's use our new function for preparing translation exceptions.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:15 +02:00
David Hildenbrand 3e3c67f6a3 KVM: s390: gaccess: convert kvm_s390_check_low_addr_prot_real()
Let's use our new function for preparing translation exceptions.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:15 +02:00
David Hildenbrand d03193de30 KVM: s390: gaccess: function for preparing translation exceptions
Let's provide a function trans_exc() that can be used for handling
preparation of translation exceptions on a central basis. We will use
that function to replace existing code in gaccess.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:14 +02:00
David Hildenbrand 6167375b55 KVM: s390: gaccess: store guest address on ALC prot exceptions
Let's pass the effective guest address to get_vcpu_asce(), so we
can properly set the guest address in case we inject an ALC protection
exception.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:14 +02:00
David Hildenbrand 22be5a1331 KVM: s390: forward ESOP if available
ESOP guarantees that during a protection exception, bit 61 of real location
168-175 will only be set to 1 if it was because of ALCP or DATP. If the
exception is due to LAP or KCP, the bit will always be set to 0.

The old SOP definition allowed bit 61 to be unpredictable in case of LAP
or KCP in some conditions. So ESOP replaces this unpredictability by
a guarantee.

Therefore, we can directly forward ESOP if it is available on our machine.
We don't have to do anything when ESOP is disabled - the guest will simply
expect unpredictable values. Our guest access functions are already
handling ESOP properly.

Please note that future functionality in KVM will require knowledge about
ESOP being enabled for a guest or not.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:13 +02:00
David Hildenbrand 15c9705f0c KVM: s390: interface to query and configure cpu features
For now, we only have an interface to query and configure facilities
indicated via STFL(E). However, we also have features indicated via
SCLP, that have to be indicated to the guest by user space and usually
require KVM support.

This patch allows user space to query and configure available cpu features
for the guest.

Please note that disabling a feature doesn't necessarily mean that it is
completely disabled (e.g. ESOP is mostly handled by the SIE). We will try
our best to disable it.

Most features (e.g. SCLP) can't directly be forwarded, as most of them need
in addition to hardware support, support in KVM. As we later on want to
turn these features in KVM explicitly on/off (to simulate different
behavior), we have to filter all features provided by the hardware and
make them configurable.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:13 +02:00
Alexander Yarygin c1778e5157 KVM: s390: Add mnemonic print to kvm_s390_intercept_prog
We have a table of mnemonic names for intercepted program
interruptions, let's print readable name of the interruption in the
kvm_s390_intercept_prog trace event.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:13 +02:00
Janosch Frank 7d0a5e6241 KVM: s390: Limit sthyi execution
Store hypervisor information is a valid instruction not only in
supervisor state but also in problem state, i.e. the guest's
userspace. Its execution is not only computational and memory
intensive, but also has to get hold of the ipte lock to write to the
guest's memory.

This lock is not intended to be held often and long, especially not
from the untrusted guest userspace. Therefore we apply rate limiting
of sthyi executions per VM.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:12 +02:00
Janosch Frank 95ca2cb579 KVM: s390: Add sthyi emulation
Store Hypervisor Information is an emulated z/VM instruction that
provides a guest with basic information about the layers it is running
on. This includes information about the cpu configuration of both the
machine and the lpar, as well as their names, machine model and
machine type. This information enables an application to determine the
maximum capacity of CPs and IFLs available to software.

The instruction is available whenever the facility bit 74 is set,
otherwise executing it results in an operation exception.

It is important to check the validity flags in the sections before
using data from any structure member. It is not guaranteed that all
members will be valid on all machines / machine configurations.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:12 +02:00
Janosch Frank a2d57b35c0 KVM: s390: Extend diag 204 fields
The new store hypervisor information instruction, which we are going
to introduce, needs previously unused fields in diag 204 structures.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:11 +02:00
Janosch Frank a011eeb2a3 KVM: s390: Add operation exception interception handler
This commit introduces code that handles operation exception
interceptions. With this handler we can emulate instructions by using
illegal opcodes.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:11 +02:00
Janosch Frank 022bd2d11c s390: Make diag224 public
Diag204's cpu structures only contain the cpu type by means of an
index in the diag224 name table. Hence, to be able to use diag204 in
any meaningful way, we also need a usable diag224 interface.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:10 +02:00
Janosch Frank e435dc3139 s390: Make cpc_name accessible
sclp_ocf.c is the only way to get the cpc name, as it registers the
sole event handler for the ocf event. By creating a new global
function that copies that name, we make it accessible to the world
which longs to retrieve it.

Additionally we now also store the cpc name as EBCDIC, so we don't
have to convert it to and from ASCII if it is requested in native
encoding.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:10 +02:00
Janosch Frank e65f30e0cb s390: hypfs: Move diag implementation and data definitions
Diag 204 data and function definitions currently live in the hypfs
files. As KVM will be a consumer of this data, we need to make it
publicly available and move it to the appropriate diag.{c,h} files.

__attribute__ ((packed)) occurences were replaced with __packed for
all moved structs.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-10 12:07:09 +02:00
Kai Huang dca4d72877 kvm/x86: remove unnecessary header file inclusion
arch/x86/kvm/iommu.c includes <linux/intel-iommu.h> and <linux/dmar.h>, which
both are unnecessary, in fact incorrect to be here as they are intel specific.

Building kvm on x86 passed after removing above inclusion.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-03 15:28:19 +02:00
Paolo Bonzini 250715a617 KVM: x86: protect KVM_CREATE_PIT/KVM_CREATE_PIT2 with kvm->lock
The syzkaller folks reported a NULL pointer dereference that seems
to be cause by a race between KVM_CREATE_IRQCHIP and KVM_CREATE_PIT2.
The former takes kvm->lock (except when registering the devices,
which needs kvm->slots_lock); the latter takes kvm->slots_lock only.
Change KVM_CREATE_PIT2 to follow the same model as KVM_CREATE_IRQCHIP.

Testcase:

    #include <pthread.h>
    #include <linux/kvm.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>
    #include <stdint.h>
    #include <string.h>
    #include <stdlib.h>
    #include <sys/syscall.h>
    #include <unistd.h>

    long r[23];

    void* thr1(void* arg)
    {
        struct kvm_pit_config pitcfg = { .flags = 4 };
        switch ((long)arg) {
        case 0: r[2]  = open("/dev/kvm", O_RDONLY|O_ASYNC);    break;
        case 1: r[3]  = ioctl(r[2], KVM_CREATE_VM, 0);         break;
        case 2: r[4]  = ioctl(r[3], KVM_CREATE_IRQCHIP, 0);    break;
        case 3: r[22] = ioctl(r[3], KVM_CREATE_PIT2, &pitcfg); break;
        }
        return 0;
    }

    int main(int argc, char **argv)
    {
        long i;
        pthread_t th[4];

        memset(r, -1, sizeof(r));
        for (i = 0; i < 4; i++) {
            pthread_create(&th[i], 0, thr, (void*)i);
            if (argc > 1 && rand()%2) usleep(rand()%1000);
        }
        usleep(20000);
        return 0;
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-03 15:28:19 +02:00
Paolo Bonzini ee2cd4b755 KVM: x86: rename process_smi to enter_smm, process_smi_request to process_smi
Make the function names more similar between KVM_REQ_NMI and KVM_REQ_SMI.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-03 15:28:19 +02:00
Paolo Bonzini c43203cab1 KVM: x86: avoid simultaneous queueing of both IRQ and SMI
If the processor exits to KVM while delivering an interrupt,
the hypervisor then requeues the interrupt for the next vmentry.
Trying to enter SMM in this same window causes to enter non-root
mode in emulated SMM (i.e. with IF=0) and with a request to
inject an IRQ (i.e. with a valid VM-entry interrupt info field).
This is invalid guest state (SDM 26.3.1.4 "Check on Guest RIP
and RFLAGS") and the processor fails vmentry.

The fix is to defer the injection from KVM_REQ_SMI to KVM_REQ_EVENT,
like we already do for e.g. NMIs.  This patch doesn't change the
name of the process_smi function so that it can be applied to
stable releases.  The next patch will modify the names so that
process_nmi and process_smi handle respectively KVM_REQ_NMI and
KVM_REQ_SMI.

This is especially common with Windows, probably due to the
self-IPI trick that it uses to deliver deferred procedure
calls (DPCs).

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Michał Zegan <webczat_200@poczta.onet.pl>
Fixes: 64d6067057
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-03 15:28:19 +02:00
Linus Torvalds 4340fa5529 ARM:
- two fixes for 4.6 vgic [Christoffer]
    (cc stable)
 
  - six fixes for 4.7 vgic [Marc]
 
 x86:
  - six fixes from syzkaller reports [Paolo]
    (two of them cc stable)
 
  - allow OS X to boot [Dmitry]
 
  - don't trust compilers [Nadav]
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJXUI6+AAoJEED/6hsPKofof70H/iL+tri1Mu9q+8+EtANPp9i1
 XWjjAQCGszq27H/A1Io+igszW/ghhp0OoVYSB8wSUBubTSjjyba90dWe41UpBywt
 r2W41xbGUBRdC6aK5rZ0+Zw2MIv5+ubSsbDKqfY8IsWQzWln+NHo+q5UU8+NfdZ7
 Nu7+lRhXJfJWy31k/FeDp3FeZXPZk9b9iZBafmnNcg/i1JMjrrVK8nSBXOSUvZ22
 Dy/b3ln+8emv4lvOX0jfostQH0HkfulpBmJouZ5TadoegBtQEDEKpkglxSyUHAcT
 TmMhRwRyx4K6i5Bp3zAJ0LtjEks9/ayHVmerMJ+LH1MBaJqNmvxUzSPXvKGA8JE=
 =PCWN
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "ARM:
   - two fixes for 4.6 vgic [Christoffer] (cc stable)

   - six fixes for 4.7 vgic [Marc]

  x86:
   - six fixes from syzkaller reports [Paolo] (two of them cc stable)

   - allow OS X to boot [Dmitry]

   - don't trust compilers [Nadav]"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS
  KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
  KVM: irqfd: fix NULL pointer dereference in kvm_irq_map_gsi
  KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number
  KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
  kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR
  KVM: Handle MSR_IA32_PERF_CTL
  KVM: x86: avoid write-tearing of TDP
  KVM: arm/arm64: vgic-new: Removel harmful BUG_ON
  arm64: KVM: vgic-v3: Relax synchronization when SRE==1
  arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1
  arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value
  KVM: arm/arm64: vgic-v3: Always resample level interrupts
  KVM: arm/arm64: vgic-v2: Always resample level interrupts
  KVM: arm/arm64: vgic-v3: Clear all dirty LRs
  KVM: arm/arm64: vgic-v2: Clear all dirty LRs
2016-06-02 15:08:06 -07:00
Paolo Bonzini d14bdb553f KVM: x86: fix OOPS after invalid KVM_SET_DEBUGREGS
MOV to DR6 or DR7 causes a #GP if an attempt is made to write a 1 to
any of bits 63:32.  However, this is not detected at KVM_SET_DEBUGREGS
time, and the next KVM_RUN oopses:

   general protection fault: 0000 [#1] SMP
   CPU: 2 PID: 14987 Comm: a.out Not tainted 4.4.9-300.fc23.x86_64 #1
   Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
   [...]
   Call Trace:
    [<ffffffffa072c93d>] kvm_arch_vcpu_ioctl_run+0x141d/0x14e0 [kvm]
    [<ffffffffa071405d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
    [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
    [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
    [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 55 83 ff 07 48 89 e5 77 27 89 ff ff 24 fd 90 87 80 81 0f 23 fe 5d c3 0f 23 c6 5d c3 0f 23 ce 5d c3 0f 23 d6 5d c3 0f 23 de 5d c3 <0f> 23 f6 5d c3 0f 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
   RIP  [<ffffffff810639eb>] native_set_debugreg+0x2b/0x40
    RSP <ffff88005836bd50>

Testcase (beautified/reduced from syzkaller output):

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <string.h>
    #include <stdint.h>
    #include <linux/kvm.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>

    long r[8];

    int main()
    {
        struct kvm_debugregs dr = { 0 };

        r[2] = open("/dev/kvm", O_RDONLY);
        r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
        r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7);

        memcpy(&dr,
               "\x5d\x6a\x6b\xe8\x57\x3b\x4b\x7e\xcf\x0d\xa1\x72"
               "\xa3\x4a\x29\x0c\xfc\x6d\x44\x00\xa7\x52\xc7\xd8"
               "\x00\xdb\x89\x9d\x78\xb5\x54\x6b\x6b\x13\x1c\xe9"
               "\x5e\xd3\x0e\x40\x6f\xb4\x66\xf7\x5b\xe3\x36\xcb",
               48);
        r[7] = ioctl(r[4], KVM_SET_DEBUGREGS, &dr);
        r[6] = ioctl(r[4], KVM_RUN, 0);
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Paolo Bonzini 78e546c824 KVM: fail KVM_SET_VCPU_EVENTS with invalid exception number
This cannot be returned by KVM_GET_VCPU_EVENTS, so it is okay to return
EINVAL.  It causes a WARN from exception_type:

    WARNING: CPU: 3 PID: 16732 at arch/x86/kvm/x86.c:345 exception_type+0x49/0x50 [kvm]()
    CPU: 3 PID: 16732 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
    Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
     0000000000000286 000000006308a48b ffff8800bec7fcf8 ffffffff813b542e
     0000000000000000 ffffffffa0966496 ffff8800bec7fd30 ffffffff810a40f2
     ffff8800552a8000 0000000000000000 00000000002c267c 0000000000000001
    Call Trace:
     [<ffffffff813b542e>] dump_stack+0x63/0x85
     [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
     [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
     [<ffffffffa0924809>] exception_type+0x49/0x50 [kvm]
     [<ffffffffa0934622>] kvm_arch_vcpu_ioctl_run+0x10a2/0x14e0 [kvm]
     [<ffffffffa091c04d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
     [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
     [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
     [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
    ---[ end trace b1a0391266848f50 ]---

Testcase (beautified/reduced from syzkaller output):

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <string.h>
    #include <stdint.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>
    #include <linux/kvm.h>

    long r[31];

    int main()
    {
        memset(r, -1, sizeof(r));
        r[2] = open("/dev/kvm", O_RDONLY);
        r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
        r[7] = ioctl(r[3], KVM_CREATE_VCPU, 0);

        struct kvm_vcpu_events ve = {
                .exception.injected = 1,
                .exception.nr = 0xd4
        };
        r[27] = ioctl(r[7], KVM_SET_VCPU_EVENTS, &ve);
        r[30] = ioctl(r[7], KVM_RUN, 0);
        return 0;
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Paolo Bonzini 83676e9238 KVM: x86: avoid vmalloc(0) in the KVM_SET_CPUID
This causes an ugly dmesg splat.  Beautified syzkaller testcase:

    #include <unistd.h>
    #include <sys/syscall.h>
    #include <sys/ioctl.h>
    #include <fcntl.h>
    #include <linux/kvm.h>

    long r[8];

    int main()
    {
        struct kvm_cpuid2 c = { 0 };
        r[2] = open("/dev/kvm", O_RDWR);
        r[3] = ioctl(r[2], KVM_CREATE_VM, 0);
        r[4] = ioctl(r[3], KVM_CREATE_VCPU, 0x8);
        r[7] = ioctl(r[4], KVM_SET_CPUID, &c);
        return 0;
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Paolo Bonzini b21629da12 kvm: x86: avoid warning on repeated KVM_SET_TSS_ADDR
Found by syzkaller:

    WARNING: CPU: 3 PID: 15175 at arch/x86/kvm/x86.c:7705 __x86_set_memory_region+0x1dc/0x1f0 [kvm]()
    CPU: 3 PID: 15175 Comm: a.out Tainted: G        W       4.4.6-300.fc23.x86_64 #1
    Hardware name: LENOVO 2325F51/2325F51, BIOS G2ET32WW (1.12 ) 05/30/2012
     0000000000000286 00000000950899a7 ffff88011ab3fbf0 ffffffff813b542e
     0000000000000000 ffffffffa0966496 ffff88011ab3fc28 ffffffff810a40f2
     00000000000001fd 0000000000003000 ffff88014fc50000 0000000000000000
    Call Trace:
     [<ffffffff813b542e>] dump_stack+0x63/0x85
     [<ffffffff810a40f2>] warn_slowpath_common+0x82/0xc0
     [<ffffffff810a423a>] warn_slowpath_null+0x1a/0x20
     [<ffffffffa09251cc>] __x86_set_memory_region+0x1dc/0x1f0 [kvm]
     [<ffffffffa092521b>] x86_set_memory_region+0x3b/0x60 [kvm]
     [<ffffffffa09bb61c>] vmx_set_tss_addr+0x3c/0x150 [kvm_intel]
     [<ffffffffa092f4d4>] kvm_arch_vm_ioctl+0x654/0xbc0 [kvm]
     [<ffffffffa091d31a>] kvm_vm_ioctl+0x9a/0x6f0 [kvm]
     [<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
     [<ffffffff812414a9>] SyS_ioctl+0x79/0x90
     [<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71

Testcase:

    #include <unistd.h>
    #include <sys/ioctl.h>
    #include <fcntl.h>
    #include <string.h>
    #include <linux/kvm.h>

    long r[8];

    int main()
    {
        memset(r, -1, sizeof(r));
	r[2] = open("/dev/kvm", O_RDONLY|O_TRUNC);
        r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
        r[5] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
        r[7] = ioctl(r[3], KVM_SET_TSS_ADDR, 0x20000000ul);
        return 0;
    }

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Dmitry Bilunov 0c2df2a1af KVM: Handle MSR_IA32_PERF_CTL
Intel CPUs having Turbo Boost feature implement an MSR to provide a
control interface via rdmsr/wrmsr instructions. One could detect the
presence of this feature by issuing one of these instructions and
handling the #GP exception which is generated in case the referenced MSR
is not implemented by the CPU.

KVM's vCPU model behaves exactly as a real CPU in this case by injecting
a fault when MSR_IA32_PERF_CTL is called (which KVM does not support).
However, some operating systems use this register during an early boot
stage in which their kernel is not capable of handling #GP correctly,
causing #DP and finally a triple fault effectively resetting the vCPU.

This patch implements a dummy handler for MSR_IA32_PERF_CTL to avoid the
crashes.

Signed-off-by: Dmitry Bilunov <kmeaw@yandex-team.ru>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Nadav Amit b19ee2ff3b KVM: x86: avoid write-tearing of TDP
In theory, nothing prevents the compiler from write-tearing PTEs, or
split PTE writes. These partially-modified PTEs can be fetched by other
cores and cause mayhem. I have not really encountered such case in
real-life, but it does seem possible.

For example, the compiler may try to do something creative for
kvm_set_pte_rmapp() and perform multiple writes to the PTE.

Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-06-02 17:38:50 +02:00
Linus Torvalds 58c1f9950f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "sparc64 mmu context allocation and trap return bug fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix return from trap window fill crashes.
  sparc: Harden signal return frame checks.
  sparc64: Take ctx_alloc_lock properly in hugetlb_setup().
2016-05-31 22:20:56 -07:00
Linus Torvalds 367d3fd505 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "Three bugs fixes and an update for the default configuration"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: fix info leak in do_sigsegv
  s390/config: update default configuration
  s390/bpf: fix recache skb->data/hlen for skb_vlan_push/pop
  s390/bpf: reduce maximum program size to 64 KB
2016-05-31 09:43:24 -07:00
Marc Zyngier c585132840 arm64: KVM: vgic-v3: Relax synchronization when SRE==1
The GICv3 backend of the vgic is quite barrier heavy, in order
to ensure synchronization of the system registers and the
memory mapped view for a potential GICv2 guest.

But when the guest is using a GICv3 model, there is absolutely
no need to execute all these heavy barriers, and it is actually
beneficial to avoid them altogether.

This patch makes the synchonization conditional, and ensures
that we do not change the EL1 SRE settings if we do not need to.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-31 16:12:17 +02:00
Marc Zyngier a057001e9e arm64: KVM: vgic-v3: Prevent the guest from messing with ICC_SRE_EL1
Both our GIC emulations are "strict", in the sense that we either
emulate a GICv2 or a GICv3, and not a GICv3 with GICv2 legacy
support.

But when running on a GICv3 host, we still allow the guest to
tinker with the ICC_SRE_EL1 register during its time slice:
it can switch SRE off, observe that it is off, and yet on the
next world switch, find the SRE bit to be set again. Not very
nice.

An obvious solution is to always trap accesses to ICC_SRE_EL1
(by clearing ICC_SRE_EL2.Enable), and to let the handler return
the programmed value on a read, or ignore the write.

That way, the guest can always observe that our GICv3 is SRE==1
only.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-31 16:12:17 +02:00
Marc Zyngier b34f2bcbf5 arm64: KVM: Make ICC_SRE_EL1 access return the configured SRE value
When we trap ICC_SRE_EL1, we handle it as RAZ/WI. It would be
more correct to actual make it RO, and return the configured
value when read.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-05-31 16:12:16 +02:00
Christoffer Dall fa89c77e89 KVM: arm/arm64: vgic-v3: Clear all dirty LRs
When saving the state of the list registers, it is critical to
reset them zero, as we could otherwise leave unexpected EOI
interrupts pending for virtual level interrupts.

Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-05-31 16:12:09 +02:00
David S. Miller 7cafc0b8bf sparc64: Fix return from trap window fill crashes.
We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 #34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-29 18:55:54 -07:00
David S. Miller d11c2a0de2 sparc: Harden signal return frame checks.
All signal frames must be at least 16-byte aligned, because that is
the alignment we explicitly create when we build signal return stack
frames.

All stack pointers must be at least 8-byte aligned.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-29 11:24:05 -07:00
Linus Torvalds 4029632c34 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull more MIPS updates from Ralf Baechle:
 "This is the secondnd batch of MIPS patches for 4.7. Summary:

  CPS:
   - Copy EVA configuration when starting secondary VPs.

  EIC:
   - Clear Status IPL.

  Lasat:
   - Fix a few off by one bugs.

  lib:
   - Mark intrinsics notrace.  Not only are the intrinsics
     uninteresting, it would cause infinite recursion.

  MAINTAINERS:
   - Add file patterns for MIPS BRCM device tree bindings.
   - Add file patterns for mips device tree bindings.

  MT7628:
   - Fix MT7628 pinmux typos.
   - wled_an pinmux gpio.
   - EPHY LEDs pinmux support.

  Pistachio:
   - Enable KASLR

  VDSO:
   - Build microMIPS VDSO for microMIPS kernels.
   - Fix aliasing warning by building with `-fno-strict-aliasing' for
     debugging but also tracing them might result in recursion.

  Misc:
   - Add missing FROZEN hotplug notifier transitions.
   - Fix clk binding example for varioius PIC32 devices.
   - Fix cpu interrupt controller node-names in the DT files.
   - Fix XPA CPU feature separation.
   - Fix write_gc0_* macros when writing zero.
   - Add inline asm encoding helpers.
   - Add missing VZ accessor microMIPS encodings.
   - Fix little endian microMIPS MSA encodings.
   - Add 64-bit HTW fields and fix its configuration.
   - Fix sigreturn via VDSO on microMIPS kernel.
   - Lots of typo fixes.
   - Add definitions of SegCtl registers and use them"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (49 commits)
  MIPS: Add missing FROZEN hotplug notifier transitions
  MIPS: Build microMIPS VDSO for microMIPS kernels
  MIPS: Fix sigreturn via VDSO on microMIPS kernel
  MIPS: devicetree: fix cpu interrupt controller node-names
  MIPS: VDSO: Build with `-fno-strict-aliasing'
  MIPS: Pistachio: Enable KASLR
  MIPS: lib: Mark intrinsics notrace
  MIPS: Fix 64-bit HTW configuration
  MIPS: Add 64-bit HTW fields
  MAINTAINERS: Add file patterns for mips device tree bindings
  MAINTAINERS: Add file patterns for mips brcm device tree bindings
  MIPS: Simplify DSP instruction encoding macros
  MIPS: Add missing tlbinvf/XPA microMIPS encodings
  MIPS: Fix little endian microMIPS MSA encodings
  MIPS: Add missing VZ accessor microMIPS encodings
  MIPS: Add inline asm encoding helpers
  MIPS: Spelling fix lets -> let's
  MIPS: VR41xx: Fix typo
  MIPS: oprofile: Fix typo
  MIPS: math-emu: Fix typo
  ...
2016-05-28 16:41:39 -07:00
Linus Torvalds 7e0fb73c52 Merge branch 'hash' of git://ftp.sciencehorizons.net/linux
Pull string hash improvements from George Spelvin:
 "This series does several related things:

   - Makes the dcache hash (fs/namei.c) useful for general kernel use.

     (Thanks to Bruce for noticing the zero-length corner case)

   - Converts the string hashes in <linux/sunrpc/svcauth.h> to use the
     above.

   - Avoids 64-bit multiplies in hash_64() on 32-bit platforms.  Two
     32-bit multiplies will do well enough.

   - Rids the world of the bad hash multipliers in hash_32.

     This finishes the job started in commit 689de1d6ca ("Minimal
     fix-up of bad hashing behavior of hash_64()")

     The vast majority of Linux architectures have hardware support for
     32x32-bit multiply and so derive no benefit from "simplified"
     multipliers.

     The few processors that do not (68000, h8/300 and some models of
     Microblaze) have arch-specific implementations added.  Those
     patches are last in the series.

   - Overhauls the dcache hash mixing.

     The patch in commit 0fed3ac866 ("namei: Improve hash mixing if
     CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
     Replaced with a much more careful design that's simultaneously
     faster and better.  (My own invention, as there was noting suitable
     in the literature I could find.  Comments welcome!)

   - Modify the hash_name() loop to skip the initial HASH_MIX().  This
     would let us salt the hash if we ever wanted to.

   - Sort out partial_name_hash().

     The hash function is declared as using a long state, even though
     it's truncated to 32 bits at the end and the extra internal state
     contributes nothing to the result.  And some callers do odd things:

      - fs/hfs/string.c only allocates 32 bits of state
      - fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes

   - Modify bytemask_from_count to handle inputs of 1..sizeof(long)
     rather than 0..sizeof(long)-1.  This would simplify users other
     than full_name_hash"

  Special thanks to Bruce Fields for testing and finding bugs in v1.  (I
  learned some humbling lessons about "obviously correct" code.)

  On the arch-specific front, the m68k assembly has been tested in a
  standalone test harness, I've been in contact with the Microblaze
  maintainers who mostly don't care, as the hardware multiplier is never
  omitted in real-world applications, and I haven't heard anything from
  the H8/300 world"

* 'hash' of git://ftp.sciencehorizons.net/linux:
  h8300: Add <asm/hash.h>
  microblaze: Add <asm/hash.h>
  m68k: Add <asm/hash.h>
  <linux/hash.h>: Add support for architecture-specific functions
  fs/namei.c: Improve dcache hash function
  Eliminate bad hash multipliers from hash_32() and  hash_64()
  Change hash_64() return value to 32 bits
  <linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
  fs/namei.c: Add hashlen_string() function
  Pull out string hash to <linux/stringhash.h>
2016-05-28 16:15:25 -07:00
George Spelvin 4684fe9530 h8300: Add <asm/hash.h>
This will improve the performance of hash_32() and hash_64(), but due
to complete lack of multi-bit shift instructions on H8, performance will
still be bad in surrounding code.

Designing H8-specific hash algorithms to work around that is a separate
project.  (But if the maintainers would like to get in touch...)

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
2016-05-28 15:48:58 -04:00
George Spelvin 7b13277b68 microblaze: Add <asm/hash.h>
Microblaze is an FPGA soft core that can be configured various ways.

If it is configured without a multiplier, the standard __hash_32()
will require a call to __mulsi3, which is a slow software loop.

Instead, use a shift-and-add sequence for the constant multiply.
GCC knows how to do this, but it's not as clever as some.

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
2016-05-28 15:48:58 -04:00
George Spelvin 14c44b95b3 m68k: Add <asm/hash.h>
This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.

Yes, the amount of optimization effort put in is excessive. :-)

Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
2016-05-28 15:48:57 -04:00
George Spelvin 468a942852 <linux/hash.h>: Add support for architecture-specific functions
This is just the infrastructure; there are no users yet.

This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.

That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.

Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis <alistai@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
2016-05-28 15:48:31 -04:00
Anna-Maria Gleixner a8c5ddf08f MIPS: Add missing FROZEN hotplug notifier transitions
The corresponding FROZEN hotplug notifier transitions used on
suspend/resume are ignored. Therefore the switch case action argument
is masked with the frozen hotplug notifier transition mask.

Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: rt@linutronix.de
Patchwork: https://patchwork.linux-mips.org/patch/13351/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:12 +02:00
James Hogan bb93078e65 MIPS: Build microMIPS VDSO for microMIPS kernels
MicroMIPS kernels may be expected to run on microMIPS only cores which
don't support the normal MIPS instruction set, so be sure to pass the
-mmicromips flag through to the VDSO cflags.

Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/13349/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:12 +02:00
James Hogan 13eb192d10 MIPS: Fix sigreturn via VDSO on microMIPS kernel
In microMIPS kernels, handle_signal() sets the isa16 mode bit in the
vdso address so that the sigreturn trampolines (which are offset from
the VDSO) get executed as microMIPS.

However commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
changed the offsets to come from the VDSO image, which already have the
isa16 mode bit set correctly since they're extracted from the VDSO
shared library symbol table.

Drop the isa16 mode bit handling from handle_signal() to fix sigreturn
for cores which support both microMIPS and normal MIPS. This doesn't fix
microMIPS only cores, since the VDSO is still built for normal MIPS, but
thats a separate problem.

Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 4.4.x-
Patchwork: https://patchwork.linux-mips.org/patch/13348/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:12 +02:00
Antony Pavlov 5214cae77c MIPS: devicetree: fix cpu interrupt controller node-names
Here is the quote from [1]:

    The unit-address must match the first address specified
    in the reg property of the node. If the node has no reg property,
    the @ and unit-address must be omitted and the node-name alone
    differentiates the node from other nodes at the same level

This patch adjusts MIPS dts-files and devicetree binding
documentation in accordance with [1].

    [1] Power.org(tm) Standard for Embedded Power Architecture(tm)
        Platform Requirements (ePAPR). Version 1.1 – 08 April 2011.
        Chapter 2.2.1.1 Node Name Requirements

Signed-off-by: Antony Pavlov <antonynpavlov@gmail.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13345/
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:12 +02:00
Maciej W. Rozycki 94cc36b84a MIPS: VDSO: Build with `-fno-strict-aliasing'
Avoid an aliasing issue causing a build error in VDSO:

In file included from include/linux/srcu.h:34:0,
                 from include/linux/notifier.h:15,
                 from ./arch/mips/include/asm/uprobes.h:9,
                 from include/linux/uprobes.h:61,
                 from include/linux/mm_types.h:13,
                 from ./arch/mips/include/asm/vdso.h:14,
                 from arch/mips/vdso/vdso.h:27,
                 from arch/mips/vdso/gettimeofday.c:11:
include/linux/workqueue.h: In function 'work_static':
include/linux/workqueue.h:186:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
  return *work_data_bits(work) & WORK_STRUCT_STATIC;
  ^
cc1: all warnings being treated as errors
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1

with a CONFIG_DEBUG_OBJECTS_WORK configuration and GCC 5.2.0.  Include
`-fno-strict-aliasing' along with compiler options used, as required for
kernel code, fixing a problem present since the introduction of VDSO
with commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO").

Thanks to Tejun for diagnosing this properly!

Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Fixes: ebb5e78cc6 ("MIPS: Initial implementation of a VDSO")
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: stable@vger.kernel.org # v4.3+
Patchwork: https://patchwork.linux-mips.org/patch/13357/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:12 +02:00
Matt Redfearn 41cc07be42 MIPS: Pistachio: Enable KASLR
Allow KASLR to be selected on Pistachio based systems. Tested on a
Creator Ci40.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Jonas Gorski <jogo@openwrt.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13356/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:11 +02:00
Harvey Hunt aedcfbe065 MIPS: lib: Mark intrinsics notrace
On certain MIPS32 devices, the ftrace tracer "function_graph" uses
__lshrdi3() during the capturing of trace data. ftrace then attempts to
trace __lshrdi3() which leads to infinite recursion and a stack overflow.
Fix this by marking __lshrdi3() as notrace. Mark the other compiler
intrinsics as notrace in case the compiler decides to use them in the
ftrace path.

Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com>
Cc: <linux-mips@linux-mips.org>
Cc: <linux-kernel@vger.kernel.org>
Cc: <stable@vger.kernel.org> # 4.2.x-
Patchwork: https://patchwork.linux-mips.org/patch/13354/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:11 +02:00
James Hogan aa76042a01 MIPS: Fix 64-bit HTW configuration
The Hardware page Table Walker (HTW) is being misconfigured on 64-bit
kernels. The PWSize.PS (pointer size) bit determines whether pointers
within directories are loaded as 32-bit or 64-bit addresses, but was
never being set to 1 for 64-bit kernels where the unsigned long in pgd_t
is 64-bits wide.

This actually reduces rather than improves performance when the HTW is
enabled on P6600 since the HTW is initiated lots, but walks are all
aborted due I think to bad intermediate pointers.

Since we were already taking the width of the PTEs into account by
setting PWSize.PTEW, which is the left shift applied to the page table
index *in addition to* the native pointer size, we also need to reduce
PTEW by 1 when PS=1. This is done by calculating PTEW based on the
relative size of pte_t compared to pgd_t.

Finally in order for the HTW to be used when PS=1, the appropriate
XK/XS/XU bits corresponding to the different 64-bit segments need to be
set in PWCtl. We enable only XU for now to enable walking for XUSeg.

Supporting walking for XKSeg would be a bit more involved so is left for
a future patch. It would either require the use of a per-CPU top level
base directory if supported by the HTW (a bit like pgd_current but with
a second entry pointing at swapper_pg_dir), or the HTW would prepend bit
63 of the address to the global directory index which doesn't really
match how we split user and kernel page directories.

Fixes: cab25bc753 ("MIPS: Extend hardware table walking support to MIPS64")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13364/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:11 +02:00
James Hogan 6446e6cf44 MIPS: Add 64-bit HTW fields
Add field definitions for some of the 64-bit specific Hardware page
Table Walker (HTW) register fields in PWSize and PWCtl, in preparation
for fixing the 64-bit HTW configuration.

Also print these fields out along with the others in print_htw_config().

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13363/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:11 +02:00
James Hogan 5aadab0c1a MIPS: Simplify DSP instruction encoding macros
Simplify the DSP instruction wrapper macros which use explicit encodings
for microMIPS and normal MIPS by using the new encoding macros and
removing duplication.

To me this makes it easier to read since it is much shorter, but it also
ensures .insn is used, preventing objdump disassembling the microMIPS
code as normal MIPS.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13314/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:10 +02:00
James Hogan c84700cc57 MIPS: Add missing tlbinvf/XPA microMIPS encodings
Hardcoded MIPS instruction encodings are provided for tlbinvf, mfhc0 &
mthc0 instructions, but microMIPS encodings are missing. I doubt any
microMIPS cores exist at present which support these instructions, but
the microMIPS encodings exist, and microMIPS cores may support them in
the future. Add the missing microMIPS encodings using the new macros.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13313/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:10 +02:00
James Hogan 6e1b29c309 MIPS: Fix little endian microMIPS MSA encodings
When the toolchain doesn't support MSA we encode MSA instructions
explicitly in assembly. Unfortunately we use .word for both MIPS and
microMIPS encodings which is wrong, since 32-bit microMIPS instructions
are made up from a pair of halfwords.

- The most significant halfword always comes first, so for little endian
  builds the halves will be emitted in the wrong order.

- 32-bit alignment isn't guaranteed, so the assembler may insert a
  16-bit nop instruction to pad the instruction stream to a 32-bit
  boundary.

Use the new instruction encoding macros to encode microMIPS MSA
instructions correctly.

Fixes: d96cc3d1ec ("MIPS: Add microMIPS MSA support.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <Paul.Burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13312/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:10 +02:00
James Hogan 1c48a17735 MIPS: Add missing VZ accessor microMIPS encodings
Toolchains may be used which support microMIPS but not VZ instructions
(i.e. binutis 2.22 & 2.23), so extend the explicitly encoded versions of
the guest COP0 register & guest TLB access macros to support microMIPS
encodings too, using the new macros.

This prevents non-microMIPS instructions being executed in microMIPS
mode during CPU probe on cores supporting VZ (e.g. M5150), which cause
reserved instruction exceptions early during boot.

Fixes: bad50d7925 ("MIPS: Fix VZ probe gas errors with binutils <2.24")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13311/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:10 +02:00
James Hogan 0dfa1c12f3 MIPS: Add inline asm encoding helpers
To allow simplification of macros which use inline assembly to
explicitly encode instructions, add a few simple abstractions to
mipsregs.h which expand to specific microMIPS or normal MIPS encodings
depending on what type of kernel is being built:

_ASM_INSN_IF_MIPS(_enc) : Emit a 32bit MIPS instruction if microMIPS is
                          not enabled.
_ASM_INSN32_IF_MM(_enc) : Emit a 32bit microMIPS instruction if enabled.
_ASM_INSN16_IF_MM(_enc) : Emit a 16bit microMIPS instruction if enabled.

The macros can be used one after another since the MIPS / microMIPS
macros are mutually exclusive, for example:

__asm__ __volatile__(
        ".set push\n\t"
        ".set noat\n\t"
        "# mfgc0 $1, $%1, %2\n\t"
        _ASM_INSN_IF_MIPS(0x40610000 | %1 << 11 | %2)
        _ASM_INSN32_IF_MM(0x002004fc | %1 << 16 | %2 << 11)
        "move %0, $1\n\t"
        ".set pop"
        : "=r" (__res)
        : "i" (source), "i" (sel));

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13310/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:09 +02:00
Ralf Baechle 4939788eb8 MIPS: Spelling fix lets -> let's
As noticed by Sergei in the discussion of Andrea Gelmini's patch series.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
2016-05-28 12:35:09 +02:00
Andrea Gelmini a320a1156a MIPS: VR41xx: Fix typo
Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Cc: trivial@kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/13338/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-05-28 12:35:09 +02:00