Граф коммитов

773 Коммитов

Автор SHA1 Сообщение Дата
Markus Elfring a1708a2ead KVM: s390: Improve determination of sizes in kvm_s390_import_bp_data()
* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus reuse the corresponding function "kmalloc_array".

  Suggested-by: Paolo Bonzini <pbonzini@redhat.com>

  This issue was detected also by using the Coccinelle software.

* Replace the specification of data structures by pointer dereferences
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

* Delete the local variable "size" which became unnecessary with
  this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-Id: <c3323f6b-4af2-0bfb-9399-e529952e378e@users.sourceforge.net>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:54 +02:00
David Hildenbrand a6940674c3 KVM: s390: allow 255 VCPUs when sca entries aren't used
If the SCA entries aren't used by the hardware (no SIGPIF), we
can simply not set the entries, stick to the basic sca and allow more
than 64 VCPUs.

To hinder any other facility from using these entries, let's properly
provoke intercepts by not setting the MCN and keeping the entries
unset.

This effectively allows when running KVM under KVM (vSIE) or under z/VM to
provide more than 64 VCPUs to a guest. Let's limit it to 255 for now, to
not run into problems if the CPU numbers are limited somewhere else.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:53 +02:00
Fan Zhang 80cd876338 KVM: s390: lazy enable RI
Only enable runtime instrumentation if the guest issues an RI related
instruction or if userspace changes the riccb to a valid state.
This makes entry/exit a tiny bit faster.

Initial patch by Christian Borntraeger
Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 13:40:39 +02:00
Janosch Frank c14b88d766 KVM: s390: gaccess: simplify translation exception handling
The payload data for protection exceptions is a superset of the
payload of other translation exceptions. Let's set the additional
flags and use a fall through to minimize code duplication.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:53 +02:00
David Hildenbrand b1ffffbd0f KVM: s390: guestdbg: separate defines for per code
Let's avoid working with the PER_EVENT* defines, used for control register
manipulation, when checking the u8 PER code. Introduce separate defines
based on the existing defines.

Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand 8953fb08ab KVM: s390: write external damage code on machine checks
Let's also write the external damage code already provided by
struct kvm_s390_mchk_info.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand ff5dc1492a KVM: s390: fix delivery of vector regs during machine checks
Vector registers are only to be stored if the facility is available
and if the guest has set up the machine check extended save area.

If anything goes wrong while writing the vector registers, the vector
registers are to be marked as invalid. Please note that we are allowed
to write the registers although they are marked as invalid.

Machine checks and "store status" SIGP orders are two different concepts,
let's correctly separate these. As the SIGP part is completely handled in
user space, we can drop it.

This patch is based on a patch from Cornelia Huck.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:52 +02:00
David Hildenbrand 0319dae677 KVM: s390: split store status and machine check handling
Store status writes the prefix which is not to be done by a machine check.
Also, the psw is stored and later on overwritten by the failing-storage
address, which looks strange at first sight.

Store status and machine check handling look similar, but they are actually
two different things.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:51 +02:00
David Hildenbrand d6404ded30 KVM: s390: factor out actual delivery of machine checks
Let's factor this out to prepare for bigger changes. Reorder to calls to
match the logical order given in the PoP.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-09-08 09:07:51 +02:00
Julius Niedworok aca411a4b1 KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
When triggering KVM_RUN without a user memory region being mapped
(KVM_SET_USER_MEMORY_REGION) a validity intercept occurs. This could
happen, if the user memory region was not mapped initially or if it
was unmapped after the vcpu is initialized. The function
kvm_s390_handle_requests checks for the KVM_REQ_MMU_RELOAD bit. The
check function always clears this bit. If gmap_mprotect_notify
returns an error code, the mapping failed, but the KVM_REQ_MMU_RELOAD
was not set anymore. So the next time kvm_s390_handle_requests is
called, the execution would fall trough the check for
KVM_REQ_MMU_RELOAD. The bit needs to be resetted, if
gmap_mprotect_notify returns an error code. Resetting the bit with
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu) fixes the bug.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-12 09:11:08 +02:00
Julius Niedworok 75a4615c95 KVM: s390: set the prefix initially properly
When KVM_RUN is triggered on a VCPU without an initial reset, a
validity intercept occurs.
Setting the prefix will set the KVM_REQ_MMU_RELOAD bit initially,
thus preventing the bug.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Julius Niedworok <jniedwor@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-08-12 09:10:17 +02:00
Linus Torvalds 221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Linus Torvalds 015cd867e5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "There are a couple of new things for s390 with this merge request:

   - a new scheduling domain "drawer" is added to reflect the unusual
     topology found on z13 machines.  Performance tests showed up to 8
     percent gain with the additional domain.

   - the new crc-32 checksum crypto module uses the vector-galois-field
     multiply and sum SIMD instruction to speed up crc-32 and crc-32c.

   - proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in
     the generic vmlinux.lds linker script definitions.

   - kcov instrumentation support.  A prerequisite for that is the
     inline assembly basic block cleanup, which is the reason for the
     net/iucv/iucv.c change.

   - support for 2GB pages is added to the hugetlbfs backend.

  Then there are two removals:

   - the oprofile hardware sampling support is dead code and is removed.
     The oprofile user space uses the perf interface nowadays.

   - the ETR clock synchronization is removed, this has been superseeded
     be the STP clock synchronization.  And it always has been
     "interesting" code..

  And the usual bug fixes and cleanups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits)
  s390/pci: Delete an unnecessary check before the function call "pci_dev_put"
  s390/smp: clean up a condition
  s390/cio/chp : Remove deprecated create_singlethread_workqueue
  s390/chsc: improve channel path descriptor determination
  s390/chsc: sanitize fmt check for chp_desc determination
  s390/cio: make fmt1 channel path descriptor optional
  s390/chsc: fix ioctl CHSC_INFO_CU command
  s390/cio/device_ops: fix kernel doc
  s390/cio: allow to reset channel measurement block
  s390/console: Make preferred console handling more consistent
  s390/mm: fix gmap tlb flush issues
  s390/mm: add support for 2GB hugepages
  s390: have unique symbol for __switch_to address
  s390/cpuinfo: show maximum thread id
  s390/ptrace: clarify bits in the per_struct
  s390: stack address vs thread_info
  s390: remove pointless load within __switch_to
  s390: enable kcov support
  s390/cpumf: use basic block for ecctr inline assembly
  s390/hypfs: use basic block for diag inline assembly
  ...
2016-07-26 12:22:51 -07:00
David Hildenbrand 9acc317b18 KVM: s390: let ptff intercepts result in cc=3
We don't emulate ptff subfunctions, therefore react on any attempt of
execution by setting cc=3 (Requested function not available).

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-07-18 14:15:00 +02:00
David Hildenbrand 6502a34cfd KVM: s390: allow user space to handle instr 0x0000
We will use illegal instruction 0x0000 for handling 2 byte sw breakpoints
from user space. As it can be enabled dynamically via a capability,
let's move setting of ICTL_OPEREXC to the post creation step, so we avoid
any races when enabling that capability just while adding new cpus.

Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-07-18 14:15:00 +02:00
Radim Krčmář c63cf538eb KVM: pass struct kvm to kvm_set_routing_entry
Arch-specific code will use it.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-14 09:03:56 +02:00
David Hildenbrand 5ffe466cd3 KVM: s390: inject PER i-fetch events on applicable icpts
In case we have to emuluate an instruction or part of it (instruction,
partial instruction, operation exception), we have to inject a PER
instruction-fetching event for that instruction, if hardware told us to do
so.

In case we retry an instruction, we must not inject the PER event.

Please note that we don't filter the events properly yet, so guest
debugging will be visible for the guest.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-07-05 12:02:56 +02:00
Paolo Bonzini 6edaa5307f KVM: remove kvm_guest_enter/exit wrappers
Use the functions from context_tracking.h directly.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-01 11:03:21 +02:00
David Hildenbrand a411edf132 KVM: s390: vsie: add module parameter "nested"
Let's be careful first and allow nested virtualization only if enabled
by the system administrator. In addition, user space still has to
explicitly enable it via SCLP features for it to work.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:47 +02:00
David Hildenbrand 5d3876a8bf KVM: s390: vsie: add indication for future features
We have certain SIE features that we cannot support for now.
Let's add these features, so user space can directly prepare to enable
them, so we don't have to update yet another component.

In addition, add a comment block, telling why it is for now not possible to
forward/enable these features.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:47 +02:00
David Hildenbrand 91473b487d KVM: s390: vsie: correctly set and handle guest TOD
Guest 2 sets up the epoch of guest 3 from his point of view. Therefore,
we have to add the guest 2 epoch to the guest 3 epoch. We also have to take
care of guest 2 epoch changes on STP syncs. This will work just fine by
also updating the guest 3 epoch when a vsie_block has been set for a VCPU.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:46 +02:00
David Hildenbrand b917ae573f KVM: s390: vsie: speed up VCPU external calls
Whenever a SIGP external call is injected via the SIGP external call
interpretation facility, the VCPU is not kicked. When a VCPU is currently
in the VSIE, the external call might not be processed immediately.

Therefore we have to provoke partial execution exceptions, which leads to a
kick of the VCPU and therefore also kick out of VSIE. This is done by
simulating the WAIT state. This bit has no other side effects.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:46 +02:00
David Hildenbrand 94a15de8fb KVM: s390: don't use CPUSTAT_WAIT to detect if a VCPU is idle
As we want to make use of CPUSTAT_WAIT also when a VCPU is not idle but
to force interception of external calls, let's check in the bitmap instead.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:45 +02:00
David Hildenbrand adbf16985c KVM: s390: vsie: speed up VCPU irq delivery when handling vsie
Whenever we want to wake up a VCPU (e.g. when injecting an IRQ), we
have to kick it out of vsie, so the request will be handled faster.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:44 +02:00
David Hildenbrand 1b7029bec1 KVM: s390: vsie: try to refault after a reported fault to g2
We can avoid one unneeded SIE entry after we reported a fault to g2.
Theoretically, g2 resolves the fault and we can create the shadow mapping
directly, instead of failing again when entering the SIE.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:44 +02:00
David Hildenbrand 7fd7f39daa KVM: s390: vsie: support IBS interpretation
We can easily enable ibs for guest 2, so he can use it for guest 3.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:43 +02:00
David Hildenbrand 13ee3f678b KVM: s390: vsie: support conditional-external-interception
We can easily enable cei for guest 2, so he can use it for guest 3.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:42 +02:00
David Hildenbrand 5630a8e82b KVM: s390: vsie: support intervention-bypass
We can easily enable intervention bypass for guest 2, so it can use it
for guest 3.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:42 +02:00
David Hildenbrand a1b7b9b286 KVM: s390: vsie: support guest-storage-limit-suppression
We can easily forward guest-storage-limit-suppression if available.

One thing to care about is keeping the prefix properly mapped when
gsls in toggled on/off or the mso changes in between. Therefore we better
remap the prefix on any mso changes just like we already do with the
prefix.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:41 +02:00
David Hildenbrand 77d18f6d47 KVM: s390: vsie: support guest-PER-enhancement
We can easily forward the guest-PER-enhancement facility to guest 2 if
available.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:40 +02:00
David Hildenbrand 0615a326e0 KVM: s390: vsie: support shared IPTE-interlock facility
As we forward the whole SCA provided by guest 2, we can directly forward
SIIF if available.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:40 +02:00
David Hildenbrand 19c439b564 KVM: s390: vsie: support 64-bit-SCAO
Let's provide the 64-bit-SCAO facility to guest 2, so he can set up a SCA
for guest 3 that has a 64 bit address. Please note that we already require
the 64 bit SCAO for our vsie implementation, in order to forward the SCA
directly (by pinning the page).

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:39 +02:00
David Hildenbrand 588438cba0 KVM: s390: vsie: support run-time-instrumentation
As soon as guest 2 is allowed to use run-time-instrumentation (indicated
via via STFLE), it can also enable it for guest 3.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:39 +02:00
David Hildenbrand c9bc1eabe5 KVM: s390: vsie: support vectory facility (SIMD)
As soon as guest 2 is allowed to use the vector facility (indicated via
STFLE), it can also enable it for guest 3. We have to take care of the
sattellite block that might be used when not relying on lazy vector
copying (not the case for KVM).

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:38 +02:00
David Hildenbrand 166ecb3d3c KVM: s390: vsie: support transactional execution
As soon as guest 2 is allowed to use transactional execution (indicated via
STFLE), he can also enable it for guest 3.

Active transactional execution requires also the second prefix page to be
mapped. If that page cannot be mapped, a validity icpt has to be presented
to the guest.

We have to take care of tx being toggled on/off, otherwise we might get
wrong prefix validity icpt.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:37 +02:00
David Hildenbrand bbeaa58b32 KVM: s390: vsie: support aes dea wrapping keys
As soon as message-security-assist extension 3 is enabled for guest 2,
we have to allow key wrapping for guest 3.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:37 +02:00
David Hildenbrand 66b630d5b7 KVM: s390: vsie: support STFLE interpretation
Issuing STFLE is extremely rare. Instead of copying 2k on every
VSIE call, let's do this lazily, when a guest 3 tries to execute
STFLE. We can setup the block and retry.

Unfortunately, we can't directly forward that facility list, as
we only have a 31 bit address for the facility list designation.
So let's use a DMA allocation for our vsie_page instead for now.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:36 +02:00
David Hildenbrand 4ceafa9027 KVM: s390: vsie: support host-protection-interruption
Introduced with ESOP, therefore available for the guest if it
is allowed to use ESOP.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:35 +02:00
David Hildenbrand 535ef81c6e KVM: s390: vsie: support edat1 / edat2
If guest 2 is allowed to use edat 1 / edat 2, it can also set it up for
guest 3, so let's properly check and forward the edat cpuflags.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:35 +02:00
David Hildenbrand 3573602b20 KVM: s390: vsie: support setting the ibc
As soon as we forward an ibc to guest 2 (indicated via
kvm->arch.model.ibc), he can also use it for guest 3. Let's properly round
the ibc up/down, so we avoid any potential validity icpts from the
underlying SIE, if it doesn't simply round the values.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:34 +02:00
David Hildenbrand 06d68a6c85 KVM: s390: vsie: optimize gmap prefix mapping
In order to not always map the prefix, we have to take care of certain
aspects that implicitly unmap the prefix:
- Changes to the prefix address
- Changes to MSO, because the HVA of the prefix is changed
- Changes of the gmap shadow (e.g. unshadowed, asce or edat changes)

By properly handling these cases, we can stop remapping the prefix when
there is no reason to do so.

This also allows us now to not acquire any gmap shadow locks when
rerunning the vsie and still having a valid gmap shadow.

Please note, to detect changing gmap shadows, we have to keep the reference
of the gmap shadow. The address of a gmap shadow does otherwise not
reliably indicate if the gmap shadow has changed (the memory chunk
could get reused).

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:34 +02:00
David Hildenbrand a3508fbe9d KVM: s390: vsie: initial support for nested virtualization
This patch adds basic support for nested virtualization on s390x, called
VSIE (virtual SIE) and allows it to be used by the guest if the necessary
facilities are supported by the hardware and enabled for the guest.

In order to make this work, we have to shadow the sie control block
provided by guest 2. In order to gain some performance, we have to
reuse the same shadow blocks as good as possible. For now, we allow
as many shadow blocks as we have VCPUs (that way, every VCPU can run the
VSIE concurrently).

We have to watch out for the prefix getting unmapped out of our shadow
gmap and properly get the VCPU out of VSIE in that case, to fault the
prefix pages back in. We use the PROG_REQUEST bit for that purpose.

This patch is based on an initial prototype by Tobias Elpelt.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-21 09:43:33 +02:00
David Hildenbrand 37d9df98b7 KVM: s390: backup the currently enabled gmap when scheduled out
Nested virtualization will have to enable own gmaps. Current code
would enable the wrong gmap whenever scheduled out and back in,
therefore resulting in the wrong gmap being enabled.

This patch reenables the last enabled gmap, therefore avoiding having to
touch vcpu->arch.gmap when enabling a different gmap.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:55:24 +02:00
David Hildenbrand 65d0b0d4bc KVM: s390: fast path for shadow gmaps in gmap notifier
The default kvm gmap notifier doesn't have to handle shadow gmaps.
So let's just directly exit in case we get notified about one.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:55:21 +02:00
David Hildenbrand 3218f7094b s390/mm: support real-space for gmap shadows
We can easily support real-space designation just like EDAT1 and EDAT2.
So guest2 can provide for guest3 an asce with the real-space control being
set.

We simply have to allocate the biggest page table possible and fake all
levels.

There is no protection to consider. If we exceed guest memory, vsie code
will inject an addressing exception (via program intercept). In the future,
we could limit the fake table level to the gmap page table.

As the top level page table can never go away, such gmap shadows will never
get unshadowed, we'll have to come up with another way to limit the number
of kept gmap shadows.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:55:02 +02:00
David Hildenbrand 1c65781b56 s390/mm: push rte protection down to shadow pte
Just like we already do with ste protection, let's take rte protection
into account. This way, the host pte doesn't have to be mapped writable.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:55:00 +02:00
David Hildenbrand 18b8980988 s390/mm: support EDAT2 for gmap shadows
If the guest is enabled for EDAT2, we can easily create shadows for
guest2 -> guest3 provided tables that make use of EDAT2.

If guest2 references a 2GB page, this memory looks consecutive for guest2,
but it does not have to be so for us. Therefore we have to create fake
segment and page tables.

This works just like EDAT1 support, so page tables are removed when the
parent table (r3t table entry) is changed.

We don't hve to care about:
- ACCF-Validity Control in RTTE
- Access-Control Bits in RTTE
- Fetch-Protection Bit in RTTE
- Common-Region Bit in RTTE

Just like for EDAT1, all bits might be dropped and there is no guaranteed
that they are active.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:54:56 +02:00
David Hildenbrand fd8d4e3ab6 s390/mm: support EDAT1 for gmap shadows
If the guest is enabled for EDAT1, we can easily create shadows for
guest2 -> guest3 provided tables that make use of EDAT1.

If guest2 references a 1MB page, this memory looks consecutive for guest2,
but it might not be so for us. Therefore we have to create fake page tables.

We can easily add that to our existing infrastructure. The invalidation
mechanism will make sure that fake page tables are removed when the parent
table (sgt table entry) is changed.

As EDAT1 also introduced protection on all page table levels, we have to
also shadow these correctly.

We don't have to care about:
- ACCF-Validity Control in STE
- Access-Control Bits in STE
- Fetch-Protection Bit in STE
- Common-Segment Bit in STE

As all bits might be dropped and there is no guaranteed that they are
active ("unpredictable whether the CPU uses these bits", "may be used").
Without using EDAT1 in the shadow ourselfes (STE-format control == 0),
simply shadowing these bits would not be enough. They would be ignored.

Please note that we are using the "fake" flag to make this look consistent
with further changes (EDAT2, real-space designation support) and don't let
the shadow functions handle fc=1 stes.

In the future, with huge pages in the host, gmap_shadow_pgt() could simply
try to map a huge host page if "fake" is set to one and indicate via return
value that no lower fake tables / shadow ptes are required.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:54:51 +02:00
David Hildenbrand 00fc062d53 s390/mm: push ste protection down to shadow pte
If a guest ste is read-only, it doesn't make sense to force the ptes in as
writable in the host. If the source page is read-only in the host, it won't
have to be made writable. Please note that if the source page is not
available, it will still be faulted in writable. This can be changed
internally later on.

If ste protection is removed, underlying shadow tables are also removed,
therefore this change does not affect the guest.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:54:45 +02:00
David Hildenbrand f4debb4090 s390/mm: take ipte_lock during shadow faults
Let's take the ipte_lock while working on guest 2 provided page table, just
like the other gaccess functions.

Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-06-20 09:54:40 +02:00