WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Oliver Upton	1577cb5823	KVM: arm64: Handle stage-2 faults in parallel The stage-2 map walker has been made parallel-aware, and as such can be called while only holding the read side of the MMU lock. Rip out the conditional locking in user_mem_abort() and instead grab the read lock. Continue to take the write lock from other callsites to kvm_pgtable_stage2_map(). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107220033.1895655-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	af87fc03cf	KVM: arm64: Make table->block changes parallel-aware stage2_map_walker_try_leaf() and friends now handle stage-2 PTEs generically, and perform the correct flush when a table PTE is removed. Additionally, they've been made parallel-aware, using an atomic break to take ownership of the PTE. Stop clearing the PTE in the pre-order callback and instead let stage2_map_walker_try_leaf() deal with it. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107220006.1895572-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	946fbfdf33	KVM: arm64: Make leaf->leaf PTE changes parallel-aware Convert stage2_map_walker_try_leaf() to use the new break-before-make helpers, thereby making the handler parallel-aware. As before, avoid the break-before-make if recreating the existing mapping. Additionally, retry execution if another vCPU thread is modifying the same PTE. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215934.1895478-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	0ab12f3574	KVM: arm64: Make block->table PTE changes parallel-aware In order to service stage-2 faults in parallel, stage-2 table walkers must take exclusive ownership of the PTE being worked on. An additional requirement of the architecture is that software must perform a 'break-before-make' operation when changing the block size used for mapping memory. Roll these two concepts together into helpers for performing a 'break-before-make' sequence. Use a special PTE value to indicate a PTE has been locked by a software walker. Additionally, use an atomic compare-exchange to 'break' the PTE when the stage-2 page tables are possibly shared with another software walker. Elide the DSB + TLBI if the evicted PTE was invalid (and thus not subject to break-before-make). All of the atomics do nothing for now, as the stage-2 walker isn't fully ready to perform parallel walks. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215855.1895367-1-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	331aa3a054	KVM: arm64: Split init and set for table PTE Create a helper to initialize a table and directly call smp_store_release() to install it (for now). Prepare for a subsequent change that generalizes PTE writes with a helper. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-11-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	ca5de2448c	KVM: arm64: Atomically update stage 2 leaf attributes in parallel walks The stage2 attr walker is already used for parallel walks. Since commit `f783ef1c0e` ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging"), KVM acquires the read lock when write-unprotecting a PTE. However, the walker only uses a simple store to update the PTE. This is safe as the only possible race is with hardware updates to the access flag, which is benign. However, a subsequent change to KVM will allow more changes to the stage 2 page tables to be done in parallel. Prepare the stage 2 attribute walker by performing atomic updates to the PTE when walking in parallel. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-10-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	c3119ae45d	KVM: arm64: Protect stage-2 traversal with RCU Use RCU to safely walk the stage-2 page tables in parallel. Acquire and release the RCU read lock when traversing the page tables. Defer the freeing of table memory to an RCU callback. Indirect the calls into RCU and provide stubs for hypervisor code, as RCU is not available in such a context. The RCU protection doesn't amount to much at the moment, as readers are already protected by the read-write lock (all walkers that free table memory take the write lock). Nonetheless, a subsequent change will futher relax the locking requirements around the stage-2 MMU, thereby depending on RCU. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-9-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	5c359cca1f	KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make The break-before-make sequence is a bit annoying as it opens a window wherein memory is unmapped from the guest. KVM should replace the PTE as quickly as possible and avoid unnecessary work in between. Presently, the stage-2 map walker tears down a removed table before installing a block mapping when coalescing a table into a block. As the removed table is no longer visible to hardware walkers after the DSB+TLBI, it is possible to move the remaining cleanup to happen after installing the new PTE. Reshuffle the stage-2 map walker to install the new block entry in the pre-order callback. Unwire all of the teardown logic and replace it with a call to kvm_pgtable_stage2_free_removed() after fixing the PTE. The post-order visitor is now completely unnecessary, so drop it. Finally, touch up the comments to better represent the now simplified map walker. Note that the call to tear down the unlinked stage-2 is indirected as a subsequent change will use an RCU callback to trigger tear down. RCU is not available to pKVM, so there is a need to use different implementations on pKVM and non-pKVM VMs. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-8-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	6b91b8f95c	KVM: arm64: Use an opaque type for pteps Use an opaque type for pteps and require visitors explicitly dereference the pointer before using. Protecting page table memory with RCU requires that KVM dereferences RCU-annotated pointers before using. However, RCU is not available for use in the nVHE hypervisor and the opaque type can be conditionally annotated with RCU for the stage-2 MMU. Call the type a 'pteref' to avoid a naming collision with raw pteps. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-7-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	8e94e1252c	KVM: arm64: Add a helper to tear down unlinked stage-2 subtrees A subsequent change to KVM will move the tear down of an unlinked stage-2 subtree out of the critical path of the break-before-make sequence. Introduce a new helper for tearing down unlinked stage-2 subtrees. Leverage the existing stage-2 free walkers to do so, with a deep call into __kvm_pgtable_walk() as the subtree is no longer reachable from the root. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-6-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	fa002e8e79	KVM: arm64: Don't pass kvm_pgtable through kvm_pgtable_walk_data In order to tear down page tables from outside the context of kvm_pgtable (such as an RCU callback), stop passing a pointer through kvm_pgtable_walk_data. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-5-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	2a611c7f87	KVM: arm64: Pass mm_ops through the visitor context As a prerequisite for getting visitors off of struct kvm_pgtable, pass mm_ops through the visitor context. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-4-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	83844a2317	KVM: arm64: Stash observed pte value in visitor context Rather than reading the ptep all over the shop, read the ptep once from __kvm_pgtable_visit() and stick it in the visitor context. Reread the ptep after visiting a leaf in case the callback installed a new table underneath. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-3-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	dfc7a7769a	KVM: arm64: Combine visitor arguments into a context structure Passing new arguments by value to the visitor callbacks is extremely inflexible for stuffing new parameters used by only some of the visitors. Use a context structure instead and pass the pointer through to the visitor callback. While at it, redefine the 'flags' parameter to the visitor to contain the bit indicating the phase of the walk. Pass the entire set of flags through the context structure such that the walker can communicate additional state to the visitor callback. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-2-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Gavin Shan	9cb1096f85	KVM: arm64: Enable ring-based dirty memory tracking Enable ring-based dirty memory tracking on ARM64: - Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL. - Enable CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP. - Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page offset. - Add ARM64 specific kvm_arch_allow_write_without_running_vcpu() to keep the site of saving vgic/its tables out of the no-running-vcpu radar. Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110104914.31280-5-gshan@redhat.com	2022-11-10 13:11:58 +00:00
Ard Biesheuvel	68c76ad4a9	arm64: unwind: add asynchronous unwind tables to kernel and modules Enable asynchronous unwind table generation for both the core kernel as well as modules, and emit the resulting .eh_frame sections as init code so we can use the unwind directives for code patching at boot or module load time. This will be used by dynamic shadow call stack support, which will rely on code patching rather than compiler codegen to emit the shadow call stack push and pop instructions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-11-09 18:06:35 +00:00
Paolo Bonzini	d663b8a285	KVM: replace direct irq.h inclusion virt/kvm/irqchip.c is including "irq.h" from the arch-specific KVM source directory (i.e. not from arch/*/include) for the sole purpose of retrieving irqchip_in_kernel. Making the function inline in a header that is already included, such as asm/kvm_host.h, is not possible because it needs to look at struct kvm which is defined after asm/kvm_host.h is included. So add a kvm_arch_irqchip_in_kernel non-inline function; irqchip_in_kernel() is only performance critical on arm64 and x86, and the non-inline function is enough on all other architectures. irq.h can then be deleted from all architectures except x86. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-09 12:31:37 -05:00
Peter Xu	c8b88b332b	kvm: Add interruptible flag to __gfn_to_pfn_memslot() Add a new "interruptible" flag showing that the caller is willing to be interrupted by signals during the __gfn_to_pfn_memslot() request. Wire it up with a FOLL_INTERRUPTIBLE flag that we've just introduced. This prepares KVM to be able to respond to SIGUSR1 (for QEMU that's the SIGIPI) even during e.g. handling an userfaultfd page fault. No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221011195809.557016-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-09 12:31:27 -05:00
Marc Zyngier	4151bb636a	KVM: arm64: Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE The trapping of SMPRI_EL1 and TPIDR2_EL0 currently only really work on nVHE, as only this mode uses the fine-grained trapping that controls these two registers. Move the trapping enable/disable code into __{de,}activate_traps_common(), allowing it to be called when it actually matters on VHE, and remove the flipping of EL2 control for TPIDR2_EL0, which only affects the host access of this register. Fixes: `861262ab86` ("KVM: arm64: Handle SME host state when running guests") Reported-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/86bkpqer4z.wl-maz@kernel.org	2022-11-01 15:56:52 +00:00
Ryan Roberts	b6bcdc9f6b	KVM: arm64: Fix bad dereference on MTE-enabled systems enter_exception64() performs an MTE check, which involves dereferencing vcpu->kvm. While vcpu has already been fixed up to be a HYP VA pointer, kvm is still a pointer in the kernel VA space. This only affects nVHE configurations with MTE enabled, as in other cases, the pointer is either valid (VHE) or not dereferenced (!MTE). Fix this by first converting kvm to a HYP VA pointer. Fixes: `ea7fc1bb1c` ("KVM: arm64: Introduce MTE VM feature") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221027120945.29679-1-ryan.roberts@arm.com	2022-10-27 19:49:40 +01:00
Quentin Perret	6853a71726	KVM: arm64: Use correct accessor to parse stage-1 PTEs hyp_get_page_state() is used with pKVM to retrieve metadata about a page by parsing a hypervisor stage-1 PTE. However, it incorrectly uses a helper which parses stage-2 mappings. Ouch. Luckily, pkvm_getstate() only looks at the software bits, which happen to be in the same place for stage-1 and stage-2 PTEs, and this all ends up working correctly by accident. But clearly, we should do better. Fix hyp_get_page_state() to use the correct helper. Fixes: `e82edcc75c` ("KVM: arm64: Implement do_share() helper for sharing memory") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221025145156.855308-1-qperret@google.com	2022-10-25 16:29:58 +01:00
Paolo Bonzini	ebccb53e93	KVM/arm64 fixes for 6.1, take #2 - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmNRGjoPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDVcEP/2zn+J/pAv1FoNKxSiSk/H2hx5PH685mqEwv I7YPYPOuGJDv/uJFXzMMDpjjs8RHG9yWwPwtFZYGKqtmlkXgM3a0jk5jtsgfSVcp vsvqgR0kL67x5D/Mc+I/bkVMNEecrucvkewf0Sp7W2gdXnC+rm1EFN5McK5U60tW +s6fKVkIB0/fxNJCXPdETGiNpoYGEBeVVzoTrXGFm9QBcSDKvqdaPF8koZzOecpS 3UcVYKBaxNf1zowAseNYiS4kndwKuopaYsk4BY/KDgTbPyoFQBwMZYm6DAkb9ugW sFK6g8FmOJJ5meLWIg4vTdruILL+YP6gbJ0HrzFoxJJEdd2iRzEUgQlBp7G+STSY npXmeTJh5DJelm1Qd4OdiSBlqwYTIxKEyaV34/xeFh9P3w/LbN9uSiZAybbTw92h sE1kQpT//s8lTXOnn23YHycZ1Dsy0JSExoJutCt6E5YnTb0wtPQ6ZLoWmyWMnTL/ 6Gyj7LTy+trv4l0N2ND+LF7BfN6Cb4eSXMfLLhwo2qfQI2kD/gBrbrTYYdyiVvHD YnZftvVViOcQVaocXP1SeQ5/yhdj6ASrCiFWie7nJW9Z3IJ9BJuXCxk3gSW5yCde Rw37+FpVSciyhpEK9f4aVGfAGA8ifO1Gi5yp70ioF8Y3i1CvS238g0knx6v6p8Vm dAi3i1ga =C0jb -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #2 - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation	2022-10-22 03:33:26 -04:00
Paolo Bonzini	5834816829	KVM/arm64 fixes for 6.1, take #1 - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmNIEAUPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDIXcP/AyWlCEJlmc1Jcd9rlaW1Wenr82U+StLVeMy qP5P02gMWdbGExWIWEi4zkt+pAm7K2WRgXid9z5Vjw7kZY/+WwswTzKHWcQhVuZv cBHfeOqgtoHVGR8NcwX6xcp406y3WRqYIsyAmbc5qmo75L8Ew1o3m+3eDfFtAq7l 3XuTCv+lQGNSGMhXHN2SVewZ+pCAo3XJmuHfCBXTqRjwqH4Tzh+54IKzo+9mqBWW 7yeIm5qcbIKGuXLuLL7XCf99gWy/3kQ0xQ1yJeXLAyiHswHqEISZXGHnKeATvD+6 RdbmQ9oRmIYfZfoDKZRUJg8TyTvW1rIKokFbe0q2iyuDnI5D/fAJ48epZaLw+kEf PUzdB3UgPk19SLwgZKQddqY4wOD420ZD5x1TUFUQuLL7sjVv1vUILDvuCLWpq7F7 GyfSB+LEMgexHGsZ1wjslN/ivTbG+dQgaSS9mlV8/WDOLPtD2uOf65vYR3P28hAX zOHrwm3e2+UV83BsEFEY2FQiiIBD24JmSecMbmAIHY09MCSZ+vJ/WbF4J1PcPP8C 3vjueIYTcjhzLtQrfIkGZcS7+wC9ji/RRmpJjbg79EpwrjhEs9G8h1+HyL9+zBZ4 Xn6X+ZG/cv0/ZYdin0ZRzJMvM0RutbsR77blVCLY97PBuLtBlqJDcxr+lmmjyIZ2 Db8Qd6uW =IOxM -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #1 - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes	2022-10-22 03:32:23 -04:00
Eric Ren	c000a26071	KVM: arm64: vgic: Fix exit condition in scan_its_table() With some PCIe topologies, restoring a guest fails while parsing the ITS device tables. Reproducer hints: 1. Create ARM virt VM with pxb-pcie bus which adds extra host bridges, with qemu command like: ``` -device pxb-pcie,bus_nr=8,id=pci.x,numa_node=0,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.x \ ... -device pxb-pcie,bus_nr=37,id=pci.y,numa_node=1,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.y \ ... ``` 2. Ensure the guest uses 2-level device table 3. Perform VM migration which calls save/restore device tables In that setup, we get a big "offset" between 2 device_ids, which makes unsigned "len" round up a big positive number, causing the scan loop to continue with a bad GPA. For example: 1. L1 table has 2 entries; 2. and we are now scanning at L2 table entry index 2075 (pointed to by L1 first entry) 3. if next device id is 9472, we will get a big offset: 7397; 4. with unsigned 'len', 'len -= offset * esz', len will underflow to a positive number, mistakenly into next iteration with a bad GPA; (It should break out of the current L2 table scanning, and jump into the next L1 table entry) 5. that bad GPA fails the guest read. Fix it by stopping the L2 table scan when the next device id is outside of the current table, allowing the scan to continue from the next L1 table entry. Thanks to Eric Auger for the fix suggestion. Fixes: `920a7a8fa9` ("KVM: arm64: vgic-its: Add infrastructure for tableookup") Suggested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Eric Ren <renzhengeek@gmail.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/d9c3a564af9e2c5bf63f48a7dcbf08cd593c5c0b.1665802985.git.renzhengeek@gmail.com	2022-10-15 12:10:54 +01:00
Denis Nikitin	bde971a83b	KVM: arm64: nvhe: Fix build with profile optimization Kernel build with clang and KCFLAGS=-fprofile-sample-use=<profile> fails with: error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL section ".rel.llvm.call-graph-profile" Starting from 13.0.0 llvm can generate SHT_REL section, see https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086. gen-hyprel does not support SHT_REL relocation section. Filter out profile use flags to fix the build with profile optimization. Signed-off-by: Denis Nikitin <denik@chromium.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221014184532.3153551-1-denik@chromium.org	2022-10-15 12:09:50 +01:00
Linus Torvalds	f311d498be	ARM: * Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS * Better handling of AArch32 ID registers on AArch64-only systems * Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering * Advertise the new kvmarm mailing list * Various minor cleanups and spelling fixes RISC-V: * Improved instruction encoding infrastructure for instructions not yet supported by binutils * Svinval support for both KVM Host and KVM Guest * Zihintpause support for KVM Guest * Zicbom support for KVM Guest * Record number of signal exits as a VCPU stat * Use generic guest entry infrastructure x86: * Misc PMU fixes and cleanups. * selftests: fixes for Hyper-V hypercall * selftests: fix nx_huge_pages_test on TDP-disabled hosts * selftests: cleanups for fix_hypercall_test -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM7OcMUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAFgf/Rqc9hrXZVdbh2OZ+gScSsFsPK1zO DISUksLcXaYVYYsvQAEg/N2BPz3XbmO4jA+z8bIUrYTA7fC98we2C4jfR+EaX/fO +/Kzf0lAgu/nQZyFzUya+1jRsZqvVbC/HmDCI2kzN4u78e/LZ7NVcMijdV/ly6ib cq0b0LLqJHe/fcpJ806JZP3p5sndQhDmlUkZ2AWZf6CUKSEFcufbbYkt+84ZK4PL N9mEqXYQ3DXClLQmIBv+NZhtGlmADkWDE4BNouw8dVxhaXH7Hw/jfBHdb6SSHMRe tQ6Src1j8AYOhf5J35SMudgkbGcMelm0yeZ7Sizk+5Ft0EmdbZsnkvsGdQ== =4RA+ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm updates from Paolo Bonzini: "The main batch of ARM + RISC-V changes, and a few fixes and cleanups for x86 (PMU virtualization and selftests). ARM: - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes RISC-V: - Improved instruction encoding infrastructure for instructions not yet supported by binutils - Svinval support for both KVM Host and KVM Guest - Zihintpause support for KVM Guest - Zicbom support for KVM Guest - Record number of signal exits as a VCPU stat - Use generic guest entry infrastructure x86: - Misc PMU fixes and cleanups. - selftests: fixes for Hyper-V hypercall - selftests: fix nx_huge_pages_test on TDP-disabled hosts - selftests: cleanups for fix_hypercall_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (57 commits) riscv: select HAVE_POSIX_CPU_TIMERS_TASK_WORK RISC-V: KVM: Use generic guest entry infrastructure RISC-V: KVM: Record number of signal exits as a vCPU stat RISC-V: KVM: add __init annotation to riscv_kvm_init() RISC-V: KVM: Expose Zicbom to the guest RISC-V: KVM: Provide UAPI for Zicbom block size RISC-V: KVM: Make ISA ext mappings explicit RISC-V: KVM: Allow Guest use Zihintpause extension RISC-V: KVM: Allow Guest use Svinval extension RISC-V: KVM: Use Svinval for local TLB maintenance when available RISC-V: Probe Svinval extension form ISA string RISC-V: KVM: Change the SBI specification version to v1.0 riscv: KVM: Apply insn-def to hlv encodings riscv: KVM: Apply insn-def to hfence encodings riscv: Introduce support for defining instructions riscv: Add X register names to gpr-nums KVM: arm64: Advertise new kvmarm mailing list kvm: vmx: keep constant definition format consistent kvm: mmu: fix typos in struct kvm_arch KVM: selftests: Fix nx_huge_pages_test on TDP-disabled hosts ...	2022-10-11 20:07:44 -07:00
Linus Torvalds	ef688f8b8c	The first batch of KVM patches, mostly covering x86, which I am sending out early due to me travelling next week. There is a lone mm patch for which Andrew gave an informal ack at https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org. I will send the bulk of ARM work, as well as other architectures, at the end of next week. ARM: * Account stage2 page table allocations in memory stats. x86: * Account EPT/NPT arm64 page table allocations in memory stats. * Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses. * Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest. * Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs. * A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good. * A handful of fixes for memory leaks in error paths. * Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow. * Never write to memory from non-sleepable kvm_vcpu_check_block() * Selftests refinements and cleanups. * Misc typo cleanups. Generic: * remove KVM_REQ_UNHALT -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/ FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4 3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE +cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig== =/hqX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "The first batch of KVM patches, mostly covering x86. ARM: - Account stage2 page table allocations in memory stats x86: - Account EPT/NPT arm64 page table allocations in memory stats - Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses - Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest - Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs - A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good - A handful of fixes for memory leaks in error paths - Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow - Never write to memory from non-sleepable kvm_vcpu_check_block() - Selftests refinements and cleanups - Misc typo cleanups Generic: - remove KVM_REQ_UNHALT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits) KVM: remove KVM_REQ_UNHALT KVM: mips, x86: do not rely on KVM_REQ_UNHALT KVM: x86: never write to memory from kvm_vcpu_check_block() KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set KVM: x86: lapic does not have to process INIT if it is blocked KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed KVM: nVMX: Make an event request when pending an MTF nested VM-Exit KVM: x86: make vendor code check for all nested events mailmap: Update Oliver's email address KVM: x86: Allow force_emulation_prefix to be written without a reload KVM: selftests: Add an x86-only test to verify nested exception queueing KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events() KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions KVM: x86: Morph pending exceptions to pending VM-Exits at queue time ...	2022-10-09 09:39:55 -07:00
Vincent Donnefort	837d632a38	KVM: arm64: Enable stack protection and branch profiling for VHE For historical reasons, the VHE code inherited the build configuration from nVHE. Now those two parts have their own folder and makefile, we can enable stack protection and branch profiling for VHE. Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221004154216.2833636-1-vdonnefort@google.com	2022-10-09 03:15:55 +01:00
Oliver Upton	5994bc9e05	KVM: arm64: Limit stage2_apply_range() batch size to largest block Presently stage2_apply_range() works on a batch of memory addressed by a stage 2 root table entry for the VM. Depending on the IPA limit of the VM and PAGE_SIZE of the host, this could address a massive range of memory. Some examples: 4 level, 4K paging -> 512 GB batch size 3 level, 64K paging -> 4TB batch size Unsurprisingly, working on such a large range of memory can lead to soft lockups. When running dirty_log_perf_test: ./dirty_log_perf_test -m -2 -s anonymous_thp -b 4G -v 48 watchdog: BUG: soft lockup - CPU#0 stuck for 45s! [dirty_log_perf_:16703] Modules linked in: vfat fat cdc_ether usbnet mii xhci_pci xhci_hcd sha3_generic gq(O) CPU: 0 PID: 16703 Comm: dirty_log_perf_ Tainted: G O 6.0.0-smp-DEV #1 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dcache_clean_inval_poc+0x24/0x38 lr : clean_dcache_guest_page+0x28/0x4c sp : ffff800021763990 pmr_save: 000000e0 x29: ffff800021763990 x28: 0000000000000005 x27: 0000000000000de0 x26: 0000000000000001 x25: 00400830b13bc77f x24: ffffad4f91ead9c0 x23: 0000000000000000 x22: ffff8000082ad9c8 x21: 0000fffafa7bc000 x20: ffffad4f9066ce50 x19: 0000000000000003 x18: ffffad4f92402000 x17: 000000000000011b x16: 000000000000011b x15: 0000000000000124 x14: ffff07ff8301d280 x13: 0000000000000000 x12: 00000000ffffffff x11: 0000000000010001 x10: fffffc0000000000 x9 : ffffad4f9069e580 x8 : 000000000000000c x7 : 0000000000000000 x6 : 000000000000003f x5 : ffff07ffa2076980 x4 : 0000000000000001 x3 : 000000000000003f x2 : 0000000000000040 x1 : ffff0830313bd000 x0 : ffff0830313bcc40 Call trace: dcache_clean_inval_poc+0x24/0x38 stage2_unmap_walker+0x138/0x1ec __kvm_pgtable_walk+0x130/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 kvm_pgtable_stage2_unmap+0xc4/0xf8 kvm_arch_flush_shadow_memslot+0xa4/0x10c kvm_set_memslot+0xb8/0x454 __kvm_set_memory_region+0x194/0x244 kvm_vm_ioctl_set_memory_region+0x58/0x7c kvm_vm_ioctl+0x49c/0x560 __arm64_sys_ioctl+0x9c/0xd4 invoke_syscall+0x4c/0x124 el0_svc_common+0xc8/0x194 do_el0_svc+0x38/0xc0 el0_svc+0x2c/0xa4 el0t_64_sync_handler+0x84/0xf0 el0t_64_sync+0x1a0/0x1a4 Use the largest supported block mapping for the configured page size as the batch granularity. In so doing the walker is guaranteed to visit a leaf only once. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221007234151.461779-3-oliver.upton@linux.dev	2022-10-09 02:33:49 +01:00
Linus Torvalds	18fd049731	arm64 updates for 6.1: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8= =jl3Z -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...	2022-10-06 11:51:49 -07:00
Paolo Bonzini	fe4d9e4abf	KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5hQcPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDoMUP/jra4HSmujLUB5G7Op8HxuurEecOc6xtw0Af AbDLlVc2Vs4rrdVh8GMc8D80atUAVitp8IFjdp/PzI2GTBTzWz43Gav2AbhgIJbJ xoFVHL8LkdHKyMbq10359DqGMqhIf41OFzGwhbzcx2V4pKNkSpjbCpu3bi/+Ybjg 006ZpZc7NAU0rZgw9Flb/dhn0jw7RMc3orhoDQ4tBp1P/VhvqvgFt5bWipkvvBP7 +lQK28ujG3ghST/hKRhg6ozgy5+6NEEHMuhErMYP8nIivRchX+pWF2Lb0qGH1e+U v2MZIZnIIUjyTV1vbYlxtltzfYmPuQ2MFNUBawI9tmlIOU9vJSCzeJS64uWK4KLV kbmk57OfC7rQoSNJH4jaKQp0YpIktrB9Vei97t4I7NwEmkjQj6cLTgg4tQrNqTiQ cFGeC9mE+lhFC8z1lCbna2eG631FxpPrB1SJ1/CU9wboam9dUfXGIvBPh+i2pvMZ vcxzUZJ11y+/uhp4k8i2PBwNno0iwRXd5MinwRUs2CR5vhs8qa5y7FVWKyqKpgI2 xqr4lYTixJZL3mWkYyOQuClrTbT1zkoaPldLq6M7wvO08+QV8ryMeyKT+9s/gNQU dcYSwBCWZaOZm2nN8/zjxRb7VqZVu3cwyXi9XXUWNTCgIe/Q/SDPbXU/Hwbgzf8X UsQF7e9A =aNPK -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes	2022-10-03 15:33:32 -04:00
Marc Zyngier	b302ca52ba	Merge branch kvm-arm64/misc-6.1 into kvmarm-master/next * kvm-arm64/misc-6.1: : . : Misc KVM/arm64 fixes and improvement for v6.1 : : - Simplify the affinity check when moving a GICv3 collection : : - Tone down the shouting when kvm-arm.mode=protected is passed : to a guest : : - Fix various comments : : - Advertise the new kvmarm@lists.linux.dev and deprecate the : old Columbia list : . KVM: arm64: Advertise new kvmarm mailing list KVM: arm64: Fix comment typo in nvhe/switch.c KVM: selftests: Update top-of-file comment in psci_test KVM: arm64: Ignore kvm-arm.mode if !is_hyp_mode_available() KVM: arm64: vgic: Remove duplicate check in update_affinity_collection() Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-10-01 10:19:36 +01:00
Paolo Bonzini	c99ad25b0d	Merge tag 'kvm-x86-6.1-2' of https://github.com/sean-jc/linux into HEAD KVM x86 updates for 6.1, batch #2: - Misc PMU fixes and cleanups. - Fixes for Hyper-V hypercall selftest	2022-09-30 07:09:48 -04:00
Catalin Marinas	c704cf27a1	Merge branch 'for-next/alternatives' into for-next/core * for-next/alternatives: : Alternatives (code patching) improvements arm64: fix the build with binutils 2.27 arm64: avoid BUILD_BUG_ON() in alternative-macros arm64: alternatives: add shared NOP callback arm64: alternatives: add alternative_has_feature_*() arm64: alternatives: have callbacks take a cap arm64: alternatives: make alt_region const arm64: alternatives: hoist print out of __apply_alternatives() arm64: alternatives: proton-pack: prepare for cap changes arm64: alternatives: kvm: prepare for cap changes arm64: cpufeature: make cpus_have_cap() noinstr-safe	2022-09-30 09:18:22 +01:00
Catalin Marinas	b23ec74cbd	Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', 'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header arm64/sve: Add Perf extensions documentation perf: arm64: Add SVE vector granule register to user regs MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC docs: perf: Add description for Alibaba's T-Head PMU driver * for-next/doc: : Documentation/arm64 updates arm64/sve: Document our actual ABI for clearing registers on syscall * for-next/sve: : SVE updates arm64/sysreg: Add hwcap for SVE EBF16 * for-next/sysreg: (35 commits) : arm64 system registers generation (more conversions) arm64/sysreg: Fix a few missed conversions arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation arm64/sysreg: Use feature numbering for PMU and SPE revisions arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture arm64/sysreg: Add defintion for ALLINT arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation arm64/sysreg: Convert TIPDR_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation arm64/sysreg: Convert HCRX_EL2 to automatic generation arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields arm64/sysreg: Standardise naming for MTE feature enumeration ... * for-next/gettimeofday: : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV) arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday arm64: alternative: patch alternatives in the vDSO arm64: module: move find_section to header * for-next/stacktrace: : arm64 stacktrace cleanups and improvements arm64: stacktrace: track hyp stacks in unwinder's address space arm64: stacktrace: track all stack boundaries explicitly arm64: stacktrace: remove stack type from fp translator arm64: stacktrace: rework stack boundary discovery arm64: stacktrace: add stackinfo_on_stack() helper arm64: stacktrace: move SDEI stack helpers to stacktrace code arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() arm64: stacktrace: simplify unwind_next_common() arm64: stacktrace: fix kerneldoc comments * for-next/atomics: : arm64 atomics improvements arm64: atomic: always inline the assembly arm64: atomics: remove LL/SC trampolines * for-next/el1-exceptions: : Improve the reporting of EL1 exceptions arm64: rework BTI exception handling arm64: rework FPAC exception handling arm64: consistently pass ESR_ELx to die() arm64: die(): pass 'err' as long arm64: report EL1 UNDEFs better * for-next/a510-erratum-2658417: : Cortex-A510: 2658417: remove BF16 support due to incorrect result arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510 arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space * for-next/defconfig: : arm64 defconfig updates arm64: defconfig: Add Coresight as module arm64: Enable docker support in defconfig arm64: defconfig: Enable memory hotplug and hotremove config arm64: configs: Enable all PMUs provided by Arm * for-next/tpidr2_el0: : arm64 ptrace() support for TPIDR2_EL0 kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface arm64/ptrace: Support access to TPIDR2_EL0 arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0 kselftest/arm64: Add test coverage for NT_ARM_TLS * for-next/ftrace: : arm64 ftraces updates/fixes arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static	2022-09-30 09:17:57 +01:00
Wei-Lin Chang	43b233b158	KVM: arm64: Fix comment typo in nvhe/switch.c Fix the comment of __hyp_vgic_restore_state() from saying VEH to VHE, also change the underscore to a dash to match the comment above __hyp_vgic_save_state(). Signed-off-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220929042839.24277-1-r09922117@csie.ntu.edu.tw	2022-09-29 09:07:08 +01:00
Paolo Bonzini	c59fb12758	KVM: remove KVM_REQ_UNHALT KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:37:21 -04:00
Elliot Berman	b2a4d007c3	KVM: arm64: Ignore kvm-arm.mode if !is_hyp_mode_available() Ignore kvm-arm.mode if !is_hyp_mode_available(). Specifically, we want to avoid switching kvm_mode to KVM_MODE_PROTECTED if hypervisor mode is not available. This prevents "Protected KVM" cpu capability being reported when Linux is booting in EL1 and would not have KVM enabled. Reasonably though, we should warn if the command line is requesting a KVM mode at all if KVM isn't actually available. Allow "kvm-arm.mode=none" to skip the warning since this would disable KVM anyway. Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220920190658.2880184-1-quic_eberman@quicinc.com	2022-09-26 10:49:49 +01:00
Gavin Shan	096560dd13	KVM: arm64: vgic: Remove duplicate check in update_affinity_collection() The 'coll' parameter to update_affinity_collection() is never NULL, so comparing it with 'ite->collection' is enough to cover both the NULL case and the "another collection" case. Remove the duplicate check in update_affinity_collection(). Signed-off-by: Gavin Shan <gshan@redhat.com> [maz: repainted commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220923065447.323445-1-gshan@redhat.com	2022-09-26 10:46:37 +01:00
Zenghui Yu	522c9a64c7	KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base With commit `0c24e06119` ("mm: kmemleak: add rbtree and store physical address for objects allocated with PA"), kmemleak started to put the objects allocated with physical address onto object_phys_tree_root tree. The kmemleak_free_part() therefore no longer worked as expected on physically allocated objects (hyp_mem_base in this case) as it attempted to search and remove things in object_tree_root tree. Fix it by using kmemleak_free_part_phys() to unregister hyp_mem_base. This fixes an immediate crash when booting a KVM host in protected mode with kmemleak enabled. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220908130659.2021-1-yuzenghui@huawei.com	2022-09-19 17:59:48 +01:00
Marc Zyngier	bb0cca240a	Merge branch kvm-arm64/single-step-async-exception into kvmarm-master/next * kvm-arm64/single-step-async-exception: : . : Single-step fixes from Reiji Watanabe: : : "This series fixes two bugs of single-step execution enabled by : userspace, and add a test case for KVM_GUESTDBG_SINGLESTEP to : the debug-exception test to verify the single-step behavior." : . KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP KVM: arm64: selftests: Refactor debug-exceptions to make it amenable to new test cases KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-09-19 10:59:29 +01:00
Reiji Watanabe	370531d1e9	KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending While userspace enables single-step, if the Software Step state at the last guest exit was "Active-pending", clear PSTATE.SS on guest entry to restore the state. Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP). It means KVM always makes the vCPU's Software Step state "Active-not-pending" on the guest entry, which lets the VCPU perform single-step (then Software Step exception is taken). This could cause extra single-step (without returning to userspace) if the Software Step state at the last guest exit was "Active-pending" (i.e. the last exit was triggered by an asynchronous exception after the single-step is performed, but before the Software Step exception is taken. See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior in the active-pending state" in ARM DDI 0487I.a for more info about this behavior). Fix this by clearing PSTATE.SS on guest entry if the Software Step state at the last exit was "Active-pending" so that KVM restore the state (and the exception is taken before further single-step is performed). Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com	2022-09-19 10:48:53 +01:00
Reiji Watanabe	34fbdee086	KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Preserve the PSTATE.SS value for the guest while userspace enables single-step (i.e. while KVM manipulates the PSTATE.SS) for the vCPU. Currently, while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP), KVM sets PSTATE.SS to 1 on every guest entry, not saving its original value. When userspace disables single-step, KVM doesn't restore the original value for the subsequent guest entry (use the current value instead). Exception return instructions copy PSTATE.SS from SPSR_ELx.SS only in certain cases when single-step is enabled (and set it to 0 in other cases). So, the value matters only when the guest enables single-step (and when the guest's Software step state isn't affected by single-step enabled by userspace, practically), though. Fix this by preserving the original PSTATE.SS value while userspace enables single-step, and restoring the value once it is disabled. This fix modifies the behavior of GET_ONE_REG/SET_ONE_REG for the PSTATE.SS while single-step is enabled by userspace. Presently, GET_ONE_REG/SET_ONE_REG gets/sets the current PSTATE.SS value, which KVM will override on the next guest entry (i.e. the value userspace gets/sets is not used for the next guest entry). With this patch, GET_ONE_REG/SET_ONE_REG will get/set the guest's preserved value, which KVM will preserve and try to restore after single-step is disabled. Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-2-reijiw@google.com	2022-09-19 10:48:53 +01:00
Marc Zyngier	b04b331502	Merge remote-tracking branch 'arm64/for-next/sysreg' into kvmarm-master/next Merge arm64/for-next/sysreg in order to avoid upstream conflicts due to the never ending sysreg repainting... Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-09-19 09:45:00 +01:00
Mark Rutland	4c0bd995d7	arm64: alternatives: have callbacks take a cap Today, callback alternatives are special-cased within __apply_alternatives(), and are applied alongside patching for system capabilities as ARM64_NCAPS is not part of the boot_capabilities feature mask. This special-casing is less than ideal. Giving special meaning to ARM64_NCAPS for this requires some structures and loops to use ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS. It's also not immediately clear callback alternatives are only applied when applying alternatives for system-wide features. To make this a bit clearer, changes the way that callback alternatives are identified to remove the special-casing of ARM64_NCAPS, and to allow callback alternatives to be associated with a cpucap as with all other alternatives. New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which are always detected alongside boot cpu capabilities and system capabilities respectively. All existing callback alternatives are made to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point during the boot flow as before. Subsequent patches will make more use of these new cpucaps. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:03 +01:00
Mark Rutland	34bbfdfb14	arm64: alternatives: kvm: prepare for cap changes The KVM patching callbacks use cpus_have_final_cap() internally within has_vhe(), and subsequent patches will make it invalid to call cpus_have_final_cap() before alternatives patching has completed, and will mean that cpus_have_const_cap() will always fall back to dynamic checks prior to alternatives patching. In preparation for said change, this patch modifies the KVM patching callbacks to use cpus_have_cap() directly. This is not subject to patching, and will dynamically check the cpu_hwcaps array, which is functionally equivalent to the existing behaviour. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:02 +01:00
Mark Brown	121a8fc088	arm64/sysreg: Use feature numbering for PMU and SPE revisions Currently the kernel refers to the versions of the PMU and SPE features by the version of the architecture where those features were updated but the ARM refers to them using the FEAT_ names for the features. To improve consistency and help with updating for newer features and since v9 will make our current naming scheme a bit more confusing update the macros identfying features to use the FEAT_ based scheme. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	fcf37b38ff	arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64DFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	c0357a73fa	arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1 does not align well with kernel conventions, using as it does a lot of MixedCase in various arrangements. In preparation for automatically generating the defines for this register rename the defines used to match what is in the architecture. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Oliver Upton	d5efec7ed8	KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system One of the oddities of the architecture is that the AArch64 views of the AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any EL. Nonetheless, KVM exposes these registers to userspace for the sake of save/restore. It is possible that the UNKNOWN value could differ between systems, leading to a rejected write from userspace. Avoid the issue altogether by handling the AArch32 ID registers as RAZ/WI when on an AArch64-only system. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-7-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	4de06e4c1d	KVM: arm64: Add a visibility bit to ignore user writes We're about to ignore writes to AArch32 ID registers on AArch64-only systems. Add a bit to indicate a register is handled as write ignore when accessed from userspace. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-6-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	5d9a718b64	KVM: arm64: Spin off helper for calling visibility hook No functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-5-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	cdd5036d04	KVM: arm64: Drop raz parameter from read_id_reg() There is no longer a need for caller-specified RAZ visibility. Hoist the call to sysreg_visible_as_raz() into read_id_reg() and drop the parameter. No functional change intended. Suggested-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-4-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	4782ccc8ef	KVM: arm64: Remove internal accessor helpers for id regs The internal accessors are only ever called once. Dump out their contents in the caller. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-3-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	34b4d20399	KVM: arm64: Use visibility hook to treat ID regs as RAZ The generic id reg accessors already handle RAZ registers by way of the visibility hook. Add a visibility hook that returns REG_RAZ unconditionally and throw out the RAZ specific accessors. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-2-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Mark Rutland	4b5e694e25	arm64: stacktrace: track hyp stacks in unwinder's address space Currently unwind_next_frame_record() has an optional callback to convert the address space of the FP. This is necessary for the NVHE unwinder, which tracks the stacks in the hyp VA space, but accesses the frame records in the kernel VA space. This is a bit unfortunate since it clutters unwind_next_frame_record(), which will get in the way of future rework. Instead, this patch changes the NVHE unwinder to track the stacks in the kernel's VA space and translate to FP prior to calling unwind_next_frame_record(). This removes the need for the translate_fp() callback, as all unwinders consistently track stacks in the native address space of the unwinder. At the same time, this patch consolidates the generation of the stack addresses behind the stackinfo_get_*() helpers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	8df137300d	arm64: stacktrace: track all stack boundaries explicitly Currently we call an on_accessible_stack() callback for each step of the unwinder, requiring redundant work to be performed in the core of the unwind loop (e.g. disabling preemption around accesses to per-cpu variables containing stack boundaries). To prevent unwind loops which go through a stack multiple times, we have to track the set of unwound stacks, requiring a stack_type enum which needs to cater for all the stacks of all possible callees. To prevent loops within a stack, we must track the prior FP values. This patch reworks the unwinder to minimize the work in the core of the unwinder, and to remove the need for the stack_type enum. The set of accessible stacks (and their boundaries) are determined at the start of the unwind, and the current stack is tracked during the unwind, with completed stacks removed from the set of accessible stacks. This makes the boundary checks more accurate (e.g. detecting overlapped frame records), and removes the need for separate tracking of the prior FP and visited stacks. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	bd8abd6883	arm64: stacktrace: remove stack type from fp translator In subsequent patches we'll remove the stack_type enum, and move the FP translation logic out of the raw FP unwind code. In preparation for doing so, this patch removes the type parameter from the FP translation callback, and modifies kvm_nvhe_stack_kern_va() to determine the relevant stack directly. So that kvm_nvhe_stack_kern_va() can use the stackinfo_*() helpers, these are moved earlier in the file, but are not modified in any way. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	d1f684e46b	arm64: stacktrace: rework stack boundary discovery In subsequent patches we'll want to acquire the stack boundaries ahead-of-time, and we'll need to be able to acquire the relevant stack_info regardless of whether we have an object the happens to be on the stack. This patch replaces the on_XXX_stack() helpers with stackinfo_get_XXX() helpers, with the caller being responsible for the checking whether an object is on a relevant stack. For the moment this is moved into the on_accessible_stack() functions, making these slightly larger; subsequent patches will remove the on_accessible_stack() functions and simplify the logic. The on_irq_stack() and on_task_stack() helpers are kept as these are used by IRQ entry sequences and stackleak respectively. As they're only used as predicates, the stack_info pointer parameter is removed in both cases. As the on_accessible_stack() functions are always passed a non-NULL info pointer, these now update info unconditionally. When updating the type to STACK_TYPE_UNKNOWN, the low/high bounds are also modified, but as these will not be consumed this should have no adverse affect. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Rutland	b532ab5f23	arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() The unwind_next_common() function unwinds a single frame record. There are other unwind steps (e.g. unwinding through trampolines) which are handled in the regular kernel unwinder, and in future there may be other common unwind helpers. Clarify the purpose of unwind_next_common() by renaming it to unwind_next_frame_record(). At the same time, add commentary, and delete the redundant comment at the top of asm/stacktrace/common.h. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Rutland	bc8d75212d	arm64: stacktrace: simplify unwind_next_common() Currently unwind_next_common() takes a pointer to a stack_info which is only ever used within unwind_next_common(). Make it a local variable and simplify callers. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Brown	5620b4b037	arm64/sysreg: Standardise naming for ID_AA64PFR0_EL1.AdvSIMD constants The architecture refers to the register field identifying advanced SIMD as AdvSIMD but the kernel refers to it as ASIMD. Use the architecture's naming. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-15-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	4f8456c319	arm64/sysreg: Standardise naming for ID_AA64PFR0_EL1 constants We generally refer to the baseline feature implemented as _IMP so in preparation for automatic generation of register defines update those for ID_AA64PFR0_EL1 to reflect this. In the case of ASIMD we don't actually use the define so just remove it. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-14-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	ca951862ad	arm64/sysreg: Standardise naming for ID_AA64MMFR2_EL1.CnP The kernel refers to ID_AA64MMFR2_EL1.CnP as CNP. In preparation for automatic generation of defines for the system registers bring the naming used by the kernel in sync with that of DDI0487H.a. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-13-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Kristina Martsenko	6fcd019359	arm64/sysreg: Standardise naming for ID_AA64MMFR1_EL1 fields In preparation for converting the ID_AA64MMFR1_EL1 system register defines to automatic generation, rename them to follow the conventions used by other automatically generated registers: * Add _EL1 in the register name. * Rename fields to match the names in the ARM ARM: * LOR -> LO * HPD -> HPDS * VHE -> VH * HADBS -> HAFDBS * SPECSEI -> SpecSEI * VMIDBITS -> VMIDBits There should be no functional change as a result of this patch. Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-11-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	07d7d848b9	arm64/sysreg: Standardise naming of ID_AA64MMFR0_EL1.ASIDBits For some reason we refer to ID_AA64MMFR0_EL1.ASIDBits as ASID. Add BITS into the name, bringing the naming into sync with DDI0487H.a. Due to the large amount of MixedCase in this register which isn't really consistent with either the kernel style or the majority of the architecture the use of upper case is preserved. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-10-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	ed7c138d6f	arm64/sysreg: Standardise naming of ID_AA64MMFR0_EL1.BigEnd For some reason we refer to ID_AA64MMFR0_EL1.BigEnd as BIGENDEL. Remove the EL from the name, bringing the naming into sync with DDI0487H.a. Due to the large amount of MixedCase in this register which isn't really consistent with either the kernel style or the majority of the architecture the use of upper case is preserved. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-9-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	6ca2b9ca45	arm64/sysreg: Add _EL1 into ID_AA64PFR1_EL1 constant names Our standard is to include the _EL1 in the constant names for registers but we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for conversion to automatic generation. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	55adc08d7e	arm64/sysreg: Add _EL1 into ID_AA64PFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64PFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	a957c6be2b	arm64/sysreg: Add _EL1 into ID_AA64MMFR2_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR2_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	2d987e64e8	arm64/sysreg: Add _EL1 into ID_AA64MMFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Yosry Ahmed	d38ba8ccd9	KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats Count the pages used by KVM in arm64 for stage2 mmu in memory stats under secondary pagetable stats (e.g. "SecPageTables" in /proc/meminfo) to give better visibility into the memory consumption of KVM mmu in a similar way to how normal user page tables are accounted. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220823004639.2387269-5-yosryahmed@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-08-30 07:44:25 -07:00
Paolo Bonzini	959d6c4ae2	KVM/arm64 fixes for 6.0, take #1 - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK - Tidy-up handling of AArch32 on asymmetric systems -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmL+RsgPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDHrUP/3IYZ0LnYUZBImSU/YTPL5yYzdSVAuMNcdRQ EgvLQKwP+JSrmd7B7wZ4MhY1LheKjpNmmuSqTRsZOHb/yBmnh3+ao5n2gqusYQeJ PCuLYjeF7ZU5fGIrPAW6BW0BFlmYMVbTrC6SEMhZsisBhna44jrrWgkBz9mOsXE/ YcDWv8kP15lisuQzMvnYxmZobbVgSJ3KgQY4/Dp6vyKMR8ULujCxziFV5R4RD0xP Ay8wnxtMUymx9P6sZsd6Vwi5h1MUXOOoI4He7+8ejIfoMOManIMOIq4PDQhINwQv tGysDmQavftSbUkXJ1VB+8cJ/9KufzwKxFoc5WqGk1y14QulyBNyb/XR3UtORe1n bitINTTkqibHY6fdQJA7z1sD0jaEAh/xNwO1Gq0BS40o4XVQDv2BjdQir9TEdlZO tsZKVaFpN3UZe681ru12No8YzQDhpuLH65gDHDjLaftH99WKsrSwMZLoEjqZTlM/ vH/9acd4UB+9zMGTpN2tJ//2cq6g3JoUC7jJIQB1oGStHX0/7AxKMlabR2xHmt9E 4CmJND9RLK6+yEelagxOAYMQfnCdj6pW/3bhvAmsZWh0t3fNxCeBBXFr2I5os+E9 hV0FYx4PG9GtorMSqudCDsP83SDIxCluNZ5iM8t1suSn3dFhk5bDFChS86XGuqQe XxHQ6JTF =r5iq -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.0, take #1 - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK - Tidy-up handling of AArch32 on asymmetric systems	2022-08-19 05:43:53 -04:00
Chao Peng	20ec3ebd70	KVM: Rename mmu_notifier_* to mmu_invalidate_* The motivation of this renaming is to make these variables and related helper functions less mmu_notifier bound and can also be used for non mmu_notifier based page invalidation. mmu_invalidate_* was chosen to better describe the purpose of 'invalidating' a page that those variables are used for. - mmu_notifier_seq/range_start/range_end are renamed to mmu_invalidate_seq/range_start/range_end. - mmu_notifier_retry{_hva} helper functions are renamed to mmu_invalidate_retry{_hva}. - mmu_notifier_count is renamed to mmu_invalidate_in_progress to avoid confusion with mn_active_invalidate_count. - While here, also update kvm_inc/dec_notifier_count() to kvm_mmu_invalidate_begin/end() to match the change for mmu_notifier_count. No functional change intended. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-08-19 04:05:41 -04:00
Oliver Upton	b10d86fb8e	KVM: arm64: Reject 32bit user PSTATE on asymmetric systems KVM does not support AArch32 EL0 on asymmetric systems. To that end, prevent userspace from configuring a vCPU in such a state through setting PSTATE. It is already ABI that KVM rejects such a write on a system where AArch32 EL0 is unsupported. Though the kernel's definition of a 32bit system changed in commit `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support"), KVM's did not. Fixes: `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-3-oliver.upton@linux.dev	2022-08-17 10:29:07 +01:00
Oliver Upton	f3c6efc72f	KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems KVM does not support AArch32 on asymmetric systems. To that end, enforce AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system. Fixes: `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev	2022-08-17 10:29:07 +01:00
Linus Torvalds	3bd6e5854b	asm-generic: updates for 6.0 There are three independent sets of changes: - Sai Prakash Ranjan adds tracing support to the asm-generic version of the MMIO accessors, which is intended to help understand problems with device drivers and has been part of Qualcomm's vendor kernels for many years. - A patch from Sebastian Siewior to rework the handling of IRQ stacks in softirqs across architectures, which is needed for enabling PREEMPT_RT. - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of the code behind that, after the last users of this old interface made it in through the netdev, scsi, media and staging trees. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q= =aTR0 -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three independent sets of changes: - Sai Prakash Ranjan adds tracing support to the asm-generic version of the MMIO accessors, which is intended to help understand problems with device drivers and has been part of Qualcomm's vendor kernels for many years - A patch from Sebastian Siewior to rework the handling of IRQ stacks in softirqs across architectures, which is needed for enabling PREEMPT_RT - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of the code behind that, after the last users of this old interface made it in through the netdev, scsi, media and staging trees" * tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: asm-generic: fcntl: Fix typo 'the the' in comment arch//: remove CONFIG_VIRT_TO_BUS soc: qcom: geni: Disable MMIO tracing for GENI SE serial: qcom_geni_serial: Disable MMIO tracing for geni serial asm-generic/io: Add logging support for MMIO accessors KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM lib: Add register read/write tracing support drm/meson: Fix overflow implicit truncation warnings irqchip/tegra: Fix overflow implicit truncation warnings coresight: etm4x: Use asm-generic IO memory barriers arm64: io: Use asm-generic high level MMIO accessors arch/: Disable softirq stacks on PREEMPT_RT.	2022-08-05 10:07:23 -07:00
Linus Torvalds	7c5c3a6177	ARM: * Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack * Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure * Disagregation of the vcpu flags in separate sets to better track their use model. * A fix for the GICv2-on-v3 selftest * A small set of cosmetic fixes RISC-V: * Track ISA extensions used by Guest using bitmap * Added system instruction emulation framework * Added CSR emulation framework * Added gfp_custom flag in struct kvm_mmu_memory_cache * Added G-stage ioremap() and iounmap() functions * Added support for Svpbmt inside Guest s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes x86: * Permit guests to ignore single-bit ECC errors * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Use try_cmpxchg64 instead of cmpxchg64 * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * Allow NX huge page mitigation to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs * Miscellaneous cleanups: MCE MSR emulation Use separate namespaces for guest PTEs and shadow PTEs bitmasks PIO emulation Reorganize rmap API, mostly around rmap destruction Do not workaround very old KVM bugs for L0 that runs with nesting enabled new selftests API for CPUID Generic: * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA== =PM8o -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Quite a large pull request due to a selftest API overhaul and some patches that had come in too late for 5.19. ARM: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes RISC-V: - Track ISA extensions used by Guest using bitmap - Added system instruction emulation framework - Added CSR emulation framework - Added gfp_custom flag in struct kvm_mmu_memory_cache - Added G-stage ioremap() and iounmap() functions - Added support for Svpbmt inside Guest s390: - add an interface to provide a hypervisor dump for secure guests - improve selftests to use TAP interface - enable interpretive execution of zPCI instructions (for PCI passthrough) - First part of deferred teardown - CPU Topology - PV attestation - Minor fixes x86: - Permit guests to ignore single-bit ECC errors - Intel IPI virtualization - Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS - PEBS virtualization - Simplify PMU emulation by just using PERF_TYPE_RAW events - More accurate event reinjection on SVM (avoid retrying instructions) - Allow getting/setting the state of the speaker port data bit - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent - "Notify" VM exit (detect microarchitectural hangs) for Intel - Use try_cmpxchg64 instead of cmpxchg64 - Ignore benign host accesses to PMU MSRs when PMU is disabled - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior - Allow NX huge page mitigation to be disabled on a per-vm basis - Port eager page splitting to shadow MMU as well - Enable CMCI capability by default and handle injected UCNA errors - Expose pid of vcpu threads in debugfs - x2AVIC support for AMD - cleanup PIO emulation - Fixes for LLDT/LTR emulation - Don't require refcounted "struct page" to create huge SPTEs - Miscellaneous cleanups: - MCE MSR emulation - Use separate namespaces for guest PTEs and shadow PTEs bitmasks - PIO emulation - Reorganize rmap API, mostly around rmap destruction - Do not workaround very old KVM bugs for L0 that runs with nesting enabled - new selftests API for CPUID Generic: - Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache - new selftests API using struct kvm_vcpu instead of a (vm, id) tuple" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits) selftests: kvm: set rax before vmcall selftests: KVM: Add exponent check for boolean stats selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test selftests: KVM: Check stat name before other fields KVM: x86/mmu: remove unused variable RISC-V: KVM: Add support for Svpbmt inside Guest/VM RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap() RISC-V: KVM: Add G-stage ioremap() and iounmap() functions KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache RISC-V: KVM: Add extensible CSR emulation framework RISC-V: KVM: Add extensible system instruction emulation framework RISC-V: KVM: Factor-out instruction emulation into separate sources RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function RISC-V: KVM: Fix variable spelling mistake RISC-V: KVM: Improve ISA extension by using a bitmap KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs() KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT KVM: x86: Do not block APIC write for non ICR registers ...	2022-08-04 14:59:54 -07:00
Linus Torvalds	0cec3f24a7	arm64 updates for 5.20 - Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4 wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1 =LVH0 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Highlights include a major rework of our kPTI page-table rewriting code (which makes it both more maintainable and considerably faster in the cases where it is required) as well as significant changes to our early boot code to reduce the need for data cache maintenance and greatly simplify the KASLR relocation dance. Summary: - Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits) arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr} arm64: fix KASAN_INLINE arm64/hwcap: Support FEAT_EBF16 arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long arm64/hwcap: Document allocation of upper bits of AT_HWCAP arm64: enable THP_SWAP for arm64 arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52 arm64: errata: Remove AES hwcap for COMPAT tasks arm64: numa: Don't check node against MAX_NUMNODES drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node() docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags" mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON mm: kasan: Skip unpoisoning of user pages mm: kasan: Ensure the tags are visible before the tag in page->flags drivers/perf: hisi: add driver for HNS3 PMU drivers/perf: hisi: Add description for HNS3 PMU driver drivers/perf: riscv_pmu_sbi: perf format perf/arm-cci: Use the bitmap API to allocate bitmaps ...	2022-08-01 10:37:00 -07:00
Paolo Bonzini	c4edb2babc	Merge tag 'kvmarm-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.20: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes	2022-08-01 03:24:12 -04:00
Paolo Bonzini	63f4b21041	Merge remote-tracking branch 'kvm/next' into kvm-next-5.20 KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID	2022-08-01 03:21:00 -04:00
Marc Zyngier	0982c8d859	Merge branch kvm-arm64/nvhe-stacktrace into kvmarm-master/next * kvm-arm64/nvhe-stacktrace: (27 commits) : . : Add an overflow stack to the nVHE EL2 code, allowing : the implementation of an unwinder, courtesy of : Kalesh Singh. From the cover letter (slightly edited): : : "nVHE has two modes of operation: protected (pKVM) and unprotected : (conventional nVHE). Depending on the mode, a slightly different approach : is used to dump the hypervisor stacktrace but the core unwinding logic : remains the same. : : * Protected nVHE (pKVM) stacktraces: : : In protected nVHE mode, the host cannot directly access hypervisor memory. : : The hypervisor stack unwinding happens in EL2 and is made accessible to : the host via a shared buffer. Symbolizing and printing the stacktrace : addresses is delegated to the host and happens in EL1. : : * Non-protected (Conventional) nVHE stacktraces: : : In non-protected mode, the host is able to directly access the hypervisor : stack pages. : : The hypervisor stack unwinding and dumping of the stacktrace is performed : by the host in EL1, as this avoids the memory overhead of setting up : shared buffers between the host and hypervisor." : : Additional patches from Oliver Upton and Marc Zyngier, tidying up : the initial series. : . arm64: Update 'unwinder howto' KVM: arm64: Don't open code ARRAY_SIZE() KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around KVM: arm64: Introduce pkvm_dump_backtrace() KVM: arm64: Implement protected nVHE hyp stack unwinder KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace KVM: arm64: Stub implementation of pKVM HYP stack unwinder KVM: arm64: Allocate shared pKVM hyp stacktrace buffers KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig KVM: arm64: Introduce hyp_dump_backtrace() KVM: arm64: Implement non-protected nVHE hyp stack unwinder KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder KVM: arm64: On stack overflow switch to hyp overflow_stack arm64: stacktrace: Add description of stacktrace/common.h arm64: stacktrace: Factor out common unwind() arm64: stacktrace: Handle frame pointer from different address spaces ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-27 18:33:27 +01:00
Oliver Upton	62ae21627a	KVM: arm64: Don't open code ARRAY_SIZE() Use ARRAY_SIZE() instead of an open-coded version. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Link: https://lore.kernel.org/r/20220727142906.1856759-6-maz@kernel.org	2022-07-27 18:18:38 +01:00
Marc Zyngier	0e773da1e6	KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c kvm_nvhe_stack_kern_va() only makes sense as part of the nVHE unwinder, so simply move it there. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-5-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	4e00532f37	KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions Having multiple versions of on_accessible_stack() (one per unwinder) makes it very hard to reason about what is used where due to the complexity of the various includes, the forward declarations, and the reliance on everything being 'inline'. Instead, move the code back where it should be. Each unwinder implements: - on_accessible_stack() as well as the helpers it depends on, - unwind()/unwind_next(), as they pass on_accessible_stack as a parameter to unwind_next_common() (which is the only common code here) This hardly results in any duplication, and makes it much easier to reason about the code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-4-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	9f5fee05f6	KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit The unwinding code doesn't really belong to the exit handling code. Instead, move it to a file (conveniently named stacktrace.c to confuse the reviewer), and move all the stacktrace-related stuff there. It will be joined by more code very soon. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-3-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	03fe9cd05b	KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around Make the dependency with EL2_DEBUG more obvious by moving the stacktrace configurtion after it. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-2-maz@kernel.org	2022-07-27 18:18:03 +01:00
Kalesh Singh	3a7e1b55aa	KVM: arm64: Introduce pkvm_dump_backtrace() Dumps the pKVM hypervisor backtrace from EL1 by reading the unwinded addresses from the shared stacktrace buffer. The nVHE hyp backtrace is dumped on hyp_panic(), before panicking the host. [ 111.623091] kvm [367]: nVHE call trace: [ 111.623215] kvm [367]: [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8 [ 111.623448] kvm [367]: [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10 [ 111.623642] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 . . . [ 111.640366] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 111.640467] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 111.640574] kvm [367]: [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c [ 111.640676] kvm [367]: [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48 [ 111.640778] kvm [367]: [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128 [ 111.640880] kvm [367]: [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64 [ 111.640996] kvm [367]: ---[ end nVHE call trace ]--- Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-18-kaleshsingh@google.com	2022-07-26 10:51:39 +01:00
Kalesh Singh	871c5d9314	KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace In protected nVHE mode, the host cannot access private owned hypervisor memory. Also the hypervisor aims to remains simple to reduce the attack surface and does not provide any printk support. For the above reasons, the approach taken to provide hypervisor stacktraces in protected mode is: 1) Unwind and save the hyp stack addresses in EL2 to a shared buffer with the host (done in this patch). 2) Delegate the dumping and symbolization of the addresses to the host in EL1 (later patch in the series). On hyp_panic(), the hypervisor prepares the stacktrace before returning to the host. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-16-kaleshsingh@google.com	2022-07-26 10:51:17 +01:00
Kalesh Singh	6928bcc84b	KVM: arm64: Allocate shared pKVM hyp stacktrace buffers In protected nVHE mode the host cannot directly access hypervisor memory, so we will dump the hypervisor stacktrace to a shared buffer with the host. The minimum size for the buffer required, assuming the min frame size of [x29, x30] (2 * sizeof(long)), is half the combined size of the hypervisor and overflow stacks plus an additional entry to delimit the end of the stacktrace. The stacktrace buffers are used later in the series to dump the nVHE hypervisor stacktrace when using protected-mode. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-14-kaleshsingh@google.com	2022-07-26 10:50:12 +01:00
Kalesh Singh	72adac1bd2	KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig This can be used to disable stacktrace for the protected KVM nVHE hypervisor, in order to save on the associated memory usage. This option is disabled by default, since protected KVM is not widely used on platforms other than Android currently. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-13-kaleshsingh@google.com	2022-07-26 10:50:01 +01:00
Kalesh Singh	314a61dc31	KVM: arm64: Introduce hyp_dump_backtrace() In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace from EL1. This is possible beacause the host can directly access the hypervisor stack pages in non-protected mode. The nVHE backtrace is dumped on hyp_panic(), before panicking the host. [ 101.498183] kvm [377]: nVHE call trace: [ 101.498363] kvm [377]: [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8 [ 101.499045] kvm [377]: [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10 [ 101.499498] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 . . . [ 101.524929] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 101.525062] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 101.525195] kvm [377]: [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c [ 101.525333] kvm [377]: [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48 [ 101.525468] kvm [377]: [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128 [ 101.525602] kvm [377]: [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64 [ 101.525745] kvm [377]: ---[ end nVHE call trace ]--- Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-12-kaleshsingh@google.com	2022-07-26 10:49:50 +01:00
Kalesh Singh	db129d486e	KVM: arm64: Implement non-protected nVHE hyp stack unwinder Implements the common framework necessary for unwind() to work for non-protected nVHE mode: - on_accessible_stack() - on_overflow_stack() - unwind_next() Non-protected nVHE unwind() is used to unwind and dump the hypervisor stacktrace by the host in EL1 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-11-kaleshsingh@google.com	2022-07-26 10:49:39 +01:00
Kalesh Singh	879e5ac7b2	KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace In non-protected nVHE mode (non-pKVM) the host can directly access hypervisor memory; and unwinding of the hypervisor stacktrace is done from EL1 to save on memory for shared buffers. To unwind the hypervisor stack from EL1 the host needs to know the starting point for the unwind and information that will allow it to translate hypervisor stack addresses to the corresponding kernel addresses. This patch sets up this book keeping. It is made use of later in the series. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-10-kaleshsingh@google.com	2022-07-26 10:49:27 +01:00
Kalesh Singh	548ec3336f	KVM: arm64: On stack overflow switch to hyp overflow_stack On hyp stack overflow switch to 16-byte aligned secondary stack. This provides us stack space to better handle overflows; and is used in a subsequent patch to dump the hypervisor stacktrace. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-8-kaleshsingh@google.com	2022-07-26 10:49:05 +01:00
Marc Zyngier	ae98a4a989	Merge branch kvm-arm64/sysreg-cleanup-5.20 into kvmarm-master/next * kvm-arm64/sysreg-cleanup-5.20: : . : Long overdue cleanup of the sysreg userspace access, : with extra scrubbing on the vgic side of things. : From the cover letter: : : "Schspa Shi recently reported[1] that some of the vgic code interacting : with userspace was reading uninitialised stack memory, and although : that read wasn't used any further, it prompted me to revisit this part : of the code. : : Needless to say, this area of the kernel is pretty crufty, and shows a : bunch of issues in other parts of the KVM/arm64 infrastructure. This : series tries to remedy a bunch of them: : : - Sanitise the way we deal with sysregs from userspace: at the moment, : each and every .set_user/.get_user callback has to implement its own : userspace accesses (directly or indirectly). It'd be much better if : that was centralised so that we can reason about it. : : - Enforce that all AArch64 sysregs are 64bit. Always. This was sort of : implied by the code, but it took some effort to convince myself that : this was actually the case. : : - Move the vgic-v3 sysreg userspace accessors to the userspace : callbacks instead of hijacking the vcpu trap callback. This allows : us to reuse the sysreg infrastructure. : : - Consolidate userspace accesses for both GICv2, GICv3 and common code : as much as possible. : : - Cleanup a bunch of not-very-useful helpers, tidy up some of the code : as we touch it. : : [1] https://lore.kernel.org/r/m2h740zz1i.fsf@gmail.com" : . KVM: arm64: Get rid or outdated comments KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg() KVM: arm64: Get rid of find_reg_by_id() KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr() KVM: arm64: vgic: Consolidate userspace access for base address setting KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess() KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr() KVM: arm64: Get rid of reg_from/to_user() KVM: arm64: Consolidate sysreg userspace accesses KVM: arm64: Rely on index_to_param() for size checks on userspace access KVM: arm64: Introduce generic get_user/set_user helpers for system registers KVM: arm64: Reorder handling of invariant sysregs from userspace KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:58 +01:00
Marc Zyngier	4274d42716	KVM: arm64: Get rid or outdated comments Once apon a time, the 32bit KVM/arm port was the reference, while the arm64 version was the new kid on the block, without a clear future... This was a long time ago. "The times, they are a-changing." Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:34 +01:00
Marc Zyngier	c5332898dc	KVM: arm64: Descope kvm_arm_sys_reg_{get,set}_reg() Having kvm_arm_sys_reg_get_reg and co in kvm_host.h gives the impression that these functions are free to be called from anywhere. Not quite. They really are tied to out internal sysreg handling, and they would be better off in the sys_regs.h header, which is private. kvm_host.h could also get a bit of a diet, so let's just do that. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	f6dddbb255	KVM: arm64: Get rid of find_reg_by_id() This helper doesn't have a user anymore, let's get rid of it. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	619064afa9	KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr() The userspace accessors have an early call to vgic_{get,set}_common_attr() that makes the code hard to follow. Move it to the default: clause of the decoding switch statement, which results in a nice cleanup. This requires us to move the handling of the pending table into the common handling, even if it is strictly a GICv3 feature (it has the benefit of keeping the whole control group handling in the same function). Also cleanup vgic_v3_{get,set}_attr() while we're at it, deduplicating the calls to vgic_v3_attr_regs_access(). Suggested-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	4b85080f4e	KVM: arm64: vgic: Consolidate userspace access for base address setting Align kvm_vgic_addr() with the rest of the code by moving the userspace accesses into it. kvm_vgic_addr() is also made static. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	9f968c9266	KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting We carry a legacy interface to set the base addresses for GICv2. As this is currently plumbed into the same handling code as the modern interface, it limits the evolution we can make there. Add a helper dedicated to this handling, with a view of maybe removing this in the future. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	d7df6f282d	KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user Tidy-up vgic_get_common_attr() and vgic_set_common_attr() to use {get,put}_user() instead of the more complex (and less type-safe) copy_{from,to}_user(). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	7e9f723c2a	KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers Align the GICv2 MMIO accesses from userspace with the way the GICv3 code is now structured. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	e1246f3f2d	KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers For userspace accesses to GICv3 MMIO registers (and related data), vgic_v3_{get,set}_attr are littered with {get,put}_user() calls, making it hard to audit and reason about. Consolidate all userspace accesses in vgic_v3_attr_regs_access(), making the code far simpler to audit. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	38cf0bb762	KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace Despite the userspace ABI clearly defining the bits dealt with by KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO as a __u32, the kernel uses a u64. Use a u32 to match the userspace ABI, which will subsequently lead to some simplifications. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	71c3c7753c	KVM: arm64: vgic-v3: Convert userspace accessors over to FIELD_GET/FIELD_PREP The GICv3 userspace accessors are all about dealing with conversion between fields from architectural registers and internal representations. However, and owing to the age of this code, the accessors use a combination of shift/mask that is hard to read. It is nonetheless easy to make it better by using the FIELD_{GET,PREP} macros that solely rely on a mask. This results in somewhat nicer looking code, and is probably easier to maintain. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	cbcf14dd23	KVM: arm64: vgic-v3: Make the userspace accessors use sysreg API The vgic-v3 sysreg accessors have been ignored as the rest of the sysreg internal API was evolving, and are stuck with the .access method (which is normally reserved to the guest's own access) for the userspace accesses (which should use the .set/.get_user() methods). Catch up with the program and repaint all the accessors so that they fit into the normal userspace model, and plug the result into the helpers that have been introduced earlier. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	db25081e14	KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess() In order to start making the vgic sysreg access from userspace similar to all the other sysregs, push the userspace memory access one level down into vgic_v3_cpu_sysregs_uaccess(). The next step will be to rely on the sysreg infrastructure to perform this task. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	b61fc0857a	KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr() Finding out whether a sysreg exists has little to do with that register being accessed, so drop the is_write parameter. Also, the reg pointer is completely unused, and we're better off just passing the attr pointer to the function. This result in a small cleanup of the calling site, with a new helper converting the vGIC view of a sysreg into the canonical one (this is purely cosmetic, as the encoding is the same). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	5a420ed964	KVM: arm64: Get rid of reg_from/to_user() These helpers are only used by the invariant stuff now, and while they pretend to support non-64bit registers, this only serves as a way to scare the casual reviewer... Replace these helpers with our good friends get/put_user(), and don't look back. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	978ceeb3e4	KVM: arm64: Consolidate sysreg userspace accesses Until now, the .set_user and .get_user callbacks have to implement (directly or not) the userspace memory accesses. Although this gives us maximem flexibility, this is also a maintenance burden, making it hard to audit, and I'd feel much better if it was all located in a single place. So let's do just that, simplifying most of the function signatures in the process (the callbacks are now only concerned with the data itself, and not with userspace). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	e48407ff97	KVM: arm64: Rely on index_to_param() for size checks on userspace access index_to_param() already checks that we use 64bit accesses for all registers accessed from userspace. However, we have extra checks in other places (such as index_to_params), which is pretty confusing. Get rid off these redundant checks. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	ba23aec9f4	KVM: arm64: Introduce generic get_user/set_user helpers for system registers The userspace access to the system registers is done using helpers that hardcode the table that is looked up. extract some generic helpers from this, moving the handling of hidden sysregs into the core code. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	1deeffb559	KVM: arm64: Reorder handling of invariant sysregs from userspace In order to allow some further refactor of the sysreg helpers, move the handling of invariant sysreg to occur before we handle all the other ones. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	da8d120fba	KVM: arm64: Add get_reg_by_id() as a sys_reg_desc retrieving helper find_reg_by_id() requires a sys_reg_param as input, which most users provide as a on-stack variable, but don't make any use of the result. Provide a helper that doesn't have this requirement and simplify the callers (all but one). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:32 +01:00
Marc Zyngier	aeb7942b64	Merge branch kvm-arm64/misc-5.20 into kvmarm-master/next * kvm-arm64/misc-5.20: : . : Misc fixes for 5.20: : : - Tidy up the hyp/nvhe Makefile : : - Fix functions pointlessly returning a void value : : - Fix vgic_init selftest to handle the GICv3-on-v3 case : : - Fix hypervisor symbolisation when CONFIG_RANDOMIZE_BASE=y : . KVM: arm64: Fix hypervisor address symbolization KVM: arm64: selftests: Add support for GICv2 on v3 KVM: arm64: Don't return from void function KVM: arm64: nvhe: Add intermediates to 'targets' instead of extra-y KVM: arm64: nvhe: Rename confusing obj-y Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:45:22 +01:00
Kalesh Singh	ed6313a93f	KVM: arm64: Fix hypervisor address symbolization With CONFIG_RANDOMIZE_BASE=y vmlinux addresses will resolve incorrectly from kallsyms. Fix this by adding the KASLR offset before printing the symbols. Fixes: `6ccf9cb557` ("KVM: arm64: Symbolize the nVHE HYP addresses") Reported-by: Fuad Tabba <tabba@google.com> Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220715235824.2549012-1-kaleshsingh@google.com	2022-07-17 11:43:40 +01:00
Quentin Perret	1c3ace2b8b	KVM: arm64: Don't return from void function Although harmless, the return statement in kvm_unexpected_el2_exception is rather confusing as the function itself has a void return type. The C standard is also pretty clear that "A return statement with an expression shall not appear in a function whose return type is void". Given that this return statement does not seem to add any actual value, let's not pointlessly violate the standard. Build-tested with GCC 10 and CLANG 13 for good measure, the disassembled code is identical with or without the return statement. Fixes: `e9ee186bb7` ("KVM: arm64: Add kvm_extable for vaxorcism code") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220705142310.3847918-1-qperret@google.com	2022-07-06 10:00:31 +01:00
Mark Brown	b2d71f275d	arm64/sysreg: Add _EL1 into ID_AA64ISAR2_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64ISAR2_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-17-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-07-05 11:45:46 +01:00
Mark Brown	aa50479b4f	arm64/sysreg: Add _EL1 into ID_AA64ISAR1_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64ISAR1_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-16-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-07-05 11:45:46 +01:00
Mark Brown	9a2f3290bb	arm64/sysreg: Standardise naming for WFxT defines The defines for WFxT refer to the feature as WFXT and use SUPPORTED rather than IMP. In preparation for automatic generation of defines update these to be more standard. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-12-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-07-05 11:45:46 +01:00
Masahiro Yamada	40c56bd8e1	KVM: arm64: nvhe: Add intermediates to 'targets' instead of extra-y These are generated on demand. Adding them to 'targets' is enough. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220613092026.1705630-2-masahiroy@kernel.org	2022-06-29 10:53:08 +01:00
Masahiro Yamada	3d5697f95e	KVM: arm64: nvhe: Rename confusing obj-y This Makefile appends several objects to obj-y from line 15, but none of them is linked to vmlinux in an ordinary way. obj-y is overwritten at line 30: obj-y := kvm_nvhe.o So, kvm_nvhe.o is the only object directly linked to vmlinux. Replace the abused obj-y with hyp-obj-y. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220613092026.1705630-1-masahiroy@kernel.org	2022-06-29 10:52:58 +01:00
Marc Zyngier	dc94f89ae6	Merge branch kvm-arm64/burn-the-flags into kvmarm-master/next * kvm-arm64/burn-the-flags: : . : Rework the per-vcpu flags to make them more manageable, : splitting them in different sets that have specific : uses: : : - configuration flags : - input to the world-switch : - state bookkeeping for the kernel itself : : The FP tracking is also simplified and tracked outside : of the flags as a separate state. : . KVM: arm64: Move the handling of !FP outside of the fast path KVM: arm64: Document why pause cannot be turned into a flag KVM: arm64: Reduce the size of the vcpu flag members KVM: arm64: Add build-time sanity checks for flags KVM: arm64: Warn when PENDING_EXCEPTION and INCREMENT_PC are set together KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag KVM: arm64: Kill unused vcpu flags field KVM: arm64: Move vcpu WFIT flag to the state flag set KVM: arm64: Move vcpu ON_UNSUPPORTED_CPU flag to the state flag set KVM: arm64: Move vcpu SVE/SME flags to the state flag set KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set KVM: arm64: Move vcpu PC/Exception flags to the input flag set KVM: arm64: Move vcpu configuration flags into their own set KVM: arm64: Add three sets of flags to the vcpu state KVM: arm64: Add helpers to manipulate vcpu flags among a set KVM: arm64: Move FP state ownership from flag to a tristate KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:30:10 +01:00
Marc Zyngier	b4da91879e	KVM: arm64: Move the handling of !FP outside of the fast path We currently start by assuming that the host owns the FP unit at load time, then check again whether this is the case as we are about to run. Only at this point do we account for the fact that there is a (vanishingly small) chance that we're running on a system without a FPSIMD unit (yes, this is madness). We can actually move this FPSIMD check as early as load-time, and drop the check at run time. No intended change in behaviour. Suggested-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:56 +01:00
Marc Zyngier	30b6ab45f8	KVM: arm64: Convert vcpu sysregs_loaded_on_cpu to a state flag The aptly named boolean 'sysregs_loaded_on_cpu' tracks whether some of the vcpu system registers are resident on the physical CPU when running in VHE mode. This is obviously a flag in hidding, so let's convert it to a state flag, since this is solely a host concern (the hypervisor itself always knows which state we're in). Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:32 +01:00
Marc Zyngier	eebc538d8e	KVM: arm64: Move vcpu WFIT flag to the state flag set The host kernel uses the WFIT flag to remember that a vcpu has used this instruction and wake it up as required. Move it to the state set, as nothing in the hypervisor uses this information. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:23 +01:00
Marc Zyngier	0affa37fcd	KVM: arm64: Move vcpu SVE/SME flags to the state flag set The two HOST_{SVE,SME}_ENABLED are only used for the host kernel to track its own state across a vcpu run so that it can be fully restored. Move these flags to the so called state set. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:14 +01:00
Marc Zyngier	b1da49088a	KVM: arm64: Move vcpu debug/SPE/TRBE flags to the input flag set The three debug flags (which deal with the debug registers, SPE and TRBE) all are input flags to the hypervisor code. Move them into the input set and convert them to the new accessors. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-29 10:23:03 +01:00
David Matlack	837f66c712	KVM: Allow for different capacities in kvm_mmu_memory_cache structs Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at declaration time rather than being fixed for all declarations. This will be used in a follow-up commit to declare an cache in x86 with a capacity of 512+ objects without having to increase the capacity of all caches in KVM. This change requires each cache now specify its capacity at runtime, since the cache struct itself no longer has a fixed capacity known at compile time. To protect against someone accidentally defining a kvm_mmu_memory_cache struct directly (without the extra storage), this commit includes a WARN_ON() in kvm_mmu_topup_memory_cache(). In order to support different capacities, this commit changes the objects pointer array to be dynamically allocated the first time the cache is topped-up. While here, opportunistically clean up the stack-allocated kvm_mmu_memory_cache structs in riscv and arm64 to use designated initializers. No functional change intended. Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220516232138.1783324-22-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-06-24 04:52:00 -04:00
Quentin Perret	56961c6331	KVM: arm64: Prevent kmemleak from accessing pKVM memory Commit `a7259df767` ("memblock: make memblock_find_in_range method private") changed the API using which memory is reserved for the pKVM hypervisor. However, memblock_phys_alloc() differs from the original API in terms of kmemleak semantics -- the old one didn't report the reserved regions to kmemleak while the new one does. Unfortunately, when protected KVM is enabled, all kernel accesses to pKVM-private memory result in a fatal exception, which can now happen because of kmemleak scans: $ echo scan > /sys/kernel/debug/kmemleak [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290! [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000 [ 34.991813] Kernel panic - not syncing: HYP panic: [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 34.991813] VCPU:0000000000000000 [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102 [ 34.994059] Hardware name: linux,dummy-virt (DT) [ 34.994452] Call trace: [ 34.994641] dump_backtrace.part.0+0xcc/0xe0 [ 34.994932] show_stack+0x18/0x6c [ 34.995094] dump_stack_lvl+0x68/0x84 [ 34.995276] dump_stack+0x18/0x34 [ 34.995484] panic+0x16c/0x354 [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60 [ 34.995933] scan_block+0x74/0x12c [ 34.996129] scan_gray_list+0xd8/0x19c [ 34.996332] kmemleak_scan+0x2c8/0x580 [ 34.996535] kmemleak_write+0x340/0x4a0 [ 34.996744] full_proxy_write+0x60/0xbc [ 34.996967] vfs_write+0xc4/0x2b0 [ 34.997136] ksys_write+0x68/0xf4 [ 34.997311] __arm64_sys_write+0x20/0x2c [ 34.997532] invoke_syscall+0x48/0x114 [ 34.997779] el0_svc_common.constprop.0+0x44/0xec [ 34.998029] do_el0_svc+0x2c/0xc0 [ 34.998205] el0_svc+0x2c/0x84 [ 34.998421] el0t_64_sync_handler+0xf4/0x100 [ 34.998653] el0t_64_sync+0x18c/0x190 [ 34.999252] SMP: stopping secondary CPUs [ 35.000034] Kernel Offset: disabled [ 35.000261] CPU features: 0x800,00007831,00001086 [ 35.000642] Memory Limit: none [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic: [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 35.001329] VCPU:0000000000000000 ]--- Fix this by explicitly excluding the hypervisor's memory pool from kmemleak like we already do for the hyp BSS. Cc: Mike Rapoport <rppt@kernel.org> Fixes: `a7259df767` ("memblock: make memblock_find_in_range method private") Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220616161135.3997786-1-qperret@google.com	2022-06-17 09:48:38 +01:00
Sai Prakash Ranjan	451f2f1c90	KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM Add a generic flag (__DISABLE_TRACE_MMIO__) to disable MMIO tracing in nVHE KVM as the tracepoint and MMIO logging symbols should not be visible at nVHE KVM as there is no way to execute them. It can also be used to disable MMIO tracing for specific drivers. Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-06-15 17:41:12 +02:00
Marc Zyngier	699bb2e0c6	KVM: arm64: Move vcpu PC/Exception flags to the input flag set The PC update flags (which also deal with exception injection) is one of the most complicated use of the flag we have. Make it more fool prof by: - moving it over to the new accessors and assign it to the input flag set - turn the combination of generic ELx flags with another flag indicating the target EL itself into an explicit set of flags for each EL and vector combination - add a new accessor to pend the exception This is otherwise a pretty straightformward conversion. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-10 09:54:34 +01:00
Marc Zyngier	4c0680d394	KVM: arm64: Move vcpu configuration flags into their own set The KVM_ARM64_{GUEST_HAS_SVE,VCPU_SVE_FINALIZED,GUEST_HAS_PTRAUTH} flags are purely configuration flags. Once set, they are never cleared, but evaluated all over the code base. Move these three flags into the configuration set in one go, using the new accessors, and take this opportunity to drop the KVM_ARM64_ prefix which doesn't provide any help. Reviewed-by: Fuad Tabba <tabba@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-06-09 15:43:46 +01:00
Will Deacon	5879c97f37	KVM: arm64: Remove redundant hyp_assert_lock_held() assertions host_stage2_try() asserts that the KVM host lock is held, so there's no need to duplicate the assertion in its wrappers. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-6-will@kernel.org	2022-06-09 13:24:02 +01:00
Will Deacon	cde5042adf	KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHE Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode() only returns KVM_MODE_PROTECTED on systems where the feature is available. Cc: David Brazdil <dbrazdil@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org	2022-06-09 13:24:02 +01:00
Marc Zyngier	fa7a172144	KVM: arm64: Handle all ID registers trapped for a protected VM A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF, while it really should only get a zero back if the register is not handled by the hypervisor emulation (as mandated by the architecture). Introduce all the missing ID registers (including the unallocated ones), and have them to return 0. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-3-will@kernel.org	2022-06-09 13:24:02 +01:00
Will Deacon	ae187fec75	KVM: arm64: Return error from kvm_arch_init_vm() on allocation failure If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm() then be sure to return -ENOMEM instead of success (0) on the failure path. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org	2022-06-09 13:24:02 +01:00
Marc Zyngier	f8077b0d59	KVM: arm64: Move FP state ownership from flag to a tristate The KVM FP code uses a pair of flags to denote three states: - FP_ENABLED set: the guest owns the FP state - FP_HOST set: the host owns the FP state - FP_ENABLED and FP_HOST clear: nobody owns the FP state at all and both flags set is an illegal state, which nothing ever checks for... As it turns out, this isn't really a good match for flags, and we'd be better off if this was a simpler tristate, each state having a name that actually reflect the state: - FP_STATE_FREE - FP_STATE_HOST_OWNED - FP_STATE_GUEST_OWNED Kill the two flags, and move over to an enum encoding these three states. This results in less confusing code, and less risk of ending up in the uncharted territory of a 4th state if we forget to clear one of the two flags. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com>	2022-06-09 12:01:58 +01:00
Marc Zyngier	e9ada6c208	KVM: arm64: Drop FP_FOREIGN_STATE from the hypervisor code The vcpu KVM_ARM64_FP_FOREIGN_FPSTATE flag tracks the thread's own TIF_FOREIGN_FPSTATE so that we can evaluate just before running the vcpu whether it the FP regs contain something that is owned by the vcpu or not by updating the rest of the FP flags. We do this in the hypervisor code in order to make sure we're in a context where we are not interruptible. But we already have a hook in the run loop to generate this flag. We may as well update the FP flags directly and save the pointless flag tracking. Whilst we're at it, rename update_fp_enabled() to guest_owns_fp_regs() to indicate what the leftover of this helper actually do. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Mark Brown <broonie@kernel.org>	2022-06-09 12:01:51 +01:00
Marc Zyngier	efedd01de4	KVM: arm64: Warn if accessing timer pending state outside of vcpu context A recurrent bug in the KVM/arm64 code base consists in trying to access the timer pending state outside of the vcpu context, which makes zero sense (the pending state only exists when the vcpu is loaded). In order to avoid more embarassing crashes and catch the offenders red-handed, add a warning to kvm_arch_timer_get_input_level() and return the state as non-pending. This avoids taking the system down, and still helps tracking down silly bugs. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220607131427.1164881-4-maz@kernel.org	2022-06-08 10:16:23 +01:00
Marc Zyngier	98432ccdec	KVM: arm64: Replace vgic_v3_uaccess_read_pending with vgic_uaccess_read_pending Now that GICv2 has a proper userspace accessor for the pending state, switch GICv3 over to it, dropping the local version, moving over the specific behaviours that CGIv3 requires (such as the distinction between pending latch and line level which were never enforced with GICv2). We also gain extra locking that isn't really necessary for userspace, but that's a small price to pay for getting rid of superfluous code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20220607131427.1164881-3-maz@kernel.org	2022-06-08 10:16:15 +01:00
Marc Zyngier	2cdea19a34	KVM: arm64: Don't read a HW interrupt pending state in user context Since `5bfa685e62` ("KVM: arm64: vgic: Read HW interrupt pending state from the HW"), we're able to source the pending bit for an interrupt that is stored either on the physical distributor or on a device. However, this state is only available when the vcpu is loaded, and is not intended to be accessed from userspace. Unfortunately, the GICv2 emulation doesn't provide specific userspace accessors, and we fallback with the ones that are intended for the guest, with fatal consequences. Add a new vgic_uaccess_read_pending() accessor for userspace to use, build on top of the existing vgic_mmio_read_pending(). Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: `5bfa685e62` ("KVM: arm64: vgic: Read HW interrupt pending state from the HW") Link: https://lore.kernel.org/r/20220607131427.1164881-2-maz@kernel.org Cc: stable@vger.kernel.org	2022-06-07 16:28:19 +01:00
sunliming	e3fe65e0d3	KVM: arm64: Fix inconsistent indenting Fix the following smatch warnings: arch/arm64/kvm/vmid.c:62 flush_context() warn: inconsistent indenting Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: sunliming <sunliming@kylinos.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220602024805.511457-1-sunliming@kylinos.cn	2022-06-07 15:27:05 +01:00
Marc Zyngier	039f49c4ca	KVM: arm64: Always start with clearing SME flag on load On each vcpu load, we set the KVM_ARM64_HOST_SME_ENABLED flag if SME is enabled for EL0 on the host. This is used to restore the correct state on vpcu put. However, it appears that nothing ever clears this flag. Once set, it will stick until the vcpu is destroyed, which has the potential to spuriously enable SME for userspace. As it turns out, this is due to the SME code being more or less copied from SVE, and inheriting the same shortcomings. We never saw the issue because nothing uses SME, and the amount of testing is probably still pretty low. Fixes: `861262ab86` ("KVM: arm64: Handle SME host state when running guests") Signed-off-by: Marc Zyngier <maz@kernel.org> Reviwed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220528113829.1043361-3-maz@kernel.org	2022-06-07 14:31:30 +01:00
Marc Zyngier	d52d165d67	KVM: arm64: Always start with clearing SVE flag on load On each vcpu load, we set the KVM_ARM64_HOST_SVE_ENABLED flag if SVE is enabled for EL0 on the host. This is used to restore the correct state on vpcu put. However, it appears that nothing ever clears this flag. Once set, it will stick until the vcpu is destroyed, which has the potential to spuriously enable SVE for userspace. We probably never saw the issue because no VMM uses SVE, but that's still pretty bad. Unconditionally clearing the flag on vcpu load addresses the issue. Fixes: `8383741ab2` ("KVM: arm64: Get rid of host SVE tracking/saving") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220528113829.1043361-2-maz@kernel.org	2022-06-07 14:19:23 +01:00
Linus Torvalds	bf9095424d	S390: * ultravisor communication device driver * fix TEID on terminating storage key ops RISC-V: * Added Sv57x4 support for G-stage page table * Added range based local HFENCE functions * Added remote HFENCE functions based on VCPU requests * Added ISA extension registers in ONE_REG interface * Updated KVM RISC-V maintainers entry to cover selftests support ARM: * Add support for the ARMv8.6 WFxT extension * Guard pages for the EL2 stacks * Trap and emulate AArch32 ID registers to hide unsupported features * Ability to select and save/restore the set of hypercalls exposed to the guest * Support for PSCI-initiated suspend in collaboration with userspace * GICv3 register-based LPI invalidation support * Move host PMU event merging into the vcpu data structure * GICv3 ITS save/restore fixes * The usual set of small-scale cleanups and fixes x86: * New ioctls to get/set TSC frequency for a whole VM * Allow userspace to opt out of hypercall patching * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES * V_TSC_AUX support Nested virtualization improvements for AMD: * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) * Allow AVIC to co-exist with a nested guest running * Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support * PAUSE filtering for nested hypervisors Guest support: * Decoupling of vcpu_is_preempted from PV spinlocks -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw== =ozWx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ...	2022-05-26 14:20:14 -07:00
Paolo Bonzini	47e8eec832	KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmKGAGsPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDB/gQAMhyZ+wCG0OMEZhwFF6iDfxVEX2Kw8L41NtD a/e6LDWuIOGihItpRkYROc5myG74D7XckF2Bz3G7HJoU4vhwHOV/XulE26GFizoC O1GVRekeSUY81wgS1yfo0jojLupBkTjiq3SjTHoDP7GmCM0qDPBtA0QlMRzd2bMs Kx0+UUXZUHFSTXc7Lp4vqNH+tMp7se+yRx7hxm6PCM5zG+XYJjLxnsZ0qpchObgU 7f6YFojsLUs1SexgiUqJ1RChVQ+FkgICh5HyzORvGtHNNzK6D2sIbsW6nqMGAMql Kr3A5O/VOkCztSYnLxaa76/HqD21mvUrXvr3grhabNc7rOmuzWV0dDgr6c6wHKHb uNCtH4d7Ra06gUrEOrfsgLOLn0Zqik89y6aIlMsnTudMg9gMNgFHy1jz4LM7vMkY FS5AVj059heg2uJcfgTvzzcqneyuBLBmF3dS4coowO6oaj8SycpaEmP5e89zkPMI 1kk8d0e6RmXuCh/2AJ8GxxnKvBPgqp2mMKXOCJ8j4AmHEDX/CKpEBBqIWLKkplUU 8DGiOWJUtRZJg398dUeIpiVLoXJthMODjAnkKkuhiFcQbXomlwgg7YSnNAz6TRED Z7KR2leC247kapHnnagf02q2wED8pBeyrxbQPNdrHtSJ9Usm4nTkY443HgVTJW3s aTwPZAQ7 =mh7W -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated from 4 to 6. - Paolo]	2022-05-25 05:09:23 -04:00
Linus Torvalds	143a6252e1	arm64 updates for 5.19: - Initial support for the ARMv9 Scalable Matrix Extension (SME). SME takes the approach used for vectors in SVE and extends this to provide architectural support for matrix operations. No KVM support yet, SME is disabled in guests. - Support for crashkernel reservations above ZONE_DMA via the 'crashkernel=X,high' command line option. - btrfs search_ioctl() fix for live-lock with sub-page faults. - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring coherent I/O traffic, support for Arm's CMN-650 and CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup. - Kselftest updates for SME, BTI, MTE. - Automatic generation of the system register macros from a 'sysreg' file describing the register bitfields. - Update the type of the function argument holding the ESR_ELx register value to unsigned long to match the architecture register size (originally 32-bit but extended since ARMv8.0). - stacktrace cleanups. - ftrace cleanups. - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(), avoid executable mappings in kexec/hibernate code, drop TLB flushing from get_clear_flush() (and rename it to get_clear_contig()), ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmKH19IACgkQa9axLQDI XvEFWg//bf0p6zjeNaOJmBbyVFsXsVyYiEaLUpFPUs3oB+81s2YZ+9i1rgMrNCft EIDQ9+/HgScKxJxnzWf68heMdcBDbk76VJtLALExbge6owFsjByQDyfb/b3v/bLd ezAcGzc6G5/FlI1IP7ct4Z9MnQry4v5AG8lMNAHjnf6GlBS/tYNAqpmj8HpQfgRQ ZbhfZ8Ayu3TRSLWL39NHVevpmxQm/bGcpP3Q9TtjUqg0r1FQ5sK/LCqOksueIAzT UOgUVYWSFwTpLEqbYitVqgERQp9LiLoK5RmNYCIEydfGM7+qmgoxofSq5e2hQtH2 SZM1XilzsZctRbBbhMit1qDBqMlr/XAy/R5FO0GauETVKTaBhgtj6mZGyeC9nU/+ RGDljaArbrOzRwMtSuXF+Fp6uVo5spyRn1m8UT/k19lUTdrV9z6EX5Fzuc4Mnhed oz4iokbl/n8pDObXKauQspPA46QpxUYhrAs10B/ELc3yyp/Qj3jOfzYHKDNFCUOq HC9mU+YiO9g2TbYgCrrFM6Dah2E8fU6/cR0ZPMeMgWK4tKa+6JMEINYEwak9e7M+ 8lZnvu3ntxiJLN+PrPkiPyG+XBh2sux1UfvNQ+nw4Oi9xaydeX7PCbQVWmzTFmHD q7UPQ8220e2JNCha9pULS8cxDLxiSksce06DQrGXwnHc1Ir7T04= =0DjE -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Initial support for the ARMv9 Scalable Matrix Extension (SME). SME takes the approach used for vectors in SVE and extends this to provide architectural support for matrix operations. No KVM support yet, SME is disabled in guests. - Support for crashkernel reservations above ZONE_DMA via the 'crashkernel=X,high' command line option. - btrfs search_ioctl() fix for live-lock with sub-page faults. - arm64 perf updates: support for the Hisilicon "CPA" PMU for monitoring coherent I/O traffic, support for Arm's CMN-650 and CMN-700 interconnect PMUs, minor driver fixes, kerneldoc cleanup. - Kselftest updates for SME, BTI, MTE. - Automatic generation of the system register macros from a 'sysreg' file describing the register bitfields. - Update the type of the function argument holding the ESR_ELx register value to unsigned long to match the architecture register size (originally 32-bit but extended since ARMv8.0). - stacktrace cleanups. - ftrace cleanups. - Miscellaneous updates, most notably: arm64-specific huge_ptep_get(), avoid executable mappings in kexec/hibernate code, drop TLB flushing from get_clear_flush() (and rename it to get_clear_contig()), ARCH_NR_GPIO bumped to 2048 for ARCH_APPLE. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (145 commits) arm64/sysreg: Generate definitions for FAR_ELx arm64/sysreg: Generate definitions for DACR32_EL2 arm64/sysreg: Generate definitions for CSSELR_EL1 arm64/sysreg: Generate definitions for CPACR_ELx arm64/sysreg: Generate definitions for CONTEXTIDR_ELx arm64/sysreg: Generate definitions for CLIDR_EL1 arm64/sve: Move sve_free() into SVE code section arm64: Kconfig.platforms: Add comments arm64: Kconfig: Fix indentation and add comments arm64: mm: avoid writable executable mappings in kexec/hibernate code arm64: lds: move special code sections out of kernel exec segment arm64/hugetlb: Implement arm64 specific huge_ptep_get() arm64/hugetlb: Use ptep_get() to get the pte value of a huge page arm64: kdump: Do not allocate crash low memory if not needed arm64/sve: Generate ZCR definitions arm64/sme: Generate defintions for SVCR arm64/sme: Generate SMPRI_EL1 definitions arm64/sme: Automatically generate SMPRIMAP_EL2 definitions arm64/sme: Automatically generate SMIDR_EL1 defines arm64/sme: Automatically generate defines for SMCR ...	2022-05-23 21:06:11 -07:00
Catalin Marinas	0616ea3f1b	Merge branch 'for-next/esr-elx-64-bit' into for-next/core * for-next/esr-elx-64-bit: : Treat ESR_ELx as a 64-bit register. KVM: arm64: uapi: Add kvm_debug_exit_arch.hsr_high KVM: arm64: Treat ESR_EL2 as a 64-bit register arm64: Treat ESR_ELx as a 64-bit register arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall arm64: Make ESR_ELx_xVC_IMM_MASK compatible with assembly	2022-05-20 18:51:54 +01:00
Paolo Bonzini	6f5adb3504	KVM/arm64 fixes for 5.18, take #3 - Correctly expose GICv3 support even if no irqchip is created so that userspace doesn't observe it changing pointlessly (fixing a regression with QEMU) - Don't issue a hypercall to set the id-mapped vectors when protected mode is enabled -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmKCKnIPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDD1IP/2y+6ntgxdwuvHWVMEttGh9dOG/jCiV0B+uZ R0x6G6i+VvqoBM3vzHl5fMqfRF47edQ17Kofa815Iae9dkoSR3oetA5qn8zZzGac z9102EYsPkb9qj+hOYpPDT3ST/jYLq3EUoEef/lGwcJ32CPldKIttWdyZvHbfjoP 6sOJYCWUiLiGt98VF/CNDazDInOgQtmRBkslHyNCeTC8w+7vT/2qXgfN2x513h92 CH9yM7dIzS0Qt3U6yMlx39zZ95T0FslonAgtzZfXQ4590aJD+w367HT3WaAOp9Qn MKIJF9DV9cy2o7pyz9R81x0NWiYmJvTsWBxqLdxDQuObevBayGrGNwEgGuUSwtYj zez536JOAIShKJZLyWP8t2a3NwIxu3KWOzKqhm+mt/1fikcP3KEhh7CTdJTp2GqX XBO5wGVW3I3M1s+rjziQues5aampsSo3dJbHU0hx+t4ODVKkVQo19dXfCtwFMLrT KLTDQLiUzRadv1c6q2rO66L//r6g3gA5DSRiCgOShA6iNcDaf2uVtvfG6p6n10k2 Tss5hvDfSJTSttnNYsCsVYdIGhJizpxVBLfXJHLyBn/DnTUcjkEqpIo0eWZvT2gD nxgh0lewenVKUYzP01jkph6kLnKU6LwtNKV6ZJbpazJYYcEQ+vVYoTweCu7L3RJa F7SURWTh =OGUb -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.18, take #3 - Correctly expose GICv3 support even if no irqchip is created so that userspace doesn't observe it changing pointlessly (fixing a regression with QEMU) - Don't issue a hypercall to set the id-mapped vectors when protected mode is enabled (fix for pKVM in combination with CPUs affected by Spectre-v3a)	2022-05-17 13:26:33 -04:00
Mark Brown	ec0067a63e	arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.h The defines for SVCR call it SVCR_EL0 however the architecture calls the register SVCR with no _EL0 suffix. In preparation for generating the sysreg definitions rename to match the architecture, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-05-16 19:50:20 +01:00
Catalin Marinas	8c00c8f02f	Merge branch 'for-next/sme' into for-next/sysreg-gen * for-next/sme: (29 commits) : Scalable Matrix Extensions support. arm64/sve: Make kernel FPU protection RT friendly arm64/sve: Delay freeing memory in fpsimd_flush_thread() arm64/sme: More sensibly define the size for the ZA register set arm64/sme: Fix NULL check after kzalloc arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding() arm64/sme: Provide Kconfig for SME KVM: arm64: Handle SME host state when running guests KVM: arm64: Trap SME usage in guest KVM: arm64: Hide SME system registers from guests arm64/sme: Save and restore streaming mode over EFI runtime calls arm64/sme: Disable streaming mode and ZA when flushing CPU state arm64/sme: Add ptrace support for ZA arm64/sme: Implement ptrace support for streaming mode SVE registers arm64/sme: Implement ZA signal handling arm64/sme: Implement streaming SVE signal handling arm64/sme: Disable ZA and streaming mode when handling signals arm64/sme: Implement traps and syscall handling for SME arm64/sme: Implement ZA context switching arm64/sme: Implement streaming SVE context switching arm64/sme: Implement SVCR context switching ...	2022-05-16 19:49:58 +01:00
Marc Zyngier	5c0ad551e9	Merge branch kvm-arm64/its-save-restore-fixes-5.19 into kvmarm-master/next * kvm-arm64/its-save-restore-fixes-5.19: : . : Tighten the ITS save/restore infrastructure to fail early rather : than late. Patches courtesy of Rocardo Koller. : . KVM: arm64: vgic: Undo work in failed ITS restores KVM: arm64: vgic: Do not ignore vgic_its_restore_cte failures KVM: arm64: vgic: Add more checks when restoring ITS tables KVM: arm64: vgic: Check that new ITEs could be saved in guest memory Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	822ca7f82b	Merge branch kvm-arm64/misc-5.19 into kvmarm-master/next * kvm-arm64/misc-5.19: : . : Misc fixes and general improvements for KVMM/arm64: : : - Better handle out of sequence sysregs in the global tables : : - Remove a couple of unnecessary loads from constant pool : : - Drop unnecessary pKVM checks : : - Add all known M1 implementations to the SEIS workaround : : - Cleanup kerneldoc warnings : . KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler KVM: arm64: nvhe: Eliminate kernel-doc warnings KVM: arm64: Avoid unnecessary absolute addressing via literals KVM: arm64: Print emulated register table name when it is unsorted KVM: arm64: Don't BUG_ON() if emulated register table is unsorted Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	8794b4f510	Merge branch kvm-arm64/per-vcpu-host-pmu-data into kvmarm-master/next * kvm-arm64/per-vcpu-host-pmu-data: : . : Pass the host PMU state in the vcpu to avoid the use of additional : shared memory between EL1 and EL2 (this obviously only applies : to nVHE and Protected setups). : : Patches courtesy of Fuad Tabba. : . KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected KVM: arm64: Reenable pmu in Protected Mode KVM: arm64: Pass pmu events to hyp via vcpu KVM: arm64: Repack struct kvm_pmu to reduce size KVM: arm64: Wrapper for getting pmu_events Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	ec2cff6cbd	Merge branch kvm-arm64/vgic-invlpir into kvmarm-master/next * kvm-arm64/vgic-invlpir: : . : Implement MMIO-based LPI invalidation for vGICv3. : . KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation KVM: arm64: vgic-v3: Expose GICR_CTLR.RWP when disabling LPIs irqchip/gic-v3: Exposes bit values for GICR_CTLR.{IR, CES} Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:35 +01:00
Marc Zyngier	3b8e21e3c3	Merge branch kvm-arm64/psci-suspend into kvmarm-master/next * kvm-arm64/psci-suspend: : . : Add support for PSCI SYSTEM_SUSPEND and allow userspace to : filter the wake-up events. : : Patches courtesy of Oliver. : . Documentation: KVM: Fix title level for PSCI_SUSPEND selftests: KVM: Test SYSTEM_SUSPEND PSCI call selftests: KVM: Refactor psci_test to make it amenable to new tests selftests: KVM: Use KVM_SET_MP_STATE to power off vCPU in psci_test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test KVM: arm64: Implement PSCI SYSTEM_SUSPEND KVM: arm64: Add support for userspace to suspend a vCPU KVM: arm64: Return a value from check_vcpu_requests() KVM: arm64: Rename the KVM_REQ_SLEEP handler KVM: arm64: Track vCPU power state using MP state values KVM: arm64: Dedupe vCPU power off helpers KVM: arm64: Don't depend on fallthrough to hide SYSTEM_RESET2 Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:20 +01:00
Marc Zyngier	0586e28aaa	Merge branch kvm-arm64/hcall-selection into kvmarm-master/next * kvm-arm64/hcall-selection: : . : Introduce a new set of virtual sysregs for userspace to : select the hypercalls it wants to see exposed to the guest. : : Patches courtesy of Raghavendra and Oliver. : . KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run KVM: arm64: Hide KVM_REG_ARM_*_BMAP_BIT_COUNT from userspace Documentation: Fix index.rst after psci.rst renaming selftests: KVM: aarch64: Add the bitmap firmware registers to get-reg-list selftests: KVM: aarch64: Introduce hypercall ABI test selftests: KVM: Create helper for making SMCCC calls selftests: KVM: Rename psci_cpu_on_test to psci_test tools: Import ARM SMCCC definitions Docs: KVM: Add doc for the bitmap firmware registers Docs: KVM: Rename psci.rst to hypercalls.rst KVM: arm64: Add vendor hypervisor firmware register KVM: arm64: Add standard hypervisor firmware register KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM: arm64: Factor out firmware register handling from psci.c Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:47:03 +01:00
Marc Zyngier	528ada2811	KVM: arm64: Fix hypercall bitmap writeback when vcpus have already run We generally want to disallow hypercall bitmaps being changed once vcpus have already run. But we must allow the write if the written value is unchanged so that userspace can rewrite the register file on reboot, for example. Without this, a QEMU-based VM will fail to reboot correctly. The original code was correct, and it is me that introduced the regression. Fixes: `05714cab7d` ("KVM: arm64: Setup a framework for hypercall bitmap firmware registers") Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:40:48 +01:00
Ricardo Koller	8c5e74c90b	KVM: arm64: vgic: Undo work in failed ITS restores Failed ITS restores should clean up all state restored until the failure. There is some cleanup already present when failing to restore some tables, but it's not complete. Add the missing cleanup. Note that this changes the behavior in case of a failed restore of the device tables. restore ioctl: 1. restore collection tables 2. restore device tables With this commit, failures in 2. clean up everything created so far, including state created by 1. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-5-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	a1ccfd6f6e	KVM: arm64: vgic: Do not ignore vgic_its_restore_cte failures Restoring a corrupted collection entry (like an out of range ID) is being ignored and treated as success. More specifically, a vgic_its_restore_cte failure is treated as success by vgic_its_restore_collection_table. vgic_its_restore_cte uses positive and negative numbers to return error, and +1 to return success. The caller then uses "ret > 0" to check for success. Fix this by having vgic_its_restore_cte only return negative numbers on error. Do this by changing alloc_collection return codes to only return negative numbers on error. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-4-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	243b1f6c8f	KVM: arm64: vgic: Add more checks when restoring ITS tables Try to improve the predictability of ITS save/restores (and debuggability of failed ITS saves) by failing early on restore when trying to read corrupted tables. Restoring the ITS tables does some checks for corrupted tables, but not as many as in a save: an overflowing device ID will be detected on save but not on restore. The consequence is that restoring a corrupted table won't be detected until the next save; including the ITS not working as expected after the restore. As an example, if the guest sets tables overlapping each other, which would most likely result in some corrupted table, this is what we would see from the host point of view: guest sets base addresses that overlap each other save ioctl restore ioctl save ioctl (fails) Ideally, we would like the first save to fail, but overlapping tables could actually be intended by the guest. So, let's at least fail on the restore with some checks: like checking that device and event IDs don't overflow their tables. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-3-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	cafe7e544d	KVM: arm64: vgic: Check that new ITEs could be saved in guest memory Try to improve the predictability of ITS save/restores by failing commands that would lead to failed saves. More specifically, fail any command that adds an entry into an ITS table that is not in guest memory, which would otherwise lead to a failed ITS save ioctl. There are already checks for collection and device entries, but not for ITEs. Add the corresponding check for the ITT when adding ITEs. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-2-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Marc Zyngier	20492a62b9	KVM: arm64: pmu: Restore compilation when HW_PERF_EVENTS isn't selected Moving kvm_pmu_events into the vcpu (and refering to it) broke the somewhat unusual case where the kernel has no support for a PMU at all. In order to solve this, move things around a bit so that we can easily avoid refering to the pmu structure outside of PMU-aware code. As a bonus, pmu.c isn't compiled in when HW_PERF_EVENTS isn't selected. Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/202205161814.KQHpOzsJ-lkp@intel.com	2022-05-16 13:42:41 +01:00
Quentin Perret	2e40316753	KVM: arm64: Don't hypercall before EL2 init Will reported the following splat when running with Protected KVM enabled: [ 2.427181] ------------[ cut here ]------------ [ 2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac [ 2.428424] Modules linked in: [ 2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1 [ 2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac [ 2.431091] lr : create_hyp_exec_mappings+0x40/0x80 [ 2.431377] sp : ffff80000803baf0 [ 2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000 [ 2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40 [ 2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000 [ 2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8 [ 2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000 [ 2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004 [ 2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b [ 2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005 [ 2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11 [ 2.436149] Call trace: [ 2.436600] __create_hyp_private_mapping+0x118/0x1ac [ 2.437576] create_hyp_exec_mappings+0x40/0x80 [ 2.438180] kvm_init_vector_slots+0x180/0x194 [ 2.458941] kvm_arch_init+0x80/0x274 [ 2.459220] kvm_init+0x48/0x354 [ 2.459416] arm_init+0x20/0x2c [ 2.459601] do_one_initcall+0xbc/0x238 [ 2.459809] do_initcall_level+0x94/0xb4 [ 2.460043] do_initcalls+0x54/0x94 [ 2.460228] do_basic_setup+0x1c/0x28 [ 2.460407] kernel_init_freeable+0x110/0x178 [ 2.460610] kernel_init+0x20/0x1a0 [ 2.460817] ret_from_fork+0x10/0x20 [ 2.461274] ---[ end trace 0000000000000000 ]--- Indeed, the Protected KVM mode promotes __create_hyp_private_mapping() to a hypercall as EL1 no longer has access to the hypervisor's stage-1 page-table. However, the call from kvm_init_vector_slots() happens after pKVM has been initialized on the primary CPU, but before it has been initialized on secondaries. As such, if the KVM initcall procedure is migrated from one CPU to another in this window, the hypercall may end up running on a CPU for which EL2 has not been initialized. Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the vectors in the private range, so the hypercall in question is in fact superfluous. Skip it when pKVM is enabled. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> [maz: simplified the checks slightly] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220513092607.35233-1-qperret@google.com	2022-05-15 12:14:14 +01:00
Marc Zyngier	5163373af1	KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC When adding support for the slightly wonky Apple M1, we had to populate ID_AA64PFR0_EL1.GIC==1 to present something to the guest, as the HW itself doesn't advertise the feature. However, we gated this on the in-kernel irqchip being created. This causes some trouble for QEMU, which snapshots the state of the registers before creating a virtual GIC, and then tries to restore these registers once the GIC has been created. Obviously, between the two stages, ID_AA64PFR0_EL1.GIC has changed value, and the write fails. The fix is to actually emulate the HW, and always populate the field if the HW is capable of it. Fixes: `562e530fd7` ("KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3") Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Oliver Upton <oupton@google.com> Link: https://lore.kernel.org/r/20220503211424.3375263-1-maz@kernel.org	2022-05-15 12:12:14 +01:00
Fuad Tabba	722625c6f4	KVM: arm64: Reenable pmu in Protected Mode Now that the pmu code does not access hyp data, reenable it in protected mode. Once fully supported, protected VMs will not have pmu support, since that could leak information. However, non-protected VMs in protected mode should have pmu support if available. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510095710.148178-5-tabba@google.com	2022-05-15 11:26:41 +01:00
Fuad Tabba	84d751a019	KVM: arm64: Pass pmu events to hyp via vcpu Instead of the host accessing hyp data directly, pass the pmu events of the current cpu to hyp via the vcpu. This adds 64 bits (in two fields) to the vcpu that need to be synced before every vcpu run in nvhe and protected modes. However, it isolates the hypervisor from the host, which allows us to use pmu in protected mode in a subsequent patch. No visible side effects in behavior intended. Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510095710.148178-4-tabba@google.com	2022-05-15 11:26:41 +01:00
Fuad Tabba	3cb8a091a7	KVM: arm64: Wrapper for getting pmu_events Eases migrating away from using hyp data and simplifies the code. No functional change intended. Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510095710.148178-2-tabba@google.com	2022-05-15 11:24:17 +01:00
Marc Zyngier	cae889302e	KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround Unsusprisingly, Apple M1 Pro/Max have the exact same defect as the original M1 and generate random SErrors in the host when a guest tickles the GICv3 CPU interface the wrong way. Add the part numbers for both the CPU types found in these two new implementations, and add them to the hall of shame. This also applies to the Ultra version, as it is composed of 2 Max SoCs. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220514102524.3188730-1-maz@kernel.org	2022-05-15 11:18:50 +01:00
Oliver Upton	249838b766	KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE FEAT_SVE is already masked by the fixed configuration for ID_AA64PFR0_EL1; don't try and mask it at runtime. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220509162559.2387784-3-oupton@google.com	2022-05-10 10:58:43 +01:00
Oliver Upton	4d2e469e16	KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler The pVM-specific FP/SIMD trap handler just calls straight into the generic trap handler. Avoid the indirection and just call the hyp handler directly. Note that the BUILD_BUG_ON() pattern is repeated in pvm_init_traps_aa64pfr0(), which is likely a better home for it. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220509162559.2387784-2-oupton@google.com	2022-05-10 10:58:42 +01:00
Randy Dunlap	bd61395ae8	KVM: arm64: nvhe: Eliminate kernel-doc warnings Don't use begin-kernel-doc notation (/) for comments that are not in kernel-doc format. This prevents these kernel-doc warnings: arch/arm64/kvm/hyp/nvhe/switch.c:126: warning: This comment starts with '/', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Disable host events, enable guest events arch/arm64/kvm/hyp/nvhe/switch.c:146: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Disable guest events, enable host events arch/arm64/kvm/hyp/nvhe/switch.c:164: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Handler for protected VM restricted exceptions. arch/arm64/kvm/hyp/nvhe/switch.c:176: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Handler for protected VM MSR, MRS or System instruction execution in AArch64. arch/arm64/kvm/hyp/nvhe/switch.c:196: warning: Function parameter or member 'vcpu' not described in 'kvm_handle_pvm_fpsimd' arch/arm64/kvm/hyp/nvhe/switch.c:196: warning: Function parameter or member 'exit_code' not described in 'kvm_handle_pvm_fpsimd' arch/arm64/kvm/hyp/nvhe/switch.c:196: warning: expecting prototype for Handler for protected floating(). Prototype was for kvm_handle_pvm_fpsimd() instead Fixes: `09cf57eba3` ("KVM: arm64: Split hyp/switch.c to VHE/nVHE") Fixes: `1423afcb41` ("KVM: arm64: Trap access to pVM restricted features") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: David Brazdil <dbrazdil@google.com> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220430050123.2844-1-rdunlap@infradead.org	2022-05-06 11:03:56 +01:00
Ard Biesheuvel	7ee74cc7ad	KVM: arm64: Avoid unnecessary absolute addressing via literals There are a few cases in the nVHE code where we take the absolute address of a symbol via a literal pool entry, and subsequently translate it to another address space (PA, kimg VA, kernel linear VA, etc). Originally, this literal was needed because we relied on a different translation for absolute references, but this is no longer the case, so we can simply use relative addressing instead. This removes a couple of RELA entries pointing into the .text segment. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220428140350.3303481-1-ardb@kernel.org	2022-05-06 11:03:56 +01:00
Alexandru Elisei	325031d4f3	KVM: arm64: Print emulated register table name when it is unsorted When a sysreg table entry is out-of-order, KVM attempts to print the address of the table: [ 0.143911] kvm [1]: sys_reg table (____ptrval____) out of order (1) Printing the name of the table instead of a pointer is more helpful in this case. The message has also been slightly tweaked to be point out the offending entry (and to match the missing reset error message): [ 0.143891] kvm [1]: sys_reg table sys_reg_descs+0x50/0x7490 entry 1 out of order Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220428103405.70884-3-alexandru.elisei@arm.com	2022-05-04 17:52:56 +01:00
Alexandru Elisei	f1f0c0cfea	KVM: arm64: Don't BUG_ON() if emulated register table is unsorted To emulate a register access, KVM uses a table of registers sorted by register encoding to speed up queries using binary search. When Linux boots, KVM checks that the table is sorted and uses a BUG_ON() statement to let the user know if it's not. The unfortunate side effect is that an unsorted sysreg table brings down the whole kernel, not just KVM, even though the rest of the kernel can function just fine without KVM. To make matters worse, on machines which lack a serial console, the user is left pondering why the machine is taking so long to boot. Improve this situation by returning an error from kvm_arch_init() if the sysreg tables are not in the correct order. The machine is still very much usable for the user, with the exception of virtualization, who can now easily determine what went wrong. A minor typo has also been corrected in the check_sysreg_table() function. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220428103405.70884-2-alexandru.elisei@arm.com	2022-05-04 17:52:50 +01:00
Mark Brown	0eda2ec489	arm64/sysreg: Standardise ID_AA64ISAR0_EL1 macro names The macros for accessing fields in ID_AA64ISAR0_EL1 omit the _EL1 from the name of the register. In preparation for converting this register to be automatically generated update the names to include an _EL1, there should be no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220503170233.507788-8-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-05-04 15:30:28 +01:00
Mark Brown	6329eb543d	arm64: Update name of ID_AA64ISAR0_EL1_ATOMIC to reflect ARM The architecture reference manual refers to the field in bits 23:20 of ID_AA64ISAR0_EL1 with the name "atomic" but the kernel defines for this bitfield use the name "atomics". Bring the two into sync to make it easier to cross reference with the specification. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220503170233.507788-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-05-04 15:30:28 +01:00
Marc Zyngier	49a1a2c70a	KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision Since adversising GICR_CTLR.{IC,CES} is directly observable from a guest, we need to make it selectable from userspace. For that, bump the default GICD_IIDR revision and let userspace downgrade it to the previous default. For GICv2, the two distributor revisions are strictly equivalent. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220405182327.205520-5-maz@kernel.org	2022-05-04 14:09:53 +01:00
Marc Zyngier	4645d11f4a	KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation Since GICv4.1, it has become legal for an implementation to advertise GICR_{INVLPIR,INVALLR,SYNCR} while having an ITS, allowing for a more efficient invalidation scheme (no guest command queue contention when multiple CPUs are generating invalidations). Provide the invalidation registers as a primitive to their ITS counterpart. Note that we don't advertise them to the guest yet (the architecture allows an implementation to do this). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Oliver Upton <oupton@google.com> Link: https://lore.kernel.org/r/20220405182327.205520-4-maz@kernel.org	2022-05-04 14:09:53 +01:00
Marc Zyngier	94828468a6	KVM: arm64: vgic-v3: Expose GICR_CTLR.RWP when disabling LPIs When disabling LPIs, a guest needs to poll GICR_CTLR.RWP in order to be sure that the write has taken effect. We so far reported it as 0, as we didn't advertise that LPIs could be turned off the first place. Start tracking this state during which LPIs are being disabled, and expose the 'in progress' state via the RWP bit. We also take this opportunity to disallow enabling LPIs and programming GICR_{PEND,PROP}BASER while LPI disabling is in progress, as allowed by the architecture (UNPRED behaviour). We don't advertise the feature to the guest yet (which is allowed by the architecture). Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220405182327.205520-3-maz@kernel.org	2022-05-04 14:09:53 +01:00
Marc Zyngier	d25f30fe41	Merge branch kvm-arm64/aarch32-idreg-trap into kvmarm-master/next * kvm-arm64/aarch32-idreg-trap: : . : Add trapping/sanitising infrastructure for AArch32 systen registers, : allowing more control over what we actually expose (such as the PMU). : : Patches courtesy of Oliver and Alexandru. : . KVM: arm64: Fix new instances of 32bit ESRs KVM: arm64: Hide AArch32 PMU registers when not available KVM: arm64: Start trapping ID registers for 32 bit guests KVM: arm64: Plumb cp10 ID traps through the AArch64 sysreg handler KVM: arm64: Wire up CP15 feature registers to their AArch64 equivalents KVM: arm64: Don't write to Rt unless sys_reg emulation succeeds KVM: arm64: Return a bool from emulate_cp() Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:42:45 +01:00
Marc Zyngier	904cabf471	Merge branch kvm-arm64/hyp-stack-guard into kvmarm-master/next * kvm-arm64/hyp-stack-guard: : . : Harden the EL2 stack by providing stack guards, courtesy of : Kalesh Singh. : . KVM: arm64: Symbolize the nVHE HYP addresses KVM: arm64: Detect and handle hypervisor stack overflows KVM: arm64: Add guard pages for pKVM (protected nVHE) hypervisor stack KVM: arm64: Add guard pages for KVM nVHE hypervisor stack KVM: arm64: Introduce pkvm_alloc_private_va_range() KVM: arm64: Introduce hyp_alloc_private_va_range() Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:42:37 +01:00
Marc Zyngier	b2c4caf331	Merge branch kvm-arm64/wfxt into kvmarm-master/next * kvm-arm64/wfxt: : . : Add support for the WFET/WFIT instructions that provide the same : service as WFE/WFI, only with a timeout. : . KVM: arm64: Expose the WFXT feature to guests KVM: arm64: Offer early resume for non-blocking WFxT instructions KVM: arm64: Handle blocking WFIT instruction KVM: arm64: Introduce kvm_counter_compute_delta() helper KVM: arm64: Simplify kvm_cpu_has_pending_timer() arm64: Use WFxT for __delay() when possible arm64: Add wfet()/wfit() helpers arm64: Add HWCAP advertising FEAT_WFXT arm64: Add RV and RN fields for ESR_ELx_WFx_ISS arm64: Expand ESR_ELx_WFx_ISS_TI to match its ARMv8.7 definition Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:42:16 +01:00
Marc Zyngier	4b88524c47	Merge remote-tracking branch 'arm64/for-next/sme' into kvmarm-master/next Merge arm64's SME branch to resolve conflicts with the WFxT branch. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 09:38:32 +01:00
Oliver Upton	bfbab44568	KVM: arm64: Implement PSCI SYSTEM_SUSPEND ARM DEN0022D.b 5.19 "SYSTEM_SUSPEND" describes a PSCI call that allows software to request that a system be placed in the deepest possible low-power state. Effectively, software can use this to suspend itself to RAM. Unfortunately, there really is no good way to implement a system-wide PSCI call in KVM. Any precondition checks done in the kernel will need to be repeated by userspace since there is no good way to protect a critical section that spans an exit to userspace. SYSTEM_RESET and SYSTEM_OFF are equally plagued by this issue, although no users have seemingly cared for the relatively long time these calls have been supported. The solution is to just make the whole implementation userspace's problem. Introduce a new system event, KVM_SYSTEM_EVENT_SUSPEND, that indicates to userspace a calling vCPU has invoked PSCI SYSTEM_SUSPEND. Additionally, add a CAP to get buy-in from userspace for this new exit type. Only advertise the SYSTEM_SUSPEND PSCI call if userspace has opted in. If a vCPU calls SYSTEM_SUSPEND, punt straight to userspace. Provide explicit documentation of userspace's responsibilites for the exit and point to the PSCI specification to describe the actual PSCI call. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-8-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	7b33a09d03	KVM: arm64: Add support for userspace to suspend a vCPU Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU is in a suspended state. In the suspended state the vCPU will block until a wakeup event (pending interrupt) is recognized. Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to userspace that KVM has recognized one such wakeup event. It is the responsibility of userspace to then make the vCPU runnable, or leave it suspended until the next wakeup event. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-7-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	3fdd04592d	KVM: arm64: Return a value from check_vcpu_requests() A subsequent change to KVM will introduce a vCPU request that could result in an exit to userspace. Change check_vcpu_requests() to return a value and document the function. Unconditionally return 1 for now. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-6-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	1c6219e3fa	KVM: arm64: Rename the KVM_REQ_SLEEP handler The naming of the kvm_req_sleep function is confusing: the function itself sleeps the vCPU, it does not request such an event. Rename the function to make its purpose more clear. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-5-oupton@google.com	2022-05-04 09:28:45 +01:00
Oliver Upton	b171f9bbb1	KVM: arm64: Track vCPU power state using MP state values A subsequent change to KVM will add support for additional power states. Store the MP state by value rather than keeping track of it as a boolean. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-4-oupton@google.com	2022-05-04 09:28:44 +01:00
Oliver Upton	1e5794295c	KVM: arm64: Dedupe vCPU power off helpers vcpu_power_off() and kvm_psci_vcpu_off() are equivalent; rename the former and replace all callsites to the latter. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-3-oupton@google.com	2022-05-04 09:28:44 +01:00
Oliver Upton	5bc2cb95ad	KVM: arm64: Don't depend on fallthrough to hide SYSTEM_RESET2 Depending on a fallthrough to the default case for hiding SYSTEM_RESET2 requires that any new case statements clean up the failure path for this PSCI call. Unhitch SYSTEM_RESET2 from the default case by setting val to PSCI_RET_NOT_SUPPORTED outside of the switch statement. Apply the cleanup to both the PSCI_1_1_FN_SYSTEM_RESET2 and PSCI_1_0_FN_PSCI_FEATURES handlers. No functional change intended. Signed-off-by: Oliver Upton <oupton@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220504032446.4133305-2-oupton@google.com	2022-05-04 09:28:44 +01:00
Marc Zyngier	ee87a9bd65	KVM: arm64: Fix new instances of 32bit ESRs Fix the new instances of ESR being described as a u32, now that we consistently are using a u64 for this register. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-04 08:01:05 +01:00
Raghavendra Rao Ananta	b22216e1a6	KVM: arm64: Add vendor hypervisor firmware register Introduce the firmware register to hold the vendor specific hypervisor service calls (owner value 6) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support for KVM-vendor features along with reading the UID, represented by bit-0, and Precision Time Protocol (PTP), represented by bit-1. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-5-rananta@google.com	2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta	428fd6788d	KVM: arm64: Add standard hypervisor firmware register Introduce the firmware register to hold the standard hypervisor service calls (owner value 5) as a bitmap. The bitmap represents the features that'll be enabled for the guest, as configured by the user-space. Currently, this includes support only for Paravirtualized time, represented by bit-0. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: tidy-up bitmap values] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-4-rananta@google.com	2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta	05714cab7d	KVM: arm64: Setup a framework for hypercall bitmap firmware registers KVM regularly introduces new hypercall services to the guests without any consent from the userspace. This means, the guests can observe hypercall services in and out as they migrate across various host kernel versions. This could be a major problem if the guest discovered a hypercall, started using it, and after getting migrated to an older kernel realizes that it's no longer available. Depending on how the guest handles the change, there's a potential chance that the guest would just panic. As a result, there's a need for the userspace to elect the services that it wishes the guest to discover. It can elect these services based on the kernels spread across its (migration) fleet. To remedy this, extend the existing firmware pseudo-registers, such as KVM_REG_ARM_PSCI_VERSION, but by creating a new COPROC register space for all the hypercall services available. These firmware registers are categorized based on the service call owners, but unlike the existing firmware pseudo-registers, they hold the features supported in the form of a bitmap. During the VM initialization, the registers are set to upper-limit of the features supported by the corresponding registers. It's expected that the VMMs discover the features provided by each register via GET_ONE_REG, and write back the desired values using SET_ONE_REG. KVM allows this modification only until the VM has started. Some of the standard features are not mapped to any bits of the registers. But since they can recreate the original problem of making it available without userspace's consent, they need to be explicitly added to the case-list in kvm_hvc_call_default_allowed(). Any function-id that's not enabled via the bitmap, or not listed in kvm_hvc_call_default_allowed, will be returned as SMCCC_RET_NOT_SUPPORTED to the guest. Older userspace code can simply ignore the feature and the hypercall services will be exposed unconditionally to the guests, thus ensuring backward compatibility. In this patch, the framework adds the register only for ARM's standard secure services (owner value 4). Currently, this includes support only for ARM True Random Number Generator (TRNG) service, with bit-0 of the register representing mandatory features of v1.0. Other services are momentarily added in the upcoming patches. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: reduced the scope of some helpers, tidy-up bitmap max values, dropped error-only fast path] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-3-rananta@google.com	2022-05-03 21:30:19 +01:00
Raghavendra Rao Ananta	85fbe08e4d	KVM: arm64: Factor out firmware register handling from psci.c Common hypercall firmware register handing is currently employed by psci.c. Since the upcoming patches add more of these registers, it's better to move the generic handling to hypercall.c for a cleaner presentation. While we are at it, collect all the firmware registers under fw_reg_ids[] to help implement kvm_arm_get_fw_num_regs() and kvm_arm_copy_fw_reg_indices() in a generic way. Also, define KVM_REG_FEATURE_LEVEL_MASK using a GENMASK instead. No functional change intended. Signed-off-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> [maz: fixed KVM_REG_FEATURE_LEVEL_MASK] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220502233853.1233742-2-rananta@google.com	2022-05-03 14:48:54 +01:00
Alexandru Elisei	a9e192cd4f	KVM: arm64: Hide AArch32 PMU registers when not available commit `11663111cd` ("KVM: arm64: Hide PMU registers from userspace when not available") hid the AArch64 PMU registers from userspace and guest when the PMU VCPU feature was not set. Do the same when the PMU registers are accessed by an AArch32 guest. While we're at it, rename the previously unused AA32_ZEROHIGH to AA32_DIRECT to match the behavior of get_access_mask(). Now that KVM emulates ID_DFR0 and hides the PMU from the guest when the feature is not set, it is safe to inject to inject an undefined exception when the PMU is not present, as that corresponds to the architected behaviour. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> [Oliver - Add AA32_DIRECT to match the zero value of the enum] Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220503060205.2823727-7-oupton@google.com	2022-05-03 11:17:41 +01:00

... 2 3 4 5 6 ...

1928 Коммитов