WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Mark Brown	baa8515281	arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 15:01:56 +00:00
Mark Brown	93ae6b01ba	KVM: arm64: Discard any SVE state when entering KVM guests Since `8383741ab2` (KVM: arm64: Get rid of host SVE tracking/saving) KVM has not tracked the host SVE state, relying on the fact that we currently disable SVE whenever we perform a syscall. This may not be true in future since performance optimisation may result in us keeping SVE enabled in order to avoid needing to take access traps to reenable it. Handle this by clearing TIF_SVE and converting the stored task state to FPSIMD format when preparing to run the guest. This is done with a new call fpsimd_kvm_prepare() to keep the direct state manipulation functions internal to fpsimd.c. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-11-29 15:01:56 +00:00
Peter Collingbourne	c911f0d468	KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE enabled Certain VMMs such as crosvm have features (e.g. sandboxing) that depend on being able to map guest memory as MAP_SHARED. The current restriction on sharing MAP_SHARED pages with the guest is preventing the use of those features with MTE. Now that the races between tasks concurrently clearing tags on the same page have been fixed, remove this restriction. Note that this is a relaxation of the ABI. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-8-pcc@google.com	2022-11-29 09:26:07 +00:00
Peter Collingbourne	d89585fbb3	KVM: arm64: unify the tests for VMAs in memslots when MTE is enabled Previously we allowed creating a memslot containing a private mapping that was not VM_MTE_ALLOWED, but would later reject KVM_RUN with -EFAULT. Now we reject the memory region at memslot creation time. Since this is a minor tweak to the ABI (a VMM that created one of these memslots would fail later anyway), no VMM to my knowledge has MTE support yet, and the hardware with the necessary features is not generally available, we can probably make this ABI change at this point. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-7-pcc@google.com	2022-11-29 09:26:07 +00:00
Catalin Marinas	d77e59a8fc	arm64: mte: Lock a page for MTE tag initialisation Initialising the tags and setting PG_mte_tagged flag for a page can race between multiple set_pte_at() on shared pages or setting the stage 2 pte via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and set it before attempting page initialisation. Given that PG_mte_tagged is never cleared for a page, consider setting this flag to mean page unlocked and wait on this bit with acquire semantics if the page is locked: - try_page_mte_tagging() - lock the page for tagging, return true if it can be tagged, false if already tagged. No acquire semantics if it returns true (PG_mte_tagged not set) as there is no serialisation with a previous set_page_mte_tagged(). - set_page_mte_tagged() - set PG_mte_tagged with release semantics. The two-bit locking is based on Peter Collingbourne's idea. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-6-pcc@google.com	2022-11-29 09:26:07 +00:00
Catalin Marinas	2dbf12ae13	KVM: arm64: Simplify the sanitise_mte_tags() logic Currently sanitise_mte_tags() checks if it's an online page before attempting to sanitise the tags. Such detection should be done in the caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does not have the vma, leave the page unmapped if not already tagged. Tag initialisation will be done on a subsequent access fault in user_mem_abort(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [pcc@google.com: fix the page initializer] Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-4-pcc@google.com	2022-11-29 09:26:07 +00:00
Catalin Marinas	e059853d14	arm64: mte: Fix/clarify the PG_mte_tagged semantics Currently the PG_mte_tagged page flag mostly means the page contains valid tags and it should be set after the tags have been cleared or restored. However, in mte_sync_tags() it is set before setting the tags to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on write in another thread can cause the new page to have stale tags. Similarly, tag reading via ptrace() can read stale tags if the PG_mte_tagged flag is set before actually clearing/restoring the tags. Fix the PG_mte_tagged semantics so that it is only set after the tags have been cleared or restored. This is safe for swap restoring into a MAP_SHARED or CoW page since the core code takes the page lock. Add two functions to test and set the PG_mte_tagged flag with acquire and release semantics. The downside is that concurrent mprotect(PROT_MTE) on a MAP_SHARED page may cause tag loss. This is already the case for KVM guests if a VMM changes the page protection while the guest triggers a user_mem_abort(). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [pcc@google.com: fix build with CONFIG_ARM64_MTE disabled] Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Steven Price <steven.price@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221104011041.290951-3-pcc@google.com	2022-11-29 09:26:07 +00:00
Marc Zyngier	64d6820d64	KVM: arm64: PMU: Sanitise PMCR_EL0.LP on first vcpu run Userspace can play some dirty tricks on us by selecting a given PMU version (such as PMUv3p5), restore a PMCR_EL0 value that has PMCR_EL0.LP set, and then switch the PMU version to PMUv3p1, for example. In this situation, we end-up with PMCR_EL0.LP being set and spreading havoc in the PMU emulation. This is specially hard as the first two step can be done on one vcpu and the third step on another, meaning that we need to sanitise all vcpus when the PMU version is changed. In orer to avoid a pretty complicated locking situation, defer the sanitisation of PMCR_EL0 to the point where the vcpu is actually run for the first tine, using the existing KVM_REQ_RELOAD_PMU request that calls into kvm_pmu_handle_pmcr(). There is still an obscure corner case where userspace could do the above trick, and then save the VM without running it. They would then observe an inconsistent state (PMUv3.1 + LP set), but that state will be fixed on the first run anyway whenever the guest gets restored on a host. Reported-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-11-28 14:04:08 +00:00
Marc Zyngier	292e8f1494	KVM: arm64: PMU: Simplify PMCR_EL0 reset handling Resetting PMCR_EL0 is a pretty involved process that includes poisoning some of the writable bits, just because we can. It makes it hard to reason about about what gets configured, and just resetting things to 0 seems like a much saner option. Reduce reset_pmcr() to just preserving PMCR_EL0.N from the host, and setting PMCR_EL0.LC if we don't support AArch32. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-11-28 14:04:08 +00:00
Anshuman Khandual	86815735aa	KVM: arm64: PMU: Replace version number '0' with ID_AA64DFR0_EL1_PMUVer_NI kvm_host_pmu_init() returns when detected PMU is either not implemented, or implementation defined. kvm_pmu_probe_armpmu() also has a similar situation. Extracted ID_AA64DFR0_EL1_PMUVer value, when PMU is not implemented is '0', which can be replaced with ID_AA64DFR0_EL1_PMUVer_NI defined as '0b0000'. Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221128135629.118346-1-anshuman.khandual@arm.com	2022-11-28 14:04:08 +00:00
Oliver Upton	5e806c5812	KVM: arm64: Reject shared table walks in the hyp code Exclusive table walks are the only supported table walk in the hyp, as there is no construct like RCU available in the hypervisor code. Reject any attempt to do a shared table walk by returning an error and allowing the caller to clean up the mess. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221118182222.3932898-4-oliver.upton@linux.dev	2022-11-22 13:05:53 +00:00
Oliver Upton	b7833bf202	KVM: arm64: Don't acquire RCU read lock for exclusive table walks Marek reported a BUG resulting from the recent parallel faults changes, as the hyp stage-1 map walker attempted to allocate table memory while holding the RCU read lock: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 2 locks held by swapper/0/1: #0: ffff80000a8a44d0 (kvm_hyp_pgd_mutex){+.+.}-{3:3}, at: __create_hyp_mappings+0x80/0xc4 #1: ffff80000a927720 (rcu_read_lock){....}-{1:2}, at: kvm_pgtable_walk+0x0/0x1f4 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc3+ #5918 Hardware name: Raspberry Pi 3 Model B (DT) Call trace: dump_backtrace.part.0+0xe4/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x178/0x220 __might_sleep+0x48/0xa0 prepare_alloc_pages+0x178/0x1a0 __alloc_pages+0x9c/0x109c alloc_page_interleave+0x1c/0xc4 alloc_pages+0xec/0x160 get_zeroed_page+0x1c/0x44 kvm_hyp_zalloc_page+0x14/0x20 hyp_map_walker+0xd4/0x134 kvm_pgtable_visitor_cb.isra.0+0x38/0x5c __kvm_pgtable_walk+0x1a4/0x220 kvm_pgtable_walk+0x104/0x1f4 kvm_pgtable_hyp_map+0x80/0xc4 __create_hyp_mappings+0x9c/0xc4 kvm_mmu_init+0x144/0x1cc kvm_arch_init+0xe4/0xef4 kvm_init+0x3c/0x3d0 arm_init+0x20/0x30 do_one_initcall+0x74/0x400 kernel_init_freeable+0x2e0/0x350 kernel_init+0x24/0x130 ret_from_fork+0x10/0x20 Since the hyp stage-1 table walkers are serialized by kvm_hyp_pgd_mutex, RCU protection really doesn't add anything. Don't acquire the RCU read lock for an exclusive walk. Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221118182222.3932898-3-oliver.upton@linux.dev	2022-11-22 13:05:53 +00:00
Oliver Upton	3a5154c723	KVM: arm64: Take a pointer to walker data in kvm_dereference_pteref() Rather than passing through the state of the KVM_PGTABLE_WALK_SHARED flag, just take a pointer to the whole walker structure instead. Move around struct kvm_pgtable and the RCU indirection such that the associated ifdeffery remains in one place while ensuring the walker + flags definitions precede their use. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221118182222.3932898-2-oliver.upton@linux.dev	2022-11-22 13:05:53 +00:00
Marc Zyngier	d56bdce586	KVM: arm64: PMU: Make kvm_pmc the main data structure The PMU code has historically been torn between referencing a counter as a pair vcpu+index or as the PMC pointer. Given that it is pretty easy to go from one representation to the other, standardise on the latter which, IMHO, makes the code slightly more readable. YMMV. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-17-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	9bad925dd7	KVM: arm64: PMU: Simplify vcpu computation on perf overflow notification The way we compute the target vcpu on getting an overflow is a bit odd, as we use the PMC array as an anchor for kvm_pmc_to_vcpu, while we could directly compute the correct address. Get rid of the intermediate step and directly compute the target vcpu. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-16-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	1f7c978282	KVM: arm64: PMU: Allow PMUv3p5 to be exposed to the guest Now that the infrastructure is in place, bump the PMU support up to PMUv3p5. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-15-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	11af4c3716	KVM: arm64: PMU: Implement PMUv3p5 long counter support PMUv3p5 (which is mandatory with ARMv8.5) comes with some extra features: - All counters are 64bit - The overflow point is controlled by the PMCR_EL0.LP bit Add the required checks in the helpers that control counter width and overflow, as well as the sysreg handling for the LP bit. A new kvm_pmu_is_3p5() helper makes it easy to spot the PMUv3p5 specific handling. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-14-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	d82e0dfdfd	KVM: arm64: PMU: Allow ID_DFR0_EL1.PerfMon to be set from userspace Allow userspace to write ID_DFR0_EL1, on the condition that only the PerfMon field can be altered and be something that is compatible with what was computed for the AArch64 view of the guest. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-13-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	60e651ff1f	KVM: arm64: PMU: Allow ID_AA64DFR0_EL1.PMUver to be set from userspace Allow userspace to write ID_AA64DFR0_EL1, on the condition that only the PMUver field can be altered and be at most the one that was initially computed for the guest. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-12-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	3d0dba5764	KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation As further patches will enable the selection of a PMU revision from userspace, sample the supported PMU revision at VM creation time, rather than building each time the ID_AA64DFR0_EL1 register is accessed. This shouldn't result in any change in behaviour. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-11-maz@kernel.org	2022-11-19 12:56:39 +00:00
Marc Zyngier	26d2d0594d	KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits Even when using PMUv3p5 (which implies 64bit counters), there is no way for AArch32 to write to the top 32 bits of the counters. The only way to influence these bits (other than by counting events) is by writing PMCR.P==1. Make sure we obey the architecture and preserve the top 32 bits on a counter update. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-10-maz@kernel.org	2022-11-19 12:43:47 +00:00
Marc Zyngier	9917264d74	KVM: arm64: PMU: Simplify setting a counter to a specific value kvm_pmu_set_counter_value() is pretty odd, as it tries to update the counter value while taking into account the value that is currently held by the running perf counter. This is not only complicated, this is quite wrong. Nowhere in the architecture is it said that the counter would be offset by something that is pending. The counter should be updated with the value set by SW, and start counting from there if required. Remove the odd computation and just assign the provided value after having released the perf event (which is then restarted). Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-9-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	0cb9c3c87a	KVM: arm64: PMU: Add counter_index_to_*reg() helpers In order to reduce the boilerplate code, add two helpers returning the counter register index (resp. the event register) in the vcpu register file from the counter index. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-8-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	0f1e172b54	KVM: arm64: PMU: Only narrow counters that are not 64bit wide The current PMU emulation sometimes narrows counters to 32bit if the counter isn't the cycle counter. As this is going to change with PMUv3p5 where the counters are all 64bit, fix the couple of cases where this happens unconditionally. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20221113163832.3154370-7-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	001d85bd6c	KVM: arm64: PMU: Narrow the overflow checking when required For 64bit counters that overflow on a 32bit boundary, make sure we only check the bottom 32bit to generate a CHAIN event. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20221113163832.3154370-6-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	c82d28cbf1	KVM: arm64: PMU: Distinguish between 64bit counter and 64bit overflow The PMU architecture makes a subtle difference between a 64bit counter and a counter that has a 64bit overflow. This is for example the case of the cycle counter, which can generate an overflow on a 32bit boundary if PMCR_EL0.LC==0 despite the accumulation being done on 64 bits. Use this distinction in the few cases where it matters in the code, as we will reuse this with PMUv3p5 long counters. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-5-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	acdd8a4e13	KVM: arm64: PMU: Always advertise the CHAIN event Even when the underlying HW doesn't offer the CHAIN event (which happens with QEMU), we can always support it as we're in control of the counter overflow. Always advertise the event via PMCEID0_EL0. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221113163832.3154370-4-maz@kernel.org	2022-11-17 15:39:51 +00:00
Marc Zyngier	bead02204e	KVM: arm64: PMU: Align chained counter implementation with architecture pseudocode Ricardo recently pointed out that the PMU chained counter emulation in KVM wasn't quite behaving like the one on actual hardware, in the sense that a chained counter would expose an overflow on both halves of a chained counter, while KVM would only expose the overflow on the top half. The difference is subtle, but significant. What does the architecture say (DDI0087 H.a): - Up to PMUv3p4, all counters but the cycle counter are 32bit - A 32bit counter that overflows generates a CHAIN event on the adjacent counter after exposing its own overflow status - The CHAIN event is accounted if the counter is correctly configured (CHAIN event selected and counter enabled) This all means that our current implementation (which uses 64bit perf events) prevents us from emulating this overflow on the lower half. How to fix this? By implementing the above, to the letter. This largely results in code deletion, removing the notions of "counter pair", "chained counters", and "canonical counter". The code is further restructured to make the CHAIN handling similar to SWINC, as the two are now extremely similar in behaviour. Reported-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20221113163832.3154370-3-maz@kernel.org	2022-11-17 15:39:35 +00:00
Will Deacon	be66e67f17	KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run() As a stepping stone towards deprivileging the host's access to the guest's vCPU structures, introduce some naive flush/sync routines to copy most of the host vCPU into the hyp vCPU on vCPU run and back again on return to EL1. This allows us to run using the pKVM hyp structures when KVM is initialised in protected mode. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-27-will@kernel.org	2022-11-11 17:19:35 +00:00
Quentin Perret	169cd0f823	KVM: arm64: Don't unnecessarily map host kernel sections at EL2 We no longer need to map the host's '.rodata' and '.bss' sections in the stage-1 page-table of the pKVM hypervisor at EL2, so remove those mappings and avoid creating any future dependencies at EL2 on host-controlled data structures. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-25-will@kernel.org	2022-11-11 17:19:35 +00:00
Quentin Perret	27eb26bfff	KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2 The pkvm hypervisor at EL2 may need to read the 'kvm_vgic_global_state' variable from the host, for example when saving and restoring the state of the virtual GIC. Explicitly map 'kvm_vgic_global_state' in the stage-1 page-table of the pKVM hypervisor rather than relying on mapping all of the host '.rodata' section. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-24-will@kernel.org	2022-11-11 17:19:35 +00:00
Will Deacon	73f38ef2ae	KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2 Sharing 'kvm_arm_vmid_bits' between EL1 and EL2 allows the host to modify the variable arbitrarily, potentially leading to all sorts of shenanians as this is used to configure the VTTBR register for the guest stage-2. In preparation for unmapping host sections entirely from EL2, maintain a copy of 'kvm_arm_vmid_bits' in the pKVM hypervisor and initialise it from the host value while it is still trusted. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-23-will@kernel.org	2022-11-11 17:19:35 +00:00
Quentin Perret	fe41a7f8c0	KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host When pKVM is enabled, the hypervisor at EL2 does not trust the host at EL1 and must therefore prevent it from having unrestricted access to internal hypervisor state. The 'kvm_arm_hyp_percpu_base' array holds the offsets for hypervisor per-cpu allocations, so move this this into the nVHE code where it cannot be modified by the untrusted host at EL1. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-22-will@kernel.org	2022-11-11 17:19:35 +00:00
Quentin Perret	f41dff4efb	KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache Rather than relying on the host to free the previously-donated pKVM hypervisor VM pages explicitly on teardown, introduce a dedicated teardown memcache which allows the host to reclaim guest memory resources without having to keep track of all of the allocations made by the pKVM hypervisor at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> [maz: dropped __maybe_unused from unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-21-will@kernel.org	2022-11-11 17:18:58 +00:00
Quentin Perret	60dfe093ec	KVM: arm64: Instantiate guest stage-2 page-tables at EL2 Extend the initialisation of guest data structures within the pKVM hypervisor at EL2 so that we instantiate a memory pool and a full 'struct kvm_s2_mmu' structure for each VM, with a stage-2 page-table entirely independent from the one managed by the host at EL1. The 'struct kvm_pgtable_mm_ops' used by the page-table code is populated with a set of callbacks that can manage guest pages in the hypervisor without any direct intervention from the host, allocating page-table pages from the provided pool and returning these to the host on VM teardown. To keep things simple, the stage-2 MMU for the guest is configured identically to the host stage-2 in the VTCR register and so the IPA size of the guest must match the PA size of the host. For now, the new page-table is unused as there is no way for the host to map anything into it. Yet. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-20-will@kernel.org	2022-11-11 17:16:25 +00:00
Quentin Perret	315775ff7c	KVM: arm64: Consolidate stage-2 initialisation into a single function The initialisation of guest stage-2 page-tables is currently split across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2(). That is presumably for historical reasons as kvm_arm_setup_stage2() originates from the (now defunct) KVM port for 32-bit Arm. Simplify this code path by merging both functions into one, taking care to map the 'struct kvm' into the hypervisor stage-1 early on in order to simplify the failure path. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-19-will@kernel.org	2022-11-11 17:16:25 +00:00
Quentin Perret	717a7eebac	KVM: arm64: Add generic hyp_memcache helpers The host at EL1 and the pKVM hypervisor at EL2 will soon need to exchange memory pages dynamically for creating and destroying VM state. Indeed, the hypervisor will rely on the host to donate memory pages it can use to create guest stage-2 page-tables and to store VM and vCPU metadata. In order to ease this process, introduce a 'struct hyp_memcache' which is essentially a linked list of available pages, indexed by physical addresses so that it can be passed meaningfully between the different virtual address spaces configured at EL1 and EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-18-will@kernel.org	2022-11-11 17:16:25 +00:00
Will Deacon	13e248aab7	KVM: arm64: Provide I-cache invalidation by virtual address at EL2 In preparation for handling cache maintenance of guest pages from within the pKVM hypervisor at EL2, introduce an EL2 copy of icache_inval_pou() which will later be plumbed into the stage-2 page-table cache maintenance callbacks, ensuring that the initial contents of pages mapped as executable into the guest stage-2 page-table is visible to the instruction fetcher. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-17-will@kernel.org	2022-11-11 17:16:25 +00:00
Will Deacon	6c165223e9	KVM: arm64: Initialise hypervisor copies of host symbols unconditionally The nVHE object at EL2 maintains its own copies of some host variables so that, when pKVM is enabled, the host cannot directly modify the hypervisor state. When running in normal nVHE mode, however, these variables are still mirrored at EL2 but are not initialised. Initialise the hypervisor symbols from the host copies regardless of pKVM, ensuring that any reference to this data at EL2 with normal nVHE will return a sensibly initialised value. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-16-will@kernel.org	2022-11-11 17:16:25 +00:00
Quentin Perret	aa6948f82f	KVM: arm64: Add per-cpu fixmap infrastructure at EL2 Mapping pages in a guest page-table from within the pKVM hypervisor at EL2 may require cache maintenance to ensure that the initialised page contents is visible even to non-cacheable (e.g. MMU-off) accesses from the guest. In preparation for performing this maintenance at EL2, introduce a per-vCPU fixmap which allows the pKVM hypervisor to map guest pages temporarily into its stage-1 page-table for the purposes of cache maintenance and, in future, poisoning on the reclaim path. The use of a fixmap avoids the need for memory allocation or locking on the map() path. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Co-developed-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-15-will@kernel.org	2022-11-11 17:16:25 +00:00
Fuad Tabba	9d0c063a4d	KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1 With the pKVM hypervisor at EL2 now offering hypercalls to the host for creating and destroying VM and vCPU structures, plumb these in to the existing arm64 KVM backend to ensure that the hypervisor data structures are allocated and initialised on first vCPU run for a pKVM guest. In the host, 'struct kvm_protected_vm' is introduced to hold the handle of the pKVM VM instance as well as to track references to the memory donated to the hypervisor so that it can be freed back to the host allocator following VM teardown. The stage-2 page-table, hypervisor VM and vCPU structures are allocated separately so as to avoid the need for a large physically-contiguous allocation in the host at run-time. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-14-will@kernel.org	2022-11-11 17:16:24 +00:00
Fuad Tabba	a1ec5c70d3	KVM: arm64: Add infrastructure to create and track pKVM instances at EL2 Introduce a global table (and lock) to track pKVM instances at EL2, and provide hypercalls that can be used by the untrusted host to create and destroy pKVM VMs and their vCPUs. pKVM VM/vCPU state is directly accessible only by the trusted hypervisor (EL2). Each pKVM VM is directly associated with an untrusted host KVM instance, and is referenced by the host using an opaque handle. Future patches will provide hypercalls to allow the host to initialize/set/get pKVM VM/vCPU state using the opaque handle. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Co-developed-by: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> [maz: silence warning on unmap_donated_memory_noclear()] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-13-will@kernel.org	2022-11-11 17:16:05 +00:00
Will Deacon	5304002dc3	KVM: arm64: Rename 'host_kvm' to 'host_mmu' In preparation for introducing VM and vCPU state at EL2, rename the existing 'struct host_kvm' and its singleton 'host_kvm' instance to 'host_mmu' so as to avoid confusion between the structure tracking the host stage-2 MMU state and the host instance of a 'struct kvm' for a protected guest. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-12-will@kernel.org	2022-11-11 16:40:54 +00:00
Fuad Tabba	1c80002e32	KVM: arm64: Add hyp_spinlock_t static initializer Introduce a static initializer macro for 'hyp_spinlock_t' so that it is straightforward to instantiate global locks at EL2. This will be later utilised for locking the VM table in the hypervisor. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Fuad Tabba <tabba@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-11-will@kernel.org	2022-11-11 16:40:54 +00:00
Will Deacon	4d968b12e6	KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h nvhe/mem_protect.h refers to __load_stage2() in the definition of __load_host_stage2() but doesn't include the relevant header. Include asm/kvm_mmu.h in nvhe/mem_protect.h so that users of the latter don't have to do this themselves. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-10-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	9926cfce8d	KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2 Add helpers allowing the hypervisor to check whether a range of pages are currently shared by the host, and 'pin' them if so by blocking host unshare operations until the memory has been unpinned. This will allow the hypervisor to take references on host-provided data-structures (e.g. 'struct kvm') with the guarantee that these pages will remain in a stable state until the hypervisor decides to release them, for example during guest teardown. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-9-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	43c1ff8b75	KVM: arm64: Prevent the donation of no-map pages Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation. Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org	2022-11-11 16:40:54 +00:00
Will Deacon	1ed5c24c26	KVM: arm64: Implement do_donate() helper for donating memory Transferring ownership information of a memory region from one component to another can be achieved using a "donate" operation, which results in the previous owner losing access to the underlying pages entirely and the new owner having exclusive access to the page. Implement a do_donate() helper, along the same lines as do_{un,}share, and provide this functionality for the host-{to,from}-hyp cases as this will later be used to donate/reclaim memory pages to store VM metadata at EL2. In a similar manner to the sharing transitions, permission checks are performed by the hypervisor to ensure that the component initiating the transition really is the owner of the page and also that the completer does not currently have a page mapped at the target address. Tested-by: Vincent Donnefort <vdonnefort@google.com> Co-developed-by: Quentin Perret <qperret@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-7-will@kernel.org	2022-11-11 16:40:54 +00:00
Will Deacon	33bc332d40	KVM: arm64: Unify identifiers used to distinguish host and hypervisor The 'pkvm_component_id' enum type provides constants to refer to the host and the hypervisor, yet this information is duplicated by the 'pkvm_hyp_id' constant. Remove the definition of 'pkvm_hyp_id' and move the 'pkvm_component_id' type definition to 'mem_protect.h' so that it can be used outside of the memory protection code, for example when initialising the owner for hypervisor-owned pages. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-6-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	0d16d12eb2	KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2 In order to allow unmapping arbitrary memory pages from the hypervisor stage-1 page-table, fix-up the initial refcount for pages that have been mapped before the 'vmemmap' array was up and running so that it accurately accounts for all existing hypervisor mappings. This is achieved by traversing the entire hypervisor stage-1 page-table during initialisation of EL2 and updating the corresponding 'struct hyp_page' for each valid mapping. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-5-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	8e6bcc3a45	KVM: arm64: Back the hypervisor 'struct hyp_page' array for all memory The EL2 'vmemmap' array in nVHE Protected mode is currently very sparse: only memory pages owned by the hypervisor itself have a matching 'struct hyp_page'. However, as the size of this struct has been reduced significantly since its introduction, it appears that we can now afford to back the vmemmap for all of memory. Having an easily accessible 'struct hyp_page' for every physical page in memory provides the hypervisor with a simple mechanism to store metadata (e.g. a refcount) that wouldn't otherwise fit in the very limited number of software bits available in the host stage-2 page-table entries. This will be used in subsequent patches when pinning host memory pages for use by the hypervisor at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-4-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	72a5bc0f15	KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool All the contiguous pages used to initialize a 'struct hyp_pool' are considered coalescable, which means that the hyp page allocator will actively try to merge them with their buddies on the hyp_put_page() path. However, using hyp_put_page() on a page that is not part of the inital memory range given to a hyp_pool() is currently unsupported. In order to allow dynamically extending hyp pools at run-time, add a check to __hyp_attach_page() to allow inserting 'external' pages into the free-list of order 0. This will be necessary to allow lazy donation of pages from the host to the hypervisor when allocating guest stage-2 page-table pages at EL2. Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-3-will@kernel.org	2022-11-11 16:40:54 +00:00
Quentin Perret	0f4f7ae10e	KVM: arm64: Move hyp refcount manipulation helpers to common header file We will soon need to manipulate 'struct hyp_page' refcounts from outside page_alloc.c, so move the helpers to a common header file to allow them to be reused easily. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110190259.26861-2-will@kernel.org	2022-11-11 16:40:54 +00:00
Zhiyuan Dai	e1b3253340	KVM: arm64: Fix typo in comment Fix typo in comment (nVHE/VHE). Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1667737840-702-1-git-send-email-daizhiyuan@phytium.com.cn	2022-11-11 16:29:28 +00:00
Ryan Roberts	579d7ebe90	KVM: arm64: Fix kvm init failure when mode!=vhe and VA_BITS=52. For nvhe and protected modes, the hyp stage 1 page-tables were previously configured to have the same number of VA bits as the kernel's idmap. However, for kernel configs with VA_BITS=52 and where the kernel is loaded in physical memory below 48 bits, the idmap VA bits is actually smaller than the kernel's normal stage 1 VA bits. This can lead to kernel addresses that can't be mapped into the hypervisor, leading to kvm initialization failure during boot: kvm [1]: IPA Size Limit: 48 bits kvm [1]: Cannot map world-switch code kvm [1]: error initializing Hyp mode: -34 Fix this by ensuring that the hyp stage 1 VA size is the maximum of what's used for the idmap and the regular kernel stage 1. At the same time, refactor the code so that the hyp VA bits is only calculated in one place. Prior to `7ba8f2b2d6`, the idmap was always 52 bits for a 52 VA bits kernel and therefore the hyp stage1 was also always 52 bits. Fixes: `7ba8f2b2d6` ("arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> [maz: commit message fixes] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221103150507.32948-2-ryan.roberts@arm.com	2022-11-10 19:22:51 +00:00
Oliver Upton	1577cb5823	KVM: arm64: Handle stage-2 faults in parallel The stage-2 map walker has been made parallel-aware, and as such can be called while only holding the read side of the MMU lock. Rip out the conditional locking in user_mem_abort() and instead grab the read lock. Continue to take the write lock from other callsites to kvm_pgtable_stage2_map(). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107220033.1895655-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	af87fc03cf	KVM: arm64: Make table->block changes parallel-aware stage2_map_walker_try_leaf() and friends now handle stage-2 PTEs generically, and perform the correct flush when a table PTE is removed. Additionally, they've been made parallel-aware, using an atomic break to take ownership of the PTE. Stop clearing the PTE in the pre-order callback and instead let stage2_map_walker_try_leaf() deal with it. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107220006.1895572-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	946fbfdf33	KVM: arm64: Make leaf->leaf PTE changes parallel-aware Convert stage2_map_walker_try_leaf() to use the new break-before-make helpers, thereby making the handler parallel-aware. As before, avoid the break-before-make if recreating the existing mapping. Additionally, retry execution if another vCPU thread is modifying the same PTE. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215934.1895478-1-oliver.upton@linux.dev	2022-11-10 14:43:47 +00:00
Oliver Upton	0ab12f3574	KVM: arm64: Make block->table PTE changes parallel-aware In order to service stage-2 faults in parallel, stage-2 table walkers must take exclusive ownership of the PTE being worked on. An additional requirement of the architecture is that software must perform a 'break-before-make' operation when changing the block size used for mapping memory. Roll these two concepts together into helpers for performing a 'break-before-make' sequence. Use a special PTE value to indicate a PTE has been locked by a software walker. Additionally, use an atomic compare-exchange to 'break' the PTE when the stage-2 page tables are possibly shared with another software walker. Elide the DSB + TLBI if the evicted PTE was invalid (and thus not subject to break-before-make). All of the atomics do nothing for now, as the stage-2 walker isn't fully ready to perform parallel walks. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215855.1895367-1-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	331aa3a054	KVM: arm64: Split init and set for table PTE Create a helper to initialize a table and directly call smp_store_release() to install it (for now). Prepare for a subsequent change that generalizes PTE writes with a helper. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-11-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	ca5de2448c	KVM: arm64: Atomically update stage 2 leaf attributes in parallel walks The stage2 attr walker is already used for parallel walks. Since commit `f783ef1c0e` ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging"), KVM acquires the read lock when write-unprotecting a PTE. However, the walker only uses a simple store to update the PTE. This is safe as the only possible race is with hardware updates to the access flag, which is benign. However, a subsequent change to KVM will allow more changes to the stage 2 page tables to be done in parallel. Prepare the stage 2 attribute walker by performing atomic updates to the PTE when walking in parallel. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-10-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	c3119ae45d	KVM: arm64: Protect stage-2 traversal with RCU Use RCU to safely walk the stage-2 page tables in parallel. Acquire and release the RCU read lock when traversing the page tables. Defer the freeing of table memory to an RCU callback. Indirect the calls into RCU and provide stubs for hypervisor code, as RCU is not available in such a context. The RCU protection doesn't amount to much at the moment, as readers are already protected by the read-write lock (all walkers that free table memory take the write lock). Nonetheless, a subsequent change will futher relax the locking requirements around the stage-2 MMU, thereby depending on RCU. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-9-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	5c359cca1f	KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make The break-before-make sequence is a bit annoying as it opens a window wherein memory is unmapped from the guest. KVM should replace the PTE as quickly as possible and avoid unnecessary work in between. Presently, the stage-2 map walker tears down a removed table before installing a block mapping when coalescing a table into a block. As the removed table is no longer visible to hardware walkers after the DSB+TLBI, it is possible to move the remaining cleanup to happen after installing the new PTE. Reshuffle the stage-2 map walker to install the new block entry in the pre-order callback. Unwire all of the teardown logic and replace it with a call to kvm_pgtable_stage2_free_removed() after fixing the PTE. The post-order visitor is now completely unnecessary, so drop it. Finally, touch up the comments to better represent the now simplified map walker. Note that the call to tear down the unlinked stage-2 is indirected as a subsequent change will use an RCU callback to trigger tear down. RCU is not available to pKVM, so there is a need to use different implementations on pKVM and non-pKVM VMs. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-8-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	6b91b8f95c	KVM: arm64: Use an opaque type for pteps Use an opaque type for pteps and require visitors explicitly dereference the pointer before using. Protecting page table memory with RCU requires that KVM dereferences RCU-annotated pointers before using. However, RCU is not available for use in the nVHE hypervisor and the opaque type can be conditionally annotated with RCU for the stage-2 MMU. Call the type a 'pteref' to avoid a naming collision with raw pteps. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-7-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	8e94e1252c	KVM: arm64: Add a helper to tear down unlinked stage-2 subtrees A subsequent change to KVM will move the tear down of an unlinked stage-2 subtree out of the critical path of the break-before-make sequence. Introduce a new helper for tearing down unlinked stage-2 subtrees. Leverage the existing stage-2 free walkers to do so, with a deep call into __kvm_pgtable_walk() as the subtree is no longer reachable from the root. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-6-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	fa002e8e79	KVM: arm64: Don't pass kvm_pgtable through kvm_pgtable_walk_data In order to tear down page tables from outside the context of kvm_pgtable (such as an RCU callback), stop passing a pointer through kvm_pgtable_walk_data. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-5-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	2a611c7f87	KVM: arm64: Pass mm_ops through the visitor context As a prerequisite for getting visitors off of struct kvm_pgtable, pass mm_ops through the visitor context. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-4-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	83844a2317	KVM: arm64: Stash observed pte value in visitor context Rather than reading the ptep all over the shop, read the ptep once from __kvm_pgtable_visit() and stick it in the visitor context. Reread the ptep after visiting a leaf in case the callback installed a new table underneath. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-3-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Oliver Upton	dfc7a7769a	KVM: arm64: Combine visitor arguments into a context structure Passing new arguments by value to the visitor callbacks is extremely inflexible for stuffing new parameters used by only some of the visitors. Use a context structure instead and pass the pointer through to the visitor callback. While at it, redefine the 'flags' parameter to the visitor to contain the bit indicating the phase of the walk. Pass the entire set of flags through the context structure such that the walker can communicate additional state to the visitor callback. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Ben Gardon <bgardon@google.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221107215644.1895162-2-oliver.upton@linux.dev	2022-11-10 14:43:46 +00:00
Gavin Shan	9cb1096f85	KVM: arm64: Enable ring-based dirty memory tracking Enable ring-based dirty memory tracking on ARM64: - Enable CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL. - Enable CONFIG_NEED_KVM_DIRTY_RING_WITH_BITMAP. - Set KVM_DIRTY_LOG_PAGE_OFFSET for the ring buffer's physical page offset. - Add ARM64 specific kvm_arch_allow_write_without_running_vcpu() to keep the site of saving vgic/its tables out of the no-running-vcpu radar. Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221110104914.31280-5-gshan@redhat.com	2022-11-10 13:11:58 +00:00
Ard Biesheuvel	68c76ad4a9	arm64: unwind: add asynchronous unwind tables to kernel and modules Enable asynchronous unwind table generation for both the core kernel as well as modules, and emit the resulting .eh_frame sections as init code so we can use the unwind directives for code patching at boot or module load time. This will be used by dynamic shadow call stack support, which will rely on code patching rather than compiler codegen to emit the shadow call stack push and pop instructions. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Tested-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20221027155908.1940624-2-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-11-09 18:06:35 +00:00
Paolo Bonzini	d663b8a285	KVM: replace direct irq.h inclusion virt/kvm/irqchip.c is including "irq.h" from the arch-specific KVM source directory (i.e. not from arch/*/include) for the sole purpose of retrieving irqchip_in_kernel. Making the function inline in a header that is already included, such as asm/kvm_host.h, is not possible because it needs to look at struct kvm which is defined after asm/kvm_host.h is included. So add a kvm_arch_irqchip_in_kernel non-inline function; irqchip_in_kernel() is only performance critical on arm64 and x86, and the non-inline function is enough on all other architectures. irq.h can then be deleted from all architectures except x86. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-09 12:31:37 -05:00
Peter Xu	c8b88b332b	kvm: Add interruptible flag to __gfn_to_pfn_memslot() Add a new "interruptible" flag showing that the caller is willing to be interrupted by signals during the __gfn_to_pfn_memslot() request. Wire it up with a FOLL_INTERRUPTIBLE flag that we've just introduced. This prepares KVM to be able to respond to SIGUSR1 (for QEMU that's the SIGIPI) even during e.g. handling an userfaultfd page fault. No functional change intended. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221011195809.557016-4-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-09 12:31:27 -05:00
Marc Zyngier	4151bb636a	KVM: arm64: Fix SMPRI_EL1/TPIDR2_EL0 trapping on VHE The trapping of SMPRI_EL1 and TPIDR2_EL0 currently only really work on nVHE, as only this mode uses the fine-grained trapping that controls these two registers. Move the trapping enable/disable code into __{de,}activate_traps_common(), allowing it to be called when it actually matters on VHE, and remove the flipping of EL2 control for TPIDR2_EL0, which only affects the host access of this register. Fixes: `861262ab86` ("KVM: arm64: Handle SME host state when running guests") Reported-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/86bkpqer4z.wl-maz@kernel.org	2022-11-01 15:56:52 +00:00
Ryan Roberts	b6bcdc9f6b	KVM: arm64: Fix bad dereference on MTE-enabled systems enter_exception64() performs an MTE check, which involves dereferencing vcpu->kvm. While vcpu has already been fixed up to be a HYP VA pointer, kvm is still a pointer in the kernel VA space. This only affects nVHE configurations with MTE enabled, as in other cases, the pointer is either valid (VHE) or not dereferenced (!MTE). Fix this by first converting kvm to a HYP VA pointer. Fixes: `ea7fc1bb1c` ("KVM: arm64: Introduce MTE VM feature") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221027120945.29679-1-ryan.roberts@arm.com	2022-10-27 19:49:40 +01:00
Quentin Perret	6853a71726	KVM: arm64: Use correct accessor to parse stage-1 PTEs hyp_get_page_state() is used with pKVM to retrieve metadata about a page by parsing a hypervisor stage-1 PTE. However, it incorrectly uses a helper which parses stage-2 mappings. Ouch. Luckily, pkvm_getstate() only looks at the software bits, which happen to be in the same place for stage-1 and stage-2 PTEs, and this all ends up working correctly by accident. But clearly, we should do better. Fix hyp_get_page_state() to use the correct helper. Fixes: `e82edcc75c` ("KVM: arm64: Implement do_share() helper for sharing memory") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221025145156.855308-1-qperret@google.com	2022-10-25 16:29:58 +01:00
Paolo Bonzini	ebccb53e93	KVM/arm64 fixes for 6.1, take #2 - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmNRGjoPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDVcEP/2zn+J/pAv1FoNKxSiSk/H2hx5PH685mqEwv I7YPYPOuGJDv/uJFXzMMDpjjs8RHG9yWwPwtFZYGKqtmlkXgM3a0jk5jtsgfSVcp vsvqgR0kL67x5D/Mc+I/bkVMNEecrucvkewf0Sp7W2gdXnC+rm1EFN5McK5U60tW +s6fKVkIB0/fxNJCXPdETGiNpoYGEBeVVzoTrXGFm9QBcSDKvqdaPF8koZzOecpS 3UcVYKBaxNf1zowAseNYiS4kndwKuopaYsk4BY/KDgTbPyoFQBwMZYm6DAkb9ugW sFK6g8FmOJJ5meLWIg4vTdruILL+YP6gbJ0HrzFoxJJEdd2iRzEUgQlBp7G+STSY npXmeTJh5DJelm1Qd4OdiSBlqwYTIxKEyaV34/xeFh9P3w/LbN9uSiZAybbTw92h sE1kQpT//s8lTXOnn23YHycZ1Dsy0JSExoJutCt6E5YnTb0wtPQ6ZLoWmyWMnTL/ 6Gyj7LTy+trv4l0N2ND+LF7BfN6Cb4eSXMfLLhwo2qfQI2kD/gBrbrTYYdyiVvHD YnZftvVViOcQVaocXP1SeQ5/yhdj6ASrCiFWie7nJW9Z3IJ9BJuXCxk3gSW5yCde Rw37+FpVSciyhpEK9f4aVGfAGA8ifO1Gi5yp70ioF8Y3i1CvS238g0knx6v6p8Vm dAi3i1ga =C0jb -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #2 - Fix a bug preventing restoring an ITS containing mappings for very large and very sparse device topology - Work around a relocation handling error when compiling the nVHE object with profile optimisation	2022-10-22 03:33:26 -04:00
Paolo Bonzini	5834816829	KVM/arm64 fixes for 6.1, take #1 - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmNIEAUPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDIXcP/AyWlCEJlmc1Jcd9rlaW1Wenr82U+StLVeMy qP5P02gMWdbGExWIWEi4zkt+pAm7K2WRgXid9z5Vjw7kZY/+WwswTzKHWcQhVuZv cBHfeOqgtoHVGR8NcwX6xcp406y3WRqYIsyAmbc5qmo75L8Ew1o3m+3eDfFtAq7l 3XuTCv+lQGNSGMhXHN2SVewZ+pCAo3XJmuHfCBXTqRjwqH4Tzh+54IKzo+9mqBWW 7yeIm5qcbIKGuXLuLL7XCf99gWy/3kQ0xQ1yJeXLAyiHswHqEISZXGHnKeATvD+6 RdbmQ9oRmIYfZfoDKZRUJg8TyTvW1rIKokFbe0q2iyuDnI5D/fAJ48epZaLw+kEf PUzdB3UgPk19SLwgZKQddqY4wOD420ZD5x1TUFUQuLL7sjVv1vUILDvuCLWpq7F7 GyfSB+LEMgexHGsZ1wjslN/ivTbG+dQgaSS9mlV8/WDOLPtD2uOf65vYR3P28hAX zOHrwm3e2+UV83BsEFEY2FQiiIBD24JmSecMbmAIHY09MCSZ+vJ/WbF4J1PcPP8C 3vjueIYTcjhzLtQrfIkGZcS7+wC9ji/RRmpJjbg79EpwrjhEs9G8h1+HyL9+zBZ4 Xn6X+ZG/cv0/ZYdin0ZRzJMvM0RutbsR77blVCLY97PBuLtBlqJDcxr+lmmjyIZ2 Db8Qd6uW =IOxM -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.1, take #1 - Fix for stage-2 invalidation holding the VM MMU lock for too long by limiting the walk to the largest block mapping size - Enable stack protection and branch profiling for VHE - Two selftest fixes	2022-10-22 03:32:23 -04:00
Eric Ren	c000a26071	KVM: arm64: vgic: Fix exit condition in scan_its_table() With some PCIe topologies, restoring a guest fails while parsing the ITS device tables. Reproducer hints: 1. Create ARM virt VM with pxb-pcie bus which adds extra host bridges, with qemu command like: ``` -device pxb-pcie,bus_nr=8,id=pci.x,numa_node=0,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.x \ ... -device pxb-pcie,bus_nr=37,id=pci.y,numa_node=1,bus=pcie.0 \ -device pcie-root-port,..,bus=pci.y \ ... ``` 2. Ensure the guest uses 2-level device table 3. Perform VM migration which calls save/restore device tables In that setup, we get a big "offset" between 2 device_ids, which makes unsigned "len" round up a big positive number, causing the scan loop to continue with a bad GPA. For example: 1. L1 table has 2 entries; 2. and we are now scanning at L2 table entry index 2075 (pointed to by L1 first entry) 3. if next device id is 9472, we will get a big offset: 7397; 4. with unsigned 'len', 'len -= offset * esz', len will underflow to a positive number, mistakenly into next iteration with a bad GPA; (It should break out of the current L2 table scanning, and jump into the next L1 table entry) 5. that bad GPA fails the guest read. Fix it by stopping the L2 table scan when the next device id is outside of the current table, allowing the scan to continue from the next L1 table entry. Thanks to Eric Auger for the fix suggestion. Fixes: `920a7a8fa9` ("KVM: arm64: vgic-its: Add infrastructure for tableookup") Suggested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Eric Ren <renzhengeek@gmail.com> [maz: commit message tidy-up] Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/d9c3a564af9e2c5bf63f48a7dcbf08cd593c5c0b.1665802985.git.renzhengeek@gmail.com	2022-10-15 12:10:54 +01:00
Denis Nikitin	bde971a83b	KVM: arm64: nvhe: Fix build with profile optimization Kernel build with clang and KCFLAGS=-fprofile-sample-use=<profile> fails with: error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL section ".rel.llvm.call-graph-profile" Starting from 13.0.0 llvm can generate SHT_REL section, see https://reviews.llvm.org/rGca3bdb57fa1ac98b711a735de048c12b5fdd8086. gen-hyprel does not support SHT_REL relocation section. Filter out profile use flags to fix the build with profile optimization. Signed-off-by: Denis Nikitin <denik@chromium.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221014184532.3153551-1-denik@chromium.org	2022-10-15 12:09:50 +01:00
Linus Torvalds	f311d498be	ARM: * Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS * Better handling of AArch32 ID registers on AArch64-only systems * Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering * Advertise the new kvmarm mailing list * Various minor cleanups and spelling fixes RISC-V: * Improved instruction encoding infrastructure for instructions not yet supported by binutils * Svinval support for both KVM Host and KVM Guest * Zihintpause support for KVM Guest * Zicbom support for KVM Guest * Record number of signal exits as a VCPU stat * Use generic guest entry infrastructure x86: * Misc PMU fixes and cleanups. * selftests: fixes for Hyper-V hypercall * selftests: fix nx_huge_pages_test on TDP-disabled hosts * selftests: cleanups for fix_hypercall_test -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM7OcMUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAFgf/Rqc9hrXZVdbh2OZ+gScSsFsPK1zO DISUksLcXaYVYYsvQAEg/N2BPz3XbmO4jA+z8bIUrYTA7fC98we2C4jfR+EaX/fO +/Kzf0lAgu/nQZyFzUya+1jRsZqvVbC/HmDCI2kzN4u78e/LZ7NVcMijdV/ly6ib cq0b0LLqJHe/fcpJ806JZP3p5sndQhDmlUkZ2AWZf6CUKSEFcufbbYkt+84ZK4PL N9mEqXYQ3DXClLQmIBv+NZhtGlmADkWDE4BNouw8dVxhaXH7Hw/jfBHdb6SSHMRe tQ6Src1j8AYOhf5J35SMudgkbGcMelm0yeZ7Sizk+5Ft0EmdbZsnkvsGdQ== =4RA+ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull more kvm updates from Paolo Bonzini: "The main batch of ARM + RISC-V changes, and a few fixes and cleanups for x86 (PMU virtualization and selftests). ARM: - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes RISC-V: - Improved instruction encoding infrastructure for instructions not yet supported by binutils - Svinval support for both KVM Host and KVM Guest - Zihintpause support for KVM Guest - Zicbom support for KVM Guest - Record number of signal exits as a VCPU stat - Use generic guest entry infrastructure x86: - Misc PMU fixes and cleanups. - selftests: fixes for Hyper-V hypercall - selftests: fix nx_huge_pages_test on TDP-disabled hosts - selftests: cleanups for fix_hypercall_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (57 commits) riscv: select HAVE_POSIX_CPU_TIMERS_TASK_WORK RISC-V: KVM: Use generic guest entry infrastructure RISC-V: KVM: Record number of signal exits as a vCPU stat RISC-V: KVM: add __init annotation to riscv_kvm_init() RISC-V: KVM: Expose Zicbom to the guest RISC-V: KVM: Provide UAPI for Zicbom block size RISC-V: KVM: Make ISA ext mappings explicit RISC-V: KVM: Allow Guest use Zihintpause extension RISC-V: KVM: Allow Guest use Svinval extension RISC-V: KVM: Use Svinval for local TLB maintenance when available RISC-V: Probe Svinval extension form ISA string RISC-V: KVM: Change the SBI specification version to v1.0 riscv: KVM: Apply insn-def to hlv encodings riscv: KVM: Apply insn-def to hfence encodings riscv: Introduce support for defining instructions riscv: Add X register names to gpr-nums KVM: arm64: Advertise new kvmarm mailing list kvm: vmx: keep constant definition format consistent kvm: mmu: fix typos in struct kvm_arch KVM: selftests: Fix nx_huge_pages_test on TDP-disabled hosts ...	2022-10-11 20:07:44 -07:00
Linus Torvalds	ef688f8b8c	The first batch of KVM patches, mostly covering x86, which I am sending out early due to me travelling next week. There is a lone mm patch for which Andrew gave an informal ack at https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org. I will send the bulk of ARM work, as well as other architectures, at the end of next week. ARM: * Account stage2 page table allocations in memory stats. x86: * Account EPT/NPT arm64 page table allocations in memory stats. * Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses. * Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest. * Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs. * A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good. * A handful of fixes for memory leaks in error paths. * Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow. * Never write to memory from non-sleepable kvm_vcpu_check_block() * Selftests refinements and cleanups. * Misc typo cleanups. Generic: * remove KVM_REQ_UNHALT -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/ FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4 3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE +cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig== =/hqX -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "The first batch of KVM patches, mostly covering x86. ARM: - Account stage2 page table allocations in memory stats x86: - Account EPT/NPT arm64 page table allocations in memory stats - Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses - Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of Hyper-V now support eVMCS fields associated with features that are enumerated to the guest - Use KVM's sanitized VMCS config as the basis for the values of nested VMX capabilities MSRs - A myriad event/exception fixes and cleanups. Most notably, pending exceptions morph into VM-Exits earlier, as soon as the exception is queued, instead of waiting until the next vmentry. This fixed a longstanding issue where the exceptions would incorrecly become double-faults instead of triggering a vmexit; the common case of page-fault vmexits had a special workaround, but now it's fixed for good - A handful of fixes for memory leaks in error paths - Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow - Never write to memory from non-sleepable kvm_vcpu_check_block() - Selftests refinements and cleanups - Misc typo cleanups Generic: - remove KVM_REQ_UNHALT" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits) KVM: remove KVM_REQ_UNHALT KVM: mips, x86: do not rely on KVM_REQ_UNHALT KVM: x86: never write to memory from kvm_vcpu_check_block() KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set KVM: x86: lapic does not have to process INIT if it is blocked KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed KVM: nVMX: Make an event request when pending an MTF nested VM-Exit KVM: x86: make vendor code check for all nested events mailmap: Update Oliver's email address KVM: x86: Allow force_emulation_prefix to be written without a reload KVM: selftests: Add an x86-only test to verify nested exception queueing KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events() KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions KVM: x86: Morph pending exceptions to pending VM-Exits at queue time ...	2022-10-09 09:39:55 -07:00
Vincent Donnefort	837d632a38	KVM: arm64: Enable stack protection and branch profiling for VHE For historical reasons, the VHE code inherited the build configuration from nVHE. Now those two parts have their own folder and makefile, we can enable stack protection and branch profiling for VHE. Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Reviewed-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221004154216.2833636-1-vdonnefort@google.com	2022-10-09 03:15:55 +01:00
Oliver Upton	5994bc9e05	KVM: arm64: Limit stage2_apply_range() batch size to largest block Presently stage2_apply_range() works on a batch of memory addressed by a stage 2 root table entry for the VM. Depending on the IPA limit of the VM and PAGE_SIZE of the host, this could address a massive range of memory. Some examples: 4 level, 4K paging -> 512 GB batch size 3 level, 64K paging -> 4TB batch size Unsurprisingly, working on such a large range of memory can lead to soft lockups. When running dirty_log_perf_test: ./dirty_log_perf_test -m -2 -s anonymous_thp -b 4G -v 48 watchdog: BUG: soft lockup - CPU#0 stuck for 45s! [dirty_log_perf_:16703] Modules linked in: vfat fat cdc_ether usbnet mii xhci_pci xhci_hcd sha3_generic gq(O) CPU: 0 PID: 16703 Comm: dirty_log_perf_ Tainted: G O 6.0.0-smp-DEV #1 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dcache_clean_inval_poc+0x24/0x38 lr : clean_dcache_guest_page+0x28/0x4c sp : ffff800021763990 pmr_save: 000000e0 x29: ffff800021763990 x28: 0000000000000005 x27: 0000000000000de0 x26: 0000000000000001 x25: 00400830b13bc77f x24: ffffad4f91ead9c0 x23: 0000000000000000 x22: ffff8000082ad9c8 x21: 0000fffafa7bc000 x20: ffffad4f9066ce50 x19: 0000000000000003 x18: ffffad4f92402000 x17: 000000000000011b x16: 000000000000011b x15: 0000000000000124 x14: ffff07ff8301d280 x13: 0000000000000000 x12: 00000000ffffffff x11: 0000000000010001 x10: fffffc0000000000 x9 : ffffad4f9069e580 x8 : 000000000000000c x7 : 0000000000000000 x6 : 000000000000003f x5 : ffff07ffa2076980 x4 : 0000000000000001 x3 : 000000000000003f x2 : 0000000000000040 x1 : ffff0830313bd000 x0 : ffff0830313bcc40 Call trace: dcache_clean_inval_poc+0x24/0x38 stage2_unmap_walker+0x138/0x1ec __kvm_pgtable_walk+0x130/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 __kvm_pgtable_walk+0x170/0x1d4 kvm_pgtable_stage2_unmap+0xc4/0xf8 kvm_arch_flush_shadow_memslot+0xa4/0x10c kvm_set_memslot+0xb8/0x454 __kvm_set_memory_region+0x194/0x244 kvm_vm_ioctl_set_memory_region+0x58/0x7c kvm_vm_ioctl+0x49c/0x560 __arm64_sys_ioctl+0x9c/0xd4 invoke_syscall+0x4c/0x124 el0_svc_common+0xc8/0x194 do_el0_svc+0x38/0xc0 el0_svc+0x2c/0xa4 el0t_64_sync_handler+0x84/0xf0 el0t_64_sync+0x1a0/0x1a4 Use the largest supported block mapping for the configured page size as the batch granularity. In so doing the walker is guaranteed to visit a leaf only once. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221007234151.461779-3-oliver.upton@linux.dev	2022-10-09 02:33:49 +01:00
Linus Torvalds	18fd049731	arm64 updates for 6.1: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmM9W4cACgkQa9axLQDI XvEy3w/+LJ3KCFowWiz5gTAWikjv+UVssHjLMJixn47V7hsEFQ26Xnam/438rTMI kE95u6DHUpw2SMIxKzFRO7oI5cQtP+cWGwTtOUnjVO+U1oN+HqDOIbO9DbylWDcU eeeqMMmawMfTPuZrYklpOhXscsorbrKIvYBg7wHYOcwBYV3EPhWr89lwMvTVRuyJ qpX628KlkGMaBcONNhv3nS3qZcAOs0oHQCAVS4C8czLDL+vtJlumXUS3xr1Mqm72 xtFe7sje8Djr2kZ8mzh0GbFiZEBoBD3F/l7ayq8gVRaVpToUt8sk36Stjs4LojF1 6imuAfji/5TItkScq5KhGqj6MIugwp/eUVbRN74OLNTYx7msF1ZADNFQ+Q0UuY0H SYK13KvmOji0xjS8qAfhqrwNB79sk3fb+zF9LjETbdz4ZJCgg9gcFbSUTY0DvMfS MXZk/jVeB07olA8xYbjh0BRt4UV9xU628FPQzK5k7e4Nzl4jSvgtJZCZanfuVtjy /ZS1vbN8o7tQLBAlVnw+Exi/VedkKxkkMgm8tPKsMgERTFDx0Pc4Gs72hRpDnPWT MRbeCCGleAf3JQ5vF0coBDNOCEVvweQgShHOyHTz0GyhWXLCFx3RJICo5I4EIpps LLUk4JK0fO3LVrf1AEpu5ZP4+Sact0zfsH3gB7qyLPYFDmjDXD8= =jl3Z -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...	2022-10-06 11:51:49 -07:00
Paolo Bonzini	fe4d9e4abf	KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5hQcPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDoMUP/jra4HSmujLUB5G7Op8HxuurEecOc6xtw0Af AbDLlVc2Vs4rrdVh8GMc8D80atUAVitp8IFjdp/PzI2GTBTzWz43Gav2AbhgIJbJ xoFVHL8LkdHKyMbq10359DqGMqhIf41OFzGwhbzcx2V4pKNkSpjbCpu3bi/+Ybjg 006ZpZc7NAU0rZgw9Flb/dhn0jw7RMc3orhoDQ4tBp1P/VhvqvgFt5bWipkvvBP7 +lQK28ujG3ghST/hKRhg6ozgy5+6NEEHMuhErMYP8nIivRchX+pWF2Lb0qGH1e+U v2MZIZnIIUjyTV1vbYlxtltzfYmPuQ2MFNUBawI9tmlIOU9vJSCzeJS64uWK4KLV kbmk57OfC7rQoSNJH4jaKQp0YpIktrB9Vei97t4I7NwEmkjQj6cLTgg4tQrNqTiQ cFGeC9mE+lhFC8z1lCbna2eG631FxpPrB1SJ1/CU9wboam9dUfXGIvBPh+i2pvMZ vcxzUZJ11y+/uhp4k8i2PBwNno0iwRXd5MinwRUs2CR5vhs8qa5y7FVWKyqKpgI2 xqr4lYTixJZL3mWkYyOQuClrTbT1zkoaPldLq6M7wvO08+QV8ryMeyKT+9s/gNQU dcYSwBCWZaOZm2nN8/zjxRb7VqZVu3cwyXi9XXUWNTCgIe/Q/SDPbXU/Hwbgzf8X UsQF7e9A =aNPK -----END PGP SIGNATURE----- Merge tag 'kvmarm-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v6.1 - Fixes for single-stepping in the presence of an async exception as well as the preservation of PSTATE.SS - Better handling of AArch32 ID registers on AArch64-only systems - Fixes for the dirty-ring API, allowing it to work on architectures with relaxed memory ordering - Advertise the new kvmarm mailing list - Various minor cleanups and spelling fixes	2022-10-03 15:33:32 -04:00
Marc Zyngier	b302ca52ba	Merge branch kvm-arm64/misc-6.1 into kvmarm-master/next * kvm-arm64/misc-6.1: : . : Misc KVM/arm64 fixes and improvement for v6.1 : : - Simplify the affinity check when moving a GICv3 collection : : - Tone down the shouting when kvm-arm.mode=protected is passed : to a guest : : - Fix various comments : : - Advertise the new kvmarm@lists.linux.dev and deprecate the : old Columbia list : . KVM: arm64: Advertise new kvmarm mailing list KVM: arm64: Fix comment typo in nvhe/switch.c KVM: selftests: Update top-of-file comment in psci_test KVM: arm64: Ignore kvm-arm.mode if !is_hyp_mode_available() KVM: arm64: vgic: Remove duplicate check in update_affinity_collection() Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-10-01 10:19:36 +01:00
Paolo Bonzini	c99ad25b0d	Merge tag 'kvm-x86-6.1-2' of https://github.com/sean-jc/linux into HEAD KVM x86 updates for 6.1, batch #2: - Misc PMU fixes and cleanups. - Fixes for Hyper-V hypercall selftest	2022-09-30 07:09:48 -04:00
Catalin Marinas	c704cf27a1	Merge branch 'for-next/alternatives' into for-next/core * for-next/alternatives: : Alternatives (code patching) improvements arm64: fix the build with binutils 2.27 arm64: avoid BUILD_BUG_ON() in alternative-macros arm64: alternatives: add shared NOP callback arm64: alternatives: add alternative_has_feature_*() arm64: alternatives: have callbacks take a cap arm64: alternatives: make alt_region const arm64: alternatives: hoist print out of __apply_alternatives() arm64: alternatives: proton-pack: prepare for cap changes arm64: alternatives: kvm: prepare for cap changes arm64: cpufeature: make cpus_have_cap() noinstr-safe	2022-09-30 09:18:22 +01:00
Catalin Marinas	b23ec74cbd	Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', 'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header arm64/sve: Add Perf extensions documentation perf: arm64: Add SVE vector granule register to user regs MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC docs: perf: Add description for Alibaba's T-Head PMU driver * for-next/doc: : Documentation/arm64 updates arm64/sve: Document our actual ABI for clearing registers on syscall * for-next/sve: : SVE updates arm64/sysreg: Add hwcap for SVE EBF16 * for-next/sysreg: (35 commits) : arm64 system registers generation (more conversions) arm64/sysreg: Fix a few missed conversions arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation arm64/sysreg: Use feature numbering for PMU and SPE revisions arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture arm64/sysreg: Add defintion for ALLINT arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation arm64/sysreg: Convert TIPDR_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation arm64/sysreg: Convert HCRX_EL2 to automatic generation arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields arm64/sysreg: Standardise naming for MTE feature enumeration ... * for-next/gettimeofday: : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV) arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday arm64: alternative: patch alternatives in the vDSO arm64: module: move find_section to header * for-next/stacktrace: : arm64 stacktrace cleanups and improvements arm64: stacktrace: track hyp stacks in unwinder's address space arm64: stacktrace: track all stack boundaries explicitly arm64: stacktrace: remove stack type from fp translator arm64: stacktrace: rework stack boundary discovery arm64: stacktrace: add stackinfo_on_stack() helper arm64: stacktrace: move SDEI stack helpers to stacktrace code arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() arm64: stacktrace: simplify unwind_next_common() arm64: stacktrace: fix kerneldoc comments * for-next/atomics: : arm64 atomics improvements arm64: atomic: always inline the assembly arm64: atomics: remove LL/SC trampolines * for-next/el1-exceptions: : Improve the reporting of EL1 exceptions arm64: rework BTI exception handling arm64: rework FPAC exception handling arm64: consistently pass ESR_ELx to die() arm64: die(): pass 'err' as long arm64: report EL1 UNDEFs better * for-next/a510-erratum-2658417: : Cortex-A510: 2658417: remove BF16 support due to incorrect result arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510 arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space * for-next/defconfig: : arm64 defconfig updates arm64: defconfig: Add Coresight as module arm64: Enable docker support in defconfig arm64: defconfig: Enable memory hotplug and hotremove config arm64: configs: Enable all PMUs provided by Arm * for-next/tpidr2_el0: : arm64 ptrace() support for TPIDR2_EL0 kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface arm64/ptrace: Support access to TPIDR2_EL0 arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0 kselftest/arm64: Add test coverage for NT_ARM_TLS * for-next/ftrace: : arm64 ftraces updates/fixes arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static	2022-09-30 09:17:57 +01:00
Wei-Lin Chang	43b233b158	KVM: arm64: Fix comment typo in nvhe/switch.c Fix the comment of __hyp_vgic_restore_state() from saying VEH to VHE, also change the underscore to a dash to match the comment above __hyp_vgic_save_state(). Signed-off-by: Wei-Lin Chang <r09922117@csie.ntu.edu.tw> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220929042839.24277-1-r09922117@csie.ntu.edu.tw	2022-09-29 09:07:08 +01:00
Paolo Bonzini	c59fb12758	KVM: remove KVM_REQ_UNHALT KVM_REQ_UNHALT is now unnecessary because it is replaced by the return value of kvm_vcpu_block/kvm_vcpu_halt. Remove it. No functional change intended. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20220921003201.1441511-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:37:21 -04:00
Elliot Berman	b2a4d007c3	KVM: arm64: Ignore kvm-arm.mode if !is_hyp_mode_available() Ignore kvm-arm.mode if !is_hyp_mode_available(). Specifically, we want to avoid switching kvm_mode to KVM_MODE_PROTECTED if hypervisor mode is not available. This prevents "Protected KVM" cpu capability being reported when Linux is booting in EL1 and would not have KVM enabled. Reasonably though, we should warn if the command line is requesting a KVM mode at all if KVM isn't actually available. Allow "kvm-arm.mode=none" to skip the warning since this would disable KVM anyway. Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220920190658.2880184-1-quic_eberman@quicinc.com	2022-09-26 10:49:49 +01:00
Gavin Shan	096560dd13	KVM: arm64: vgic: Remove duplicate check in update_affinity_collection() The 'coll' parameter to update_affinity_collection() is never NULL, so comparing it with 'ite->collection' is enough to cover both the NULL case and the "another collection" case. Remove the duplicate check in update_affinity_collection(). Signed-off-by: Gavin Shan <gshan@redhat.com> [maz: repainted commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220923065447.323445-1-gshan@redhat.com	2022-09-26 10:46:37 +01:00
Zenghui Yu	522c9a64c7	KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base With commit `0c24e06119` ("mm: kmemleak: add rbtree and store physical address for objects allocated with PA"), kmemleak started to put the objects allocated with physical address onto object_phys_tree_root tree. The kmemleak_free_part() therefore no longer worked as expected on physically allocated objects (hyp_mem_base in this case) as it attempted to search and remove things in object_tree_root tree. Fix it by using kmemleak_free_part_phys() to unregister hyp_mem_base. This fixes an immediate crash when booting a KVM host in protected mode with kmemleak enabled. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220908130659.2021-1-yuzenghui@huawei.com	2022-09-19 17:59:48 +01:00
Marc Zyngier	bb0cca240a	Merge branch kvm-arm64/single-step-async-exception into kvmarm-master/next * kvm-arm64/single-step-async-exception: : . : Single-step fixes from Reiji Watanabe: : : "This series fixes two bugs of single-step execution enabled by : userspace, and add a test case for KVM_GUESTDBG_SINGLESTEP to : the debug-exception test to verify the single-step behavior." : . KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP KVM: arm64: selftests: Refactor debug-exceptions to make it amenable to new test cases KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-09-19 10:59:29 +01:00
Reiji Watanabe	370531d1e9	KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending While userspace enables single-step, if the Software Step state at the last guest exit was "Active-pending", clear PSTATE.SS on guest entry to restore the state. Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP). It means KVM always makes the vCPU's Software Step state "Active-not-pending" on the guest entry, which lets the VCPU perform single-step (then Software Step exception is taken). This could cause extra single-step (without returning to userspace) if the Software Step state at the last guest exit was "Active-pending" (i.e. the last exit was triggered by an asynchronous exception after the single-step is performed, but before the Software Step exception is taken. See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior in the active-pending state" in ARM DDI 0487I.a for more info about this behavior). Fix this by clearing PSTATE.SS on guest entry if the Software Step state at the last exit was "Active-pending" so that KVM restore the state (and the exception is taken before further single-step is performed). Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com	2022-09-19 10:48:53 +01:00
Reiji Watanabe	34fbdee086	KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Preserve the PSTATE.SS value for the guest while userspace enables single-step (i.e. while KVM manipulates the PSTATE.SS) for the vCPU. Currently, while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP), KVM sets PSTATE.SS to 1 on every guest entry, not saving its original value. When userspace disables single-step, KVM doesn't restore the original value for the subsequent guest entry (use the current value instead). Exception return instructions copy PSTATE.SS from SPSR_ELx.SS only in certain cases when single-step is enabled (and set it to 0 in other cases). So, the value matters only when the guest enables single-step (and when the guest's Software step state isn't affected by single-step enabled by userspace, practically), though. Fix this by preserving the original PSTATE.SS value while userspace enables single-step, and restoring the value once it is disabled. This fix modifies the behavior of GET_ONE_REG/SET_ONE_REG for the PSTATE.SS while single-step is enabled by userspace. Presently, GET_ONE_REG/SET_ONE_REG gets/sets the current PSTATE.SS value, which KVM will override on the next guest entry (i.e. the value userspace gets/sets is not used for the next guest entry). With this patch, GET_ONE_REG/SET_ONE_REG will get/set the guest's preserved value, which KVM will preserve and try to restore after single-step is disabled. Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-2-reijiw@google.com	2022-09-19 10:48:53 +01:00
Marc Zyngier	b04b331502	Merge remote-tracking branch 'arm64/for-next/sysreg' into kvmarm-master/next Merge arm64/for-next/sysreg in order to avoid upstream conflicts due to the never ending sysreg repainting... Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-09-19 09:45:00 +01:00
Mark Rutland	4c0bd995d7	arm64: alternatives: have callbacks take a cap Today, callback alternatives are special-cased within __apply_alternatives(), and are applied alongside patching for system capabilities as ARM64_NCAPS is not part of the boot_capabilities feature mask. This special-casing is less than ideal. Giving special meaning to ARM64_NCAPS for this requires some structures and loops to use ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS. It's also not immediately clear callback alternatives are only applied when applying alternatives for system-wide features. To make this a bit clearer, changes the way that callback alternatives are identified to remove the special-casing of ARM64_NCAPS, and to allow callback alternatives to be associated with a cpucap as with all other alternatives. New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which are always detected alongside boot cpu capabilities and system capabilities respectively. All existing callback alternatives are made to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point during the boot flow as before. Subsequent patches will make more use of these new cpucaps. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:03 +01:00
Mark Rutland	34bbfdfb14	arm64: alternatives: kvm: prepare for cap changes The KVM patching callbacks use cpus_have_final_cap() internally within has_vhe(), and subsequent patches will make it invalid to call cpus_have_final_cap() before alternatives patching has completed, and will mean that cpus_have_const_cap() will always fall back to dynamic checks prior to alternatives patching. In preparation for said change, this patch modifies the KVM patching callbacks to use cpus_have_cap() directly. This is not subject to patching, and will dynamically check the cpu_hwcaps array, which is functionally equivalent to the existing behaviour. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:02 +01:00
Mark Brown	121a8fc088	arm64/sysreg: Use feature numbering for PMU and SPE revisions Currently the kernel refers to the versions of the PMU and SPE features by the version of the architecture where those features were updated but the ARM refers to them using the FEAT_ names for the features. To improve consistency and help with updating for newer features and since v9 will make our current naming scheme a bit more confusing update the macros identfying features to use the FEAT_ based scheme. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	fcf37b38ff	arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64DFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	c0357a73fa	arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1 does not align well with kernel conventions, using as it does a lot of MixedCase in various arrangements. In preparation for automatically generating the defines for this register rename the defines used to match what is in the architecture. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Oliver Upton	d5efec7ed8	KVM: arm64: Treat 32bit ID registers as RAZ/WI on 64bit-only system One of the oddities of the architecture is that the AArch64 views of the AArch32 ID registers are UNKNOWN if AArch32 isn't implemented at any EL. Nonetheless, KVM exposes these registers to userspace for the sake of save/restore. It is possible that the UNKNOWN value could differ between systems, leading to a rejected write from userspace. Avoid the issue altogether by handling the AArch32 ID registers as RAZ/WI when on an AArch64-only system. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-7-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	4de06e4c1d	KVM: arm64: Add a visibility bit to ignore user writes We're about to ignore writes to AArch32 ID registers on AArch64-only systems. Add a bit to indicate a register is handled as write ignore when accessed from userspace. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-6-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	5d9a718b64	KVM: arm64: Spin off helper for calling visibility hook No functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-5-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	cdd5036d04	KVM: arm64: Drop raz parameter from read_id_reg() There is no longer a need for caller-specified RAZ visibility. Hoist the call to sysreg_visible_as_raz() into read_id_reg() and drop the parameter. No functional change intended. Suggested-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-4-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	4782ccc8ef	KVM: arm64: Remove internal accessor helpers for id regs The internal accessors are only ever called once. Dump out their contents in the caller. No functional change intended. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-3-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Oliver Upton	34b4d20399	KVM: arm64: Use visibility hook to treat ID regs as RAZ The generic id reg accessors already handle RAZ registers by way of the visibility hook. Add a visibility hook that returns REG_RAZ unconditionally and throw out the RAZ specific accessors. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220913094441.3957645-2-oliver.upton@linux.dev	2022-09-14 11:36:16 +01:00
Mark Rutland	4b5e694e25	arm64: stacktrace: track hyp stacks in unwinder's address space Currently unwind_next_frame_record() has an optional callback to convert the address space of the FP. This is necessary for the NVHE unwinder, which tracks the stacks in the hyp VA space, but accesses the frame records in the kernel VA space. This is a bit unfortunate since it clutters unwind_next_frame_record(), which will get in the way of future rework. Instead, this patch changes the NVHE unwinder to track the stacks in the kernel's VA space and translate to FP prior to calling unwind_next_frame_record(). This removes the need for the translate_fp() callback, as all unwinders consistently track stacks in the native address space of the unwinder. At the same time, this patch consolidates the generation of the stack addresses behind the stackinfo_get_*() helpers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	8df137300d	arm64: stacktrace: track all stack boundaries explicitly Currently we call an on_accessible_stack() callback for each step of the unwinder, requiring redundant work to be performed in the core of the unwind loop (e.g. disabling preemption around accesses to per-cpu variables containing stack boundaries). To prevent unwind loops which go through a stack multiple times, we have to track the set of unwound stacks, requiring a stack_type enum which needs to cater for all the stacks of all possible callees. To prevent loops within a stack, we must track the prior FP values. This patch reworks the unwinder to minimize the work in the core of the unwinder, and to remove the need for the stack_type enum. The set of accessible stacks (and their boundaries) are determined at the start of the unwind, and the current stack is tracked during the unwind, with completed stacks removed from the set of accessible stacks. This makes the boundary checks more accurate (e.g. detecting overlapped frame records), and removes the need for separate tracking of the prior FP and visited stacks. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	bd8abd6883	arm64: stacktrace: remove stack type from fp translator In subsequent patches we'll remove the stack_type enum, and move the FP translation logic out of the raw FP unwind code. In preparation for doing so, this patch removes the type parameter from the FP translation callback, and modifies kvm_nvhe_stack_kern_va() to determine the relevant stack directly. So that kvm_nvhe_stack_kern_va() can use the stackinfo_*() helpers, these are moved earlier in the file, but are not modified in any way. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	d1f684e46b	arm64: stacktrace: rework stack boundary discovery In subsequent patches we'll want to acquire the stack boundaries ahead-of-time, and we'll need to be able to acquire the relevant stack_info regardless of whether we have an object the happens to be on the stack. This patch replaces the on_XXX_stack() helpers with stackinfo_get_XXX() helpers, with the caller being responsible for the checking whether an object is on a relevant stack. For the moment this is moved into the on_accessible_stack() functions, making these slightly larger; subsequent patches will remove the on_accessible_stack() functions and simplify the logic. The on_irq_stack() and on_task_stack() helpers are kept as these are used by IRQ entry sequences and stackleak respectively. As they're only used as predicates, the stack_info pointer parameter is removed in both cases. As the on_accessible_stack() functions are always passed a non-NULL info pointer, these now update info unconditionally. When updating the type to STACK_TYPE_UNKNOWN, the low/high bounds are also modified, but as these will not be consumed this should have no adverse affect. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Rutland	b532ab5f23	arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() The unwind_next_common() function unwinds a single frame record. There are other unwind steps (e.g. unwinding through trampolines) which are handled in the regular kernel unwinder, and in future there may be other common unwind helpers. Clarify the purpose of unwind_next_common() by renaming it to unwind_next_frame_record(). At the same time, add commentary, and delete the redundant comment at the top of asm/stacktrace/common.h. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Rutland	bc8d75212d	arm64: stacktrace: simplify unwind_next_common() Currently unwind_next_common() takes a pointer to a stack_info which is only ever used within unwind_next_common(). Make it a local variable and simplify callers. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:07 +01:00
Mark Brown	5620b4b037	arm64/sysreg: Standardise naming for ID_AA64PFR0_EL1.AdvSIMD constants The architecture refers to the register field identifying advanced SIMD as AdvSIMD but the kernel refers to it as ASIMD. Use the architecture's naming. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-15-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	4f8456c319	arm64/sysreg: Standardise naming for ID_AA64PFR0_EL1 constants We generally refer to the baseline feature implemented as _IMP so in preparation for automatic generation of register defines update those for ID_AA64PFR0_EL1 to reflect this. In the case of ASIMD we don't actually use the define so just remove it. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-14-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	ca951862ad	arm64/sysreg: Standardise naming for ID_AA64MMFR2_EL1.CnP The kernel refers to ID_AA64MMFR2_EL1.CnP as CNP. In preparation for automatic generation of defines for the system registers bring the naming used by the kernel in sync with that of DDI0487H.a. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-13-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Kristina Martsenko	6fcd019359	arm64/sysreg: Standardise naming for ID_AA64MMFR1_EL1 fields In preparation for converting the ID_AA64MMFR1_EL1 system register defines to automatic generation, rename them to follow the conventions used by other automatically generated registers: * Add _EL1 in the register name. * Rename fields to match the names in the ARM ARM: * LOR -> LO * HPD -> HPDS * VHE -> VH * HADBS -> HAFDBS * SPECSEI -> SpecSEI * VMIDBITS -> VMIDBits There should be no functional change as a result of this patch. Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-11-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:03 +01:00
Mark Brown	07d7d848b9	arm64/sysreg: Standardise naming of ID_AA64MMFR0_EL1.ASIDBits For some reason we refer to ID_AA64MMFR0_EL1.ASIDBits as ASID. Add BITS into the name, bringing the naming into sync with DDI0487H.a. Due to the large amount of MixedCase in this register which isn't really consistent with either the kernel style or the majority of the architecture the use of upper case is preserved. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-10-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	ed7c138d6f	arm64/sysreg: Standardise naming of ID_AA64MMFR0_EL1.BigEnd For some reason we refer to ID_AA64MMFR0_EL1.BigEnd as BIGENDEL. Remove the EL from the name, bringing the naming into sync with DDI0487H.a. Due to the large amount of MixedCase in this register which isn't really consistent with either the kernel style or the majority of the architecture the use of upper case is preserved. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-9-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	6ca2b9ca45	arm64/sysreg: Add _EL1 into ID_AA64PFR1_EL1 constant names Our standard is to include the _EL1 in the constant names for registers but we did not do that for ID_AA64PFR1_EL1, update to do so in preparation for conversion to automatic generation. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-8-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	55adc08d7e	arm64/sysreg: Add _EL1 into ID_AA64PFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64PFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	a957c6be2b	arm64/sysreg: Add _EL1 into ID_AA64MMFR2_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR2_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Mark Brown	2d987e64e8	arm64/sysreg: Add _EL1 into ID_AA64MMFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64MMFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20220905225425.1871461-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 10:59:02 +01:00
Yosry Ahmed	d38ba8ccd9	KVM: arm64/mmu: count KVM s2 mmu usage in secondary pagetable stats Count the pages used by KVM in arm64 for stage2 mmu in memory stats under secondary pagetable stats (e.g. "SecPageTables" in /proc/meminfo) to give better visibility into the memory consumption of KVM mmu in a similar way to how normal user page tables are accounted. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220823004639.2387269-5-yosryahmed@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-08-30 07:44:25 -07:00
Paolo Bonzini	959d6c4ae2	KVM/arm64 fixes for 6.0, take #1 - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK - Tidy-up handling of AArch32 on asymmetric systems -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmL+RsgPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDHrUP/3IYZ0LnYUZBImSU/YTPL5yYzdSVAuMNcdRQ EgvLQKwP+JSrmd7B7wZ4MhY1LheKjpNmmuSqTRsZOHb/yBmnh3+ao5n2gqusYQeJ PCuLYjeF7ZU5fGIrPAW6BW0BFlmYMVbTrC6SEMhZsisBhna44jrrWgkBz9mOsXE/ YcDWv8kP15lisuQzMvnYxmZobbVgSJ3KgQY4/Dp6vyKMR8ULujCxziFV5R4RD0xP Ay8wnxtMUymx9P6sZsd6Vwi5h1MUXOOoI4He7+8ejIfoMOManIMOIq4PDQhINwQv tGysDmQavftSbUkXJ1VB+8cJ/9KufzwKxFoc5WqGk1y14QulyBNyb/XR3UtORe1n bitINTTkqibHY6fdQJA7z1sD0jaEAh/xNwO1Gq0BS40o4XVQDv2BjdQir9TEdlZO tsZKVaFpN3UZe681ru12No8YzQDhpuLH65gDHDjLaftH99WKsrSwMZLoEjqZTlM/ vH/9acd4UB+9zMGTpN2tJ//2cq6g3JoUC7jJIQB1oGStHX0/7AxKMlabR2xHmt9E 4CmJND9RLK6+yEelagxOAYMQfnCdj6pW/3bhvAmsZWh0t3fNxCeBBXFr2I5os+E9 hV0FYx4PG9GtorMSqudCDsP83SDIxCluNZ5iM8t1suSn3dFhk5bDFChS86XGuqQe XxHQ6JTF =r5iq -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.0, take #1 - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK - Tidy-up handling of AArch32 on asymmetric systems	2022-08-19 05:43:53 -04:00
Chao Peng	20ec3ebd70	KVM: Rename mmu_notifier_* to mmu_invalidate_* The motivation of this renaming is to make these variables and related helper functions less mmu_notifier bound and can also be used for non mmu_notifier based page invalidation. mmu_invalidate_* was chosen to better describe the purpose of 'invalidating' a page that those variables are used for. - mmu_notifier_seq/range_start/range_end are renamed to mmu_invalidate_seq/range_start/range_end. - mmu_notifier_retry{_hva} helper functions are renamed to mmu_invalidate_retry{_hva}. - mmu_notifier_count is renamed to mmu_invalidate_in_progress to avoid confusion with mn_active_invalidate_count. - While here, also update kvm_inc/dec_notifier_count() to kvm_mmu_invalidate_begin/end() to match the change for mmu_notifier_count. No functional change intended. Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Message-Id: <20220816125322.1110439-3-chao.p.peng@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-08-19 04:05:41 -04:00
Oliver Upton	b10d86fb8e	KVM: arm64: Reject 32bit user PSTATE on asymmetric systems KVM does not support AArch32 EL0 on asymmetric systems. To that end, prevent userspace from configuring a vCPU in such a state through setting PSTATE. It is already ABI that KVM rejects such a write on a system where AArch32 EL0 is unsupported. Though the kernel's definition of a 32bit system changed in commit `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support"), KVM's did not. Fixes: `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-3-oliver.upton@linux.dev	2022-08-17 10:29:07 +01:00
Oliver Upton	f3c6efc72f	KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems KVM does not support AArch32 on asymmetric systems. To that end, enforce AArch64-only behavior on PMCR_EL1.LC when on an asymmetric system. Fixes: `2122a83331` ("arm64: Allow mismatched 32-bit EL0 support") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220816192554.1455559-2-oliver.upton@linux.dev	2022-08-17 10:29:07 +01:00
Linus Torvalds	3bd6e5854b	asm-generic: updates for 6.0 There are three independent sets of changes: - Sai Prakash Ranjan adds tracing support to the asm-generic version of the MMIO accessors, which is intended to help understand problems with device drivers and has been part of Qualcomm's vendor kernels for many years. - A patch from Sebastian Siewior to rework the handling of IRQ stacks in softirqs across architectures, which is needed for enabling PREEMPT_RT. - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of the code behind that, after the last users of this old interface made it in through the netdev, scsi, media and staging trees. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD 1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa 0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7 ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q= =aTR0 -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three independent sets of changes: - Sai Prakash Ranjan adds tracing support to the asm-generic version of the MMIO accessors, which is intended to help understand problems with device drivers and has been part of Qualcomm's vendor kernels for many years - A patch from Sebastian Siewior to rework the handling of IRQ stacks in softirqs across architectures, which is needed for enabling PREEMPT_RT - The last patch to remove the CONFIG_VIRT_TO_BUS option and some of the code behind that, after the last users of this old interface made it in through the netdev, scsi, media and staging trees" * tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: asm-generic: fcntl: Fix typo 'the the' in comment arch//: remove CONFIG_VIRT_TO_BUS soc: qcom: geni: Disable MMIO tracing for GENI SE serial: qcom_geni_serial: Disable MMIO tracing for geni serial asm-generic/io: Add logging support for MMIO accessors KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM lib: Add register read/write tracing support drm/meson: Fix overflow implicit truncation warnings irqchip/tegra: Fix overflow implicit truncation warnings coresight: etm4x: Use asm-generic IO memory barriers arm64: io: Use asm-generic high level MMIO accessors arch/: Disable softirq stacks on PREEMPT_RT.	2022-08-05 10:07:23 -07:00
Linus Torvalds	7c5c3a6177	ARM: * Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack * Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure * Disagregation of the vcpu flags in separate sets to better track their use model. * A fix for the GICv2-on-v3 selftest * A small set of cosmetic fixes RISC-V: * Track ISA extensions used by Guest using bitmap * Added system instruction emulation framework * Added CSR emulation framework * Added gfp_custom flag in struct kvm_mmu_memory_cache * Added G-stage ioremap() and iounmap() functions * Added support for Svpbmt inside Guest s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes x86: * Permit guests to ignore single-bit ECC errors * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Use try_cmpxchg64 instead of cmpxchg64 * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * Allow NX huge page mitigation to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs * Miscellaneous cleanups: MCE MSR emulation Use separate namespaces for guest PTEs and shadow PTEs bitmasks PIO emulation Reorganize rmap API, mostly around rmap destruction Do not workaround very old KVM bugs for L0 that runs with nesting enabled new selftests API for CPUID Generic: * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA== =PM8o -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "Quite a large pull request due to a selftest API overhaul and some patches that had come in too late for 5.19. ARM: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes RISC-V: - Track ISA extensions used by Guest using bitmap - Added system instruction emulation framework - Added CSR emulation framework - Added gfp_custom flag in struct kvm_mmu_memory_cache - Added G-stage ioremap() and iounmap() functions - Added support for Svpbmt inside Guest s390: - add an interface to provide a hypervisor dump for secure guests - improve selftests to use TAP interface - enable interpretive execution of zPCI instructions (for PCI passthrough) - First part of deferred teardown - CPU Topology - PV attestation - Minor fixes x86: - Permit guests to ignore single-bit ECC errors - Intel IPI virtualization - Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS - PEBS virtualization - Simplify PMU emulation by just using PERF_TYPE_RAW events - More accurate event reinjection on SVM (avoid retrying instructions) - Allow getting/setting the state of the speaker port data bit - Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent - "Notify" VM exit (detect microarchitectural hangs) for Intel - Use try_cmpxchg64 instead of cmpxchg64 - Ignore benign host accesses to PMU MSRs when PMU is disabled - Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior - Allow NX huge page mitigation to be disabled on a per-vm basis - Port eager page splitting to shadow MMU as well - Enable CMCI capability by default and handle injected UCNA errors - Expose pid of vcpu threads in debugfs - x2AVIC support for AMD - cleanup PIO emulation - Fixes for LLDT/LTR emulation - Don't require refcounted "struct page" to create huge SPTEs - Miscellaneous cleanups: - MCE MSR emulation - Use separate namespaces for guest PTEs and shadow PTEs bitmasks - PIO emulation - Reorganize rmap API, mostly around rmap destruction - Do not workaround very old KVM bugs for L0 that runs with nesting enabled - new selftests API for CPUID Generic: - Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache - new selftests API using struct kvm_vcpu instead of a (vm, id) tuple" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits) selftests: kvm: set rax before vmcall selftests: KVM: Add exponent check for boolean stats selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test selftests: KVM: Check stat name before other fields KVM: x86/mmu: remove unused variable RISC-V: KVM: Add support for Svpbmt inside Guest/VM RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap() RISC-V: KVM: Add G-stage ioremap() and iounmap() functions KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache RISC-V: KVM: Add extensible CSR emulation framework RISC-V: KVM: Add extensible system instruction emulation framework RISC-V: KVM: Factor-out instruction emulation into separate sources RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function RISC-V: KVM: Fix variable spelling mistake RISC-V: KVM: Improve ISA extension by using a bitmap KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs() KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT KVM: x86: Do not block APIC write for non ICR registers ...	2022-08-04 14:59:54 -07:00
Linus Torvalds	0cec3f24a7	arm64 updates for 5.20 - Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes -----BEGIN PGP SIGNATURE----- iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmLeccUQHHdpbGxAa2Vy bmVsLm9yZwAKCRC3rHDchMFjNCysB/4ml92RJLhVwRAofbtFfVgVz3JLTSsvob9x Z7FhNDxfM/G32wKtOHU9tHkGJ+PMVWOPajukzxkMhxmilfTyHBbiisNWVRjKQxj4 wrd07DNXPIv3bi8SWzS1y2y8ZqujZWjNJlX8SUCzEoxCVtuNKwrh96kU1jUjrkFZ kBo4E4wBWK/qW29nClGSCgIHRQNJaB/jvITlQhkqIb0pwNf3sAUzW7QoF1iTZWhs UswcLh/zC4q79k9poegdCt8chV5OBDLtLPnMxkyQFvsLYRp3qhyCSQQY/BxvO5JS jT9QR6d+1ewET9BFhqHlIIuOTYBCk3xn/PR9AucUl+ZBQd2tO4B1 =LVH0 -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "Highlights include a major rework of our kPTI page-table rewriting code (which makes it both more maintainable and considerably faster in the cases where it is required) as well as significant changes to our early boot code to reduce the need for data cache maintenance and greatly simplify the KASLR relocation dance. Summary: - Remove unused generic cpuidle support (replaced by PSCI version) - Fix documentation describing the kernel virtual address space - Handling of some new CPU errata in Arm implementations - Rework of our exception table code in preparation for handling machine checks (i.e. RAS errors) more gracefully - Switch over to the generic implementation of ioremap() - Fix lockdep tracking in NMI context - Instrument our memory barrier macros for KCSAN - Rework of the kPTI G->nG page-table repainting so that the MMU remains enabled and the boot time is no longer slowed to a crawl for systems which require the late remapping - Enable support for direct swapping of 2MiB transparent huge-pages on systems without MTE - Fix handling of MTE tags with allocating new pages with HW KASAN - Expose the SMIDR register to userspace via sysfs - Continued rework of the stack unwinder, particularly improving the behaviour under KASAN - More repainting of our system register definitions to match the architectural terminology - Improvements to the layout of the vDSO objects - Support for allocating additional bits of HWCAP2 and exposing FEAT_EBF16 to userspace on CPUs that support it - Considerable rework and optimisation of our early boot code to reduce the need for cache maintenance and avoid jumping in and out of the kernel when handling relocation under KASLR - Support for disabling SVE and SME support on the kernel command-line - Support for the Hisilicon HNS3 PMU - Miscellanous cleanups, trivial updates and minor fixes" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits) arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr} arm64: fix KASAN_INLINE arm64/hwcap: Support FEAT_EBF16 arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long arm64/hwcap: Document allocation of upper bits of AT_HWCAP arm64: enable THP_SWAP for arm64 arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52 arm64: errata: Remove AES hwcap for COMPAT tasks arm64: numa: Don't check node against MAX_NUMNODES drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node() docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags" mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON mm: kasan: Skip unpoisoning of user pages mm: kasan: Ensure the tags are visible before the tag in page->flags drivers/perf: hisi: add driver for HNS3 PMU drivers/perf: hisi: Add description for HNS3 PMU driver drivers/perf: riscv_pmu_sbi: perf format perf/arm-cci: Use the bitmap API to allocate bitmaps ...	2022-08-01 10:37:00 -07:00
Paolo Bonzini	c4edb2babc	Merge tag 'kvmarm-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.20: - Unwinder implementations for both nVHE modes (classic and protected), complete with an overflow stack - Rework of the sysreg access from userspace, with a complete rewrite of the vgic-v3 view to allign with the rest of the infrastructure - Disagregation of the vcpu flags in separate sets to better track their use model. - A fix for the GICv2-on-v3 selftest - A small set of cosmetic fixes	2022-08-01 03:24:12 -04:00
Paolo Bonzini	63f4b21041	Merge remote-tracking branch 'kvm/next' into kvm-next-5.20 KVM/s390, KVM/x86 and common infrastructure changes for 5.20 x86: * Permit guests to ignore single-bit ECC errors * Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache * Intel IPI virtualization * Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS * PEBS virtualization * Simplify PMU emulation by just using PERF_TYPE_RAW events * More accurate event reinjection on SVM (avoid retrying instructions) * Allow getting/setting the state of the speaker port data bit * Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent * "Notify" VM exit (detect microarchitectural hangs) for Intel * Cleanups for MCE MSR emulation s390: * add an interface to provide a hypervisor dump for secure guests * improve selftests to use TAP interface * enable interpretive execution of zPCI instructions (for PCI passthrough) * First part of deferred teardown * CPU Topology * PV attestation * Minor fixes Generic: * new selftests API using struct kvm_vcpu instead of a (vm, id) tuple x86: * Use try_cmpxchg64 instead of cmpxchg64 * Bugfixes * Ignore benign host accesses to PMU MSRs when PMU is disabled * Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior * x86/MMU: Allow NX huge pages to be disabled on a per-vm basis * Port eager page splitting to shadow MMU as well * Enable CMCI capability by default and handle injected UCNA errors * Expose pid of vcpu threads in debugfs * x2AVIC support for AMD * cleanup PIO emulation * Fixes for LLDT/LTR emulation * Don't require refcounted "struct page" to create huge SPTEs x86 cleanups: * Use separate namespaces for guest PTEs and shadow PTEs bitmasks * PIO emulation * Reorganize rmap API, mostly around rmap destruction * Do not workaround very old KVM bugs for L0 that runs with nesting enabled * new selftests API for CPUID	2022-08-01 03:21:00 -04:00
Marc Zyngier	0982c8d859	Merge branch kvm-arm64/nvhe-stacktrace into kvmarm-master/next * kvm-arm64/nvhe-stacktrace: (27 commits) : . : Add an overflow stack to the nVHE EL2 code, allowing : the implementation of an unwinder, courtesy of : Kalesh Singh. From the cover letter (slightly edited): : : "nVHE has two modes of operation: protected (pKVM) and unprotected : (conventional nVHE). Depending on the mode, a slightly different approach : is used to dump the hypervisor stacktrace but the core unwinding logic : remains the same. : : * Protected nVHE (pKVM) stacktraces: : : In protected nVHE mode, the host cannot directly access hypervisor memory. : : The hypervisor stack unwinding happens in EL2 and is made accessible to : the host via a shared buffer. Symbolizing and printing the stacktrace : addresses is delegated to the host and happens in EL1. : : * Non-protected (Conventional) nVHE stacktraces: : : In non-protected mode, the host is able to directly access the hypervisor : stack pages. : : The hypervisor stack unwinding and dumping of the stacktrace is performed : by the host in EL1, as this avoids the memory overhead of setting up : shared buffers between the host and hypervisor." : : Additional patches from Oliver Upton and Marc Zyngier, tidying up : the initial series. : . arm64: Update 'unwinder howto' KVM: arm64: Don't open code ARRAY_SIZE() KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around KVM: arm64: Introduce pkvm_dump_backtrace() KVM: arm64: Implement protected nVHE hyp stack unwinder KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace KVM: arm64: Stub implementation of pKVM HYP stack unwinder KVM: arm64: Allocate shared pKVM hyp stacktrace buffers KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig KVM: arm64: Introduce hyp_dump_backtrace() KVM: arm64: Implement non-protected nVHE hyp stack unwinder KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder KVM: arm64: On stack overflow switch to hyp overflow_stack arm64: stacktrace: Add description of stacktrace/common.h arm64: stacktrace: Factor out common unwind() arm64: stacktrace: Handle frame pointer from different address spaces ... Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-27 18:33:27 +01:00
Oliver Upton	62ae21627a	KVM: arm64: Don't open code ARRAY_SIZE() Use ARRAY_SIZE() instead of an open-coded version. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Link: https://lore.kernel.org/r/20220727142906.1856759-6-maz@kernel.org	2022-07-27 18:18:38 +01:00
Marc Zyngier	0e773da1e6	KVM: arm64: Move nVHE-only helpers into kvm/stacktrace.c kvm_nvhe_stack_kern_va() only makes sense as part of the nVHE unwinder, so simply move it there. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-5-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	4e00532f37	KVM: arm64: Make unwind()/on_accessible_stack() per-unwinder functions Having multiple versions of on_accessible_stack() (one per unwinder) makes it very hard to reason about what is used where due to the complexity of the various includes, the forward declarations, and the reliance on everything being 'inline'. Instead, move the code back where it should be. Each unwinder implements: - on_accessible_stack() as well as the helpers it depends on, - unwind()/unwind_next(), as they pass on_accessible_stack as a parameter to unwind_next_common() (which is the only common code here) This hardly results in any duplication, and makes it much easier to reason about the code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-4-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	9f5fee05f6	KVM: arm64: Move nVHE stacktrace unwinding into its own compilation unit The unwinding code doesn't really belong to the exit handling code. Instead, move it to a file (conveniently named stacktrace.c to confuse the reviewer), and move all the stacktrace-related stuff there. It will be joined by more code very soon. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-3-maz@kernel.org	2022-07-27 18:18:03 +01:00
Marc Zyngier	03fe9cd05b	KVM: arm64: Move PROTECTED_NVHE_STACKTRACE around Make the dependency with EL2_DEBUG more obvious by moving the stacktrace configurtion after it. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20220727142906.1856759-2-maz@kernel.org	2022-07-27 18:18:03 +01:00
Kalesh Singh	3a7e1b55aa	KVM: arm64: Introduce pkvm_dump_backtrace() Dumps the pKVM hypervisor backtrace from EL1 by reading the unwinded addresses from the shared stacktrace buffer. The nVHE hyp backtrace is dumped on hyp_panic(), before panicking the host. [ 111.623091] kvm [367]: nVHE call trace: [ 111.623215] kvm [367]: [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8 [ 111.623448] kvm [367]: [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10 [ 111.623642] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 . . . [ 111.640366] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 111.640467] kvm [367]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 111.640574] kvm [367]: [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c [ 111.640676] kvm [367]: [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48 [ 111.640778] kvm [367]: [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128 [ 111.640880] kvm [367]: [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64 [ 111.640996] kvm [367]: ---[ end nVHE call trace ]--- Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-18-kaleshsingh@google.com	2022-07-26 10:51:39 +01:00
Kalesh Singh	871c5d9314	KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace In protected nVHE mode, the host cannot access private owned hypervisor memory. Also the hypervisor aims to remains simple to reduce the attack surface and does not provide any printk support. For the above reasons, the approach taken to provide hypervisor stacktraces in protected mode is: 1) Unwind and save the hyp stack addresses in EL2 to a shared buffer with the host (done in this patch). 2) Delegate the dumping and symbolization of the addresses to the host in EL1 (later patch in the series). On hyp_panic(), the hypervisor prepares the stacktrace before returning to the host. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-16-kaleshsingh@google.com	2022-07-26 10:51:17 +01:00
Kalesh Singh	6928bcc84b	KVM: arm64: Allocate shared pKVM hyp stacktrace buffers In protected nVHE mode the host cannot directly access hypervisor memory, so we will dump the hypervisor stacktrace to a shared buffer with the host. The minimum size for the buffer required, assuming the min frame size of [x29, x30] (2 * sizeof(long)), is half the combined size of the hypervisor and overflow stacks plus an additional entry to delimit the end of the stacktrace. The stacktrace buffers are used later in the series to dump the nVHE hypervisor stacktrace when using protected-mode. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-14-kaleshsingh@google.com	2022-07-26 10:50:12 +01:00
Kalesh Singh	72adac1bd2	KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig This can be used to disable stacktrace for the protected KVM nVHE hypervisor, in order to save on the associated memory usage. This option is disabled by default, since protected KVM is not widely used on platforms other than Android currently. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-13-kaleshsingh@google.com	2022-07-26 10:50:01 +01:00
Kalesh Singh	314a61dc31	KVM: arm64: Introduce hyp_dump_backtrace() In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace from EL1. This is possible beacause the host can directly access the hypervisor stack pages in non-protected mode. The nVHE backtrace is dumped on hyp_panic(), before panicking the host. [ 101.498183] kvm [377]: nVHE call trace: [ 101.498363] kvm [377]: [<ffff8000090a6570>] __kvm_nvhe_hyp_panic+0xac/0xf8 [ 101.499045] kvm [377]: [<ffff8000090a65cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10 [ 101.499498] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 . . . [ 101.524929] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 101.525062] kvm [377]: [<ffff8000090a61e4>] __kvm_nvhe_recursive_death+0x24/0x34 [ 101.525195] kvm [377]: [<ffff8000090a5de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c [ 101.525333] kvm [377]: [<ffff8000090a8b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48 [ 101.525468] kvm [377]: [<ffff8000090a88b8>] __kvm_nvhe_handle_trap+0xc4/0x128 [ 101.525602] kvm [377]: [<ffff8000090a7864>] __kvm_nvhe___host_exit+0x64/0x64 [ 101.525745] kvm [377]: ---[ end nVHE call trace ]--- Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-12-kaleshsingh@google.com	2022-07-26 10:49:50 +01:00
Kalesh Singh	db129d486e	KVM: arm64: Implement non-protected nVHE hyp stack unwinder Implements the common framework necessary for unwind() to work for non-protected nVHE mode: - on_accessible_stack() - on_overflow_stack() - unwind_next() Non-protected nVHE unwind() is used to unwind and dump the hypervisor stacktrace by the host in EL1 Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-11-kaleshsingh@google.com	2022-07-26 10:49:39 +01:00
Kalesh Singh	879e5ac7b2	KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace In non-protected nVHE mode (non-pKVM) the host can directly access hypervisor memory; and unwinding of the hypervisor stacktrace is done from EL1 to save on memory for shared buffers. To unwind the hypervisor stack from EL1 the host needs to know the starting point for the unwind and information that will allow it to translate hypervisor stack addresses to the corresponding kernel addresses. This patch sets up this book keeping. It is made use of later in the series. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-10-kaleshsingh@google.com	2022-07-26 10:49:27 +01:00
Kalesh Singh	548ec3336f	KVM: arm64: On stack overflow switch to hyp overflow_stack On hyp stack overflow switch to 16-byte aligned secondary stack. This provides us stack space to better handle overflows; and is used in a subsequent patch to dump the hypervisor stacktrace. Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220726073750.3219117-8-kaleshsingh@google.com	2022-07-26 10:49:05 +01:00

1 2 3 4 5 ...

1933 Коммитов