WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Sean Christopherson	80a3e4ae96	KVM: x86/mmu: Map TDP MMU leaf SPTE iff target level is reached Map the leaf SPTE when handling a TDP MMU page fault if and only if the target level is reached. A recent commit reworked the retry logic and incorrectly assumed that walking SPTEs would never "fail", as the loop either bails (retries) or installs parent SPs. However, the iterator itself will bail early if it detects a frozen (REMOVED) SPTE when stepping down. The TDP iterator also rereads the current SPTE before stepping down specifically to avoid walking into a part of the tree that is being removed, which means it's possible to terminate the loop without the guts of the loop observing the frozen SPTE, e.g. if a different task zaps a parent SPTE between the initial read and try_step_down()'s refresh. Mapping a leaf SPTE at the wrong level results in all kinds of badness as page table walkers interpret the SPTE as a page table, not a leaf, and walk into the weeds. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1025 at arch/x86/kvm/mmu/tdp_mmu.c:1070 kvm_tdp_mmu_map+0x481/0x510 Modules linked in: kvm_intel CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_tdp_mmu_map+0x481/0x510 RSP: 0018:ffffc9000072fba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffc9000072fcc0 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8 RBP: ffff888107d45a10 R08: ffff888277c5b4c0 R09: ffffc9000072fa48 R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000073a0e0 R13: ffff88810fc54800 R14: ffff888107d1ae60 R15: ffff88810fc54f90 FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0 Call Trace: <TASK> kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> ---[ end trace 0000000000000000 ]--- Invalid SPTE change: cannot replace a present leaf SPTE with another present leaf SPTE mapping a different PFN! as_id: 0 gfn: 100200 old_spte: 600000112400bf3 new_spte: 6000001126009f3 level: 2 ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:__handle_changed_spte.cold+0x95/0x9c RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246 RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027 RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8 RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0 R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3 R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01 FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0 Call Trace: <TASK> kvm_tdp_mmu_map+0x3b0/0x510 kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> Modules linked in: kvm_intel ---[ end trace 0000000000000000 ]--- Fixes: `63d28a25e0` ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry") Cc: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221213033030.83345-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:33:52 -05:00
Sean Christopherson	f5d16bb9be	KVM: x86/mmu: Don't attempt to map leaf if target TDP MMU SPTE is frozen Hoist the is_removed_spte() check above the "level == goal_level" check when walking SPTEs during a TDP MMU page fault to avoid attempting to map a leaf entry if said entry is frozen by a different task/vCPU. ------------[ cut here ]------------ WARNING: CPU: 3 PID: 939 at arch/x86/kvm/mmu/tdp_mmu.c:653 kvm_tdp_mmu_map+0x269/0x4b0 Modules linked in: kvm_intel CPU: 3 PID: 939 Comm: nx_huge_pages_t Not tainted 6.1.0-rc4+ #67 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_tdp_mmu_map+0x269/0x4b0 RSP: 0018:ffffc9000068fba8 EFLAGS: 00010246 RAX: 00000000000005a0 RBX: ffffc9000068fcc0 RCX: 0000000000000005 RDX: ffff88810741f000 RSI: ffff888107f04600 RDI: ffffc900006a3000 RBP: 060000010b000bf3 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 000ffffffffff000 R12: 0000000000000005 R13: ffff888113670000 R14: ffff888107464958 R15: 0000000000000000 FS: 00007f01c942c740(0000) GS:ffff888277cc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000117013006 CR4: 0000000000172ea0 Call Trace: <TASK> kvm_tdp_page_fault+0x10c/0x130 kvm_mmu_page_fault+0x103/0x680 vmx_handle_exit+0x132/0x5a0 [kvm_intel] vcpu_enter_guest+0x60c/0x16f0 kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0 kvm_vcpu_ioctl+0x271/0x660 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> ---[ end trace 0000000000000000 ]--- Fixes: `63d28a25e0` ("KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry") Cc: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Robert Hoo <robert.hu@linux.intel.com> Message-Id: <20221213033030.83345-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:33:52 -05:00
Sean Christopherson	a0860d68a2	KVM: nVMX: Don't stuff secondary execution control if it's not supported When stuffing the allowed secondary execution controls for nested VMX in response to CPUID updates, don't set the allowed-1 bit for a feature that isn't supported by KVM, i.e. isn't allowed by the canonical vmcs_config. WARN if KVM attempts to manipulate a feature that isn't supported. All features that are currently stuffed are always advertised to L1 for nested VMX if they are supported in KVM's base configuration, and no additional features should ever be added to the CPUID-induced stuffing (updating VMX MSRs in response to CPUID updates is a long-standing KVM flaw that is slowly being fixed). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221213062306.667649-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:32:03 -05:00
Sean Christopherson	31de69f4ee	KVM: nVMX: Properly expose ENABLE_USR_WAIT_PAUSE control to L1 Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the feature is supported in hardware and enabled in KVM's base, non-nested configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported. This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and obviously allows L1 to enable the feature for L2. KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when updating secondary controls in response to KVM_SET_CPUID(2), but (a) that depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction that the guest value must be a strict subset of the supported host value. Although no past commit explicitly enabled nested support for WAITPKG, doing so is safe and functionally correct from an architectural perspective as no additional KVM support is needed to virtualize TPAUSE, UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards VM-Exits to L1 as necessary (commit `bf653b78f9`, "KVM: vmx: Introduce handle_unexpected_vmexit and handle WAITPKG vmexit"). Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in hardware, i.e. always runs both L1 and L2 with the host's power management settings for TPAUSE and UMWAIT. See commit `bf09fb6cba` ("KVM: VMX: Stop context switching MSR_IA32_UMWAIT_CONTROL") for more details. Fixes: `e69e72faa3` ("KVM: x86: Add support for user wait instructions") Cc: stable@vger.kernel.org Reported-by: Aaron Lewis <aaronlewis@google.com> Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20221213062306.667649-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:22:37 -05:00
Sean Christopherson	057b18756b	KVM: nVMX: Document that ignoring memory failures for VMCLEAR is deliberate Explicitly drop the result of kvm_vcpu_write_guest() when writing the "launch state" as part of VMCLEAR emulation, and add a comment to call out that KVM's behavior is architecturally valid. Intel's pseudocode effectively says that VMCLEAR is a nop if the target VMCS address isn't in memory, e.g. if the address points at MMIO. Add a FIXME to call out that suppressing failures on __copy_to_user() is wrong, as memory (a memslot) does exist in that case. Punt the issue to the future as open coding kvm_vcpu_write_guest() just to make sure the guest dies with -EFAULT isn't worth the extra complexity. The flaw will need to be addressed if KVM ever does something intelligent on uaccess failures, e.g. to support post-copy demand paging, but in that case KVM will need a more thorough overhaul, i.e. VMCLEAR shouldn't need to open code a core KVM helper. No functional change intended. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1527765 ("Error handling issues") Fixes: `587d7e72ae` ("kvm: nVMX: VMCLEAR should not cause the vCPU to shut down") Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221220154224.526568-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:16:49 -05:00
Sean Christopherson	53800f88d4	KVM: selftests: Zero out valid_bank_mask for "all" case in Hyper-V IPI test Zero out the valid_bank_mask when using the fast variant of HVCALL_SEND_IPI_EX to send IPIs to all vCPUs. KVM requires the "var_cnt" and "valid_bank_mask" inputs to be consistent even when targeting all vCPUs. See commit `bd1ba5732b` ("KVM: x86: Get the number of Hyper-V sparse banks from the VARHEAD field"). Fixes: `998489245d` ("KVM: selftests: Hyper-V PV IPI selftest") Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221219220416.395329-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:16:09 -05:00
Sean Christopherson	77b1908e10	KVM: x86: Sanity check inputs to kvm_handle_memory_failure() Add a sanity check in kvm_handle_memory_failure() to assert that a valid x86_exception structure is provided if the memory "failure" wants to propagate a fault into the guest. If a memory failure happens during a direct guest physical memory access, e.g. for nested VMX, KVM hardcodes the failure to X86EMUL_IO_NEEDED and doesn't provide an exception pointer (because the exception struct would just be filled with garbage). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221220153427.514032-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:15:25 -05:00
Peng Hao	3c649918b7	KVM: x86: Simplify kvm_apic_hw_enabled kvm_apic_hw_enabled() only needs to return bool, there is no place to use the return value of MSR_IA32_APICBASE_ENABLE. Signed-off-by: Peng Hao <flyingpeng@tencent.com> Message-Id: <CAPm50aJ=BLXNWT11+j36Dd6d7nz2JmOBk4u7o_NPQ0N61ODu1g@mail.gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:09:28 -05:00
Vitaly Kuznetsov	8b9e13d2de	KVM: x86: hyper-v: Fix 'using uninitialized value' Coverity warning In kvm_hv_flush_tlb(), 'data_offset' and 'consumed_xmm_halves' variables are used in a mutually exclusive way: in 'hc->fast' we count in 'XMM halves' and increase 'data_offset' otherwise. Coverity discovered, that in one case both variables are incremented unconditionally. This doesn't seem to cause any issues as the only user of 'data_offset'/'consumed_xmm_halves' data is kvm_hv_get_tlb_flush_entries() -> kvm_hv_get_hc_data() which also takes into account 'hc->fast' but is still worth fixing. To make things explicit, put 'data_offset' and 'consumed_xmm_halves' to 'struct kvm_hv_hcall' as a union and use at call sites. This allows to remove explicit 'data_offset'/'consumed_xmm_halves' parameters from kvm_hv_get_hc_data()/kvm_get_sparse_vp_set()/kvm_hv_get_tlb_flush_entries() helpers. Note: 'struct kvm_hv_hcall' is allocated on stack in kvm_hv_hypercall() and is not zeroed, consumers are supposed to initialize the appropriate field if needed. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1527764 ("Uninitialized variables") Fixes: `260970862c` ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221208102700.959630-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:08:16 -05:00
Adamos Ttofari	fceb3a36c2	KVM: x86: ioapic: Fix level-triggered EOI and userspace I/OAPIC reconfigure race When scanning userspace I/OAPIC entries, intercept EOI for level-triggered IRQs if the current vCPU has a pending and/or in-service IRQ for the vector in its local API, even if the vCPU doesn't match the new entry's destination. This fixes a race between userspace I/OAPIC reconfiguration and IRQ delivery that results in the vector's bit being left set in the remote IRR due to the eventual EOI not being forwarded to the userspace I/OAPIC. Commit `0fc5a36dd6` ("KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race") fixed the in-kernel IOAPIC, but not the userspace IOAPIC configuration, which has a similar race. Fixes: `0fc5a36dd6` ("KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race") Signed-off-by: Adamos Ttofari <attofari@amazon.de> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221208094415.12723-1-attofari@amazon.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:07:40 -05:00
Like Xu	55c590adfe	KVM: x86/pmu: Prevent zero period event from being repeatedly released The current vPMU can reuse the same pmc->perf_event for the same hardware event via pmc_pause/resume_counter(), but this optimization does not apply to a portion of the TSX events (e.g., "event=0x3c,in_tx=1, in_tx_cp=1"), where event->attr.sample_period is legally zero at creation, thus making the perf call to perf_event_period() meaningless (no need to adjust sample period in this case), and instead causing such reusable perf_events to be repeatedly released and created. Avoid releasing zero sample_period events by checking is_sampling_event() to follow the previously enable/disable optimization. Signed-off-by: Like Xu <likexu@tencent.com> Message-Id: <20221207071506.15733-2-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-12-23 12:06:45 -05:00
Jens Axboe	343190841a	io_uring: check for valid register opcode earlier We only check the register opcode value inside the restricted ring section, move it into the main io_uring_register() function instead and check it up front. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-12-23 06:40:32 -07:00
Linus Torvalds	8395ae05cb	SCSI misc on 20221222 Mostly small bug fixes and small updates. The only things of note is a qla2xxx fix for crash on hotplug and timeout and the addition of a user exposed abstraction layer for persistent reservation error return handling (which necessitates the conversion of nvme.c as well as SCSI). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY6SZISYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQkdAP9Juri0 ihkyA9tVx1ZslVOp8V8mWK3P2VROA4ArvcMRVwD/Qxf2REP8Fx2GIgC0sNaRedg3 +ncveg3EpZ1n/NXXeDw= =q+XO -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull more SCSI updates from James Bottomley: "Mostly small bug fixes and small updates. The only things of note is a qla2xxx fix for crash on hotplug and timeout and the addition of a user exposed abstraction layer for persistent reservation error return handling (which necessitates the conversion of nvme.c as well as SCSI)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix crash when I/O abort times out nvme: Convert NVMe errors to PR errors scsi: sd: Convert SCSI errors to PR errors scsi: core: Rename status_byte to sg_status_byte block: Add error codes for common PR failures scsi: sd: sd_zbc: Trace zone append emulation scsi: libfc: Include the correct header	2022-12-22 11:22:31 -08:00
Linus Torvalds	ff75ec43a2	afs next -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmOkQmcACgkQ+7dXa6fL C2vjNg/8CWHpUQj32SSASt5uQvndqBe3xyr+NPYRdNddcu/gS82UoMSuOdMh+afb OhZ/yrkWzkraJVMgEc2mbe0xfGEN9TRQnld+/oy5Co2dxlLAtA/Iw3xZKKG5V5J1 CVE8V2SUtPC0ycJ4XLNuwfmaTEGxZjKju832V4qWvT8oz299Xl4MTsu3zN+Rqpih TAAfokfMVTN57x6PTd+KCl8dmExRyRIq70Iu9OwHPF9lFFDVqGlzGPYJ+gPqSKxV B0F/sW6y1djuyL8wFuZn+W1ECf3DnA9Ol2cSP6qEsWrymQkjY/9tntN52Hu22y9x xP6MHXKQXF+gjmX7aokivTTcOSw6/ript1ykcaNlz7ZX31mxKQIsb++jHSWshs6f 7Ncbjffqg+L8CmgVvaQ63dNVBvHa+Y+9Os8H0t8DZ0DoY6Crv+W8ssQkjW3Lqdoq DIlOFRKEbeXO0+hTM00te3NhP8sYKGtjup8Xuv8TMqye2hE8DvBu80qdvISBmglP P8odB7Rlwxp9n7jkBUFdc86IrQOHchao1Q7xNY4RDe/CZc6smNBgwf7aK5TONZkk qQGmVk2Ca/rFNQxXAV/iHRFCPJtTcdOk7b6kYWHFVj0E+r0iYNeMD4+hKfQK6W4X u4MzrmX9qm8+zN4e+FSpMU7OEDw2Yi87KmGrz4nbvK6wNW7o6Go= =B9ez -----END PGP SIGNATURE----- Merge tag 'afs-next-20221222' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs update from David Howells: "A fix for a couple of missing resource counter decrements, two small cleanups of now-unused bits of code and a patch to remove writepage support from afs" * tag 'afs-next-20221222' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Stop implementing ->writepage() afs: remove afs_cache_netfs and afs_zap_permits() declarations afs: remove variable nr_servers afs: Fix lost servers_outstanding count	2022-12-22 11:17:34 -08:00
Linus Torvalds	d1ac1a2b14	perf tools fixes and improvements for v6.2: 2nd batch - Don't stop building perf if python setuptools isn't installed, just disable the affected perf feature. - Remove explicit reference to python 2.x devel files, that warning is about python-devel, no matter what version, being unavailable and thus disabling the linking with libpython. - Don't use -Werror=switch-enum when building the python support that handles libtraceevent enumerations, as there is no good way to test if some specific enum entry is available with the libtraceevent installed on the system. - Introduce 'perf lock contention' --type-filter and --lock-filter, to filter by lock type and lock name: $ sudo ./perf lock record -a -- ./perf bench sched messaging $ sudo ./perf lock contention -E 5 -Y spinlock contended total wait max wait avg wait type caller 802 1.26 ms 11.73 us 1.58 us spinlock __wake_up_common_lock+0x62 13 787.16 us 105.44 us 60.55 us spinlock remove_wait_queue+0x14 12 612.96 us 78.70 us 51.08 us spinlock prepare_to_wait+0x27 114 340.68 us 12.61 us 2.99 us spinlock try_to_wake_up+0x1f5 83 226.38 us 9.15 us 2.73 us spinlock folio_lruvec_lock_irqsave+0x5e $ sudo ./perf lock contention -l contended total wait max wait avg wait address symbol 57 1.11 ms 42.83 us 19.54 us ffff9f4140059000 15 280.88 us 23.51 us 18.73 us ffffffff9d007a40 jiffies_lock 1 20.49 us 20.49 us 20.49 us ffffffff9d0d50c0 rcu_state 1 9.02 us 9.02 us 9.02 us ffff9f41759e9ba0 $ sudo ./perf lock contention -L jiffies_lock,rcu_state contended total wait max wait avg wait type caller 15 280.88 us 23.51 us 18.73 us spinlock tick_sched_do_timer+0x93 1 20.49 us 20.49 us 20.49 us spinlock __softirqentry_text_start+0xeb $ sudo ./perf lock contention -L ffff9f4140059000 contended total wait max wait avg wait type caller 38 779.40 us 42.83 us 20.51 us spinlock worker_thread+0x50 11 216.30 us 39.87 us 19.66 us spinlock queue_work_on+0x39 8 118.13 us 20.51 us 14.77 us spinlock kthread+0xe5 - Fix splitting CC into compiler and options when checking if a option is present in clang to build the python binding, needed in systems such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c". - Refresh metris and events for Intel systems: alderlake. alderlake-n, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, elkhartlake, goldmont, goldmontplus, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, knightslanding, meteorlake, nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereex. - Add vendor events files (JSON) for AMD Zen 4, from sections 2.1.15.4 "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache Performance Monitor Counter"s and Section 7.1 "Fabric Performance Monitor Counter (PMC) Events" in the Processor Programming Reference (PPR) for AMD Family 19h Model 11h Revision B1 processors. This constitutes events which capture op dispatch, execution and retirement, branch prediction, L1 and L2 cache activity, TLB activity, L3 cache activity and data bandwidth for various links and interfaces in the Data Fabric. - Also, from the same PPR are metrics taken from Section 2.1.15.2 "Performance Measurement", including pipeline utilization, which are new to Zen 4 processors and useful for finding performance bottlenecks by analyzing activity at different stages of the pipeline. - Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and 'srcfile' sort keys performance by postponing calling the external addr2line utility to the collapse phase of histogram bucketing. - Fix 'perf test' "all PMU test" to skip parametrized events, that requires setting up and are not supported by this test. - Update tools/ copies of kernel headers: features, disabled-features, fscrypt.h, i915_drm.h, msr-index.h, power pc syscall table and kvm.h. - Add .DELETE_ON_ERROR special Makefile target to clean up partially updated files on error. - Simplify the mksyscalltbl script for arm64 by avoiding to run the host compiler to create the syscall table, do it all just with the shell script. - Further fixes to honour quiet mode (-q). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY6SJ+gAKCRCyPKLppCJ+ J5JSAQCSokw2lsIqelDfoBfOQcMwah4ogW1vuO5KiepHgGOjuwD/d+65IxFIRA/h tJjAtq4fReyi4u4eTc1aLgUwFh7V0ws= =rneN -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull more perf tools updates from Arnaldo Carvalho de Melo: "perf tools fixes and improvements: - Don't stop building perf if python setuptools isn't installed, just disable the affected perf feature. - Remove explicit reference to python 2.x devel files, that warning is about python-devel, no matter what version, being unavailable and thus disabling the linking with libpython. - Don't use -Werror=switch-enum when building the python support that handles libtraceevent enumerations, as there is no good way to test if some specific enum entry is available with the libtraceevent installed on the system. - Introduce 'perf lock contention' --type-filter and --lock-filter, to filter by lock type and lock name: $ sudo ./perf lock record -a -- ./perf bench sched messaging $ sudo ./perf lock contention -E 5 -Y spinlock contended total wait max wait avg wait type caller 802 1.26 ms 11.73 us 1.58 us spinlock __wake_up_common_lock+0x62 13 787.16 us 105.44 us 60.55 us spinlock remove_wait_queue+0x14 12 612.96 us 78.70 us 51.08 us spinlock prepare_to_wait+0x27 114 340.68 us 12.61 us 2.99 us spinlock try_to_wake_up+0x1f5 83 226.38 us 9.15 us 2.73 us spinlock folio_lruvec_lock_irqsave+0x5e $ sudo ./perf lock contention -l contended total wait max wait avg wait address symbol 57 1.11 ms 42.83 us 19.54 us ffff9f4140059000 15 280.88 us 23.51 us 18.73 us ffffffff9d007a40 jiffies_lock 1 20.49 us 20.49 us 20.49 us ffffffff9d0d50c0 rcu_state 1 9.02 us 9.02 us 9.02 us ffff9f41759e9ba0 $ sudo ./perf lock contention -L jiffies_lock,rcu_state contended total wait max wait avg wait type caller 15 280.88 us 23.51 us 18.73 us spinlock tick_sched_do_timer+0x93 1 20.49 us 20.49 us 20.49 us spinlock __softirqentry_text_start+0xeb $ sudo ./perf lock contention -L ffff9f4140059000 contended total wait max wait avg wait type caller 38 779.40 us 42.83 us 20.51 us spinlock worker_thread+0x50 11 216.30 us 39.87 us 19.66 us spinlock queue_work_on+0x39 8 118.13 us 20.51 us 14.77 us spinlock kthread+0xe5 - Fix splitting CC into compiler and options when checking if a option is present in clang to build the python binding, needed in systems such as yocto that set CC to, e.g.: "gcc --sysroot=/a/b/c". - Refresh metris and events for Intel systems: alderlake. alderlake-n, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, elkhartlake, goldmont, goldmontplus, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, knightslanding, meteorlake, nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereex. - Add vendor events files (JSON) for AMD Zen 4, from sections 2.1.15.4 "Core Performance Monitor Counters", 2.1.15.5 "L3 Cache Performance Monitor Counter"s and Section 7.1 "Fabric Performance Monitor Counter (PMC) Events" in the Processor Programming Reference (PPR) for AMD Family 19h Model 11h Revision B1 processors. This constitutes events which capture op dispatch, execution and retirement, branch prediction, L1 and L2 cache activity, TLB activity, L3 cache activity and data bandwidth for various links and interfaces in the Data Fabric. - Also, from the same PPR are metrics taken from Section 2.1.15.2 "Performance Measurement", including pipeline utilization, which are new to Zen 4 processors and useful for finding performance bottlenecks by analyzing activity at different stages of the pipeline. - Greatly improve the 'srcline', 'srcline_from', 'srcline_to' and 'srcfile' sort keys performance by postponing calling the external addr2line utility to the collapse phase of histogram bucketing. - Fix 'perf test' "all PMU test" to skip parametrized events, that requires setting up and are not supported by this test. - Update tools/ copies of kernel headers: features, disabled-features, fscrypt.h, i915_drm.h, msr-index.h, power pc syscall table and kvm.h. - Add .DELETE_ON_ERROR special Makefile target to clean up partially updated files on error. - Simplify the mksyscalltbl script for arm64 by avoiding to run the host compiler to create the syscall table, do it all just with the shell script. - Further fixes to honour quiet mode (-q)" * tag 'perf-tools-for-v6.2-2-2022-12-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits) perf python: Fix splitting CC into compiler and options perf scripting python: Don't be strict at handling libtraceevent enumerations perf arm64: Simplify mksyscalltbl perf build: Remove explicit reference to python 2.x devel files perf vendor events amd: Add Zen 4 mapping perf vendor events amd: Add Zen 4 metrics perf vendor events amd: Add Zen 4 uncore events perf vendor events amd: Add Zen 4 core events perf vendor events intel: Refresh westmereex events perf vendor events intel: Refresh westmereep-sp events perf vendor events intel: Refresh westmereep-dp events perf vendor events intel: Refresh tigerlake metrics and events perf vendor events intel: Refresh snowridgex events perf vendor events intel: Refresh skylakex metrics and events perf vendor events intel: Refresh skylake metrics and events perf vendor events intel: Refresh silvermont events perf vendor events intel: Refresh sapphirerapids metrics and events perf vendor events intel: Refresh sandybridge metrics and events perf vendor events intel: Refresh nehalemex events perf vendor events intel: Refresh nehalemep events ...	2022-12-22 11:07:29 -08:00
Mario Limonciello	e555c85792	ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+ After we introduced a module parameter and quirk infrastructure for picking the Microsoft GUID over the SOC vendor GUID we discovered that lots and lots of systems are getting this wrong. The table continues to grow, and is becoming unwieldy. We don't really have any benefit to forcing vendors to populate the AMD GUID. This is just extra work, and more and more vendors seem to mess it up. As the Microsoft GUID is used by Windows as well, it's very likely that it won't be messed up like this. So drop all the quirks forcing it and the Rembrandt behavior. This means that Cezanne or later effectively only run the Microsoft GUID codepath with the exception of HP Elitebook 8*5 G9. Fixes: `fd894f05cf` ("ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt") Cc: stable@vger.kernel.org # 6.1 Reported-by: Benjamin Cheng <ben@bcheng.me> Reported-by: bilkow@tutanota.com Reported-by: Paul <paul@zogpog.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2292 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216768 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com> Tested-by: Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:39:31 +01:00
Mario Limonciello	3ea45390e9	ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865 HP Elitebook 865 supports both the AMD GUID w/ _REV 2 and Microsoft GUID with _REV 0. Both have very similar code but the AMD GUID has a special workaround that is specific to a problem with spurious wakeups on systems with Qualcomm WLAN. This is believed to be a bug in the Qualcomm WLAN F/W (it doesn't affect any other WLAN H/W). If this WLAN firmware is fixed this quirk can be dropped. Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:39:31 +01:00
Hans de Goede	3cf3b7f012	ACPI: video: Fix Apple GMUX backlight detection The apple-gmux driver only binds to old GMUX devices which have an IORESOURCE_IO resource (using inb()/outb()) rather then memory-mapped IO (IORESOURCE_MEM). T2 MacBooks use the new style GMUX devices (with IORESOURCE_MEM access), so these are not supported by the apple-gmux driver. This is not a problem since they have working ACPI video backlight support. But the apple_gmux_present() helper only checks if an ACPI device with the "APP000B" HID is present, causing acpi_video_get_backlight_type() to return acpi_backlight_apple_gmux disabling the acpi_video backlight device. Add a new apple_gmux_backlight_present() helper which checks that the "APP000B" device actually is an old GMUX device with an IORESOURCE_IO resource. This fixes the acpi_video0 backlight no longer registering on T2 MacBooks. Note people are working to add support for the new style GMUX to Linux: https://github.com/kekrby/linux-t2/commits/wip/hybrid-graphics Once this lands this patch should be reverted so that acpi_video_get_backlight_type() also prefers the gmux on new style GMUX MacBooks, but for now this is necessary to avoid regressing backlight control on T2 Macs. Fixes: `21245df307` ("ACPI: video: Add Apple GMUX brightness control detection") Reported-and-tested-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:36:49 +01:00
Hans de Goede	7203481fd1	ACPI: resource: Add Asus ExpertBook B2502 to Asus quirks The Asus ExpertBook B2502 has the same keyboard issue as Asus Vivobook K3402ZA/K3502ZA. The kernel overrides IRQ 1 to Edge_High when it should be Active_Low. This patch adds the ExpertBook B2502 model to the existing quirk list of Asus laptops with this issue. Fixes: `b5f9223a10` ("ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142574 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:35:29 +01:00
Adrian Freund	f3cb9b7408	ACPI: resource: do IRQ override on Lenovo 14ALC7 Commit `bfcdf58380` ("ACPI: resource: do IRQ override on LENOVO IdeaPad") added an override for Lenovo IdeaPad 5 16ALC7. The 14ALC7 variant also suffers from a broken touchscreen and trackpad. Fixes: `9946e39fe8` ("ACPI: resource: skip IRQ override on AMD Zen platforms") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216804 Signed-off-by: Adrian Freund <adrian@freund.io> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:32:34 +01:00
Erik Schumacher	7592b79ba4	ACPI: resource: do IRQ override on XMG Core 15 The Schenker XMG CORE 15 (M22) is Ryzen-6 based and needs IRQ overriding for the keyboard to work. Adding an entry for this laptop to the override_table makes the internal keyboard functional again. Signed-off-by: Erik Schumacher <ofenfisch@googlemail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:29:49 +01:00
Mario Limonciello	5aa9d943e9	ACPI: video: Don't enable fallback path for creating ACPI backlight by default The ACPI video detection code has a module parameter `register_backlight_delay` which is currently configured to 8 seconds. This means that if after 8 seconds of booting no native driver has created a backlight device then the code will attempt to make an ACPI video backlight device. This was intended as a safety mechanism with the backlight overhaul that occurred in kernel 6.1, but as it doesn't appear necesssary set it to be disabled by default. Suggested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:26:42 +01:00
Mario Limonciello	c573e24060	drm/amd/display: Report to ACPI video if no panels were found On desktop APUs amdgpu doesn't create a native backlight device as no eDP panels are found. However if the BIOS has reported backlight control methods in the ACPI tables then an acpi_video0 backlight device will be made 8 seconds after boot. This has manifested in a power slider on a number of desktop APUs ranging from Ryzen 5000 through Ryzen 7000 on various motherboard manufacturers. To avoid this, report to the acpi video detection that the system does not have any panel connected in the native driver. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783786 Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:26:42 +01:00
Mario Limonciello	00a734104a	ACPI: video: Allow GPU drivers to report no panels The current logic for the ACPI backlight detection will create a backlight device if no native or vendor drivers have created 8 seconds after the system has booted if the ACPI tables included backlight control methods. If the GPU drivers have loaded, they may be able to report whether any LCD panels were found. Allow using this information to factor in whether to enable the fallback logic for making an acpi_video0 backlight device. Suggested-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-12-22 17:26:41 +01:00
Jens Axboe	fb857b0bb2	nvme fixes for Linux 6.2 - fix doorbell buffer value endianness (Klaus Jensen) - fix Linux vs NVMe page size mismatch (Keith Busch) - fix a potential use memory access beyong the allocation limit (Keith Busch) - fix a multipath vs blktrace NULL pointer dereference (Yanjun Zhang) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmOkeqkLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYNqvBAAleIay/9mavb1iXTteEFKBN3ml/3Dslc1nETP5FWS 7j8oXaYT4TsXTN4D5lGUPNzeDIVaPvbVeduJLpGbA7Z/g4XSEdfnorc+AmLdje4q LzPAd9u99+P92U5Colj2el4eyPTPzZFbP8IHBZxsR6fTU1i2WyiVYDw+V+MCIQE0 yrg8oU4JHTq3/4B21guADIOK46hYlUMKUhNNsmW1DNsMs/i320ENbZ5gPY4+WiQq t9LK8QDY/NS519KCwtHsZOVwicTpXZoRG19Kx9duiLU+cRUwG5ApdRe0vBXBVjMH R65ekFUu7BUXcRHFoNOZeHzjLnDekYkdfBEHTol9+5fdLMZM3Dbv0CAZindYWA38 VNr63nUkkMh4kShBQjk6VR/TYMsVJ8ZmmrC9Q8kkV9JnvG0ajohQspVhVDwQDKgO +RJSZ0yE6uvw9Vzjha0lpUs/DxMEBzXyCe1kGhecb830lLDB0T9KH5EnBMcnpH9w E5QGqLHfgbqaAqOXq8aBrZRHc0gcb7ubh47LJI4G+d52XrbeHBmRIbpQ4HAq9A7s AeCNtTZ1ksByZsvX/Wwy/Osxs52U9+piRvdBBL39WuM7R0DFQuRykJNqxofhkf6g OG/8i1xd0jQusnyyGNY7jRra9FLcvHNKZTx8HNOFXP7RVeWWdVUrajwaRiGZufQ3 mwg= =1hmt -----END PGP SIGNATURE----- Merge tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme into block-6.2 Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - fix doorbell buffer value endianness (Klaus Jensen) - fix Linux vs NVMe page size mismatch (Keith Busch) - fix a potential use memory access beyong the allocation limit (Keith Busch) - fix a multipath vs blktrace NULL pointer dereference (Yanjun Zhang)" * tag 'nvme-6.2-2022-12-22' of git://git.infradead.org/nvme: nvme: fix multipath crash caused by flush request when blktrace is enabled nvme-pci: fix page size checks nvme-pci: fix mempool alloc size nvme-pci: fix doorbell buffer value endianness	2022-12-22 09:22:35 -07:00
Jeff Layton	789e1e10f2	nfsd: shut down the NFSv4 state objects before the filecache Currently, we shut down the filecache before trying to clean up the stateids that depend on it. This leads to the kernel trying to free an nfsd_file twice, and a refcount overput on the nf_mark. Change the shutdown procedure to tear down all of the stateids prior to shutting down the filecache. Reported-and-tested-by: Wang Yugui <wangyugui@e16-tech.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Fixes: `5e113224c1` ("nfsd: nfsd_file cache entries should be per net namespace") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2022-12-22 10:12:56 -05:00
Arnaldo Carvalho de Melo	09e6f9f983	perf python: Fix splitting CC into compiler and options Noticed this build failure on archlinux:base when building with clang: clang-14: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument] In tools/perf/util/setup.py we check if clang supports that option, but since commit `3cad53a6f9` ("perf python: Account for multiple words in CC") this got broken as in the common case where CC="clang": >>> cc="clang" >>> print(cc.split()[0]) clang >>> option="-ffat-lto-objects" >>> print(str(cc.split()[1:]) + option) []-ffat-lto-objects >>> And then the Popen will call clang with that bogus option name that in turn will not produce the b"unknown argument" or b"is not supported" that this function uses to detect if the option is not available and thus later on clang will be called with an unknown/unsupported option. Fix it by looking if really there are options in the provided CC variable, and if so override 'cc' with the first token and append the options to the 'option' variable. Fixes: `3cad53a6f9` ("perf python: Account for multiple words in CC") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Keeping <john@metanate.com> Cc: Khem Raj <raj.khem@gmail.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sedat Dilek <sedat.dilek@gmail.com> Link: http://lore.kernel.org/lkml/Y6Rq5F5NI0v1QQHM@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2022-12-22 11:34:30 -03:00
David Howells	a9eb558a5b	afs: Stop implementing ->writepage() We're trying to get rid of the ->writepage() hook[1]. Stop afs from using it by unlocking the page and calling afs_writepages_region() rather than folio_write_one(). A flag is passed to afs_writepages_region() to indicate that it should only write a single region so that we don't flush the entire file in ->write_begin(), but do add other dirty data to the region being written to try and reduce the number of RPC ops. This requires ->migrate_folio() to be implemented, so point that at filemap_migrate_folio() for files and also for symlinks and directories. This can be tested by turning on the afs_folio_dirty tracepoint and then doing something like: xfs_io -c "w 2223 7000" -c "w 15000 22222" -c "w 23 7" /afs/my/test/foo and then looking in the trace to see if the write at position 15000 gets stored before page 0 gets dirtied for the write at position 23. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Christoph Hellwig <hch@lst.de> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20221113162902.883850-1-hch@lst.de/ [1] Link: https://lore.kernel.org/r/166876785552.222254.4403222906022558715.stgit@warthog.procyon.org.uk/ # v1	2022-12-22 11:40:35 +00:00
Gaosheng Cui	b3d3ca5567	afs: remove afs_cache_netfs and afs_zap_permits() declarations afs_zap_permits() has been removed since commit `be080a6f43` ("afs: Overhaul permit caching"). afs_cache_netfs has been removed since commit `523d27cda1` ("afs: Convert afs to use the new fscache API"). so remove the declare for them from header file. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20220909070353.1160228-1-cuigaosheng1@huawei.com/	2022-12-22 11:40:35 +00:00
Colin Ian King	318b83b712	afs: remove variable nr_servers Variable nr_servers is no longer being used, the last reference to it was removed in commit `45df846273` ("afs: Fix server list handling") so clean up the code by removing it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20221020173923.21342-1-colin.i.king@gmail.com/	2022-12-22 11:40:35 +00:00
David Howells	36f82c93ee	afs: Fix lost servers_outstanding count The afs_fs_probe_dispatcher() work function is passed a count on net->servers_outstanding when it is scheduled (which may come via its timer). This is passed back to the work_item, passed to the timer or dropped at the end of the dispatcher function. But, at the top of the dispatcher function, there are two checks which skip the rest of the function: if the network namespace is being destroyed or if there are no fileservers to probe. These two return paths, however, do not drop the count passed to the dispatcher, and so, sometimes, the destruction of a network namespace, such as induced by rmmod of the kafs module, may get stuck in afs_purge_servers(), waiting for net->servers_outstanding to become zero. Fix this by adding the missing decrements in afs_fs_probe_dispatcher(). Fixes: `f6cbb368bc` ("afs: Actively poll fileservers to maintain NAT or firewall openings") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/167164544917.2072364.3759519569649459359.stgit@warthog.procyon.org.uk/	2022-12-22 11:40:35 +00:00
Yanjun Zhang	3659fb5ac2	nvme: fix multipath crash caused by flush request when blktrace is enabled The flush request initialized by blk_kick_flush has NULL bio, and it may be dealt with nvme_end_req during io completion. When blktrace is enabled, nvme_trace_bio_complete with multipath activated trying to access NULL pointer bio from flush request results in the following crash: [ 2517.831677] BUG: kernel NULL pointer dereference, address: 000000000000001a [ 2517.835213] #PF: supervisor read access in kernel mode [ 2517.838724] #PF: error_code(0x0000) - not-present page [ 2517.842222] PGD 7b2d51067 P4D 0 [ 2517.845684] Oops: 0000 [#1] SMP NOPTI [ 2517.849125] CPU: 2 PID: 732 Comm: kworker/2:1H Kdump: loaded Tainted: G S 5.15.67-0.cl9.x86_64 #1 [ 2517.852723] Hardware name: XFUSION 2288H V6/BC13MBSBC, BIOS 1.13 07/27/2022 [ 2517.856358] Workqueue: nvme_tcp_wq nvme_tcp_io_work [nvme_tcp] [ 2517.859993] RIP: 0010:blk_add_trace_bio_complete+0x6/0x30 [ 2517.863628] Code: 1f 44 00 00 48 8b 46 08 31 c9 ba 04 00 10 00 48 8b 80 50 03 00 00 48 8b 78 50 e9 e5 fe ff ff 0f 1f 44 00 00 41 54 49 89 f4 55 <0f> b6 7a 1a 48 89 d5 e8 3e 1c 2b 00 48 89 ee 4c 89 e7 5d 89 c1 ba [ 2517.871269] RSP: 0018:ff7f6a008d9dbcd0 EFLAGS: 00010286 [ 2517.875081] RAX: ff3d5b4be00b1d50 RBX: 0000000002040002 RCX: ff3d5b0a270f2000 [ 2517.878966] RDX: 0000000000000000 RSI: ff3d5b0b021fb9f8 RDI: 0000000000000000 [ 2517.882849] RBP: ff3d5b0b96a6fa00 R08: 0000000000000001 R09: 0000000000000000 [ 2517.886718] R10: 000000000000000c R11: 000000000000000c R12: ff3d5b0b021fb9f8 [ 2517.890575] R13: 0000000002000000 R14: ff3d5b0b021fb1b0 R15: 0000000000000018 [ 2517.894434] FS: 0000000000000000(0000) GS:ff3d5b42bfc80000(0000) knlGS:0000000000000000 [ 2517.898299] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2517.902157] CR2: 000000000000001a CR3: 00000004f023e005 CR4: 0000000000771ee0 [ 2517.906053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2517.909930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2517.913761] PKRU: 55555554 [ 2517.917558] Call Trace: [ 2517.921294] <TASK> [ 2517.924982] nvme_complete_rq+0x1c3/0x1e0 [nvme_core] [ 2517.928715] nvme_tcp_recv_pdu+0x4d7/0x540 [nvme_tcp] [ 2517.932442] nvme_tcp_recv_skb+0x4f/0x240 [nvme_tcp] [ 2517.936137] ? nvme_tcp_recv_pdu+0x540/0x540 [nvme_tcp] [ 2517.939830] tcp_read_sock+0x9c/0x260 [ 2517.943486] nvme_tcp_try_recv+0x65/0xa0 [nvme_tcp] [ 2517.947173] nvme_tcp_io_work+0x64/0x90 [nvme_tcp] [ 2517.950834] process_one_work+0x1e8/0x390 [ 2517.954473] worker_thread+0x53/0x3c0 [ 2517.958069] ? process_one_work+0x390/0x390 [ 2517.961655] kthread+0x10c/0x130 [ 2517.965211] ? set_kthread_struct+0x40/0x40 [ 2517.968760] ret_from_fork+0x1f/0x30 [ 2517.972285] </TASK> To avoid this situation, add a NULL check for req->bio before calling trace_block_bio_complete. Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn> Signed-off-by: Christoph Hellwig <hch@lst.de>	2022-12-22 09:40:27 +01:00
Takashi Iwai	6bf5f9a8b4	ASoC: Updates for v6.2 Some more small fixes and board quirks that came in since my last update, the main one being the fixes from Kai for issues around the attempts to get kexec working well on SOF based systems. -----BEGIN PGP SIGNATURE----- iQEyBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmOhn2kACgkQJNaLcl1U h9Dfdgf47os8jUAaEuV3/pFl7OOh+L2jR2P5yCK60VHu0CfuHo3lynwpYvS/8wKN XqYz0eeuYOWpFeZ12wZBY/Dnk2dwkXiqpv7e0ID0szAH9TezSlQ3MRno9hwWGloU w3ntU5VeIYTKl91E2y5X9GMoDsfnfh751MsjXOcP40npjGEJpOtAO0z1sIXANSKz ftceXGapvTokSp7mbk68BM5ivom4TM3eDSlQiOeMj2OeOhXRylx5tHeQV3FzVeB+ 4K7bECzveDn/hTYBX2Lopn2stR1RF5S9HDynjo83YDKXOKUp8bJfEHK7R/3y3u56 eIwKgMmxb2eK2IwgjU/7sKP87ARz =OaSY -----END PGP SIGNATURE----- Merge tag 'asoc-v6.2-3' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v6.2 Some more small fixes and board quirks that came in since my last update, the main one being the fixes from Kai for issues around the attempts to get kexec working well on SOF based systems.	2022-12-22 09:18:38 +01:00
Jaroslav Kysela	fd28941cff	ALSA: usb-audio: Add new quirk FIXED_RATE for JBL Quantum810 Wireless It seems that the firmware is broken and does not accept the UAC_EP_CS_ATTR_SAMPLE_RATE URB. There is only one rate (48000Hz) available in the descriptors for the output endpoint. Create a new quirk QUIRK_FLAG_FIXED_RATE to skip the rate setup when only one rate is available (fixed). BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=216798 Signed-off-by: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20221215153037.1163786-1-perex@perex.cz Signed-off-by: Takashi Iwai <tiwai@suse.de>	2022-12-22 09:13:54 +01:00
Jiapeng Chong	a95e163a4b	ALSA: azt3328: Remove the unused function snd_azf3328_codec_outl() The function snd_azf3328_codec_outl is defined in the azt3328.c file, but not called elsewhere, so remove this unused function. sound/pci/azt3328.c:367:1: warning: unused function 'snd_azf3328_codec_outl'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3432 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221213061355.62856-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Takashi Iwai <tiwai@suse.de>	2022-12-22 09:12:26 +01:00
Takashi Iwai	2d78eb0342	Merge branch 'for-next' into for-linus	2022-12-22 09:11:48 +01:00
Linus Torvalds	9d2f6060fe	Tracing fix for 6.2: - Make monitor structures read only -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY6J+vxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qohJAP9Yx3A4xmopkMjpfK1HBzuB7j4U7blN 2NhqKM626unbeQEAi3FhPRc5N/sGBdsUClYZIKau0p3ip1TVfYbhk8vSgwg= =VcGm -----END PGP SIGNATURE----- Merge tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "I missed this minor hardening of the kernel in the first pull. - Make monitor structures read only" * tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv/monitors: Move monitor structure in rodata	2022-12-21 19:03:42 -08:00
Linus Torvalds	af9b3fa15d	Trace probes updates for 6.2: - New "symstr" type for dynamic events that writes the name of the function+offset into the ring buffer and not just the address - Prevent kernel symbol processing on addresses in user space probes (uprobes). - And minor fixes and clean ups -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY5yAHxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qoWoAP9ZLmqgIqlH3Zcms31SR250kLXxsxT3 JHe82hiuI1I3fAD/Z93QLHw9wngLqIMx/wXsdFjTNOGGWdxfclSWI2qI6Q0= =KaJg -----END PGP SIGNATURE----- Merge tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull trace probes updates from Steven Rostedt: - New "symstr" type for dynamic events that writes the name of the function+offset into the ring buffer and not just the address - Prevent kernel symbol processing on addresses in user space probes (uprobes). - And minor fixes and clean ups * tag 'trace-probes-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: Reject symbol/symstr type for uprobe tracing/probes: Add symstr type for dynamic events kprobes: kretprobe events missing on 2-core KVM guest kprobes: Fix check for probe enabled in kill_kprobe() test_kprobes: Fix implicit declaration error of test_kprobes tracing: Fix race where eprobes can be called before the event	2022-12-21 18:57:24 -08:00
Linus Torvalds	7a5189c58b	KVM/riscv changes for 6.2 * Allow unloading KVM module * Allow KVM user-space to set mvendorid, marchid, and mimpid * Several fixes and cleanups -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOhy+QUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOdUwf+K3i8RHW1H8TF/JSrn1I6nURNLYhb 2wXzl3esOsfswtn6dxEvLEXivcKmD2G9bLpa2UIa3vw1Plg9tdce9IJ5qDodtxVL mlISMUSgMNy+lelKJiG+l5Ld4oJ4HUY0yw/p3Ml9WUpra98UCB0sJ+FsqXr4ndi9 LxkQJrNyZkQcRH2IXjQhKjdjkepFTmkhKs/uCxAZvW9zfUmGX0dcp9W22PTbsapQ IcaBKdVaNN3TXNSIdDCM2Iv+oBN7gJn1CbgFxhkp4L8eE5PvRjFw0QooFMn2TjDw VflP3gIs/41+5tnoPWXGAkKFe/Z5aJjGjx6Yx0WnEEgoAG47RUHYsKIUjw== =8ejV -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull RISC-V kvm updates from Paolo Bonzini: - Allow unloading KVM module - Allow KVM user-space to set mvendorid, marchid, and mimpid - Several fixes and cleanups * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: RISC-V: KVM: Add ONE_REG interface for mvendorid, marchid, and mimpid RISC-V: KVM: Save mvendorid, marchid, and mimpid when creating VCPU RISC-V: Export sbi_get_mvendorid() and friends RISC-V: KVM: Move sbi related struct and functions to kvm_vcpu_sbi.h RISC-V: KVM: Use switch-case in kvm_riscv_vcpu_set/get_reg() RISC-V: KVM: Remove redundant includes of asm/csr.h RISC-V: KVM: Remove redundant includes of asm/kvm_vcpu_timer.h RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config() RISC-V: KVM: Simplify kvm_arch_prepare_memory_region() RISC-V: KVM: Exit run-loop immediately if xfer_to_guest fails RISC-V: KVM: use vma_lookup() instead of find_vma_intersection() RISC-V: KVM: Add exit logic to main.c	2022-12-21 18:52:15 -08:00
Dave Airlie	fe8f5b2f7b	Merge tag 'amd-drm-fixes-6.2-2022-12-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.2-2022-12-21: amdgpu: - Avoid large variable on the stack - S0ix fixes - SMU 13.x fixes - VCN fix - Add missing fence reference amdkfd: - Fix init vm error handling - Fix double release of compute pasid Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221221205828.6093-1-alexander.deucher@amd.com	2022-12-22 11:02:56 +10:00
Linus Torvalds	569c3a283c	block-6.2-2022-12-19 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOgp5AQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpm5SD/9tduSZQW00aDm83HbEikWdCgQm0w37tyYl C2+IwRwLF8pnAoSb6yaO7LZM9ZUYfoIfIlkHXkKhT1xNJ/XdeGDgwjOHi106iaEx kG08DcFnUjyJ4Yh6hnnpnSepIo0ckwa18pSaE4smvmKZirj3it3O6xSspyBxtUcv q6PvJDMN15aG6uLHq3xNZPzoI2KYXBDgwanyImRhdvLoOTiS9rok+F9e2ob3lzAa PB+FOipQoKb7M6jbyfZe4KbeTiJh4EYEl5Qa6ebrDIkOTm7zjc8sQbCkNeI7osh+ D0FvEQ1Vsrjj5Bp6N9CmZcrmNagjEcAPbzguxAilrgw2/XvA8d0fymziGXvuyUEv bSAx6lyJzfMLrvtubSqMhIF+8DlccQnnXz2ccacwvAfayytzNJjC9serU+czHA4O ZkPTwZFjAmbn6q6SK3qaOCB9IgITHipj8R/ncGu9KjNvM2QgzM+OIrP0xGxtk6uI ZGrt9nGMUmgjtaliQjiDVZomMewru1lRWPRAjfQ995gmVkejgapUHYoaDtDzaLKZ Q9BaK5CC2jltGUuuoFEnXnwu/Eyvp9y++pKkz4Esb+/Wkst4qyGtr9DOSTnv1wKN W20h3Z5vOAXXquvUJ5S3mQl8TNJHiBz+/CRB9PZG8XFtn8ubGo8XttGdgjQgyLM3 6FHzcZgeWw== =TSec -----END PGP SIGNATURE----- Merge tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux Pull block fixes from Jens Axboe: - Various fixes for BFQ (Yu, Yuwei) - Fix for loop command line parsing (Isaac) - No need to specifically clear REQ_ALLOC_CACHE on IOPOLL downgrade anymore (me) - blk-iocost enum fix for newer gcc (Jiri) - UAF fix for queue release (Ming) - blk-iolatency error handling memory leak fix (Tejun) * tag 'block-6.2-2022-12-19' of git://git.kernel.dk/linux: block: don't clear REQ_ALLOC_CACHE for non-polled requests block: fix use-after-free of q->q_usage_counter block, bfq: only do counting of pending-request for BFQ_GROUP_IOSCHED blk-iolatency: Fix memory leak on add_disk() failures loop: Fix the max_loop commandline argument treatment when it is set to 0 block/blk-iocost (gcc13): keep large values in a new enum block, bfq: replace 0/1 with false/true in bic apis block, bfq: don't return bfqg from __bfq_bic_change_cgroup() block, bfq: fix possible uaf for 'bfqq->bic'	2022-12-21 16:35:26 -08:00
Linus Torvalds	5d4740fc78	io_uring-6.2-2022-12-19 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOgp3oQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpvjeD/4w17ERLignAko51qJFS+lcpjEWFYk63XZN tFaZqGOscH9PertlQu5IstORa/OWY2iCzhi2waMvtHAI9YaT7jpxkgrUdfEoGyNL 6Ij5DIqnlIkZG+cUBXKq+xLhThssJECkqckcVPgtZIbCZzDAL/ffghH94sZY/LxA +cwsloA24s0hjZX3Cm/RNQIgEBf2g4HNNA09Ft3Idd9tSL0WndqcHTasEGAC8K+Z r9ZFsKCSVKB+6wUCYawO5xF+zfm5wA4sD1PXjVA1q++mwDm8BKmpcBG10v3grJ24 qzh+k8wMeiD7BJLDekEyWvklV7bIpbMZ2dzkdMI7n0Cs6WRumhLo+Enrh1l5l9YJ wizqwWykGjWWyLm9QP5R249o7n/T6q7jKqmsBzN+3wWYasq4W7PIjr4hQZ2hqiAW pUdaqvb0V91OQHjDHi4wI4xZnsmgr6eJhDz0JAd5wzc+g2Uav8GWs6wEaW++4HkL IHWggX51oF3Mzjo+Lx0pfs3dkcA5vQ85KDcICeLXnv6HPm90ImZY4cTdSW+YYzlK 351GwaPOTepm4M+hLZHZYVj5pTQPIKAspxwbSNcYlQ4nLhVPfcKkQGZ6di4yHZaC j8zT1opSmh4OqKA9mE/tCUf3s3e2YDmemDuyUKD56luAIw+rsxScC6HEPJPBxrmm hZfEkgw/vQ== =Jq/9 -----END PGP SIGNATURE----- Merge tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux Pull io_uring fixes from Jens Axboe: - Improve the locking for timeouts. This was originally queued up for the initial pull, but I messed up and it got missed. (Pavel) - Fix an issue with running task_work from the wait path, causing some inefficiencies (me) - Add a clear of ->free_iov upfront in the 32-bit compat data importing, so we ensure that it's always sane at completion time (me) - Use call_rcu_hurry() for the eventfd signaling (Dylan) - Ordering fix for multishot recv completions (Pavel) - Add the io_uring trace header to the MAINTAINERS entry (Ammar) * tag 'io_uring-6.2-2022-12-19' of git://git.kernel.dk/linux: MAINTAINERS: io_uring: Add include/trace/events/io_uring.h io_uring/net: fix cleanup after recycle io_uring/net: ensure compat import handlers clear free_iov io_uring: include task_work run after scheduling in wait for events io_uring: don't use TIF_NOTIFY_SIGNAL to test for availability of task_work io_uring: use call_rcu_hurry if signaling an eventfd io_uring: fix overflow handling regression io_uring: ease timeout flush locking requirements io_uring: revise completion_lock locking io_uring: protect cq_timeouts with timeout_lock	2022-12-21 16:28:25 -08:00
Rickard x Andersson	e96b95c2b7	gcov: add support for checksum field In GCC version 12.1 a checksum field was added. This patch fixes a kernel crash occurring during boot when using gcov-kernel with GCC version 12.2. The crash occurred on a system running on i.MX6SX. Link: https://lkml.kernel.org/r/20221220102318.3418501-1-rickaran@axis.com Fixes: `977ef30a7d` ("gcov: support GCC 12.1 and newer compilers") Signed-off-by: Rickard x Andersson <rickaran@axis.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reviewed-by: Martin Liska <mliska@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:52 -08:00
Liam Howlett	c5651b31f5	test_maple_tree: add test for mas_spanning_rebalance() on insufficient data Add a test to the maple tree test suite for the spanning rebalance insufficient node issue does not go undetected again. Link: https://lkml.kernel.org/r/20221219161922.2708732-3-Liam.Howlett@oracle.com Fixes: `54a611b605` ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:52 -08:00
Liam Howlett	0abb964aae	maple_tree: fix mas_spanning_rebalance() on insufficient data Mike Rapoport contacted me off-list with a regression in running criu. Periodic tests fail with an RCU stall during execution. Although rare, it is possible to hit this with other uses so this patch should be backported to fix the regression. This patchset adds the fix and a test case to the maple tree test suite. This patch (of 2): An insufficient node was causing an out-of-bounds access on the node in mas_leaf_max_gap(). The cause was the faulty detection of the new node being a root node when overwriting many entries at the end of the tree. Fix the detection of a new root and ensure there is sufficient data prior to entering the spanning rebalance loop. Link: https://lkml.kernel.org/r/20221219161922.2708732-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20221219161922.2708732-2-Liam.Howlett@oracle.com Fixes: `54a611b605` ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Mike Rapoport <rppt@kernel.org> Tested-by: Mike Rapoport <rppt@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:52 -08:00
Mike Kravetz	e700898fa0	hugetlb: really allocate vma lock for all sharable vmas Commit `bbff39cc6c` ("hugetlb: allocate vma lock for all sharable vmas") removed the pmd sharable checks in the vma lock helper routines. However, it left the functional version of helper routines behind #ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE. Therefore, the vma lock is not being used for sharable vmas on architectures that do not support pmd sharing. On these architectures, a potential fault/truncation race is exposed that could leave pages in a hugetlb file past i_size until the file is removed. Move the functional vma lock helpers outside the ifdef, and remove the non-functional stubs. Since the vma lock is not just for pmd sharing, rename the routine __vma_shareable_flags_pmd. Link: https://lkml.kernel.org/r/20221212235042.178355-1-mike.kravetz@oracle.com Fixes: `bbff39cc6c` ("hugetlb: allocate vma lock for all sharable vmas") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: James Houghton <jthoughton@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Peter Xu <peterx@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:52 -08:00
Arnd Bergmann	7ba594d700	kmsan: export kmsan_handle_urb USB support can be in a loadable module, and this causes a link failure with KMSAN: ERROR: modpost: "kmsan_handle_urb" [drivers/usb/core/usbcore.ko] undefined! Export the symbol so it can be used by this module. Link: https://lkml.kernel.org/r/20221215162710.3802378-1-arnd@kernel.org Fixes: `553a80188a` ("kmsan: handle memory sent to/from USB") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:52 -08:00
Arnd Bergmann	aaa746ad8b	kmsan: include linux/vmalloc.h This is needed for the vmap/vunmap declarations: mm/kmsan/kmsan_test.c:316:9: error: implicit declaration of function 'vmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration] vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL); ^ mm/kmsan/kmsan_test.c:316:29: error: use of undeclared identifier 'VM_MAP' vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL); ^ mm/kmsan/kmsan_test.c:322:3: error: implicit declaration of function 'vunmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration] vunmap(vbuf); ^ Link: https://lkml.kernel.org/r/20221215163046.4079767-1-arnd@kernel.org Fixes: `8ed691b02a` ("kmsan: add tests for KMSAN") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:51 -08:00
Mathieu Desnoyers	38ce7c9bdf	mm/mempolicy: fix memory leak in set_mempolicy_home_node system call When encountering any vma in the range with policy other than MPOL_BIND or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put on the policy just allocated with mpol_dup(). This allows arbitrary users to leak kernel memory. Link: https://lkml.kernel.org/r/20221215194621.202816-1-mathieu.desnoyers@efficios.com Fixes: `c6018b4b25` ("mm/mempolicy: add set_mempolicy_home_node syscall") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <stable@vger.kernel.org> [5.17+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:51 -08:00
Vlastimil Babka	6f12be792f	mm, mremap: fix mremap() expanding vma with addr inside vma Since 6.1 we have noticed random rpm install failures that were tracked to mremap() returning -ENOMEM and to commit `ca3d76b0aa` ("mm: add merging after mremap resize"). The problem occurs when mremap() expands a VMA in place, but using an starting address that's not vma->vm_start, but somewhere in the middle. The extension_pgoff calculation introduced by the commit is wrong in that case, so vma_merge() fails due to pgoffs not being compatible. Fix the calculation. By the way it seems that the situations, where rpm now expands a vma from the middle, were made possible also due to that commit, thanks to the improved vma merging. Yet it should work just fine, except for the buggy calculation. Link: https://lkml.kernel.org/r/20221216163227.24648-1-vbabka@suse.cz Reported-by: Jiri Slaby <jirislaby@kernel.org> Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359 Fixes: `ca3d76b0aa` ("mm: add merging after mremap resize") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jakub Matěna <matenajakub@gmail.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-12-21 14:31:51 -08:00

... 2 3 4 5 6 ...

1153561 Коммитов Все ветки Поиск

1153561 Коммитов

Все ветки