WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Mitchell Levy	4fa7bc1bbd	Merge fix/xsaves-lbr/5.15 into v5.15 * commit '46b414261e8193c1118924e0c62b773ad1747aff': (1884 commits) x86/fpu: Avoid writing LBR bit to IA32_XSS unless supported Linux 5.15.167 udp: fix receiving fraglist GSO packets memcg: protect concurrent access to mem_cgroup_idr btrfs: fix race between direct IO write and fsync when using same fd net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket x86/mm: Fix PTI for i386 some more net: drop bad gso csum_start and offset in virtio_net_hdr gso: fix dodgy bit handling for GSO_UDP_L4 net: change maximum number of UDP segments to 128 net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation gpio: rockchip: fix OF node leak in probe() drm/i915/fence: Mark debug_fence_free() with __maybe_unused drm/i915/fence: Mark debug_fence_init_onstack() with __maybe_unused ASoC: sunxi: sun4i-i2s: fix LRCLK polarity in i2s mode nvmet-tcp: fix kernel crash if commands allocation fails arm64: acpi: Harden get_cpu_for_acpi_id() against missing CPU entry arm64: acpi: Move get_cpu_for_acpi_id() to a header ACPI: processor: Fix memory leaks in error paths of processor_add() ACPI: processor: Return an error if acpi_processor_get_info() fails in processor_add() ...	2024-10-10 15:55:58 -07:00
Paolo Pisati	cca17211c8	m68k: amiga: Turn off Warp1260 interrupts during boot commit 1d8491d3e726984343dd8c3cdbe2f2b47cfdd928 upstream. On an Amiga 1200 equipped with a Warp1260 accelerator, an interrupt storm coming from the accelerator board causes the machine to crash in local_irq_enable() or auto_irq_enable(). Disabling interrupts for the Warp1260 in amiga_parse_bootinfo() fixes the problem. Link: https://lore.kernel.org/r/ZkjwzVwYeQtyAPrL@amaterasu.local Cc: stable <stable@kernel.org> Signed-off-by: Paolo Pisati <p.pisati@gmail.com> Reviewed-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20240601153254.186225-1-p.pisati@gmail.com Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-08-19 05:45:13 +02:00
Pablo Neira Ayuso	8ad0ec7f36	netfilter: nf_tables: rise cap on SELinux secmark context [ Upstream commit e29630247be24c3987e2b048f8e152771b32d38b ] secmark context is artificially limited 256 bytes, rise it to 4Kbytes. Fixes: `fb96194545` ("netfilter: nf_tables: add SECMARK support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-08-19 05:44:58 +02:00
Iouri Tarassov	2d38986289	drivers: hv: dxgkrnl: Implement known escapes Implement an escape to build test command buffer. Implement other known escapes. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:11 +00:00
Iouri Tarassov	10f168f9c2	drivers: hv: dxgkrnl: Implement D3DDKMTIsFeatureEnabled API D3DKMTIsFeatureEnabled is used to query if a particular feature is supported by the given adapter. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:10 +00:00
Iouri Tarassov	d05528a083	drivers: hv: dxgkrnl: Implement the D3DKMTEnumProcesses API D3DKMTEnumProcesses is used to enumerate PIDs for all processes, which opened the /dev/dxg device. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:10 +00:00
Iouri Tarassov	61a4538209	drivers: hv: dxgkrnl: Added implementation for D3DKMTInvalidateCache D3DKMTInvalidateCache is called by user mode drivers when the device doesn't support cache coherent access to compute device allocations. It needs to be called after an allocation was accessed by CPU and now needs to be accessed by the device. And vice versa. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:10 +00:00
Iouri Tarassov	6fc4a21466	drivers: hv: dxgkrnl: Implement D3DKMTWaitSyncFile Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:10 +00:00
Iouri Tarassov	329f7fa954	drivers: hv: dxgkrnl: Creation of dxgsyncfile objects Implement the ioctl to create a dxgsyncfile object (LX_DXCREATESYNCFILE). This object is a wrapper around a monitored fence sync object and a fence value. dxgsyncfile is built on top of the Linux sync_file object and provides a way for the user mode to synchronize with the execution of the device DMA packets. The ioctl creates a dxgsyncfile object for the given GPU synchronization object and a fence value. A file descriptor of the sync_file object is returned to the caller. The caller could wait for the object by using poll(). When the underlying GPU synchronization object is signaled on the host, the host sends a message to the virtual machine and the sync_file object is signaled. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	e2c32b38d1	drivers: hv: dxgkrnl: Manage compute device virtual addresses Implement ioctls to manage compute device virtual addresses (VA): - LX_DXRESERVEGPUVIRTUALADDRESS, - LX_DXFREEGPUVIRTUALADDRESS, - LX_DXMAPGPUVIRTUALADDRESS, - LX_DXUPDATEGPUVIRTUALADDRESS. Compute devices access memory by using virtual addressses. Each process has a dedicated VA space. The video memory manager on the host is responsible with updating device page tables before submitting a DMA buffer for execution. The LX_DXRESERVEGPUVIRTUALADDRESS ioctl reserves a portion of the process compute device VA space. The LX_DXMAPGPUVIRTUALADDRESS ioctl reserves a portion of the process compute device VA space and maps it to the given compute device allocation. The LX_DXFREEGPUVIRTUALADDRESS frees the previously reserved portion of the compute device VA space. The LX_DXUPDATEGPUVIRTUALADDRESS ioctl adds operations to modify the compute device VA space to a compute device execution context. It allows the operations to be queued and synchronized with execution of other compute device DMA buffers.. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	a2b48aede5	drivers: hv: dxgkrnl: Manage residency of allocations Implement ioctls to manage residency of compute device allocations: - LX_DXMAKERESIDENT, - LX_DXEVICT. An allocation is "resident" when the compute devoce is setup to access it. It means that the allocation is in the local device memory or in non-pageable system memory. The current design does not support on demand compute device page faulting. An allocation must be resident before the compute device is allowed to access it. The LX_DXMAKERESIDENT ioctl instructs the video memory manager to make the given allocations resident. The operation is submitted to a paging queue (dxgpagingqueue). When the ioctl returns a "pending" status, a monitored fence sync object can be used to synchronize with the completion of the operation. The LX_DXEVICT ioctl istructs the video memory manager to evict the given allocations from device accessible memory. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	cb8161abde	drivers: hv: dxgkrnl: Ioctls to manage scheduling priority Implement iocts to manage compute device scheduling priority: - LX_DXGETCONTEXTINPROCESSSCHEDULINGPRIORITY - LX_DXGETCONTEXTSCHEDULINGPRIORITY - LX_DXSETCONTEXTINPROCESSSCHEDULINGPRIORITY - LX_DXSETCONTEXTSCHEDULINGPRIORITY Each compute device execution context has an assigned scheduling priority. It is used by the compute device scheduler on the host to pick contexts for execution. There is a global priority and a priority within a process. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	15f5e8e6c0	drivers: hv: dxgkrnl: Offer and reclaim allocations Implement ioctls to offer and reclaim compute device allocations: - LX_DXOFFERALLOCATIONS, - LX_DXRECLAIMALLOCATIONS2 When a user mode driver (UMD) does not need to access an allocation, it can "offer" it by issuing the LX_DXOFFERALLOCATIONS ioctl. This means that the allocation is not in use and its local device memory could be evicted. The freed space could be given to another allocation. When the allocation is again needed, the UMD can attempt to"reclaim" the allocation by issuing the LX_DXRECLAIMALLOCATIONS2 ioctl. If the allocation is still not evicted, the reclaim operation succeeds and no other action is required. If the reclaim operation fails, the caller must restore the content of the allocation before it can be used by the device. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	92b6a85b1a	drivers: hv: dxgkrnl: Ioctls to query statistics and clock calibration Implement ioctls to query statistics from the VGPU device (LX_DXQUERYSTATISTICS) and to query clock calibration (LX_DXQUERYCLOCKCALIBRATION). The LX_DXQUERYSTATISTICS ioctl is used to query various statistics from the compute device on the host. The LX_DXQUERYCLOCKCALIBRATION ioctl queries the compute device clock and is used for performance monitoring. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	39d7838ac1	drivers: hv: dxgkrnl: Ioctl to put device to error state Implement the ioctl to put the virtual compute device to the error state (LX_DXMARKDEVICEASERROR). This ioctl is used by the user mode driver when it detects an unrecoverable error condition. When a compute device is put to the error state, all subsequent ioctl calls to the device will fail. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:09 +00:00
Iouri Tarassov	ad1c37783f	drivers: hv: dxgkrnl: The escape ioctl Implement the escape ioctl (LX_DXESCAPE). This ioctl is used to send/receive private data between user mode compute device driver (guest) and kernel mode compute device driver (host). It allows the user mode driver to extend the virtual compute device API. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	c61c38dc6a	drivers: hv: dxgkrnl: Query video memory information Implement the ioctl to query video memory information from the host (LX_DXQUERYVIDEOMEMORYINFO). Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	bcd35de6f4	drivers: hv: dxgkrnl: Flush heap transitions Implement the ioctl to flush heap transitions (LX_DXFLUSHHEAPTRANSITIONS). The ioctl is used to ensure that the video memory manager on the host flushes all internal operations. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	77bf29aa37	drivers: hv: dxgkrnl: Manage device allocation properties Implement ioctls to manage properties of a compute device allocation: - LX_DXUPDATEALLOCPROPERTY, - LX_DXSETALLOCATIONPRIORITY, - LX_DXGETALLOCATIONPRIORITY, - LX_DXQUERYALLOCATIONRESIDENCY. - LX_DXCHANGEVIDEOMEMORYRESERVATION, The LX_DXUPDATEALLOCPROPERTY ioctl requests the host to update various properties of a compute devoce allocation. The LX_DXSETALLOCATIONPRIORITY and LX_DXGETALLOCATIONPRIORITY ioctls are used to set/get allocation priority, which defines the importance of the allocation to be in the local device memory. The LX_DXQUERYALLOCATIONRESIDENCY ioctl queries if the allocation is located in the compute device accessible memory. The LX_DXCHANGEVIDEOMEMORYRESERVATION ioctl changes compute device memory reservation of an allocation. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	f403f70856	drivers: hv: dxgkrnl: Map(unmap) CPU address to device allocation Implement ioctls to map/unmap CPU virtual addresses to compute device allocations - LX_DXLOCK2 and LX_DXUNLOCK2. The LX_DXLOCK2 ioctl maps a CPU virtual address to a compute device allocation. The allocation could be located in system memory or local device memory on the host. When the device allocation is created from the guest system memory (existing sysmem allocation), the allocation CPU address is known and is returned to the caller. For other CPU visible allocations the code flow is the following: 1. A VM bus message is sent to the host to map the allocation 2. The host allocates a portion of the guest IO space and maps it to the allocation backing store. The IO space address of the allocation is returned back to the guest. 3. The guest allocates a CPU virtual address and maps it to the IO space (see the dxg_map_iospace function). 4. The CPU VA is returned back to the caller cpu_address_mapped and cpu_address_refcount are used to track how many times an allocation was mapped. The LX_DXUNLOCK2 ioctl unmaps a CPU virtual address from a compute device allocation. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	3c3a6d1ee1	drivers: hv: dxgkrnl: Query the dxgdevice state Implement the ioctl to query the dxgdevice state - LX_DXGETDEVICESTATE. The IOCTL is used to query the state of the given dxgdevice object (active, error, etc.). A call to the dxgdevice execution state could be high frequency. The following method is used to avoid sending a synchronous VM bus message to the host for every call: - When a dxgdevice is created, a pointer to dxgglobal->device_state_counter is sent to the host - Every time the device state on the host is changed, the host will send an asynchronous message to the guest (DXGK_VMBCOMMAND_SETGUESTDATA) and the guest will increment the device_state_counter value. - the dxgdevice object has execution_state_counter member, which is equal to dxgglobal->device_state_counter value at the time when LX_DXGETDEVICESTATE was last processed.. - if execution_state_counter is different from device_state_counter, the dxgk_vmbcommand_getdevicestate VM bus message is sent to the host. Otherwise, the cached value is returned to the caller. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	359c7b1ac2	drivers: hv: dxgkrnl: Share objects with the host Implement the LX_DXSHAREOBJECTWITHHOST ioctl. This ioctl is used to create a Windows NT handle on the host for the given shared object (resource or sync object). The NT handle is returned to the caller. The caller could share the NT handle with a host application, which needs to access the object. The host application can open the shared resource using the NT handle. This way the guest and the host have access to the same object. Fix incorrect handling of error results from copy_from_user(). Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	e22f5ce9f2	drivers: hv: dxgkrnl: Submit execution commands to the compute device Implements ioctls for submission of compute device buffers for execution: - LX_DXSUBMITCOMMAND The ioctl is used to submit a command buffer to the device, working in the "packet scheduling" mode. - LX_DXSUBMITCOMMANDTOHWQUEUE The ioctl is used to submit a command buffer to the device, working in the "hardware scheduling" mode. To improve performance both ioctls use asynchronous VM bus messages to communicate with the host as these are high frequency operations. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	aa067375a7	drivers: hv: dxgkrnl: Creation of paging queue objects. Implement ioctls for creation/destruction of the paging queue objects: - LX_DXCREATEPAGINGQUEUE, - LX_DXDESTROYPAGINGQUEUE Paging queue objects (dxgpagingqueue) contain operations, which handle residency of device accessible allocations. An allocation is resident, when the device has access to it. For example, the allocation resides in local device memory or device page tables point to system memory which is made non-pageable. Each paging queue has an associated monitored fence sync object, which is used to detect when a paging operation is completed. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	cd03c649d3	drivers: hv: dxgkrnl: Sharing of sync objects Implement creation of a shared sync objects and the ioctl for sharing dxgsyncobject objects between processes in the virtual machine. Sync objects are shared using file descriptor (FD) handles. The name "NT handle" is used to be compatible with Windows implementation. An FD handle is created by the LX_DXSHAREOBJECTS ioctl. The created FD handle could be sent to another process using any Linux API. To use a shared sync object in other ioctls, the object needs to be opened using its FD handle. A sync object is opened by the LX_DXOPENSYNCOBJECTFROMNTHANDLE2 ioctl, which returns a d3dkmthandle value. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	3f4a94d21a	drivers: hv: dxgkrnl: Sharing of dxgresource objects Implement creation of shared resources and ioctls for sharing dxgresource objects between processes in the virtual machine. A dxgresource object is a collection of dxgallocation objects. The driver API allows addition/removal of allocations to a resource, but has limitations on addition/removal of allocations to a shared resource. When a resource is "sealed", addition/removal of allocations is not allowed. Resources are shared using file descriptor (FD) handles. The name "NT handle" is used to be compatible with Windows implementation. An FD handle is created by the LX_DXSHAREOBJECTS ioctl. The given FD handle could be sent to another process using any Linux API. To use a shared resource object in other ioctls the object needs to be opened using its FD handle. An resource object is opened by the LX_DXOPENRESOURCEFROMNTHANDLE ioctl. This ioctl returns a d3dkmthandle value, which can be used to reference the resource object. The LX_DXQUERYRESOURCEINFOFROMNTHANDLE ioctl is used to query private driver data of a shared resource object. This private data needs to be used to actually open the object using the LX_DXOPENRESOURCEFROMNTHANDLE ioctl. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:08 +00:00
Iouri Tarassov	4e561ebc06	drivers: hv: dxgkrnl: Operations using sync objects Implement ioctls to submit operations with compute device sync objects: - the LX_DXSIGNALSYNCHRONIZATIONOBJECT ioctl. The ioctl is used to submit a signal to a sync object. - the LX_DXWAITFORSYNCHRONIZATIONOBJECT ioctl. The ioctl is used to submit a wait for a sync object - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMCPU ioctl The ioctl is used to signal to a monitored fence sync object from a CPU thread. - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU ioctl. The ioctl is used to submit a signal to a monitored fence sync object.. - the LX_DXSIGNALSYNCHRONIZATIONOBJECTFROMGPU2 ioctl. The ioctl is used to submit a signal to a monitored fence sync object. - the LX_DXWAITFORSYNCHRONIZATIONOBJECTFROMGPU ioctl. The ioctl is used to submit a wait for a monitored fence sync object. Compute device synchronization objects are used to synchronize execution of DMA buffers between different execution contexts. Operations with sync objects include "signal" and "wait". A wait for a sync object is satisfied when the sync object is signaled. A signal operation could be submitted to a compute device context or the sync object could be signaled by a CPU thread. To improve performance, submitting operations to the host is done asynchronously when the host supports it. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	6afd35c73a	drivers: hv: dxgkrnl: Creation of compute device sync objects Implement ioctls to create and destroy compute devicesync objects: - the LX_DXCREATESYNCHRONIZATIONOBJECT ioctl, - the LX_DXDESTROYSYNCHRONIZATIONOBJECT ioctl. Compute device synchronization objects are used to synchronize execution of compute device commands, which are queued to different execution contexts (dxgcontext objects). There are several types of sync objects (mutex, monitored fence, CPU event, fence). A "signal" or a "wait" operation could be queued to an execution context. Monitored fence sync objects are particular important. A monitored fence object has a fence value, which could be monitored by the compute device or by CPU. Therefore, a CPU virtual address is allocated during object creation to allow an application to read the fence value. dxg_map_iospace and dxg_unmap_iospace implement creation of the CPU virtual address. This is done as follow: - The host allocates a portion of the guest IO space, which is mapped to the actual fence value memory on the host - The host returns the guest IO space address to the guest - The guest allocates a CPU virtual address and updates page tables to point to the IO space address Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	ce7a1f172d	drivers: hv: dxgkrnl: Creation of compute device allocations and resources Implemented ioctls to create and destroy virtual compute device allocations (dxgallocation) and resources (dxgresource): - the LX_DXCREATEALLOCATION ioctl, - the LX_DXDESTROYALLOCATION2 ioctl. Compute device allocations (dxgallocation objects) represent memory allocation, which could be accessible by the device. Allocations can be created around existing system memory (provided by an application) or memory, allocated by dxgkrnl on the host. Compute device resources (dxgresource objects) represent containers of compute device allocations. Allocations could be dynamically added, removed from a resource. Each allocation/resource has associated driver private data, which is provided during creation. Each created resource or allocation have a handle (d3dkmthandle), which is used to reference the corresponding object in other ioctls. A dxgallocation can be resident (meaning that it is accessible by the compute device) or evicted. When an allocation is evicted, its content is stored in the backing store in system memory. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	8668f2837c	drivers: hv: dxgkrnl: Creation of dxgcontext objects Implement ioctls for creation/destruction of dxgcontext objects: - the LX_DXCREATECONTEXTVIRTUAL ioctl - the LX_DXDESTROYCONTEXT ioctl. A dxgcontext object represents a compute device execution thread. Ccompute device DMA buffers and synchronization operations are submitted for execution to a dxgcontext. dxgcontexts objects belong to a dxgdevice object. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	615677695d	drivers: hv: dxgkrnl: Creation of dxgdevice objects Implement ioctls for creation and destruction of dxgdevice objects: - the LX_DXCREATEDEVICE ioctl - the LX_DXDESTROYDEVICE ioctl A dxgdevice object represents a container of other virtual compute device objects (allocations, sync objects, contexts, etc.). It belongs to a dxgadapter object. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	56538daeb2	drivers: hv: dxgkrnl: Opening of /dev/dxg device and dxgprocess creation - Implement opening of the device (/dev/dxg) file object and creation of dxgprocess objects. - Add VM bus messages to create and destroy the host side of a dxgprocess object. - Implement the handle manager, which manages d3dkmthandle handles for the internal process objects. The handles are used by a user mode client to reference dxgkrnl objects. dxgprocess is created for each process, which opens /dev/dxg. dxgprocess is ref counted, so the existing dxgprocess objects is used for a process, which opens the device object multiple time. dxgprocess is destroyed when the file object is released. A corresponding dxgprocess object is created on the host for every dxgprocess object in the guest. When a dxgkrnl object is created, in most cases the corresponding object is created in the host. The VM references the host objects by handles (d3dkmthandle). d3dkmthandle values for a host object and the corresponding VM object are the same. A host handle is allocated first and its value is assigned to the guest object. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	1676b11742	drivers: hv: dxgkrnl: Add VMBus message support, initialize VMBus channels. Implement support for sending/receiving VMBus messages between the host and the guest. Initialize the VMBus channels and notify the host about IO space settings of the VMBus global channel. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Iouri Tarassov	334ce7fe44	drivers: hv: dxgkrnl: Driver initialization and loading - Create skeleton and add basic functionality for the Hyper-V compute device driver (dxgkrnl). - Register for PCI and VMBus driver notifications and handle initialization of VMBus channels. - Connect the dxgkrnl module to the drivers/hv/ Makefile and Kconfig - Create a MAINTAINERS entry A VMBus channel is a communication interface between the Hyper-V guest and the host. The are two type of VMBus channels, used in the driver: - the global channel - per virtual compute device channel A PCI device is created for each virtual compute device, projected by the host. The device vendor is PCI_VENDOR_ID_MICROSOFT and device id is PCI_DEVICE_ID_VIRTUAL_RENDER. dxg_pci_probe_device handles arrival of such devices. The PCI config space of the virtual compute device has luid of the corresponding virtual compute device VM bus channel. This is how the compute device adapter objects are linked to VMBus channels. VMBus interface version is exchanged by reading/writing the PCI config space of the virtual compute device. The IO space is used to handle CPU accessible compute device allocations. Hyper-V allocates IO space for the global VMBus channel. Signed-off-by: Iouri Tarassov <iourit@linux.microsoft.com>	2024-07-09 23:40:07 +00:00
Arnd Bergmann	16c0403b7d	syscalls: fix compat_sys_io_pgetevents_time64 usage commit d3882564a77c21eb746ba5364f3fa89b88de3d61 upstream. Using sys_io_pgetevents() as the entry point for compat mode tasks works almost correctly, but misses the sign extension for the min_nr and nr arguments. This was addressed on parisc by switching to compat_sys_io_pgetevents_time64() in commit `6431e92fc8` ("parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode"), as well as by using more sophisticated system call wrappers on x86 and s390. However, arm64, mips, powerpc, sparc and riscv still have the same bug. Change all of them over to use compat_sys_io_pgetevents_time64() like parisc already does. This was clearly the intention when the function was originally added, but it got hooked up incorrectly in the tables. Cc: stable@vger.kernel.org Fixes: `48166e6ea4` ("y2038: add 64-bit time_t syscalls to all 32-bit architectures") Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 09:14:50 +02:00
Matthias Goergens	f571c8ab18	hugetlb_encode.h: fix undefined behaviour (34 << 26) commit `710bb68c2e` upstream. Left-shifting past the size of your datatype is undefined behaviour in C. The literal 34 gets the type `int`, and that one is not big enough to be left shifted by 26 bits. An `unsigned` is long enough (on any machine that has at least 32 bits for their ints.) For uniformity, we mark all the literals as unsigned. But it's only really needed for HUGETLB_FLAG_ENCODE_16GB. Thanks to Randy Dunlap for an initial review and suggestion. Link: https://lkml.kernel.org/r/20220905031904.150925-1-matthias.goergens@gmail.com Signed-off-by: Matthias Goergens <matthias.goergens@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-07-05 09:14:23 +02:00
Anton Protopopov	f654b258e9	bpf: Pack struct bpf_fib_lookup [ Upstream commit f91717007217d975aa975ddabd91ae1a107b9bff ] The struct bpf_fib_lookup is supposed to be of size 64. A recent commit 59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added a static assertion to check this property so that future changes to the structure will not accidentally break this assumption. As it immediately turned out, on some 32-bit arm systems, when AEABI=n, the total size of the structure was equal to 68, see [1]. This happened because the bpf_fib_lookup structure contains a union of two 16-bit fields: union { __u16 tot_len; __u16 mtu_result; }; which was supposed to compile to a 16-bit-aligned 16-bit field. On the aforementioned setups it was instead both aligned and padded to 32-bits. Declare this inner union as __attribute__((packed, aligned(2))) such that it always is of size 2 and is aligned to 16 bits. [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: `e1850ea9bd` ("bpf: bpf_fib_lookup return MTU value as output when looked up") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-16 13:39:18 +02:00
Gergo Koteles	18c51d97a2	Input: allocate keycode for Display refresh rate toggle [ Upstream commit cfeb98b95fff25c442f78a6f616c627bc48a26b7 ] Newer Lenovo Yogas and Legions with 60Hz/90Hz displays send a wmi event when Fn + R is pressed. This is intended for use to switch between the two refresh rates. Allocate a new KEY_REFRESH_RATE_TOGGLE keycode for it. Signed-off-by: Gergo Koteles <soyer@irl.hu> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://lore.kernel.org/r/15a5d08c84cf4d7b820de34ebbcf8ae2502fb3ca.1710065750.git.soyer@irl.hu Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-13 13:01:46 +02:00
Amir Goldstein	b65b2d4187	fanotify: introduce FAN_MARK_IGNORE [ Upstream commit `e252f2ed1c` ] This flag is a new way to configure ignore mask which allows adding and removing the event flags FAN_ONDIR and FAN_EVENT_ON_CHILD in ignore mask. The legacy FAN_MARK_IGNORED_MASK flag would always ignore events on directories and would ignore events on children depending on whether the FAN_EVENT_ON_CHILD flag was set in the (non ignored) mask. FAN_MARK_IGNORE can be used to ignore events on children without setting FAN_EVENT_ON_CHILD in the mark's mask and will not ignore events on directories unconditionally, only when FAN_ONDIR is set in ignore mask. The new behavior is non-downgradable. After calling fanotify_mark() with FAN_MARK_IGNORE once, calling fanotify_mark() with FAN_MARK_IGNORED_MASK on the same object will return EEXIST error. Setting the event flags with FAN_MARK_IGNORE on a non-dir inode mark has no meaning and will return ENOTDIR error. The meaning of FAN_MARK_IGNORED_SURV_MODIFY is preserved with the new FAN_MARK_IGNORE flag, but with a few semantic differences: 1. FAN_MARK_IGNORED_SURV_MODIFY is required for filesystem and mount marks and on an inode mark on a directory. Omitting this flag will return EINVAL or EISDIR error. 2. An ignore mask on a non-directory inode that survives modify could never be downgraded to an ignore mask that does not survive modify. With new FAN_MARK_IGNORE semantics we make that rule explicit - trying to update a surviving ignore mask without the flag FAN_MARK_IGNORED_SURV_MODIFY will return EEXIST error. The conveniene macro FAN_MARK_IGNORE_SURV is added for (FAN_MARK_IGNORE \| FAN_MARK_IGNORED_SURV_MODIFY), because the common case should use short constant names. Link: https://lore.kernel.org/r/20220629144210.2983229-4-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:07 +02:00
Amir Goldstein	7fcef3285a	fanotify: implement "evictable" inode marks [ Upstream commit `7d5e005d98` ] When an inode mark is created with flag FAN_MARK_EVICTABLE, it will not pin the marked inode to inode cache, so when inode is evicted from cache due to memory pressure, the mark will be lost. When an inode mark with flag FAN_MARK_EVICATBLE is updated without using this flag, the marked inode is pinned to inode cache. When an inode mark is updated with flag FAN_MARK_EVICTABLE but an existing mark already has the inode pinned, the mark update fails with error EEXIST. Evictable inode marks can be used to setup inode marks with ignored mask to suppress events from uninteresting files or directories in a lazy manner, upon receiving the first event, without having to iterate all the uninteresting files or directories before hand. The evictbale inode mark feature allows performing this lazy marks setup without exhausting the system memory with pinned inodes. This change does not enable the feature yet. Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiRDpuS=2uA6+ZUM7yG9vVU-u212tkunBmSnP_u=mkv=Q@mail.gmail.com/ Link: https://lore.kernel.org/r/20220422120327.3459282-15-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:03 +02:00
Amir Goldstein	a187e777d7	fanotify: report old and/or new parent+name in FAN_RENAME event [ Upstream commit `7326e382c2` ] In the special case of FAN_RENAME event, we report old or new or both old and new parent+name. A single info record will be reported if either the old or new dir is watched and two records will be reported if both old and new dir (or their filesystem) are watched. The old and new parent+name are reported using new info record types FAN_EVENT_INFO_TYPE_{OLD,NEW}_DFID_NAME, so if a single info record is reported, it is clear to the application, to which dir entry the fid+name info is referring to. Link: https://lore.kernel.org/r/20211129201537.1932819-11-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:55 +02:00
Amir Goldstein	9acb63f955	fanotify: record old and new parent and name in FAN_RENAME event [ Upstream commit `3982534ba5` ] In the special case of FAN_RENAME event, we record both the old and new parent and name. Link: https://lore.kernel.org/r/20211129201537.1932819-9-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:55 +02:00
Amir Goldstein	8bd3d40ea3	fanotify: introduce group flag FAN_REPORT_TARGET_FID [ Upstream commit `d61fd650e9` ] FAN_REPORT_FID is ambiguous in that it reports the fid of the child for some events and the fid of the parent for create/delete/move events. The new FAN_REPORT_TARGET_FID flag is an implicit request to report the fid of the target object of the operation (a.k.a the child inode) also in create/delete/move events in addition to the fid of the parent and the name of the child. To reduce the test matrix for uninteresting use cases, the new FAN_REPORT_TARGET_FID flag requires both FAN_REPORT_NAME and FAN_REPORT_FID. The convenience macro FAN_REPORT_DFID_NAME_TARGET combines FAN_REPORT_TARGET_FID with all the required flags. Link: https://lore.kernel.org/r/20211129201537.1932819-4-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:54 +02:00
NeilBrown	f829bb3a06	NFSD: move filehandle format declarations out of "uapi". [ Upstream commit `ef5825e3cf` ] A small part of the declaration concerning filehandle format are currently in the "uapi" include directory: include/uapi/linux/nfsd/nfsfh.h There is a lot more to the filehandle format, including "enum fid_type" and "enum nfsd_fsid" which are not exported via "uapi". This small part of the filehandle definition is of minimal use outside of the kernel, and I can find no evidence that an other code is using it. Certainly nfs-utils and wireshark (The most likely candidates) do not use these declarations. So move it out of "uapi" by copying the content from include/uapi/linux/nfsd/nfsfh.h into fs/nfsd/nfsfh.h A few unnecessary "#include" directives are not copied, and neither is the #define of fh_auth, which is annotated as being for userspace only. The copyright claims in the uapi file are identical to those in the nfsd file, so there is no need to copy those. The "__u32" style integer types are only needed in "uapi". In kernel-only code we can use the more familiar "u32" style. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
Gabriel Krisman Bertazi	c7c013dff4	fanotify: Emit generic error info for error event [ Upstream commit `130a3c7421` ] The error info is a record sent to users on FAN_FS_ERROR events documenting the type of error. It also carries an error count, documenting how many errors were observed since the last reporting. Link: https://lore.kernel.org/r/20211025192746.66445-28-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	11280c7181	fanotify: Reserve UAPI bits for FAN_FS_ERROR [ Upstream commit `8d11a4f43e` ] FAN_FS_ERROR allows reporting of event type FS_ERROR to userspace, which is a mechanism to report file system wide problems via fanotify. This commit preallocate userspace visible bits to match the FS_ERROR event. Link: https://lore.kernel.org/r/20211025192746.66445-19-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Martynas Pumputis	68dbe92d67	bpf: Derive source IP addr via bpf__fib_lookup() commit dab4e1f06cabb6834de14264394ccab197007302 upstream. Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; / the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-06 14:38:50 +00:00
Louis DeLosSantos	39b4ee40d2	bpf: Add table ID to bpf_fib_lookup BPF helper commit `8ad77e72ca` upstream. Add ability to specify routing table ID to the `bpf_fib_lookup` BPF helper. A new field `tbid` is added to `struct bpf_fib_lookup` used as parameters to the `bpf_fib_lookup` BPF helper. When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` and `BPF_FIB_LOOKUP_TBID` flags the `tbid` field in `struct bpf_fib_lookup` will be used as the table ID for the fib lookup. If the `tbid` does not exist the fib lookup will fail with `BPF_FIB_LKUP_RET_NOT_FWDED`. The `tbid` field becomes a union over the vlan related output fields in `struct bpf_fib_lookup` and will be zeroed immediately after usage. This functionality is useful in containerized environments. For instance, if a CNI wants to dictate the next-hop for traffic leaving a container it can create a container-specific routing table and perform a fib lookup against this table in a "host-net-namespace-side" TC program. This functionality also allows `ip rule` like functionality at the TC layer, allowing an eBPF program to pick a routing table based on some aspect of the sk_buff. As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN datapath. When egress traffic leaves a Pod an eBPF program attached by Cilium will determine which VRF the egress traffic should target, and then perform a FIB lookup in a specific table representing this VRF's FIB. Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-1-0a31c22c748c@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-06 14:38:50 +00:00
Martin KaFai Lau	75ca92271d	bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup commit `31de4105f0` upstream. The bpf_fib_lookup() also looks up the neigh table. This was done before bpf_redirect_neigh() was added. In the use case that does not manage the neigh table and requires bpf_fib_lookup() to lookup a fib to decide if it needs to redirect or not, the bpf prog can depend only on using bpf_redirect_neigh() to lookup the neigh. It also keeps the neigh entries fresh and connected. This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid the double neigh lookup when the bpf prog always call bpf_redirect_neigh() to do the neigh lookup. The params->smac output is skipped together when SKIP_NEIGH is set because bpf_redirect_neigh() will figure out the smac also. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217205515.3583372-1-martin.lau@linux.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-06 14:38:50 +00:00
Justin Iurman	28bbdb4e19	uapi: in6: replace temporary label with rfc9486 [ Upstream commit 6a2008641920a9c6fe1abbeb9acbec463215d505 ] Not really a fix per se, but IPV6_TLV_IOAM is still tagged as "TEMPORARY IANA allocation for IOAM", while RFC 9486 is available for some time now. Just update the reference. Fixes: `9ee11f0fff` ("ipv6: ioam: Data plane support for Pre-allocated Trace") Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240226124921.9097-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-06 14:38:45 +00:00

1 2 3 4 5 ...

10813 Коммитов