F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
- Performance tweaks for efifb earlycon by Andy
- Preparatory refactoring and cleanup work in the efivar layer by Johan,
which is needed to accommodate the Snapdragon arm64 laptops that
expose their EFI variable store via a TEE secure world API.
- Enhancements to the EFI memory map handling so that Xen dom0 can
safely access EFI configuration tables (Demi Marie)
- Wire up the newly introduced IBT/BTI flag in the EFI memory attributes
table, so that firmware that is generated with ENDBR/BTI landing pads
will be mapped with enforcement enabled.
- Clean up how we check and print the EFI revision exposed by the
firmware.
- Incorporate EFI memory attributes protocol definition contributed by
Evgeniy and wire it up in the EFI zboot code. This ensures that these
images can execute under new and stricter rules regarding the default
memory permissions for EFI page allocations. (More work is in progress
here)
- CPER header cleanup by Dan Williams
- Use a raw spinlock to protect the EFI runtime services stack on arm64
to ensure the correct semantics under -rt. (Pierre)
- EFI framebuffer quirk for Lenovo Ideapad by Darrell.
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmPzuwsACgkQw08iOZLZ
jyS7dwwAm95DlDxFIQi4FmTm2mqJws9PyDrkfaAK1CoyqCgeOLQT2FkVolgr8jne
pwpwCTXtYP8y0BZvdQEIjpAq/BHKaD3GJSPfl7lo+pnUu68PpsFWaV6EdT33KKfj
QeF0MnUvrqUeTFI77D+S0ZW2zxdo9eCcahF3TPA52/bEiiDHWBF8Qm9VHeQGklik
zoXA15ft3mgITybgjEA0ncGrVZiBMZrYoMvbdkeoedfw02GN/eaQn8d2iHBtTDEh
3XNlo7ONX0v50cjt0yvwFEA0AKo0o7R1cj+ziKH/bc4KjzIiCbINhy7blroSq+5K
YMlnPHuj8Nhv3I+MBdmn/nxRCQeQsE4RfRru04hfNfdcqjAuqwcBvRXvVnjWKZHl
CmUYs+p/oqxrQ4BjiHfw0JKbXRsgbFI6o3FeeLH9kzI9IDUPpqu3Ma814FVok9Ai
zbOCrJf5tEtg5tIavcUESEMBuHjEafqzh8c7j7AAqbaNjlihsqosDy9aYoarEi5M
f/tLec86
=+pOz
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"A healthy mix of EFI contributions this time:
- Performance tweaks for efifb earlycon (Andy)
- Preparatory refactoring and cleanup work in the efivar layer, which
is needed to accommodate the Snapdragon arm64 laptops that expose
their EFI variable store via a TEE secure world API (Johan)
- Enhancements to the EFI memory map handling so that Xen dom0 can
safely access EFI configuration tables (Demi Marie)
- Wire up the newly introduced IBT/BTI flag in the EFI memory
attributes table, so that firmware that is generated with ENDBR/BTI
landing pads will be mapped with enforcement enabled
- Clean up how we check and print the EFI revision exposed by the
firmware
- Incorporate EFI memory attributes protocol definition and wire it
up in the EFI zboot code (Evgeniy)
This ensures that these images can execute under new and stricter
rules regarding the default memory permissions for EFI page
allocations (More work is in progress here)
- CPER header cleanup (Dan Williams)
- Use a raw spinlock to protect the EFI runtime services stack on
arm64 to ensure the correct semantics under -rt (Pierre)
- EFI framebuffer quirk for Lenovo Ideapad (Darrell)"
* tag 'efi-next-for-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
firmware/efi sysfb_efi: Add quirk for Lenovo IdeaPad Duet 3
arm64: efi: Make efi_rt_lock a raw_spinlock
efi: Add mixed-mode thunk recipe for GetMemoryAttributes
efi: x86: Wire up IBT annotation in memory attributes table
efi: arm64: Wire up BTI annotation in memory attributes table
efi: Discover BTI support in runtime services regions
efi/cper, cxl: Remove cxl_err.h
efi: Use standard format for printing the EFI revision
efi: Drop minimum EFI version check at boot
efi: zboot: Use EFI protocol to remap code/data with the right attributes
efi/libstub: Add memory attribute protocol definitions
efi: efivars: prevent double registration
efi: verify that variable services are supported
efivarfs: always register filesystem
efi: efivars: add efivars printk prefix
efi: Warn if trying to reserve memory under Xen
efi: Actually enable the ESRT under Xen
efi: Apply allowlist to EFI configuration tables when running under Xen
efi: xen: Implement memory descriptor lookup based on hypercall
efi: memmap: Disregard bogus entries instead of returning them
...
If cache is enabled, make sure we are putting the right things
in place (mainly impacts mmap). This also sets us up for more
cache levels.
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>
Just one fix which just came in, this just hit linux-next just yesterday
with a success build report. But since its a fix and reviewed I think its
good to take in.
Sadly the eager beavers willing to help with the sysctl moves have slowed.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmP2ndUSHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoinyLYQAM8HBVhPcXi4gp2DHbUY6Bd6MeXsQ4Mk
EulLxYZXJFIfgXcKYN9IMNcPtdYJ4xnGWLNzPoixiMb61KY6Cjw+l62AtNBiwvxw
/GliVtrwz0EU1Bw7vilm4UBn/BUH7vXk7HMRExBwMXLwY7y11TYWMtb6+V+7Zzf1
VImyPZ1MlfOYlvgRUdaNUnuvHLE19x2pdVG/oRDLez9gs38FYEAiBd34adUKDf2U
FxZBQPd+4xaEht2sdTp0ws52YoHHx3k9i8mSiRwQqsiydKMk32iD59hXxeA4r3Oc
lzf10VgN7EdSDCTzdfDYhpIxq04RuA1s1gtXU/eePMqfJZenR0FbFHhcHmIhon4j
D5EuON5eMISzzT9h9DSI5k7kXPDv0iIZPb/vunAfklC1UPVv8uZ2/RV6M1jNR1Bn
S5WYURzdlwpS+4+uwl479061W1YnnhmtwVAMNw+AzfTxv9eQQ3oa2X70L0xirfDx
xJxZS8lzc1rLgL2y3lO2/8W1yI6hA0xhME+pxuopVTMWcXt6aG4efq3tIrxEx3Kf
0HukI1KbN6O/Rmf7hqprHGsfIq4EP4IEOk2XIvMSOrDyf2fOXVGCQQQ0eMOw4ejR
WTtHp1TW2byA8lSOp4K+rMZamt9Mokci0tOEc/52OXFA2jDtq4KZNHvHAOdZv4/Y
kLMflEpOmU++
=Txc3
-----END PGP SIGNATURE-----
Merge tag 'sysctl-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull sysctl update from Luis Chamberlain:
"Just one fix which just came in.
Sadly the eager beavers willing to help with the sysctl moves have
slowed"
* tag 'sysctl-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
sysctl: fix proc_dobool() usability
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmP2kaAACgkQiiy9cAdy
T1Eergv9FHVs7hS0anJF0xgRghR4+g0m5UUo08iJazgJdDgcS5JY+ZasIpYpEsG3
QmsIT33XVYZypXoOzjMSsPlwo6esTCJQScVLz85e4ebedCbCBDks+wVQcbfTzD5/
KrwmUoTBLU0L/ppFhqRk9k53nrSf1SXCWPthjdfWa3mTHdIVM4kQJruTWwUDiJXp
mdYwTx6FnTNer3QWetNzYOwdUgLu3rk0zLcBwQNCo6g5LOpA44iFfEAO4zeiOuZT
LMDPbDj0nWQyWPLLdcbtsn2laYyEBDBLZevLirSaqPQ/KCtGcw0mBt6dCAzg8/CM
ONqHHxdEpvPON8Sxujcn4CxpXhl0nCLwwtKtWU4rt7IevI9U+PynNl57TtJJ16/s
b3XD2QVbFjlcdAMTmArvqnogdzoC3mZu1R1IRs+jukhLAOqZiLN6o/E2HAllt47i
krzXeXIzQr10w9fnJ7LtIc/7IUFgtUfrOkg4TKyNcnRVHQaSSxv+JLRgqMPOr/M0
I7zt0G0j
=4hIT
-----END PGP SIGNATURE-----
Merge tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client updates from Steve French:
"The largest subset of this is from David Howells et al: making the
cifs/smb3 driver pass iov_iters down to the lowest layers, directly to
the network transport rather than passing lists of pages around,
helping multiple areas:
- Pin user pages, thereby fixing the race between concurrent DIO read
and fork, where the pages containing the DIO read buffer may end up
belonging to the child process and not the parent - with the result
that the parent might not see the retrieved data.
- cifs shouldn't take refs on pages extracted from non-user-backed
iterators (eg. KVEC). With these changes, cifs will apply the
appropriate cleanup.
- Making it easier to transition to using folios in cifs rather than
pages by dealing with them through BVEC and XARRAY iterators.
- Allowing cifs to use the new splice function
The remainder are:
- fixes for stable, including various fixes for uninitialized memory,
wrong length field causing mount issue to very old servers,
important directory lease fixes and reconnect fixes
- cleanups (unused code removal, change one element array usage, and
a change form strtobool to kstrtobool, and Kconfig cleanups)
- SMBDIRECT (RDMA) fixes including iov_iter integration and UAF fixes
- reconnect fixes
- multichannel fixes, including improving channel allocation (to
least used channel)
- remove the last use of lock_page_killable by moving to
folio_lock_killable"
* tag '6.3-rc-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (46 commits)
update internal module version number for cifs.ko
cifs: update ip_addr for ses only for primary chan setup
cifs: use tcon allocation functions even for dummy tcon
cifs: use the least loaded channel for sending requests
cifs: DIO to/from KVEC-type iterators should now work
cifs: Remove unused code
cifs: Build the RDMA SGE list directly from an iterator
cifs: Change the I/O paths to use an iterator rather than a page list
cifs: Add a function to read into an iter from a socket
cifs: Add some helper functions
cifs: Add a function to Hash the contents of an iterator
cifs: Add a function to build an RDMA SGE list from an iterator
netfs: Add a function to extract an iterator into a scatterlist
netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator
cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE
splice: Export filemap/direct_splice_read()
iov_iter: Add a function to extract a page list from an iterator
iov_iter: Define flags to qualify page extraction.
splice: Add a func to do a splice from an O_DIRECT file without ITER_PIPE
splice: Add a func to do a splice from a buffered file without ITER_PIPE
...
New Features:
* Convert the read and write paths to use folios
Bugfixes and Cleanups:
* Fix tracepoint state manager flag printing
* Fix disabling swap files
* Fix NFSv4 client identifier sysfs path in the documentation
* Don't clear NFS_CAP_COPY if server returns NFS4ERR_OFFLOAD_DENIED
* Treat GETDEVICEINFO errors as a layout failure
* Replace kmap_atomic() calls with kmap_local_page()
* Constify sunrpc sysfs kobj_type structures
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmP2lSQACgkQ18tUv7Cl
QOuaPw//TNLuEMiiMtM6EXFPc2p2tNMfM+EeMjYKZkS5WnvszVOZRzQfXz4dvvif
R2BIHT+C6AWkPOxnbuhIk86acy/wRuAPD9OMyp+3PNa4oEqjyIk5QaTvgYeeRBNe
H7haAqCiEBx/g8F9DladWLAwpilTSPpupdmiXwTS7Q7wB6NkKqc1eAJ+BTuRYAsm
19p2nGTzenWrut4HMOOmuAVrA2OcWtzgYzdJlY19lWRmzfCvX1gh3i2JxGutmnl3
tduviVBdu5QGnTKHJiP853kzs3VwEHIFSO+5z137JWKTm3IgPxW/u3zf+ijmw22t
vjbtFAhb4+u5juchvVKyIX4lClPSKZincQ6dn9soOm7qJcxYWNm40DzxAF21GUAq
d4tp+zBoe9pfKxZrylp6YxIchLQYhU+dL0hhqbKzfOAFOg28k1JsJubl/wRvWKVe
LGbvFOrUYo8arPOKfxBWMf4HCouu/vGq/qng+U/Ppjw3h8gD32aniyP3qN9Fm1JA
rFAMKTnD4I3r3x1E4HX3icUMwil36zBr9cPglQBYD5DW54ConEW8e6hvQmjTdQHP
EgN9+PSYzYXuKR/Fij7BYOymUD9grT1+5hVaOPzlM4SY3jMy2novcQGoGiLVOd0t
PGQ5047qbESFD7VkW4GBAYftrSMLiU+l5KL3aN1JRtprxu6NaYA=
=39ue
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"New Features:
- Convert the read and write paths to use folios
Bugfixes and Cleanups:
- Fix tracepoint state manager flag printing
- Fix disabling swap files
- Fix NFSv4 client identifier sysfs path in the documentation
- Don't clear NFS_CAP_COPY if server returns NFS4ERR_OFFLOAD_DENIED
- Treat GETDEVICEINFO errors as a layout failure
- Replace kmap_atomic() calls with kmap_local_page()
- Constify sunrpc sysfs kobj_type structures"
* tag 'nfs-for-6.3-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (25 commits)
fs/nfs: Replace kmap_atomic() with kmap_local_page() in dir.c
pNFS/filelayout: treat GETDEVICEINFO errors as layout failure
Documentation: Fix sysfs path for the NFSv4 client identifier
nfs42: do not fail with EIO if ssc returns NFS4ERR_OFFLOAD_DENIED
NFS: fix disabling of swap
SUNRPC: make kobj_type structures constant
nfs4trace: fix state manager flag printing
NFS: Remove unnecessary check in nfs_read_folio()
NFS: Improve tracing of nfs_wb_folio()
NFS: Enable tracing of nfs_invalidate_folio() and nfs_launder_folio()
NFS: fix up nfs_release_folio() to try to release the page
NFS: Clean up O_DIRECT request allocation
NFS: Fix up nfs_vm_page_mkwrite() for folios
NFS: Convert nfs_write_begin/end to use folios
NFS: Remove unused function nfs_wb_page()
NFS: Convert buffered writes to use folios
NFS: Convert the function nfs_wb_page() to use folios
NFS: Convert buffered reads to use folios
NFS: Add a helper nfs_wb_folio()
NFS: Convert the remaining pagelist helper functions to support folios
...
Two significant security enhancements are part of this release:
* NFSD's RPC header encoding and decoding, including RPCSEC GSS
and gssproxy header parsing, has been overhauled to make it
more memory-safe.
* Support for Kerberos AES-SHA2-based encryption types has been
added for both the NFS client and server. This provides a clean
path for deprecating and removing insecure encryption types
based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so
that NFS with Kerberos may now be used on systems with fips
enabled.
In addition to these, NFSD is now able to handle crossing into an
auto-mounted mount point on an exported NFS mount. A number of
fixes have been made to NFSD's server-side copy implementation.
RPC metrics have been converted to per-CPU variables. This helps
reduce unnecessary cross-CPU and cross-node memory bus traffic,
and significantly reduces noise when KCSAN is enabled.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPzgiYACgkQM2qzM29m
f5dB2A//eqjpj+FgAN+UjygrwMC4ahAsPX3Sc3FG8/lTAiao3NFVFY2gxAiCPyVE
CFk+tUyfL23oXvbyfIBe3LhxSBOf621xU6up2OzqAzJqh1Q9iUWB6as3I14to8ZU
sWpxXo5ofwk1hzkbrvOAVkyfY0emwsr00iBeWMawkpBe8FZEQA31OYj3/xHr6bBI
zEVlZPBZAZlp0DZ74tb+bBLs/EOnqKj+XLWcogCH13JB3sn2umF6cQNkYgsxvHGa
TNQi4LEdzWZGme242LfBRiGGwm1xuVIjlAhYV/R1wIjaknE3QBzqfXc6lJx74WII
HaqpRJGrKqdo7B+1gaXCl/AMS7YluED1CBrxuej0wBG7l2JEB7m2MFMQ4LTQjgsn
nrr3P70DgbB4LuPCPyUS7dtsMmUXabIqP7niiCR4T1toH6lBmHAgEi4cFmkzg7Cd
EoFzn888mtDpfx4fghcsRWS5oKXEzbPJfu5+IZOD63+UB+NGpi0Xo2s23sJPK8vz
kqK/X63JYOUxWUvK0zkj/c/wW1cLqIaBwnSKbShou5/BL+cZVI+uJYrnEesgpoB2
5fh/cZv3hdcoOPO7OfcjCLQYy4J6RCWajptnk/hcS3lMvBTBrnq697iAqCVURDKU
Xfmlf7XbBwje+sk4eHgqVGEqqVjrEmoqbmA2OS44WSS5LDvxXdI=
=ZG/7
-----END PGP SIGNATURE-----
Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"Two significant security enhancements are part of this release:
- NFSD's RPC header encoding and decoding, including RPCSEC GSS and
gssproxy header parsing, has been overhauled to make it more
memory-safe.
- Support for Kerberos AES-SHA2-based encryption types has been added
for both the NFS client and server. This provides a clean path for
deprecating and removing insecure encryption types based on DES and
SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with
Kerberos may now be used on systems with fips enabled.
In addition to these, NFSD is now able to handle crossing into an
auto-mounted mount point on an exported NFS mount. A number of fixes
have been made to NFSD's server-side copy implementation.
RPC metrics have been converted to per-CPU variables. This helps
reduce unnecessary cross-CPU and cross-node memory bus traffic, and
significantly reduces noise when KCSAN is enabled"
* tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (121 commits)
NFSD: Clean up nfsd_symlink()
NFSD: copy the whole verifier in nfsd_copy_write_verifier
nfsd: don't fsync nfsd_files on last close
SUNRPC: Fix occasional warning when destroying gss_krb5_enctypes
nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open
NFSD: fix problems with cleanup on errors in nfsd4_copy
nfsd: fix race to check ls_layouts
nfsd: don't hand out delegation on setuid files being opened for write
SUNRPC: Remove ->xpo_secure_port()
SUNRPC: Clean up the svc_xprt_flags() macro
nfsd: remove fs/nfsd/fault_inject.c
NFSD: fix leaked reference count of nfsd4_ssc_umount_item
nfsd: clean up potential nfsd_file refcount leaks in COPY codepath
nfsd: zero out pointers after putting nfsd_files on COPY setup error
SUNRPC: Fix whitespace damage in svcauth_unix.c
nfsd: eliminate __nfs4_get_fd
nfsd: add some kerneldoc comments for stateid preprocessing functions
nfsd: eliminate find_deleg_file_locked
nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS
SUNRPC: Add encryption self-tests
...
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmP1kvMACgkQiiy9cAdy
T1GwlQwAvFBHn8oUwDTj2bIlZRKsMMWGP61qdtgrsKPXcg2hZzw2jDmM6nk12MrS
AqYCnT1YdjKzDnEgBhHnaPLe+q2M27feJ5Y+FTqv1Hb9VwNY9OJT4hH3II3cuHfm
zLcnAAdVmtIh7FRYKrwGxqyESNik9N3g4evXcafKKq99sxaGnrnbTWQmOvTqM7kF
2AeBh6AHlHkoJNfd3BjlB8158Y/7Sfkul1AA7xXm6GWhgnWt2rmWNNqAEpTPXWyj
xkcb3Jy0wez1D8IXRqyrZdqdFXIfri6mvJoun3zAll6f29T/xxz0CYZUDA881q6E
EBANy/mLeMMKC4g3c2LPdEUB9ijwA9N+w8E7BSefCIMtN46cOQBTxYrbDfbCEs/m
ElGo2AcOM25FrkMdl9XSX0Phru3OAOrOivVLCBqbGbHPa0lgBhIZOdy38HCK3ZGm
0eH6ttyE0p0Mpt1+kex1e6t8XfdyFkHk5gjPElWbsrhSQYK1AAWuBhETWZ79CSlx
fCou/JZ1
=HBuN
-----END PGP SIGNATURE-----
Merge tag '6.3-rc-ksmbd-fixes' of git://git.samba.org/ksmbd
Pull ksmbd server updates from Steve French:
- Fix for memory leak
- Two important fixes for frame length checks (which are also now
stricter)
- four minor cleanup fixes
- Fix to clarify ksmbd/Kconfig to indent properl
- Conversion of the channel list and rpc handle list to xarrays
* tag '6.3-rc-ksmbd-fixes' of git://git.samba.org/ksmbd:
ksmbd: fix possible memory leak in smb2_lock()
ksmbd: do not allow the actual frame length to be smaller than the rfc1002 length
ksmbd: fix wrong data area length for smb2 lock request
ksmbd: Fix parameter name and comment mismatch
ksmbd: Fix spelling mistake "excceed" -> "exceeded"
ksmbd: update Kconfig to note Kerberos support and fix indentation
ksmbd: Remove duplicated codes
ksmbd: fix typo, syncronous->synchronous
ksmbd: Implements sess->rpc_handle_list as xarray
ksmbd: Implements sess->ksmbd_chann_list as xarray
* Reorganize zonefs code to split file related operations to a new
fs/zonefs/file.c file. From me.
* Modify zonefs to use dynamically allocated inodes and dentries
(using the inode and dentry caches) instead of statically allocating
everything on mount. This saves a significant amount of memory for
very large zoned block devices with 10s of thousands of zones. From
me.
* Make zonefs_sb_ktype a const struct kobj_type, from Thomas.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCY/VokQAKCRDdoc3SxdoY
dgCVAP427Q2wBrphU1//iW6R/7OZMz24nwmJcYPBAVLTngQO5gD+IVX+ru4xhD4Q
W1VKFN9QIDbjEiEc+Lmt9iQuDz+roAI=
=e/FD
-----END PGP SIGNATURE-----
Merge tag 'zonefs-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs updates from Damien Le Moal:
- Reorganize zonefs code to split file related operations to a new
fs/zonefs/file.c file (me)
- Modify zonefs to use dynamically allocated inodes and dentries (using
the inode and dentry caches) instead of statically allocating
everything on mount. This saves a significant amount of memory for
very large zoned block devices with 10s of thousands of zones (me)
- Make zonefs_sb_ktype a const struct kobj_type (Thomas)
* tag 'zonefs-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: make kobj_type structure constant
zonefs: Cache zone group directory inodes
zonefs: Dynamically create file inodes when needed
zonefs: Separate zone information from inode information
zonefs: Reduce struct zonefs_inode_info size
zonefs: Simplify IO error handling
zonefs: Reorganize code
- Fix a race when disassociating inodes from their glocks after
iget_failed().
- On filesystems with a block size smaller than the page size, make
sure that ->writepages() writes out all buffers of journaled inodes.
- Various improvements to the way the delete workqueue is drained to
speed up unmount and prevent leftover inodes. At unmount time, evict
deleted inodes cooperatively across the cluster to avoid unnecessary
timeouts.
- Various minor cleanups and fixes.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmP2ASMUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTodnQ//XSt/sKo6W4y56Pxz5AHO2cTdypLk
b+ki1uWfguBm/3o8nFtXedoCcTsWZ8EICPL9bZKFBwSLSRChZuc6kKDz07wwseKR
t8Lbi9G8tSwyDeQaiGTBd1UBWFAjNGqXniyqx/ki+RZZ3QMVJnwcb1Bjtl9DJ5DE
lAXcvz0DZSQQwtpsdG+qpme65XZSziS0uDgkaz5Pio/1NDfbZ/U28HBMLNkS/Ef7
RnF05PRaM07OGn2rXmvIqwIwoxjH2eF38x5EyMI3xpr//b3x/mVbwL+QFECfgf7r
iuISCtL46n/gs4NmroPfT5LbCDhkOw513mmkdJNKXwHsz4s8hS+BuZVtTa8b02hn
0K5Ova63qz3TIoZ+P6n44xiRFEVjqz0eqn0XhOr+HRljRXhn2ihxQLl4yGNgrmB2
KTC+xMqZHXs2J3H97OTZDJVHYe5k6HqzvBUN6BnGRZ1lSEJbt5Fe0b7Web2aaLY2
X2jFXplWVisTPKcusSuG3kP4WrEJq7ic8YLX6BgKU5DBbS69NETssUuMGUIxsd6k
P+A4wfrUWac+X+DHFRPJu1yNL2UsW237AX75sNNOqLNRX04ZjXGFxymnEw//t2Qr
2sPOEkU4O61o7tYWlK8PXTDVEbteZO3pBCdj7ARsmDEY401QuT9ZlmpmhDxxP/hP
TGLDXXMbG+eireU=
=3v/p
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v6.2-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Fix a race when disassociating inodes from their glocks after
iget_failed()
- On filesystems with a block size smaller than the page size, make
sure that ->writepages() writes out all buffers of journaled inodes
- Various improvements to the way the delete workqueue is drained to
speed up unmount and prevent leftover inodes. At unmount time, evict
deleted inodes cooperatively across the cluster to avoid unnecessary
timeouts
- Various minor cleanups and fixes
* tag 'gfs2-v6.2-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Convert gfs2_page_add_databufs to folios
gfs2: jdata writepage fix
gfs2: Improve gfs2_make_fs_rw error handling
Revert "GFS2: free disk inode which is deleted by remote node -V2"
gfs2: Evict inodes cooperatively
gfs2: Flush delete work before shrinking inode cache
gfs2: Cease delete work during unmount
gfs2: Add SDF_DEACTIVATING super block flag
gfs2: check gl_object in rgrp glops
gfs2: Split the two kinds of glock "delete" work
gfs2: Move delete workqueue into super block
gfs2: Get rid of GLF_PENDING_DELETE flag
gfs2: Make glock lru list scanning safer
gfs2: Clean up gfs2_scan_glock_lru
gfs2: Improve gfs2_upgrade_iopen_glock comment
gfs2: gl_object races fix
* Eliminate repeated boxing and unboxing of log item parameters.
* Clean up some confusing variable names in the log item code.
* Fix a deadlock when doing unwritten extent conversion that causes a
bmbt split when there are sustained memory shortages and the worker
pool runs out of worker threads.
* Fix the panic_mask debug knob not being able to trigger on verifier
errors.
* Constify kobj_type objects.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCY+Z6BwAKCRBKO3ySh0YR
pkQJAQCjkzXqZuj8WH/g22S01smT51QhmX+1ubLdzMYSvRvrKQD+MlH74EcgurQD
GhgCWJh6dBTx1nICKpCXYgVD9Glvowc=
=J2Xw
-----END PGP SIGNATURE-----
Merge tag 'xfs-6.3-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"There's a couple of bug fixes, some cleanups for inconsistent variable
names and reduction of struct boxing and unboxing in the logging code.
More work is pending, which will begin reworking allocation group
lifetimes and finally replace confusing indirect calls to the
allocator with actual ... function calls. But I want to let that
experience another week of testing.
Summary:
- Eliminate repeated boxing and unboxing of log item parameters
- Clean up some confusing variable names in the log item code
- Fix a deadlock when doing unwritten extent conversion that causes a
bmbt split when there are sustained memory shortages and the worker
pool runs out of worker threads
- Fix the panic_mask debug knob not being able to trigger on verifier
errors
- Constify kobj_type objects"
* tag 'xfs-6.3-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: revert commit 8954c44ff4
xfs: make kobj_type structures constant
xfs: allow setting full range of panic tags
xfs: don't use BMBT btree split workers for IO completion
xfs: fix confusing variable names in xfs_refcount_item.c
xfs: pass refcount intent directly through the log intent code
xfs: fix confusing variable names in xfs_rmap_item.c
xfs: pass rmap space mapping directly through the log intent code
xfs: fix confusing xfs_extent_item variable names
xfs: pass xfs_extent_free_item directly through the log intent code
xfs: fix confusing variable names in xfs_bmap_item.c
xfs: pass the xfs_bmbt_irec directly through the log intent code
xfs: use strscpy() to instead of strncpy()
- Change when the iomap page_done function is called so that we still
have a locked folio in the success case. This fixes a writeback race
in gfs2.
- Change when the iomap page_prepare function is called so that gfs2
can recover from OOM scenarios more gracefully.
- Rename the iomap page_ops to folio_ops, since they operate on folios
now.
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQ2qTKExjcn+O1o2YRKO3ySh0YRpgUCY8g/FwAKCRBKO3ySh0YR
pi19AQDCatxkzguJGV9BY52Bf8iDxCgdL34RatKXAzkZC3Y6UQEAsNdb88rkWkNK
qPlXgsZm9cNlFb8c7mFvA9JAL9IPxgE=
=ubh6
-----END PGP SIGNATURE-----
Merge tag 'iomap-6.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap updates from Darrick Wong:
"This is mostly rearranging things to make life easier for gfs2,
nothing all that mindblowing for this release.
- Change when the iomap page_done function is called so that we still
have a locked folio in the success case. This fixes a writeback
race in gfs2
- Change when the iomap page_prepare function is called so that gfs2
can recover from OOM scenarios more gracefully
- Rename the iomap page_ops to folio_ops, since they operate on
folios now"
* tag 'iomap-6.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: Rename page_ops to folio_ops
iomap: Rename page_prepare handler to get_folio
iomap: Add __iomap_get_folio helper
iomap/gfs2: Get page in page_prepare handler
iomap: Add iomap_get_folio helper
iomap: Rename page_done handler to put_folio
iomap/gfs2: Unlock and put folio in page_done handler
iomap: Add __iomap_put_folio helper
Convert gfs2_page_add_databufs() to folios and rename it to
gfs2_trans_add_databufs().
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
The ->writepage() and ->writepages() operations are supposed to write
entire pages. However, on filesystems with a block size smaller than
PAGE_SIZE, __gfs2_jdata_writepage() only adds the first block to the
current transaction instead of adding the entire page. Fix that.
Fixes: 18ec7d5c3f ("[GFS2] Make journaled data files identical to normal files on disk")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Core
----
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used
to describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols
---------
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP
path manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF
---
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key
to better support decap on GRE tunnel devices not operating
in collect metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols
by livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter
---------
- Remove the CLUSTERIP target. It has been marked as obsolete
for years, and we still have WARN splats wrt. races of
the out-of-band /proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to
the existing 'delete' commands, but do not return an error if
the referenced object (set, chain, rule...) did not exist.
Driver API
----------
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into multiple
files, drop some of the unnecessarily granular locks and factor out
common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless Extensions
for Wi-Fi 7 devices at all. Everyone should switch to using nl80211
interface instead.
- Improve the CAN bit timing configuration. Use extack to return error
messages directly to user space, update the SJW handling, including
the definition of a new default value that will benefit CAN-FD
controllers, by increasing their oscillator tolerance.
New hardware / drivers
----------------------
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers
-------
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- enetc: support XDP_REDIRECT for XDP non-linear buffers
- enetc: improve reconfig, avoid link flap and waiting for idle
- enetc: support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP1VIYACgkQMUZtbf5S
IrvsChAApz0rNL/sPKxXTEfxZ1tN7D3sYxYKQPomxvl5BV+MvicrLddJy3KmzEFK
nnJNO3nuRNuH422JQ/ylZ4mGX1opa6+5QJb0UINImXUI7Fm8HHBIuPGkv7d5CheZ
7JexFqjPJXUy9nPyh1Rra+IA9AcRd2U7jeGEZR38wb99bHJQj5Bzdk20WArEB0el
n44aqg49LXH71bSeXRz77x5SjkwVtYiccQxLcnmTbjLU2xVraLvI2J+wAhHnVXWW
9lrU1+V4Ex2Xcd1xR0L0cHeK+meP1TrPRAeF+JDpVI3a/zJiE7cZjfHdG/jH5xWl
leZJqghVozrZQNtewWWO7XhUFhMDgFu3W/1vNLjSHPZEqaz1JpM67J1+ql6s63l4
LMWoXbcYZz+SL9ZRCoPkbGue/5fKSHv8/Jl9Sh58+eTS+c/zgN8uFGRNFXLX1+EP
n8uvt985PxMd6x1+dHumhOUzxnY4Sfi1vjitSunTsNFQ3Cmp4SO0IfBVJWfLUCuC
xz5hbJGJJbSpvUsO+HWyCg83E5OWghRE/Onpt2jsQSZCrO9HDg4FRTEf3WAMgaqc
edb5KfbRZPTJQM08gWdluXzSk1nw3FNP2tXW4XlgUrEbjb+fOk0V9dQg2gyYTxQ1
Nhvn8ZQPi6/GMMELHAIPGmmW1allyOGiAzGlQsv8EmL+OFM6WDI=
=xXhC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
API:
- Use kmap_local instead of kmap_atomic.
- Change request callback to take void pointer.
- Print FIPS status in /proc/crypto (when enabled).
Algorithms:
- Add rfc4106/gcm support on arm64.
- Add ARIA AVX2/512 support on x86.
Drivers:
- Add TRNG driver for StarFive SoC.
- Delete ux500/hash driver (subsumed by stm32/hash).
- Add zlib support in qat.
- Add RSA support in aspeed.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmPzAiwACgkQxycdCkmx
i6et8xAAoO3w5MZFGXMzWsYhfSZFdceXBEQfDR7JOCdHxpMIQhw0FLlb0uttFk6m
SeWrdP9wiifBDoCmw7qffFJml8ZftPL/XeXjob2d9v7jKbPyw3lDSIdsNfN/5EEL
oIc9915zwrgawvahPAa+PQ4Ue03qRjUyOcV42dpd1W3NYhzDVHoK5OUU+mEFYDvx
Sgw/YUugKf0VXkVDFzG5049+CPcheyRZqclAo9jyl2eZiXujgUyV33nxRCtqIA+t
7jlHKwi+6QzFHY0CX5BvShR8xyEuH5MLoU3H/jYGXnRb3nEpRYAEO4VZchIHqF0F
Y6pKIKc6Q8OyIVY8RsjQY3hioCqYnQFZ5Xtc1zGtOYEitVLbkmItMG0mVn0XOfyt
gJDi6gkEw5uPUbEQdI4R1xEgJ8eCckMsOJ+uRxqTm+uLqNDxPbsB9bohKniMogXV
lDlVXjU23AA9VeKtqU8FvWjfgqsN47X4aoq1j4/4aI7X9F7P9FOP21TZloP7+ssj
PFrzNaRXUrMEsvyS1wqPegIh987lj6WkH4hyU0wjzaIq4IQELidHsSXFS12iWIPH
kTEoC/trAVoYSr0zXKWUCs4h/x0FztVNbjs4KiDP2FLXX1RzeVZ0WlaXZhryHr+n
1+8yCuS6tVofAbSX0wNkZdf0x5+3CIBw4kqSIvjKDPYYEfIDaT0=
=dMYe
-----END PGP SIGNATURE-----
Merge tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Use kmap_local instead of kmap_atomic
- Change request callback to take void pointer
- Print FIPS status in /proc/crypto (when enabled)
Algorithms:
- Add rfc4106/gcm support on arm64
- Add ARIA AVX2/512 support on x86
Drivers:
- Add TRNG driver for StarFive SoC
- Delete ux500/hash driver (subsumed by stm32/hash)
- Add zlib support in qat
- Add RSA support in aspeed"
* tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
crypto: x86/aria-avx - Do not use avx2 instructions
crypto: aspeed - Fix modular aspeed-acry
crypto: hisilicon/qm - fix coding style issues
crypto: hisilicon/qm - update comments to match function
crypto: hisilicon/qm - change function names
crypto: hisilicon/qm - use min() instead of min_t()
crypto: hisilicon/qm - remove some unused defines
crypto: proc - Print fips status
crypto: crypto4xx - Call dma_unmap_page when done
crypto: octeontx2 - Fix objects shared between several modules
crypto: nx - Fix sparse warnings
crypto: ecc - Silence sparse warning
tls: Pass rec instead of aead_req into tls_encrypt_done
crypto: api - Remove completion function scaffolding
tls: Remove completion function scaffolding
tipc: Remove completion function scaffolding
net: ipv6: Remove completion function scaffolding
net: ipv4: Remove completion function scaffolding
net: macsec: Remove completion function scaffolding
dm: Remove completion function scaffolding
...
Currently proc_dobool expects a (bool *) in table->data, but sizeof(int)
in table->maxsize, because it uses do_proc_dointvec() directly.
This is unsafe for at least two reasons:
1. A sysctl table definition may use { .data = &variable, .maxsize =
sizeof(variable) }, not realizing that this makes the sysctl unusable
(see the Fixes: tag) and that they need to use the completely
counterintuitive sizeof(int) instead.
2. proc_dobool() will currently try to parse an array of values if given
.maxsize >= 2*sizeof(int), but will try to write values of type bool
by offsets of sizeof(int), so it will not work correctly with neither
an (int *) nor a (bool *). There is no .maxsize validation to prevent
this.
Fix this by:
1. Constraining proc_dobool() to allow only one value and .maxsize ==
sizeof(bool).
2. Wrapping the original struct ctl_table in a temporary one with .data
pointing to a local int variable and .maxsize set to sizeof(int) and
passing this one to proc_dointvec(), converting the value to/from
bool as needed (using proc_dou8vec_minmax() as an example).
3. Extending sysctl_check_table() to enforce proc_dobool() expectations.
4. Fixing the proc_dobool() docstring (it was just copy-pasted from
proc_douintvec, apparently...).
5. Converting all existing proc_dobool() users to set .maxsize to
sizeof(bool) instead of sizeof(int).
Fixes: 83efeeeb3d ("tty: Allow TIOCSTI to be disabled")
Fixes: a2071573d6 ("sysctl: introduce new proc handler proc_dobool")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
- Replace 0-length and 1-element arrays with flexible arrays in various
subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees Cook)
- randstruct: Disable Clang 15 support (Eric Biggers)
- GCC plugins: Drop -std=gnu++11 flag (Sam James)
- strpbrk(): Refactor to use strchr() (Andy Shevchenko)
- LoadPin LSM: Allow root filesystem switching when non-enforcing
- fortify: Use dynamic object size hints when available
- ext4: Fix CFI function prototype mismatch
- Nouveau: Fix DP buffer size arguments
- hisilicon: Wipe entire crypto DMA pool on error
- coda: Fully allocate sig_inputArgs
- UBSAN: Improve arm64 trap code reporting
- copy_struct_from_user(): Add minimum bounds check on kernel buffer size
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmPv1Y8WHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJg5UD/9x3Lx0EG3iL4qPtjmohaXd899r
AzP1ysoxYnmo/cY0//W3DPCJrUaVlTm7M2xXOpzi7YPVD8Jcofzy6Uxm9BiG/OJ9
bla7uQixlDMA2MBmWzAXhM7337WgEtBcr6kbXk6rHFnzmk8CdAY3wjmLmiefxEWT
gkdeJlbkBFynssSF2nejgCvr/ZyiWQr2V9hRdEavLQH/MDS785bmNwbLyUNqK+eo
gOtuyjyV90t+cSIN0bF7gOCFGf1ivKA/+GNFrob0jY0Fy2kGx1I2wQMn9yzjzerC
o6Majz9r+7Z7xIaz2Pm9nDaWyZDI05RfoRpQZ9dSEJ+zYgbFBFpDpJShcJvSpNa0
POqeR400n/6VWBcbk7UU0s7VCVU13IsOFhBSVMQM5FfzIcUkj0/VBm0Jm0ODrpM9
13/nKyAkvHkH0uSJbQjn79rXvEvqQyi5f28emm2CuhiHHUiDEUdsmMD7fE8UXo4r
U8dgfwTOLLQBKmOQJcgiLo8iLDPhatZKYQAZ7LMY9kbHLsJlRVxfzY9PriNCuI5o
XuMLJG33TrlUDfqQrKeSJ9srVRiiIBAzoWnIfIVE3Xb46LqFNXVRdJCt4A2678jn
gYIzkQ2HbVe2chUhUyjsjGTjmmeX9qZG0UOlhRQ0RvWFxi390wwYqhkSaOEGtDGv
QbVh0Lb86m3H/G+M9g==
=XnVa
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"Beyond some specific LoadPin, UBSAN, and fortify features, there are
other fixes scattered around in various subsystems where maintainers
were okay with me carrying them in my tree or were non-responsive but
the patches were reviewed by others:
- Replace 0-length and 1-element arrays with flexible arrays in
various subsystems (Paulo Miguel Almeida, Stephen Rothwell, Kees
Cook)
- randstruct: Disable Clang 15 support (Eric Biggers)
- GCC plugins: Drop -std=gnu++11 flag (Sam James)
- strpbrk(): Refactor to use strchr() (Andy Shevchenko)
- LoadPin LSM: Allow root filesystem switching when non-enforcing
- fortify: Use dynamic object size hints when available
- ext4: Fix CFI function prototype mismatch
- Nouveau: Fix DP buffer size arguments
- hisilicon: Wipe entire crypto DMA pool on error
- coda: Fully allocate sig_inputArgs
- UBSAN: Improve arm64 trap code reporting
- copy_struct_from_user(): Add minimum bounds check on kernel buffer
size"
* tag 'hardening-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
randstruct: disable Clang 15 support
uaccess: Add minimum bounds check on kernel buffer size
arm64: Support Clang UBSAN trap codes for better reporting
coda: Avoid partial allocation of sig_inputArgs
gcc-plugins: drop -std=gnu++11 to fix GCC 13 build
lib/string: Use strchr() in strpbrk()
crypto: hisilicon: Wipe entire pool on error
net/i40e: Replace 0-length array with flexible array
io_uring: Replace 0-length array with flexible array
ext4: Fix function prototype mismatch for ext4_feat_ktype
i915/gvt: Replace one-element array with flexible-array member
drm/nouveau/disp: Fix nvif_outp_acquire_dp() argument size
LoadPin: Allow filesystem switch when not enforcing
LoadPin: Move pin reporting cleanly out of locking
LoadPin: Refactor sysctl initialization
LoadPin: Refactor read-only check into a helper
ARM: ixp4xx: Replace 0-length arrays with flexible arrays
fortify: Use __builtin_dynamic_object_size() when available
rxrpc: replace zero-lenth array with DECLARE_FLEX_ARRAY() helper
This pull request contains the following branches:
doc.2023.01.05a: Documentation updates.
fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably:
o Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number
of callbacks.
o Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized.
o Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have doen this for mnay years.)
o Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled,
so this should not (yet) affect production use cases.)
kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of
polled grace periods, thus reducing memory footprint by almost
two orders of magnitude, admittedly on a microbenchmark.
This series also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs
where kfree_rcu(p), which can block, was typed instead of the
intended kfree_rcu(p, rh).
srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that
causes SRCU to fail when booted on a system with a non-zero boot
CPU. This surprising situation actually happens for kdump kernels
on the powerpc architecture. It also adds an srcu_down_read()
and srcu_up_read(), which act like srcu_read_lock() and
srcu_read_unlock(), but allow an SRCU read-side critical section
to be handed off from one task to another.
srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option.
There are a few more commits that are not yet acked or pulled
into maintainer trees, and these will be in a pull request for
a later merge window.
tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes:
o A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang.
o A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can result
in a too-short grace period.
o A race between shrinking RCU tasks down to a single callback list
and queuing a new callback to some other CPU, but where that
queuing is delayed for more than an RCU grace period. This can
result in that callback being stranded on the non-boot CPU.
torture.2023.01.05a: Torture-test updates and fixes.
torturescript.2023.01.03a: Torture-test scripting updates and fixes.
stall.2023.01.09a: Provide additional RCU CPU stall-warning information
in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and
restore the full five-minute timeout limit for expedited RCU
CPU stall warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh
oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU
Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X
myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8
qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV
vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2
BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb
NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07
cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ
FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr
AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e
4AFBFd/+VedUGg==
=bBYK
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably:
- Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number of
callbacks
- Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized
- Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have done this for many years)
- Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled, so
this should not (yet) affect production use cases)
- Make kfree_rcu() and friends take advantage of polled grace periods,
thus reducing memory footprint by almost two orders of magnitude,
admittedly on a microbenchmark
This also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs where
kfree_rcu(p), which can block, was typed instead of the intended
kfree_rcu(p, rh)
- SRCU updates, perhaps most notably fixing a bug that causes SRCU to
fail when booted on a system with a non-zero boot CPU. This
surprising situation actually happens for kdump kernels on the
powerpc architecture
This also adds an srcu_down_read() and srcu_up_read(), which act like
srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
critical section to be handed off from one task to another
- Clean up the now-useless SRCU Kconfig option
There are a few more commits that are not yet acked or pulled into
maintainer trees, and these will be in a pull request for a later
merge window
- RCU-tasks updates, perhaps most notably these fixes:
- A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang
- A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can
result in a too-short grace period
- A race between shrinking RCU tasks down to a single callback
list and queuing a new callback to some other CPU, but where
that queuing is delayed for more than an RCU grace period. This
can result in that callback being stranded on the non-boot CPU
- Torture-test updates and fixes
- Torture-test scripting updates and fixes
- Provide additional RCU CPU stall-warning information in kernels built
with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
timeout limit for expedited RCU CPU stall warnings
* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
kernel/notifier: Remove CONFIG_SRCU
init: Remove "select SRCU"
fs/quota: Remove "select SRCU"
fs/notify: Remove "select SRCU"
fs/btrfs: Remove "select SRCU"
fs: Remove CONFIG_SRCU
drivers/pci/controller: Remove "select SRCU"
drivers/net: Remove "select SRCU"
drivers/md: Remove "select SRCU"
drivers/hwtracing/stm: Remove "select SRCU"
drivers/dax: Remove "select SRCU"
drivers/base: Remove CONFIG_SRCU
rcu: Disable laziness if lazy-tracking says so
rcu: Track laziness during boot and suspend
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
rcu: Align the output of RCU CPU stall warning messages
rcu: Add RCU stall diagnosis information
sched: Add helper nr_context_switches_cpu()
...
We update ses->ip_addr whenever we do a session setup.
But this should happen only for primary channel in mchan
scenario.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
In smb2_reconnect_server, we allocate a dummy tcon for
calling reconnect for just the session. This should be
allocated using tconInfoAlloc, and not kmalloc.
Fixes: 3663c9045f ("cifs: check reconnects for channels of active tcons too")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
Till now, we've used a simple round robin approach to
distribute the requests between the channels. This does
not work well if the channels consume the requests at
different speeds, even if the advertised speeds are the
same.
This change will allow the client to pick the channel
with least number of requests currently in-flight. This
will disregard the link speed, and select a channel
based on the current load of the channels.
For cases when all the channels are equally loaded,
fall back to the old round robin method.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
- Improve the scalability of the CFS bandwidth unthrottling logic
with large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with
the generic scheduler code. Add __cpuidle methods as noinstr to
objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS,
to query previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period,
to improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- ... Misc other cleanups, fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzbJwRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iIvA//ZcEaB8Z6ChLRQjM+bsaudKJu3pdLQbPK
iYbP8Da+LsAfxbEfYuGV3m+jIp0LlBOtsI/EezxQrXV+V7FvNyAX9Y00eEu/zlj8
7Jn3LMy/DBYTwH7LwVdcU0MyIVI8ZPc6WNnkx0LOtGZn8n+qfHPSDzcP3CW+a5AV
UvllPYpYyEmsX0Eby7CF4Ue8mSmbViw/xR3rNr8ZSve0c25XzKabw8O9kE3jiHxP
d/zERJoAYeDyYUEuZqhfn5dTlB4an4IjNEkAfRE5SQ09RA8Gkxsa5Ar8gob9e9M1
eQsdd4/bdhnrkM8L5qDZczqmgCTZ2bukQrxkBXhRDhLgoFxwAn77b+2ZjmIW3Lae
AyGqRcDSg1q2oxaYm5ZiuO/t26aDOZu9vPHyHRDGt95EGbZlrp+GgeePyfCigJYz
UmPdZAAcHdSymnnnlcvdG37WVvaVkpgWZzd8LbtBi23QR+Zc4WQ2IlgnUS5WKNNf
VOBcAcP6E1IslDotZDQCc2dPFFQoQQEssVooyUc5oMytm7BsvxXLOeHG+Ncu/8uc
H+U8Qn8jnqTxJbC5hkWQIJlhVKCq2FJrHxxySYTKROfUNcDgCmxboFeAcXTCIU1K
T0S+sdoTS/CvtLklRkG0j6B8N4N98mOd9cFwUV3tX+/gMLMep3hCQs5L76JagvC5
skkQXoONNaM=
=l1nN
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Improve the scalability of the CFS bandwidth unthrottling logic with
large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with the
generic scheduler code. Add __cpuidle methods as noinstr to objtool's
noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query
previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period, to
improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- Misc other cleanups, fixes
* tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
sched/rt: pick_next_rt_entity(): check list_entry
sched/deadline: Add more reschedule cases to prio_changed_dl()
sched/fair: sanitize vruntime of entity being placed
sched/fair: Remove capacity inversion detection
sched/fair: unlink misfit task from cpu overutilized
objtool: mem*() are not uaccess safe
cpuidle: Fix poll_idle() noinstr annotation
sched/clock: Make local_clock() noinstr
sched/clock/x86: Mark sched_clock() noinstr
x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read()
x86/atomics: Always inline arch_atomic64*()
cpuidle: tracing, preempt: Squash _rcuidle tracing
cpuidle: tracing: Warn about !rcu_is_watching()
cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
cpuidle: drivers: firmware: psci: Dont instrument suspend code
KVM: selftests: Fix build of rseq test
exit: Detect and fix irq disabled state in oops
cpuidle, arm64: Fix the ARM64 cpuidle logic
cpuidle: mvebu: Fix duplicate flags assignment
sched/fair: Limit sched slice duration
...
DIO to/from KVEC-type iterators should now work as the iterator is passed
down to the socket in non-RDMA/non-crypto mode and in RDMA or crypto mode
care is taken to handle vmap/vmalloc correctly and not take page refs when
building a scatterlist.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Tom Talpey <tom@talpey.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
In the depths of the cifs RDMA code, extract part of an iov iterator
directly into an SGE list without going through an intermediate
scatterlist.
Note that this doesn't support extraction from an IOBUF- or UBUF-type
iterator (ie. user-supplied buffer). The assumption is that the higher
layers will extract those to a BVEC-type iterator first and do whatever is
required to stop the pages from going away.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Tom Talpey <tom@talpey.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
cc: linux-rdma@vger.kernel.org
Link: https://lore.kernel.org/r/166697260361.61150.5064013393408112197.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732032518.3186319.1859601819981624629.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
Currently, the cifs I/O paths hand lists of pages from the VM interface
routines at the top all the way through the intervening layers to the
socket interface at the bottom.
This is a problem, however, for interfacing with netfslib which passes an
iterator through to the ->issue_read() method (and will pass an iterator
through to the ->issue_write() method in future). Netfslib takes over
bounce buffering for direct I/O, async I/O and encrypted content, so cifs
doesn't need to do that. Netfslib also converts IOVEC-type iterators into
BVEC-type iterators if necessary.
Further, cifs needs foliating - and folios may come in a variety of sizes,
so a page list pointing to an array of heterogeneous pages may cause
problems in places such as where crypto is done.
Change the cifs I/O paths to hand iov_iter iterators all the way through
instead.
Notes:
(1) Some old routines are #if'd out to be removed in a follow up patch so
as to avoid confusing diff, thereby making the diff output easier to
follow. I've removed functions that don't overlap with anything
added.
(2) struct smb_rqst loses rq_pages, rq_offset, rq_npages, rq_pagesz and
rq_tailsz which describe the pages forming the buffer; instead there's
an rq_iter describing the source buffer and an rq_buffer which is used
to hold the buffer for encryption.
(3) struct cifs_readdata and cifs_writedata are similarly modified to
smb_rqst. The ->read_into_pages() and ->copy_into_pages() are then
replaced with passing the iterator directly to the socket.
The iterators are stored in these structs so that they are persistent
and don't get deallocated when the function returns (unlike if they
were stack variables).
(4) Buffered writeback is overhauled, borrowing the code from the afs
filesystem to gather up contiguous runs of folios. The XARRAY-type
iterator is then used to refer directly to the pagecache and can be
passed to the socket to transmit data directly from there.
This includes:
cifs_extend_writeback()
cifs_write_back_from_locked_folio()
cifs_writepages_region()
cifs_writepages()
(5) Pages are converted to folios.
(6) Direct I/O uses netfs_extract_user_iter() to create a BVEC-type
iterator from an IOBUF/UBUF-type source iterator.
(7) smb2_get_aead_req() uses netfs_extract_iter_to_sg() to extract page
fragments from the iterator into the scatterlists that the crypto
layer prefers.
(8) smb2_init_transform_rq() attached pages to smb_rqst::rq_buffer, an
xarray, to use as a bounce buffer for encryption. An XARRAY-type
iterator can then be used to pass the bounce buffer to lower layers.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Paulo Alcantara <pc@cjr.nz>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/164311907995.2806745.400147335497304099.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/164928620163.457102.11602306234438271112.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165211420279.3154751.15923591172438186144.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165348880385.2106726.3220789453472800240.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165364827111.3334034.934805882842932881.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/166126396180.708021.271013668175370826.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/166697259595.61150.5982032408321852414.stgit@warthog.procyon.org.uk/ # rfc
Link: https://lore.kernel.org/r/166732031756.3186319.12528413619888902872.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
Provide a function for filling in a scatterlist from the list of pages
contained in an iterator.
If the iterator is UBUF- or IOBUF-type, the pages have a pin taken on them
(as FOLL_PIN).
If the iterator is BVEC-, KVEC- or XARRAY-type, no pin is taken on the
pages and it is left to the caller to manage their lifetime. It cannot be
assumed that a ref can be validly taken, particularly in the case of a KVEC
iterator.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Add a function to extract the pages from a user-space supplied iterator
(UBUF- or IOVEC-type) into a BVEC-type iterator, retaining the pages by
getting a pin on them (as FOLL_PIN) as we go.
This is useful in three situations:
(1) A userspace thread may have a sibling that unmaps or remaps the
process's VM during the operation, changing the assignment of the
pages and potentially causing an error. Retaining the pages keeps
some pages around, even if this occurs; futher, we find out at the
point of extraction if EFAULT is going to be incurred.
(2) Pages might get swapped out/discarded if not retained, so we want to
retain them to avoid the reload causing a deadlock due to a DIO
from/to an mmapped region on the same file.
(3) The iterator may get passed to sendmsg() by the filesystem. If a
fault occurs, we may get a short write to a TCP stream that's then
tricky to recover from.
We don't deal with other types of iterator here, leaving it to other
mechanisms to retain the pages (eg. PG_locked, PG_writeback and the pipe
lock).
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Provide cifs_splice_read() to use a bvec rather than an pipe iterator as
the latter cannot so easily be split and advanced, which is necessary to
pass an iterator down to the bottom levels. Upstream cifs gets around this
problem by using iov_iter_get_pages() to prefill the pipe and then passing
the list of pages down.
This is done by:
(1) Bulk-allocate a bunch of pages to carry as much of the requested
amount of data as possible, but without overrunning the available
slots in the pipe and add them to an ITER_BVEC.
(2) Synchronously call ->read_iter() to read into the buffer.
(3) Discard any unused pages.
(4) Load the remaining pages into the pipe in order and advance the head
pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: linux-cifs@vger.kernel.org
Link: https://lore.kernel.org/r/166732028113.3186319.1793644937097301358.stgit@warthog.procyon.org.uk/ # rfc
Signed-off-by: Steve French <stfrench@microsoft.com>
filemap_splice_read() and direct_splice_read() should be exported.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steve French <sfrench@samba.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-cifs@vger.kernel.org
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Implement a function, direct_file_splice(), that deals with this by using
an ITER_BVEC iterator instead of an ITER_PIPE iterator as the former won't
free its buffers when reverted. The function bulk allocates all the
buffers it thinks it is going to use in advance, does the read
synchronously and only then trims the buffer down. The pages we did use
get pushed into the pipe.
This fixes a problem with the upcoming iov_iter_extract_pages() function,
whereby pages extracted from a non-user-backed iterator such as ITER_PIPE
aren't pinned. __iomap_dio_rw(), however, calls iov_iter_revert() to
shorten the iterator to just the bufferage it is going to use - which has
the side-effect of freeing the excess pipe buffers, even though they're
attached to a bio and may get written to by DMA (thanks to Hillf Danton for
spotting this[1]).
This then causes memory corruption that is particularly noticeable when the
syzbot test[2] is run. The test boils down to:
out = creat(argv[1], 0666);
ftruncate(out, 0x800);
lseek(out, 0x200, SEEK_SET);
in = open(argv[1], O_RDONLY | O_DIRECT | O_NOFOLLOW);
sendfile(out, in, NULL, 0x1dd00);
run repeatedly in parallel. What I think is happening is that ftruncate()
occasionally shortens the DIO read that's about to be made by sendfile's
splice core by reducing i_size.
This should be more efficient for DIO read by virtue of doing a bulk page
allocation, but slightly less efficient by ignoring any partial page in the
pipe.
Reported-by: syzbot+a440341a59e3b7142895@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20230207094731.1390-1-hdanton@sina.com/ [1]
Link: https://lore.kernel.org/r/000000000000b0b3c005f3a09383@google.com/ [2]
Signed-off-by: Steve French <stfrench@microsoft.com>
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
Replace the trailing 1-element array with a flexible array in the
following structures:
struct smb2_err_rsp
struct smb2_tree_connect_req
struct smb2_negotiate_rsp
struct smb2_sess_setup_req
struct smb2_sess_setup_rsp
struct smb2_read_req
struct smb2_read_rsp
struct smb2_write_req
struct smb2_write_rsp
struct smb2_query_directory_req
struct smb2_query_directory_rsp
struct smb2_set_info_req
struct smb2_change_notify_rsp
struct smb2_create_rsp
struct smb2_query_info_req
struct smb2_query_info_rsp
Replace the trailing 1-element array with a flexible array, but leave
the existing structure padding:
struct smb2_file_all_info
struct smb2_lock_req
Adjust all related size calculations to match the changes to sizeof().
No machine code output or .data section differences are produced after
these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrayshttps://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org>
Cc: Paulo Alcantara <pc@cjr.nz>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
We already upcall to resolve hostnames during reconnect by calling
reconn_set_ipaddr_from_hostname(), so there is no point in having a
worker to periodically call it.
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPvfncQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpob2EADXJxcr2jjYHm/7cjKkyuVX8fr80dNdMeuY
JFdsjG1k6Uj73BVhQQWYTcs/PsrWBHWRsv6uz4WgOELj55eXmf5Q0kJszyUeJW33
/DjqLvtoppVcYf80xE13wKvCfn73BjwQo6xkGM0qAYn15eaXiD/Ax3xC6eJlsBeK
PEw7EJyhacbSxZa/1D2B6+mqII1jUQWProTCc3udZ4JHi3WvdWa3Rda0qCqHl4a1
+K2aP2YTFIRPxBzfMNa/CafWVIFubTdht+4Ds6R60RImzB9e0VUBfcsiUyW5Zg7L
Fwv7ptXuWrALwVNdW56Oz1QikBxn2pdRR2HMLwKJW1MD8kP9r8LMm2jV5Rhiwe0B
OQsGRYkOzBvw+bxeP5fvk0iPGVMz6ActH4gkraA5QdLqayDaFYOadlhqz0uRo5SH
Fb42Vl658K/MHDSIk8U58TNkmrsIJsBGohXI9DOGINPPvv3XOPi4Q1HmXkGRmii0
y+lNU/QEGh7xXXew29SPP76uQpQaYfC7NxXCMw/OpOMwehzjsjshmM2lpxi8zsgt
PJUmfHv5qxCplNmTJXmUpmX7sS7550HUdu9FJb13DM+gzKg8bk9jWVuLrzqrVlG5
1hKWEl1+heg1heRfaIuJVLbPI0au6Sb4uqhih/PHyrP9TWIoAruDbDJM65GKTxyE
2uEgcHzHQw==
=poRc
-----END PGP SIGNATURE-----
Merge tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe updates via Christoph:
- Small improvements to the logging functionality (Amit Engel)
- Authentication cleanups (Hannes Reinecke)
- Cleanup and optimize the DMA mapping cod in the PCIe driver
(Keith Busch)
- Work around the command effects for Format NVM (Keith Busch)
- Misc cleanups (Keith Busch, Christoph Hellwig)
- Fix and cleanup freeing single sgl (Keith Busch)
- MD updates via Song:
- Fix a rare crash during the takeover process
- Don't update recovery_cp when curr_resync is ACTIVE
- Free writes_pending in md_stop
- Change active_io to percpu
- Updates to drbd, inching us closer to unifying the out-of-tree driver
with the in-tree one (Andreas, Christoph, Lars, Robert)
- BFQ update adding support for multi-actuator drives (Paolo, Federico,
Davide)
- Make brd compliant with REQ_NOWAIT (me)
- Fix for IOPOLL and queue entering, fixing stalled IO waiting on
timeouts (me)
- Fix for REQ_NOWAIT with multiple bios (me)
- Fix memory leak in blktrace cleanup (Greg)
- Clean up sbitmap and fix a potential hang (Kemeng)
- Clean up some bits in BFQ, and fix a bug in the request injection
(Kemeng)
- Clean up the request allocation and issue code, and fix some bugs
related to that (Kemeng)
- ublk updates and fixes:
- Add support for unprivileged ublk (Ming)
- Improve device deletion handling (Ming)
- Misc (Liu, Ziyang)
- s390 dasd fixes (Alexander, Qiheng)
- Improve utility of request caching and fixes (Anuj, Xiao)
- zoned cleanups (Pankaj)
- More constification for kobjs (Thomas)
- blk-iocost cleanups (Yu)
- Remove bio splitting from drivers that don't need it (Christoph)
- Switch blk-cgroups to use struct gendisk. Some of this is now
incomplete as select late reverts were done. (Christoph)
- Add bvec initialization helpers, and convert callers to use that
rather than open-coding it (Christoph)
- Misc fixes and cleanups (Jinke, Keith, Arnd, Bart, Li, Martin,
Matthew, Ulf, Zhong)
* tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux: (169 commits)
brd: use radix_tree_maybe_preload instead of radix_tree_preload
block: use proper return value from bio_failfast()
block: bio-integrity: Copy flags when bio_integrity_payload is cloned
block: Fix io statistics for cgroup in throttle path
brd: mark as nowait compatible
brd: check for REQ_NOWAIT and set correct page allocation mask
brd: return 0/-error from brd_insert_page()
block: sync mixed merged request's failfast with 1st bio's
Revert "blk-cgroup: pin the gendisk in struct blkcg_gq"
Revert "blk-cgroup: pass a gendisk to blkg_lookup"
Revert "blk-cgroup: delay blk-cgroup initialization until add_disk"
Revert "blk-cgroup: delay calling blkcg_exit_disk until disk_release"
Revert "blk-cgroup: move the cgroup information to struct gendisk"
nvme-pci: remove iod use_sgls
nvme-pci: fix freeing single sgl
block: ublk: check IO buffer based on flag need_get_data
s390/dasd: Fix potential memleak in dasd_eckd_init()
s390/dasd: sort out physical vs virtual pointers usage
block: Remove the ALLOC_CACHE_SLACK constant
block: make kobj_type structures constant
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPueAQQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgplopEACo17a4Z2p2xCedA0NCqX2ggtsSIdYiluPm
pgdBzIEsgwKo1XVLGRgGiC8VdMRuzO4Zh/NGn4iRF1a68wjgjnwGWY7r052TUoSr
q1yya739BpffnkXjj15x86cwl+5rHv2RQkm15+2HqBgcruA63/ZgdKBtjj+EtVKs
zYOlmgyfFbkn8AdULMGiDKP4lixV8gUelv6vWneBwNrj4iSLnuN1+8nJNsl4wxwg
ImSpx63AzhUoeL6byc+fmiA8fZhDhSvwON2tCyyCmOjlFM/TLrsm5t1juWiDid1O
UROkQwQtsmjSUq3ow5fRJfjbZ3HLa1uGQr95DYHy0OBRAteAhDY5Upv0DXNL0ZBh
uNNg8AXtJbyc+pLHWnncyiTzi+3eWs7WiMn04/a5eDhFvcJ0PZjLIgRi5j1ezUS1
bWqoPaAIxoMD83WoMxjnKvBpGeMzPHvNTijeZjkGOu0vOk8JhXqNmLTjNG9aLtzf
1Nvp0o8AqtQAW7cgFazZSWtw4bPk/wZ7mW0zHtqLDHIzXkc7A/Uo0ftdv84G08aW
pvakNz4aNLwSPf7hxgPP9SgS9CeHhxK8PS6uk3V788SI8qGiew11+EcTNGkQNmvw
/ItCo93UaWD/7SZLObTLslmet7rFHzz6PXaXrMxrSvaeZMkgr7DWEy9XS+ueOtXO
fS8QhJX11w==
=IU45
-----END PGP SIGNATURE-----
Merge tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linux
Pull legacy dio update from Jens Axboe:
"We only have a few file systems that use the old dio code, make them
select it rather than build it unconditionally"
* tag 'for-6.3/dio-2023-02-16' of git://git.kernel.dk/linux:
fs: build the legacy direct I/O code conditionally
fs: move sb_init_dio_done_wq out of direct-io.c
This patch set fixes some races in the lowcomms startup and shutdown code
that were found by targetted stress testing that quickly and repeatedly
joins and leaves lockspaces.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJj870xAAoJEDgbc8f8gGmqGasP/2fy7GIe/NNaTDVfRPzJMVsh
v5GP42q7pwuaR2rTZxHOpALfDJXx3wu9zsA3A6luSaYRRHxdntztaVT2vaAFa13o
KI2IDB2DIdJ1PqDVgq8E9I6+X/y5gFpeq9Wyo2FdYATncQ370wN42uIk3EECPTZh
73HZHBkACQJ28ljt4zWg9ANn1rSiZQjaKayFwbnaH0AjbWrTehccyKR0BE2rNd6c
EqoRX0tRJLH1RVaOHKW+m4EZaMDo2m5hQH/lhcwth/uqxOqQYOSyMPneKVj/TsI8
8ToXRCqOYwV3uNZVhgRxu3hHPd2e0l3OgU+mlvNqURTSANr8/D3kEZp90AWDV2yg
yGtaSYspKY7tjgdIWZbTffxXu+5CZs0xTSkw53xjWyCZ4++c/MO4WO3w8/QIsmWd
Ow8l29YQc5t7XI0zjpsgyQMaKHQLXE06y9F1jZlZpxa0PfvDOevhl+8YQrKGIJYh
B+QzjtkmQLpD9RyBRdmxIrS5dl/Wv+wa4zhstD9jkUAkKvZ5ndYvCVtjRu5F8PZ7
HJzUorZFqDgKfXqTLt91irtS4OdMY4xPWr/V1K8NekTQ/+GY7Q1Qu1MlBOW2aqIk
UHfn1VaxdFom+tdFF83WG+kmnUu2+YZK83vtnFoZlTsbvfY//zveR1Hbt4MnTj6G
Z00nx5eb2SaKJFHD0J8z
=4Def
-----END PGP SIGNATURE-----
Merge tag 'dlm-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This fixes some races in the lowcomms startup and shutdown code that
were found by targeted stress testing that quickly and repeatedly
joins and leaves lockspaces"
* tag 'dlm-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
fs: dlm: remove unnecessary waker_up() calls
fs: dlm: move state change into else branch
fs: dlm: remove newline in log_print
fs: dlm: reduce the shutdown timeout to 5 secs
fs: dlm: make dlm sequence id more robust
fs: dlm: wait until all midcomms nodes detect version
fs: dlm: ignore unexpected non dlm opts msgs
fs: dlm: bring back previous shutdown handling
fs: dlm: send FIN ack back in right cases
fs: dlm: move sending fin message into state change handling
fs: dlm: don't set stop rx flag after node reset
fs: dlm: fix race setting stop tx flag
fs: dlm: be sure to call dlm_send_queue_flush()
fs: dlm: fix use after free in midcomms commit
fs: dlm: start midcomms before scand
fs/dlm: Remove "select SRCU"
fs: dlm: fix return value check in dlm_memory_init()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmPzxWcACgkQxWXV+ddt
WDt+fRAAg5pz7gWNMtIK30gp/uojjAkCWXymxRtK2tZU3naI+6IYSAKxuKq8Iz1Y
drdlpSvTX/Gv3XlGB9QuoH6digTjQzeVzjAm0eP6w8t8354KGSRUYdtoFp8I8E5Z
q0JUuZ6w/KvpZfOIsmcgpOScgcl+8+UlOxs2iuSrOvAqP8Dg1VCt5vBm7htIb0tm
5ClbgmIacxWrOII55XGuY0mWuZSlS4hdyWdYMelvtM8aPPG+e8eEzKjscVOOueLz
Smi1kN5QU3o+m4oKjN1OJlKfeURdbcZUwva9zOsegSbPHUzNwIao44cQ5cQhMR0r
kI3nCpJwGKdUd6IblEdcqBN5F4V64edLSruOLuGYzxySnEWhFE2YU2xW/v5b1eQW
GHurI52FGrPqcX9FgQNzfTjQzk341iQ0QIs5exycJH7xeohEZnlaK2yNUngKSo1C
naqczEMMMcxNjQaooUuxRkL/zz36D/Dkyo2YOCODtWyu61XY9LqvaxMvClFI20lL
40dzzYnnMQwkXJrQ/MVQhz1BBaPVqizt8+ErL7GQp2CWr9miD6mcA5b2pyZm5Q3r
hHadzeTXXS7P9g9UnuDxpZqkhvadGC2Sy4l/D6jURyKFzr8mtplaRRwUS2gSuP3z
zxavvP4UukwNWXxDz755NAhiGbA+xpSMATKCrZ/Sdogvxe8IhRg=
=NCpw
-----END PGP SIGNATURE-----
Merge tag 'for-6.3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"The usual mix of performance improvements and new features.
The core change is reworking how checksums are processed, with
followup cleanups and simplifications. There are two minor changes in
block layer and iomap code.
Features:
- block group allocation class heuristics:
- pack files by size (up to 128k, up to 8M, more) to avoid
fragmentation in block groups, assuming that file size and life
time is correlated, in particular this may help during balance
- with tracepoints and extensible in the future
Performance:
- send: cache directory utimes and only emit the command when
necessary
- speedup up to 10x
- smaller final stream produced (no redundant utimes commands
issued)
- compatibility not affected
- fiemap: skip backref checks for shared leaves
- speedup 3x on sample filesystem with all leaves shared (e.g. on
snapshots)
- micro optimized b-tree key lookup, speedup in metadata operations
(sample benchmark: fs_mark +10% of files/sec)
Core changes:
- change where checksumming is done in the io path:
- checksum and read repair does verification at lower layer
- cascaded cleanups and simplifications
- raid56 refactoring and cleanups
Fixes:
- sysfs: make sure that a run-time change of a feature is correctly
tracked by the feature files
- scrub: better reporting of tree block errors
Other:
- locally enable -Wmaybe-uninitialized after fixing all warnings
- misc cleanups, spelling fixes
Other code:
- block: export bio_split_rw
- iomap: remove IOMAP_F_ZONE_APPEND"
* tag 'for-6.3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (109 commits)
btrfs: make kobj_type structures constant
btrfs: remove the bdev argument to btrfs_rmap_block
btrfs: don't rely on unchanging ->bi_bdev for zone append remaps
btrfs: never return true for reads in btrfs_use_zone_append
btrfs: pass a btrfs_bio to btrfs_use_append
btrfs: set bbio->file_offset in alloc_new_bio
btrfs: use file_offset to limit bios size in calc_bio_boundaries
btrfs: do unsigned integer division in the extent buffer binary search loop
btrfs: eliminate extra call when doing binary search on extent buffer
btrfs: raid56: handle endio in scrub_rbio
btrfs: raid56: handle endio in recover_rbio
btrfs: raid56: handle endio in rmw_rbio
btrfs: raid56: submit the read bios from scrub_assemble_read_bios
btrfs: raid56: fold rmw_read_wait_recover into rmw_read_bios
btrfs: raid56: fold recover_assemble_read_bios into recover_rbio
btrfs: raid56: add a bio_list_put helper
btrfs: raid56: wait for I/O completion in submit_read_bios
btrfs: raid56: simplify code flow in rmw_rbio
btrfs: raid56: simplify error handling and code flow in raid56_parity_write
btrfs: replace btrfs_wait_tree_block_writeback by wait_on_extent_buffer_writeback
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmPvZWkACgkQnJ2qBz9k
QNkVtQf/V515KIb7lEkwOjlF7AQg5MS/c8zuodHISzWYhZyd0KSTb+qnF/QzLvVm
cJ3cIPXrIyVw4Eeqh0qQvukOYCcBvUa1IBW5kePiy3mHiHRD2PRgaxSGBXTXqqPG
xXagwllrn3/mG4ZKXlNYFrzgoshSFFeBdkLGEi+/L6DAe0B+mG+FIHON1eWylgOT
j1D+/k9RNvRhRU8WtStcI4u9mnVPqUI2RSWUpjxuNzUPtyFflPVNCz+bkXXovPQ0
ZQY2HeZcs7jsorCmeSHUzTt5bbj3BfhO3uWL4/wnHgp+88OBRUUyvMrNbOF97xd6
KFqbJVQbSevasUSEvCS+3+EChzjGoA==
=DaJi
-----END PGP SIGNATURE-----
Merge tag 'fixes_for_v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF and ext2 fixes from Jan Kara:
- Rewrite of udf directory iteration code to address multiple syzbot
reports
- Fixes to udf extent handling and block mapping code to address
several syzbot reports and filesystem corruption issues uncovered by
fsx & fsstress
- Convert udf to kmap_local()
- Add sanity checks when loading udf bitmaps
- Drop old VARCONV support which I've never seen used and which was
broken for quite some years without anybody noticing
- Finish conversion of ext2 to kmap_local()
- One fix to mpage_writepages() on which other udf fixes depend
* tag 'fixes_for_v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (78 commits)
udf: Avoid directory type conversion failure due to ENOMEM
udf: Use unsigned variables for size calculations
udf: remove reporting loc in debug output
udf: Check consistency of Space Bitmap Descriptor
udf: Fix file counting in LVID
udf: Limit file size to 4TB
udf: Don't return bh from udf_expand_dir_adinicb()
udf: Convert udf_expand_file_adinicb() to avoid kmap_atomic()
udf: Convert udf_adinicb_writepage() to memcpy_to_page()
udf: Switch udf_adinicb_readpage() to kmap_local_page()
udf: Move udf_adinicb_readpage() to inode.c
udf: Mark aops implementation static
udf: Switch to single address_space_operations
udf: Add handling of in-ICB files to udf_bmap()
udf: Convert all file types to use udf_write_end()
udf: Convert in-ICB files to use udf_write_begin()
udf: Convert in-ICB files to use udf_direct_IO()
udf: Convert in-ICB files to use udf_writepages()
udf: Unify .read_folio for normal and in-ICB files
udf: Fix off-by-one error when discarding preallocation
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmPvZJwACgkQnJ2qBz9k
QNlPcAf/UL7DDv37vnvfcFTa9lRyC0dXsgxnVZUwMU0hJs/ewbmueYGnJSBRTVLG
7ad7bKYQVWsjhas4YulofgRrFWxVDcR32qbC+pDo/X6vGjo4tDl2CNPYREY3n3kN
xR6Ca7nPxBH5AVYwwOqBJSTqhWGy1TSDeuskndS0P+YtTv6Y4Zvm4UEiNAXJ4nwo
5Nd+bsPpkrEgQqO/NK2rCXfBfkJr4jAMcp+Nn2zAP44icZAXJYn8QrN3gVL6OZlN
RKq36MGQf52lxyufVyFCulWKRbxhEKUS0nURZgAG+Sv87DlSuBJgRVG7xJ1baPpK
2g7wG2jaT7YMfA4PWms/rwAj/CkGLA==
=NRh0
-----END PGP SIGNATURE-----
Merge tag 'fsnotify_for_v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
"Support for auditing decisions regarding fanotify permission events"
* tag 'fsnotify_for_v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify,audit: Allow audit to use the full permission event response
fanotify: define struct members to hold response decision context
fanotify: Ensure consistent variable type for response
Fix the longstanding implementation limitation that fsverity was only
supported when the Merkle tree block size, filesystem block size, and
PAGE_SIZE were all equal. Specifically, add support for Merkle tree
block sizes less than PAGE_SIZE, and make ext4 support fsverity on
filesystems where the filesystem block size is less than PAGE_SIZE.
Effectively, this means that fsverity can now be used on systems with
non-4K pages, at least on ext4. These changes have been tested using
the verity group of xfstests, newly updated to cover the new code paths.
Also update fs/verity/ to support verifying data from large folios.
There's also a similar patch for fs/crypto/, to support decrypting data
from large folios, which I'm including in this pull request to avoid a
merge conflict between the fscrypt and fsverity branches.
There will be a merge conflict in fs/buffer.c with some of the foliation
work in the mm tree. Please use the merge resolution from linux-next.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY/KJtRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK/A/AP0RUlCClBRuHwXPRG0we8R1L153ga4s
Vl+xRpCr+SswXwEAiOEpYN5cXoVKzNgxbEXo2pQzxi5lrpjZgUI6CL3DuQs=
=ZRFX
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity updates from Eric Biggers:
"Fix the longstanding implementation limitation that fsverity was only
supported when the Merkle tree block size, filesystem block size, and
PAGE_SIZE were all equal.
Specifically, add support for Merkle tree block sizes less than
PAGE_SIZE, and make ext4 support fsverity on filesystems where the
filesystem block size is less than PAGE_SIZE.
Effectively, this means that fsverity can now be used on systems with
non-4K pages, at least on ext4. These changes have been tested using
the verity group of xfstests, newly updated to cover the new code
paths.
Also update fs/verity/ to support verifying data from large folios.
There's also a similar patch for fs/crypto/, to support decrypting
data from large folios, which I'm including in here to avoid a merge
conflict between the fscrypt and fsverity branches"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fscrypt: support decrypting data from large folios
fsverity: support verifying data from large folios
fsverity.rst: update git repo URL for fsverity-utils
ext4: allow verity with fs block size < PAGE_SIZE
fs/buffer.c: support fsverity in block_read_full_folio()
f2fs: simplify f2fs_readpage_limit()
ext4: simplify ext4_readpage_limit()
fsverity: support enabling with tree block size < PAGE_SIZE
fsverity: support verification with tree block size < PAGE_SIZE
fsverity: replace fsverity_hash_page() with fsverity_hash_block()
fsverity: use EFBIG for file too large to enable verity
fsverity: store log2(digest_size) precomputed
fsverity: simplify Merkle tree readahead size calculation
fsverity: use unsigned long for level_start
fsverity: remove debug messages and CONFIG_FS_VERITY_DEBUG
fsverity: pass pos and size to ->write_merkle_tree_block
fsverity: optimize fsverity_cleanup_inode() on non-verity files
fsverity: optimize fsverity_prepare_setattr() on non-verity files
fsverity: optimize fsverity_file_open() on non-verity files
Simplify the implementation of the test_dummy_encryption mount option by
adding the "test dummy key" on-demand.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY/J7NRQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3UVAP9Likiuy47D/RM4mOsPMwLAlQRx5uW6
iGxT6DutekA7DwEA4hNjEQQ/EKO+UxFb+fBCX+xpTDbS3LB7CxGsqHzZJQM=
=SiNJ
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux
Pull fscrypt updates from Eric Biggers:
"Simplify the implementation of the test_dummy_encryption mount option
by adding the 'test dummy key' on-demand"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
fscrypt: clean up fscrypt_add_test_dummy_key()
fs/super.c: stop calling fscrypt_destroy_keyring() from __put_super()
f2fs: stop calling fscrypt_add_test_dummy_key()
ext4: stop calling fscrypt_add_test_dummy_key()
fscrypt: add the test dummy encryption key on-demand
- Add per-cpu kthreads for low-latency decompression for Android
use cases;
- Get rid of tagged pointer helpers since they are rarely used now;
- Several code cleanups to reduce codebase;
- Documentation and MAINTAINERS updates.
-----BEGIN PGP SIGNATURE-----
iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCY/IDjhEceGlhbmdAa2Vy
bmVsLm9yZwAKCRA5NzHcH7XmBNbTAQDT2njll/B2JSYbVC2I2HYTZSyFXEaHhH+M
6gHRbEhTWAD/VNiAcdE600IkUwut/78tDvwlz/XJSd2JQMMwkTSviwc=
=oroQ
-----END PGP SIGNATURE-----
Merge tag 'erofs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"The most noticeable feature for this cycle is per-CPU kthread
decompression since Android use cases need low-latency I/O handling in
order to ensure the app runtime performance, currently unbounded
workqueue latencies are not quite good for production on many aarch64
hardwares and thus we need to introduce a deterministic expectation
for these. Decompression is CPU-intensive and it is sleepable for
EROFS, so other alternatives like decompression under softirq contexts
are not considered. More details are in the corresponding commit
message.
Others are random cleanups around the whole codebase and we will
continue to clean up further in the next few months.
Due to Lunar New Year holidays, some other new features were not
completely reviewed and solidified as expected and we may delay them
into the next version.
Summary:
- Add per-cpu kthreads for low-latency decompression for Android use
cases
- Get rid of tagged pointer helpers since they are rarely used now
- Several code cleanups to reduce codebase
- Documentation and MAINTAINERS updates"
* tag 'erofs-for-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (21 commits)
erofs: fix an error code in z_erofs_init_zip_subsystem()
erofs: unify anonymous inodes for blob
erofs: relinquish volume with mutex held
erofs: maintain cookies of share domain in self-contained list
erofs: remove unused device mapping in meta routine
MAINTAINERS: erofs: Add Documentation/ABI/testing/sysfs-fs-erofs
Documentation/ABI: sysfs-fs-erofs: update supported features
erofs: remove unused EROFS_GET_BLOCKS_RAW flag
erofs: update print symbols for various flags in trace
erofs: make kobj_type structures constant
erofs: add per-cpu threads for decompression as an option
erofs: tidy up internal.h
erofs: get rid of z_erofs_do_map_blocks() forward declaration
erofs: move zdata.h into zdata.c
erofs: remove tagged pointer helpers
erofs: avoid tagged pointers to mark sync decompression
erofs: get rid of erofs_inode_datablocks()
erofs: simplify iloc()
erofs: get rid of debug_one_dentry()
erofs: remove linux/buffer_head.h dependency
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NzQAKCRCRxhvAZXjc
otv2AP9wJtg+RL01iYiUE2mRMYxq4R79yWrtPEyuDEZIq5tQSwEA/H4yk7EHgHMS
aKnEfny/P9JjKPtZzsxhMQcpiIVewQs=
=+Q0C
-----END PGP SIGNATURE-----
Merge tag 'fs.acl.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs acl update from Christian Brauner:
"This contains a single update to the internal get acl method and
replaces an open-coded cmpxchg() comparison with with try_cmpxchg().
It's clearer and also beneficial on some architectures"
* tag 'fs.acl.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
posix_acl: Use try_cmpxchg in get_acl
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5SogAKCRCRxhvAZXjc
orVwAP4jJ1dPZYx1xHip9TfB5fv5xHz3euhvWns6qGJdVYoHzwEAhVxgYUpqWdXX
L/+VKRFFujYxsSXP4BbS3xDPUJeQFAI=
=ccK2
-----END PGP SIGNATURE-----
Merge tag 'fs.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs hardening update from Christian Brauner:
"Jan pointed out that during shutdown both filp_close() and super block
destruction will use basic printk logging when bugs are detected. This
causes issues in a few scenarios:
- Tools like syzkaller cannot figure out that the logged message
indicates a bug.
- Users that explicitly opt in to have the kernel bug on data
corruption by selecting CONFIG_BUG_ON_DATA_CORRUPTION should see
the kernel crash when they did actually select that option.
- When there are busy inodes after the superblock is shut down later
access to such a busy inodes walks through freed memory. It would
be better to cleanly crash instead.
All of this can be addressed by using the already existing
CHECK_DATA_CORRUPTION() macro in these places when kernel bugs are
detected. Its logging improvement is useful for all users.
Otherwise this only has a meaningful behavioral effect when users do
select CONFIG_BUG_ON_DATA_CORRUPTION which means this is backward
compatible for regular users"
* tag 'fs.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
fs: Use CHECK_DATA_CORRUPTION() when kernel bugs are detected
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc
orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz
Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ=
=+BG5
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs idmapping updates from Christian Brauner:
- Last cycle we introduced the dedicated struct mnt_idmap type for
mount idmapping and the required infrastucture in 256c8aed2b ("fs:
introduce dedicated idmap type for mounts"). As promised in last
cycle's pull request message this converts everything to rely on
struct mnt_idmap.
Currently we still pass around the plain namespace that was attached
to a mount. This is in general pretty convenient but it makes it easy
to conflate namespaces that are relevant on the filesystem with
namespaces that are relevant on the mount level. Especially for
non-vfs developers without detailed knowledge in this area this was a
potential source for bugs.
This finishes the conversion. Instead of passing the plain namespace
around this updates all places that currently take a pointer to a
mnt_userns with a pointer to struct mnt_idmap.
Now that the conversion is done all helpers down to the really
low-level helpers only accept a struct mnt_idmap argument instead of
two namespace arguments.
Conflating mount and other idmappings will now cause the compiler to
complain loudly thus eliminating the possibility of any bugs. This
makes it impossible for filesystem developers to mix up mount and
filesystem idmappings as they are two distinct types and require
distinct helpers that cannot be used interchangeably.
Everything associated with struct mnt_idmap is moved into a single
separate file. With that change no code can poke around in struct
mnt_idmap. It can only be interacted with through dedicated helpers.
That means all filesystems are and all of the vfs is completely
oblivious to the actual implementation of idmappings.
We are now also able to extend struct mnt_idmap as we see fit. For
example, we can decouple it completely from namespaces for users that
don't require or don't want to use them at all. We can also extend
the concept of idmappings so we can cover filesystem specific
requirements.
In combination with the vfs{g,u}id_t work we finished in v6.2 this
makes this feature substantially more robust and thus difficult to
implement wrong by a given filesystem and also protects the vfs.
- Enable idmapped mounts for tmpfs and fulfill a longstanding request.
A long-standing request from users had been to make it possible to
create idmapped mounts for tmpfs. For example, to share the host's
tmpfs mount between multiple sandboxes. This is a prerequisite for
some advanced Kubernetes cases. Systemd also has a range of use-cases
to increase service isolation. And there are more users of this.
However, with all of the other work going on this was way down on the
priority list but luckily someone other than ourselves picked this
up.
As usual the patch is tiny as all the infrastructure work had been
done multiple kernel releases ago. In addition to all the tests that
we already have I requested that Rodrigo add a dedicated tmpfs
testsuite for idmapped mounts to xfstests. It is to be included into
xfstests during the v6.3 development cycle. This should add a slew of
additional tests.
* tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits)
shmem: support idmapped mounts for tmpfs
fs: move mnt_idmap
fs: port vfs{g,u}id helpers to mnt_idmap
fs: port fs{g,u}id helpers to mnt_idmap
fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
fs: port i_{g,u}id_{needs_}update() to mnt_idmap
quota: port to mnt_idmap
fs: port privilege checking helpers to mnt_idmap
fs: port inode_owner_or_capable() to mnt_idmap
fs: port inode_init_owner() to mnt_idmap
fs: port acl to mnt_idmap
fs: port xattr to mnt_idmap
fs: port ->permission() to pass mnt_idmap
fs: port ->fileattr_set() to pass mnt_idmap
fs: port ->set_acl() to pass mnt_idmap
fs: port ->get_acl() to pass mnt_idmap
fs: port ->tmpfile() to pass mnt_idmap
fs: port ->rename() to pass mnt_idmap
fs: port ->mknod() to pass mnt_idmap
fs: port ->mkdir() to pass mnt_idmap
...
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmPuDAUTHGpsYXl0b25A
a2VybmVsLm9yZwAKCRAADmhBGVaCFQu9D/9ZEYVvoY3s7tK5Cw+My+1U0sNQFUW0
D46kVF0bLxv09G8prI6FZFQORguSMZli3TEsdyUbU12ak3kNHYkTW6AxxnN0jQVE
ur6f5iOauh9mWcv2S9/gulz9KH2uiG1rGHWfClfoZgQI1QCatfaKh9nrEcNrPy71
qpTbcN8AOhhYDP2ZAHajuwTuak1jLBHYN7ihfaUPyex9kAxUalbw5wIsXYfTB+Mn
/xPuCAN9YL5KpuUzVSF7QfWWVEiqErSFeFip+NCS/srihURKr14JuVSwUwkTn8rR
ONWBawXDw+eiWZ1nQmzvPTazbvsQGO1cKx8A20nOLmJjtRqAGlfngDQrCchkoSg7
SwYDiDpDqkg5PTWSfkDMG1wY/tEm6dO89P912kY+jFuGJIrG4HPkOR20StmKkuz5
La4bUs5UwucxxS9Hlu5+aFGrYFtuixWpR/1Wu5ITRtoq+j5vzHHrQ1wkbezNXs0w
VxS7YqW1TDDUUXgBzYOctmQ50ADVVC63J6AbhA7w7Qh+1vpEvyqYcWZw/Vd54bPP
tDh3FFnEGJjX0QqtdpHpyIiOm/BXiJ/8y96vEAxVzXtOmb59AQDCPq99HgWa4G8L
62ltF7qkgXKOsWCaKk07uycE0zISnx1ZXk9bvLFeN02XhRkPxevV93Yn/YiJ7nZe
xcR23eGRsseA/w==
=EKA6
-----END PGP SIGNATURE-----
Merge tag 'iversion-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull i_version updates from Jeff Layton:
"This overhauls how we handle i_version queries from nfsd.
Instead of having special routines and grabbing the i_version field
directly out of the inode in some cases, we've moved most of the
handling into the various filesystems' getattr operations. As a bonus,
this makes ceph's change attribute usable by knfsd as well.
This should pave the way for future work to make this value queryable
by userland, and to make it more resilient against rolling back on a
crash"
* tag 'iversion-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
nfsd: remove fetch_iversion export operation
nfsd: use the getattr operation to fetch i_version
nfsd: move nfsd4_change_attribute to nfsfh.c
ceph: report the inode version in getattr if requested
nfs: report the inode version in getattr if requested
vfs: plumb i_version handling into struct kstat
fs: clarify when the i_version counter must be updated
fs: uninline inode_query_iversion
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmPuC5kTHGpsYXl0b25A
a2VybmVsLm9yZwAKCRAADmhBGVaCFdXpEADFN/loTtcANLPQmLmgmJLDuZr2zKrf
aziMJRjqGMx6BLdwBgX8/XGBNwpG4tkVbI+zdRoVHkpcayMDpLq0dnrvi79a/dGU
fBrI72ZDMd/S9lzbodObHMziLqvgFthsPm9ldVAZ2Kt400KKNE+ozcveiC3yVGy0
n1k5BSt/78abzpqut5whVgJBooHtUMCh3XvBJPKwgOneHfAXCm+jqaXlKKpKlpZj
s2OUyn8BLfNkTgpAZ88L5Rkf0mftjziL6C8KOMy1hvOsyiP0IkwLuQ/kO+2H0Ate
p3tbOGvUT+n1gYpFYBDLnuWB4G8+CVPxfoO6KGhwT4OlCJpPlNCM8O+w/A/dKn4I
858spkpYPMy91lEkcrRLRkg/MARWGTgZ3k76fp3OWNnfruWd6ekMlYKx9n6CIy34
Aoc3Svy9KeA7oRrbRDltmw3UVmz53GcDo337ZL1J6Jph3s86dMG7AwGYvoDfKuKK
b0oNK5db5v50scBnRHX6UWejE5fSnnHvgC7pU57u08odCVEUALB+r8f04vmkxcVJ
Qed7lolQdFzn9ddaOXzpg5KeCe/cX3p4IPZSTHad7CPr8gswmC135DfXCr64x2hC
5jyNzKbe/x+7B2xCweHmEk4ojt8IU3UaYxLJoQkNeVr8rEGC9gqZgkSDe7BnTpOf
wT0ijzhy2u5RKg==
=Zhf3
-----END PGP SIGNATURE-----
Merge tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"The main change here is that I've broken out most of the file locking
definitions into a new header file. I also went ahead and completed
the removal of locks_inode function"
* tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs: remove locks_inode
filelock: move file locking definitions to separate header file
If the MR allocate failed, the MR recovery work not initialized
and list not cleared. Then will be warning and UAF when release
the MR:
WARNING: CPU: 4 PID: 824 at kernel/workqueue.c:3066 __flush_work.isra.0+0xf7/0x110
CPU: 4 PID: 824 Comm: mount.cifs Not tainted 6.1.0-rc5+ #82
RIP: 0010:__flush_work.isra.0+0xf7/0x110
Call Trace:
<TASK>
__cancel_work_timer+0x2ba/0x2e0
smbd_destroy+0x4e1/0x990
_smbd_get_connection+0x1cbd/0x2110
smbd_get_connection+0x21/0x40
cifs_get_tcp_session+0x8ef/0xda0
mount_get_conns+0x60/0x750
cifs_mount+0x103/0xd00
cifs_smb3_do_mount+0x1dd/0xcb0
smb3_get_tree+0x1d5/0x300
vfs_get_tree+0x41/0xf0
path_mount+0x9b3/0xdd0
__x64_sys_mount+0x190/0x1d0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
BUG: KASAN: use-after-free in smbd_destroy+0x4fc/0x990
Read of size 8 at addr ffff88810b156a08 by task mount.cifs/824
CPU: 4 PID: 824 Comm: mount.cifs Tainted: G W 6.1.0-rc5+ #82
Call Trace:
dump_stack_lvl+0x34/0x44
print_report+0x171/0x472
kasan_report+0xad/0x130
smbd_destroy+0x4fc/0x990
_smbd_get_connection+0x1cbd/0x2110
smbd_get_connection+0x21/0x40
cifs_get_tcp_session+0x8ef/0xda0
mount_get_conns+0x60/0x750
cifs_mount+0x103/0xd00
cifs_smb3_do_mount+0x1dd/0xcb0
smb3_get_tree+0x1d5/0x300
vfs_get_tree+0x41/0xf0
path_mount+0x9b3/0xdd0
__x64_sys_mount+0x190/0x1d0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Allocated by task 824:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x7a/0x90
_smbd_get_connection+0x1b6f/0x2110
smbd_get_connection+0x21/0x40
cifs_get_tcp_session+0x8ef/0xda0
mount_get_conns+0x60/0x750
cifs_mount+0x103/0xd00
cifs_smb3_do_mount+0x1dd/0xcb0
smb3_get_tree+0x1d5/0x300
vfs_get_tree+0x41/0xf0
path_mount+0x9b3/0xdd0
__x64_sys_mount+0x190/0x1d0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 824:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
____kasan_slab_free+0x143/0x1b0
__kmem_cache_free+0xc8/0x330
_smbd_get_connection+0x1c6a/0x2110
smbd_get_connection+0x21/0x40
cifs_get_tcp_session+0x8ef/0xda0
mount_get_conns+0x60/0x750
cifs_mount+0x103/0xd00
cifs_smb3_do_mount+0x1dd/0xcb0
smb3_get_tree+0x1d5/0x300
vfs_get_tree+0x41/0xf0
path_mount+0x9b3/0xdd0
__x64_sys_mount+0x190/0x1d0
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Let's initialize the MR recovery work before MR allocate to prevent
the warning, remove the MRs from the list to prevent the UAF.
Fixes: c739858334 ("CIFS: SMBD: Implement RDMA memory registration")
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
If the MR allocate failed, the smb direct connection info is NULL,
then smbd_destroy() will directly return, then the connection info
will be leaked.
Let's set the smb direct connection info to the server before call
smbd_destroy().
Fixes: c739858334 ("CIFS: SMBD: Implement RDMA memory registration")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
If we did not get a lease we can still return a single use cfid to the caller.
The cfid will not have has_lease set and will thus not be shared with any
other concurrent users and will be freed immediately when the caller
drops the handle.
This avoids extra roundtrips for servers that do not support directory leases
where they would first fail to get a cfid with a lease and then fallback
to try a normal SMB2_open()
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Some servers may return that we got a lease in rsp->OplockLevel
but then in the lease context contradict this and say we got no lease
at all. Thus we need to check the context if we have a lease.
Additionally, If we do not get a lease we need to make sure we close
the handle before we return an error to the caller.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
Replace the trailing 1-element array with a flexible array in the
following structures:
struct cifs_spnego_msg
struct cifs_quota_data
struct get_dfs_referral_rsp
struct file_alt_name_info
NEGOTIATE_RSP
SESSION_SETUP_ANDX
TCONX_REQ
TCONX_RSP
TCONX_RSP_EXT
ECHO_REQ
ECHO_RSP
OPEN_REQ
OPENX_REQ
LOCK_REQ
RENAME_REQ
COPY_REQ
COPY_RSP
NT_RENAME_REQ
DELETE_FILE_REQ
DELETE_DIRECTORY_REQ
CREATE_DIRECTORY_REQ
QUERY_INFORMATION_REQ
SETATTR_REQ
TRANSACT_IOCTL_REQ
TRANSACT_CHANGE_NOTIFY_REQ
TRANSACTION2_QPI_REQ
TRANSACTION2_SPI_REQ
TRANSACTION2_FFIRST_REQ
TRANSACTION2_GET_DFS_REFER_REQ
FILE_UNIX_LINK_INFO
FILE_DIRECTORY_INFO
FILE_FULL_DIRECTORY_INFO
SEARCH_ID_FULL_DIR_INFO
FILE_BOTH_DIRECTORY_INFO
FIND_FILE_STANDARD_INFO
Replace the trailing 1-element array with a flexible array, but leave
the existing structure padding:
FILE_ALL_INFO
FILE_UNIX_INFO
Remove unused structures:
struct gea
struct gealist
Adjust all related size calculations to match the changes to sizeof().
No machine code output differences are produced after these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrayshttps://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org>
Cc: Paulo Alcantara <pc@cjr.nz>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The kernel is globally removing the ambiguous 0-length and 1-element
arrays in favor of flexible arrays, so that we can gain both compile-time
and run-time array bounds checking[1].
While struct fealist is defined as a "fake" flexible array (via a
1-element array), it is only used for examination of the first array
element. Walking the list is performed separately, so there is no reason
to treat the "list" member of struct fealist as anything other than a
single entry. Adjust the struct and code to match.
Additionally, struct fea uses the "name" member either as a dynamic
string, or is manually calculated from the start of the struct. Redefine
the member as a flexible array.
No machine code output differences are produced after these changes.
[1] For lots of details, see both:
https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrayshttps://people.kernel.org/kees/bounded-flexible-arrays-in-c
Cc: Steve French <sfrench@samba.org>
Cc: Paulo Alcantara <pc@cjr.nz>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The client was sending rfc1002 session request packet with a wrong
length field set, therefore failing to mount shares against old SMB
servers over port 139.
Fix this by calculating the correct length as specified in rfc1002.
Fixes: d7173623bf ("cifs: use ALIGN() and round_up() macros")
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Use a struct assignment with implicit member initialization
Signed-off-by: Volker Lendecke <vl@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Due to the 2bytes of padding from the smb2 tree connect request,
there is an unneeded difference between the rfc1002 length and the actual
frame length. In the case of windows client, it is sent by matching it
exactly.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Avoids many calls to compound_head() and removes calls to various
compat functions.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
oparms was not fully initialized
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
The aim of using encryption on a connection is to keep
the data confidential, so we must not use plaintext rdma offload
for that data!
It seems that current windows servers and ksmbd would allow
this, but that's no reason to expose the users data in plaintext!
And servers hopefully reject this in future.
Note modern windows servers support signed or encrypted offload,
see MS-SMB2 2.2.3.1.6 SMB2_RDMA_TRANSFORM_CAPABILITIES, but we don't
support that yet.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
We should have the logic to decide if we want rdma offload
in a single spot in order to advance it in future.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
This will simplify the following changes and makes it easy to get
in passed in from the caller in future.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Just have @skip set to 0 after first iterations of the two nested
loops.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Make sure to get an up-to-date TCP_Server_Info::nr_targets value prior
to waiting the server to be reconnected in smb2_reconnect(). It is
set in cifs_tcp_ses_needs_reconnect() and protected by
TCP_Server_Info::srv_lock.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
The options that are displayed for the smb3.1.1/cifs client
in "make menuconfig" are confusing because some of them are
not indented making them not appear to be related to cifs.ko
Fix that by adding an if/endif (similar to what ceph and 9pm did)
if fs/cifs/Kconfig
Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
There were various outdated or missing things in fs/cifs/Kconfig
e.g. mention of support for insecure NTLM which has been removed,
and lack of mention of some important features. This also shortens
it slightly, and fixes some confusing text (e.g. the SMB1 POSIX
extensions option).
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
In the smb2_get_aead_req() the skip variable is used only for
the very first iteration of the two nested loops, which means
it's basically in invariant to those loops. Hence, instead of
using conditional on each iteration, unconditionally assign
the 'skip' variable before the loops and at the end of the
inner loop.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
We store the last updated time for interface list while
parsing the interfaces. This change is to just print that
info in DebugData.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Paulo Alcantara <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
The pointer dentry is assigned a value that is never read, the
assignment is redundant and can be removed.
Cleans up clang-scan warning:
fs/nfsd/nfsctl.c:1231:2: warning: Value stored to 'dentry' is
never read [deadcode.DeadStores]
dentry = ERR_PTR(ret);
No need to initialize "int ret = -ENOMEM;" either.
These are vestiges of nfsd_mkdir(), from whence I copied
nfsd_symlink().
Reported-by: Colin Ian King <colin.i.king@gmail.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Currently, we're only memcpy'ing the first __be32. Ensure we copy into
both words.
Fixes: 91d2e9b56c ("NFSD: Clean up the nfsd_net::nfssvc_boot field")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Most of the time, NFSv4 clients issue a COMMIT before the final CLOSE of
an open stateid, so with NFSv4, the fsync in the nfsd_file_free path is
usually a no-op and doesn't block.
We have a customer running knfsd over very slow storage (XFS over Ceph
RBD). They were using the "async" export option because performance was
more important than data integrity for this application. That export
option turns NFSv4 COMMIT calls into no-ops. Due to the fsync in this
codepath however, their final CLOSE calls would still stall (since a
CLOSE effectively became a COMMIT).
I think this fsync is not strictly necessary. We only use that result to
reset the write verifier. Instead of fsync'ing all of the data when we
free an nfsd_file, we can just check for writeback errors when one is
acquired and when it is freed.
If the client never comes back, then it'll never see the error anyway
and there is no point in resetting it. If an error occurs after the
nfsd_file is removed from the cache but before the inode is evicted,
then it will reset the write verifier on the next nfsd_file_acquire,
(since there will be an unseen error).
The only exception here is if something else opens and fsyncs the file
during that window. Given that local applications work with this
limitation today, I don't see that as an issue.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2166658
Fixes: ac3a2585f0 ("nfsd: rework refcounting in filecache")
Reported-and-tested-by: Pierguido Lambri <plambri@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The nested if statements here make no sense, as you can never reach
"else" branch in the nested statement. Fix the error handling for
when there is a courtesy client that holds a conflicting deny mode.
Fixes: 3d69427151 ("NFSD: add support for share reservation conflict to courteous server")
Reported-by: 張智諺 <cc85nod@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
When nfsd4_copy fails to allocate memory for async_copy->cp_src, or
nfs4_init_copy_state fails, it calls cleanup_async_copy to do the
cleanup for the async_copy which causes page fault since async_copy
is not yet initialized.
This patche rearranges the order of initializing the fields in
async_copy and adds checks in cleanup_async_copy to skip un-initialized
fields.
Fixes: ce0887ac96 ("NFSD add nfs4 inter ssc to nfsd4_copy")
Fixes: 87689df694 ("NFSD: Shrink size of struct nfsd4_copy")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Its possible for __break_lease to find the layout's lease before we've
added the layout to the owner's ls_layouts list. In that case, setting
ls_recalled = true without actually recalling the layout will cause the
server to never send a recall callback.
Move the check for ls_layouts before setting ls_recalled.
Fixes: c5c707f96f ("nfsd: implement pNFS layout recalls")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
We had a bug report that xfstest generic/355 was failing on NFSv4.0.
This test sets various combinations of setuid/setgid modes and tests
whether DIO writes will cause them to be stripped.
What I found was that the server did properly strip those bits, but
the client didn't notice because it held a delegation that was not
recalled. The recall didn't occur because the client itself was the
one generating the activity and we avoid recalls in that case.
Clearing setuid bits is an "implicit" activity. The client didn't
specifically request that we do that, so we need the server to issue a
CB_RECALL, or avoid the situation entirely by not issuing a delegation.
The easiest fix here is to simply not give out a delegation if the file
is being opened for write, and the mode has the setuid and/or setgid bit
set. Note that there is a potential race between the mode and lease
being set, so we test for this condition both before and after setting
the lease.
This patch fixes generic/355, generic/683 and generic/684 for me. (Note
that 355 fails only on v4.0, and 683 and 684 require NFSv4.2 to run and
fail).
Reported-by: Boyang Xue <bxue@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The reference count of nfsd4_ssc_umount_item is not decremented
on error conditions. This prevents the laundromat from unmounting
the vfsmount of the source file.
This patch decrements the reference count of nfsd4_ssc_umount_item
on error.
Fixes: f4e44b3933 ("NFSD: delay unmount source's export after inter-server copy completed.")
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
There are two different flavors of the nfsd4_copy struct. One is
embedded in the compound and is used directly in synchronous copies. The
other is dynamically allocated, refcounted and tracked in the client
struture. For the embedded one, the cleanup just involves releasing any
nfsd_files held on its behalf. For the async one, the cleanup is a bit
more involved, and we need to dequeue it from lists, unhash it, etc.
There is at least one potential refcount leak in this code now. If the
kthread_create call fails, then both the src and dst nfsd_files in the
original nfsd4_copy object are leaked.
The cleanup in this codepath is also sort of weird. In the async copy
case, we'll have up to four nfsd_file references (src and dst for both
flavors of copy structure). They are both put at the end of
nfsd4_do_async_copy, even though the ones held on behalf of the embedded
one outlive that structure.
Change it so that we always clean up the nfsd_file refs held by the
embedded copy structure before nfsd4_copy returns. Rework
cleanup_async_copy to handle both inter and intra copies. Eliminate
nfsd4_cleanup_intra_ssc since it now becomes a no-op.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
At first, I thought this might be a source of nfsd_file overputs, but
the current callers seem to avoid an extra put when nfsd4_verify_copy
returns an error.
Still, it's "bad form" to leave the pointers filled out when we don't
have a reference to them anymore, and that might lead to bugs later.
Zero them out as a defensive coding measure.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
This is wrapper is pointless, and just obscures what's going on.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
We're not doing any blocking operations for OP_OFFLOAD_STATUS, so taking
and putting a reference is a waste of effort. Take the client lock,
search for the copy and fetch the wr_bytes_written field and return.
Also, make find_async_copy a static function.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Now that I've added a file under /proc/net/rpc that is managed by
the SunRPC's Kerberos mechanism, replace NFSD's
supported_krb5_enctypes file with a symlink to the new SunRPC proc
file, which contains exactly the same content.
Remarkably, commit b0b0c0a26e ("nfsd: add proc file listing
kernel's gss_krb5 enctypes") added the nfsd_supported_krb5_enctypes
file in 2011, but this file has never been documented in nfsd(7).
Tested-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
There's no need to start the reply cache before nfsd is up and running,
and doing so means that we register a shrinker for every net namespace
instead of just the ones where nfsd is running.
Move it to the per-net nfsd startup instead.
Reported-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Currently, svcauth_gss_accept() pre-reserves response buffer space
for the RPC payload length and GSS sequence number before returning
to the dispatcher, which then adds the header's accept_stat field.
The problem is the accept_stat field is supposed to go before the
length and seq_num fields. So svcauth_gss_release() has to relocate
the accept_stat value (see svcauth_gss_prepare_to_wrap()).
To enable these fields to be added to the response buffer in the
correct (final) order, the pointer to the accept_stat has to be made
available to svcauth_gss_accept() so that it can set it before
reserving space for the length and seq_num fields.
As a first step, move the pointer to the location of the accept_stat
field into struct svc_rqst.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Now that all vs_dispatch functions invoke svcxdr_init_encode(), it
is common code and can be pushed down into the generic RPC server.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The WARN_ON_ONCE check is not terribly useful. It also seems possible
for nfs4_find_file to race with the destruction of an fi_deleg_file
while trying to take a reference to it.
Now that it's safe to pass nfs_get_file a NULL pointer, remove the WARN
and NULL pointer check. Take the fi_lock when fetching fi_deleg_file.
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
...and remove some now-useless NULL pointer checks in its callers.
Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Currently nfsd4_setup_inter_ssc returns the vfsmount of the source
server's export when the mount completes. After the copy is done
nfsd4_cleanup_inter_ssc is called with the vfsmount of the source
server and it searches nfsd_ssc_mount_list for a matching entry
to do the clean up.
The problems with this approach are (1) the need to search the
nfsd_ssc_mount_list and (2) the code has to handle the case where
the matching entry is not found which looks ugly.
The enhancement is instead of nfsd4_setup_inter_ssc returning the
vfsmount, it returns the nfsd4_ssc_umount_item which has the
vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called
it's passed with the nfsd4_ssc_umount_item directly to do the
clean up so no searching is needed and there is no need to handle
the 'not found' case.
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
[ cel: adjusted whitespace and variable/function names ]
Reviewed-by: Olga Kornievskaia <kolga@netapp.com>
Now that all vs_dispatch functions invoke svcxdr_init_decode(), it
is common code and can be pushed down into the generic RPC server.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Now with NFSD being able to cross into auto mounts,
the check can be removed.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>