Граф коммитов

78791 Коммитов

Автор SHA1 Сообщение Дата
Darrick J. Wong 921ed96b4f xfs: actually abort log recovery on corrupt intent-done log items
If log recovery picks up intent-done log items that are not of the
correct size it needs to abort recovery and fail the mount.  Debug
assertions are not good enough.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:20 -07:00
Darrick J. Wong 3c5aaaced9 xfs: refactor all the EFI/EFD log item sizeof logic
Refactor all the open-coded sizeof logic for EFI/EFD log item and log
format structures into common helper functions whose names reflect the
struct names.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:20 -07:00
Darrick J. Wong 03a7485cd7 xfs: fix memcpy fortify errors in EFI log format copying
Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of
memcpy.  Since we're already fixing problems with BUI item copying, we
should fix it everything else.

An extra difficulty here is that the ef[id]_extents arrays are declared
as single-element arrays.  This is not the convention for flex arrays in
the modern kernel, and it causes all manner of problems with static
checking tools, since they often cannot tell the difference between a
single element array and a flex array.

So for starters, change those array[1] declarations to array[]
declarations to signal that they are proper flex arrays and adjust all
the "size-1" expressions to fit the new declaration style.

Next, refactor the xfs_efi_copy_format function to handle the copying of
the head and the flex array members separately.  While we're at it, fix
a minor validation deficiency in the recovery function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:20 -07:00
Darrick J. Wong b45ca961e9 xfs: fix memcpy fortify errors in RUI log format copying
Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of
memcpy.  Since we're already fixing problems with BUI item copying, we
should fix it everything else.

Refactor the xfs_rui_copy_format function to handle the copying of the
head and the flex array members separately.  While we're at it, fix a
minor validation deficiency in the recovery function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:19 -07:00
Darrick J. Wong a38935c03c xfs: fix memcpy fortify errors in CUI log format copying
Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of
memcpy.  Since we're already fixing problems with BUI item copying, we
should fix it everything else.

Refactor the xfs_cui_copy_format function to handle the copying of the
head and the flex array members separately.  While we're at it, fix a
minor validation deficiency in the recovery function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:19 -07:00
Darrick J. Wong a38ebce1da xfs: fix memcpy fortify errors in BUI log format copying
Starting in 6.1, CONFIG_FORTIFY_SOURCE checks the length parameter of
memcpy.  Unfortunately, it doesn't handle flex arrays correctly:

------------[ cut here ]------------
memcpy: detected field-spanning write (size 48) of single field "dst_bui_fmt" at fs/xfs/xfs_bmap_item.c:628 (size 16)

Fix this by refactoring the xfs_bui_copy_format function to handle the
copying of the head and the flex array members separately.  While we're
at it, fix a minor validation deficiency in the recovery function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:19 -07:00
Darrick J. Wong 59da7ff49d xfs: fix validation in attr log item recovery
Before we start fixing all the complaints about memcpy'ing log items
around, let's fix some inadequate validation in the xattr log item
recovery code and get rid of the (now trivial) copy_format function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:58:19 -07:00
Filipe Manana 8184620ae2 btrfs: fix lost file sync on direct IO write with nowait and dsync iocb
When doing a direct IO write using a iocb with nowait and dsync set, we
end up not syncing the file once the write completes.

This is because we tell iomap to not call generic_write_sync(), which
would result in calling btrfs_sync_file(), in order to avoid a deadlock
since iomap can call it while we are holding the inode's lock and
btrfs_sync_file() needs to acquire the inode's lock. The deadlock happens
only if the write happens synchronously, when iomap_dio_rw() calls
iomap_dio_complete() before it returns. Instead we do the sync ourselves
at btrfs_do_write_iter().

For a nowait write however we can end up not doing the sync ourselves at
at btrfs_do_write_iter() because the write could have been queued, and
therefore we get -EIOCBQUEUED returned from iomap in such case. That makes
us skip the sync call at btrfs_do_write_iter(), as we don't do it for
any error returned from btrfs_direct_write(). We can't simply do the call
even if -EIOCBQUEUED is returned, since that would block the task waiting
for IO, both for the data since there are bios still in progress as well
as potentially blocking when joining a log transaction and when syncing
the log (writing log trees, super blocks, etc).

So let iomap do the sync call itself and in order to avoid deadlocks for
the case of synchronous writes (without nowait), use __iomap_dio_rw() and
have ourselves call iomap_dio_complete() after unlocking the inode.

A test case will later be sent for fstests, after this is fixed in Linus'
tree.

Fixes: 51bd9563b6 ("btrfs: fix deadlock due to page faults during direct IO reads and writes")
Reported-by: Марк Коренберг <socketpair@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CAEmTpZGRKbzc16fWPvxbr6AfFsQoLmz-Lcg-7OgJOZDboJ+SGQ@mail.gmail.com/
CC: stable@vger.kernel.org # 6.0+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-31 16:52:56 +01:00
Darrick J. Wong 47ba8cc7b4 xfs: fix incorrect return type for fsdax fault handlers
The kernel robot complained about this:

>> fs/xfs/xfs_file.c:1266:31: sparse: sparse: incorrect type in return expression (different base types) @@     expected int @@     got restricted vm_fault_t @@
   fs/xfs/xfs_file.c:1266:31: sparse:     expected int
   fs/xfs/xfs_file.c:1266:31: sparse:     got restricted vm_fault_t
   fs/xfs/xfs_file.c:1314:21: sparse: sparse: incorrect type in assignment (different base types) @@     expected restricted vm_fault_t [usertype] ret @@     got int @@
   fs/xfs/xfs_file.c:1314:21: sparse:     expected restricted vm_fault_t [usertype] ret
   fs/xfs/xfs_file.c:1314:21: sparse:     got int

Fix the incorrect return type for these two functions.

While we're at it, make the !fsdax version return VM_FAULT_SIGBUS
because a zero return value will cause some callers to try to lock
vmf->page, which we never set here.

Fixes: ea6c49b784 ("xfs: support CoW in fsdax mode")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-10-31 08:51:45 -07:00
Christophe JAILLET 063b1f21cc btrfs: fix a memory allocation failure test in btrfs_submit_direct
After allocation 'dip' is tested instead of 'dip->csums'.  Fix it.

Fixes: 642c5d34da ("btrfs: allocate the btrfs_dio_private as part of the iomap dio bio")
CC: stable@vger.kernel.org # 5.19+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-31 16:50:15 +01:00
Linus Torvalds 28b7bd4ad2 3 cifs/smb3 fixes (also for stable)
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmNd32gACgkQiiy9cAdy
 T1E/FQwAqaALaUBM+PHcqHnOKhflYf7+DjkClKcbKSuQ6kN9R5ajHk6pi7EEcexx
 eDNvNszxNm+eEGmSRMoH10cQ3U0C68Dtip1vEyBmywCRBk7QWKAvGDF2JzzS92Cv
 rz4IrZfBGDS5kxeKHAHMaGBy4xnU+4yeVFkcgESltv+g3+C2wLmwL72oeI/ttkIg
 +Tmr2EQLKG/FIxobLZePc90fWUg6vvUM3u0HwK0bzW2ZtkrxTa8/RU2ziNCbOOQN
 VVUEq9FlEVf+71TLa+N4fJStBWQWqldX197Fk15C4on7zcT05wVITXro2CYAPvhV
 ZwROwggSu0jCPiohVkrg4lQjVAFXejE/GNv8c1casliWbxaixpfLC2czONU0PItj
 lTpntaX+fxUIblhMgCsJNxpgYpgeyTS1XyC9kqmt2tsOAgkrgUx7wKtreI+M18yC
 GXPbXkczAEq73pkfkvBRd1UcWuoUPi2ex6UG6oQQO6DnFQwIBYNlMGGIHwEG3VoI
 kvdtwhi8
 =/RPn
 -----END PGP SIGNATURE-----

Merge tag '6.1-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:

 - use after free fix for reconnect race

 - two memory leak fixes

* tag '6.1-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix use-after-free caused by invalid pointer `hostname`
  cifs: Fix pages leak when writedata alloc failed in cifs_write_from_iter()
  cifs: Fix pages array leak when writedata alloc failed in cifs_writedata_alloc()
2022-10-30 09:40:04 -07:00
Linus Torvalds 3c339dbd13 23 hotfixes.
Eight fix pre-6.0 bugs and the remainder address issues which were
 introduced in the 6.1-rc merge cycle, or address issues which aren't
 considered sufficiently serious to warrant a -stable backport.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY1w/LAAKCRDdBJ7gKXxA
 jovHAQDqY3TGAVQsvCBKdUqkp5nakZ7o7kK+mUGvsZ8Cgp5fwQD/Upsu93RZsTgm
 oJfYW4W6eSVEKPu7oAY20xVwLvK6iQ0=
 =z0Fn
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2022-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "Eight fix pre-6.0 bugs and the remainder address issues which were
  introduced in the 6.1-rc merge cycle, or address issues which aren't
  considered sufficiently serious to warrant a -stable backport"

* tag 'mm-hotfixes-stable-2022-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  mm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off region
  lib: maple_tree: remove unneeded initialization in mtree_range_walk()
  mmap: fix remap_file_pages() regression
  mm/shmem: ensure proper fallback if page faults
  mm/userfaultfd: replace kmap/kmap_atomic() with kmap_local_page()
  x86: fortify: kmsan: fix KMSAN fortify builds
  x86: asm: make sure __put_user_size() evaluates pointer once
  Kconfig.debug: disable CONFIG_FRAME_WARN for KMSAN by default
  x86/purgatory: disable KMSAN instrumentation
  mm: kmsan: export kmsan_copy_page_meta()
  mm: migrate: fix return value if all subpages of THPs are migrated successfully
  mm/uffd: fix vma check on userfault for wp
  mm: prep_compound_tail() clear page->private
  mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs
  mm/page_isolation: fix clang deadcode warning
  fs/ext4/super.c: remove unused `deprecated_msg'
  ipc/msg.c: fix percpu_counter use after free
  memory tier, sysfs: rename attribute "nodes" to "nodelist"
  MAINTAINERS: git://github.com -> https://github.com for nilfs2
  mm/kmemleak: prevent soft lockup in kmemleak_scan()'s object iteration loops
  ...
2022-10-29 17:49:33 -07:00
Sebastian Andrzej Siewior dda1c41a07 mm: multi-gen LRU: move lru_gen_add_mm() out of IRQ-off region
lru_gen_add_mm() has been added within an IRQ-off region in the commit
mentioned below.  The other invocations of lru_gen_add_mm() are not within
an IRQ-off region.

The invocation within IRQ-off region is problematic on PREEMPT_RT because
the function is using a spin_lock_t which must not be used within
IRQ-disabled regions.

The other invocations of lru_gen_add_mm() occur while
task_struct::alloc_lock is acquired.  Move lru_gen_add_mm() after
interrupts are enabled and before task_unlock().

Link: https://lkml.kernel.org/r/20221026134830.711887-1-bigeasy@linutronix.de
Fixes: bd74fdaea1 ("mm: multi-gen LRU: support page table walks")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28 13:37:23 -07:00
Andrew Morton bb2282cf01 fs/ext4/super.c: remove unused `deprecated_msg'
fs/ext4/super.c:1744:19: warning: 'deprecated_msg' defined but not used [-Wunused-const-variable=]

Reported-by: kernel test robot <lkp@intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28 13:37:22 -07:00
Phillip Lougher e11c4e088b squashfs: fix buffer release race condition in readahead code
Fix a buffer release race condition, where the error value was used after
release.

Link: https://lkml.kernel.org/r/20221020223616.7571-4-phillip@squashfs.org.uk
Fixes: b09a7a036d ("squashfs: support reading fragments in readahead call")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Slade Watkins <srw@sladewatkins.net>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28 13:37:21 -07:00
Phillip Lougher c9199de82b squashfs: fix extending readahead beyond end of file
The readahead code will try to extend readahead to the entire size of the
Squashfs data block.

But, it didn't take into account that the last block at the end of the
file may not be a whole block.  In this case, the code would extend
readahead to beyond the end of the file, leaving trailing pages.

Fix this by only requesting the expected number of pages.

Link: https://lkml.kernel.org/r/20221020223616.7571-3-phillip@squashfs.org.uk
Fixes: 8fc78b6fe2 ("squashfs: implement readahead")
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Slade Watkins <srw@sladewatkins.net>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28 13:37:21 -07:00
Phillip Lougher 9ef8eb6104 squashfs: fix read regression introduced in readahead code
Patch series "squashfs: fix some regressions introduced in the readahead
code".

This patchset fixes 3 regressions introduced by the recent readahead code
changes.  The first regression is causing "snaps" to randomly fail after a
couple of hours or days, which how the regression came to light.


This patch (of 3):

If a file isn't a whole multiple of the page size, the last page will have
trailing bytes unfilled.

There was a mistake in the readahead code which did this.  In particular
it incorrectly assumed that the last page in the readahead page array
(page[nr_pages - 1]) will always contain the last page in the block, which
if we're at file end, will be the page that needs to be zero filled.

But the readahead code may not return the last page in the block, which
means it is unmapped and will be skipped by the decompressors (a temporary
buffer used).

In this case the zero filling code will zero out the wrong page, leading
to data corruption.

Fix this by by extending the "page actor" to return the last page if
present, or NULL if a temporary buffer was used.

Link: https://lkml.kernel.org/r/20221020223616.7571-1-phillip@squashfs.org.uk
Link: https://lkml.kernel.org/r/20221020223616.7571-2-phillip@squashfs.org.uk
Fixes: 8fc78b6fe2 ("squashfs: implement readahead")
Link: https://lore.kernel.org/lkml/b0c258c3-6dcf-aade-efc4-d62a8b3a1ce2@alu.unizg.hr/
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Tested-by: Slade Watkins <srw@sladewatkins.net>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reported-by: Marc Miltenberger <marcmiltenberger@gmail.com>
Cc: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Cc: Hsin-Yi Wang <hsinyi@chromium.org>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-28 13:37:21 -07:00
Miklos Szeredi 4a6f278d48 fuse: add file_modified() to fallocate
Add missing file_modified() call to fuse_file_fallocate().  Without this
fallocate on fuse failed to clear privileges.

Fixes: 05ba1f0823 ("fuse: add FALLOCATE operation")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-10-28 14:25:20 +02:00
Zeng Heng 153695d36e cifs: fix use-after-free caused by invalid pointer `hostname`
`hostname` needs to be set as null-pointer after free in
`cifs_put_tcp_session` function, or when `cifsd` thread attempts
to resolve hostname and reconnect the host, the thread would deref
the invalid pointer.

Here is one of practical backtrace examples as reference:

Task 477
---------------------------
 do_mount
  path_mount
   do_new_mount
    vfs_get_tree
     smb3_get_tree
      smb3_get_tree_common
       cifs_smb3_do_mount
        cifs_mount
         mount_put_conns
          cifs_put_tcp_session
          --> kfree(server->hostname)

cifsd
---------------------------
 kthread
  cifs_demultiplex_thread
   cifs_reconnect
    reconn_set_ipaddr_from_hostname
    --> if (!server->hostname)
    --> if (server->hostname[0] == '\0')  // !! UAF fault here

CIFS: VFS: cifs_mount failed w/return code = -112
mount error(112): Host is down
BUG: KASAN: use-after-free in reconn_set_ipaddr_from_hostname+0x2ba/0x310
Read of size 1 at addr ffff888108f35380 by task cifsd/480
CPU: 2 PID: 480 Comm: cifsd Not tainted 6.1.0-rc2-00106-gf705792f89dd-dirty #25
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x68/0x85
 print_report+0x16c/0x4a3
 kasan_report+0x95/0x190
 reconn_set_ipaddr_from_hostname+0x2ba/0x310
 __cifs_reconnect.part.0+0x241/0x800
 cifs_reconnect+0x65f/0xb60
 cifs_demultiplex_thread+0x1570/0x2570
 kthread+0x2c5/0x380
 ret_from_fork+0x22/0x30
 </TASK>
Allocated by task 477:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x7e/0x90
 __kmalloc_node_track_caller+0x52/0x1b0
 kstrdup+0x3b/0x70
 cifs_get_tcp_session+0xbc/0x19b0
 mount_get_conns+0xa9/0x10c0
 cifs_mount+0xdf/0x1970
 cifs_smb3_do_mount+0x295/0x1660
 smb3_get_tree+0x352/0x5e0
 vfs_get_tree+0x8e/0x2e0
 path_mount+0xf8c/0x1990
 do_mount+0xee/0x110
 __x64_sys_mount+0x14b/0x1f0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 477:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x2a/0x50
 __kasan_slab_free+0x10a/0x190
 __kmem_cache_free+0xca/0x3f0
 cifs_put_tcp_session+0x30c/0x450
 cifs_mount+0xf95/0x1970
 cifs_smb3_do_mount+0x295/0x1660
 smb3_get_tree+0x352/0x5e0
 vfs_get_tree+0x8e/0x2e0
 path_mount+0xf8c/0x1990
 do_mount+0xee/0x110
 __x64_sys_mount+0x14b/0x1f0
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
The buggy address belongs to the object at ffff888108f35380
 which belongs to the cache kmalloc-16 of size 16
The buggy address is located 0 bytes inside of
 16-byte region [ffff888108f35380, ffff888108f35390)
The buggy address belongs to the physical page:
page:00000000333f8e58 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888108f350e0 pfn:0x108f35
flags: 0x200000000000200(slab|node=0|zone=2)
raw: 0200000000000200 0000000000000000 dead000000000122 ffff8881000423c0
raw: ffff888108f350e0 000000008080007a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
 ffff888108f35280: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
 ffff888108f35300: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
>ffff888108f35380: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
                   ^
 ffff888108f35400: fa fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff888108f35480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Fixes: 7be3248f31 ("cifs: To match file servers, make sure the server hostname matches")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-27 23:59:13 -05:00
Theodore Ts'o 9a8c5b0d06 ext4: update the backup superblock's at the end of the online resize
When expanding a file system using online resize, various fields in
the superblock (e.g., s_blocks_count, s_inodes_count, etc.) change.
To update the backup superblocks, the online resize uses the function
update_backups() in fs/ext4/resize.c.  This function was not updating
the checksum field in the backup superblocks.  This wasn't a big deal
previously, because e2fsck didn't care about the checksum field in the
backup superblock.  (And indeed, update_backups() goes all the way
back to the ext3 days, well before we had support for metadata
checksums.)

However, there is an alternate, more general way of updating
superblock fields, ext4_update_primary_sb() in fs/ext4/ioctl.c.  This
function does check the checksum of the backup superblock, and if it
doesn't match will mark the file system as corrupted.  That was
clearly not the intent, so avoid to aborting the resize when a bad
superblock is found.

In addition, teach update_backups() to properly update the checksum in
the backup superblocks.  We will eventually want to unify
updapte_backups() with the infrasture in ext4_update_primary_sb(), but
that's for another day.

Note: The problem has been around for a while; it just didn't really
matter until ext4_update_primary_sb() was added by commit bbc605cdb1
("ext4: implement support for get/set fs label").  And it became
trivially easy to reproduce after commit 827891a38a ("ext4: update
the s_overhead_clusters in the backup sb's when resizing") in v6.0.

Cc: stable@kernel.org # 5.17+
Fixes: bbc605cdb1 ("ext4: implement support for get/set fs label")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-10-27 23:21:40 -04:00
Linus Torvalds 7dd257d02e execve fixes for v6.1-rc3
- Fix an ancient signal action copy race. (Bernd Edlinger)
 
 - Fix a memory leak in ELF loader, when under memory pressure. (Li Zetao)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmNa1xEWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJoLqD/927ZXWxVLQ0GygmNz3xSEZh+5c
 34flrZv4LUDQPw1rNXycWx2D5MQv5MehrpsMvF+11pu/M1EP3e3+R3bngFeFXtBo
 12ov3yEloe6yA8bOPPWEDB1fU8K7C9aODKMcJOoWFCk20g7uQGYS8+GCUGhLxjHs
 mZn5U8OuEGGvn4QuGknIps+Ddca2SHuJ7jBtsw8NVjuvtWcAhlw9PYNbLTJEgBzU
 0zsfK68idMpQHDPvWMmoRcwAXn3kiVzc3wKeR9Zdx9q2NyDIS+OxgynEAc3fM2rf
 ag19+Epn6GUGPMakS/zJNQS0wCA4+pJi60Z+Hlddy0WNUocg55uHd0zY7xcT3s75
 rsPtbTeabOrtzQMf7lSpsn5OUeCDJjc3KcZIlmILaZaVXUZv+jvysRwH7CRdDNNS
 gM2j9nu87I8TbSPXbY79KutvucfKAl88iWxRgFqnzyqzRYLWahwWSKsiVubH7OoU
 kUYdDdPmiZh7XAqTFUsMF4++wyx/PAwU7RdYuxaUvHZd6PT8J92AqIisPwRT9ojL
 oqLpgRoeYX3JY7aDyvBjYan2IKfIPhB0WZF9vCeHVoTXoEy/LVZeWVNoBXyO6ILl
 BYzBAjp5oJRLbJYVtjI4/gkDizdtpAu8YYRYX36TUvBAkFqpGYn9dvySpMGl24uJ
 g3IEqTj/kajeZleHnQ==
 =dHXB
 -----END PGP SIGNATURE-----

Merge tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve fixes from Kees Cook:

 - Fix an ancient signal action copy race (Bernd Edlinger)

 - Fix a memory leak in ELF loader, when under memory pressure (Li
   Zetao)

* tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  fs/binfmt_elf: Fix memory leak in load_elf_binary()
  exec: Copy oldsighand->action under spin-lock
2022-10-27 13:16:36 -07:00
Zhang Xiaoxu 7e8436728e nfs4: Fix kmemleak when allocate slot failed
If one of the slot allocate failed, should cleanup all the other
allocated slots, otherwise, the allocated slots will leak:

  unreferenced object 0xffff8881115aa100 (size 64):
    comm ""mount.nfs"", pid 679, jiffies 4294744957 (age 115.037s)
    hex dump (first 32 bytes):
      00 cc 19 73 81 88 ff ff 00 a0 5a 11 81 88 ff ff  ...s......Z.....
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<000000007a4c434a>] nfs4_find_or_create_slot+0x8e/0x130
      [<000000005472a39c>] nfs4_realloc_slot_table+0x23f/0x270
      [<00000000cd8ca0eb>] nfs40_init_client+0x4a/0x90
      [<00000000128486db>] nfs4_init_client+0xce/0x270
      [<000000008d2cacad>] nfs4_set_client+0x1a2/0x2b0
      [<000000000e593b52>] nfs4_create_server+0x300/0x5f0
      [<00000000e4425dd2>] nfs4_try_get_tree+0x65/0x110
      [<00000000d3a6176f>] vfs_get_tree+0x41/0xf0
      [<0000000016b5ad4c>] path_mount+0x9b3/0xdd0
      [<00000000494cae71>] __x64_sys_mount+0x190/0x1d0
      [<000000005d56bdec>] do_syscall_64+0x35/0x80
      [<00000000687c9ae4>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: abf79bb341 ("NFS: Add a slot table to struct nfs_client for NFSv4.0 transport blocking")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:11 -04:00
Benjamin Coddington 038efb6348 NFSv4.2: Fixup CLONE dest file size for zero-length count
When holding a delegation, the NFS client optimizes away setting the
attributes of a file from the GETATTR in the compound after CLONE, and for
a zero-length CLONE we will end up setting the inode's size to zero in
nfs42_copy_dest_done().  Handle this case by computing the resulting count
from the server's reported size after CLONE's GETATTR.

Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 94d202d5ca ("NFSv42: Copy offload should update the file size when appropriate")
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Benjamin Coddington f5ea16137a NFSv4: Retry LOCK on OLD_STATEID during delegation return
There's a small window where a LOCK sent during a delegation return can
race with another OPEN on client, but the open stateid has not yet been
updated.  In this case, the client doesn't handle the OLD_STATEID error
from the server and will lose this lock, emitting:
"NFS: nfs4_handle_delegation_recall_error: unhandled error -10024".

Fix this by sending the task through the nfs4 error handling in
nfs4_lock_done() when we may have to reconcile our stateid with what the
server believes it to be.  For this case, the result is a retry of the
LOCK operation with the updated stateid.

Reported-by: Gonzalo Siero Humet <gsierohu@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Trond Myklebust e59679f2b7 NFSv4.1: We must always send RECLAIM_COMPLETE after a reboot
Currently, we are only guaranteed to send RECLAIM_COMPLETE if we have
open state to recover. Fix the client to always send RECLAIM_COMPLETE
after setting up the lease.

Fixes: fce5c838e1 ("nfs41: RECLAIM_COMPLETE functionality")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Trond Myklebust 5d917cba32 NFSv4.1: Handle RECLAIM_COMPLETE trunking errors
If RECLAIM_COMPLETE sets the NFS4CLNT_BIND_CONN_TO_SESSION flag, then we
need to loop back in order to handle it.

Fixes: 0048fdd066 ("NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Trond Myklebust 1ba04394e0 NFSv4: Fix a potential state reclaim deadlock
If the server reboots while we are engaged in a delegation return, and
there is a pNFS layout with return-on-close set, then the current code
can end up deadlocking in pnfs_roc() when nfs_inode_set_delegation()
tries to return the old delegation.
Now that delegreturn actually uses its own copy of the stateid, it
should be safe to just always update the delegation stateid in place.

Fixes: 078000d02d ("pNFS: We want return-on-close to complete when evicting the inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Kees Cook cf0d7e7f45 NFS: Avoid memcpy() run-time warning for struct sockaddr overflows
The 'nfs_server' and 'mount_server' structures include a union of
'struct sockaddr' (with the older 16 bytes max address size) and
'struct sockaddr_storage' which is large enough to hold all the
supported sa_family types (128 bytes max size). The runtime memcpy()
buffer overflow checker is seeing attempts to write beyond the 16
bytes as an overflow, but the actual expected size is that of 'struct
sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more
completely through-out NFS, which results in adjusting the memcpy()
buffers to the correct union members. Avoids this false positive run-time
warning under CONFIG_FORTIFY_SOURCE:

  memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16)

Reported-by: kernel test robot <yujie.liu@intel.com>
Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Yushan Zhou 121affdf8a nfs: Remove redundant null checks before kfree
Fix the following coccicheck warning:
fs/nfs/dir.c:2494:2-7: WARNING:
NULL check before some freeing functions is not needed.

Signed-off-by: Yushan Zhou <katrinzhou@tencent.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2022-10-27 15:52:10 -04:00
Linus Torvalds 200204f56f fscrypt fix for 6.1-rc3
Fix a memory leak that was introduced by a change that went into -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY1oM6BQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK3ixAP9IY1TdJu64uKTofFdYvO/wBASpdszm
 GkY1QnEFxATA9AEAwRswZgaGiuKj4hFBeIWmu9+luT4T7kVIcaumslTyTg8=
 =YinC
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt fix from Eric Biggers:
 "Fix a memory leak that was introduced by a change that went into -rc1"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: fix keyring memory leak on mount failure
2022-10-27 11:44:18 -07:00
Allison Henderson e07ee6fe21 xfs: increase rename inode reservation
xfs_rename can update up to 5 inodes: src_dp, target_dp, src_ip, target_ip
and wip.  So we need to increase the inode reservation to match.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-10-26 13:02:24 -07:00
Li Zetao 594d2a14f2 fs/binfmt_elf: Fix memory leak in load_elf_binary()
There is a memory leak reported by kmemleak:

  unreferenced object 0xffff88817104ef80 (size 224):
    comm "xfs_admin", pid 47165, jiffies 4298708825 (age 1333.476s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      60 a8 b3 00 81 88 ff ff a8 10 5a 00 81 88 ff ff  `.........Z.....
    backtrace:
      [<ffffffff819171e1>] __alloc_file+0x21/0x250
      [<ffffffff81918061>] alloc_empty_file+0x41/0xf0
      [<ffffffff81948cda>] path_openat+0xea/0x3d30
      [<ffffffff8194ec89>] do_filp_open+0x1b9/0x290
      [<ffffffff8192660e>] do_open_execat+0xce/0x5b0
      [<ffffffff81926b17>] open_exec+0x27/0x50
      [<ffffffff81a69250>] load_elf_binary+0x510/0x3ed0
      [<ffffffff81927759>] bprm_execve+0x599/0x1240
      [<ffffffff8192a997>] do_execveat_common.isra.0+0x4c7/0x680
      [<ffffffff8192b078>] __x64_sys_execve+0x88/0xb0
      [<ffffffff83bbf0a5>] do_syscall_64+0x35/0x80

If "interp_elf_ex" fails to allocate memory in load_elf_binary(),
the program will take the "out_free_ph" error handing path,
resulting in "interpreter" file resource is not released.

Fix it by adding an error handing path "out_free_file", which will
release the file resource when "interp_elf_ex" failed to allocate
memory.

Fixes: 0693ffebcf ("fs/binfmt_elf.c: allocate less for static executable")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221024154421.982230-1-lizetao1@huawei.com
2022-10-25 15:11:21 -07:00
Bernd Edlinger 5bf2fedca8 exec: Copy oldsighand->action under spin-lock
unshare_sighand should only access oldsighand->action
while holding oldsighand->siglock, to make sure that
newsighand->action is in a consistent state.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/AM8PR10MB470871DEBD1DED081F9CC391E4389@AM8PR10MB4708.EURPRD10.PROD.OUTLOOK.COM
2022-10-25 15:05:58 -07:00
Qu Wenruo 76a66ba101 btrfs: don't use btrfs_chunk::sub_stripes from disk
[BUG]
There are two reports (the earliest one from LKP, a more recent one from
kernel bugzilla) that we can have some chunks with 0 as sub_stripes.

This will cause divide-by-zero errors at btrfs_rmap_block, which is
introduced by a recent kernel patch ac0677348f ("btrfs: merge
calculations for simple striped profiles in btrfs_rmap_block"):

		if (map->type & (BTRFS_BLOCK_GROUP_RAID0 |
				 BTRFS_BLOCK_GROUP_RAID10)) {
			stripe_nr = stripe_nr * map->num_stripes + i;
			stripe_nr = div_u64(stripe_nr, map->sub_stripes); <<<
		}

[CAUSE]
From the more recent report, it has been proven that we have some chunks
with 0 as sub_stripes, mostly caused by older mkfs.

It turns out that the mkfs.btrfs fix is only introduced in 6718ab4d33aa
("btrfs-progs: Initialize sub_stripes to 1 in btrfs_alloc_data_chunk")
which is included in v5.4 btrfs-progs release.

So there would be quite some old filesystems with such 0 sub_stripes.

[FIX]
Just don't trust the sub_stripes values from disk.

We have a trusted btrfs_raid_array[] to fetch the correct sub_stripes
numbers for each profile and that are fixed.

By this, we can keep the compatibility with older filesystems while
still avoid divide-by-zero bugs.

Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: Viktor Kuzmin <kvaster@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216559
Fixes: ac0677348f ("btrfs: merge calculations for simple striped profiles in btrfs_rmap_block")
CC: stable@vger.kernel.org # 6.0
Reviewed-by: Su Yue <glass@fydeos.io>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-25 10:17:33 +02:00
David Sterba 2398091f9c btrfs: fix type of parameter generation in btrfs_get_dentry
The type of parameter generation has been u32 since the beginning,
however all callers pass a u64 generation, so unify the types to prevent
potential loss.

CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:28:58 +02:00
BingJing Chang 9b8be45f1e btrfs: send: fix send failure of a subcase of orphan inodes
Commit 9ed0a72e5b ("btrfs: send: fix failures when processing inodes with
no links") tries to fix all incremental send cases of orphan inodes the
send operation will meet. However, there's still a bug causing the corner
subcase fails with a ENOENT error.

Here's shortened steps of that subcase:

  $ btrfs subvolume create vol
  $ touch vol/foo

  $ btrfs subvolume snapshot -r vol snap1
  $ btrfs subvolume snapshot -r vol snap2

  # Turn the second snapshot to RW mode and delete the file while
  # holding an open file descriptor on it
  $ btrfs property set snap2 ro false
  $ exec 73<snap2/foo
  $ rm snap2/foo

  # Set the second snapshot back to RO mode and do an incremental send
  # with an unusal reverse order
  $ btrfs property set snap2 ro true
  $ btrfs send -p snap2 snap1 > /dev/null
  At subvol snap1
  ERROR: send ioctl failed with -2: No such file or directory

It's subcase 3 of BTRFS_COMPARE_TREE_CHANGED in the commit 9ed0a72e5b
("btrfs: send: fix failures when processing inodes with no links"). And
it's not a common case. We still have not met it in the real world.
Theoretically, this case can happen in a batch cascading snapshot backup.
In cascading backups, the receive operation in the middle may cause orphan
inodes to appear because of the open file descriptors on the snapshot files
during receiving. And if we don't do the batch snapshot backups in their
creation order, then we can have an inode, which is an orphan in the parent
snapshot but refers to a file in the send snapshot. Since an orphan inode
has no paths, the send operation will fail with a ENOENT error if it
tries to generate a path for it.

In that patch, this subcase will be treated as an inode with a new
generation. However, when the routine tries to delete the old paths in
the parent snapshot, the function process_all_refs() doesn't check whether
there are paths recorded or not before it calls the function
process_recorded_refs(). And the function process_recorded_refs() try
to get the first path in the parent snapshot in the beginning. Since it has
no paths in the parent snapshot, the send operation fails.

To fix this, we can easily put a link count check to avoid entering the
deletion routine like what we do a link count check to avoid creating a
new one. Moreover, we can assume that the function process_all_refs()
can always collect references to process because we know it has a
positive link count.

Fixes: 9ed0a72e5b ("btrfs: send: fix failures when processing inodes with no links")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: BingJing Chang <bingjingc@synology.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:28:52 +02:00
Qu Wenruo 3d17adea74 btrfs: make thaw time super block check to also verify checksum
Previous commit a05d3c9153 ("btrfs: check superblock to ensure the fs
was not modified at thaw time") only checks the content of the super
block, but it doesn't really check if the on-disk super block has a
matching checksum.

This patch will add the checksum verification to thaw time superblock
verification.

This involves the following extra changes:

- Export btrfs_check_super_csum()
  As we need to call it in super.c.

- Change the argument list of btrfs_check_super_csum()
  Instead of passing a char *, directly pass struct btrfs_super_block *
  pointer.

- Verify that our checksum type didn't change before checking the
  checksum value, like it's done at mount time

Fixes: a05d3c9153 ("btrfs: check superblock to ensure the fs was not modified at thaw time")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:28:29 +02:00
Josef Bacik 968b715831 btrfs: fix tree mod log mishandling of reallocated nodes
We have been seeing the following panic in production

  kernel BUG at fs/btrfs/tree-mod-log.c:677!
  invalid opcode: 0000 [#1] SMP
  RIP: 0010:tree_mod_log_rewind+0x1b4/0x200
  RSP: 0000:ffffc9002c02f890 EFLAGS: 00010293
  RAX: 0000000000000003 RBX: ffff8882b448c700 RCX: 0000000000000000
  RDX: 0000000000008000 RSI: 00000000000000a7 RDI: ffff88877d831c00
  RBP: 0000000000000002 R08: 000000000000009f R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000100c40 R12: 0000000000000001
  R13: ffff8886c26d6a00 R14: ffff88829f5424f8 R15: ffff88877d831a00
  FS:  00007fee1d80c780(0000) GS:ffff8890400c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fee1963a020 CR3: 0000000434f33002 CR4: 00000000007706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   btrfs_get_old_root+0x12b/0x420
   btrfs_search_old_slot+0x64/0x2f0
   ? tree_mod_log_oldest_root+0x3d/0xf0
   resolve_indirect_ref+0xfd/0x660
   ? ulist_alloc+0x31/0x60
   ? kmem_cache_alloc_trace+0x114/0x2c0
   find_parent_nodes+0x97a/0x17e0
   ? ulist_alloc+0x30/0x60
   btrfs_find_all_roots_safe+0x97/0x150
   iterate_extent_inodes+0x154/0x370
   ? btrfs_search_path_in_tree+0x240/0x240
   iterate_inodes_from_logical+0x98/0xd0
   ? btrfs_search_path_in_tree+0x240/0x240
   btrfs_ioctl_logical_to_ino+0xd9/0x180
   btrfs_ioctl+0xe2/0x2ec0
   ? __mod_memcg_lruvec_state+0x3d/0x280
   ? do_sys_openat2+0x6d/0x140
   ? kretprobe_dispatcher+0x47/0x70
   ? kretprobe_rethook_handler+0x38/0x50
   ? rethook_trampoline_handler+0x82/0x140
   ? arch_rethook_trampoline_callback+0x3b/0x50
   ? kmem_cache_free+0xfb/0x270
   ? do_sys_openat2+0xd5/0x140
   __x64_sys_ioctl+0x71/0xb0
   do_syscall_64+0x2d/0x40

Which is this code in tree_mod_log_rewind()

	switch (tm->op) {
        case BTRFS_MOD_LOG_KEY_REMOVE_WHILE_FREEING:
		BUG_ON(tm->slot < n);

This occurs because we replay the nodes in order that they happened, and
when we do a REPLACE we will log a REMOVE_WHILE_FREEING for every slot,
starting at 0.  'n' here is the number of items in this block, which in
this case was 1, but we had 2 REMOVE_WHILE_FREEING operations.

The actual root cause of this was that we were replaying operations for
a block that shouldn't have been replayed.  Consider the following
sequence of events

1. We have an already modified root, and we do a btrfs_get_tree_mod_seq().
2. We begin removing items from this root, triggering KEY_REPLACE for
   it's child slots.
3. We remove one of the 2 children this root node points to, thus triggering
   the root node promotion of the remaining child, and freeing this node.
4. We modify a new root, and re-allocate the above node to the root node of
   this other root.

The tree mod log looks something like this

	logical 0	op KEY_REPLACE (slot 1)			seq 2
	logical 0	op KEY_REMOVE (slot 1)			seq 3
	logical 0	op KEY_REMOVE_WHILE_FREEING (slot 0)	seq 4
	logical 4096	op LOG_ROOT_REPLACE (old logical 0)	seq 5
	logical 8192	op KEY_REMOVE_WHILE_FREEING (slot 1)	seq 6
	logical 8192	op KEY_REMOVE_WHILE_FREEING (slot 0)	seq 7
	logical 0	op LOG_ROOT_REPLACE (old logical 8192)	seq 8

>From here the bug is triggered by the following steps

1.  Call btrfs_get_old_root() on the new_root.
2.  We call tree_mod_log_oldest_root(btrfs_root_node(new_root)), which is
    currently logical 0.
3.  tree_mod_log_oldest_root() calls tree_mod_log_search_oldest(), which
    gives us the KEY_REPLACE seq 2, and since that's not a
    LOG_ROOT_REPLACE we incorrectly believe that we don't have an old
    root, because we expect that the most recent change should be a
    LOG_ROOT_REPLACE.
4.  Back in tree_mod_log_oldest_root() we don't have a LOG_ROOT_REPLACE,
    so we don't set old_root, we simply use our existing extent buffer.
5.  Since we're using our existing extent buffer (logical 0) we call
    tree_mod_log_search(0) in order to get the newest change to start the
    rewind from, which ends up being the LOG_ROOT_REPLACE at seq 8.
6.  Again since we didn't find an old_root we simply clone logical 0 at
    it's current state.
7.  We call tree_mod_log_rewind() with the cloned extent buffer.
8.  Set n = btrfs_header_nritems(logical 0), which would be whatever the
    original nritems was when we COWed the original root, say for this
    example it's 2.
9.  We start from the newest operation and work our way forward, so we
    see LOG_ROOT_REPLACE which we ignore.
10. Next we see KEY_REMOVE_WHILE_FREEING for slot 0, which triggers the
    BUG_ON(tm->slot < n), because it expects if we've done this we have a
    completely empty extent buffer to replay completely.

The correct thing would be to find the first LOG_ROOT_REPLACE, and then
get the old_root set to logical 8192.  In fact making that change fixes
this particular problem.

However consider the much more complicated case.  We have a child node
in this tree and the above situation.  In the above case we freed one
of the child blocks at the seq 3 operation.  If this block was also
re-allocated and got new tree mod log operations we would have a
different problem.  btrfs_search_old_slot(orig root) would get down to
the logical 0 root that still pointed at that node.  However in
btrfs_search_old_slot() we call tree_mod_log_rewind(buf) directly.  This
is not context aware enough to know which operations we should be
replaying.  If the block was re-allocated multiple times we may only
want to replay a range of operations, and determining what that range is
isn't possible to determine.

We could maybe solve this by keeping track of which root the node
belonged to at every tree mod log operation, and then passing this
around to make sure we're only replaying operations that relate to the
root we're trying to rewind.

However there's a simpler way to solve this problem, simply disallow
reallocations if we have currently running tree mod log users.  We
already do this for leaf's, so we're simply expanding this to nodes as
well.  This is a relatively uncommon occurrence, and the problem is
complicated enough I'm worried that we will still have corner cases in
the reallocation case.  So fix this in the most straightforward way
possible.

Fixes: bd989ba359 ("Btrfs: add tree modification log functions")
CC: stable@vger.kernel.org # 3.3+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:28:07 +02:00
David Sterba ae0e5df4d1 btrfs: reorder btrfs_bio for better packing
After changes in commit 917f32a235 ("btrfs: give struct btrfs_bio a
real end_io handler") the layout of btrfs_bio can be improved.  There
are two holes and the structure size is 264 bytes on release build. By
reordering the iterator we can get rid of the holes and the size is 256
bytes which fits to slabs much better.

Final layout:

struct btrfs_bio {
	unsigned int               mirror_num;           /*     0     4 */
	struct bvec_iter           iter;                 /*     4    20 */
	u64                        file_offset;          /*    24     8 */
	struct btrfs_device *      device;               /*    32     8 */
	u8 *                       csum;                 /*    40     8 */
	u8                         csum_inline[64];      /*    48    64 */
	/* --- cacheline 1 boundary (64 bytes) was 48 bytes ago --- */
	btrfs_bio_end_io_t         end_io;               /*   112     8 */
	void *                     private;              /*   120     8 */
	/* --- cacheline 2 boundary (128 bytes) --- */
	struct work_struct         end_io_work;          /*   128    32 */
	struct bio                 bio;                  /*   160    96 */

	/* size: 256, cachelines: 4, members: 10 */
};

Fixes: 917f32a235 ("btrfs: give struct btrfs_bio a real end_io handler")
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:27:34 +02:00
Qu Wenruo ab4c54c643 btrfs: raid56: avoid double freeing for rbio if full_stripe_write() failed
Currently if full_stripe_write() failed to allocate the pages for
parity, it will call __free_raid_bio() first, then return -ENOMEM.

But some caller of full_stripe_write() will also call __free_raid_bio()
again, this would cause double freeing.

And it's not a logically sound either, normally we should either free
the memory at the same level where we allocated it, or let endio to
handle everything.

So this patch will solve the double freeing by make
raid56_parity_write() to handle the error and free the rbio.

Just like what we do in raid56_parity_recover().

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:26:56 +02:00
Qu Wenruo f15fb2cd97 btrfs: raid56: properly handle the error when unable to find the missing stripe
In raid56_alloc_missing_rbio(), if we can not determine where the
missing device is inside the full stripe, we just BUG_ON().

This is not necessary especially the only caller inside scrub.c is
already properly checking the return value, and will treat it as a
memory allocation failure.

Fix the error handling by:

- Add an extra warning for the reason
  Although personally speaking it may be better to be an ASSERT().

- Properly free the allocated rbio

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-10-24 15:26:54 +02:00
Zhang Xiaoxu f950c85e78 cifs: Fix pages leak when writedata alloc failed in cifs_write_from_iter()
There is a kmemleak when writedata alloc failed:

  unreferenced object 0xffff888175ae4000 (size 4096):
    comm "dd", pid 19419, jiffies 4296028749 (age 739.396s)
    hex dump (first 32 bytes):
      80 02 b0 04 00 ea ff ff c0 02 b0 04 00 ea ff ff  ................
      80 22 4c 04 00 ea ff ff c0 22 4c 04 00 ea ff ff  ."L......"L.....
    backtrace:
      [<0000000072fdbb86>] __kmalloc_node+0x50/0x150
      [<0000000039faf56f>] __iov_iter_get_pages_alloc+0x605/0xdd0
      [<00000000f862a9d4>] iov_iter_get_pages_alloc2+0x3b/0x80
      [<000000008f226067>] cifs_write_from_iter+0x2ae/0xe40
      [<000000001f78f2f1>] __cifs_writev+0x337/0x5c0
      [<00000000257fcef5>] vfs_write+0x503/0x690
      [<000000008778a238>] ksys_write+0xb9/0x150
      [<00000000ed82047c>] do_syscall_64+0x35/0x80
      [<000000003365551d>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

__iov_iter_get_pages_alloc+0x605/0xdd0 is:
  want_pages_array at lib/iov_iter.c:1304
  (inlined by) __iov_iter_get_pages_alloc at lib/iov_iter.c:1457

If writedata allocate failed, the pages and pagevec should be cleanup.

Fixes: 8c5f9c1ab7 ("CIFS: Add support for direct I/O write")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-23 17:50:10 -05:00
Zhang Xiaoxu 4153d789e2 cifs: Fix pages array leak when writedata alloc failed in cifs_writedata_alloc()
There is a memory leak when writedata alloc failed:

  unreferenced object 0xffff888192364000 (size 8192):
    comm "sync", pid 22839, jiffies 4297313967 (age 60.230s)
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<0000000027de0814>] __kmalloc+0x4d/0x150
      [<00000000b21e81ab>] cifs_writepages+0x35f/0x14a0
      [<0000000076f7d20e>] do_writepages+0x10a/0x360
      [<00000000d6a36edc>] filemap_fdatawrite_wbc+0x95/0xc0
      [<000000005751a323>] __filemap_fdatawrite_range+0xa7/0xe0
      [<0000000088afb0ca>] file_write_and_wait_range+0x66/0xb0
      [<0000000063dbc443>] cifs_strict_fsync+0x80/0x5f0
      [<00000000c4624754>] __x64_sys_fsync+0x40/0x70
      [<000000002c0dc744>] do_syscall_64+0x35/0x80
      [<0000000052f46bee>] entry_SYSCALL_64_after_hwframe+0x46/0xb0

cifs_writepages+0x35f/0x14a0 is:
  kmalloc_array at include/linux/slab.h:628
  (inlined by) kcalloc at include/linux/slab.h:659
  (inlined by) cifs_writedata_alloc at fs/cifs/file.c:2438
  (inlined by) wdata_alloc_and_fillpages at fs/cifs/file.c:2527
  (inlined by) cifs_writepages at fs/cifs/file.c:2705

If writedata alloc failed in cifs_writedata_alloc(), the pages array
should be freed.

Fixes: 8e7360f67e ("CIFS: Add support for direct pages in wdata")
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-23 17:50:10 -05:00
Linus Torvalds ec4cf5dbb1 First batch of EFI fixes for v6.1
- A pair of fixes for the EFI variable store refactor that landed in
   v6.0
 - A couple of fixes for issue that were introduced during the merge
   window
 - Back out some changes related to EFI zboot signing - we'll add a
   better solution for this during the next cycle
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmNSs/cACgkQw08iOZLZ
 jySlPgv/Zfwdbg7b+0Y2wkevUh3shwSH5/fzRAxxFa9dt1D1DcSr+JAqGslSS0Uo
 Hq9GEnEVGhGqzd6JC/0X5jCSZFJCbfA8F6v082Qg1sJwr1cbualVcyYY0KL5GCk5
 pyIfjHfotaTyS7mzQnjlxT9NQVzJEnZcEP9sxxIny5FRwS0KqhbtGN2V0xovDB/4
 2b9w9a7zg1YZVH/IJdeLZFzG5TMQzg8X5WPWVqKHNpqMC9gOW3V3R4Gxy0RCp85Q
 9j8PY5CI3KABeYwCDKB+sw7GFYSDK9e+4qEwdcC9Fp1+K0g35ELd8xDr98aMMyyw
 pl5qdsJz6XY7i9IfgZq4YhSlMA1+Ab7hAsweoQ7tYofs0TRtuNGzLTT2scU5Pws1
 67smMKxlfXTUSB7+1aH5qkV9sKM2uB/Rbib9qIkSyIeTN8Mo/290nimOIilhpsz8
 EQC0mIAsieoF7svy5HFgDsxFi7m6jxwYqlGZt3QCF3ULdlEDQYOA5boJh9OgbR6C
 3e1kMHR8
 =q3Qj
 -----END PGP SIGNATURE-----

Merge tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - fixes for the EFI variable store refactor that landed in v6.0

 - fixes for issues that were introduced during the merge window

 - back out some changes related to EFI zboot signing - we'll add a
   better solution for this during the next cycle

* tag 'efi-fixes-for-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: runtime: Don't assume virtual mappings are missing if VA == PA == 0
  efi: libstub: Fix incorrect payload size in zboot header
  efi: libstub: Give efi_main() asmlinkage qualification
  efi: efivars: Fix variable writes without query_variable_store()
  efi: ssdt: Don't free memory if ACPI table was loaded successfully
  efi: libstub: Remove zboot signing from build options
2022-10-21 18:02:36 -07:00
Linus Torvalds bd8e963412 12 cifs/smb3 fixes, half for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmNSM84ACgkQiiy9cAdy
 T1Ev1gv/boWdv/ihnWEpdlAT4wqMQt9Q7dlfSZmNlcoSst1QGbTHjBrUNncBcRCZ
 +9PRqH0umGMx1dFBzPNp5AzgGK2hbCL5hfveZlxsZoAohFTkOCK4714vXyoI0dJM
 akbq3VMjljkrHS+biC4NwZgQyPFWnDVzI1rESI9iIooRiVlIQWRToWwyDDxHP7b8
 LiDdh62uj6aAirGBnGK8qrXnIgXcFlowNGDT4JsUuXqEavNBBcl0VH/6C1E3Zhmn
 a/w2NwCWnAxcRl5nO8v/yl/awV9hQ5Ma8mR5v6de1BDxsGUyHKxxI6+/iO31ZxzM
 4XzIdta7TNGPvWsHXEBfBZqNIt2XbEAKagNXsNwar5Py05Tb/JK1F4S5gUxVblRq
 Ro3JoP5gjdMvKVZGoQeG414g3b+z+58VP5DIVBIkCOu7ODoxCqbavneh9IUOsW0e
 3xrHPceoiN+q9pCfRYfjep6D53hkASzGihQM2rGBgPzyOATO6U7SaYosEVYuDkIW
 Li2ObspV
 =ZV02
 -----END PGP SIGNATURE-----

Merge tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:

 - memory leak fixes

 - fixes for directory leases, including an important one which fixes a
   problem noticed by git functional tests

 - fixes relating to missing free_xid calls (helpful for
   tracing/debugging of entry/exit into cifs.ko)

 - a multichannel fix

 - a small cleanup fix (use of list_move instead of list_del/list_add)

* tag '6.1-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module number
  cifs: fix memory leaks in session setup
  cifs: drop the lease for cached directories on rmdir or rename
  smb3: interface count displayed incorrectly
  cifs: Fix memory leak when build ntlmssp negotiate blob failed
  cifs: set rc to -ENOENT if we can not get a dentry for the cached dir
  cifs: use LIST_HEAD() and list_move() to simplify code
  cifs: Fix xid leak in cifs_get_file_info_unix()
  cifs: Fix xid leak in cifs_ses_add_channel()
  cifs: Fix xid leak in cifs_flock()
  cifs: Fix xid leak in cifs_copy_file_range()
  cifs: Fix xid leak in cifs_create()
2022-10-21 16:01:53 -07:00
Linus Torvalds 022c028f4c Fixes:
- Fixes for patches merged in v6.1
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmNSoKcACgkQM2qzM29m
 f5fhJRAAtvsbgjP8lME7MQp8v+19VsRvXzux7fOvbwqOb5yudftSq0wBB52Gejo2
 mNjk40h+D/SgeKpaB1Rwva6Uv3ZbdAa1ynAm6LHaoO7sHTiMRicNCtFnK8IyjCvI
 uNDlaFBoUXomVN68LmsZtwqPieGBu+YT5tVk/eej/VTaRO17mvP0KOVYVqaQV26g
 sOA8SQLnmOLrvo0GZodOwjFm0mhsD2VhAtfRDte3ULpcnGV2v/uQZthZwxpSm55W
 qToXJD1vNqruwYa06HJmF9XJRsImRz8iN6MyxA3SfLCALLmqXLekaioKF53ncSjT
 FWdGFiOotd92vU3ZCPH20CoNRHfXdrFRqt+g0nnBOowZvhYuafJMrfqxXz47Wwhm
 4Hn+r0IvcQxQ+n3+ixeFllj6RJhfjpIhb2UwztNcoJkRh3hFDtfbbetQj9ykkckE
 1KeTEw/zP78hAbH9ns5zfPBDVCebA8c/LNw/nPrlMYvb7v5nZ2/2fKB55J6nwKuK
 l+A8Juk98/heVvZlHMAg0fg6ACv2EWvjxesPp4WWnpTusMzXgM+BfOo/Xk9Y/zgT
 wJQHp4BUIAmAJEfNKF3KhKpxrKY6bb9nuwvDeWDkvqI7NrCHbQ9HyiQ9nbntaXMy
 4fArurYFTE0KxbZ484jAXHINV8XynYNxeT3GcO9bFBp3uFUShnw=
 =5zjT
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:
 "Fixes for patches merged in v6.1"

* tag 'nfsd-6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: ensure we always call fh_verify_error tracepoint
  NFSD: unregister shrinker when nfsd_init_net() fails
2022-10-21 15:51:30 -07:00
Linus Torvalds 440b7895c9 17 hotfixes, mainly for MM. 5 are cc:stable and the remainder address
post-6.0 issues.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY1IgYgAKCRDdBJ7gKXxA
 jpyRAQDkfa1LDkfbA4dQBZShkUhBX1k3AyRO1NWMjwwTxP3H8wD9HUz1BB3ynoKc
 ipzQs7q5jbBvndczEksHiG2AC7SvQAI=
 =wD9I
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2022-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morron:
 "Seventeen hotfixes, mainly for MM.

  Five are cc:stable and the remainder address post-6.0 issues"

* tag 'mm-hotfixes-stable-2022-10-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nouveau: fix migrate_to_ram() for faulting page
  mm/huge_memory: do not clobber swp_entry_t during THP split
  hugetlb: fix memory leak associated with vma_lock structure
  mm/page_alloc: reduce potential fragmentation in make_alloc_exact()
  mm: /proc/pid/smaps_rollup: fix maple tree search
  mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages
  mm/mmap: fix MAP_FIXED address return on VMA merge
  mm/mmap.c: __vma_adjust(): suppress uninitialized var warning
  mm/mmap: undo ->mmap() when mas_preallocate() fails
  init: Kconfig: fix spelling mistake "satify" -> "satisfy"
  ocfs2: clear dinode links count in case of error
  ocfs2: fix BUG when iput after ocfs2_mknod fails
  gcov: support GCC 12.1 and newer compilers
  zsmalloc: zs_destroy_pool: add size_class NULL check
  mm/mempolicy: fix mbind_range() arguments to vma_merge()
  mailmap: update email for Qais Yousef
  mailmap: update Dan Carpenter's email address
2022-10-21 12:33:03 -07:00
Ard Biesheuvel 8a254d90a7 efi: efivars: Fix variable writes without query_variable_store()
Commit bbc6d2c6ef ("efi: vars: Switch to new wrapper layer")
refactored the efivars layer so that the 'business logic' related to
which UEFI variables affect the boot flow in which way could be moved
out of it, and into the efivarfs driver.

This inadvertently broke setting variables on firmware implementations
that lack the QueryVariableInfo() boot service, because we no longer
tolerate a EFI_UNSUPPORTED result from check_var_size() when calling
efivar_entry_set_get_size(), which now ends up calling check_var_size()
a second time inadvertently.

If QueryVariableInfo() is missing, we support writes of up to 64k -
let's move that logic into check_var_size(), and drop the redundant
call.

Cc: <stable@vger.kernel.org> # v6.0
Fixes: bbc6d2c6ef ("efi: vars: Switch to new wrapper layer")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-10-21 11:09:40 +02:00
Hugh Dickins 08ac85521c mm: /proc/pid/smaps_rollup: fix maple tree search
/proc/pid/smaps_rollup showed 0 kB for everything: now find first vma.

Link: https://lkml.kernel.org/r/3011bee7-182-97a2-1083-d5f5b688e54b@google.com
Fixes: c4c84f0628 ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20 21:27:23 -07:00
Joseph Qi 28f4821b1b ocfs2: clear dinode links count in case of error
In ocfs2_mknod(), if error occurs after dinode successfully allocated,
ocfs2 i_links_count will not be 0.

So even though we clear inode i_nlink before iput in error handling, it
still won't wipe inode since we'll refresh inode from dinode during inode
lock.  So just like clear inode i_nlink, we clear ocfs2 i_links_count as
well.  Also do the same change for ocfs2_symlink().

Link: https://lkml.kernel.org/r/20221017130227.234480-2-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Yan Wang <wangyan122@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20 21:27:22 -07:00
Joseph Qi 759a7c6126 ocfs2: fix BUG when iput after ocfs2_mknod fails
Commit b1529a41f7 "ocfs2: should reclaim the inode if
'__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed
inode if __ocfs2_mknod_locked() fails later.  But this introduce a race,
the freed bit may be reused immediately by another thread, which will
update dinode, e.g.  i_generation.  Then iput this inode will lead to BUG:
inode->i_generation != le32_to_cpu(fe->i_generation)

We could make this inode as bad, but we did want to do operations like
wipe in some cases.  Since the claimed inode bit can only affect that an
dinode is missing and will return back after fsck, it seems not a big
problem.  So just leave it as is by revert the reclaim logic.

Link: https://lkml.kernel.org/r/20221017130227.234480-1-joseph.qi@linux.alibaba.com
Fixes: b1529a41f7 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Yan Wang <wangyan122@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-20 21:27:22 -07:00
Li Zetao d08af40340 xfs: Fix unreferenced object reported by kmemleak in xfs_sysfs_init()
kmemleak reported a sequence of memory leaks, and one of them indicated we
failed to free a pointer:
  comm "mount", pid 19610, jiffies 4297086464 (age 60.635s)
    hex dump (first 8 bytes):
      73 64 61 00 81 88 ff ff                          sda.....
    backtrace:
      [<00000000d77f3e04>] kstrdup_const+0x46/0x70
      [<00000000e51fa804>] kobject_set_name_vargs+0x2f/0xb0
      [<00000000247cd595>] kobject_init_and_add+0xb0/0x120
      [<00000000f9139aaf>] xfs_mountfs+0x367/0xfc0
      [<00000000250d3caf>] xfs_fs_fill_super+0xa16/0xdc0
      [<000000008d873d38>] get_tree_bdev+0x256/0x390
      [<000000004881f3fa>] vfs_get_tree+0x41/0xf0
      [<000000008291ab52>] path_mount+0x9b3/0xdd0
      [<0000000022ba8f2d>] __x64_sys_mount+0x190/0x1d0

As mentioned in kobject_init_and_add() comment, if this function
returns an error, kobject_put() must be called to properly clean up
the memory associated with the object. Apparently, xfs_sysfs_init()
does not follow such a requirement. When kobject_init_and_add()
returns an error, the space of kobj->kobject.name alloced by
kstrdup_const() is unfree, which will cause the above stack.

Fix it by adding kobject_put() when kobject_init_and_add returns an
error.

Fixes: a31b1d3d89 ("xfs: add xfs_mount sysfs kobject")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-10-20 09:42:56 -07:00
Zeng Heng cf4f4c12de xfs: fix memory leak in xfs_errortag_init
When `xfs_sysfs_init` returns failed, `mp->m_errortag` needs to free.
Otherwise kmemleak would report memory leak after mounting xfs image:

unreferenced object 0xffff888101364900 (size 192):
  comm "mount", pid 13099, jiffies 4294915218 (age 335.207s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000f08ad25c>] __kmalloc+0x41/0x1b0
    [<00000000dca9aeb6>] kmem_alloc+0xfd/0x430
    [<0000000040361882>] xfs_errortag_init+0x20/0x110
    [<00000000b384a0f6>] xfs_mountfs+0x6ea/0x1a30
    [<000000003774395d>] xfs_fs_fill_super+0xe10/0x1a80
    [<000000009cf07b6c>] get_tree_bdev+0x3e7/0x700
    [<00000000046b5426>] vfs_get_tree+0x8e/0x2e0
    [<00000000952ec082>] path_mount+0xf8c/0x1990
    [<00000000beb1f838>] do_mount+0xee/0x110
    [<000000000e9c41bb>] __x64_sys_mount+0x14b/0x1f0
    [<00000000f7bb938e>] do_syscall_64+0x3b/0x90
    [<000000003fcd67a9>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: c684010115 ("xfs: expose errortag knobs via sysfs")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-10-20 09:42:56 -07:00
Colin Ian King fc93812c72 xfs: remove redundant pointer lip
The assignment to pointer lip is not really required, the pointer lip
is redundant and can be removed.

Cleans up clang-scan warning:
warning: Although the value stored to 'lip' is used in the enclosing
expression, the value is never actually read from 'lip'
[deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-10-20 09:42:56 -07:00
Guo Xuenan 13cf24e006 xfs: fix exception caused by unexpected illegal bestcount in leaf dir
For leaf dir, In most cases, there should be as many bestfree slots
as the dir data blocks that can fit under i_size (except for [1]).

Root cause is we don't examin the number bestfree slots, when the slots
number less than dir data blocks, if we need to allocate new dir data
block and update the bestfree array, we will use the dir block number as
index to assign bestfree array, while we did not check the leaf buf
boundary which may cause UAF or other memory access problem. This issue
can also triggered with test cases xfs/473 from fstests.

According to Dave Chinner & Darrick's suggestion, adding buffer verifier
to detect this abnormal situation in time.
Simplify the testcase for fstest xfs/554 [1]

The error log is shown as follows:
==================================================================
BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0
Write of size 2 at addr ffff88810168b000 by task touch/1552
CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ #101
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 print_report.cold+0xf6/0x691
 kasan_report+0xa8/0x120
 xfs_dir2_leaf_addname+0x1995/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000
 </TASK>

The buggy address belongs to the physical page:
page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000
index:0x0 pfn:0x10168b
flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff)
raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000
raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                   ^
 ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Disabling lock debugging due to kernel taint
00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78
XDD3[S5........x
XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file
fs/xfs/libxfs/xfs_dir2_data.c.  Caller
xfs_dir2_data_use_free+0x28a/0xeb0
CPU: 5 PID: 1552 Comm: touch Tainted: G    B              6.0.0-rc3+
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x4d/0x66
 xfs_corruption_error+0x132/0x150
 xfs_dir2_data_use_free+0x198/0xeb0
 xfs_dir2_leaf_addname+0xa59/0x1ac0
 xfs_dir_createname+0x58c/0x7f0
 xfs_create+0x7af/0x1010
 xfs_generic_create+0x270/0x5e0
 path_openat+0x270b/0x3450
 do_filp_open+0x1cf/0x2b0
 do_sys_openat2+0x46b/0x7a0
 do_sys_open+0xb7/0x130
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fe4d9e9312b
Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0
75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00
f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b
RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c
RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941
R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000
 </TASK>
XFS (sdb): Corruption detected. Unmount and run xfs_repair

[1] https://lore.kernel.org/all/20220928095355.2074025-1-guoxuenan@huawei.com/
Reviewed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2022-10-20 09:42:56 -07:00
Miklos Szeredi 9fa248c65b fuse: fix readdir cache race
There's a race in fuse's readdir cache that can result in an uninitilized
page being read.  The page lock is supposed to prevent this from happening
but in the following case it doesn't:

Two fuse_add_dirent_to_cache() start out and get the same parameters
(size=0,offset=0).  One of them wins the race to create and lock the page,
after which it fills in data, sets rdc.size and unlocks the page.

In the meantime the page gets evicted from the cache before the other
instance gets to run.  That one also creates the page, but finds the
size to be mismatched, bails out and leaves the uninitialized page in the
cache.

Fix by marking a filled page uptodate and ignoring non-uptodate pages.

Reported-by: Frank Sorenson <fsorenso@redhat.com>
Fixes: 5d7bc7e868 ("fuse: allow using readdir cache")
Cc: <stable@vger.kernel.org> # v4.20
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2022-10-20 17:18:58 +02:00
Eric Biggers ccd30a476f fscrypt: fix keyring memory leak on mount failure
Commit d7e7b9af10 ("fscrypt: stop using keyrings subsystem for
fscrypt_master_key") moved the keyring destruction from __put_super() to
generic_shutdown_super() so that the filesystem's block device(s) are
still available.  Unfortunately, this causes a memory leak in the case
where a mount is attempted with the test_dummy_encryption mount option,
but the mount fails after the option has already been processed.

To fix this, attempt the keyring destruction in both places.

Reported-by: syzbot+104c2a89561289cec13e@syzkaller.appspotmail.com
Fixes: d7e7b9af10 ("fscrypt: stop using keyrings subsystem for fscrypt_master_key")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Link: https://lore.kernel.org/r/20221011213838.209879-1-ebiggers@kernel.org
2022-10-19 20:54:43 -07:00
Steve French 73b1b8d25e cifs: update internal module number
To 2.40

Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-19 17:57:51 -05:00
Paulo Alcantara 01f2ee7e32 cifs: fix memory leaks in session setup
We were only zeroing out the ntlmssp blob but forgot to free the
allocated buffer in the end of SMB2_sess_auth_rawntlmssp_negotiate()
and SMB2_sess_auth_rawntlmssp_authenticate() functions.

This fixes below kmemleak reports:

unreferenced object 0xffff88800ddcfc60 (size 96):
  comm "mount.cifs", pid 758, jiffies 4294696066 (age 42.967s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000d0beeb29>] __kmalloc+0x39/0xa0
    [<00000000e3834047>] build_ntlmssp_smb3_negotiate_blob+0x2c/0x110 [cifs]
    [<00000000e85f5ab2>] SMB2_sess_auth_rawntlmssp_negotiate+0xd3/0x230 [cifs]
    [<0000000080fdb897>] SMB2_sess_setup+0x16c/0x2a0 [cifs]
    [<000000009af320a8>] cifs_setup_session+0x13b/0x370 [cifs]
    [<00000000f15d5982>] cifs_get_smb_ses+0x643/0xb90 [cifs]
    [<00000000fe15eb90>] mount_get_conns+0x63/0x3e0 [cifs]
    [<00000000768aba03>] mount_get_dfs_conns+0x16/0xa0 [cifs]
    [<00000000cf1cf146>] cifs_mount+0x1c2/0x9a0 [cifs]
    [<000000000d66b51e>] cifs_smb3_do_mount+0x10e/0x710 [cifs]
    [<0000000077a996c5>] smb3_get_tree+0xf4/0x200 [cifs]
    [<0000000094dbd041>] vfs_get_tree+0x23/0xc0
    [<000000003a8561de>] path_mount+0x2d3/0xb50
    [<00000000ed5c86d6>] __x64_sys_mount+0x102/0x140
    [<00000000142142f3>] do_syscall_64+0x3b/0x90
    [<00000000e2b89731>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
unreferenced object 0xffff88801437f000 (size 512):
  comm "mount.cifs", pid 758, jiffies 4294696067 (age 42.970s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000d0beeb29>] __kmalloc+0x39/0xa0
    [<00000000004f53d2>] build_ntlmssp_auth_blob+0x4f/0x340 [cifs]
    [<000000005f333084>] SMB2_sess_auth_rawntlmssp_authenticate+0xd4/0x250 [cifs]
    [<0000000080fdb897>] SMB2_sess_setup+0x16c/0x2a0 [cifs]
    [<000000009af320a8>] cifs_setup_session+0x13b/0x370 [cifs]
    [<00000000f15d5982>] cifs_get_smb_ses+0x643/0xb90 [cifs]
    [<00000000fe15eb90>] mount_get_conns+0x63/0x3e0 [cifs]
    [<00000000768aba03>] mount_get_dfs_conns+0x16/0xa0 [cifs]
    [<00000000cf1cf146>] cifs_mount+0x1c2/0x9a0 [cifs]
    [<000000000d66b51e>] cifs_smb3_do_mount+0x10e/0x710 [cifs]
    [<0000000077a996c5>] smb3_get_tree+0xf4/0x200 [cifs]
    [<0000000094dbd041>] vfs_get_tree+0x23/0xc0
    [<000000003a8561de>] path_mount+0x2d3/0xb50
    [<00000000ed5c86d6>] __x64_sys_mount+0x102/0x140
    [<00000000142142f3>] do_syscall_64+0x3b/0x90
    [<00000000e2b89731>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: a4e430c8c8 ("cifs: replace kfree() with kfree_sensitive() for sensitive data")
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-19 17:57:51 -05:00
Ronnie Sahlberg 8e77860c62 cifs: drop the lease for cached directories on rmdir or rename
When we delete or rename a directory we must also drop any cached lease we have
on the directory.

Fixes: a350d6e73f5e ("cifs: enable caching of directories for which a lease is held")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-19 17:57:41 -05:00
Steve French 096bbeec7b smb3: interface count displayed incorrectly
The "Server interfaces" count in /proc/fs/cifs/DebugData increases
as the interfaces are requeried, rather than being reset to the new
value.  This could cause a problem if the server disabled
multichannel as the iface_count is checked in try_adding_channels
to see if multichannel still supported.

Also fixes a coverity warning:

Addresses-Coverity: 1526374 ("Concurrent data access violations  (MISSING_LOCK)")
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-19 10:06:23 -05:00
Darrick J. Wong 97cf79677e xfs: avoid a UAF when log intent item recovery fails
KASAN reported a UAF bug when I was running xfs/235:

 BUG: KASAN: use-after-free in xlog_recover_process_intents+0xa77/0xae0 [xfs]
 Read of size 8 at addr ffff88804391b360 by task mount/5680

 CPU: 2 PID: 5680 Comm: mount Not tainted 6.0.0-xfsx #6.0.0 77e7b52a4943a975441e5ac90a5ad7748b7867f6
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x34/0x44
  print_report.cold+0x2cc/0x682
  kasan_report+0xa3/0x120
  xlog_recover_process_intents+0xa77/0xae0 [xfs fb841c7180aad3f8359438576e27867f5795667e]
  xlog_recover_finish+0x7d/0x970 [xfs fb841c7180aad3f8359438576e27867f5795667e]
  xfs_log_mount_finish+0x2d7/0x5d0 [xfs fb841c7180aad3f8359438576e27867f5795667e]
  xfs_mountfs+0x11d4/0x1d10 [xfs fb841c7180aad3f8359438576e27867f5795667e]
  xfs_fs_fill_super+0x13d5/0x1a80 [xfs fb841c7180aad3f8359438576e27867f5795667e]
  get_tree_bdev+0x3da/0x6e0
  vfs_get_tree+0x7d/0x240
  path_mount+0xdd3/0x17d0
  __x64_sys_mount+0x1fa/0x270
  do_syscall_64+0x2b/0x80
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
 RIP: 0033:0x7ff5bc069eae
 Code: 48 8b 0d 85 1f 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 1f 0f 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffe433fd448 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff5bc069eae
 RDX: 00005575d7213290 RSI: 00005575d72132d0 RDI: 00005575d72132b0
 RBP: 00005575d7212fd0 R08: 00005575d7213230 R09: 00005575d7213fe0
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
 R13: 00005575d7213290 R14: 00005575d72132b0 R15: 00005575d7212fd0
  </TASK>

 Allocated by task 5680:
  kasan_save_stack+0x1e/0x40
  __kasan_slab_alloc+0x66/0x80
  kmem_cache_alloc+0x152/0x320
  xfs_rui_init+0x17a/0x1b0 [xfs]
  xlog_recover_rui_commit_pass2+0xb9/0x2e0 [xfs]
  xlog_recover_items_pass2+0xe9/0x220 [xfs]
  xlog_recover_commit_trans+0x673/0x900 [xfs]
  xlog_recovery_process_trans+0xbe/0x130 [xfs]
  xlog_recover_process_data+0x103/0x2a0 [xfs]
  xlog_do_recovery_pass+0x548/0xc60 [xfs]
  xlog_do_log_recovery+0x62/0xc0 [xfs]
  xlog_do_recover+0x73/0x480 [xfs]
  xlog_recover+0x229/0x460 [xfs]
  xfs_log_mount+0x284/0x640 [xfs]
  xfs_mountfs+0xf8b/0x1d10 [xfs]
  xfs_fs_fill_super+0x13d5/0x1a80 [xfs]
  get_tree_bdev+0x3da/0x6e0
  vfs_get_tree+0x7d/0x240
  path_mount+0xdd3/0x17d0
  __x64_sys_mount+0x1fa/0x270
  do_syscall_64+0x2b/0x80
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

 Freed by task 5680:
  kasan_save_stack+0x1e/0x40
  kasan_set_track+0x21/0x30
  kasan_set_free_info+0x20/0x30
  ____kasan_slab_free+0x144/0x1b0
  slab_free_freelist_hook+0xab/0x180
  kmem_cache_free+0x1f1/0x410
  xfs_rud_item_release+0x33/0x80 [xfs]
  xfs_trans_free_items+0xc3/0x220 [xfs]
  xfs_trans_cancel+0x1fa/0x590 [xfs]
  xfs_rui_item_recover+0x913/0xd60 [xfs]
  xlog_recover_process_intents+0x24e/0xae0 [xfs]
  xlog_recover_finish+0x7d/0x970 [xfs]
  xfs_log_mount_finish+0x2d7/0x5d0 [xfs]
  xfs_mountfs+0x11d4/0x1d10 [xfs]
  xfs_fs_fill_super+0x13d5/0x1a80 [xfs]
  get_tree_bdev+0x3da/0x6e0
  vfs_get_tree+0x7d/0x240
  path_mount+0xdd3/0x17d0
  __x64_sys_mount+0x1fa/0x270
  do_syscall_64+0x2b/0x80
  entry_SYSCALL_64_after_hwframe+0x46/0xb0

 The buggy address belongs to the object at ffff88804391b300
  which belongs to the cache xfs_rui_item of size 688
 The buggy address is located 96 bytes inside of
  688-byte region [ffff88804391b300, ffff88804391b5b0)

 The buggy address belongs to the physical page:
 page:ffffea00010e4600 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888043919320 pfn:0x43918
 head:ffffea00010e4600 order:2 compound_mapcount:0 compound_pincount:0
 flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff)
 raw: 04fff80000010200 0000000000000000 dead000000000122 ffff88807f0eadc0
 raw: ffff888043919320 0000000080140010 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff88804391b200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff88804391b280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 >ffff88804391b300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                        ^
  ffff88804391b380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff88804391b400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ==================================================================

The test fuzzes an rmap btree block and starts writer threads to induce
a filesystem shutdown on the corrupt block.  When the filesystem is
remounted, recovery will try to replay the committed rmap intent item,
but the corruption problem causes the recovery transaction to fail.
Cancelling the transaction frees the RUD, which frees the RUI that we
recovered.

When we return to xlog_recover_process_intents, @lip is now a dangling
pointer, and we cannot use it to find the iop_recover method for the
tracepoint.  Hence we must store the item ops before calling
->iop_recover if we want to give it to the tracepoint so that the trace
data will tell us exactly which intent item failed.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-10-18 14:39:29 -07:00
Linus Torvalds aae703b02f for-6.1-rc1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmNNTxsACgkQxWXV+ddt
 WDs7QA//WaEPFWO/086pWBhlJF8k+QCNwM/EEKQL5x4hxC6pEGbHk04q63IY3XKh
 B9WEoxTfkxpHz8p+p9wDVRJl1Fdby/UFc1l2xTLcU273wL4Iweqf00N4WyEYmqQ3
 DtZcYmh4r7gZ9cTuHk8Ex+llhqAUiN7mY3FoEU8naZCE+Fdn/h3T8DT79V2XLgzv
 4f46ci0ao74o30EE7vc/Yw3gr1ouJJ4Ajw/UCEXUVC9tWOLcDNE6501AshT/ozDp
 m2tljY630QIayaMjtR+HCJHmdmB5bNGdE01Cssqc8M+M7AtQKvf+A/nQNTiI0UfK
 6ODdukvteTEEVKL2XHHkW6RWzR1rfhT1JOrl3YRKZwAKYsURXI/t+2UIjZtVstY9
 GRb3YGBDVggtbjyXxC04i4WyF3RoHRehGiF/G303BBGFMXfgZ17rvSp7DfL9KLcc
 VNycW17CcQLVZXueNWJrNSu2dQ0X8Lx0X+OTcsxRkNCJW+JQHffDl/TwMt0GtRoO
 Vhwjp8vUKuJDZbjvGXg0ZKrmk0T12+L8ubt5o5fQtMiFf+RGq77xzI1112ZIwsL0
 OtGOD3ShgKDvz24HoxSAVTbHq+/s+bmhIL/xU4QAeol3sOVPfx6b+KqcmTyG9E9u
 +gbqB9js/2vbDFNtmhOV8Fv1HbGT8bwtMCIlq5CzsiX+aT5rT88=
 =aPaQ
 -----END PGP SIGNATURE-----

Merge tag 'for-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fiemap fixes:
      - add missing path cache update
      - fix processing of delayed data and tree refs during backref
        walking, this could lead to reporting incorrect extent sharing

 - fix extent range locking under heavy contention to avoid deadlocks

 - make it possible to test send v3 in debugging mode

 - update links in MAINTAINERS

* tag 'for-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  MAINTAINERS: update btrfs website links and files
  btrfs: ignore fiemap path cache if we have multiple leaves for a data extent
  btrfs: fix processing of delayed tree block refs during backref walking
  btrfs: fix processing of delayed data refs during backref walking
  btrfs: delete stale comments after merge conflict resolution
  btrfs: unlock locked extent area if we have contention
  btrfs: send: update command for protocol version check
  btrfs: send: allow protocol version 3 with CONFIG_BTRFS_DEBUG
  btrfs: add missing path cache update during fiemap
2022-10-18 11:25:50 -07:00
Zhang Xiaoxu 30b2d7f8f1 cifs: Fix memory leak when build ntlmssp negotiate blob failed
There is a memory leak when mount cifs:
  unreferenced object 0xffff888166059600 (size 448):
    comm "mount.cifs", pid 51391, jiffies 4295596373 (age 330.596s)
    hex dump (first 32 bytes):
      fe 53 4d 42 40 00 00 00 00 00 00 00 01 00 82 00  .SMB@...........
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<0000000060609a61>] mempool_alloc+0xe1/0x260
      [<00000000adfa6c63>] cifs_small_buf_get+0x24/0x60
      [<00000000ebb404c7>] __smb2_plain_req_init+0x32/0x460
      [<00000000bcf875b4>] SMB2_sess_alloc_buffer+0xa4/0x3f0
      [<00000000753a2987>] SMB2_sess_auth_rawntlmssp_negotiate+0xf5/0x480
      [<00000000f0c1f4f9>] SMB2_sess_setup+0x253/0x410
      [<00000000a8b83303>] cifs_setup_session+0x18f/0x4c0
      [<00000000854bd16d>] cifs_get_smb_ses+0xae7/0x13c0
      [<000000006cbc43d9>] mount_get_conns+0x7a/0x730
      [<000000005922d816>] cifs_mount+0x103/0xd10
      [<00000000e33def3b>] cifs_smb3_do_mount+0x1dd/0xc90
      [<0000000078034979>] smb3_get_tree+0x1d5/0x300
      [<000000004371f980>] vfs_get_tree+0x41/0xf0
      [<00000000b670d8a7>] path_mount+0x9b3/0xdd0
      [<000000005e839a7d>] __x64_sys_mount+0x190/0x1d0
      [<000000009404c3b9>] do_syscall_64+0x35/0x80

When build ntlmssp negotiate blob failed, the session setup request
should be freed.

Fixes: 49bd49f983 ("cifs: send workstation name during ntlmssp session setup")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Ronnie Sahlberg 053569ccde cifs: set rc to -ENOENT if we can not get a dentry for the cached dir
We already set rc to this return code further down in the function but
we can set it earlier in order to suppress a smash warning.

Also fix a false positive for Coverity. The reason this is a false positive is
that this happens during umount after all files and directories have been closed
but mosetting on ->on_list to suppress the warning.

Reported-by: Dan carpenter <dan.carpenter@oracle.com>
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1525256 ("Concurrent data access violations")
Fixes: a350d6e73f5e ("cifs: enable caching of directories for which a lease is held")
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Yang Yingliang d32f211adb cifs: use LIST_HEAD() and list_move() to simplify code
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().

Using list_move() instead of list_del() and list_add().

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Zhang Xiaoxu 10269f1325 cifs: Fix xid leak in cifs_get_file_info_unix()
If stardup the symlink target failed, should free the xid,
otherwise the xid will be leaked.

Fixes: 76894f3e2f ("cifs: improve symlink handling for smb2+")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Zhang Xiaoxu e909d054bd cifs: Fix xid leak in cifs_ses_add_channel()
Before return, should free the xid, otherwise, the
xid will be leaked.

Fixes: d70e9fa558 ("cifs: try opening channels after mounting")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Zhang Xiaoxu 575e079c78 cifs: Fix xid leak in cifs_flock()
If not flock, before return -ENOLCK, should free the xid,
otherwise, the xid will be leaked.

Fixes: d0677992d2 ("cifs: add support for flock")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Zhang Xiaoxu 9a97df404a cifs: Fix xid leak in cifs_copy_file_range()
If the file is used by swap, before return -EOPNOTSUPP, should
free the xid, otherwise, the xid will be leaked.

Fixes: 4e8aea30f7 ("smb3: enable swap on SMB3 mounts")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Zhang Xiaoxu fee0fb1f15 cifs: Fix xid leak in cifs_create()
If the cifs already shutdown, we should free the xid before return,
otherwise, the xid will be leaked.

Fixes: 087f757b01 ("cifs: add shutdown support")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-18 11:33:43 -05:00
Dawei Li ce4b815686 erofs: protect s_inodes with s_inode_list_lock for fscache
s_inodes is superblock-specific resource, which should be
protected by sb's specific lock s_inode_list_lock.

Link: https://lore.kernel.org/r/TYCP286MB23238380DE3B74874E8D78ABCA299@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM
Fixes: 7d41963759 ("erofs: Support sharing cookies in the same domain")
Reviewed-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Dawei Li <set_pte_at@outlook.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-10-17 14:57:57 +08:00
Gao Xiang e7933278b4 erofs: fix up inplace decompression success rate
Partial decompression should be checked after updating length.
It's a new regression when introducing multi-reference pclusters.

Fixes: 2bfab9c0ed ("erofs: record the longest decompressed size in this round")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221014064915.8103-1-hsiangkao@linux.alibaba.com
2022-10-17 06:55:49 +08:00
Gao Xiang 63bbb85658 erofs: shouldn't churn the mapping page for duplicated copies
If other duplicated copies exist in one decompression shot, should
leave the old page as is rather than replace it with the new duplicated
one.  Otherwise, the following cold path to deal with duplicated copies
will use the invalid bvec.  It impacts compressed data deduplication.

Also, shift the onlinepage EIO bit to avoid touching the signed bit.

Fixes: 267f2492c8 ("erofs: introduce multi-reference pclusters (fully-referenced)")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221012045056.13421-1-hsiangkao@linux.alibaba.com
2022-10-17 06:55:49 +08:00
Yue Hu 664609e49f erofs: fix illegal unmapped accesses in z_erofs_fill_inode_lazy()
Note that we are still accessing 'h_idata_size' and 'h_fragmentoff'
after calling erofs_put_metabuf(), that is not correct. Fix it.

Fixes: ab92184ff8 ("erofs: add on-disk compressed tail-packing inline support")
Fixes: b15b2e307c ("erofs: support on-disk compressed fragments data")
Signed-off-by: Yue Hu <huyue2@coolpad.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20221005013528.62977-1-zbestahu@163.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-10-17 06:55:48 +08:00
Linus Torvalds f1947d7c8a Random number generator fixes for Linux 6.1-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmNHYD0ACgkQSfxwEqXe
 A655AA//dJK0PdRghqrKQsl18GOCffV5TUw5i1VbJQbI9d8anfxNjVUQiNGZi4et
 qUwZ8OqVXxYx1Z1UDgUE39PjEDSG9/cCvOpMUWqN20/+6955WlNZjwA7Fk6zjvlM
 R30fz5CIJns9RFvGT4SwKqbVLXIMvfg/wDENUN+8sxt36+VD2gGol7J2JJdngEhM
 lW+zqzi0ABqYy5so4TU2kixpKmpC08rqFvQbD1GPid+50+JsOiIqftDErt9Eg1Mg
 MqYivoFCvbAlxxxRh3+UHBd7ZpJLtp1UFEOl2Rf00OXO+ZclLCAQAsTczucIWK9M
 8LCZjb7d4lPJv9RpXFAl3R1xvfc+Uy2ga5KeXvufZtc5G3aMUKPuIU7k28ZyblVS
 XXsXEYhjTSd0tgi3d0JlValrIreSuj0z2QGT5pVcC9utuAqAqRIlosiPmgPlzXjr
 Us4jXaUhOIPKI+Musv/fqrxsTQziT0jgVA3Njlt4cuAGm/EeUbLUkMWwKXjZLTsv
 vDsBhEQFmyZqxWu4pYo534VX2mQWTaKRV1SUVVhQEHm57b00EAiZohoOvweB09SR
 4KiJapikoopmW4oAUFotUXUL1PM6yi+MXguTuc1SEYuLz/tCFtK8DJVwNpfnWZpE
 lZKvXyJnHq2Sgod/hEZq58PMvT6aNzTzSg7YzZy+VabxQGOO5mc=
 =M+mV
 -----END PGP SIGNATURE-----

Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds b08cd74448 15 cifs/smb3 fixes including 2 for stable
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmNLLdMACgkQiiy9cAdy
 T1EWGwv/SN2nGgQ5QBxcqHmKyYyP/ndo3iig6L55VdBrWjTIH6vMAJfDAZwq5Veb
 YNkN6EQhT4XQ8RQ3nzJdSuwSLZu3CF5JMdChyoM3qgLSrT9CU3n2E5n0S/OK7TNI
 QvX2RMZrOiwl2sApJXk5xQgmocGwNOMdsI9IwxZ8zDATpLUicwYmcNajDTmPHjX7
 W8IVh4hHTBmVpnN5WkKLninLCz23eqcEMQb+nxtD3sMhOAUGn6gkKo+0L1EOYwgS
 jFfZxGbLoIBi1IhMcoWEcODnDtpwS9vtGDy2ZPVkefp5LqjjEEqR7LQLAvSt0dYK
 MpXHAvTUO8T7EEDx2djUNVM0JfXYBa0w3iOqSkOS2EBZF/Q8/ABWccfNpCTnGoAZ
 G3VJvDP/K5iPiRwy/0KK6CeCvCiXgBslcvRVdIvSCsehuvEyYTXRrjRLHElZBVZz
 Odwn8C9UGy5w0DTQ6QvwYtYWpoi2UFm8GuoswOfljYwnrul+7P0R14xxARJE8TOh
 fBXL0FAu
 =8BkB
 -----END PGP SIGNATURE-----

Merge tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull more cifs updates from Steve French:

 - fix a regression in guest mounts to old servers

 - improvements to directory leasing (caching directory entries safely
   beyond the root directory)

 - symlink improvement (reducing roundtrips needed to process symlinks)

 - an lseek fix (to problem where some dir entries could be skipped)

 - improved ioctl for returning more detailed information on directory
   change notifications

 - clarify multichannel interface query warning

 - cleanup fix (for better aligning buffers using ALIGN and round_up)

 - a compounding fix

 - fix some uninitialized variable bugs found by Coverity and the kernel
   test robot

* tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: improve SMB3 change notification support
  cifs: lease key is uninitialized in two additional functions when smb1
  cifs: lease key is uninitialized in smb1 paths
  smb3: must initialize two ACL struct fields to zero
  cifs: fix double-fault crash during ntlmssp
  cifs: fix static checker warning
  cifs: use ALIGN() and round_up() macros
  cifs: find and use the dentry for cached non-root directories also
  cifs: enable caching of directories for which a lease is held
  cifs: prevent copying past input buffer boundaries
  cifs: fix uninitialised var in smb2_compound_op()
  cifs: improve symlink handling for smb2+
  smb3: clarify multichannel warning
  cifs: fix regression in very old smb1 mounts
  cifs: fix skipping to incorrect offset in emit_cached_dirents
2022-10-16 11:01:40 -07:00
Steve French e3e9463414 smb3: improve SMB3 change notification support
Change notification is a commonly supported feature by most servers,
but the current ioctl to request notification when a directory is
changed does not return the information about what changed
(even though it is returned by the server in the SMB3 change
notify response), it simply returns when there is a change.

This ioctl improves upon CIFS_IOC_NOTIFY by returning the notify
information structure which includes the name of the file(s) that
changed and why. See MS-SMB2 2.2.35 for details on the individual
filter flags and the file_notify_information structure returned.

To use this simply pass in the following (with enough space
to fit at least one file_notify_information structure)

struct __attribute__((__packed__)) smb3_notify {
       uint32_t completion_filter;
       bool     watch_tree;
       uint32_t data_len;
       uint8_t  data[];
} __packed;

using CIFS_IOC_NOTIFY_INFO 0xc009cf0b
 or equivalently _IOWR(CIFS_IOCTL_MAGIC, 11, struct smb3_notify_info)

The ioctl will block until the server detects a change to that
directory or its subdirectories (if watch_tree is set).

Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French 2bff065933 cifs: lease key is uninitialized in two additional functions when smb1
cifs_open and _cifsFileInfo_put also end up with lease_key uninitialized
in smb1 mounts.  It is cleaner to set lease key to zero in these
places where leases are not supported (smb1 can not return lease keys
so the field was uninitialized).

Addresses-Coverity: 1514207 ("Uninitialized scalar variable")
Addresses-Coverity: 1514331 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French 625b60d4f9 cifs: lease key is uninitialized in smb1 paths
It is cleaner to set lease key to zero in the places where leases are not
supported (smb1 can not return lease keys so the field was uninitialized).

Addresses-Coverity: 1513994 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French f09bd695af smb3: must initialize two ACL struct fields to zero
Coverity spotted that we were not initalizing Stbz1 and Stbz2 to
zero in create_sd_buf.

Addresses-Coverity: 1513848 ("Uninitialized scalar variable")
Cc: <stable@vger.kernel.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Paulo Alcantara b854b4ee66 cifs: fix double-fault crash during ntlmssp
The crash occurred because we were calling memzero_explicit() on an
already freed sess_data::iov[1] (ntlmsspblob) in sess_free_buffer().

Fix this by not calling memzero_explicit() on sess_data::iov[1] as
it's already by handled by callers.

Fixes: a4e430c8c8 ("cifs: replace kfree() with kfree_sensitive() for sensitive data")
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:04:38 -05:00
Linus Torvalds b7cef0d21c This pull request contains updates for UBI and UBIFS
UBI:
 	- Use bitmap API to allocate bitmaps
 	- New attach mode, disable_fm, to attach without fastmap
         - Fixes for various typos in comments
 
 UBIFS:
 	- Fix for a deadlock when setting xattrs for encrypted file
 	- Fix for an assertion failures when truncating encrypted files
         - Fixes for various typos in comments
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmNJymMWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wRuRD/0cAP02dtaUvOHrPqBf1VA/gMdt
 hxsscSbxrbIaSyrc/Olc4uI0rqmUzijr8AI3YBqrIQvZGxrjhDE/O2ai7dQ4wzku
 J+ynLaH5GRzqLtnf1yBc2ss5rkhLeuoQ2H7z7PV6v4dyqjDQYCWCHGUIbPD8XEMq
 m0cOJfw1COPDYz/R4OVH40qDWq/D9okr3siV15xmKV+9+dRZqqKmKay4WRjktuQd
 34ryO5Hl8ynlIUdonjtLGn5I+q2ZTggrHwEhwFPWdQIytOiyWkPUQK+vPiyjTdH2
 O7sHifqlPLOJcqBHxM64pVXyPg/f9nFWsLv25RodWu9BfXA3K9eOY1dyGr9In1ly
 RXyToXLeVOwkvVgTTBwaBlPBVwJmGWdklxErgx3ZbhZ/HarPafFjT8Fkt+c/tGny
 Yb7A3zo33oy8hJRApaTijtab7evtAH2lRxqRewssY9yEAoxiPuwAfHempom+CWQj
 WdaZjbvb3uQodQSSVBbdl9X5lmaF1eSoAEr7o09mA+MOT7jB80yxm/SoXEg+cdPj
 fVRMRt2ciCE84i96l4XmjhB9sFIw6XrRYDMAP9BVd6Lxm9o+Fv6rzC4ApbnJasK0
 56M0dUgec+EZ8WyJiJJF/uOrNYq/r4OYnohmMChErGsY8cF3orhEb+Dh2oOm944E
 giHLrLLHJFZofFBFrw==
 =UVHq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBI and UBIFS updates from Richard Weinberger:
 "UBI:
   - Use bitmap API to allocate bitmaps
   - New attach mode, disable_fm, to attach without fastmap
   - Fixes for various typos in comments

  UBIFS:
   - Fix for a deadlock when setting xattrs for encrypted file
   - Fix for an assertion failures when truncating encrypted files
   - Fixes for various typos in comments"

* tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubi: fastmap: Add fastmap control support for 'UBI_IOCATT' ioctl
  ubi: fastmap: Use the bitmap API to allocate bitmaps
  ubifs: Fix AA deadlock when setting xattr for encrypted file
  ubifs: Fix UBIFS ro fail due to truncate in the encrypted directory
  mtd: ubi: drop unexpected word 'a' in comments
  ubi: block: Fix typos in comments
  ubi: fastmap: Fix typo in comments
  ubi: Fix repeated words in comments
  ubi: ubi-media.h: Fix comment typo
  ubi: block: Remove in vain semicolon
  ubifs: Fix ubifs_check_dir_empty() kernel-doc comment
2022-10-14 18:23:23 -07:00
Linus Torvalds 91080ab38f This pull request contains the following changes for UML:
- Move to strscpy()
 - Improve panic notifiers
 - Fix NR_CPUS usage
 - Fixes for various comments
 - Fixes for virtio driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmNJy+8WHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wS8ZEADnEH5M4Igf2vUkBp6cGapYuLM3
 ADfDqRfJuKYOLEBSHcQ5cV6opYebPF6BJ/xvFGZyB5h5TogYAeZPHNWx6mh5PjmK
 NJjhZwwNJJakwmVPieOqS1LlDdxDW+YaWjF+tU+3kB7u5ACnaZ/bFDKoifDIvNo8
 nbFsTDXLVHYUr4b5zer9DRov4bniamn0cEMOm8FAxC9g2z/lacRAJWchzdAdyi8i
 HKctwhi4HYl3pfecwJZn5WS1aHCo3AlWovAQPY0aTZd+zUeRJBI0IerJbgxyrBfZ
 Gzr32w/DSMXEsCM3dvsRhMzkosqJSjkFdO6hnjwZVMhxFKGltMFx1wk5MFtveMmF
 FCfRgVW58AfFPst/6wWcSINnK75PgIBoetnk5RtwfiGDDYlk8foj/8wCnIFIyp6I
 PafrpqaV+KH7bs8U4upCjJFVGmgS4e1P/+W5evoPGxvc878sWa92JbFRJfImybpz
 VE8WbS+NjJs+p6rVwvzjmuF49jdEnAHHg7S7xfj+MU3/yKzTeaUwCJQdEvhUrmP/
 4gmGjSoHuFfUGu0yXA3zpJx/F4y+k8kQeVacB4t+CNAOcA93qvKxMZtXUrmfYA/9
 5nOo+TUx3dcxU7nAtcjSqh5GBCr/X0Aj2RYnERzXr1QkKf1BK3MZtxySYtQAGZvb
 sKvbGtYQi50pa006Vg==
 =YfF1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux

Pull UML updates from Richard Weinberger:

 - Move to strscpy()

 - Improve panic notifiers

 - Fix NR_CPUS usage

 - Fixes for various comments

 - Fixes for virtio driver

* tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  uml: Remove the initialization of statics to 0
  um: Do not initialise statics to 0.
  um: Fix comment typo
  um: Improve panic notifiers consistency and ordering
  um: remove unused reactivate_chan() declaration
  um: mmaper: add __exit annotations to module exit funcs
  um: virt-pci: add __init/__exit annotations to module init/exit funcs
  hostfs: move from strlcpy with unused retval to strscpy
  um: move from strlcpy with unused retval to strscpy
  um: increase default virtual physical memory to 64 MiB
  UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
  um: read multiple msg from virtio slave request fd
2022-10-14 18:14:48 -07:00
Linus Torvalds 5e714bf171 - Alistair Popple has a series which addresses a race which causes page
refcounting errors in ZONE_DEVICE pages.
 
 - Peter Xu fixes some userfaultfd test harness instability.
 
 - Various other patches in MM, mainly fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0j6igAKCRDdBJ7gKXxA
 jnGxAP99bV39ZtOsoY4OHdZlWU16BUjKuf/cb3bZlC2G849vEwD+OKlij86SG20j
 MGJQ6TfULJ8f1dnQDd6wvDfl3FMl7Qc=
 =tbdp
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - fix a race which causes page refcounting errors in ZONE_DEVICE pages
   (Alistair Popple)

 - fix userfaultfd test harness instability (Peter Xu)

 - various other patches in MM, mainly fixes

* tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits)
  highmem: fix kmap_to_page() for kmap_local_page() addresses
  mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page
  mm/selftest: uffd: explain the write missing fault check
  mm/hugetlb: use hugetlb_pte_stable in migration race check
  mm/hugetlb: fix race condition of uffd missing/minor handling
  zram: always expose rw_page
  LoongArch: update local TLB if PTE entry exists
  mm: use update_mmu_tlb() on the second thread
  kasan: fix array-bounds warnings in tests
  hmm-tests: add test for migrate_device_range()
  nouveau/dmem: evict device private memory during release
  nouveau/dmem: refactor nouveau_dmem_fault_copy_one()
  mm/migrate_device.c: add migrate_device_range()
  mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page()
  mm/memremap.c: take a pgmap reference on page allocation
  mm: free device private pages have zero refcount
  mm/memory.c: fix race when faulting a device private page
  mm/damon: use damon_sz_region() in appropriate place
  mm/damon: move sz_damon_region to damon_sz_region
  lib/test_meminit: add checks for the allocation functions
  ...
2022-10-14 12:28:43 -07:00
Paulo Alcantara a9e17d3d74 cifs: fix static checker warning
Remove unnecessary NULL check of oparam->cifs_sb when parsing symlink
error response as it's already set by all smb2_open_file() callers and
deferenced earlier.

This fixes below report:

  fs/cifs/smb2file.c:126 smb2_open_file()
  warn: variable dereferenced before check 'oparms->cifs_sb' (see line 112)

Link: https://lore.kernel.org/r/Y0kt42j2tdpYakRu@kili
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-14 12:35:25 -05:00
Linus Torvalds 524d0c6882 A quiet round this time: several assorted filesystem fixes, the most
noteworthy one being some additional wakeups in cap handling code, and
 a messenger cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmNINMwTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi5UeB/0ZzxdhZarepRYdOw4K+hHmCWYjlrEi
 Aw91gfS9DmzXfLyV42/6kxhKGEmVH4Wpz/mAIfMcLaLJZxI9GspVZZuofK6XPJUY
 eGqllxXgbgvCqnX9puCfw4RTrEJkt/y6e0/6EhAhjArDNyEylHcApONbEsvHLB+L
 IjYEJRuDDNqBnacMjn0iqI2F3zpyu6DkuJNWLxfbGhnWWsj8LaxXVgLtBeePuoIN
 udVZiNxiJAldDGc99r0xX5gicjyihBRiomjnz6FO6F459CtrPE/qdx6TNUUt63N3
 Lt55JDCM8qJeA8ffblZrhNnT2iefcEuqRcSwSdLbQxW/l6y23O4drx+N
 =l/PT
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "A quiet round this time: several assorted filesystem fixes, the most
  noteworthy one being some additional wakeups in cap handling code, and
  a messenger cleanup"

* tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client:
  ceph: remove Sage's git tree from documentation
  ceph: fix incorrectly showing the .snap size for stat
  ceph: fail the open_by_handle_at() if the dentry is being unlinked
  ceph: increment i_version when doing a setattr with caps
  ceph: Use kcalloc for allocating multiple elements
  ceph: no need to wait for transition RDCACHE|RD -> RD
  ceph: fail the request if the peer MDS doesn't support getvxattr op
  ceph: wake up the waiters if any new caps comes
  libceph: drop last_piece flag from ceph_msg_data_cursor
2022-10-13 10:21:37 -07:00
Linus Torvalds 66b8345585 NFS Client Updates for Linux 6.1
- New Features:
   - Add NFSv4.2 xattr tracepoints
   - Replace xprtiod WQ in rpcrdma
   - Flexfiles cancels I/O on layout recall or revoke
 
 - Bugfixes and Cleanups:
   - Directly use ida_alloc() / ida_free()
   - Don't open-code max_t()
   - Prefer using strscpy over strlcpy
   - Remove unused forward declarations
   - Always return layout states on flexfiles layout return
   - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error
   - Allow more xprtrdma memory allocations to fail without triggering a reclaim
   - Various other xprtrdma clean ups
   - Fix rpc_killall_tasks() races
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmNHJToACgkQ18tUv7Cl
 QOtTbA//QiresBzf7cnZOAwiZbe9LXiWfR2p5IkBLJPYJ8xtTliRLwnwYgQib9OI
 +4DzBiEqujah9BDac5OeatYW1UDLQ9lMIoCyvPjSw8Yxa8JEHDb/1ODDUOMS+ZIo
 dk1AKV2Wi2stxn85Sy+VGriE3JKiaeJxAlsWgiT/BLP0hAyZw1L3Tg017EgxVIVz
 8cfPBciu/Bc2/pZp9f5+GBjAlcUX0u/JFKiLPDHDZkvFTr4RgREZOyStDWncgsxK
 iHAIfSr6TxlynHabNAnFNVuYq7gkBe3jg1TkABdQ+SilAgdLpugAW8MFdig0AZQO
 UIsVJHjRHLpz6cJurnDcu9tGB6jLVTZfyz8PZQl5H9CqnbSHUxdOCTuve7fGhVas
 +wSXq1U98gStzoqtw5pMwsB2YSSOsUR8QEZpLEkvQgzHwoszNa7FrELqaZUJyJHR
 qmRH2nKCzsSBbQn5AhnzHBxzeOv6r0r3YjvKd5utwsRtq3g9GX14KAOmqvDTKk2q
 9KmrGlDVtVmOww2QnPTXH6mSthHLuqcKg1H2H7Xymmskq9n8PC6M+EiQd8XsKNJa
 MfBkOVFdxrJq6Htpx4IMLJP6jvYVKEbef2eRFt8hNnla8pMPlsDqoIysJulaWpiB
 HqdoPHR9Y26Qxuw7G91ba5Q5qqu9+ZOLB9jeSRjcXtsUDxq9f/A=
 =p47k
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "New Features:
   - Add NFSv4.2 xattr tracepoints
   - Replace xprtiod WQ in rpcrdma
   - Flexfiles cancels I/O on layout recall or revoke

  Bugfixes and Cleanups:
   - Directly use ida_alloc() / ida_free()
   - Don't open-code max_t()
   - Prefer using strscpy over strlcpy
   - Remove unused forward declarations
   - Always return layout states on flexfiles layout return
   - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of
     error
   - Allow more xprtrdma memory allocations to fail without triggering a
     reclaim
   - Various other xprtrdma clean ups
   - Fix rpc_killall_tasks() races"

* tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
  NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked
  SUNRPC: Add API to force the client to disconnect
  SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls
  SUNRPC: Fix races with rpc_killall_tasks()
  xprtrdma: Fix uninitialized variable
  xprtrdma: Prevent memory allocations from driving a reclaim
  xprtrdma: Memory allocation should be allowed to fail during connect
  xprtrdma: MR-related memory allocation should be allowed to fail
  xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc()
  xprtrdma: Clean up synopsis of rpcrdma_req_create()
  svcrdma: Clean up RPCRDMA_DEF_GFP
  SUNRPC: Replace the use of the xprtiod WQ in rpcrdma
  NFSv4.2: Add a tracepoint for listxattr
  NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr
  NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2
  NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR
  nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration
  NFSv4: remove nfs4_renewd_prepare_shutdown() declaration
  fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment
  NFSv4/pNFS: Always return layout stats on layout return for flexfiles
  ...
2022-10-13 09:58:42 -07:00
Linus Torvalds 531d3b5f73 Orangefs: change iterate to iterate_shared
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIGSFVdO6eop9nER2z0QOqevODb4FAmM7KyoACgkQz0QOqevO
 Db6b0g/7BRncAvvnX6l4ktJe1l5bdZYiQNFvnUSJ/Z2kQUUwBPF0Jt1tIF6ntcrq
 yDVZWiY5RRLkbPqiZNaBFuJtJvzT9vSs1JnpqNjhCpFB/4lTdTQbkUL2att6cQt0
 OTW2OeWH+WKIOWUdhAFZI9rinCsXizAN0bR0GUYujg2Px9kEXdjFR6BfbR7Q4QXY
 mjEaai9EV5JI9NXUgpjHYZkQOkn/B+JzVmR84HXZ1KpL6euZVeSj9ytKWPwNNsPB
 3AuXZIAGKNKsGehtY3svIZSkWkl5IdyJshE8fVBlKvwSIjoDoMbR4A70k3CstaN8
 XXlLe7tuVG5MqqPsn4AOFpiv/Om2Z29nm5pZYUU4ImxlaOrGwaoTBS393Zmb/aIp
 hLMJtK8AEZZjFRnLHnlUgfhdZO0QsuqMBuqHOXdszuJ6W303pu+E+/++AijSn5J6
 W8V//3gTEq/il37HPh9MOGQ91nuiaiJpBVO/UbAy289x7qwTgB52zdSlqBQRIYjA
 X+0u/lQ/jaNJE4YiJIeogm+FeKVY1KOVXqLAv658s8Jhd7jdbqixOYutxXLTh7xx
 RBprGu8Hrkmty04mkquKo7YFeO6S0hkpCi646rc87CWQorIuSfMjP1ZEnexZ7M3G
 WlQOVzYUeEPwLr0L+vG9R0g7QI7WImUKGCbDXLKOXOp4BntcIS4=
 =9dpC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs update from Mike Marshall:
 "Change iterate to iterate_shared"

* tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  Orangefs: change iterate to iterate_shared
2022-10-13 09:56:14 -07:00
Jeff Layton 93c128e709 nfsd: ensure we always call fh_verify_error tracepoint
This is a conditional tracepoint. Call it every time, not just when
nfs_permission fails.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-10-13 12:12:37 -04:00
Enzo Matsumiya d7173623bf cifs: use ALIGN() and round_up() macros
Improve code readability by using existing macros:

Replace hardcoded alignment computations (e.g. (len + 7) & ~0x7) by
ALIGN()/IS_ALIGNED() macros.

Also replace (DIV_ROUND_UP(len, 8) * 8) with ALIGN(len, 8), which, if
not optimized by the compiler, has the overhead of a multiplication
and a division. Do the same for roundup() by replacing it by round_up()
(division-less version, but requires the multiple to be a power of 2,
which is always the case for us).

And remove some unnecessary checks where !IS_ALIGNED() would fit, but
calling round_up() directly is fine as it's a no-op if the value is
already aligned.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:39 -05:00
Ronnie Sahlberg e4029e0726 cifs: find and use the dentry for cached non-root directories also
This allows us to use cached attributes for the entries in a cached
directory for as long as a lease is held on the directory itself.
Previously we have always allowed "used cached attributes for 1 second"
but this extends this to the lifetime of the lease as well as making the
caching safer.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:39 -05:00
Ronnie Sahlberg ebe98f1447 cifs: enable caching of directories for which a lease is held
This expands the directory caching to now cache an open handle for all
directories (up to a maximum) and not just the root directory.

In this patch, locking and refcounting is intended to work as so:

The main function to get a reference to a cached handle is
find_or_create_cached_dir() called from open_cached_dir()
These functions are protected under the cfid_list_lock spin-lock
to make sure we do not race creating new references for cached dirs
with deletion of expired ones.

An successful open_cached_dir() will take out 2 references to the cfid if
this was the very first and successful call to open the directory and
it acquired a lease from the server.
One reference is for the lease  and the other is for the cfid that we
return. The is lease reference is tracked by cfid->has_lease.
If the directory already has a handle with an active lease, then we just
take out one new reference for the cfid and return it.
It can happen that we have a thread that tries to open a cached directory
where we have a cfid already but we do not, yet, have a working lease. In
this case we will just return NULL, and this the caller will fall back to
the case when no handle was available.

In this model the total number of references we have on a cfid is
1 for while the handle is open and we have a lease, and one additional
reference for each open instance of a cfid.

Once we get a lease break (cached_dir_lease_break()) we remove the
cfid from the list under the spinlock. This prevents any new threads to
use it, and we also call smb2_cached_lease_break() via the work_queue
in order to drop the reference we got for the lease (we drop it outside
of the spin-lock.)
Anytime a thread calls close_cached_dir() we also drop a reference to the
cfid.
When the last reference to the cfid is released smb2_close_cached_fid()
will be invoked which will drop the reference ot the dentry we held for
this cfid and it will also, if we the handle is open/has a lease
also call SMB2_close() to close the handle on the server.

Two events require special handling:
invalidate_all_cached_dirs() this function is called from SMB2_tdis()
and cifs_mark_open_files_invalid().
In both cases the tcon is either gone already or will be shortly so
we do not need to actually close the handles. They will be dropped
server side as part of the tcon dropping.
But we have to be careful about a potential race with a concurrent
lease break so we need to take out additional refences to avoid the
cfid from being freed while we are still referencing it.

free_cached_dirs() which is called from tconInfoFree().
This is called quite late in the umount process so there should no longer
be any open handles or files and we can just free all the remaining data.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:39 -05:00
Paulo Alcantara 9ee2afe520 cifs: prevent copying past input buffer boundaries
Prevent copying past @data buffer in smb2_validate_and_copy_iov() as
the output buffer in @iov might be potentially bigger and thus copying
more bytes than requested in @minbufsize.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:39 -05:00
Paulo Alcantara 69ccafdd35 cifs: fix uninitialised var in smb2_compound_op()
Fix uninitialised variable @idata when calling smb2_compound_op() with
SMB2_OP_POSIX_QUERY_INFO.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:38 -05:00
Paulo Alcantara 76894f3e2f cifs: improve symlink handling for smb2+
When creating inode for symlink, the client used to send below
requests to fill it in:

    * create+query_info+close (STATUS_STOPPED_ON_SYMLINK)
    * create(+reparse_flag)+query_info+close (set file attrs)
    * create+ioctl(get_reparse)+close (query reparse tag)

and then for every access to the symlink dentry, the ->link() method
would send another:

    * create+ioctl(get_reparse)+close (parse symlink)

So, in order to improve:

    (i) Get rid of unnecessary roundtrips and then resolve symlinks as
	follows:

        * create+query_info+close (STATUS_STOPPED_ON_SYMLINK +
	                           parse symlink + get reparse tag)
        * create(+reparse_flag)+query_info+close (set file attrs)

    (ii) Set the resolved symlink target directly in inode->i_link and
         use simple_get_link() for ->link() to simply return it.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:36:04 -05:00
Steve French 977bb65308 smb3: clarify multichannel warning
When server does not return network interfaces, clarify the
message to indicate that "multichannel not available" not just
that "empty network interface returned by server ..."

Suggested-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:35:57 -05:00
Ronnie Sahlberg 2f6f19c7aa cifs: fix regression in very old smb1 mounts
BZ: 215375

Fixes: 76a3c92ec9 ("cifs: remove support for NTLM and weaker authentication algorithms")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-13 09:35:07 -05:00
Matthew Wilcox (Oracle) 4fa0e3ff21 ext4,f2fs: fix readahead of verity data
The recent change of page_cache_ra_unbounded() arguments was buggy in the
two callers, causing us to readahead the wrong pages.  Move the definition
of ractl down to after the index is set correctly.  This affected
performance on configurations that use fs-verity.

Link: https://lkml.kernel.org/r/20221012193419.1453558-1-willy@infradead.org
Fixes: 73bb49da50 ("mm/readahead: make page_cache_ra_unbounded take a readahead_control")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Jintao Yin <nicememory@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12 18:51:48 -07:00
Linus Torvalds 1440f57602 Five hotfixes - three for nilfs2, two for MM. For are cc:stable, one is
not.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0YhtwAKCRDdBJ7gKXxA
 juJLAQDCa0g8sfe9cTw3PT1gRnn8gWLHEkMgUWVC/aBaqYFGeQEAta+g8muv9Tpd
 qODv0JARH4cwONKEA24Oql+A5RnI6gQ=
 =QZnW
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc hotfixes from Andrew Morton:
 "Five hotfixes - three for nilfs2, two for MM. For are cc:stable, one
  is not"

* tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  nilfs2: fix leak of nilfs_root in case of writer thread creation failure
  nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level()
  nilfs2: fix use-after-free bug of struct nilfs_root
  mm/damon/core: initialize damon_target->list in damon_new_target()
  mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
2022-10-12 11:16:58 -07:00