WSL2-Linux-Kernel/fs
Josef Bacik 647af8a998 btrfs: use nofs when cleaning up aborted transactions
commit 597441b343 upstream.

Our CI system caught a lockdep splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  6.3.0-rc7+ #1167 Not tainted
  ------------------------------------------------------
  kswapd0/46 is trying to acquire lock:
  ffff8c6543abd650 (sb_internal#2){++++}-{0:0}, at: btrfs_commit_inode_delayed_inode+0x5f/0x120

  but task is already holding lock:
  ffffffffabe61b40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x4aa/0x7a0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #1 (fs_reclaim){+.+.}-{0:0}:
	 fs_reclaim_acquire+0xa5/0xe0
	 kmem_cache_alloc+0x31/0x2c0
	 alloc_extent_state+0x1d/0xd0
	 __clear_extent_bit+0x2e0/0x4f0
	 try_release_extent_mapping+0x216/0x280
	 btrfs_release_folio+0x2e/0x90
	 invalidate_inode_pages2_range+0x397/0x470
	 btrfs_cleanup_dirty_bgs+0x9e/0x210
	 btrfs_cleanup_one_transaction+0x22/0x760
	 btrfs_commit_transaction+0x3b7/0x13a0
	 create_subvol+0x59b/0x970
	 btrfs_mksubvol+0x435/0x4f0
	 __btrfs_ioctl_snap_create+0x11e/0x1b0
	 btrfs_ioctl_snap_create_v2+0xbf/0x140
	 btrfs_ioctl+0xa45/0x28f0
	 __x64_sys_ioctl+0x88/0xc0
	 do_syscall_64+0x38/0x90
	 entry_SYSCALL_64_after_hwframe+0x72/0xdc

  -> #0 (sb_internal#2){++++}-{0:0}:
	 __lock_acquire+0x1435/0x21a0
	 lock_acquire+0xc2/0x2b0
	 start_transaction+0x401/0x730
	 btrfs_commit_inode_delayed_inode+0x5f/0x120
	 btrfs_evict_inode+0x292/0x3d0
	 evict+0xcc/0x1d0
	 inode_lru_isolate+0x14d/0x1e0
	 __list_lru_walk_one+0xbe/0x1c0
	 list_lru_walk_one+0x58/0x80
	 prune_icache_sb+0x39/0x60
	 super_cache_scan+0x161/0x1f0
	 do_shrink_slab+0x163/0x340
	 shrink_slab+0x1d3/0x290
	 shrink_node+0x300/0x720
	 balance_pgdat+0x35c/0x7a0
	 kswapd+0x205/0x410
	 kthread+0xf0/0x120
	 ret_from_fork+0x29/0x50

  other info that might help us debug this:

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(fs_reclaim);
				 lock(sb_internal#2);
				 lock(fs_reclaim);
    lock(sb_internal#2);

   *** DEADLOCK ***

  3 locks held by kswapd0/46:
   #0: ffffffffabe61b40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x4aa/0x7a0
   #1: ffffffffabe50270 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x113/0x290
   #2: ffff8c6543abd0e0 (&type->s_umount_key#44){++++}-{3:3}, at: super_cache_scan+0x38/0x1f0

  stack backtrace:
  CPU: 0 PID: 46 Comm: kswapd0 Not tainted 6.3.0-rc7+ #1167
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x58/0x90
   check_noncircular+0xd6/0x100
   ? save_trace+0x3f/0x310
   ? add_lock_to_list+0x97/0x120
   __lock_acquire+0x1435/0x21a0
   lock_acquire+0xc2/0x2b0
   ? btrfs_commit_inode_delayed_inode+0x5f/0x120
   start_transaction+0x401/0x730
   ? btrfs_commit_inode_delayed_inode+0x5f/0x120
   btrfs_commit_inode_delayed_inode+0x5f/0x120
   btrfs_evict_inode+0x292/0x3d0
   ? lock_release+0x134/0x270
   ? __pfx_wake_bit_function+0x10/0x10
   evict+0xcc/0x1d0
   inode_lru_isolate+0x14d/0x1e0
   __list_lru_walk_one+0xbe/0x1c0
   ? __pfx_inode_lru_isolate+0x10/0x10
   ? __pfx_inode_lru_isolate+0x10/0x10
   list_lru_walk_one+0x58/0x80
   prune_icache_sb+0x39/0x60
   super_cache_scan+0x161/0x1f0
   do_shrink_slab+0x163/0x340
   shrink_slab+0x1d3/0x290
   shrink_node+0x300/0x720
   balance_pgdat+0x35c/0x7a0
   kswapd+0x205/0x410
   ? __pfx_autoremove_wake_function+0x10/0x10
   ? __pfx_kswapd+0x10/0x10
   kthread+0xf0/0x120
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x29/0x50
   </TASK>

This happens because when we abort the transaction in the transaction
commit path we call invalidate_inode_pages2_range on our block group
cache inodes (if we have space cache v1) and any delalloc inodes we may
have.  The plain invalidate_inode_pages2_range() call passes through
GFP_KERNEL, which makes sense in most cases, but not here.  Wrap these
two invalidate callees with memalloc_nofs_save/memalloc_nofs_restore to
make sure we don't end up with the fs reclaim dependency under the
transaction dependency.

CC: stable@vger.kernel.org # 4.14+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30 13:55:30 +01:00
..
9p 9p: fix a bunch of checkpatch warnings 2022-08-17 14:24:07 +02:00
adfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:27:06 +01:00
afs afs: Fix updating of i_size with dv jump from server 2023-05-11 23:00:38 +09:00
autofs autofs: fix wait name hash calculation in autofs_wait() 2021-10-20 21:09:02 -04:00
befs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
bfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
btrfs btrfs: use nofs when cleaning up aborted transactions 2023-05-30 13:55:30 +01:00
cachefiles fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
ceph ceph: force updating the msg pointer in non-split case 2023-05-24 17:36:55 +01:00
cifs SMB3: drop reference to cfile before sending oplock break 2023-05-24 17:36:54 +01:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-10 09:39:50 +01:00
configfs configfs: fix possible memory leak in configfs_create_dir() 2022-12-31 13:14:15 +01:00
cramfs
crypto fscrypt: fix keyring memory leak on mount failure 2022-11-10 18:15:37 +01:00
debugfs debugfs: fix error when writing negative value to atomic_t debugfs file 2022-12-31 13:14:03 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:27:01 +01:00
dlm fs: dlm: start midcomms before scand 2023-03-17 08:48:50 +01:00
ecryptfs fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
efivarfs
efs
erofs erofs: fix potential overflow calculating xattr_isize 2023-05-11 23:00:20 +09:00
exfat exfat: fix inode->i_blocks for non-512 byte sector size device 2023-03-10 09:39:58 +01:00
exportfs exportfs: support idmapped mounts 2022-06-09 10:23:32 +02:00
ext2 ext2: Check block size validity during mount 2023-05-24 17:36:44 +01:00
ext4 ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa() 2023-05-24 17:36:45 +01:00
f2fs f2fs: fix to check readonly condition correctly 2023-05-24 17:36:45 +01:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-09 10:22:42 +02:00
freevxfs
fscache fscache: Remove an unused static variable 2021-10-04 22:13:12 +01:00
fuse fuse: fix deadlock between atomic O_TRUNC and page invalidation 2023-04-26 13:51:54 +02:00
gfs2 gfs2: Fix inode height consistency check 2023-05-24 17:36:45 +01:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-10 09:39:57 +01:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-05-24 17:36:43 +01:00
hostfs hostfs: support splice_write 2021-08-26 22:28:02 +02:00
hpfs hpfs: use iomap_fiemap to implement ->fiemap 2021-07-27 11:00:36 +02:00
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2022-12-31 13:14:44 +01:00
iomap iomap: iomap_write_failed fix 2022-06-09 10:22:55 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 15:05:50 +01:00
jbd2 jdb2: Don't refuse invalidation of already invalidated buffers 2023-05-11 23:00:29 +09:00
jffs2 jffs2: correct logic when creating a hole in jffs2_write_begin 2023-03-22 13:31:30 +01:00
jfs fs/jfs: fix shift exponent db_agl2size negative 2023-03-11 13:57:22 +01:00
kernfs kernfs: fix use-after-free in __kernfs_remove 2022-11-03 23:59:13 +09:00
ksmbd ksmbd: fix global-out-of-bounds in smb2_find_context_vals 2023-05-24 17:36:54 +01:00
lockd lockd: set file_lock start and end when decoding nlm4 testargs 2023-03-30 12:47:56 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-13 20:59:10 +02:00
netfs netfs: fix parameter of cleanup() 2021-12-29 12:28:59 +01:00
nfs NFSv4.1: Always send a RECLAIM_COMPLETE after establishing lease 2023-05-11 23:00:36 +09:00
nfs_common nfs: Fix kerneldoc warning shown up by W=1 2021-10-04 22:02:17 +01:00
nfsd NFSD: callback request does not use correct credential for AUTH_SYS 2023-04-13 16:48:19 +02:00
nilfs2 nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode() 2023-05-24 17:36:55 +01:00
nls
notify inotify: Avoid reporting event with invalid wd 2023-05-17 11:50:22 +02:00
ntfs ntfs: check overflow when iterating ATTR_RECORDs 2022-11-26 09:24:52 +01:00
ntfs3 fs/ntfs3: Fix a possible null-pointer dereference in ni_clear() 2023-05-24 17:36:47 +01:00
ocfs2 ocfs2: Switch to security_inode_init_security() 2023-05-30 13:55:29 +01:00
omfs mm: require ->set_page_dirty to be explicitly wired up 2021-06-29 10:53:48 -07:00
openpromfs
orangefs orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init() 2022-12-31 13:14:44 +01:00
overlayfs ovl: Use "buf" flexible array for memcpy() destination 2023-02-09 11:26:47 +01:00
proc mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps 2023-02-09 11:26:44 +01:00
pstore pstore: Revert pmsg_lock back to a normal mutex 2023-05-11 23:00:31 +09:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-21 08:36:48 -07:00
qnx6
quota ext4: fix bug_on in __es_tree_search caused by bad quota inode 2023-01-12 11:59:01 +01:00
ramfs fs: move ramfs_aops to libfs 2021-06-29 10:53:48 -07:00
reiserfs reiserfs: Add security prefix to xattr name in reiserfs_security_write() 2023-05-11 23:00:17 +09:00
romfs
smbfs_common cifs: Fix crash on unload of cifs_arc4.ko 2021-12-14 10:57:12 +01:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:57:07 +01:00
sysfs sysfs: Allow deferred execution of iomem_get_mapping() 2021-08-06 13:05:28 +02:00
sysv fs: sysv: Fix sysv_nblocks() returns wrong value 2022-12-31 13:14:05 +01:00
tracefs tracefs: Only clobber mode/uid/gid on remount if asked 2022-09-20 12:39:43 +02:00
ubifs ubifs: Fix memory leak in do_rename 2023-05-17 11:50:14 +02:00
udf udf: Fix off-by-one error when discarding preallocation 2023-03-17 08:48:50 +01:00
ufs isystem: ship and use stdarg.h 2021-08-19 09:02:55 +09:00
unicode
vboxsf vboxfs: fix broken legacy mount signature checking 2021-09-27 11:26:21 -07:00
verity fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY 2023-04-05 11:24:51 +02:00
xfs xfs: don't consider future format versions valid 2023-05-11 23:00:18 +09:00
zonefs zonefs: Fix error message in zonefs_file_dio_append() 2023-04-05 11:25:01 +02:00
Kconfig 4 cifs/smb3 fixes, one for DFS reconnect, and one to begin creating common headers for server and client and the other two to rename the cifs_common directory to smbfs_common to be more consistent ie change use of the name cifs to smb which is more accurate 2021-09-12 10:10:21 -07:00
Kconfig.binfmt binfmt: remove support for em86 (alpha only) 2021-07-25 22:33:03 -07:00
Makefile io_uring: move to separate directory 2022-12-14 11:37:31 +01:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:57:05 +01:00
anon_inodes.c
attr.c attr: use consistent sgid stripping checks 2023-03-17 08:49:01 +01:00
bad_inode.c vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
binfmt_aout.c binfmt: a.out: Fix bogus semicolon 2021-09-05 10:15:05 -07:00
binfmt_elf.c fs/binfmt_elf: Fix memory leak in load_elf_binary() 2022-11-03 23:59:12 +09:00
binfmt_elf_fdpic.c binfmt: Fix error return code in load_elf_fdpic_binary() 2023-01-12 11:58:46 +01:00
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-09 10:22:26 +02:00
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-31 13:14:39 +01:00
binfmt_script.c
buffer.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-26 09:24:51 +01:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-31 13:14:30 +01:00
compat_binfmt_elf.c
coredump.c coredump: Use the vma snapshot in fill_files_note 2022-04-08 14:24:18 +02:00
d_path.c d_path: make 'prepend()' fill up the buffer exactly on overflow 2021-09-02 10:07:29 -07:00
dax.c fsdax: Fix infinite loop in dax_iomap_rw() 2022-09-28 11:11:56 +02:00
dcache.c
direct-io.c
drop_caches.c fs: drop_caches: fix skipping over shadow cache inodes 2021-09-03 09:58:10 -07:00
eventfd.c eventfd: provide a eventfd_signal_mask() helper 2023-01-24 07:22:43 +01:00
eventpoll.c eventpoll: add EPOLL_URING_WAKE poll wakeup flag 2023-01-24 07:22:43 +01:00
exec.c exec: Copy oldsighand->action under spin-lock 2022-11-03 23:59:12 +09:00
fcntl.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
fhandle.c
file.c fs: prevent out-of-bounds array speculation when closing a file descriptor 2023-03-17 08:48:47 +01:00
file_table.c locks: fix TOCTOU race when granting write lease 2022-10-26 12:34:58 +02:00
filesystems.c fs: simplify get_filesystem_list / get_all_fs_names 2021-08-23 01:25:40 -04:00
fs-writeback.c writeback: fix call of incorrect macro 2023-05-17 11:50:16 +02:00
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-20 09:13:14 +01:00
fs_parser.c namei: Standardize callers of filename_lookup() 2021-09-07 16:07:47 -04:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
init.c
inode.c attr: use consistent sgid stripping checks 2023-03-17 08:49:01 +01:00
internal.h attr: use consistent sgid stripping checks 2023-03-17 08:49:01 +01:00
ioctl.c fs: fix an infinite loop in iomap_fiemap 2022-05-25 09:57:26 +02:00
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-18 20:22:03 -10:00
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2022-12-31 13:14:03 +01:00
locks.c filelocks: use mount idmapping for setlease permission check 2023-03-17 08:49:02 +01:00
mbcache.c mbcache: Avoid nesting of cache->c_list_lock under bit locks 2023-01-12 11:59:20 +01:00
mount.h
mpage.c
namei.c fs: move S_ISGID stripping into the vfs_*() helpers 2023-03-17 08:49:00 +01:00
namespace.c fs: drop peer group ids under namespace lock 2023-04-13 16:48:25 +02:00
no-block.c
nsfs.c
open.c attr: use consistent sgid stripping checks 2023-03-17 08:49:01 +01:00
pipe.c pipe: Fix missing lock in pipe_resize_ring() 2022-06-06 08:43:37 +02:00
pnode.c pnode: terminate at peers of source 2023-01-12 11:58:47 +01:00
pnode.h
posix_acl.c fs: fix acl translation 2022-07-02 16:41:17 +02:00
proc_namespace.c fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
read_write.c vfs: fix copy_file_range() averts filesystem freeze protection 2022-12-19 12:36:39 +01:00
readdir.c
remap_range.c fs/remap: constrain dedupe of EOF blocks 2022-07-21 21:24:14 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:58:25 +01:00
seq_file.c rxrpc: Fix locking issue 2022-07-12 16:35:08 +02:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-14 10:57:15 +01:00
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-26 12:34:17 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 14:38:57 +02:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-24 17:36:54 +01:00
super.c fscrypt: destroy keyring after security_sb_delete() 2023-03-30 12:47:56 +02:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-04-27 14:38:50 +02:00
timerfd.c timerfd: Provide timerfd_resume() 2021-08-10 17:57:22 +02:00
userfaultfd.c userfaultfd: open userfaultfds with O_RDONLY 2022-10-26 12:34:36 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2022-12-31 13:14:01 +01:00