Граф коммитов

2166 Коммитов

Автор SHA1 Сообщение Дата
Chuck Lever fc1021337e Revert "lockd: introduce safe async lock op"
This reverts commit 2267b2e845.

ltp test fcntl17 fails on v5.15.154. This was bisected to commit
2267b2e845 ("lockd: introduce safe async lock op").

Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27 17:05:23 +02:00
Alexander Aring 2267b2e845 lockd: introduce safe async lock op
[ Upstream commit 2dd10de8e6bcbacf85ad758b904543c294820c63 ]

This patch reverts mostly commit 40595cdc93 ("nfs: block notification
on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK
export flag to signal that the "own ->lock" implementation supports
async lock requests. The only main user is DLM that is used by GFS2 and
OCFS2 filesystem. Those implement their own lock() implementation and
return FILE_LOCK_DEFERRED as return value. Since commit 40595cdc93
("nfs: block notification on fs with its own ->lock") the DLM
implementation were never updated. This patch should prepare for DLM
to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM
plock implementation regarding to it.

Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-10 16:19:29 +02:00
Chuck Lever 394d3f294a Documentation: Add missing documentation for EXPORT_OP flags
[ Upstream commit b38a6023da ]

The commits that introduced these flags neglected to update the
Documentation/filesystems/nfs/exporting.rst file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-10 16:19:29 +02:00
Dai Ngo c3b2013544 fs/lock: add 2 callbacks to lock_manager_operations to resolve conflict
[ Upstream commit 2443da2259 ]

Add 2 new callbacks, lm_lock_expirable and lm_expire_lock, to
lock_manager_operations to allow the lock manager to take appropriate
action to resolve the lock conflict if possible.

A new field, lm_mod_owner, is also added to lock_manager_operations.
The lm_mod_owner is used by the fs/lock code to make sure the lock
manager module such as nfsd, is not freed while lock conflict is being
resolved.

lm_lock_expirable checks and returns true to indicate that the lock
conflict can be resolved else return false. This callback must be
called with the flc_lock held so it can not block.

lm_expire_lock is called to resolve the lock conflict if the returned
value from lm_lock_expirable is true. This callback is called without
the flc_lock held since it's allowed to block. Upon returning from
this callback, the lock conflict should be resolved and the caller is
expected to restart the conflict check from the beginnning of the list.

Lock manager, such as NFSv4 courteous server, uses this callback to
resolve conflict by destroying lock owner, or the NFSv4 courtesy client
(client that has expired but allowed to maintains its states) that owns
the lock.

Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-10 16:19:04 +02:00
Dai Ngo 93d2afc7d2 fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock.
[ Upstream commit 9d6647762b ]

Update lock usage of lock_manager_operations' functions to reflect
the changes in commit 6109c85037 ("locks: add a dedicated spinlock
to protect i_flctx lists").

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-10 16:19:02 +02:00
Al Viro dfda2a5eb6 rename(): fix the locking of subdirectories
commit 22e111ed6c83dcde3037fc81176012721bc34c0b upstream.

	We should never lock two subdirectories without having taken
->s_vfs_rename_mutex; inode pointer order or not, the "order" proposed
in 28eceeda13 "fs: Lock moved directories" is not transitive, with
the usual consequences.

	The rationale for locking renamed subdirectory in all cases was
the possibility of race between rename modifying .. in a subdirectory to
reflect the new parent and another thread modifying the same subdirectory.
For a lot of filesystems that's not a problem, but for some it can lead
to trouble (e.g. the case when short directory contents is kept in the
inode, but creating a file in it might push it across the size limit
and copy its contents into separate data block(s)).

	However, we need that only in case when the parent does change -
otherwise ->rename() doesn't need to do anything with .. entry in the
first place.  Some instances are lazy and do a tautological update anyway,
but it's really not hard to avoid.

Amended locking rules for rename():
	find the parent(s) of source and target
	if source and target have the same parent
		lock the common parent
	else
		lock ->s_vfs_rename_mutex
		lock both parents, in ancestor-first order; if neither
		is an ancestor of another, lock the parent of source
		first.
	find the source and target.
	if source and target have the same parent
		if operation is an overwriting rename of a subdirectory
			lock the target subdirectory
	else
		if source is a subdirectory
			lock the source
		if target is a subdirectory
			lock the target
	lock non-directories involved, in inode pointer order if both
	source and target are such.

That way we are guaranteed that parents are locked (for obvious reasons),
that any renamed non-directory is locked (nfsd relies upon that),
that any victim is locked (emptiness check needs that, among other things)
and subdirectory that changes parent is locked (needed to protect the update
of .. entries).  We are also guaranteed that any operation locking more
than one directory either takes ->s_vfs_rename_mutex or locks a parent
followed by its child.

Cc: stable@vger.kernel.org
Fixes: 28eceeda13 "fs: Lock moved directories"
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-02-23 08:54:26 +01:00
Jan Kara 6db001a7ed fs: Lock moved directories
commit 28eceeda13 upstream.

When a directory is moved to a different directory, some filesystems
(udf, ext4, ocfs2, f2fs, and likely gfs2, reiserfs, and others) need to
update their pointer to the parent and this must not race with other
operations on the directory. Lock the directories when they are moved.
Although not all filesystems need this locking, we perform it in
vfs_rename() because getting the lock ordering right is really difficult
and we don't want to expose these locking details to filesystems.

CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230601105830.13168-5-jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23 13:47:34 +02:00
Arnd Bergmann 17bdba70a8 autofs: use flexible array in ioctl structure
commit e910c8e3aa upstream.

Commit df8fc4e934 ("kbuild: Enable -fstrict-flex-arrays=3") introduced a warning
for the autofs_dev_ioctl structure:

In function 'check_name',
    inlined from 'validate_dev_ioctl' at fs/autofs/dev-ioctl.c:131:9,
    inlined from '_autofs_dev_ioctl' at fs/autofs/dev-ioctl.c:624:8:
fs/autofs/dev-ioctl.c:33:14: error: 'strchr' reading 1 or more bytes from a region of size 0 [-Werror=stringop-overread]
   33 |         if (!strchr(name, '/'))
      |              ^~~~~~~~~~~~~~~~~
In file included from include/linux/auto_dev-ioctl.h:10,
                 from fs/autofs/autofs_i.h:10,
                 from fs/autofs/dev-ioctl.c:14:
include/uapi/linux/auto_dev-ioctl.h: In function '_autofs_dev_ioctl':
include/uapi/linux/auto_dev-ioctl.h:112:14: note: source object 'path' of size 0
  112 |         char path[0];
      |              ^~~~

This is easily fixed by changing the gnu 0-length array into a c99
flexible array. Since this is a uapi structure, we have to be careful
about possible regressions but this one should be fine as they are
equivalent here. While it would break building with ancient gcc versions
that predate c99, it helps building with --std=c99 and -Wpedantic builds
in user space, as well as non-gnu compilers. This means we probably
also want it fixed in stable kernels.

Cc: stable@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230523081944.581710-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-07-23 13:47:33 +02:00
Glenn Washburn 29cb0f6c1d docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
[ Upstream commit 7459608579 ]

The details for struct dentry_operations member d_weak_revalidate is
missing a "d_" prefix.

Fixes: af96c1e304 ("docs: filesystems: vfs: Convert vfs.txt to RST")
Signed-off-by: Glenn Washburn <development@efficientek.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Link: https://lore.kernel.org/r/20230227184042.2375235-1-development@efficientek.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22 13:31:22 +01:00
Lukas Czerner 0d94230343 fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE
commit cbfecb927f upstream.

Currently the I_DIRTY_TIME will never get set if the inode already has
I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME.  That's
true, however ext4 will only update the on-disk inode in
->dirty_inode(), not on actual writeback. As a result if the inode
already has I_DIRTY_INODE state by the time we get to
__mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled
into on-disk inode and will not get updated until the next I_DIRTY_INODE
update, which might never come if we crash or get a power failure.

The problem can be reproduced on ext4 by running xfstest generic/622
with -o iversion mount option.

Fix it by allowing I_DIRTY_TIME to be set even if the inode already has
I_DIRTY_INODE. Also make sure that the case is properly handled in
writeback_single_inode() as well. Additionally changes in
xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag.

Thanks Jan Kara for suggestions on how to make this work properly.

Cc: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220825100657.44217-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26 12:34:27 +02:00
Christian Brauner 1c62e0186d docs: update mapping documentation
commit 8cc5c54de4 upstream.

Now that we implement the full remapping algorithms described in our
documentation remove the section about shortcircuting them.

Link: https://lore.kernel.org/r/20211123114227.3124056-6-brauner@kernel.org (v1)
Link: https://lore.kernel.org/r/20211130121032.3753852-6-brauner@kernel.org (v2)
Link: https://lore.kernel.org/r/20211203111707.3901969-6-brauner@kernel.org
Cc: Seth Forshee <sforshee@digitalocean.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
CC: linux-fsdevel@vger.kernel.org
Reviewed-by: Seth Forshee <sforshee@digitalocean.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02 16:41:15 +02:00
Chao Yu 2646992ddf f2fs: support fault injection for dquot_initialize()
[ Upstream commit 10a2687856 ]

This patch adds a new function f2fs_dquot_initialize() to wrap
dquot_initialize(), and it supports to inject fault into
f2fs_dquot_initialize() to simulate inner failure occurs in
dquot_initialize().

Usage:
a) echo 65536 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=65536 <dev> <mountpoint>

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:23:13 +02:00
wangjianjian (C) 6b95256393 ext4, doc: fix incorrect h_reserved size
commit 7102ffe4c1 upstream.

According to document and code, ext4_xattr_header's size is 32 bytes, so
h_reserved size should be 3.

Signed-off-by: Wang Jianjian <wangjianjian3@huawei.com>
Link: https://lore.kernel.org/r/92fcc3a6-7d77-8c09-4126-377fcb4c46a5@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-27 14:39:01 +02:00
Eric Biggers 9630d4d0d1 fscrypt: allow 256-bit master keys with AES-256-XTS
[ Upstream commit 7f595d6a6c ]

fscrypt currently requires a 512-bit master key when AES-256-XTS is
used, since AES-256-XTS keys are 512-bit and fscrypt requires that the
master key be at least as long any key that will be derived from it.

However, this is overly strict because AES-256-XTS doesn't actually have
a 512-bit security strength, but rather 256-bit.  The fact that XTS
takes twice the expected key size is a quirk of the XTS mode.  It is
sufficient to use 256 bits of entropy for AES-256-XTS, provided that it
is first properly expanded into a 512-bit key, which HKDF-SHA512 does.

Therefore, relax the check of the master key size to use the security
strength of the derived key rather than the size of the derived key
(except for v1 encryption policies, which don't use HKDF).

Besides making things more flexible for userspace, this is needed in
order for the use of a KDF which only takes a 256-bit key to be
introduced into the fscrypt key hierarchy.  This will happen with
hardware-wrapped keys support, as all known hardware which supports that
feature uses an SP800-108 KDF using AES-256-CMAC, so the wrapped keys
are wrapped 256-bit AES keys.  Moreover, there is interest in fscrypt
supporting the same type of AES-256-CMAC based KDF in software as an
alternative to HKDF-SHA512.  There is no security problem with such
features, so fix the key length check to work properly with them.

Reviewed-by: Paul Crowley <paulcrowley@google.com>
Link: https://lore.kernel.org/r/20210921030303.5598-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18 19:16:11 +01:00
Linus Torvalds 86a44e9067 Fixed xfstests generic/016 generic/021 generic/022 generic/041 generic/274 generic/423,
some memory leaks and panic. Also many minor fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEh0DEKNP0I9IjwfWEqbAzH4MkB7YFAmFoMFQACgkQqbAzH4Mk
 B7bTtQ/+KiF48deefbEEExjfT8Mm76+JE0XkdCPT0bXkhVNpqhRLSOQBR2hg5A81
 7SSFHNbSMzXxiXdh2KfcXbBmwdJtcH1N9tjwffC3zhMkCTcDKnmDczz/lo4rHd0g
 zZ3rPBP9yPCZGxo3W804XRYOeqLclrGJPI3kWQen+Rln/cZIzJMaHRUkVI22OYwj
 e0dSdtabFDxJbdewz9xcvycHrPpVlrZUsuib/ZHFu2XGtgKalccgfvwBy5cOrTVh
 N1WSBGcoy0xQGRGLP0o2hN62N2Md7/+UwWjXY+Wz4i+4gmziGvGuk8Y5uiSLu7lS
 EG12xlrUtwouf4QaeleQZLT9Ym5YU3EALtKpZxAQi6Rm4A8Z6EMNUq0WBHJcNP/u
 MRJlfK7jC25GnIFQjZtU+eMX8BT8MgMeSriv9FIY86T3ijedfxxEbb/cMvUGm2Hn
 3hoQelLCUkLSqTyMeZiAv507AJv5MjfMrSJ9r9f36OxDer3w84VCVcxDtyGH++CR
 fbRNjHvz7gYG5L5qwsFgfxSC/z+hyUXi01RalbosojsRyvg/f1p+yMxvQ57DrltY
 IfHrMGcd9FlUiijBGFvyWQoMAl/pb6EIym2IMxr9X+aXgPJiG/BhWLbmzU4MYUUP
 1PwIOpN2vhtU2Z3bVzbecxfy/TWjBhKBYe9jW1AH8KSSvLZExjk=
 =QUnM
 -----END PGP SIGNATURE-----

Merge tag 'ntfs3_for_5.15' of git://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 fixes from Konstantin Komarov:
 "Use the new api for mounting as requested by Christoph.

  Also fixed:

   - some memory leaks and panic

   - xfstests (tested on x86_64) generic/016 generic/021 generic/022
     generic/041 generic/274 generic/423

   - some typos, wrong returned error codes, dead code, etc"

* tag 'ntfs3_for_5.15' of git://github.com/Paragon-Software-Group/linux-ntfs3: (70 commits)
  fs/ntfs3: Check for NULL pointers in ni_try_remove_attr_list
  fs/ntfs3: Refactor ntfs_read_mft
  fs/ntfs3: Refactor ni_parse_reparse
  fs/ntfs3: Refactor ntfs_create_inode
  fs/ntfs3: Refactor ntfs_readlink_hlp
  fs/ntfs3: Rework ntfs_utf16_to_nls
  fs/ntfs3: Fix memory leak if fill_super failed
  fs/ntfs3: Keep prealloc for all types of files
  fs/ntfs3: Remove unnecessary functions
  fs/ntfs3: Forbid FALLOC_FL_PUNCH_HOLE for normal files
  fs/ntfs3: Refactoring of ntfs_set_ea
  fs/ntfs3: Remove locked argument in ntfs_set_ea
  fs/ntfs3: Use available posix_acl_release instead of ntfs_posix_acl_release
  fs/ntfs3: Check for NULL if ATTR_EA_INFO is incorrect
  fs/ntfs3: Refactoring of ntfs_init_from_boot
  fs/ntfs3: Reject mount if boot's cluster size < media sector size
  fs/ntfs3: Refactoring lock in ntfs_init_acl
  fs/ntfs3: Change posix_acl_equiv_mode to posix_acl_update_mode
  fs/ntfs3: Pass flags to ntfs_set_ea in ntfs_set_acl_ex
  fs/ntfs3: Refactor ntfs_get_acl_ex for better readability
  ...
2021-10-15 09:58:11 -04:00
Kari Argillander a0fc05a37c
Doc/fs/ntfs3: Fix rst format and make it cleaner
Current ntfs3 rst documentation is broken. I turn table to list table as
this is current Linux documentation quide line. Simple table also did
not quite work in our situation as we need to span rows together.

It still look quite good as text so we did not loss anything. This will
also make diffing quite bit more pleasure.

Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-20 18:53:12 +03:00
Linus Torvalds c0f7e49fc4 block-5.15-2021-09-11
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmE8ueIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkSYD/9eaQ1Hxc+X+4eVb3A9Cpy36Qy/uY/hArnT
 kSUDtQitrRigqhStaD0MGpknWFnZE4cSojbYN0OoEWL7GC8idSZXx7KrVJpSHGbM
 XGVEflohvjDLNPkV99gmlzF2o6zPlWESApU1/HO2x+Ws1oKaYDAfFVf0CPGPe2C6
 MRerU5v3HSmTC0eFZxU246bwwX/phNuNDokndR27rrsjK0mLF5UoMKySeqy3INp5
 6mj3R+HNIW5j8eQk/HJPW7dgiKpWYneWV2Z90DuOLbcJ+wnx7s07wT1yRnOFUTsb
 p2ojVWmXtCJ1kRex6bK/eeIJC5TYvT3bNwsnIRmJHd9btHqhm2uKy77m3S1AuE7w
 K8bN581aXlr/3pUbFyYZDZQbYshUn25YP9OlyS9r4pklCh9C5KneL1b4xswWTDTB
 whvPZlkot3rGD8LHDpV5xVVzeaAcbSXanIRROjxHqQSRRTA9BjG3E4A2cDh8nmYD
 mRGEimfZcoojF2EQJYswPOQ24cZwpnihPpJO9NkOodRqfasn6XakAGg6SONFYyQ0
 Ewa6QzIOCebBgOVGbzMtpoDpnySE12ONmrDCbSEiYFJLXBMMiqgNON/Xaq0tmXHT
 lsDpyz3ytWAB9OZ3M0/9arZzlFf/E+FRqt4ExelmwxiutKRb1dIKQq8xip/YxdA+
 Y86kwUoAXQ==
 =1ajD
 -----END PGP SIGNATURE-----

Merge tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request from Christoph:
     - fix nvmet command set reporting for passthrough controllers (Adam Manzanares)
     - update a MAINTAINERS email address (Chaitanya Kulkarni)
     - set QUEUE_FLAG_NOWAIT for nvme-multipth (me)
     - handle errors from add_disk() (Luis Chamberlain)
     - update the keep alive interval when kato is modified (Tatsuya Sasaki)
     - fix a buffer overrun in nvmet_subsys_attr_serial (Hannes Reinecke)
     - do not reset transport on data digest errors in nvme-tcp (Daniel Wagner)
     - only call synchronize_srcu when clearing current path (Daniel Wagner)
     - revalidate paths during rescan (Hannes Reinecke)

 - Split out the fs/block_dev into block/fops.c and block/bdev.c, which
   has been long overdue. Do this now before -rc1, to avoid annoying
   conflicts due to this (Christoph)

 - blk-throtl use-after-free fix (Li)

 - Improve plug depth for multi-device plugs, greatly increasing md
   resync performance (Song)

 - blkdev_show() locking fix (Tetsuo)

 - n64cart error check fix (Yang)

* tag 'block-5.15-2021-09-11' of git://git.kernel.dk/linux-block:
  n64cart: fix return value check in n64cart_probe()
  blk-mq: allow 4x BLK_MAX_REQUEST_COUNT at blk_plug for multiple_queues
  block: move fs/block_dev.c to block/bdev.c
  block: split out operations on block special files
  blk-throttle: fix UAF by deleteing timer in blk_throtl_exit()
  block: genhd: don't call blkdev_show() with major_names_lock held
  nvme: update MAINTAINERS email address
  nvme: add error handling support for add_disk()
  nvme: only call synchronize_srcu when clearing current path
  nvme: update keep alive interval when kato is modified
  nvme-tcp: Do not reset transport on data digest errors
  nvmet: fixup buffer overrun in nvmet_subsys_attr_serial()
  nvmet: return bool from nvmet_passthru_ctrl and nvmet_is_passthru_req
  nvmet: looks at the passthrough controller when initializing CAP
  nvme: move nvme_multi_css into nvme.h
  nvme-multipath: revalidate paths during rescan
  nvme-multipath: set QUEUE_FLAG_NOWAIT
2021-09-11 10:19:51 -07:00
Kari Argillander 28a941ffc1
fs/ntfs3: Rename mount option no_acs_rules > (no)acsrules
Rename mount option no_acs_rules to (no)acsrules. This allow us to use
possibility to mount with options noaclrules or aclrules.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09 19:28:54 +03:00
Kari Argillander e274cde8c7
fs/ntfs3: Add iocharset= mount option as alias for nls=
Other fs drivers are using iocharset= mount option for specifying charset.
So add it also for ntfs3 and mark old nls= mount option as deprecated.

Reviewed-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09 19:28:53 +03:00
Kari Argillander b8a30b4171
fs/ntfs3: Remove unnecesarry mount option noatime
Remove unnecesarry mount option noatime because this will be handled
by VFS. Our option parser will never get opt like this.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Kari Argillander <kari.argillander@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-09-09 19:28:30 +03:00
Christoph Hellwig 0dca4462ed block: move fs/block_dev.c to block/bdev.c
Move it together with the rest of the block layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210907141303.1371844-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-09-07 08:39:40 -06:00
Linus Torvalds f7464060f7 Merge git://github.com/Paragon-Software-Group/linux-ntfs3
Merge NTFSv3 filesystem from Konstantin Komarov:
 "This patch adds NTFS Read-Write driver to fs/ntfs3.

  Having decades of expertise in commercial file systems development and
  huge test coverage, we at Paragon Software GmbH want to make our
  contribution to the Open Source Community by providing implementation
  of NTFS Read-Write driver for the Linux Kernel.

  This is fully functional NTFS Read-Write driver. Current version works
  with NTFS (including v3.1) and normal/compressed/sparse files and
  supports journal replaying.

  We plan to support this version after the codebase once merged, and
  add new features and fix bugs. For example, full journaling support
  over JBD will be added in later updates"

Link: https://lore.kernel.org/lkml/20210729134943.778917-1-almaz.alexandrovich@paragon-software.com/
Link: https://lore.kernel.org/lkml/aa4aa155-b9b2-9099-b7a2-349d8d9d8fbd@paragon-software.com/

* git://github.com/Paragon-Software-Group/linux-ntfs3: (35 commits)
  fs/ntfs3: Change how module init/info messages are displayed
  fs/ntfs3: Remove GPL boilerplates from decompress lib files
  fs/ntfs3: Remove unnecessary condition checking from ntfs_file_read_iter
  fs/ntfs3: Fix integer overflow in ni_fiemap with fiemap_prep()
  fs/ntfs3: Restyle comments to better align with kernel-doc
  fs/ntfs3: Rework file operations
  fs/ntfs3: Remove fat ioctl's from ntfs3 driver for now
  fs/ntfs3: Restyle comments to better align with kernel-doc
  fs/ntfs3: Fix error handling in indx_insert_into_root()
  fs/ntfs3: Potential NULL dereference in hdr_find_split()
  fs/ntfs3: Fix error code in indx_add_allocate()
  fs/ntfs3: fix an error code in ntfs_get_acl_ex()
  fs/ntfs3: add checks for allocation failure
  fs/ntfs3: Use kcalloc/kmalloc_array over kzalloc/kmalloc
  fs/ntfs3: Do not use driver own alloc wrappers
  fs/ntfs3: Use kernel ALIGN macros over driver specific
  fs/ntfs3: Restyle comment block in ni_parse_reparse()
  fs/ntfs3: Remove unused including <linux/version.h>
  fs/ntfs3: Fix fall-through warnings for Clang
  fs/ntfs3: Fix one none utf8 char in source file
  ...
2021-09-04 11:15:50 -07:00
Linus Torvalds 6abaa83c73 f2fs-for-5.15-rc1
In this cycle, we've addressed some performance issues such as lock contention,
 misbehaving compress_cache, allowing extent_cache for compressed files, and new
 sysfs to adjust ra_size for fadvise. In order to diagnose the performance issues
 quickly, we also added an iostat which shows the IO latencies periodically. On
 the stability side, we've found two memory leakage cases in the error path in
 compression flow. And, we've also fixed various corner cases in fiemap, quota,
 checkpoint=disable, zstd, and so on.
 
 Enhancement:
  - avoid long checkpoint latency by releasing nat_tree_lock
  - collect and show iostats periodically
  - support extent_cache for compressed files
  - add a sysfs entry to manage ra_size given fadvise(POSIX_FADV_SEQUENTIAL)
  - report f2fs GC status via sysfs
  - add discard_unit=%s in mount option to handle zoned device
 
 Bug fix:
  - fix two memory leakages when an error happens in the compressed IO flow
  - fix commpress_cache to get the right LBA
  - fix fiemap to deal with compressed case correctly
  - fix wrong EIO returns due to SBI_NEED_FSCK
  - fix missing writes when enabling checkpoint back
  - fix quota deadlock
  - fix zstd level mount option
 
 In addition to the above major updates, we've cleaned up several code paths such
 as dio, unnecessary operations, debugfs/f2fs/status, sanity check, and typos.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmEyw1sACgkQQBSofoJI
 UNLJmA/+NHUgwUjLMcHvmLyp6QYpQDZtKj93/sRDo+YHOYNdYFjWWUb329PYTKWS
 kEdzApCP+KHfVxeSkiL/x3qWP+RlTkIf96P0kR3/BKi0tjg25G2riFWztusDDFpt
 xi+AW5sUFDvIx1tFumvQHAQedSwBgcZ96ovT5EwxEuONkljhZC9phEC6vSXz9nOR
 e2EQIyezbC5O21np1KSeqSgqRMpVkJkVcEHy4VmpMBCLMOOYPepWwKw+yPaV/jR/
 zUXdo2/53vma50M5LCDPCtjCtWQgLoeNeGLxyjfzQuTJU6TmtPY65JObLPt6pUSj
 fRW6qIziTZbVYXzOWBD0EYilv2N4c3BNJdhQCpx2Vyjw9/LLxzqKPOUyzBoa1kjY
 eZVvmaLXVCKsoJdHDSi7OH/4BqS6SuSZE8eO/nGkgswqiErHZ0Vwl3bFCWC7r/Bk
 r2U5spJx/83XO6c9H1bzeWEies1DRtwnCDIRRuw35RtJ4uHZaqCfkuJ7rOBwC90X
 4SpaAKdUxP2RWc3GKELBIhaqPn7vyMy9ile6VU14PjM8UcY5hyE87T2azqR8gGut
 nVjRL4cbMGTPj6m1Qj8KqBRSaLuShe6AncUy7bNGiM+JlcLcdB6OJ1ZYLl9hjx2r
 TbIouXThgcZ4SIK0DEaBLKz2b9/0TfaO9gw1XzpRma+bWA1pApM=
 =W67o
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this cycle, we've addressed some performance issues such as lock
  contention, misbehaving compress_cache, allowing extent_cache for
  compressed files, and new sysfs to adjust ra_size for fadvise.

  In order to diagnose the performance issues quickly, we also added an
  iostat which shows the IO latencies periodically.

  On the stability side, we've found two memory leakage cases in the
  error path in compression flow. And, we've also fixed various corner
  cases in fiemap, quota, checkpoint=disable, zstd, and so on.

  Enhancements:
   - avoid long checkpoint latency by releasing nat_tree_lock
   - collect and show iostats periodically
   - support extent_cache for compressed files
   - add a sysfs entry to manage ra_size given fadvise(POSIX_FADV_SEQUENTIAL)
   - report f2fs GC status via sysfs
   - add discard_unit=%s in mount option to handle zoned device

  Bug fixes:
   - fix two memory leakages when an error happens in the compressed IO flow
   - fix commpress_cache to get the right LBA
   - fix fiemap to deal with compressed case correctly
   - fix wrong EIO returns due to SBI_NEED_FSCK
   - fix missing writes when enabling checkpoint back
   - fix quota deadlock
   - fix zstd level mount option

  In addition to the above major updates, we've cleaned up several code
  paths such as dio, unnecessary operations, debugfs/f2fs/status, sanity
  check, and typos"

* tag 'f2fs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (46 commits)
  f2fs: should put a page beyond EOF when preparing a write
  f2fs: deallocate compressed pages when error happens
  f2fs: enable realtime discard iff device supports discard
  f2fs: guarantee to write dirty data when enabling checkpoint back
  f2fs: fix to unmap pages from userspace process in punch_hole()
  f2fs: fix unexpected ENOENT comes from f2fs_map_blocks()
  f2fs: fix to account missing .skipped_gc_rwsem
  f2fs: adjust unlock order for cleanup
  f2fs: Don't create discard thread when device doesn't support realtime discard
  f2fs: rebuild nat_bits during umount
  f2fs: introduce periodic iostat io latency traces
  f2fs: separate out iostat feature
  f2fs: compress: do sanity check on cluster
  f2fs: fix description about main_blkaddr node
  f2fs: convert S_IRUGO to 0444
  f2fs: fix to keep compatibility of fault injection interface
  f2fs: support fault injection for f2fs_kmem_cache_alloc()
  f2fs: compress: allow write compress released file after truncate to zero
  f2fs: correct comment in segment.h
  f2fs: improve sbi status info in debugfs/f2fs/status
  ...
2021-09-04 10:48:47 -07:00
Linus Torvalds 111c1aa8ca In addition to some ext4 bug fixes and cleanups, this cycle we add the
orphan_file feature, which eliminates bottlenecks when doing a large
 number of parallel truncates and file deletions, and move the discard
 operation out of the jbd2 commit thread when using the discard mount
 option, to better support devices with slow discard operations.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmEw5gEACgkQ8vlZVpUN
 gaMatgf9GKc2H3JUDGJVXrOE1EZWXzyDI+Tv6zt0iTr05zi1ahjGccAbmJXiAwxU
 Zy5TGr53CpPcGUG+sO4NpVzcH8q7cQeG0pVx9OnzJUdVfmv+htoNE0aAqUY5L3vg
 AxV4KPGgxPofRQa3QRE2LDFHIkNs7c0ncprdaAtxNztd09iFo7bIayt614mARK++
 HIO7VOGrH5Wya8SSoqYHmlO0g5viy3ypP6CpysIQw0JifSlHYkmYBUJ0/hwPV/Xl
 WfzmwQ9p43C9EXVmIN4++l674TDzkSn9ebITXOgkq4C8KjnFgyhKQIj5QVj81MvH
 dac5jxsuLTXTLYnRpAQ/duV4jRd+Fw==
 =+NN7
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "In addition to some ext4 bug fixes and cleanups, this cycle we add the
  orphan_file feature, which eliminates bottlenecks when doing a large
  number of parallel truncates and file deletions, and move the discard
  operation out of the jbd2 commit thread when using the discard mount
  option, to better support devices with slow discard operations"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
  ext4: make the updating inode data procedure atomic
  ext4: remove an unnecessary if statement in __ext4_get_inode_loc()
  ext4: move inode eio simulation behind io completeion
  ext4: Improve scalability of ext4 orphan file handling
  ext4: Orphan file documentation
  ext4: Speedup ext4 orphan inode handling
  ext4: Move orphan inode handling into a separate file
  ext4: Support for checksumming from journal triggers
  ext4: fix race writing to an inline_data file while its xattrs are changing
  jbd2: add sparse annotations for add_transaction_credits()
  ext4: fix sparse warnings
  ext4: Make sure quota files are not grabbed accidentally
  ext4: fix e2fsprogs checksum failure for mounted filesystem
  ext4: if zeroout fails fall back to splitting the extent node
  ext4: reduce arguments of ext4_fc_add_dentry_tlv
  ext4: flush background discard kwork when retry allocation
  ext4: get discard out of jbd2 commit kthread contex
  ext4: remove the repeated comment of ext4_trim_all_free
  ext4: add new helper interface ext4_try_to_trim_range()
  ext4: remove the 'group' parameter of ext4_trim_extent
  ...
2021-09-02 09:37:09 -07:00
Linus Torvalds 815409a12c overlayfs update for 5.15
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCYTDKKAAKCRDh3BK/laaZ
 PG9PAQCUF0fdBlCKudwSEt5PV5xemycL9OCAlYCd7d4XbBIe9wEA6sVJL9J+OwV2
 aF0NomiXtJccE+S9+byjVCyqSzQJGQQ=
 =6L2Y
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs update from Miklos Szeredi:

 - Copy up immutable/append/sync/noatime attributes (Amir Goldstein)

 - Improve performance by enabling RCU lookup.

 - Misc fixes and improvements

The reason this touches so many files is that the ->get_acl() method now
gets a "bool rcu" argument.  The ->get_acl() API was updated based on
comments from Al and Linus:

Link: https://lore.kernel.org/linux-fsdevel/CAJfpeguQxpd6Wgc0Jd3ks77zcsAv_bn0q17L3VNnnmPKu11t8A@mail.gmail.com/

* tag 'ovl-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: enable RCU'd ->get_acl()
  vfs: add rcu argument to ->get_acl() callback
  ovl: fix BUG_ON() in may_delete() when called from ovl_cleanup()
  ovl: use kvalloc in xattr copy-up
  ovl: update ctime when changing fileattr
  ovl: skip checking lower file's i_writecount on truncate
  ovl: relax lookup error on mismatch origin ftype
  ovl: do not set overlay.opaque for new directories
  ovl: add ovl_allow_offline_changes() helper
  ovl: disable decoding null uuid with redirect_dir
  ovl: consistent behavior for immutable/append-only inodes
  ovl: copy up sync/noatime fileattr flags
  ovl: pass ovl_fs to ovl_check_setxattr()
  fs: add generic helper for filling statx attribute flags
2021-09-02 09:21:27 -07:00
Linus Torvalds 412106c203 Changes since last update:
- support direct I/O for all uncompressed files;
 
  - support fsdax for non-tailpacking regular files;
 
  - use iomap infrastructure for all uncompressed cases;
 
  - support fiemap for both (un)compressed files;
 
  - introduce chunk-based files for chunk deduplication.
 
  - some cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYS03PBEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBNdRAQCzumTFT7TRrA8PmcNB8viVrDO07czJRpwz
 Fsh0UdH3IQEAsESa11vmQjbKWATCB3g81SMZeeDqGEb4HYLDhgbS+gw=
 =98HB
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "In this cycle, direct I/O and fsdax support for uncompressed files are
  now added in order to avoid double-caching for loop device and VM
  container use cases. All uncompressed cases are now turned into iomap
  infrastructure, which looks much simpler and cleaner.

  In addition, fiemap support is added for both (un)compressed files by
  using iomap infrastructure as well so end users can easily get file
  distribution. We've also added chunk-based uncompressed files support
  for data deduplication as the next step of VM container use cases.

  Summary:

   - support direct I/O for all uncompressed files

   - support fsdax for non-tailpacking regular files

   - use iomap infrastructure for all uncompressed cases

   - support fiemap for both (un)compressed files

   - introduce chunk-based files for chunk deduplication

   - some cleanups"

* tag 'erofs-for-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix double free of 'copied'
  erofs: support reading chunk-based uncompressed files
  erofs: introduce chunk-based file on-disk format
  erofs: add fiemap support with iomap
  erofs: add support for the full decompressed length
  erofs: remove the mapping parameter from erofs_try_to_free_cached_page()
  erofs: directly use wrapper erofs_page_is_managed() when shrinking
  erofs: convert all uncompressed cases to iomap
  erofs: dax support for non-tailpacking regular file
  erofs: iomap support for non-tailpacking DIO
2021-09-02 09:12:05 -07:00
Linus Torvalds 67b03f93a3 fs.idmapped.v5.15
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYS3kewAKCRCRxhvAZXjc
 og5bAQD1G0tRKsxBZJPoB7JdfRG+26zekxPMGgmS9R+oRtwDagD8CdB2dTNsNYx7
 mAFotML2qtWN8c3ErcVUE3Q6yWkaGgw=
 =wJoh
 -----END PGP SIGNATURE-----

Merge tag 'fs.idmapped.v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull idmapping documentation updates from Christian Brauner:
 "The bulk of the idmapped work this cycle was adding support for
  idmapped mounts to btrfs.

  While this required the addition of a (simple) new vfs helper all the
  work is going through David Sterba's btrfs tree. It was way simpler to
  do it this way rather then forcing David to coordinate between his
  btrfs and my tree. Plus I don't care who merges it as long as I feel I
  can trust the maintainer and the btrfs folks were really fast and
  helpful in reviewing this work.

  As always, associated with the btrfs port for idmapped mounts is a new
  fstests extension specifically concerned with btrfs ioctls (e.g.
  subvolume creation, deletion etc.) on idmapped mounts which can be
  found in the fstests repo as 5f8179ce8b00 ("btrfs: introduce btrfs
  specific idmapped mounts tests").

  Consequently, this cycle the idmapping pull is boring. It only
  contains documentation updates, specifically about how idmappings and
  idmapped mounts work"

* tag 'fs.idmapped.v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  doc: give a more thorough id handling explanation
2021-08-31 12:06:17 -07:00
Linus Torvalds cd358208d7 fscrypt updates for 5.15
Some small fixes and cleanups for fs/crypto/:
 
 - Fix ->getattr() for ext4, f2fs, and ubifs to report the correct
   st_size for encrypted symlinks.
 
 - Use base64url instead of a custom Base64 variant.
 
 - Document struct fscrypt_operations.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCYS0HzhQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK+XZAQDfvDE9gK4Ii2uE4Jb5XYv4M/BnVhoR
 WIhNEoHROIGv+AEAtyfmeCMdpPobkWHFfAE1iBysl3iS56fibQhi2wqyuQI=
 =s6Wi
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt updates from Eric Biggers:
 "Some small fixes and cleanups for fs/crypto/:

   - Fix ->getattr() for ext4, f2fs, and ubifs to report the correct
     st_size for encrypted symlinks

   - Use base64url instead of a custom Base64 variant

   - Document struct fscrypt_operations"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: document struct fscrypt_operations
  fscrypt: align Base64 encoding with RFC 4648 base64url
  fscrypt: remove mention of symlink st_size quirk from documentation
  ubifs: report correct st_size for encrypted symlinks
  f2fs: report correct st_size for encrypted symlinks
  ext4: report correct st_size for encrypted symlinks
  fscrypt: add fscrypt_symlink_getattr() for computing st_size
2021-08-31 10:01:14 -07:00
Linus Torvalds e24c567b7e Initial merge of kernel smb3 file server, ksmbd
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmEr2okACgkQiiy9cAdy
 T1GucQwAoZykSKyy2AMbyYSap1Bm+CXX7YHoMlnqt0IF8ka9XKFj8nMBPgB8GKFE
 JLyikUxxeqgwbuOAkeyJu2knzX7QH18yTqz/YpytCx5waBgWdhLybn01xx/2n8fX
 ZA6E3Sb4aJKD7DO1Ia5t0u1lI4c2iZTIOwpoYgTX7VQi8VQYU/TN6aOIZR0Xgznw
 ffoxC4d9nlDjLzlTqY4oLnuIEQ0zHUBhndXxSy5Zh20cBUOLeWkOLUt1w1IkaR36
 f4k2UL3Y8njc9mKiOsQfFV/KE6YnPA0hZ48+h3xmWopjIJp5Wl7ublA/2c9rjxCf
 K5PwwgZ2bS5OpDvzSMPmquxGdQm9aPOlWwTbSL0NHe1ws/bD3Gx3/5/pjGJjyaYb
 i0YagAI8VaJPnPqBGdCfcwgV4NtshzSV2l9qfnkBG2hwkD4DldFt2qP4s/nQXPNt
 /bmEoTjuzrNhYbw3N2bbBQUK9CKhGywCbb2yAtSx5Me4FfxNANQ7lK1VLdfd6h4k
 o9esOBdF
 =yh1J
 -----END PGP SIGNATURE-----

Merge tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd

Pull initial ksmbd implementation from Steve French:
 "Initial merge of kernel smb3 file server, ksmbd.

  The SMB family of protocols is the most widely deployed network
  filesystem protocol, the default on Windows and Macs (and even on many
  phones and tablets), with clients and servers on all major operating
  systems, but lacked a kernel server for Linux. For many cases the
  current userspace server choices were suboptimal either due to memory
  footprint, performance or difficulty integrating well with advanced
  Linux features.

  ksmbd is a new kernel module which implements the server-side of the
  SMB3 protocol. The target is to provide optimized performance, GPLv2
  SMB server, and better lease handling (distributed caching). The
  bigger goal is to add new features more rapidly (e.g. RDMA aka
  "smbdirect", and recent encryption and signing improvements to the
  protocol) which are easier to develop on a smaller, more tightly
  optimized kernel server than for example in Samba.

  The Samba project is much broader in scope (tools, security services,
  LDAP, Active Directory Domain Controller, and a cross platform file
  server for a wider variety of purposes) but the user space file server
  portion of Samba has proved hard to optimize for some Linux workloads,
  including for smaller devices.

  This is not meant to replace Samba, but rather be an extension to
  allow better optimizing for Linux, and will continue to integrate well
  with Samba user space tools and libraries where appropriate. Working
  with the Samba team we have already made sure that the configuration
  files and xattrs are in a compatible format between the kernel and
  user space server.

  Various types of functional and regression tests are regularly run
  against it. One example is the automated 'buildbot' regression tests
  which use the Linux client to test against ksmbd, e.g.

     http://smb3-test-rhel-75.southcentralus.cloudapp.azure.com/#/builders/8/builds/56

  but other test suites, including Samba's smbtorture functional test
  suite are also used regularly"

* tag '5.15-rc-first-ksmbd-merge' of git://git.samba.org/ksmbd: (219 commits)
  ksmbd: fix __write_overflow warning in ndr_read_string
  MAINTAINERS: ksmbd: add cifs_common directory to ksmbd entry
  MAINTAINERS: ksmbd: update my email address
  ksmbd: fix permission check issue on chown and chmod
  ksmbd: don't set FILE DELETE and FILE_DELETE_CHILD in access mask by default
  MAINTAINERS: add git adddress of ksmbd
  ksmbd: update SMB3 multi-channel support in ksmbd.rst
  ksmbd: smbd: fix kernel oops during server shutdown
  ksmbd: remove select FS_POSIX_ACL in Kconfig
  ksmbd: use proper errno instead of -1 in smb2_get_ksmbd_tcon()
  ksmbd: update the comment for smb2_get_ksmbd_tcon()
  ksmbd: change int data type to boolean
  ksmbd: Fix multi-protocol negotiation
  ksmbd: fix an oops in error handling in smb2_open()
  ksmbd: add ipv6_addr_v4mapped check to know if connection from client is ipv4
  ksmbd: fix missing error code in smb2_lock
  ksmbd: use channel signingkey for binding SMB2 session setup
  ksmbd: don't set RSS capable in FSCTL_QUERY_NETWORK_INTERFACE_INFO
  ksmbd: Return STATUS_OBJECT_PATH_NOT_FOUND if smb2_creat() returns ENOENT
  ksmbd: fix -Wstringop-truncation warnings
  ...
2021-08-31 09:11:55 -07:00
Jan Kara 3a6541e97c ext4: Orphan file documentation
Add documentation about the orphan file feature.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210816095713.16537-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-08-30 23:36:51 -04:00
Linus Torvalds 6f01c935d9 File locking changes for v5.15.
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmEo3WcTHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFbnuD/9Vj3dnXRgZ9LHFuQHp5Vy5yaGfujwP
 kIMN7bJiL0pCZ1zHapyn0cidZuTScDK2qMzGVR9Wrj/uYHEI+T08IEfaq/2CU3pS
 PdygHgqQi+WtJj4dks1OphVdlF0+46B/zy2YY0LE39zFUDO38VUIgb/gTiQ8JHIr
 BaDGtohlXmK8bDXo2XqMunSp7uQoRX92mv+bPjvIpsM60RnhPzhX0HeO/xhXO1PP
 OpswQG3nOTX1alZOXuSKdH7e3zfO+pLnC5NYeo5R4rys5xRiShU19OSAJfiPwf2w
 06OX7uLT7aF5LPQPjOP4MA9dHPWWeIuYnKzUpQZZ+D+zjuaxhF62saJG1Um0yQns
 qqPZgTWhxzkJ16bj94T2UZW0bufFKM5cEfBPFw9ZyzQshX5jIE2V5ecV+F/AdtTj
 EA55m/5N8NZMsNgnEOqn6o8TkWoV1seHKJXqja0+ZY9/Xs8GkmjHLhjNBCwGZIbw
 8S64Szp+yex3m1mRS+L6T0Gudic5WdFxOKQAzvvJcmT6u+ZGBzU0FjNTNNdnu3tp
 f9mANYxT1vNfYplRuNhMCp9oioSRpO2vfDQLCBRtsCy0QlKCum6qb7E/BcWRlwn1
 SOAFmjeXpr+VXGX+Rjp/bzCdHlNOVbR4ODWJ+h5D7QZaQwQ6FafkEq1D2yjPR0OR
 i722ECoM++vZ4w==
 =CfcC
 -----END PGP SIGNATURE-----

Merge tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:
 "This starts with a couple of fixes for potential deadlocks in the
  fowner/fasync handling.

  The next patch removes the old mandatory locking code from the kernel
  altogether.

  The last patch cleans up rw_verify_area a bit more after the mandatory
  locking removal"

* tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  fs: clean up after mandatory file locking support removal
  fs: remove mandatory file locking support
  fcntl: fix potential deadlock for &fasync_struct.fa_lock
  fcntl: fix potential deadlocks for &fown_struct.lock
2021-08-30 12:38:13 -07:00
Linus Torvalds aa99f3c2b9 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmEmTZcACgkQnJ2qBz9k
 QNkkmAgArW6XoF1CePds/ZaC9vfg/nk66/zVo0n+J8xXjMWAPxcKbWFfV0uWVixq
 yk4lcLV47a2Mu/B/1oLNd3vrSmhwU+srWqNwOFn1nv+lP/6wJqr8oztRHn/0L9Q3
 ZSRrukSejbQ6AvTL/WzTNnCjjCc2ne3Kyko6W41aU6uyJuzhSM32wbx7qlV6t54Z
 iint9OrB4gM0avLohNafTUq6I+tEGzBMNwpCG/tqCmkcvDcv3rTDVAnPSCTm0Tx2
 hdrYDcY/rLxo93pDBaW1rYA/fohR+mIVye6k2TjkPAL6T1x+rxeT5qnc+YijH5yF
 sFPDhlD+ZsfOLi8stWXLOJ+8+gLODg==
 =pDBR
 -----END PGP SIGNATURE-----

Merge tag 'hole_punch_for_v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fs hole punching vs cache filling race fixes from Jan Kara:
 "Fix races leading to possible data corruption or stale data exposure
  in multiple filesystems when hole punching races with operations such
  as readahead.

  This is the series I was sending for the last merge window but with
  your objection fixed - now filemap_fault() has been modified to take
  invalidate_lock only when we need to create new page in the page cache
  and / or bring it uptodate"

* tag 'hole_punch_for_v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  filesystems/locking: fix Malformed table warning
  cifs: Fix race between hole punch and page fault
  ceph: Fix race between hole punch and page fault
  fuse: Convert to using invalidate_lock
  f2fs: Convert to using invalidate_lock
  zonefs: Convert to using invalidate_lock
  xfs: Convert double locking of MMAPLOCK to use VFS helpers
  xfs: Convert to use invalidate_lock
  xfs: Refactor xfs_isilocked()
  ext2: Convert to using invalidate_lock
  ext4: Convert to use mapping->invalidate_lock
  mm: Add functions to lock invalidate_lock for two mappings
  mm: Protect operations adding pages to page cache with invalidate_lock
  documentation: Sync file_operations members with reality
  mm: Fix comments mentioning i_mutex
2021-08-30 10:24:50 -07:00
Jeff Layton f7e33bdbd6 fs: remove mandatory file locking support
We added CONFIG_MANDATORY_FILE_LOCKING in 2015, and soon after turned it
off in Fedora and RHEL8. Several other distros have followed suit.

I've heard of one problem in all that time: Someone migrated from an
older distro that supported "-o mand" to one that didn't, and the host
had a fstab entry with "mand" in it which broke on reboot. They didn't
actually _use_ mandatory locking so they just removed the mount option
and moved on.

This patch rips out mandatory locking support wholesale from the kernel,
along with the Kconfig option and the Documentation file. It also
changes the mount code to ignore the "mand" mount option instead of
erroring out, and to throw a big, ugly warning.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
2021-08-23 06:15:36 -04:00
Gao Xiang 2a9dc7a8fe erofs: introduce chunk-based file on-disk format
Currently, uncompressed data except for tail-packing inline is
consecutive on disk.

In order to support chunk-based data deduplication, add a new
corresponding inode data layout.

In the future, the data source of chunks can be either (un)compressed.

Link: https://lore.kernel.org/r/20210820100019.208490-1-hsiangkao@linux.alibaba.com
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-08-20 22:38:01 +08:00
Miklos Szeredi 0cad624662 vfs: add rcu argument to ->get_acl() callback
Add a rcu argument to the ->get_acl() callback to allow
get_cached_acl_rcu() to call the ->get_acl() method in the next patch.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-18 22:08:24 +02:00
Chao Yu b96d9b3b09 f2fs: fix to keep compatibility of fault injection interface
The value of FAULT_* macros and its description in f2fs.rst became
inconsistent, fix this to keep compatibility of fault injection
interface.

Fixes: 67883ade7a ("f2fs: remove FAULT_ALLOC_BIO")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-17 11:59:05 -07:00
Chao Yu 324105775c f2fs: support fault injection for f2fs_kmem_cache_alloc()
This patch supports to inject fault into f2fs_kmem_cache_alloc().

Usage:
a) echo 32768 > /sys/fs/f2fs/<dev>/inject_type or
b) mount -o fault_type=32768 <dev> <mountpoint>

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-17 11:59:05 -07:00
Fengnan Chang 4a4fc043f5 f2fs: compress: allow write compress released file after truncate to zero
For compressed file, after release compress blocks, don't allow write
direct, but we should allow write direct after truncate to zero.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Fengnan Chang <changfengnan@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-17 11:59:04 -07:00
Chengguang Xu b71759ef1e ovl: skip checking lower file's i_writecount on truncate
It is possible that a directory tree is shared between multiple overlay
instances as a lower layer.  In this case when one instance executes a file
residing on the lower layer, the other instance denies a truncate(2) call
on this file.

This only happens for truncate(2) and not for open(2) with the O_TRUNC
flag.

Fix this interference and inconsistency by removing the preliminary
i_writecount check before copy-up.

This means that unlike on normal filesystems truncate(argv[0]) will now
succeed.  If this ever causes a regression in a real world use case this
needs to be revisited.

One way to fix this properly would be to keep a correct i_writecount in the
overlay inode, but that is difficult due to memory mapping code only
dealing with the real file/inode.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-08-17 11:47:44 +02:00
Konstantin Komarov 12dad495ea
fs/ntfs3: Add Kconfig, Makefile and doc
This adds Kconfig, Makefile and doc

Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
2021-08-13 07:56:37 -07:00
Namjae Jeon 668fff0172 ksmbd: update SMB3 multi-channel support in ksmbd.rst
ksmbd start supporting SMB3 Multi-channel feature. Mark it as Partially
supported till replay/retry mechanisms are implemented.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-08-13 08:18:15 +09:00
Christian Brauner ad19607a90
doc: give a more thorough id handling explanation
Currently there's no document explaining how idmappings work at all.
Add a document that gives an introduction and also goes into a bit more
detail for more advanced use-cases.

Link: https://lore.kernel.org/r/20210727104416.828293-1-brauner@kernel.org
Cc: Seth Forshee <seth.forshee@digitalocean.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-08-11 15:28:32 +02:00
Gao Xiang 06252e9ce0 erofs: dax support for non-tailpacking regular file
DAX is quite useful for some VM use cases in order to save guest
memory extremely with minimal lightweight EROFS.

In order to prepare for such use cases, add preliminary dax support
for non-tailpacking regular files for now.

Tested with the DRAM-emulated PMEM and the EROFS image generated by
"mkfs.erofs -Enoinline_data enwik9.fsdax.img enwik9"

Link: https://lore.kernel.org/r/20210805003601.183063-3-hsiangkao@linux.alibaba.com
Cc: nvdimm@lists.linux.dev
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-08-10 00:14:59 +08:00
Chao Yu 4f993264fe f2fs: introduce discard_unit mount option
As James Z reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=213877

[1.] One-line summary of the problem:
Mount multiple SMR block devices exceed certain number cause system non-response

[2.] Full description of the problem/report:
Created some F2FS on SMR devices (mkfs.f2fs -m), then mounted in sequence. Each device is the same Model: HGST HSH721414AL (Size 14TB).
Empirically, found that when the amount of SMR device * 1.5Gb > System RAM, the system ran out of memory and hung. No dmesg output. For example, 24 SMR Disk need 24*1.5GB = 36GB. A system with 32G RAM can only mount 21 devices, the 22nd device will be a reproducible cause of system hang.
The number of SMR devices with other FS mounted on this system does not interfere with the result above.

[3.] Keywords (i.e., modules, networking, kernel):
F2FS, SMR, Memory

[4.] Kernel information
[4.1.] Kernel version (uname -a):
Linux 5.13.4-200.fc34.x86_64 #1 SMP Tue Jul 20 20:27:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

[4.2.] Kernel .config file:
Default Fedora 34 with f2fs-tools-1.14.0-2.fc34.x86_64

[5.] Most recent kernel version which did not have the bug:
None

[6.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/admin-guide/oops-tracing.rst)
None

[7.] A small shell script or example program which triggers the
     problem (if possible)
mount /dev/sdX /mnt/0X

[8.] Memory consumption

With 24 * 14T SMR Block device with F2FS
free -g
              total        used        free      shared  buff/cache   available
Mem:             46          36           0           0          10          10
Swap:             0           0           0

With 3 * 14T SMR Block device with F2FS
free -g
               total        used        free      shared  buff/cache   available
Mem:               7           5           0           0           1           1
Swap:              7           0           7

The root cause is, there are three bitmaps:
- cur_valid_map
- ckpt_valid_map
- discard_map
and each of them will cost ~500MB memory, {cur, ckpt}_valid_map are
necessary, but discard_map is optional, since this bitmap will only be
useful in mountpoint that small discard is enabled.

For a blkzoned device such as SMR or ZNS devices, f2fs will only issue
discard for a section(zone) when all blocks of that section are invalid,
so, for such device, we don't need small discard functionality at all.

This patch introduces a new mountoption "discard_unit=block|segment|
section" to support issuing discard with different basic unit which is
aligned to block, segment or section, so that user can specify
"discard_unit=segment" or "discard_unit=section" to disable small
discard functionality.

Note that this mount option can not be changed by remount() due to
related metadata need to be initialized during mount().

In order to save memory, let's use "discard_unit=section" for blkzoned
device by default.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-03 11:16:17 -07:00
Randy Dunlap 7882c55ef6 filesystems/locking: fix Malformed table warning
Update the bottom border to be the same as the top border.

Documentation/filesystems/locking.rst:274: WARNING: Malformed table.
Bottom/header table border does not match top border.

Fixes: 730633f0b7 ("mm: Protect operations adding pages to page cache with invalidate_lock")
Link: https://lore.kernel.org/r/20210727232212.12510-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-28 15:18:18 +02:00
Eric Biggers ba47b515f5 fscrypt: align Base64 encoding with RFC 4648 base64url
fscrypt uses a Base64 encoding to encode no-key filenames (the filenames
that are presented to userspace when a directory is listed without its
encryption key).  There are many variants of Base64, but the most common
ones are specified by RFC 4648.  fscrypt can't use the regular RFC 4648
"base64" variant because "base64" uses the '/' character, which isn't
allowed in filenames.  However, RFC 4648 also specifies a "base64url"
variant for use in URLs and filenames.  "base64url" is less common than
"base64", but it's still implemented in many programming libraries.

Unfortunately, what fscrypt actually uses is a custom Base64 variant
that differs from "base64url" in several ways:

- The binary data is divided into 6-bit chunks differently.

- Values 62 and 63 are encoded with '+' and ',' instead of '-' and '_'.

- '='-padding isn't used.  This isn't a problem per se, as the padding
  isn't technically necessary, and RFC 4648 doesn't strictly require it.
  But it needs to be properly documented.

There have been two attempts to copy the fscrypt Base64 code into lib/
(https://lkml.kernel.org/r/20200821182813.52570-6-jlayton@kernel.org and
https://lkml.kernel.org/r/20210716110428.9727-5-hare@suse.de), and both
have been caught up by the fscrypt Base64 variant being nonstandard and
not properly documented.  Also, the planned use of the fscrypt Base64
code in the CephFS storage back-end will prevent it from being changed
later (whereas currently it can still be changed), so we need to choose
an encoding that we're happy with before it's too late.

Therefore, switch the fscrypt Base64 variant to base64url, in order to
align more closely with RFC 4648 and other implementations and uses of
Base64.  However, I opted not to implement '='-padding, as '='-padding
adds complexity, is unnecessary, and isn't required by the RFC.

Link: https://lore.kernel.org/r/20210718000125.59701-1-ebiggers@kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-07-25 20:47:05 -07:00
Eric Biggers e538b0985a fscrypt: remove mention of symlink st_size quirk from documentation
Now that the correct st_size is reported for encrypted symlinks on all
filesystems, update the documentation accordingly.

Link: https://lore.kernel.org/r/20210702065350.209646-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2021-07-25 20:01:07 -07:00
Robert Richter 5e60f363b3 Documentation: Fix intiramfs script name
Documentation was not changed when renaming the script in commit
80e715a06c ("initramfs: rename gen_initramfs_list.sh to
gen_initramfs.sh"). Fixing this.

Basically does:

 $ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh)

Fixes: 80e715a06c ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh")
Signed-off-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-07-18 23:48:14 +09:00
Jan Kara 730633f0b7 mm: Protect operations adding pages to page cache with invalidate_lock
Currently, serializing operations such as page fault, read, or readahead
against hole punching is rather difficult. The basic race scheme is
like:

fallocate(FALLOC_FL_PUNCH_HOLE)			read / fault / ..
  truncate_inode_pages_range()
						  <create pages in page
						   cache here>
  <update fs block mapping and free blocks>

Now the problem is in this way read / page fault / readahead can
instantiate pages in page cache with potentially stale data (if blocks
get quickly reused). Avoiding this race is not simple - page locks do
not work because we want to make sure there are *no* pages in given
range. inode->i_rwsem does not work because page fault happens under
mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes
the performance for mixed read-write workloads suffer.

So create a new rw_semaphore in the address_space - invalidate_lock -
that protects adding of pages to page cache for page faults / reads /
readahead.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-13 13:14:27 +02:00
Jan Kara c625b4cc57 documentation: Sync file_operations members with reality
Sync listing of struct file_operations members with the real one in
fs.h.

Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-07-13 12:47:55 +02:00