WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Chuck Lever	071eb319ce	NFSD: Fix reads with a non-zero offset that don't end on a page boundary [ Upstream commit `ac8db824ea` ] This was found when virtual machines with nfs-mounted qcow2 disks failed to boot properly. Reported-by: Anders Blomdell <anders.blomdell@control.lth.se> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142132 Fixes: `bfbfb6182a` ("nfsd_splice_actor(): handle compound pages") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Chuck Lever	e4d7874308	NFSD: Fix trace_nfsd_fh_verify_err() crasher [ Upstream commit `5a01c80544` ] Now that the nfsd_fh_verify_err() tracepoint is always called on error, it needs to handle cases where the filehandle is not yet fully formed. Fixes: `93c128e709` ("nfsd: ensure we always call fh_verify_error tracepoint") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	3f439c7701	nfsd: put the export reference in nfsd4_verify_deleg_dentry [ Upstream commit `50256e4793` ] nfsd_lookup_dentry returns an export reference in addition to the dentry ref. Ensure that we put it too. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138866 Fixes: `876c553cb4` ("NFSD: verify the opened dentry after setting a delegation") Reported-by: Yongcheng Yang <yoyang@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	98d400fc2d	nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint [ Upstream commit `bdd6b5624c` ] When we fail to insert into the hashtable with a non-retryable error, we'll free the object and then goto out_status. If the tracepoint is enabled, it'll end up accessing the freed object when it tries to grab the fields out of it. Set nf to NULL after freeing it to avoid the issue. Fixes: `243a526301` ("nfsd: rework hashtable handling in nfsd_do_file_acquire") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	3ec2c9976c	nfsd: fix net-namespace logic in __nfsd_file_cache_purge [ Upstream commit `d3aefd2b29` ] If the namespace doesn't match the one in "net", then we'll continue, but that doesn't cause another rhashtable_walk_next call, so it will loop infinitely. Fixes: `ce502f81ba` ("NFSD: Convert the filecache to use rhashtable") Reported-by: Petr Vorel <pvorel@suse.cz> Link: https://lore.kernel.org/ltp/Y1%2FP8gDAcWC%2F+VR3@pevik/ Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	f17c07f8ea	nfsd: ensure we always call fh_verify_error tracepoint [ Upstream commit `93c128e709` ] This is a conditional tracepoint. Call it every time, not just when nfs_permission fails. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Tetsuo Handa	15d01caf35	NFSD: unregister shrinker when nfsd_init_net() fails [ Upstream commit `bd86c69dae` ] syzbot is reporting UAF read at register_shrinker_prepared() [1], for commit `7746b32f46` ("NFSD: add shrinker to reap courtesy clients on low memory condition") missed that nfsd4_leases_net_shutdown() from nfsd_exit_net() is called only when nfsd_init_net() succeeded. If nfsd_init_net() fails due to nfsd_reply_cache_init() failure, register_shrinker() from nfsd4_init_leases_net() has to be undone before nfsd_init_net() returns. Link: https://syzkaller.appspot.com/bug?extid=ff796f04613b4c84ad89 [1] Reported-by: syzbot <syzbot+ff796f04613b4c84ad89@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `7746b32f46` ("NFSD: add shrinker to reap courtesy clients on low memory condition") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	d1b0ceeac1	nfsd: rework hashtable handling in nfsd_do_file_acquire [ Upstream commit `243a526301` ] nfsd_file is RCU-freed, so we need to hold the rcu_read_lock long enough to get a reference after finding it in the hash. Take the rcu_read_lock() and call rhashtable_lookup directly. Switch to using rhashtable_lookup_insert_key as well, and use the usual retry mechanism if we hit an -EEXIST. Rename the "retry" bool to open_retry, and eliminiate the insert_err goto target. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	405ade5b56	nfsd: fix nfsd_file_unhash_and_dispose [ Upstream commit `8d0d254b15` ] nfsd_file_unhash_and_dispose() is called for two reasons: We're either shutting down and purging the filecache, or we've gotten a notification about a file delete, so we want to go ahead and unhash it so that it'll get cleaned up when we close. We're either walking the hashtable or doing a lookup in it and we don't take a reference in either case. What we want to do in both cases is to try and unhash the object and put it on the dispose list if that was successful. If it's no longer hashed, then we don't want to touch it, with the assumption being that something else is already cleaning up the sentinel reference. Instead of trying to selectively decrement the refcount in this function, just unhash it, and if that was successful, move it to the dispose list. Then, the disposal routine will just clean that up as usual. Also, just make this a void function, drop the WARN_ON_ONCE, and the comments about deadlocking since the nature of the purported deadlock is no longer clear. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:20 +02:00
Jeff Layton	3af497e3f7	nfsd: extra checks when freeing delegation stateids [ Upstream commit `895ddf5ed4` ] We've had some reports of problems in the refcounting for delegation stateids that we've yet to track down. Add some extra checks to ensure that we've removed the object from various lists before freeing it. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2127067 Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
Jeff Layton	e55378bce5	nfsd: make nfsd4_run_cb a bool return function [ Upstream commit `b95239ca49` ] queue_work can return false and not queue anything, if the work is already queued. If that happens in the case of a CB_RECALL, we'll have taken an extra reference to the stid that will never be put. Ensure we throw a warning in that case. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
Jeff Layton	f6279fa0dc	nfsd: fix comments about spinlock handling with delegations [ Upstream commit `25fbe1fca1` ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
Jeff Layton	ecb0eb07ee	nfsd: only fill out return pointer on success in nfsd4_lookup_stateid [ Upstream commit `4d01416ab4` ] In the case of a revoked delegation, we still fill out the pointer even when returning an error, which is bad form. Only overwrite the pointer on success. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
Chuck Lever	4ad28d583e	NFSD: Cap rsize_bop result based on send buffer size [ Upstream commit `76ce4dcec0` ] Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. Add an NFSv4 helper that computes the size of the send buffer. It replaces svc_max_payload() in spots where svc_max_payload() returns a value that might be larger than the remaining send buffer space. Callers who need to know the transport's actual maximum payload size will continue to use svc_max_payload(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
Chuck Lever	4794c948de	NFSD: Rename the fields in copy_stateid_t [ Upstream commit `781fde1a2b` ] Code maintenance: The name of the copy_stateid_t::sc_count field collides with the sc_count field in struct nfs4_stid, making the latter difficult to grep for when auditing stateid reference counting. No behavior change expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
ChenXiaoSong	0793ec49ba	nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops [ Upstream commit `1342f9dd3f` ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:19 +02:00
ChenXiaoSong	815efd78cb	nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_reply_cache_stats_fops [ Upstream commit `64776611a0` ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. nfsd_net is converted from seq_file->file instead of seq_file->private in nfsd_reply_cache_stats_show(). Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> [ cel: reduce line length ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
ChenXiaoSong	861a163d49	nfsd: use DEFINE_SHOW_ATTRIBUTE to define client_info_fops [ Upstream commit `1d7f6b302b` ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. inode is converted from seq_file->file instead of seq_file->private in client_info_show(). Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
ChenXiaoSong	25e0dd89d4	nfsd: use DEFINE_SHOW_ATTRIBUTE to define export_features_fops and supported_enctypes_fops [ Upstream commit `9beeaab8e0` ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> [ cel: reduce line length ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
ChenXiaoSong	685d01c2b2	nfsd: use DEFINE_PROC_SHOW_ATTRIBUTE to define nfsd_proc_ops [ Upstream commit `0cfb0c4228` ] Use DEFINE_PROC_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	82fbfbe92d	NFSD: Pack struct nfsd4_compoundres [ Upstream commit `9f553e61bd` ] Remove a couple of 4-byte holes on platforms with 64-bit pointers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	cd8bcaeeae	NFSD: Remove unused nfsd4_compoundargs::cachetype field [ Upstream commit `77e378cf2a` ] This field was added by commit `1091006c5e` ("nfsd: turn on reply cache for NFSv4") but was never put to use. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	ba3bd2bf0c	NFSD: Remove "inline" directives on op_rsize_bop helpers [ Upstream commit `6604148cf9` ] These helpers are always invoked indirectly, so the compiler can't inline these anyway. While we're updating the synopses of these helpers, defensively convert their parameters to const pointers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	d8d3a672e1	NFSD: Clean up nfs4svc_encode_compoundres() [ Upsteam commit `9993a66317` ] In today's Linux NFS server implementation, the NFS dispatcher initializes each XDR result stream, and the NFSv4 .pc_func and .pc_encode methods all use xdr_stream-based encoding. This keeps rq_res.len automatically updated. There is no longer a need for the WARN_ON_ONCE() check in nfs4svc_encode_compoundres(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	fc47f8ddfc	NFSD: Clean up WRITE arg decoders [ Upstream commit `d4da5baa53` ] xdr_stream_subsegment() already returns a boolean value. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:18 +02:00
Chuck Lever	b3f3b21ed2	NFSD: Use xdr_inline_decode() to decode NFSv3 symlinks [ Upstream commit `c3d2a04f05` ] Replace the check for buffer over/underflow with a helper that is commonly used for this purpose. The helper also sets xdr->nwords correctly after successfully linearizing the symlink argument into the stream's scratch buffer. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	cab5399262	NFSD: Refactor common code out of dirlist helpers [ Upstream commit `98124f5bd6` ] The dust has settled a bit and it's become obvious what code is totally common between nfsd_init_dirlist_pages() and nfsd3_init_dirlist_pages(). Move that common code to SUNRPC. The new helper brackets the existing xdr_init_decode_pages() API. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	07b68ff5c7	NFSD: Reduce amount of struct nfsd4_compoundargs that needs clearing [ Upstream commit `3fdc546462` ] Have SunRPC clear everything except for the iops array. Then have each NFSv4 XDR decoder clear it's own argument before decoding. Now individual operations may have a large argument struct while not penalizing the vast majority of operations with a small struct. And, clearing the argument structure occurs as the argument fields are initialized, enabling the CPU to do write combining on that memory. In some cases, clearing is not even necessary because all of the fields in the argument structure are initialized by the decoder. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	2005eba603	SUNRPC: Parametrize how much of argsize should be zeroed [ Upstream commit `103cc1fafe` ] Currently, SUNRPC clears the whole of .pc_argsize before processing each incoming RPC transaction. Add an extra parameter to struct svc_procedure to enable upper layers to reduce the amount of each operation's argument structure that is zeroed by SUNRPC. The size of struct nfsd4_compoundargs, in particular, is a lot to clear on each incoming RPC Call. A subsequent patch will cut this down to something closer to what NFSv2 and NFSv3 uses. This patch should cause no behavior changes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Dai Ngo	9acc481242	NFSD: add shrinker to reap courtesy clients on low memory condition [ Upstream commit `7746b32f46` ] Add courtesy_client_reaper to react to low memory condition triggered by the system memory shrinker. The delayed_work for the courtesy_client_reaper is scheduled on the shrinker's count callback using the laundry_wq. The shrinker's scan callback is not used for expiring the courtesy clients due to potential deadlocks. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Dai Ngo	8c9e5ad103	NFSD: keep track of the number of courtesy clients in the system [ Upstream commit `3a4ea23d86` ] Add counter nfs4_courtesy_client_count to nfsd_net to keep track of the number of courtesy clients in the system. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	c65977020b	NFSD: Make nfsd4_remove() wait before returning NFS4ERR_DELAY [ Upstream commit `5f5f8b6d65` ] nfsd_unlink() can kick off a CB_RECALL (via vfs_unlink() -> leases_conflict()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_unlink() again, once. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	d1ee3403e0	NFSD: Make nfsd4_rename() wait before returning NFS4ERR_DELAY [ Upstream commit `68c522afd0` ] nfsd_rename() can kick off a CB_RECALL (via vfs_rename() -> leases_conflict()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_rename() again, once. This version of the patch handles renaming an existing file, but does not deal with renaming onto an existing file. That case will still always trigger an NFS4ERR_DELAY. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	50aa6a80d0	NFSD: Make nfsd4_setattr() wait before returning NFS4ERR_DELAY [ Upstream commit `34b91dda71` ] nfsd_setattr() can kick off a CB_RECALL (via notify_change() -> break_lease()) if a delegation is present. Before returning NFS4ERR_DELAY, give the client holding that delegation a chance to return it and then retry the nfsd_setattr() again, once. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=354 Tested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	9863ece99e	NFSD: Refactor nfsd_setattr() [ Upstream commit `c0aa1913db` ] Move code that will be retried (in a subsequent patch) into a helper function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:17 +02:00
Chuck Lever	8a3c48cd53	NFSD: Add a mechanism to wait for a DELEGRETURN [ Upstream commit `c035362eb9` ] Subsequent patches will use this mechanism to wake up an operation that is waiting for a client to return a delegation. The new tracepoint records whether the wait timed out or was properly awoken by the expected DELEGRETURN: nfsd-1155 [002] 83799.493199: nfsd_delegret_wakeup: xid=0x14b7d6ef fh_hash=0xf6826792 (timed out) Suggested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Chuck Lever	bcd4c75115	NFSD: Add tracepoints to report NFSv4 callback completions [ Upstream commit `1035d65446` ] Wireshark has always been lousy about dissecting NFSv4 callbacks, especially NFSv4.0 backchannel requests. Add tracepoints so we can surgically capture these events in the trace log. Tracepoints are time-stamped and ordered so that we can now observe the timing relationship between a CB_RECALL Reply and the client's DELEGRETURN Call. Example: nfsd-1153 [002] 211.986391: nfsd_cb_recall: addr=192.168.1.67:45767 client 62ea82e4:fee7492a stateid 00000003:00000001 nfsd-1153 [002] 212.095634: nfsd_compound: xid=0x0000002c opcnt=2 nfsd-1153 [002] 212.095647: nfsd_compound_status: op=1/2 OP_PUTFH status=0 nfsd-1153 [002] 212.095658: nfsd_file_put: hash=0xf72 inode=0xffff9291148c7410 ref=3 flags=HASHED\|REFERENCED may=READ file=0xffff929103b3ea00 nfsd-1153 [002] 212.095661: nfsd_compound_status: op=2/2 OP_DELEGRETURN status=0 kworker/u25:8-148 [002] 212.096713: nfsd_cb_recall_done: client 62ea82e4:fee7492a stateid 00000003:00000001 status=0 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Chuck Lever	3fe32c519b	NFSD: Trace NFSv4 COMPOUND tags [ Upstream commit `de29cf7e6c` ] The Linux NFSv4 client implementation does not use COMPOUND tags, but the Solaris and MacOS implementations do, and so does pynfs. Record these eye-catchers in the server's trace buffer to annotate client requests while troubleshooting. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Chuck Lever	62980365d6	NFSD: Replace dprintk() call site in fh_verify() [ Upstream commit `948755efc9` ] Record permission errors in the trace log. Note that the new trace event is conditional, so it will only record non-zero return values from nfsd_permission(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Gaosheng Cui	5118eb6c29	nfsd: remove nfsd4_prepare_cb_recall() declaration [ Upstream commit `18224dc58d` ] nfsd4_prepare_cb_recall() has been removed since commit `0162ac2b97` ("nfsd: introduce nfsd4_callback_ops"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Jeff Layton	4440588b93	nfsd: clean up mounted_on_fileid handling [ Upstream commit `6106d9119b` ] We only need the inode number for this, not a full rack of attributes. Rename this function make it take a pointer to a u64 instead of struct kstat, and change it to just request STATX_INO. Signed-off-by: Jeff Layton <jlayton@kernel.org> [ cel: renamed get_mounted_on_ino() ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
NeilBrown	5f6f6b2a3b	NFSD: drop fname and flen args from nfsd_create_locked() [ Upstream commit `9558f9304c` ] nfsd_create_locked() does not use the "fname" and "flen" arguments, so drop them from declaration and all callers. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Chuck Lever	37f3b9c398	NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND [ Upstream commit `80e591ce63` ] When attempting an NFSv4 mount, a Solaris NFSv4 client builds a single large COMPOUND that chains a series of LOOKUPs to get to the pseudo filesystem root directory that is to be mounted. The Linux NFS server's current maximum of 16 operations per NFSv4 COMPOUND is not large enough to ensure that this works for paths that are more than a few components deep. Since NFSD_MAX_OPS_PER_COMPOUND is mostly a sanity check, and most NFSv4 COMPOUNDS are between 3 and 6 operations (thus they do not trigger any re-allocation of the operation array on the server), increasing this maximum should result in little to no impact. The ops array can get large now, so allocate it via vmalloc() to help ensure memory fragmentation won't cause an allocation failure. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216383 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Christophe JAILLET	56ffc3ab88	nfsd: Propagate some error code returned by memdup_user() [ Upstream commit `30a30fcc3f` ] Propagate the error code returned by memdup_user() instead of a hard coded -EFAULT. Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Christophe JAILLET	371d2d25bf	nfsd: Avoid some useless tests [ Upstream commit `d44899b8bb` ] memdup_user() can't return NULL, so there is no point for checking for it. Simplify some tests accordingly. Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:16 +02:00
Jinpeng Cui	211014047e	NFSD: remove redundant variable status [ Upstream commit `4ab3442ca3` ] Return value directly from fh_verify() do_open_permission() exp_pseudoroot() instead of getting value from redundant variable status. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Jinpeng Cui <cui.jinpeng2@zte.com.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
Olga Kornievskaia	5b6441a5d3	NFSD enforce filehandle check for source file in COPY [ Upstream commit `754035ff79` ] If the passed in filehandle for the source file in the COPY operation is not a regular file, the server MUST return NFS4ERR_WRONG_TYPE. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> [ cel: adjusted to apply to v5.15.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
Wolfram Sang	574ec47ac8	NFSD: move from strlcpy with unused retval to strscpy [ Upstream commit `72f78ae00a` ] Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
Al Viro	460743da0e	nfsd_splice_actor(): handle compound pages [ Upstream commit `bfbfb6182a` ] pipe_buffer might refer to a compound page (and contain more than a PAGE_SIZE worth of data). Theoretically it had been possible since way back, but nfsd_splice_actor() hadn't run into that until copy_page_to_iter() change. Fortunately, the only thing that changes for compound pages is that we need to stuff each relevant subpage in and convert the offset into offset in the first subpage. Acked-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `f0f6b614f8` "copy_page_to_iter(): don't split high-order page in case of ITER_PIPE" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	c9cb32ad42	NFSD: fix regression with setting ACLs. [ Upstream commit `00801cd92d` ] A recent patch moved ACL setting into nfsd_setattr(). Unfortunately it didn't work as nfsd_setattr() aborts early if iap->ia_valid is 0. Remove this test, and instead avoid calling notify_change() when ia_valid is 0. This means that nfsd_setattr() will now always lock the inode. Previously it didn't if only a ATTR_MODE change was requested on a symlink (see Commit `15b7a1b86d` ("[PATCH] knfsd: fix setattr-on-symlink error return")). I don't think this change really matters. Fixes: `c0cbe70742` ("NFSD: add posix ACLs to struct nfsd_attrs") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	4b910dd7fe	NFSD: discard fh_locked flag and fh_lock/fh_unlock [ Upstream commit `dd8dd403d7` ] As all inode locking is now fully balanced, fh_put() does not need to call fh_unlock(). fh_lock() and fh_unlock() are no longer used, so discard them. These are the only real users of ->fh_locked, so discard that too. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	7538fc9cba	NFSD: use (un)lock_inode instead of fh_(un)lock for file operations [ Upstream commit `bb4d53d66e` ] When locking a file to access ACLs and xattrs etc, use explicit locking with inode_lock() instead of fh_lock(). This means that the calls to fh_fill_pre/post_attr() are also explicit which improves readability and allows us to place them only where they are needed. Only the xattr calls need pre/post information. When locking a file we don't need I_MUTEX_PARENT as the file is not a parent of anything, so we can use inode_lock() directly rather than the inode_lock_nested() call that fh_lock() uses. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	e0335e7c4a	NFSD: use explicit lock/unlock for directory ops [ Upstream commit `debf16f0c6` ] When creating or unlinking a name in a directory use explicit inode_lock_nested() instead of fh_lock(), and explicit calls to fh_fill_pre_attrs() and fh_fill_post_attrs(). This is already done for renames, with lock_rename() as the explicit locking. Also move the 'fill' calls closer to the operation that might change the attributes. This way they are avoided on some error paths. For the v2-only code in nfsproc.c, the fill calls are not replaced as they aren't needed. Making the locking explicit will simplify proposed future changes to locking for directories. It also makes it easily visible exactly where pre/post attributes are used - not all callers of fh_lock() actually need the pre/post attributes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	ebd1b016ad	NFSD: reduce locking in nfsd_lookup() [ Upstream commit `19d008b469` ] nfsd_lookup() takes an exclusive lock on the parent inode, but no callers want the lock and it may not be needed at all if the result is in the dcache. Change nfsd_lookup_dentry() to not take the lock, and call lookup_one_len_locked() which takes lock only if needed. nfsd4_open() currently expects the lock to still be held, but that isn't necessary as nfsd_validate_delegated_dentry() provides required guarantees without the lock. NOTE: NFSv4 requires directory changeinfo for OPEN even when a create wasn't requested and no change happened. Now that nfsd_lookup() doesn't use fh_lock(), we need to explicitly fill the attributes when no create happens. A new fh_fill_both_attrs() is provided for that task. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:15 +02:00
NeilBrown	ba4b518a23	NFSD: only call fh_unlock() once in nfsd_link() [ Upstream commit `e18bcb33bc` ] On non-error paths, nfsd_link() calls fh_unlock() twice. This is safe because fh_unlock() records that the unlock has been done and doesn't repeat it. However it makes the code a little confusing and interferes with changes that are planned for directory locking. So rearrange the code to ensure fh_unlock() is called exactly once if fh_lock() was called. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	ff01da71e4	NFSD: always drop directory lock in nfsd_unlink() [ Upstream commit `b677c0c63a` ] Some error paths in nfsd_unlink() allow it to exit without unlocking the directory. This is not a problem in practice as the directory will be locked with an fh_put(), but it is untidy and potentially confusing. This allows us to remove all the fh_unlock() calls that are immediately after nfsd_unlink() calls. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	4655bcbce7	NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. [ Upstream commit `927bfc5600` ] nfsd_create() usually returns with the directory still locked. nfsd_symlink() usually returns with it unlocked. This is clumsy. Until recently nfsd_create() needed to keep the directory locked until ACLs and security label had been set. These are now set inside nfsd_create() (in nfsd_setattr()) so this need is gone. So change nfsd_create() and nfsd_symlink() to always unlock, and remove any fh_unlock() calls that follow calls to these functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	d52acd23a3	NFSD: add posix ACLs to struct nfsd_attrs [ Upstream commit `c0cbe70742` ] pacl and dpacl pointers are added to struct nfsd_attrs, which requires that we have an nfsd_attrs_free() function to free them. Those nfsv4 functions that can set ACLs now set up these pointers based on the passed in NFSv4 ACL. nfsd_setattr() sets the acls as appropriate. Errors are handled as with security labels. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	a3f27177c2	NFSD: add security label to struct nfsd_attrs [ Upstream commit `d6a97d3f58` ] nfsd_setattr() now sets a security label if provided, and nfsv4 provides it in the 'open' and 'create' paths and the 'setattr' path. If setting the label failed (including because the kernel doesn't support labels), an error field in 'struct nfsd_attrs' is set, and the caller can respond. The open/create callers clear FATTR4_WORD2_SECURITY_LABEL in the returned attr set in this case. The setattr caller returns the error. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	8a26a1b5c8	NFSD: set attributes when creating symlinks [ Upstream commit `93adc1e391` ] The NFS protocol includes attributes when creating symlinks. Linux does store attributes for symlinks and allows them to be set, though they are not used for permission checking. NFSD currently doesn't set standard (struct iattr) attributes when creating symlinks, but for NFSv4 it does set ACLs and security labels. This is inconsistent. To improve consistency, pass the provided attributes into nfsd_symlink() and call nfsd_create_setattr() to set them. NOTE: this results in a behaviour change for all NFS versions when the client sends non-default attributes with a SYMLINK request. With the Linux client, the only attributes are: attr.ia_mode = S_IFLNK \| S_IRWXUGO; attr.ia_valid = ATTR_MODE; so the final outcome will be unchanged. Other clients might sent different attributes, and if they did they probably expect them to be honoured. We ignore any error from nfsd_create_setattr(). It isn't really clear what should be done if a file is successfully created, but the attributes cannot be set. NFS doesn't allow partial success to be reported. Reporting failure is probably more misleading than reporting success, so the status is ignored. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
NeilBrown	1835147948	NFSD: introduce struct nfsd_attrs [ Upstream commit `7fe2a71dda` ] The attributes that nfsd might want to set on a file include 'struct iattr' as well as an ACL and security label. The latter two are passed around quite separately from the first, in part because they are only needed for NFSv4. This leads to some clumsiness in the code, such as the attributes NOT being set in nfsd_create_setattr(). We need to keep the directory locked until all attributes are set to ensure the file is never visibile without all its attributes. This need combined with the inconsistent handling of attributes leads to more clumsiness. As a first step towards tidying this up, introduce 'struct nfsd_attrs'. This is passed (by reference) to vfs.c functions that work with attributes, and is assembled by the various nfs*proc functions which call them. As yet only iattr is included, but future patches will expand this. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
Jeff Layton	162f99ff7b	NFSD: verify the opened dentry after setting a delegation [ Upstream commit `876c553cb4` ] Between opening a file and setting a delegation on it, someone could rename or unlink the dentry. If this happens, we do not want to grant a delegation on the open. On a CLAIM_NULL open, we're opening by filename, and we may (in the non-create case) or may not (in the create case) be holding i_rwsem when attempting to set a delegation. The latter case allows a race. After getting a lease, redo the lookup of the file being opened and validate that the resulting dentry matches the one in the open file description. To properly redo the lookup we need an rqst pointer to pass to nfsd_lookup_dentry(), so make sure that is available. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
Jeff Layton	3a5ab224a8	NFSD: drop fh argument from alloc_init_deleg [ Upstream commit `bbf936edd5` ] Currently, we pass the fh of the opened file down through several functions so that alloc_init_deleg can pass it to delegation_blocked. The filehandle of the open file is available in the nfs4_file however, so there's no need to pass it in a separate argument. Drop the argument from alloc_init_deleg, nfs4_open_delegation and nfs4_set_delegation. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:14 +02:00
Chuck Lever	b6494b36b8	NFSD: Move copy offload callback arguments into a separate structure [ Upstream commit `a11ada99ce` ] Refactor so that CB_OFFLOAD arguments can be passed without allocating a whole struct nfsd4_copy object. On my system (x86_64) this removes another 96 bytes from struct nfsd4_copy. [ cel: adjusted to apply to v5.15.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	8918b50537	NFSD: Add nfsd4_send_cb_offload() [ Upstream commit `e72f9bc006` ] Refactor for legibility. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	bb1eb97558	NFSD: Remove kmalloc from nfsd4_do_async_copy() [ Upstream commit `ad1e46c9b0` ] Instead of manufacturing a phony struct nfsd_file, pass the struct file returned by nfs42_ssc_open() directly to nfsd4_do_copy(). [ cel: adjusted to apply to v5.15.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	9cecf4772e	NFSD: Refactor nfsd4_do_copy() [ Upstream commit `3b7bf5933c` ] Refactor: Now that nfsd4_do_copy() no longer calls the cleanup helpers, plumb the use of struct file pointers all the way down to _nfsd_copy_file_range(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	a48454785b	NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2) [ Upstream commit `478ed7b10d` ] Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A subsequent patch will modify one of the new call sites to avoid the need to manufacture the phony struct nfsd_file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	4952fe6689	NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2) [ Upstream commit `24d796ea38` ] The @src parameter is sometimes a pointer to a struct nfsd_file and sometimes a pointer to struct file hiding in a phony struct nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter is always an explicit struct file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	6cb00ba230	NFSD: Replace boolean fields in struct nfsd4_copy [ Upstream commit `1913cdf56c` ] Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy() with an atomic bitop. [ cel: adjusted to apply to v5.15.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	6ff95a5f72	NFSD: Make nfs4_put_copy() static [ Upstream commit `8ea6e2c90b` ] Clean up: All call sites are in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	9a99c7f5d9	NFSD: Reorder the fields in struct nfsd4_op [ Upstream commit `d314309425` ] Pack the fields to reduce the size of struct nfsd4_op, which is used an array in struct nfsd4_compoundargs. sizeof(struct nfsd4_op): Before: /* size: 672, cachelines: 11, members: 5 / After: / size: 640, cachelines: 10, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	7d1e44fd06	NFSD: Shrink size of struct nfsd4_copy [ Upstream commit `87689df694` ] struct nfsd4_copy is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 1696, cachelines: 27, members: 5 / After: / size: 672, cachelines: 11, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:13 +02:00
Chuck Lever	24286575c6	NFSD: Shrink size of struct nfsd4_copy_notify [ Upstream commit `09426ef2a6` ] struct nfsd4_copy_notify is part of struct nfsd4_op, which resides in an 8-element array. sizeof(struct nfsd4_op): Before: /* size: 2208, cachelines: 35, members: 5 / After: / size: 1696, cachelines: 27, members: 5 */ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	00eb5bd384	NFSD: nfserrno(-ENOMEM) is nfserr_jukebox [ Upstream commit `bb4d842722` ] Suggested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	9032c8e3ae	NFSD: Fix strncpy() fortify warning [ Upstream commit `5304877936` ] In function ‘strncpy’, inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3, inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11: /home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation] 52 \| #define __underlying_strncpy __builtin_strncpy \| ^ /home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’ 89 \| return __underlying_strncpy(p, q, size); \| ^~~~~~~~~~~~~~~~~~~~ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	0dfb192896	NFSD: Clean up nfsd4_encode_readlink() [ Upstream commit `99b002a1fa` ] Similar changes to nfsd4_encode_readv(), all bundled into a single patch. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	fc7380a198	NFSD: Use xdr_pad_size() [ Upstream commit `5e64d85c7d` ] Clean up: Use a helper instead of open-coding the calculation of the XDR pad size. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	2528f487c8	NFSD: Simplify starting_len [ Upstream commit `071ae99fea` ] Clean-up: Now that nfsd4_encode_readv() does not have to encode the EOF or rd_length values, it no longer needs to subtract 8 from @starting_len. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	7bc5433117	NFSD: Optimize nfsd4_encode_readv() [ Upstream commit `28d5bc468e` ] write_bytes_to_xdr_buf() is pretty expensive to use for inserting an XDR data item that is always 1 XDR_UNIT at an address that is always XDR word-aligned. Since both the readv and splice read paths encode EOF and maxcount values, move both to a common code path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	a70976ec89	NFSD: Add an nfsd4_read::rd_eof field [ Upstream commit `24c7fb8549` ] Refactor: Make the EOF result available in the entire NFSv4 READ path. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	2540b70429	NFSD: Clean up SPLICE_OK in nfsd4_encode_read() [ Upstream commit `c738b218a2` ] Do the test_bit() once -- this reduces the number of locked-bus operations and makes the function a little easier to read. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	3e7adac61d	NFSD: Optimize nfsd4_encode_fattr() [ Upstream commit `ab04de60ae` ] write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly. In nfsd4_encode_fattr(), it is unnecessary because the data item is fixed in size and the buffer destination address is always word-aligned. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:12 +02:00
Chuck Lever	0d6c82286d	NFSD: Optimize nfsd4_encode_operation() [ Upstream commit `095a764b7a` ] write_bytes_to_xdr_buf() is a generic way to place a variable-length data item in an already-reserved spot in the encoding buffer. However, it is costly, and here, it is unnecessary because the data item is fixed in size, the buffer destination address is always word-aligned, and the destination location is already in @p. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Jeff Layton	b9e6a5610b	nfsd: silence extraneous printk on nfsd.ko insertion [ Upstream commit `3a5940bfa1` ] This printk pops every time nfsd.ko gets plugged in. Most kmods don't do that and this one is not very informative. Olaf's email address seems to be defunct at this point anyway. Just drop it. Cc: Olaf Kirch <okir@suse.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Dai Ngo	650417956a	NFSD: limit the number of v4 clients to 1024 per 1GB of system memory [ Upstream commit `4271c2c088` ] Currently there is no limit on how many v4 clients are supported by the system. This can be a problem in systems with small memory configuration to function properly when a very large number of clients exist that creates memory shortage conditions. This patch enforces a limit of 1024 NFSv4 clients, including courtesy clients, per 1GB of system memory. When the number of the clients reaches the limit, requests that create new clients are returned with NFS4ERR_DELAY and the laundromat is kicked start to trim old clients. Due to the overhead of the upcall to remove the client record, the maximun number of clients the laundromat removes on each run is limited to 128. This is done to ensure the laundromat can still process the other tasks in a timely manner. Since there is now a limit of the number of clients, the 24-hr idle time limit of courtesy client is no longer needed and was removed. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Dai Ngo	59d3587829	NFSD: keep track of the number of v4 clients in the system [ Upstream commit `0926c39515` ] Add counter nfs4_client_count to keep track of the total number of v4 clients, including courtesy clients, in the system. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Dai Ngo	0f202977ca	NFSD: refactoring v4 specific code to a helper in nfs4state.c [ Upstream commit `6867137ebc` ] This patch moves the v4 specific code from nfsd_init_net() to nfsd4_init_leases_net() helper in nfs4state.c Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	a52bb607ab	NFSD: Ensure nf_inode is never dereferenced [ Upstream commit `427f5f83a3` ] The documenting comment for struct nf_file states: /* * A representation of a file that has been opened by knfsd. These are hashed * in the hashtable by inode pointer value. Note that this object doesn't * hold a reference to the inode by itself, so the nf_inode pointer should * never be dereferenced, only used for comparison. */ Replace the two existing dereferences to make the comment always true. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	e3befca679	NFSD: NFSv4 CLOSE should release an nfsd_file immediately [ Upstream commit `5e138c4a75` ] The last close of a file should enable other accessors to open and use that file immediately. Leaving the file open in the filecache prevents other users from accessing that file until the filecache garbage-collects the file -- sometimes that takes several seconds. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	9be6499171	NFSD: Move nfsd_file_trace_alloc() tracepoint [ Upstream commit `b40a283947` ] Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	06d9c87204	NFSD: Separate tracepoints for acquire and create [ Upstream commit `be0230069f` ] These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	4b338b528c	NFSD: Clean up unused code after rhashtable conversion [ Upstream commit `0ec8e9d153` ] Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:11 +02:00
Chuck Lever	1bea66c088	NFSD: Convert the filecache to use rhashtable [ Upstream commit `ce502f81ba` ] Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	208bd42a1a	NFSD: Set up an rhashtable for the filecache [ Upstream commit `fc22945ecc` ] Add code to initialize and tear down an rhashtable. The rhashtable is not used yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	0b3a69057d	NFSD: Replace the "init once" mechanism [ Upstream commit `c7b824c3d0` ] In a moment, the nfsd_file_hashtbl global will be replaced with an rhashtable. Replace the one or two spots that need to check if the hash table is available. We can easily reuse the SHUTDOWN flag for this purpose. Document that this mechanism relies on callers to hold the nfsd_mutex to prevent init, shutdown, and purging to run concurrently. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	76e2424c0d	NFSD: Remove nfsd_file::nf_hashval [ Upstream commit `f0743c2b25` ] The value in this field can always be computed from nf_inode, thus it is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	ec30a45635	NFSD: nfsd_file_hash_remove can compute hashval [ Upstream commit `cb7ec76e73` ] Remove an unnecessary use of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	7e8d4a9334	NFSD: Refactor __nfsd_file_close_inode() [ Upstream commit `a845511007` ] The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	2aa9fd1db0	NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode [ Upstream commit `8755326399` ] Remove an unnecessary usage of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	d6a23d45e2	NFSD: Remove lockdep assertion from unhash_and_release_locked() [ Upstream commit `f53cef15dd` ] IIUC, holding the hash bucket lock is needed only in nfsd_file_unhash, and there is already a lockdep assertion there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	e97c2d5a71	NFSD: No longer record nf_hashval in the trace log [ Upstream commit `54f7df7094` ] I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	1db19c3574	NFSD: Never call nfsd_file_gc() in foreground paths [ Upstream commit `6df1941136` ] The checks in nfsd_file_acquire() and nfsd_file_put() that directly invoke filecache garbage collection are intended to keep cache occupancy between a low- and high-watermark. The reason to limit the capacity of the filecache is to keep filecache lookups reasonably fast. However, invoking garbage collection at those points has some undesirable negative impacts. Files that are held open by NFSv4 clients often push the occupancy of the filecache over these watermarks. At that point: - Every call to nfsd_file_acquire() and nfsd_file_put() results in an LRU walk. This has the same effect on lookup latency as long chains in the hash table. - Garbage collection will then run on every nfsd thread, causing a lot of unnecessary lock contention. - Limiting cache capacity pushes out files used only by NFSv3 clients, which are the type of files the filecache is supposed to help. To address those negative impacts, remove the direct calls to the garbage collector. Subsequent patches will address maintaining lookup efficiency as cache capacity increases. Suggested-by: Wang Yugui <wangyugui@e16-tech.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:10 +02:00
Chuck Lever	81e3c77027	NFSD: Fix the filecache LRU shrinker [ Upstream commit `edead3a558` ] Without LRU item rotation, the shrinker visits only a few items on the end of the LRU list, and those would always be long-term OPEN files for NFSv4 workloads. That makes the filecache shrinker completely ineffective. Adopt the same strategy as the inode LRU by using LRU_ROTATE. Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	ffb1a10a44	NFSD: Leave open files out of the filecache LRU [ Upstream commit `4a0e73e635` ] There have been reports of problems when running fstests generic/531 against Linux NFS servers with NFSv4. The NFS server that hosts the test's SCRATCH_DEV suffers from CPU soft lock-ups during the test. Analysis shows that: fs/nfsd/filecache.c 482 ret = list_lru_walk(&nfsd_file_lru, 483 nfsd_file_lru_cb, 484 &head, LONG_MAX); causes nfsd_file_gc() to walk the entire length of the filecache LRU list every time it is called (which is quite frequently). The walk holds a spinlock the entire time that prevents other nfsd threads from accessing the filecache. What's more, for NFSv4 workloads, none of the items that are visited during this walk may be evicted, since they are all files that are held OPEN by NFS clients. Address this by ensuring that open files are not kept on the LRU list. Reported-by: Frank van der Linden <fllinden@amazon.com> Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	175f88a6d5	NFSD: Trace filecache LRU activity [ Upstream commit `c46203acdd` ] Observe the operation of garbage collection and the lifetime of filecache items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	eed6df3160	NFSD: WARN when freeing an item still linked via nf_lru [ Upstream commit `668ed92e65` ] Add a guardrail to prevent freeing memory that is still on a list. This includes either a dispose list or the LRU list. This is the sign of a bug, but this class of bugs can be detected so that they don't endanger system stability, especially while debugging. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	16cbc64f9c	NFSD: Hook up the filecache stat file [ Upstream commit `2e6c6e4c43` ] There has always been the capability of exporting filecache metrics via /proc, but it was never hooked up. Let's surface these metrics to enable better observability of the filecache. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	4ade29dd09	NFSD: Zero counters when the filecache is re-initialized [ Upstream commit `8b330f7804` ] If nfsd_file_cache_init() is called after a shutdown, be sure the stat counters are reset. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	a880dcef74	NFSD: Record number of flush calls [ Upstream commit `df2aff524f` ] Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	ae76efbdfe	NFSD: Report the number of items evicted by the LRU walk [ Upstream commit `94660cc19c` ] Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	5ce93c611c	NFSD: Refactor nfsd_file_lru_scan() [ Upstream commit `39f1d1ff81` ] Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	5b6f8b0836	NFSD: Refactor nfsd_file_gc() [ Upstream commit `3bc6d3470f` ] Refactor nfsd_file_gc() to use the new list_lru helper. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:09 +02:00
Chuck Lever	c162c99a29	NFSD: Add nfsd_file_lru_dispose_list() helper [ Upstream commit `0bac5a264d` ] Refactor the invariant part of nfsd_file_lru_walk_list() into a separate helper function. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	4420d19ed4	NFSD: Report average age of filecache items [ Upstream commit `904940e94a` ] This is a measure of how long items stay in the filecache, to help assess how efficient the cache is. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	c18563275f	NFSD: Report count of freed filecache items [ Upstream commit `d63293272a` ] Surface the count of freed nfsd_file items. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	b2dc4d30b0	NFSD: Report count of calls to nfsd_file_acquire() [ Upstream commit `29d4bdbbb9` ] Count the number of successful acquisitions that did not create a file (ie, acquisitions that do not result in a compulsory cache miss). This count can be compared directly with the reported hit count to compute a hit ratio. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	0369b53886	NFSD: Report filecache LRU size [ Upstream commit `0fd244c115` ] Surface the NFSD filecache's LRU list length to help field troubleshooters monitor filecache issues. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	f1785afc89	NFSD: Demote a WARN to a pr_warn() [ Upstream commit `ca3f9acb6d` ] The call trace doesn't add much value, but it sure is noisy. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Colin Ian King	f87230a7db	nfsd: remove redundant assignment to variable len [ Upstream commit `842e00ac3a` ] Variable len is being assigned a value zero and this is never read, it is being re-assigned later. The assignment is redundant and can be removed. Cleans up clang scan-build warning: fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Zhang Jiaming	cad76843c7	NFSD: Fix space and spelling mistake [ Upstream commit `f532c9ff10` ] Add a blank space after ','. Change 'succesful' to 'successful'. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	dcbebc8685	NFSD: Instrument fh_verify() [ Upstream commit `0513828855` ] Capture file handles and how they map to local inodes. In particular, NFSv4 PUTFH uses fh_verify() so we can now observe which file handles are the target of OPEN, LOOKUP, RENAME, and so on. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:08 +02:00
Chuck Lever	f3222a6b66	NFSD: Decode NFSv4 birth time attribute [ Upstream commit `5b2f3e0777` ] NFSD has advertised support for the NFSv4 time_create attribute since commit `e377a3e698` ("nfsd: Add support for the birth time attribute"). Igor Mammedov reports that Mac OS clients attempt to set the NFSv4 birth time attribute via OPEN(CREATE) and SETATTR if the server indicates that it supports it, but since the above commit was merged, those attempts now fail. Table 5 in RFC 8881 lists the time_create attribute as one that can be both set and retrieved, but the above commit did not add server support for clients to provide a time_create attribute. IMO that's a bug in our implementation of the NFSv4 protocol, which this commit addresses. Whether NFSD silently ignores the new birth time or actually sets it is another matter. I haven't found another filesystem service in the Linux kernel that enables users or clients to modify a file's birth time attribute. This commit reflects my (perhaps incorrect) understanding of whether Linux users can set a file's birth time. NFSD will now recognize a time_create attribute but it ignores its value. It clears the time_create bit in the returned attribute bitmask to indicate that the value was not used. Reported-by: Igor Mammedov <imammedo@redhat.com> Fixes: `e377a3e698` ("nfsd: Add support for the birth time attribute") Tested-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:07 +02:00
Chuck Lever	261eabe19c	NFSD: Fix potential use-after-free in nfsd_file_put() [ Upstream commit `b6c71c66b0` ] nfsd_file_put_noref() can free @nf, so don't dereference @nf immediately upon return from nfsd_file_put_noref(). Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: `999397926a` ("nfsd: Clean up nfsd_file_put()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:07 +02:00
Chuck Lever	ba68ab7d14	NFSD: nfsd_file_put() can sleep [ Upstream commit `08af54b3e5` ] Now that there are no more callers of nfsd_file_put() that might hold a spin lock, ensure the lockdep infrastructure can catch newly introduced calls to nfsd_file_put() made while a spinlock is held. Link: https://lore.kernel.org/linux-nfs/ece7fd1d-5fb3-5155-54ba-347cfc19bd9a@oracle.com/T/#mf1855552570cf9a9c80d1e49d91438cd9085aada Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	f55b83a598	NFSD: Add documenting comment for nfsd4_release_lockowner() [ Upstream commit `043862b09c` ] And return explicit nfserr values that match what is documented in the new comment / API contract. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	0775c8784e	NFSD: Modernize nfsd4_release_lockowner() [ Upstream commit `bd8fdb6e54` ] Refactor: Use existing helpers that other lock operations use. This change removes several automatic variables, so re-organize the variable declarations for readability. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Zhang Xiaoxu	5e4ee807e3	nfsd: Fix null-ptr-deref in nfsd_fill_super() [ Upstream commit `6f6f84aa21` ] KASAN report null-ptr-deref as follows: BUG: KASAN: null-ptr-deref in nfsd_fill_super+0xc6/0xe0 [nfsd] Write of size 8 at addr 000000000000005d by task a.out/852 CPU: 7 PID: 852 Comm: a.out Not tainted 5.18.0-rc7-dirty #66 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 kasan_report+0xab/0x120 ? nfsd_mkdir+0x71/0x1c0 [nfsd] ? nfsd_fill_super+0xc6/0xe0 [nfsd] nfsd_fill_super+0xc6/0xe0 [nfsd] ? nfsd_mkdir+0x1c0/0x1c0 [nfsd] get_tree_keyed+0x8e/0x100 vfs_get_tree+0x41/0xf0 __do_sys_fsconfig+0x590/0x670 ? fscontext_read+0x180/0x180 ? anon_inode_getfd+0x4f/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae This can be reproduce by concurrent operations: 1. fsopen(nfsd)/fsconfig 2. insmod/rmmod nfsd Since the nfsd file system is registered before than nfsd_net allocated, the caller may get the file_system_type and use the nfsd_net before it allocated, then null-ptr-deref occurred. So init_nfsd() should call register_filesystem() last. Fixes: `bd5ae9288d` ("nfsd: register pernet ops last, unregister first") Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Zhang Xiaoxu	bf31820549	nfsd: Unregister the cld notifier when laundry_wq create failed [ Upstream commit `62fdb65edb` ] If laundry_wq create failed, the cld notifier should be unregistered. Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	3a66ad7ea7	SUNRPC: Use RMW bitops in single-threaded hot paths [ Upstream commit `28df098881` ] I noticed CPU pipeline stalls while using perf. Once an svc thread is scheduled and executing an RPC, no other processes will touch svc_rqst::rq_flags. Thus bus-locked atomics are not needed outside the svc thread scheduler. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	7af208c9ea	NFSD: Trace filecache opens [ Upstream commit `0122e88211` ] Instrument calls to nfsd_open_verified() to get a sense of the filecache hit rate. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	73d9eb9e19	NFSD: Move documenting comment for nfsd4_process_open2() [ Upstream commit `7e2ce0cc15` ] Clean up nfsd4_open() by converting a large comment at the only call site for nfsd4_process_open2() to a kerneldoc comment in front of that function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	7dfad7f7da	NFSD: Fix whitespace [ Upstream commit `26320d7e31` ] Clean up: Pull case arms back one tab stop to conform every other switch statement in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	b54f6a079a	NFSD: Remove dprintk call sites from tail of nfsd4_open() [ Upstream commit `f67a16b147` ] Clean up: These relics are not likely to benefit server administrators. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:06 +02:00
Chuck Lever	106331a12b	NFSD: Instantiate a struct file when creating a regular NFSv4 file [ Upstream commit `fb70bf124b` ] There have been reports of races that cause NFSv4 OPEN(CREATE) to return an error even though the requested file was created. NFSv4 does not provide a status code for this case. To mitigate some of these problems, reorganize the NFSv4 OPEN(CREATE) logic to allocate resources before the file is actually created, and open the new file while the parent directory is still locked. Two new APIs are added: + Add an API that works like nfsd_file_acquire() but does not open the underlying file. The OPEN(CREATE) path can use this API when it already has an open file. + Add an API that is kin to dentry_open(). NFSD needs to create a file and grab an open "struct file *" atomically. The alloc_empty_file() has to be done before the inode create. If it fails (for example, because the NFS server has exceeded its max_files limit), we avoid creating the file and can still return an error to the NFS client. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=382 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: JianHong Yin <jiyin@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	ce2296da5d	NFSD: Clean up nfsd_open_verified() [ Upstream commit `f4d84c5264` ] Its only caller always passes S_IFREG as the @type parameter. As an additional clean-up, add a kerneldoc comment. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	dabf24069b	NFSD: Remove do_nfsd_create() [ Upstream commit `1c388f2775` ] Now that its two callers have their own version-specific instance of this function, do_nfsd_create() is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	62bac33a70	NFSD: Refactor NFSv4 OPEN(CREATE) [ Upstream commit `254454a5aa` ] Copy do_nfsd_create() to nfs4proc.c and remove NFSv3-specific logic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	ab407e0bf8	NFSD: Refactor NFSv3 CREATE [ Upstream commit `df9606abdd` ] The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to diverge such that it makes sense to split do_nfsd_create() into one version for NFSv3 and one for NFSv4. As a first step, copy do_nfsd_create() to nfs3proc.c and remove NFSv4-specific logic. One immediate legibility benefit is that the logic for handling NFSv3 createhow is now quite straightforward. NFSv4 createhow has some subtleties that IMO do not belong in generic code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	3bd0ae962b	NFSD: Refactor nfsd_create_setattr() [ Upstream commit `5f46e950c3` ] I'd like to move do_nfsd_create() out of vfs.c. Therefore nfsd_create_setattr() needs to be made publicly visible. Note that both call sites in vfs.c commit both the new object and its parent directory, so just combine those common metadata commits into nfsd_create_setattr(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	cf655c890b	NFSD: Avoid calling fh_drop_write() twice in do_nfsd_create() [ Upstream commit `14ee45b70d` ] Clean up: The "out" label already invokes fh_drop_write(). Note that fh_drop_write() is already careful not to invoke mnt_drop_write() if either it has already been done or there is nothing to drop. Therefore no change in behavior is expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Chuck Lever	55cb08630e	NFSD: Clean up nfsd3_proc_create() [ Upstream commit `e61568599c` ] As near as I can tell, mode bit masking and setting S_IFREG is already done by do_nfsd_create() and vfs_create(). The NFSv4 path (do_open_lookup), for example, does not bother with this special processing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Dai Ngo	2e0f8ee3c1	NFSD: Show state of courtesy client in client info [ Upstream commit `e9488d5ae1` ] Update client_info_show to show state of courtesy client and seconds since last renew. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Dai Ngo	6e56a5f75c	NFSD: add support for lock conflict to courteous server [ Upstream commit `27431affb0` ] This patch allows expired client with lock state to be in COURTESY state. Lock conflict with COURTESY client is resolved by the fs/lock code using the lm_lock_expirable and lm_expire_lock callback in the struct lock_manager_operations. If conflict client is in COURTESY state, set it to EXPIRABLE and schedule the laundromat to run immediately to expire the client. The callback lm_expire_lock waits for the laundromat to flush its work queue before returning to caller. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:05 +02:00
Dai Ngo	d9fc2f8267	NFSD: move create/destroy of laundry_wq to init_nfsd and exit_nfsd [ Upstream commit `d76cc46b37` ] This patch moves create/destroy of laundry_wq from nfs4_state_start and nfs4_state_shutdown_net to init_nfsd and exit_nfsd to prevent the laundromat from being freed while a thread is processing a conflicting lock. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:04 +02:00
Dai Ngo	492634cbfe	NFSD: add support for share reservation conflict to courteous server [ Upstream commit `3d69427151` ] This patch allows expired client with open state to be in COURTESY state. Share/access conflict with COURTESY client is resolved by setting COURTESY client to EXPIRABLE state, schedule laundromat to run and returning nfserr_jukebox to the request client. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:04 +02:00
Dai Ngo	26540b8940	NFSD: add courteous server support for thread with only delegation [ Upstream commit `66af257999` ] This patch provides courteous server support for delegation only. Only expired client with delegation but no conflict and no open or lock state is allowed to be in COURTESY state. Delegation conflict with COURTESY/EXPIRABLE client is resolved by setting it to EXPIRABLE, queue work for the laundromat and return delay to the caller. Conflict is resolved when the laudromat runs and expires the EXIRABLE client while the NFS client retries the OPEN request. Local thread request that gets conflict is doing the retry in _break_lease. Client in COURTESY or EXPIRABLE state is allowed to reconnect and continues to have access to its state. Access to the nfs4_client by the reconnecting thread and the laundromat is serialized via the client_lock. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:04 +02:00
Chuck Lever	56bc7e3821	NFSD: Clean up nfsd_splice_actor() [ Upstream commit `91e23b1c39` ] nfsd_splice_actor() checks that the page being spliced does not match the previous element in the svc_rqst::rq_pages array. We believe this is to prevent a double put_page() in cases where the READ payload is partially contained in the xdr_buf's head buffer. However, the NFSD READ proc functions no longer place any part of the READ payload in the head buffer, in order to properly support NFS/RDMA READ with Write chunks. Therefore, simplify the logic in nfsd_splice_actor() to remove this unnecessary check. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:04 +02:00
Amir Goldstein	50612cd6a7	nfsd: use fsnotify group lock helpers [ Upstream commit `b8962a9d8c` ] Before commit `9542e6a643` ("nfsd: Containerise filecache laundrette") nfsd would close open files in direct reclaim context and that could cause a deadlock when fsnotify mark allocation went into direct reclaim and nfsd shrinker tried to free existing fsnotify marks. To avoid issues like this in future code, set the FSNOTIFY_GROUP_NOFS flag on nfsd fsnotify group to prevent going into direct reclaim from fsnotify_add_inode_mark(). Link: https://lore.kernel.org/r/20220422120327.3459282-10-amir73il@gmail.com Suggested-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220321112310.vpr7oxro2xkz5llh@quack3.lan/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:03 +02:00
Amir Goldstein	ac51c087ab	fsnotify: pass flags argument to fsnotify_alloc_group() [ Upstream commit `867a448d58` ] Add flags argument to fsnotify_alloc_group(), define and use the flag FSNOTIFY_GROUP_USER in inotify and fanotify instead of the helper fsnotify_alloc_user_group() to indicate user allocation. Although the flag FSNOTIFY_GROUP_USER is currently not used after group allocation, we store the flags argument in the group struct for future use of other group flags. Link: https://lore.kernel.org/r/20220422120327.3459282-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:19:02 +02:00

1 2 3 4 5 ...

3626 Коммитов