Refactor the tail of nlm_gc_hosts() into nlm_destroy_host() so that
this logic can be used separately from garbage collection.
Rename it _locked() to document that it must be called with the hosts
cache mutex held.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Refactor nlm_host allocation and initialization into a separate
function. This will be the common piece of server and client nlm_host
lookup logic after the nlm_host cache is split.
Small change: use kmalloc() instead of kzalloc(), as we're overwriting
almost all fields in the new nlm_host struct with non-zero values
immediately after it is allocated. An added benefit is we now have an
explicit reference to each field name where it is initialized (for all
you cscope fans out there).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Minor reorganization; no change in behavior. This will save some
duplicated code after we split the client and server host caches.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
[ cel: Forward-ported to 2.6.37 ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We've got a lot of loops like this, and I find them a little easier to
read with the macros. More such loops are coming.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
[ cel: Forward-ported to 2.6.37 ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Nick Bowler reports:
There are no unusual messages on the client... but I just logged into
the server and I see lots of messages of the following form:
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
nfsd: request from insecure port (192.168.8.199:35766)!
Bisected to commit 9247685088 (SUNRPC:
Properly initialize sock_xprt.srcaddr in all cases)
Apparently, removing the 'transport->srcaddr.ss_family = family' from
xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly
initialising the srcaddr family to AF_UNSPEC.
Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
nsm_reboot_lookup takes a reference to the nsm_handle that it returns,
but nlm_host_rebooted never releases that reference.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
lockd needs these sort of routines, as does the NFSv4 callback code.
Move lockd's routines into common code and rename them so that they can
be used by others.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Use shared rpc_set_port() function instead of nlm_clear_port().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
now. Remove it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: nsm_find() now has only one caller, and that caller
unconditionally sets the @create argument. Thus the @create
argument is no longer needed.
Since nsm_find() now has a more specific purpose, pick a more
appropriate name for it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Invoke the newly introduced nsm_reboot_lookup() function in
nlm_host_rebooted() instead of nsm_find().
This introduces just one behavioral change: debugging messages
produced during reboot notification will now appear when the
NLMDBG_MONITOR flag is set, but not when the NLMDBG_HOSTCACHE flag
is set.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The NLM XDR decoders for the NLMPROC_SM_NOTIFY procedure should treat
their "priv" argument truly as an opaque, as defined by the protocol,
and let the upper layers figure out what is in it.
This will make it easier to modify the contents and interpretation of
the "priv" argument, and keep knowledge about what's in "priv" local
to fs/lockd/mon.c.
For now, the NLM and NSM implementations should behave exactly as they
did before.
The formation of the address of the rebooted host in
nlm_host_rebooted() may look a little strange, but it is the inverse
of how nsm_init_private() forms the private cookie. Plus, it's
going away soon anyway.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Pass the nlm_reboot data structure directly from the NLMPROC_SM_NOTIFY
XDR decoders to nlm_host_rebooted(). This eliminates some packing and
unpacking of the NLMPROC_SM_NOTIFY results, and prepares for passing
these results, including the "priv" cookie, directly to a lookup
routine in fs/lockd/mon.c.
This patch changes code organization but should not cause any
behavioral change.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The nsm_find() function sets up fresh nsm_handle entries. This is
where we will store the "priv" cookie used to lookup nsm_handles during
reboot recovery. The cookie will be constructed when nsm_find()
creates a new nsm_handle.
As much as possible, I would like to keep everything that handles a
"priv" cookie in fs/lockd/mon.c so that all the smarts are in one
source file. That organization should make it pretty simple to see how
all this works.
To me, it makes more sense than the current arrangement to keep
nsm_find() with nsm_monitor() and nsm_unmonitor().
So, start reorganizing by moving nsm_find() into fs/lockd/mon.c. The
nsm_release() function comes along too, since it shares the nsm_lock
global variable.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The nsm_handle's reference count is bumped in nlm_lookup_host(). It
should be decremented in nlm_destroy_host() to make it easier to see
the balance of these two operations.
Move the nsm_release() call to fs/lockd/host.c.
The h_nsmhandle pointer is set in nlm_lookup_host(), and never cleared.
The nlm_destroy_host() function is never called for the same nlm_host
twice, so h_nsmhandle won't ever be NULL when nsm_unmonitor() is
called.
All references to the nlm_host are gone before it is freed. We can
skip making h_nsmhandle NULL just before the nlm_host is deallocated.
It's also likely we can remove the h_nsmhandle NULL check in
nlmsvc_is_client() as well, but we can do that later when rearchitect-
ing the nlm_host cache.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: introduce a helper function to generate IPv4 addresses using
the same style as the IPv6 helper function we just added.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Scope ID support is needed since the kernel's NSM implementation is
about to use these displayed addresses as a mon_name in some cases.
When nsm_use_hostnames is zero, without scope ID support NSM will fail
to handle peers that contact us via a link-local address. Link-local
addresses do not work without an interface ID, which is stored in the
sockaddr's sin6_scope_id field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
AF_UNSPEC support is no longer needed in nlm_display_address() now
that a presentation address is no longer generated for the h_srcaddr
field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The h_name field in struct nlm_host is a just copy of
h_nsmhandle->sm_name. Likewise, the contents of the h_addrbuf field
should be identical to the sm_addrbuf field.
The h_srcaddrbuf field is used only in one place for debugging. We can
live without this until we get %pI formatting for printk().
Currently these buffers are 48 bytes, but we need to support scope IDs
in IPv6 presentation addresses, which means making the buffers even
larger. Instead, let's find ways to eliminate them to save space.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
If the admin has specified the "noresvport" option for an NFS mount
point, the kernel's NFS client uses an unprivileged source port for
the main NFS transport. The kernel's lockd client should use an
unprivileged port in this case as well.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Since commit c98451bd, the loop in nlm_lookup_host() unconditionally
compares the host's h_srcaddr field to the incoming source address.
For client-side nlm_host entries, both are always AF_UNSPEC, so this
check is unnecessary.
Since commit 781b61a6, which added support for AF_INET6 addresses to
nlm_cmp_addr(), nlm_cmp_addr() now returns FALSE for AF_UNSPEC
addresses, which causes nlm_lookup_host() to create a fresh nlm_host
entry every time it is called on the client.
These extra entries will eventually expire once the server is
unmounted, so the impact of this regression, introduced with lockd
IPv6 support in 2.6.28, should be minor.
We could fix this by adding an arm in nlm_cmp_addr() for AF_UNSPEC
addresses, but really, nlm_lookup_host() shouldn't be matching on the
srcaddr field for client-side nlm_host lookups.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The iscsi_ibft.c changes are almost certainly a bugfix as the
pointer 'ip' is a u8 *, so they never print the last 8 bytes
of the IPv6 address, and the eight bytes they do print have
a zero byte with them in each 16-bit word.
Other than that, this should cause no difference in functionality.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up nlmsvc_lookup_host() to pass AF_INET6 source addresses to
nlm_lookup_host().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Pass a struct sockaddr * and a length to nlmclnt_lookup_host() to
accomodate non-AF_INET family addresses.
As a side benefit, eliminate the hostname_len argument, as the hostname
is always NUL-terminated.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use struct sockaddr * and length in nlm_lookup_host_info to all callers
to pass in either AF_INET or AF_INET6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The nlm_lookup_host() function already has a large number of arguments,
and I'm about to add a few more. As a clean up, convert the function
to use a single data structure argument.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Having two separate functions doesn't add clarity, so
eliminate one of them. Use contemporary kernel coding conventions
where appropriate.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Adopt an approach similar to the RPC server's auth cache (from Aurelien
Charbon and Brian Haley).
Note nlm_lookup_host()'s existing IP address hash function has the same
issue with correctness on little-endian systems as the original IPv4 auth
cache hash function, so I've also updated it with a hash function similar
to the new auth cache hash function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Update the nlm_cmp_addr() helper to support AF_INET6 as well as AF_INET
addresses. New version takes two "struct sockaddr *" arguments instead of
"struct sockaddr_in *" arguments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
To store larger addresses in the nsm_handle structure, make sm_addr a
sockaddr_storage.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
To store larger addresses in the nlm_host structure, make h_saddr a
sockaddr_storage. And let's call it something more self-explanatory:
"saddr" could easily be mistaken for "server address".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
To store larger addresses in the nlm_host structure, make h_addr a
sockaddr_storage, and add an address length field.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Make sure an address family is specified for source addresses passed to
nlm_lookup_host(). nlm_lookup_host() will need this when it becomes
capable of dealing with AF_INET6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Knowing which source address is used for communicating with remote NLM
services can be helpful for debugging configuration problems on hosts
with multiple addresses.
Keep the dprintk debugging here, but adapt it so it displays AF_INET6
addresses properly. There are also a couple of dprintk clean-ups as
well.
At some point we will aggregate the helpers that display presentation
format addresses into a single set of shared helpers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
We're about to introduce some extra debugging messages in nlm_lookup_host().
Bring the coding style up to date first so we can cleanly introduce the new
debugging messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits)
SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request
make nfs_automount_list static
NFS: remove duplicate flags assignment from nfs_validate_mount_data
NFS - fix potential NULL pointer dereference v2
SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use
SUNRPC: Fix a race in gss_refresh_upcall()
SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
SUNRPC: Remove the unused export of xprt_force_disconnect
SUNRPC: remove XS_SENDMSG_RETRY
SUNRPC: Protect creds against early garbage collection
NFSv4: Attempt to use machine credentials in SETCLIENTID calls
NFSv4: Reintroduce machine creds
NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
nfs: fix printout of multiword bitfields
nfs: return negative error value from nfs{,4}_stat_to_errno
NLM/lockd: Ensure client locking calls use correct credentials
NFS: Remove the buggy lock-if-signalled case from do_setlk()
NLM/lockd: Fix a race when cancelling a blocking lock
NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call
NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel
...
There's no reason for a mutex here, except to allow an allocation under
the lock, which we can avoid with the usual trick of preallocating
memory for the new object and freeing it if it turns out to be
unnecessary.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use list_for_each_entry(). Also, in keeping with kernel style, make the
normal case (kzalloc succeeds) unindented and handle the abnormal case
with a goto.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
The sm_count is decremented to zero but left on the nsm_handles list.
So in the space between decrementing sm_count and acquiring nsm_mutex,
it is possible for another task to find this nsm_handle, increment the
use count and then enter nsm_release itself.
Thus there's nothing to prevent the nsm being freed before we acquire
nsm_mutex here.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Lockd caches information about hosts that have recently held locks to
expedite the taking of further locks.
It periodically discards this information for hosts that have not been
used for a few minutes.
lockd currently has a value NLM_HOST_MAX, and changes the 'garbage
collection' behaviour when the number of hosts exceeds this threshold.
However its behaviour is strange, and likely not what was intended.
When the number of hosts exceeds the max, it scans *less* often (every
2 minutes vs every minute) and allows unused host information to
remain around longer (5 minutes instead of 2).
Having this limit is of dubious value anyway, and we have not
suffered from the code not getting the limit right, so remove the
limit altogether. We go with the larger values (discard 5 minute old
hosts every 2 minutes) as they are probably safer.
Maybe the periodic garbage collection should be replace to with
'shrinker' handler so we just respond to memory pressure....
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: RPC protocol version numbers are u32. Make sure we use an
appropriate type for NLM version numbers when calling nlm_lookup_host().
Eliminates a harmless mixed sign comparison in nlm_host_lookup().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Now that it no longer does an RPC ping, lockd always ends up queueing
an RPC task for the GRANT_MSG callback. But, it also requeues the block
for later attempts. Since these are hard RPC tasks, if the client we're
calling back goes unresponsive the GRANT_MSG callbacks can stack up in
the RPC queue.
Fix this by making server-side RPC clients default to soft RPC tasks.
lockd requeues the block anyway, so this should be OK.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
It's currently possible for an unresponsive NLM client to completely
lock up a server's lockd. The scenario is something like this:
1) client1 (or a process on the server) takes a lock on a file
2) client2 tries to take a blocking lock on the same file and
awaits the callback
3) client2 goes unresponsive (plug pulled, network partition, etc)
4) client1 releases the lock
...at that point the server's lockd will try to queue up a GRANT_MSG
callback for client2, but first it requeues the block with a timeout of
30s. nlm_async_call will attempt to bind the RPC client to client2 and
will call rpc_ping. rpc_ping entails a sync RPC call and if client2 is
unresponsive it will take around 60s for that to time out. Once it times
out, it's already time to retry the block and the whole process repeats.
Once in this situation, nlmsvc_retry_blocked will never return until
the host starts responding again. lockd won't service new calls.
Fix this by skipping the RPC ping on NLM RPC clients. This makes
nlm_async_call return quickly when called.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>