This was unexpected behavior (at least for me)--why would you want
configuration settings automatically lost on nfsd restart?
In practice this won't affect distributions, which likely set everything
on every startup. But I'd expect the behavior to be less confusing to
someone manually restarting nfsd for testing.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The pool_to and to_pool fields of the global svc_pool_map are freed on
shutdown, but are initialized in nfsd startup only in the
SVC_POOL_PERCPU and SVC_POOL_PERNODE cases.
They *are* initialized to zero on kernel startup. So as long as you use
only SVC_POOL_GLOBAL (the default), this will never be a problem.
You're also OK if you only ever use SVC_POOL_PERCPU or SVC_POOL_PERNODE.
However, the following sequence events leads to a double-free:
1. set SVC_POOL_PERCPU or SVC_POOL_PERNODE
2. start nfsd: both fields are initialized.
3. shutdown nfsd: both fields are freed.
4. set SVC_POOL_GLOBAL
5. start nfsd: the fields are left untouched.
6. shutdown nfsd: now we try to free them again.
Step 4 is actually unnecessary, since (for some bizarre reason), nfsd
automatically resets the pool mode to SVC_POOL_GLOBAL on shutdown.
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
If the recovery directory doesn't exist, then behavior after a reboot
will be suboptimal. But it's unnecessarily harsh to then prevent the
nfsv4 server from working at all. Instead just print a warning
(already done in nfsd4_init_recdir()) and soldier on.
Tested-by: Lior <lior@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In the NFSv4.1 case, this could cause a spurious "NFSD: failed to write
recovery record (err -17); please check that /var/lib/nfs/v4recovery
exists and is writable.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Reported-by: Steve Dickson <SteveD@redhat.com>
Otherwise the for loop could try to use a file recently removed from the
file_hashtbl list and oops.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Tested-by: Casey Bodley <cbodley@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
unhash_delegation() will grab the recall lock before calling
list_del_init() in each of these places. This patch removes the
redundant calls.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Stateid's with "other" ("opaque") field all zeros or all ones are
reserved. We define all_ones separately on the off chance there will be
more such some day, though currently all the other special stateid's
have zero other field.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
v2: cache_register_net() and cache_unregister_net() GPL exports added
This is a cleanup patch. Hope, some day generic cache_register() and
cache_unregister() will be removed.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch makes svc_xprt inherit network namespace link from its socket.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
With NFSv4.0 it was safe to assume that open-by-filehandles were always
reclaims.
With NFSv4.1 there are non-reclaim open-by-filehandle operations, so we
should ensure we're only insisting on reclaims in the
OPEN_CLAIM_PREVIOUS case.
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Socket callbacks use svc_xprt_enqueue() to add an xprt to a
pool->sp_sockets list. In normal operation a server thread will later
come along and take the xprt off that list. On shutdown, after all the
threads have exited, we instead manually walk the sv_tempsocks and
sv_permsocks lists to find all the xprt's and delete them.
So the sp_sockets lists don't really matter any more. As a result,
we've mostly just ignored them and hoped they would go away.
Which has gotten us into trouble; witness for example ebc63e531c
"svcrpc: fix list-corrupting race on nfsd shutdown", the result of Ben
Greear noticing that a still-running svc_xprt_enqueue() could re-add an
xprt to an sp_sockets list just before it was deleted. The fix was to
remove it from the list at the end of svc_delete_xprt(). But that only
made corruption less likely--I can see nothing that prevents a
svc_xprt_enqueue() from adding another xprt to the list at the same
moment that we're removing this xprt from the list. In fact, despite
the earlier xpo_detach(), I don't even see what guarantees that
svc_xprt_enqueue() couldn't still be running on this xprt.
So, instead, note that svc_xprt_enqueue() essentially does:
lock sp_lock
if XPT_BUSY unset
add to sp_sockets
unlock sp_lock
So, if we do:
set XPT_BUSY on every xprt.
Empty every sp_sockets list, under the sp_socks locks.
Then we're left knowing that the sp_sockets lists are all empty and will
stay that way, since any svc_xprt_enqueue() will check XPT_BUSY under
the sp_lock and see it set.
And *then* we can continue deleting the xprt's.
(Thanks to Jeff Layton for being correctly suspicious of this code....)
Cc: Ben Greear <greearb@candelatech.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
There's no reason I can see that we need to call sv_shutdown between
closing the two lists of sockets.
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The semantic patch that makes this change is available
in scripts/coccinelle/api/memdup.cocci.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Address the possible performance regression mentioned in "nfsd4: hash
lockowners to simplify RELEASE_LOCKOWNER" by providing a separate
(lockowner, inode) hash.
Really, I doubt this matters much, but I think it's likely we'll change
these data structures here and I'd rather that the need for (owner,
inode) lookups be well-documented.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Now that they're used in the same way, it's a little simpler to put open
and lock owners in the same hash table, and I can't see a reason not to.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Hash lockowners on just the owner string rather than on (owner, inode).
This makes the owner-string lookup needed for RELEASE_LOCKOWNER simpler
(currently it's doing at a linear search through the entire hash
table!). That may come at the expense of making (owner, inode) lookups
more expensive if a client reuses the same lockowner across multiple
files. We might add a separate lookup for that.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The close parenthesis was hard to find with it spaced so far over.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[bfields@redhat.com: get all these lines under 80 chars while we're here]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
init_nfsd() was calling free_slabs() during cleanup code, but the call
to init_slabs() was hidden in nfsd4_state_init(). This could be
confusing to people unfamiliar with the code.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This script provides a convenient way to use the NFSD fault injection
framework. Fault injection writes to dmesg using the KERN_INFO flag, so
this script will compare the before and after output of `dmesg` to show
the user what happened
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Fault injection on the NFS server makes it easier to test the client's
state manager and recovery threads. Simulating errors on the server is
easier than finding the right conditions that cause them naturally.
This patch uses debugfs to add a simple framework for fault injection to
the server. This framework is a config option, and can be enabled
through CONFIG_NFSD_FAULT_INJECTION. Assuming you have debugfs mounted
to /sys/debug, a set of files will be created in /sys/debug/nfsd/.
Writing to any of these files will cause the corresponding action and
write a log entry to dmesg.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Instead of creating a new lockowner and stateid for every
open_to_lockowner call, reuse the existing lockowner if it exists.
Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Lockowners are looked up by file as well as by owner, but we were
forgetting to do a comparison on the file. This could cause an
incorrect result from lockt.
(Note looking up the inode from the lockowner is pretty awkward here.
The data structures need fixing.)
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Neil hasn't really been at all active as a maintainer for some years now.
So move his maintainership to CREDITS
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
.. with new name. Because nothing says "really solid kernel release"
like naming it after an extinct animal that just happened to be in the
news lately.
Mountpoint crossing is similar to following procfs symlinks - we do
not get ->d_revalidate() called for dentry we have arrived at, with
unpleasant consequences for NFS4.
Simple way to reproduce the problem in mainline:
cat >/tmp/a.c <<'EOF'
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
main()
{
struct flock fl = {.l_type = F_RDLCK, .l_whence = SEEK_SET, .l_len = 1};
if (fcntl(0, F_SETLK, &fl))
perror("setlk");
}
EOF
cc /tmp/a.c -o /tmp/test
then on nfs4:
mount --bind file1 file2
/tmp/test < file1 # ok
/tmp/test < file2 # spews "setlk: No locks available"...
What happens is the missing call of ->d_revalidate() after mountpoint
crossing and that's where NFS4 would issue OPEN request to server.
The fix is simple - treat mountpoint crossing the same way we deal with
following procfs-style symlinks. I.e. set LOOKUP_JUMPED...
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf top: Fix live annotation in the --stdio interface
perf top tui: Don't recalc column widths considering just the first page
perf report: Add progress bar when processing time ordered events
perf hists browser: Warn about lost events
perf tools: Fix a typo of command name as trace-cmd
perf hists: Fix recalculation of total_period when sorting entries
perf header: Fix build on old systems
perf ui browser: Handle K_RESIZE in dialog windows
perf ui browser: No need to switch char sets that often
perf hists browser: Use K_TIMER
perf ui: Rename ui__warning_paranoid to ui__error_paranoid
perf ui: Reimplement the popup windows using libslang
perf ui: Reimplement ui__popup_menu using ui__browser
perf ui: Reimplement ui_helpline using libslang
perf ui: Improve handling sigwinch a bit
perf ui progress: Reimplement using slang
perf evlist: Fix grouping of multiple events
Commit 32aaeffbd4 (Merge branch
'modsplit-Oct31_2011'...) caused some build errors. Fix these
and make sure we always have export.h or module.h included
for MODULE_ and EXPORT_SYMBOL users:
$ grep -rl ^MODULE_ arch/arm/*omap*/*.c | xargs \
grep -L linux/module.h
arch/arm/mach-omap2/dsp.c
arch/arm/mach-omap2/mailbox.c
arch/arm/mach-omap2/omap-iommu.c
arch/arm/mach-omap2/smartreflex.c
Also check we either have export.h or module.h included
for the files exporting symbols:
$ grep -rl EXPORT_SYMBOL arch/arm/*omap*/*.c | xargs \
grep -L linux/export.h | xargs grep -L linux/module.h
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Include linux/export.h to fix below build warning:
CC arch/arm/plat-omap/omap_device.o
arch/arm/plat-omap/omap_device.c:1055: warning: data definition has no type or storage class
arch/arm/plat-omap/omap_device.c:1055: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
arch/arm/plat-omap/omap_device.c:1055: warning: parameter names (without types) in function declaration
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
forcedeth: fix a few sparse warnings (variable shadowing)
forcedeth: Improve stats counters
forcedeth: remove unneeded stats updates
forcedeth: Acknowledge only interrupts that are being processed
forcedeth: fix race when unloading module
MAINTAINERS/rds: update maintainer
wanrouter: Remove kernel_lock annotations
usbnet: fix oops in usbnet_start_xmit
ixgbe: Fix compile for kernel without CONFIG_PCI_IOV defined
etherh: Add MAINTAINERS entry for etherh
bonding: comparing a u8 with -1 is always false
sky2: fix regression on Yukon Optima
netlink: clarify attribute length check documentation
netlink: validate NLA_MSECS length
i825xx:xscale:8390:freescale: Fix Kconfig dependancies
macvlan: receive multicast with local address
tg3: Update version to 3.121
tg3: Eliminate timer race with reset_task
tg3: Schedule at most one tg3_reset_task run
tg3: Obtain PCI function number from device
...
This fixes the following sparse warnings:
drivers/net/ethernet/nvidia/forcedeth.c:2113:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2102:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2155:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2102:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2227:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2215:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2271:7: warning: symbol 'size' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2215:6: originally declared here
drivers/net/ethernet/nvidia/forcedeth.c:2986:20: warning: symbol 'addr' shadows an earlier one
drivers/net/ethernet/nvidia/forcedeth.c:2963:6: originally declared here
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rx byte count was off; instead use the hardware's count. Tx packet
count was counting pre-TSO packets; instead count on-the-wire packets.
Report hardware dropped frame count as rx_fifo_errors.
- The count of transmitted packets reported by the forcedeth driver
reports pre-TSO (TCP Segmentation Offload) packet counts and not the
count of the number of packets sent on the wire. This change fixes
the forcedeth driver to report the correct count. Fixed the code by
copying the count stored in the NIC H/W to the value reported by the
driver.
- Count rx_drop_frame errors as rx_fifo_errors:
We see a lot of rx_drop_frame errors if we disable the rx bottom-halves
for too long. Normally, rx_fifo_errors would be counted in this case.
The rx_drop_frame error count is private to forcedeth and is not
reported by ifconfig or sysfs. The rx_fifo_errors count is currently
unused in the forcedeth driver. It is reported by ifconfig as overruns.
This change reports rx_drop_frame errors as rx_fifo_errors.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function ndo_get_stats() updates most of the stats from hardware
registers, making the manual updates un-needed. This change removes
these manual updates. Main exception is rx_missed_errors which needs
manual update.
Another exception is rx_packets, still updated manually in this commit
to make sure this patch doesn't change behavior of driver. This will
be addressed by a future patch.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is to avoid a race, accidentally acknowledging an interrupt that
we didn't notice and won't immediately process. This is based solely
on code inspection; it is not known if there was an actual bug here.
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When forcedeth module is unloaded, there exists a path that can lead
to mod_timer() after del_timer_sync(), causing an oops. This patch
short-circuits this unneeded path, which originates in
nv_get_ethtool_stats().
Tested:
x86_64 16-way + 3 ethtool -S infinite loops + 100Mbps incoming traffic
+ rmmod/modprobe/ifconfig in a loop
Initial-Author: Salman Qazi <sqazi@google.com>
Discussion: http://patchwork.ozlabs.org/patch/123548/
Signed-off-by: David Decotigny <david.decotigny@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
since it uses the module facilities.
Reported-by: Witold Baryluk <baryluk@smp.if.uj.edu.pl>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For the files which are not themselves modular, we can change
them to include only the smaller export.h since all they are
doing is looking for EXPORT_SYMBOL.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These files didn't exist at the time of the module.h split, and
so were not fixed by the commits on that baseline. Since they use
the EXPORT_SYMBOL and/or THIS_MODULE macros, they will need the
new export.h file included that provides them.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The BKL is gone, these annotations are useless.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the bug added in commit v3.1-rc7-1055-gf9b491e
SKB can be NULL at this point, at least for cdc-ncm.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>