WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Chuck Lever	5fc83f470d	xprtrdma: Fix panic in rpcrdma_register_frmr_external() seg1->mr_nsegs is not yet initialized when it is used to unmap segments during an error exit. Use the same unmapping logic for all error exits. "if (frmr_wr.wr.fast_reg.length < len) {" used to be a BUG_ON check. The broken code will never be executed under normal operation. Fixes: `c977dea` (xprtrdma: Remove BUG_ON() call sites) Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Tested-by: Shirley Ma <shirley.ma@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@emulex.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-07-31 16:22:52 -04:00
Trond Myklebust	518776800c	SUNRPC: Allow svc_reserve() to notify TCP socket that space has been freed Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 16:10:20 -04:00
Trond Myklebust	c7fb3f0631	SUNRPC: svc_tcp_write_space: don't clear SOCK_NOSPACE prematurely If requests are queued in the socket inbuffer waiting for an svc_tcp_has_wspace() requirement to be satisfied, then we do not want to clear the SOCK_NOSPACE flag until we've satisfied that requirement. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 16:10:19 -04:00
Trond Myklebust	0971374e28	SUNRPC: Reduce contention in svc_xprt_enqueue() Ensure that all calls to svc_xprt_enqueue() except svc_xprt_received() check the value of XPT_BUSY, before attempting to grab spinlocks etc. This is to avoid situations such as the following "perf" trace, which shows heavy contention on the pool spinlock: 54.15% nfsd [kernel.kallsyms] [k] _raw_spin_lock_bh \| --- _raw_spin_lock_bh \| \|--71.43%-- svc_xprt_enqueue \| \| \| \|--50.31%-- svc_reserve \| \| \| \|--31.35%-- svc_xprt_received \| \| \| \|--18.34%-- svc_tcp_data_ready ... Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-29 16:10:15 -04:00
Chuck Lever	e560e3b510	svcrdma: Add zero padding if the client doesn't send it See RFC 5666 section 3.7: clients don't have to send zero XDR padding. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=246 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-22 16:40:21 -04:00
Yan Burman	bf858ab0ad	xprtrdma: Fix DMA-API-DEBUG warning by checking dma_map result Fix the following warning when DMA-API debug is enabled by checking ib_dma_map_single result: [ 1455.345548] ------------[ cut here ]------------ [ 1455.346863] WARNING: CPU: 3 PID: 3929 at /home/yanb/kernel/net-next/lib/dma-debug.c:1140 check_unmap+0x4e5/0x990() [ 1455.349350] mlx4_core 0000:00:07.0: DMA-API: device driver failed to check map error[device address=0x000000007c9f2090] [size=2656 bytes] [mapped as single] [ 1455.349350] Modules linked in: xprtrdma netconsole configfs nfsv3 nfs_acl ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm autofs4 auth_rpcgss oid_registry nfsv4 nfs fscache lockd sunrpc dm_mirror dm_region_hash dm_log microcode pcspkr mlx4_ib ib_sa ib_mad ib_core ib_addr mlx4_en ipv6 ptp pps_core vxlan mlx4_core virtio_balloon cirrus ttm drm_kms_helper drm sysimgblt sysfillrect syscopyarea i2c_piix4 i2c_core button ext3 jbd virtio_blk virtio_net virtio_pci virtio_ring virtio uhci_hcd ata_generic ata_piix libata [ 1455.349350] CPU: 3 PID: 3929 Comm: mount.nfs Not tainted 3.15.0-rc1-dbg+ #13 [ 1455.349350] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007 [ 1455.349350] 0000000000000474 ffff880069dcf628 ffffffff8151c341 ffffffff817b69d8 [ 1455.349350] ffff880069dcf678 ffff880069dcf668 ffffffff8105b5fc 0000000069dcf658 [ 1455.349350] ffff880069dcf778 ffff88007b0c9f00 ffffffff8255ec40 0000000000000a60 [ 1455.349350] Call Trace: [ 1455.349350] [<ffffffff8151c341>] dump_stack+0x52/0x81 [ 1455.349350] [<ffffffff8105b5fc>] warn_slowpath_common+0x8c/0xc0 [ 1455.349350] [<ffffffff8105b6e6>] warn_slowpath_fmt+0x46/0x50 [ 1455.349350] [<ffffffff812e6305>] check_unmap+0x4e5/0x990 [ 1455.349350] [<ffffffff81521fb0>] ? _raw_spin_unlock_irq+0x30/0x60 [ 1455.349350] [<ffffffff812e6a0a>] debug_dma_unmap_page+0x5a/0x60 [ 1455.349350] [<ffffffffa0389583>] rpcrdma_deregister_internal+0xb3/0xd0 [xprtrdma] [ 1455.349350] [<ffffffffa038a639>] rpcrdma_buffer_destroy+0x69/0x170 [xprtrdma] [ 1455.349350] [<ffffffffa03872ff>] xprt_rdma_destroy+0x3f/0xb0 [xprtrdma] [ 1455.349350] [<ffffffffa04a95ff>] xprt_destroy+0x6f/0x80 [sunrpc] [ 1455.349350] [<ffffffffa04a9625>] xprt_put+0x15/0x20 [sunrpc] [ 1455.349350] [<ffffffffa04a899a>] rpc_free_client+0x8a/0xe0 [sunrpc] [ 1455.349350] [<ffffffffa04a8a58>] rpc_release_client+0x68/0xa0 [sunrpc] [ 1455.349350] [<ffffffffa04a9060>] rpc_shutdown_client+0xb0/0xc0 [sunrpc] [ 1455.349350] [<ffffffffa04a8f5d>] ? rpc_ping+0x5d/0x70 [sunrpc] [ 1455.349350] [<ffffffffa04a91ab>] rpc_create_xprt+0xbb/0xd0 [sunrpc] [ 1455.349350] [<ffffffffa04a9273>] rpc_create+0xb3/0x160 [sunrpc] [ 1455.349350] [<ffffffff81129749>] ? __probe_kernel_read+0x69/0xb0 [ 1455.349350] [<ffffffffa053851c>] nfs_create_rpc_client+0xdc/0x100 [nfs] [ 1455.349350] [<ffffffffa0538cfa>] nfs_init_client+0x3a/0x90 [nfs] [ 1455.349350] [<ffffffffa05391c8>] nfs_get_client+0x478/0x5b0 [nfs] [ 1455.349350] [<ffffffffa0538e50>] ? nfs_get_client+0x100/0x5b0 [nfs] [ 1455.349350] [<ffffffff81172c6d>] ? kmem_cache_alloc_trace+0x24d/0x260 [ 1455.349350] [<ffffffffa05393f3>] nfs_create_server+0xf3/0x4c0 [nfs] [ 1455.349350] [<ffffffffa0545ff0>] ? nfs_request_mount+0xf0/0x1a0 [nfs] [ 1455.349350] [<ffffffffa031c0c3>] nfs3_create_server+0x13/0x30 [nfsv3] [ 1455.349350] [<ffffffffa0546293>] nfs_try_mount+0x1f3/0x230 [nfs] [ 1455.349350] [<ffffffff8108ea21>] ? get_parent_ip+0x11/0x50 [ 1455.349350] [<ffffffff812d6343>] ? __this_cpu_preempt_check+0x13/0x20 [ 1455.349350] [<ffffffff810d632b>] ? try_module_get+0x6b/0x190 [ 1455.349350] [<ffffffffa05449f7>] nfs_fs_mount+0x187/0x9d0 [nfs] [ 1455.349350] [<ffffffffa0545940>] ? nfs_clone_super+0x140/0x140 [nfs] [ 1455.349350] [<ffffffffa0543b20>] ? nfs_auth_info_match+0x40/0x40 [nfs] [ 1455.349350] [<ffffffff8117e360>] mount_fs+0x20/0xe0 [ 1455.349350] [<ffffffff811a1c16>] vfs_kern_mount+0x76/0x160 [ 1455.349350] [<ffffffff811a29a8>] do_mount+0x428/0xae0 [ 1455.349350] [<ffffffff811a30f0>] SyS_mount+0x90/0xe0 [ 1455.349350] [<ffffffff8152af52>] system_call_fastpath+0x16/0x1b [ 1455.349350] ---[ end trace f1f31572972e211d ]--- Signed-off-by: Yan Burman <yanb@mellanox.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-07-22 13:55:30 -04:00
Trond Myklebust	22cb43855d	SUNRPC: xdr_get_next_encode_buffer should be declared static Quell another sparse warning. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-18 11:35:46 -04:00
Chuck Lever	3c45ddf823	svcrdma: Select NFSv4.1 backchannel transport based on forward channel The current code always selects XPRT_TRANSPORT_BC_TCP for the back channel, even when the forward channel was not TCP (eg, RDMA). When a 4.1 mount is attempted with RDMA, the server panics in the TCP BC code when trying to send CB_NULL. Instead, construct the transport protocol number from the forward channel transport or'd with XPRT_TRANSPORT_BC. Transports that do not support bi-directional RPC will not have registered a "BC" transport, causing create_backchannel_client() to fail immediately. Fixes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=265 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-18 11:35:45 -04:00
NeilBrown	c1221321b7	sched: Allow wait_on_bit_action() functions to support a timeout It is currently not possible for various wait_on_bit functions to implement a timeout. While the "action" function that is called to do the waiting could certainly use schedule_timeout(), there is no way to carry forward the remaining timeout after a false wake-up. As false-wakeups a clearly possible at least due to possible hash collisions in bit_waitqueue(), this is a real problem. The 'action' function is currently passed a pointer to the word containing the bit being waited on. No current action functions use this pointer. So changing it to something else will be a little noisy but will have no immediate effect. This patch changes the 'action' function to take a pointer to the "struct wait_bit_key", which contains a pointer to the word containing the bit so nothing is really lost. It also adds a 'private' field to "struct wait_bit_key", which is initialized to zero. An action function can now implement a timeout with something like static int timed_out_waiter(struct wait_bit_key key) { unsigned long waited; if (key->private == 0) { key->private = jiffies; if (key->private == 0) key->private -= 1; } waited = jiffies - key->private; if (waited > 10 HZ) return -EAGAIN; schedule_timeout(waited - 10 * HZ); return 0; } If any other need for context in a waiter were found it would be easy to use ->private for some other purpose, or even extend "struct wait_bit_key". My particular need is to support timeouts in nfs_release_page() to avoid deadlocks with loopback mounted NFS. While wait_on_bit_timeout() would be a cleaner interface, it will not meet my need. I need the timeout to be sensitive to the state of the connection with the server, which could change. So I need to use an 'action' interface. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-16 15:10:41 +02:00
Daniel Walter	00cfaa943e	replace strict_strto calls Replace obsolete strict_strto calls with appropriate kstrto calls Signed-off-by: Daniel Walter <dwalter@google.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:45:49 -04:00
Himangi Saraogi	adcda652c9	rpc_pipe: Drop memory allocation cast Drop cast on the result of kmalloc and similar functions. The semantic patch that makes this change is as follows: // <smpl> @@ type T; @@ - (T *) (\(kmalloc\\|kzalloc\\|kcalloc\\|kmem_cache_alloc\\|kmem_cache_zalloc\\| kmem_cache_alloc_node\\|kmalloc_node\\|kzalloc_node\)(...)) // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:43:44 -04:00
Jeff Layton	a0337d1ddb	sunrpc: add a new "stringify_acceptor" rpc_credop ...and add an new rpc_auth function to call it when it exists. This is only applicable for AUTH_GSS mechanisms, so we only specify this for those sorts of credentials. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:41:20 -04:00
Jeff Layton	2004c726b9	auth_gss: fetch the acceptor name out of the downcall If rpc.gssd sends us an acceptor name string trailing the context token, stash it as part of the context. Signed-off-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-12 18:41:06 -04:00
Steve Wise	255942907e	svcrdma: send_write() must not overflow the device's max sge Function send_write() must stop creating sges when it reaches the device max and return the amount sent in the RDMA Write to the caller. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-07-11 15:03:48 -04:00
Trond Myklebust	2fc193cf92	SUNRPC: Handle EPIPE in xprt_connect_status The callback handler xs_error_report() can end up propagating an EPIPE error by means of the call to xprt_wake_pending_tasks(). Ensure that xprt_connect_status() does not automatically convert this into an EIO error. Reported-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-07-03 00:11:35 -04:00
Trond Myklebust	3601c4a91e	SUNRPC: Ensure that we handle ENOBUFS errors correctly. Currently, an ENOBUFS error will result in a fatal error for the RPC call. Normally, we will just want to wait and then retry. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-06-30 13:42:19 -04:00
Andy Adamson	66b0686049	NFSv4: test SECINFO RPC_AUTH_GSS pseudoflavors for support Fix nfs4_negotiate_security to create an rpc_clnt used to test each SECINFO returned pseudoflavor. Check credential creation (and gss_context creation) which is important for RPC_AUTH_GSS pseudoflavors which can fail for multiple reasons including mis-configuration. Don't call nfs4_negotiate in nfs4_submount as it was just called by nfs4_proc_lookup_mountpoint (nfs4_proc_lookup_common) Signed-off-by: Andy Adamson <andros@netapp.com> [Trond: fix corrupt return value from nfs_find_best_sec()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-06-24 18:46:58 -04:00
Kinglong Mee	f15a5cf912	SUNRPC/NFSD: Change to type of bool for rq_usedeferral and rq_splice_ok rq_usedeferral and rq_splice_ok are used as 0 and 1, just defined to bool. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-23 11:31:36 -04:00
Linus Torvalds	f9da455b93	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...	2014-06-12 14:27:40 -07:00
Tom Herbert	7e3cead517	net: Save software checksum complete In skb_checksum complete, if we need to compute the checksum for the packet (via skb_checksum) save the result as CHECKSUM_COMPLETE. Subsequent checksum verification can use this. Also, added csum_complete_sw flag to distinguish between software and hardware generated checksum complete, we should always be able to trust the software computation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-11 15:46:13 -07:00
Linus Torvalds	d1e1cda862	NFS client updates for Linux 3.16 Highlights include: - Massive cleanup of the NFS read/write code by Anna and Dros - Support multiple NFS read/write requests per page in order to deal with non-page aligned pNFS striping. Also cleans up the r/wsize < page size code nicely. - stable fix for ensuring inode is declared uptodate only after all the attributes have been checked. - stable fix for a kernel Oops when remounting - NFS over RDMA client fixes - move the pNFS files layout driver into its own subdirectory -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTl3pmAAoJEGcL54qWCgDyraIP/08ZbbDowVTP9572bxl+VR2i zNbrflBtl1R05D4Imi/IEySK0w6xj1CLsncNpXAT2bxTlyKPW70tpiiPlRKMPuO8 JW+iPiepR2t0mol6MEd46yuV8btXVk8I+7IYjPXANiMJG8O5dJzNQ8NiCQOERBNt FQ7rzTCFO0ESGXnT6vYrT4I0bwqYVklBiJRTT4PQVzhhhDq9qUdq21BlQjQJFXP4 9aBLurxKptlHBvE6A2Quja6ObEC0s31CxcijqHIJ+Ue4GbKcFbMG1tgjY7ESE/AD rqzDeF0jvWHT+frmvFEUUXWqzF1ReZ4x9pfDoOgeG6T9/K6DT91O0yMOgG8jvlbF 8DSATNYGDX5sSjpvaG5JokGG+cGCk9srVDx+itn7HlwzalRwn0PjKtIYwOJ7TJIr o/j20nOsPrRGF0OqLf9phyocgRrlbMKOzj1IXldHHfAbNkRcISTK08lxvsz96Ddn zRyDmbsbY6QFXdB3AVSeQmg5R0OOLtzNIcsFPmNdvy5eiy67qU0lsGg8UGNnoz8k PHN1pcGejkctLhQ32ee3w/W6zkrgpJZcNC9JSoG8Dc3SeXus0c3IgumRknFCmiep ssN+1jEITAGeS5a2aBxwLQLVI2JAr2lxs5e+R4D5EsQlFkCl6Mrgtzh/aToWTuFl Qt7l2zI3r3VieKT9u7Bh =OyXR -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: - massive cleanup of the NFS read/write code by Anna and Dros - support multiple NFS read/write requests per page in order to deal with non-page aligned pNFS striping. Also cleans up the r/wsize < page size code nicely. - stable fix for ensuring inode is declared uptodate only after all the attributes have been checked. - stable fix for a kernel Oops when remounting - NFS over RDMA client fixes - move the pNFS files layout driver into its own subdirectory" * tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits) NFS: populate ->net in mount data when remounting pnfs: fix lockup caused by pnfs_generic_pg_test NFSv4.1: Fix typo in dprintk NFSv4.1: Comment is now wrong and redundant to code NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state xprtrdma: Disconnect on registration failure xprtrdma: Remove BUG_ON() call sites xprtrdma: Avoid deadlock when credit window is reset SUNRPC: Move congestion window constants to header file xprtrdma: Reset connection timeout after successful reconnect xprtrdma: Use macros for reconnection timeout constants xprtrdma: Allocate missing pagelist xprtrdma: Remove Tavor MTU setting xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting xprtrdma: Reduce the number of hardway buffer allocations xprtrdma: Limit work done by completion handler xprtrmda: Reduce calls to ib_poll_cq() in completion handlers xprtrmda: Reduce lock contention in completion handlers xprtrdma: Split the completion queue xprtrdma: Make rpcrdma_ep_destroy() return void ...	2014-06-10 15:02:42 -07:00
Linus Torvalds	5b174fd647	Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "The largest piece is a long-overdue rewrite of the xdr code to remove some annoying limitations: for example, there was no way to return ACLs larger than 4K, and readdir results were returned only in 4k chunks, limiting performance on large directories. Also: - part of Neil Brown's work to make NFS work reliably over the loopback interface (so client and server can run on the same machine without deadlocks). The rest of it is coming through other trees. - cleanup and bugfixes for some of the server RDMA code, from Steve Wise. - Various cleanup of NFSv4 state code in preparation for an overhaul of the locking, from Jeff, Trond, and Benny. - smaller bugfixes and cleanup from Christoph Hellwig and Kinglong Mee. Thanks to everyone! This summer looks likely to be busier than usual for knfsd. Hopefully we won't break it too badly; testing definitely welcomed" * 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits) nfsd4: fix FREE_STATEID lockowner leak svcrdma: Fence LOCAL_INV work requests svcrdma: refactor marshalling logic nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry nfs4: remove unused CHANGE_SECURITY_LABEL nfsd4: kill READ64 nfsd4: kill READ32 nfsd4: simplify server xdr->next_page use nfsd4: hash deleg stateid only on successful nfs4_set_delegation nfsd4: rename recall_lock to state_lock nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open nfsd4: use recall_lock for delegation hashing nfsd: fix laundromat next-run-time calculation nfsd: make nfsd4_encode_fattr static SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING nfsd: remove unused function nfsd_read_file nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer NFSD: Error out when getting more than one fsloc/secinfo/uuid NFSD: Using type of uint32_t for ex_nflavors instead of int ...	2014-06-10 11:50:57 -07:00
Steve Wise	83710fc753	svcrdma: Fence LOCAL_INV work requests Fencing forces the invalidate to only happen after all prior send work requests have been completed. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reported by : Devesh Sharma <Devesh.Sharma@Emulex.Com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-06 19:22:51 -04:00
Steve Wise	0bf4828983	svcrdma: refactor marshalling logic This patch refactors the NFSRDMA server marshalling logic to remove the intermediary map structures. It also fixes an existing bug where the NFSRDMA server was not minding the device fast register page list length limitations. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com>	2014-06-06 19:22:50 -04:00
J. Bruce Fields	05638dc73a	nfsd4: simplify server xdr->next_page use The rpc code makes available to the NFS server an array of pages to encod into. The server represents its reply as an xdr buf, with the head pointing into the first page in that array, the pages ** array starting just after that, and the tail (if any) sharing any leftover space in the page used by the head. While encoding, we use xdr_stream->page_ptr to keep track of which page we're currently using. Currently we set xdr_stream->page_ptr to buf->pages, which makes the head a weird exception to the rule that page_ptr always points to the page we're currently encoding into. So, instead set it to buf->pages - 1 (the page actually containing the head), and remove the need for a little unintuitive logic in xdr_get_next_encode_buffer() and xdr_truncate_encode. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-06-06 19:22:46 -04:00
Chuck Lever	c93c62231c	xprtrdma: Disconnect on registration failure If rpcrdma_register_external() fails during request marshaling, the current RPC request is killed. Instead, this RPC should be retried after reconnecting the transport instance. The most likely reason for registration failure with FRMR is a failed post_send, which would be due to a remote transport disconnect or memory exhaustion. These issues can be recovered by a retry. Problems encountered in the marshaling logic itself will not be corrected by trying again, so these should still kill a request. Now that we've added a clean exit for marshaling errors, take the opportunity to defang some BUG_ON's. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:53 -04:00
Chuck Lever	c977dea227	xprtrdma: Remove BUG_ON() call sites If an error occurs in the marshaling logic, fail the RPC request being processed, but leave the client running. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:53 -04:00
Chuck Lever	e7ce710a88	xprtrdma: Avoid deadlock when credit window is reset Update the cwnd while processing the server's reply. Otherwise the next task on the xprt_sending queue is still subject to the old credit window. Currently, no task is awoken if the old congestion window is still exceeded, even if the new window is larger, and a deadlock results. This is an issue during a transport reconnect. Servers don't normally shrink the credit window, but the client does reset it to 1 when reconnecting so the server can safely grow it again. As a minor optimization, remove the hack of grabbing the initial cwnd size (which happens to be RPC_CWNDSCALE) and using that value as the congestion scaling factor. The scaling value is invariant, and we are better off without the multiplication operation. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:52 -04:00
Chuck Lever	4f4cf5ad6f	SUNRPC: Move congestion window constants to header file I would like to use one of the RPC client's congestion algorithm constants in transport-specific code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:51 -04:00
Chuck Lever	18906972aa	xprtrdma: Reset connection timeout after successful reconnect If the new connection is able to make forward progress, reset the re-establish timeout. Otherwise it keeps growing even if disconnect events are rare. The same behavior as TCP is adopted: reconnect immediately if the transport instance has been able to make some forward progress. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:51 -04:00
Chuck Lever	bfaee096de	xprtrdma: Use macros for reconnection timeout constants Clean up: Ensure the same max and min constant values are used everywhere when setting reconnect timeouts. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:50 -04:00
Shirley Ma	196c69989d	xprtrdma: Allocate missing pagelist GETACL relies on transport layer to alloc memory for reply buffer. However xprtrdma assumes that the reply buffer (pagelist) has been pre-allocated in upper layer. This problem was reported by IOL OFA lab test on PPC. Signed-off-by: Shirley Ma <shirley.ma@oracle.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Edward Mossman <emossman@iol.unh.edu> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:49 -04:00
Chuck Lever	5bc4bc7292	xprtrdma: Remove Tavor MTU setting Clean up. Remove HCA-specific clutter in xprtrdma, which is supposed to be device-independent. Hal Rosenstock <hal@dev.mellanox.co.il> observes: > Note that there is OpenSM option (enable_quirks) to return 1K MTU > in SA PathRecord responses for Tavor so that can be used for this. > The default setting for enable_quirks is FALSE so that would need > changing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:48 -04:00
Chuck Lever	ec62f40d35	xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting Devesh Sharma <Devesh.Sharma@Emulex.Com> reports that after a disconnect, his HCA is failing to create a fresh QP, leaving ia_ri->ri_id->qp set to NULL. But xprtrdma still allows RPCs to wake up and post LOCAL_INV as they exit, causing an oops. rpcrdma_ep_connect() is allowing the wake-up by leaking the QP creation error code (-EPERM in this case) to the RPC client's generic layer. xprt_connect_status() does not recognize -EPERM, so it kills pending RPC tasks immediately rather than retrying the connect. Re-arrange the QP creation logic so that when it fails on reconnect, it leaves ->qp with the old QP rather than NULL. If pending RPC tasks wake and exit, LOCAL_INV work requests will flush rather than oops. On initial connect, leaving ->qp == NULL is OK, since there are no pending RPCs that might use ->qp. But be sure not to try to destroy a NULL QP when rpcrdma_ep_connect() is retried. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:47 -04:00
Chuck Lever	65866f8259	xprtrdma: Reduce the number of hardway buffer allocations While marshaling an RPC/RDMA request, the inline_{rsize,wsize} settings determine whether an inline request is used, or whether read or write chunks lists are built. The current default value of these settings is 1024. Any RPC request smaller than 1024 bytes is sent to the NFS server completely inline. rpcrdma_buffer_create() allocates and pre-registers a set of RPC buffers for each transport instance, also based on the inline rsize and wsize settings. RPC/RDMA requests and replies are built in these buffers. However, if an RPC/RDMA request is expected to be larger than 1024, a buffer has to be allocated and registered for that RPC, and deregistered and released when the RPC is complete. This is known has a "hardway allocation." Since the introduction of NFSv4, the size of RPC requests has become larger, and hardway allocations are thus more frequent. Hardway allocations are significant overhead, and they waste the existing RPC buffers pre-allocated by rpcrdma_buffer_create(). We'd like fewer hardway allocations. Increasing the size of the pre-registered buffers is the most direct way to do this. However, a blanket increase of the inline thresholds has interoperability consequences. On my 64-bit system, rpcrdma_buffer_create() requests roughly 7000 bytes for each RPC request buffer, using kmalloc(). Due to internal fragmentation, this wastes nearly 1200 bytes because kmalloc() already returns an 8192-byte piece of memory for a 7000-byte allocation request, though the extra space remains unused. So let's round up the size of the pre-allocated buffers, and make use of the unused space in the kmalloc'd memory. This change reduces the amount of hardway allocated memory for an NFSv4 general connectathon run from 1322092 to 9472 bytes (99%). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:46 -04:00
Chuck Lever	8301a2c047	xprtrdma: Limit work done by completion handler Sagi Grimberg <sagig@dev.mellanox.co.il> points out that a steady stream of CQ events could starve other work because of the boundless loop pooling in rpcrdma_{send,recv}_poll(). Instead of a (potentially infinite) while loop, return after collecting a budgeted number of completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Sagi Grimberg <sagig@dev.mellanox.co.il> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:45 -04:00
Chuck Lever	1c00dd0776	xprtrmda: Reduce calls to ib_poll_cq() in completion handlers Change the completion handlers to grab up to 16 items per ib_poll_cq() call. No extra ib_poll_cq() is needed if fewer than 16 items are returned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:44 -04:00
Chuck Lever	7f23f6f6e3	xprtrmda: Reduce lock contention in completion handlers Skip the ib_poll_cq() after re-arming, if the provider knows there are no additional items waiting. (Have a look at commit `ed23a727` for more details). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:43 -04:00
Chuck Lever	fc66448549	xprtrdma: Split the completion queue The current CQ handler uses the ib_wc.opcode field to distinguish between event types. However, the contents of that field are not reliable if the completion status is not IB_WC_SUCCESS. When an error completion occurs on a send event, the CQ handler schedules a tasklet with something that is not a struct rpcrdma_rep. This is never correct behavior, and sometimes it results in a panic. To resolve this issue, split the completion queue into a send CQ and a receive CQ. The send CQ handler now handles only struct rpcrdma_mw wr_id's, and the receive CQ handler now handles only struct rpcrdma_rep wr_id's. Fix suggested by Shirley Ma <shirley.ma@oracle.com> Reported-by: Rafael Reiter <rafael.reiter@ims.co.at> Fixes: `5c635e09ce` BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=73211 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Klemens Senn <klemens.senn@ims.co.at> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:42 -04:00
Chuck Lever	7f1d54191e	xprtrdma: Make rpcrdma_ep_destroy() return void Clean up: rpcrdma_ep_destroy() returns a value that is used only to print a debugging message. rpcrdma_ep_destroy() already prints debugging messages in all error cases. Make rpcrdma_ep_destroy() return void instead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:41 -04:00
Chuck Lever	13c9ff8f67	xprtrdma: Simplify rpcrdma_deregister_external() synopsis Clean up: All remaining callers of rpcrdma_deregister_external() pass NULL as the last argument, so remove that argument. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:40 -04:00
Chuck Lever	cdd9ade711	xprtrdma: mount reports "Invalid mount option" if memreg mode not supported If the selected memory registration mode is not supported by the underlying provider/HCA, the NFS mount command reports that there was an invalid mount option, and fails. This is misleading. Reporting a problem allocating memory is a lot closer to the truth. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:39 -04:00
Chuck Lever	f10eafd3a6	xprtrdma: Fall back to MTHCAFMR when FRMR is not supported An audit of in-kernel RDMA providers that do not support the FRMR memory registration shows that several of them support MTHCAFMR. Prefer MTHCAFMR when FRMR is not supported. If MTHCAFMR is not supported, only then choose ALLPHYSICAL. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:39 -04:00
Chuck Lever	0ac531c183	xprtrdma: Remove REGISTER memory registration mode All kernel RDMA providers except amso1100 support either MTHCAFMR or FRMR, both of which are faster than REGISTER. amso1100 can continue to use ALLPHYSICAL. The only other ULP consumer in the kernel that uses the reg_phys_mr verb is Lustre. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:38 -04:00
Chuck Lever	b45ccfd25d	xprtrdma: Remove MEMWINDOWS registration modes The MEMWINDOWS and MEMWINDOWS_ASYNC memory registration modes were intended as stop-gap modes before the introduction of FRMR. They are now considered obsolete. MEMWINDOWS_ASYNC is also considered unsafe because it can leave client memory registered and exposed for an indeterminant time after each I/O. At this point, the MEMWINDOWS modes add needless complexity, so remove them. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:37 -04:00
Chuck Lever	03ff8821eb	xprtrdma: Remove BOUNCEBUFFERS memory registration mode Clean up: This memory registration mode is slow and was never meant for use in production environments. Remove it to reduce implementation complexity. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:37 -04:00
Chuck Lever	254f91e2fa	xprtrdma: RPC/RDMA must invoke xprt_wake_pending_tasks() in process context An IB provider can invoke rpcrdma_conn_func() in an IRQ context, thus rpcrdma_conn_func() cannot be allowed to directly invoke generic RPC functions like xprt_wake_pending_tasks(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:36 -04:00
Allen Andrews	4034ba0423	nfs-rdma: Fix for FMR leaks Two memory region leaks were found during testing: 1. rpcrdma_buffer_create: While allocating RPCRDMA_FRMR's ib_alloc_fast_reg_mr is called and then ib_alloc_fast_reg_page_list is called. If ib_alloc_fast_reg_page_list returns an error it bails out of the routine dropping the last ib_alloc_fast_reg_mr frmr region creating a memory leak. Added code to dereg the last frmr if ib_alloc_fast_reg_page_list fails. 2. rpcrdma_buffer_destroy: While cleaning up, the routine will only free the MR's on the rb_mws list if there are rb_send_bufs present. However, in rpcrdma_buffer_create while the rb_mws list is being built if one of the MR allocation requests fail after some MR's have been allocated on the rb_mws list the routine never gets to create any rb_send_bufs but instead jumps to the rpcrdma_buffer_destroy routine which will never free the MR's on rb_mws list because the rb_send_bufs were never created. This leaks all the MR's on the rb_mws list that were created prior to one of the MR allocations failing. Issue(2) was seen during testing. Our adapter had a finite number of MR's available and we created enough connections to where we saw an MR allocation failure on our Nth NFS connection request. After the kernel cleaned up the resources it had allocated for the Nth connection we noticed that FMR's had been leaked due to the coding error described above. Issue(1) was seen during a code review while debugging issue(2). Signed-off-by: Allen Andrews <allen.andrews@emulex.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:35 -04:00
Steve Wise	0fc6c4e7bb	xprtrdma: mind the device's max fast register page list depth Some rdma devices don't support a fast register page list depth of at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast register regions according to the minimum of the device max supported depth or RPCRDMA_MAX_DATA_SEGS. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2014-06-04 08:56:33 -04:00
Kinglong Mee	a48fd0f9f7	SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING When debugging, rpc prints messages from dprintk(KERN_WARNING ...) with "^A4" prefixed, [ 2780.339988] ^A4nfsd: connect from unprivileged port: 127.0.0.1, port=35316 Trond tells, > dprintk != printk. We have NEVER supported dprintk(KERN_WARNING...) This patch removes using of dprintk with KERN_WARNING. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 20:25:28 -04:00
J. Bruce Fields	a5cddc885b	nfsd4: better reservation of head space for krb5 RPC_MAX_AUTH_SIZE is scattered around several places. Better to set it once in the auth code, where this kind of estimate should be made. And while we're at it we can leave it zero when we're not using krb5i or krb5p. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:17 -04:00
J. Bruce Fields	db3f58a95b	rpc: define xdr_restrict_buflen With this xdr_reserve_space can help us enforce various limits. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:32:01 -04:00
J. Bruce Fields	2825a7f907	nfsd4: allow encoding across page boundaries After this we can handle for example getattr of very large ACLs. Read, readdir, readlink are still special cases with their own limits. Also we can't handle a new operation starting close to the end of a page. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:54 -04:00
J. Bruce Fields	3e19ce762b	rpc: xdr_truncate_encode This will be used in the server side in a few cases: - when certain operations (read, readdir, readlink) fail after encoding a partial response. - when we run out of space after encoding a partial response. - in readlink, where we initially reserve PAGE_SIZE bytes for data, then truncate to the actual size. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-30 17:31:47 -04:00
David Rientjes	c6c8fe79a8	net, sunrpc: suppress allocation warning in rpc_malloc() rpc_malloc() allocates with GFP_NOWAIT without making any attempt at reclaim so it easily fails when low on memory. This ends up spamming the kernel log: SLAB: Unable to allocate memory on node 0 (gfp=0x4000) cache: kmalloc-8192, object size: 8192, order: 1 node 0: slabs: 207/207, objs: 207/207, free: 0 rekonq: page allocation failure: order:1, mode:0x204000 CPU: 2 PID: 14321 Comm: rekonq Tainted: G O 3.15.0-rc3-12.gfc9498b-desktop+ #6 Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 2105 07/23/2010 0000000000000000 ffff880010ff17d0 ffffffff815e693c 0000000000204000 ffff880010ff1858 ffffffff81137bd2 0000000000000000 0000001000000000 ffff88011ffebc38 0000000000000001 0000000000204000 ffff88011ffea000 Call Trace: [<ffffffff815e693c>] dump_stack+0x4d/0x6f [<ffffffff81137bd2>] warn_alloc_failed+0xd2/0x140 [<ffffffff8113be19>] __alloc_pages_nodemask+0x7e9/0xa30 [<ffffffff811824a8>] kmem_getpages+0x58/0x140 [<ffffffff81183de6>] fallback_alloc+0x1d6/0x210 [<ffffffff81183be3>] ____cache_alloc_node+0x123/0x150 [<ffffffff81185953>] __kmalloc+0x203/0x490 [<ffffffffa06b0ee2>] rpc_malloc+0x32/0xa0 [sunrpc] [<ffffffffa06a6999>] call_allocate+0xb9/0x170 [sunrpc] [<ffffffffa06b19d8>] __rpc_execute+0x88/0x460 [sunrpc] [<ffffffffa06b2da9>] rpc_execute+0x59/0xc0 [sunrpc] [<ffffffffa06a932b>] rpc_run_task+0x6b/0x90 [sunrpc] [<ffffffffa077b5c1>] nfs4_call_sync_sequence+0x51/0x80 [nfsv4] [<ffffffffa077d45d>] _nfs4_do_setattr+0x1ed/0x280 [nfsv4] [<ffffffffa0782a72>] nfs4_do_setattr+0x72/0x180 [nfsv4] [<ffffffffa078334c>] nfs4_proc_setattr+0xbc/0x140 [nfsv4] [<ffffffffa074a7e8>] nfs_setattr+0xd8/0x240 [nfs] [<ffffffff811baa71>] notify_change+0x231/0x380 [<ffffffff8119cf5c>] chmod_common+0xfc/0x120 [<ffffffff8119df80>] SyS_chmod+0x40/0x90 [<ffffffff815f4cfd>] system_call_fastpath+0x1a/0x1f ... If the allocation fails, simply return NULL and avoid spamming the kernel log. Reported-by: Marc Dietrich <marvin24@gmx.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-05-29 11:11:51 -04:00
Tom Herbert	0f8066bd48	sunrpc: Remove sk_no_check setting Setting sk_no_check to UDP_CSUM_NORCV seems to have no effect. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-23 16:28:53 -04:00
NeilBrown	ef11ce2487	SUNRPC: track whether a request is coming from a loop-back interface. If an incoming NFS request is coming from the local host, then nfsd will need to perform some special handling. So detect that possibility and make the source visible in rq_local. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:59:18 -04:00
Trond Myklebust	c789102c20	SUNRPC: Fix a module reference leak in svc_handle_xprt If the accept() call fails, we need to put the module reference. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:57:22 -04:00
Chuck Lever	16e4d93f6d	NFSD: Ignore client's source port on RDMA transports An NFS/RDMA client's source port is meaningless for RDMA transports. The transport layer typically sets the source port value on the connection to a random ephemeral port. Currently, NFS server administrators must specify the "insecure" export option to enable clients to access exports via RDMA. But this means NFS clients can access such an export via IP using an ephemeral port, which may not be desirable. This patch eliminates the need to specify the "insecure" export option to allow NFS/RDMA clients access to an export. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=250 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-22 15:55:48 -04:00
Trond Myklebust	7a9a7b774f	SUNRPC: Fix a module reference issue in rpcsec_gss We're not taking a reference in the case where _gss_mech_get_by_pseudoflavor loops without finding the correct rpcsec_gss flavour, so why are we releasing it? Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-05-18 13:47:14 -04:00
Kinglong Mee	ecca063b31	SUNRPC: Fix printk that is not only for nfsd Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-05-08 14:59:51 -04:00
Peter Zijlstra	4e857c58ef	arch: Mass conversion of smp_mb__() Mostly scripted conversion of the smp_mb__ barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-arch@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-04-18 14:20:48 +02:00
Linus Torvalds	454fd351f2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull yet more networking updates from David Miller: 1) Various fixes to the new Redpine Signals wireless driver, from Fariya Fatima. 2) L2TP PPP connect code takes PMTU from the wrong socket, fix from Dmitry Petukhov. 3) UFO and TSO packets differ in whether they include the protocol header in gso_size, account for that in skb_gso_transport_seglen(). From Florian Westphal. 4) If VLAN untagging fails, we double free the SKB in the bridging output path. From Toshiaki Makita. 5) Several call sites of sk->sk_data_ready() were referencing an SKB just added to the socket receive queue in order to calculate the second argument via skb->len. This is dangerous because the moment the skb is added to the receive queue it can be consumed in another context and freed up. It turns out also that none of the sk->sk_data_ready() implementations even care about this second argument. So just kill it off and thus fix all these use-after-free bugs as a side effect. 6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti. 7) pktgen needs to do locking properly for LLTX devices, from Daniel Borkmann. 8) xen-netfront driver initializes TX array entries in RX loop :-) From Vincenzo Maffione. 9) After refactoring, some tunnel drivers allow a tunnel to be configured on top itself. Fix from Nicolas Dichtel. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) vti: don't allow to add the same tunnel twice gre: don't allow to add the same tunnel twice drivers: net: xen-netfront: fix array initialization bug pktgen: be friendly to LLTX devices r8152: check RTL8152_UNPLUG net: sun4i-emac: add promiscuous support net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO net: ipv6: Fix oif in TCP SYN+ACK route lookup. drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts drivers: net: cpsw: discard all packets received when interface is down net: Fix use after free by removing length arg from sk_data_ready callbacks. Drivers: net: hyperv: Address UDP checksum issues Drivers: net: hyperv: Negotiate suitable ndis version for offload support Drivers: net: hyperv: Allocate memory for all possible per-pecket information bridge: Fix double free and memory leak around br_allowed_ingress bonding: Remove debug_fs files when module init fails i40evf: program RSS LUT correctly i40evf: remove open-coded skb_cow_head ixgb: remove open-coded skb_cow_head igbvf: remove open-coded skb_cow_head ...	2014-04-12 17:31:22 -07:00
David S. Miller	676d23690f	net: Fix use after free by removing length arg from sk_data_ready callbacks. Several spots in the kernel perform a sequence like: skb_queue_tail(&sk->s_receive_queue, skb); sk->sk_data_ready(sk, skb->len); But at the moment we place the SKB onto the socket receive queue it can be consumed and freed up. So this skb->len access is potentially to freed up memory. Furthermore, the skb->len can be modified by the consumer so it is possible that the value isn't accurate. And finally, no actual implementation of this callback actually uses the length argument. And since nobody actually cared about it's value, lots of call sites pass arbitrary values in such as '0' and even '1'. So just remove the length argument from the callback, that way there is no confusion whatsoever and all of these use-after-free cases get fixed as a side effect. Based upon a patch by Eric Dumazet and his suggestion to audit this issue tree-wide. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-04-11 16:15:36 -04:00
Linus Torvalds	75ff24fa52	Merge branch 'for-3.15' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: "Highlights: - server-side nfs/rdma fixes from Jeff Layton and Tom Tucker - xdr fixes (a larger xdr rewrite has been posted but I decided it would be better to queue it up for 3.16). - miscellaneous fixes and cleanup from all over (thanks especially to Kinglong Mee)" * 'for-3.15' of git://linux-nfs.org/~bfields/linux: (36 commits) nfsd4: don't create unnecessary mask acl nfsd: revert v2 half of "nfsd: don't return high mode bits" nfsd4: fix memory leak in nfsd4_encode_fattr() nfsd: check passed socket's net matches NFSd superblock's one SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp SUNRPC: New helper for creating client with rpc_xprt NFSD: Free backchannel xprt in bc_destroy NFSD: Clear wcc data between compound ops nfsd: Don't return NFS4ERR_STALE_STATEID for NFSv4.1+ nfsd4: fix nfs4err_resource in 4.1 case nfsd4: fix setclientid encode size nfsd4: remove redundant check from nfsd4_check_resp_size nfsd4: use more generous NFS4_ACL_MAX nfsd4: minor nfsd4_replay_cache_entry cleanup nfsd4: nfsd4_replay_cache_entry should be static nfsd4: update comments with obsolete function name rpc: Allow xdr_buf_subsegment to operate in-place NFSD: Using free_conn free connection SUNRPC: fix memory leak of peer addresses in XPRT ...	2014-04-08 18:28:14 -07:00
Linus Torvalds	2b3a8fd735	NFS client updates for Linux 3.15 Highlights include: - Stable fix for a use after free issue in the NFSv4.1 open code - Fix the SUNRPC bi-directional RPC code to account for TCP segmentation - Optimise usage of readdirplus when confronted with 'ls -l' situations - Soft mount bugfixes - NFS over RDMA bugfixes - NFSv4 close locking fixes - Various NFSv4.x client state management optimisations - Rename/unlink code cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTQBayAAoJEGcL54qWCgDyUzgQAKzSlbcksMQT55M/KZJXabNW KSctJeDrkTkRxOXTNxuF9NbIgeqenLijCokXty6BIUgup0zkOPMzFfRfgdQvplnp YEj4sOEXEZ8CX+PoUTYOEayzt0ssEAOyidumiM+Gx2LD/E1d2xyCL7YaAOjIhVQS OnXcX1cZw+dZSUxC9vu5fVDjrphJTnp4CXdbvR5PiJiXeKqzZd9e5M3hXgpAQ/AS mWjYeUvM9mwyz7UmbLKkWEmzB3tFlGdTzDPxLRrkfcOSKI2Ham0lL3/Uv50/nRTu 99ts6KH8KLGcUuL9vD9KRebht2f71usBrWAdvpy1cUcf1Fh6lmEg4ktGfkqldaUu 9kNu9d5DCxJoGc6R2UTw5FeyPwYuDWoBwEGy1DcguJ5CeQn2R2nH4ps/P3J3DX4d DZsJqCY9idKZCQhtyR0iF9j3x2bNFoENaL6WHI6b0J+xjMedIbHgeUQzIQP0RLBJ h0IcjK0D+e7WdyC7jk4Nm3krtms5SNUG5/N9OUO36a7v8735PJBcbcgm9hZJt8Fh t/4vqUmKIBXHioHsMhaFslqTWlYIR9a3MYmN7QtHFYbqUfNxH69v9y3d6jb4Igck kqoEiui5aJOCR76s7oVdHCcm+klBwEPiACT+H9CUMzSoKzHSWsBSNZbJR3BEia4M 7dwScS1OfI2KuutshGQA =weNx -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.15-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: - Stable fix for a use after free issue in the NFSv4.1 open code - Fix the SUNRPC bi-directional RPC code to account for TCP segmentation - Optimise usage of readdirplus when confronted with 'ls -l' situations - Soft mount bugfixes - NFS over RDMA bugfixes - NFSv4 close locking fixes - Various NFSv4.x client state management optimisations - Rename/unlink code cleanups" * tag 'nfs-for-3.15-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits) nfs: pass string length to pr_notice message about readdir loops NFSv4: Fix a use-after-free problem in open() SUNRPC: rpc_restart_call/rpc_restart_call_prepare should clear task->tk_status SUNRPC: Don't let rpc_delay() clobber non-timeout errors SUNRPC: Ensure call_connect_status() deals correctly with SOFTCONN tasks SUNRPC: Ensure call_status() deals correctly with SOFTCONN tasks NFSv4: Ensure we respect soft mount timeouts during trunking discovery NFSv4: Schedule recovery if nfs40_walk_client_list() is interrupted NFS: advertise only supported callback netids SUNRPC: remove KERN_INFO from dprintk() call sites SUNRPC: Fix large reads on NFS/RDMA NFS: Clean up: revert increase in READDIR RPC buffer max size SUNRPC: Ensure that call_bind times out correctly SUNRPC: Ensure that call_connect times out correctly nfs: emit a fsnotify_nameremove call in sillyrename codepath nfs: remove synchronous rename code nfs: convert nfs_rename to use async_rename infrastructure nfs: make nfs_async_rename non-static nfs: abstract out code needed to complete a sillyrename NFSv4: Clear the open state flags if the new stateid does not match ...	2014-04-06 10:09:38 -07:00
Stanislav Kinsbursky	3064639423	nfsd: check passed socket's net matches NFSd superblock's one There could be a case, when NFSd file system is mounted in network, different to socket's one, like below: "ip netns exec" creates new network and mount namespace, which duplicates NFSd mount point, created in init_net context. And thus NFS server stop in nested network context leads to RPCBIND client destruction in init_net. Then, on NFSd start in nested network context, rpc.nfsd process creates socket in nested net and passes it into "write_ports", which leads to RPCBIND sockets creation in init_net context because of the same reason (NFSd monut point was created in init_net context). An attempt to register passed socket in nested net leads to panic, because no RPCBIND client present in nexted network namespace. This patch add check that passed socket's net matches NFSd superblock's one. And returns -EINVAL error to user psace otherwise. v2: Put socket on exit. Reported-by: Weng Meiling <wengmeiling.weng@huawei.com> Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-31 16:58:16 -04:00
Kinglong Mee	642aab58db	SUNRPC: Clear xpt_bc_xprt if xs_setup_bc_tcp failed Don't move the assign of args->bc_xprt->xpt_bc_xprt out of xs_setup_bc_tcp, because rpc_ping (which is in rpc_create) will using it. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:37 -04:00
Kinglong Mee	d531c008d7	NFSD/SUNRPC: Check rpc_xprt out of xs_setup_bc_tcp Besides checking rpc_xprt out of xs_setup_bc_tcp, increase it's reference (it's important). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:36 -04:00
Kinglong Mee	83ddfebdd2	SUNRPC: New helper for creating client with rpc_xprt Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:36 -04:00
Kinglong Mee	47f72efa8f	NFSD: Free backchannel xprt in bc_destroy Backchannel xprt isn't freed right now. Free it in bc_destroy, and put the reference of THIS_MODULE. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-30 10:47:35 -04:00
J. Bruce Fields	de4aee2e0a	rpc: Allow xdr_buf_subsegment to operate in-place Allow xdr_buf_subsegment(&buf, &buf, base, len) to modify an xdr_buf in-place. Also, none of the callers need the iov_base of head or tail to be zeroed out. Also add documentation. (As it turns out, I'm not really using this new guarantee, but it seems a simple way to make this function a bit more robust.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:24:49 -04:00
Kinglong Mee	315f3812db	SUNRPC: fix memory leak of peer addresses in XPRT Creating xprt failed after xs_format_peer_addresses, sunrpc must free those memory of peer addresses in xprt. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 21:23:39 -04:00
Jeff Layton	3cbe01a94c	svcrdma: fix offset calculation for non-page aligned sge entries The xdr_off value in dma_map_xdr gets passed to ib_dma_map_page as the offset into the page to be mapped. This calculation does not correctly take into account the case where the data starts at some offset into the page. Increment the xdr_off by the page_base to ensure that it is respected. Cc: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:02:13 -04:00
Jeff Layton	2e8c12e1b7	xprtrdma: add separate Kconfig options for NFSoRDMA client and server support There are two entirely separate modules under xprtrdma/ and there's no reason that enabling one should automatically enable the other. Add config options for each one so they can be enabled/disabled separately. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:02:12 -04:00
Tom Tucker	7e4359e261	Fix regression in NFSRDMA server The server regression was caused by the addition of rq_next_page (`afc59400d6`). There were a few places that were missed with the update of the rq_respages array. Signed-off-by: Tom Tucker <tom@ogc.us> Tested-by: Steve Wise <swise@ogc.us> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-28 18:02:11 -04:00
Rashika Kheria	4548120100	net: Mark functions as static in net/sunrpc/svc_xprt.c Mark functions as static in net/sunrpc/svc_xprt.c because they are not used outside this file. This eliminates the following warning in net/sunrpc/svc_xprt.c: net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes] net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes] net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes] Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:57 -04:00
Jeff Layton	c42a01eee7	svcrdma: fix printk when memory allocation fails It retries in 1s, not 1000 jiffies. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-03-27 16:31:56 -04:00
Trond Myklebust	494314c415	SUNRPC: rpc_restart_call/rpc_restart_call_prepare should clear task->tk_status When restarting an rpc call, we should not be carrying over data from the previous call. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-20 13:38:44 -04:00
Trond Myklebust	6bd144160a	SUNRPC: Don't let rpc_delay() clobber non-timeout errors Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-20 13:38:43 -04:00
Steve Dickson	1fa3e2eb9d	SUNRPC: Ensure call_connect_status() deals correctly with SOFTCONN tasks Don't schedule an rpc_delay before checking to see if the task is a SOFTCONN because the tk_callback from the delay (__rpc_atrun) clears the task status before the rpc_exit_task can be run. Signed-off-by: Steve Dickson <steved@redhat.com> Fixes: `561ec16031` (SUNRPC: call_connect_status should recheck...) Link: http://lkml.kernel.org/r/5329CF7C.7090308@RedHat.com Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-20 11:46:52 -04:00
Trond Myklebust	9455e3f43b	SUNRPC: Ensure call_status() deals correctly with SOFTCONN tasks Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-19 17:19:42 -04:00
Chuck Lever	3a0799a94c	SUNRPC: remove KERN_INFO from dprintk() call sites The use of KERN_INFO causes garbage characters to appear when debugging is enabled. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-17 15:30:38 -04:00
Chuck Lever	2b7bbc963d	SUNRPC: Fix large reads on NFS/RDMA After commit `a11a2bf4`, "SUNRPC: Optimise away unnecessary data moves in xdr_align_pages", Thu Aug 2 13:21:43 2012, READs larger than a few hundred bytes via NFS/RDMA no longer work. This commit exposed a long-standing bug in rpcrdma_inline_fixup(). I reproduce this with an rsize=4096 mount using the cthon04 basic tests. Test 5 fails with an EIO error. For my reproducer, kernel log shows: NFS: server cheating in read reply: count 4096 > recvd 0 rpcrdma_inline_fixup() is zeroing the xdr_stream::page_len field, and xdr_align_pages() is now returning that value to the READ XDR decoder function. That field is set up by xdr_inline_pages() by the READ XDR encoder function. As far as I can tell, it is supposed to be left alone after that, as it describes the dimensions of the reply xdr_stream, not the contents of that stream. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=68391 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-17 15:30:38 -04:00
Trond Myklebust	bd0f725c4c	Merge branch 'devel' into linux-next	2014-03-17 15:15:21 -04:00
Trond Myklebust	fdb63dcdb5	SUNRPC: Ensure that call_bind times out correctly If the rpcbind server is unavailable, we still want the RPC client to respect the timeout. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-17 15:14:18 -04:00
Trond Myklebust	485f225178	SUNRPC: Ensure that call_connect times out correctly When the server is unavailable due to a networking error, etc, we want the RPC client to respect the timeout delays when attempting to reconnect. Reported-by: Neil Brown <neilb@suse.de> Fixes: `561ec16031` (SUNRPC: call_connect_status should recheck bind..) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-03-17 15:14:18 -04:00
Linus Torvalds	e95003c3f9	NFS client bugfixes for Linux 3.14 Highlights include stable fixes for the following bugs: - General performance regression due to NFS_INO_INVALID_LABEL being set when the server doesn't support labeled NFS - Hang in the RPC code due to a socket out-of-buffer race - Infinite loop when trying to establish the NFSv4 lease - Use-after-free bug in the RPCSEC gss code. - nfs4_select_rw_stateid is returning with a non-zero error value on success Other bug fixes: - Potential memory scribble in the RPC bi-directional RPC code - Pipe version reference leak - Use the correct net namespace in the new NFSv4 migration code -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJTBMBlAAoJEGcL54qWCgDyOakP+gKDh0VhKw8GziJFfY+6kHHI wej86M/coNnRPUv8n7s3N5TXMoV36qisYpxbIG/WQbOn6MfR3qjto6WoP7+vsrEq iNomtLivgEsWJePydTuIyAR/TK0du/zqP4zoPEDgdLDenucEVvkCzGIkqzg8Mddc duknEhIq918BkXIe3hRBWuxl+pRjwZur+TY0h/OR11oodqTYHxrE37f5PvREWOmB 08hhjOFBFYlyEnCjD3I1SFmcXQxkKzvACavvbhTyF6u/37oL/QC1/DZKL5mSdOJ6 novO8sv9gIpn/RhsEMOdaeYMYM5QTvkYIJQyLpKAYyaLZ42EMbRkczwNE7C0ZWyi F9MizDMNTie+DvsSHZPYwABTDOOQOuWPa9PO3Lyo8UtWxfmTOjYr2Rre1wyG0/+0 ywb3JiKQCtVDPnmSHhqxFfVp9XvS7D2/vz0udgKjrCLKyDC2OMMGLVa/acnrtZmz s94QpPiqhjnRqIuKo251HtbK3AVaLBNQxBCPszieYwPm9aZ04P7mjsGg1WuhhB2v +eSa9UkicGJwKWJWtIBr54qIAOlEXu2bXY+vio5UfbDb+5qfBHe0TmNrz5QNJ53A x0eUBocth9VpW1cv/Rf+o30tJyZGy6Jtv8kn2hfXkJXNL3N1Gn25rD3tpb+5TtdY b7gtHy13/oe8yi0tK91f =tll2 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights include stable fixes for the following bugs: - General performance regression due to NFS_INO_INVALID_LABEL being set when the server doesn't support labeled NFS - Hang in the RPC code due to a socket out-of-buffer race - Infinite loop when trying to establish the NFSv4 lease - Use-after-free bug in the RPCSEC gss code. - nfs4_select_rw_stateid is returning with a non-zero error value on success Other bug fixes: - Potential memory scribble in the RPC bi-directional RPC code - Pipe version reference leak - Use the correct net namespace in the new NFSv4 migration code" * tag 'nfs-for-3.14-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS fix error return in nfs4_select_rw_stateid NFSv4: Use the correct net namespace in nfs4_update_server SUNRPC: Fix a pipe_version reference leak SUNRPC: Ensure that gss_auth isn't freed before its upcall messages SUNRPC: Fix potential memory scribble in xprt_free_bc_request() SUNRPC: Fix races in xs_nospace() SUNRPC: Don't create a gss auth cache unless rpc.gssd is running NFS: Do not set NFS_INO_INVALID_LABEL unless server supports labeled NFS	2014-02-19 12:13:02 -08:00
Trond Myklebust	e9776d0f4a	SUNRPC: Fix a pipe_version reference leak In gss_alloc_msg(), if the call to gss_encode_v1_msg() fails, we want to release the reference to the pipe_version that was obtained earlier in the function. Fixes: `9d3a2260f0` (SUNRPC: Fix buffer overflow checking in...) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-16 13:28:01 -05:00
Trond Myklebust	9eb2ddb48c	SUNRPC: Ensure that gss_auth isn't freed before its upcall messages Fix a race in which the RPC client is shutting down while the gss daemon is processing a downcall. If the RPC client manages to shut down before the gss daemon is done, then the struct gss_auth used in gss_release_msg() may have already been freed. Link: http://lkml.kernel.org/r/1392494917.71728.YahooMailNeo@web140002.mail.bf1.yahoo.com Reported-by: John <da_audiophile@yahoo.com> Reported-by: Borislav Petkov <bp@alien8.de> Cc: stable@vger.kernel.org # 3.12+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-16 13:06:06 -05:00
Linus Torvalds	16e5a2ed59	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates from David Miller: 1) Fix flexcan build on big endian, from Arnd Bergmann 2) Correctly attach cpsw to GPIO bitbang MDIO drive, from Stefan Roese 3) udp_add_offload has to use GFP_ATOMIC since it can be invoked from non-sleepable contexts. From Or Gerlitz 4) vxlan_gro_receive() does not iterate over all possible flows properly, fix also from Or Gerlitz 5) CAN core doesn't use a proper SKB destructor when it hooks up sockets to SKBs. Fix from Oliver Hartkopp 6) ip_tunnel_xmit() can use an uninitialized route pointer, fix from Eric Dumazet 7) Fix address family assignment in IPVS, from Michal Kubecek 8) Fix ath9k build on ARM, from Sujith Manoharan 9) Make sure fail_over_mac only applies for the correct bonding modes, from Ding Tianhong 10) The udp offload code doesn't use RCU correctly, from Shlomo Pongratz 11) Handle gigabit features properly in generic PHY code, from Florian Fainelli 12) Don't blindly invoke link operations in rtnl_link_get_slave_info_data_size, they are optional. Fix from Fernando Luis Vazquez Cao 13) Add USB IDs for Netgear Aircard 340U, from Bjørn Mork 14) Handle netlink packet padding properly in openvswitch, from Thomas Graf 15) Fix oops when deleting chains in nf_tables, from Patrick McHardy 16) Fix RX stalls in xen-netback driver, from Zoltan Kiss 17) Fix deadlock in mac80211 stack, from Emmanuel Grumbach 18) inet_nlmsg_size() forgets to consider ifa_cacheinfo, fix from Geert Uytterhoeven 19) tg3_change_mtu() can deadlock, fix from Nithin Sujir 20) Fix regression in setting SCTP local source addresses on accepted sockets, caused by some generic ipv6 socket changes. Fix from Matija Glavinic Pecotic 21) IPPROTO_* must be pure defines, otherwise module aliases don't get constructed properly. Fix from Jan Moskyto 22) IPV6 netconsole setup doesn't work properly unless an explicit source address is specified, fix from Sabrina Dubroca 23) Use __GFP_NORETRY for high order skb page allocations in sock_alloc_send_pskb and skb_page_frag_refill. From Eric Dumazet 24) Fix a regression added in netconsole over bridging, from Cong Wang 25) TCP uses an artificial offset of 1ms for SRTT, but this doesn't jive well with TCP pacing which needs the SRTT to be accurate. Fix from Eric Dumazet 26) Several cases of missing header file includes from Rashika Kheria 27) Add ZTE MF667 device ID to qmi_wwan driver, from Raymond Wanyoike 28) TCP Small Queues doesn't handle nonagle properly in some corner cases, fix from Eric Dumazet 29) Remove extraneous read_unlock in bond_enslave, whoops. From Ding Tianhong 30) Fix 9p trans_virtio handling of vmalloc buffers, from Richard Yao * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (136 commits) 6lowpan: fix lockdep splats alx: add missing stats_lock spinlock init 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers bonding: remove unwanted bond lock for enslave processing USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support tcp: tsq: fix nonagle handling bridge: Prevent possible race condition in br_fdb_change_mac_address bridge: Properly check if local fdb entry can be deleted when deleting vlan bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address bridge: Fix the way to check if a local fdb entry can be deleted bridge: Change local fdb entries whenever mac address of bridge device changes bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min net: vxge: Remove unused device pointer net: qmi_wwan: add ZTE MF667 3c59x: Remove unused pointer in vortex_eisa_cleanup() net: fix 'ip rule' iif/oif device rename ...	2014-02-11 12:05:55 -08:00
Trond Myklebust	2ea24497a1	SUNRPC: RPC callbacks may be split across several TCP segments Since TCP is a stream protocol, our callback read code needs to take into account the fact that RPC callbacks are not always confined to a single TCP segment. This patch adds support for multiple TCP segments by ensuring that we only remove the rpc_rqst structure from the 'free backchannel requests' list once the data has been completely received. We rely on the fact that TCP data is ordered for the duration of the connection. Reported-by: shaobingqing <shaobingqing@bwstor.com.cn> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-11 14:01:20 -05:00
Trond Myklebust	628356791b	SUNRPC: Fix potential memory scribble in xprt_free_bc_request() The call to xprt_free_allocation() will call list_del() on req->rq_bc_pa_list, which is not attached to a list. This patch moves the list_del() out of xprt_free_allocation() and into those callers that need it. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-11 13:56:54 -05:00
Trond Myklebust	06ea0bfe6e	SUNRPC: Fix races in xs_nospace() When a send failure occurs due to the socket being out of buffer space, we call xs_nospace() in order to have the RPC task wait until the socket has drained enough to make it worth while trying again. The current patch fixes a race in which the socket is drained before we get round to setting up the machinery in xs_nospace(), and which is reported to cause hangs. Link: http://lkml.kernel.org/r/20140210170315.33dfc621@notabene.brown Fixes: `a9a6b52ee1` (SUNRPC: Don't start the retransmission timer...) Reported-by: Neil Brown <neilb@suse.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-11 09:34:30 -05:00
Trond Myklebust	a699d65ec4	SUNRPC: Don't create a gss auth cache unless rpc.gssd is running An infinite loop is caused when nfs4_establish_lease() fails with -EACCES. This causes nfs4_handle_reclaim_lease_error() to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit. This in turn causes nfs4_state_manager() to try and reestablished the lease, again, again, again... The problem is a valid RPCSEC_GSS client is being created when rpc.gssd is not running. Link: http://lkml.kernel.org/r/1392066375-16502-1-git-send-email-steved@redhat.com Fixes: `0ea9de0ea6` (sunrpc: turn warn_gssd() log message into a dprintk()) Reported-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-02-10 16:49:17 -05:00
Rashika Kheria	e1d83ee673	net: Mark functions as static in net/sunrpc/svc_xprt.c Mark functions as static in net/sunrpc/svc_xprt.c because they are not used outside this file. This eliminates the following warning in net/sunrpc/svc_xprt.c: net/sunrpc/svc_xprt.c:574:5: warning: no previous prototype for ‘svc_alloc_arg’ [-Wmissing-prototypes] net/sunrpc/svc_xprt.c:615:18: warning: no previous prototype for ‘svc_get_next_xprt’ [-Wmissing-prototypes] net/sunrpc/svc_xprt.c:694:6: warning: no previous prototype for ‘svc_add_new_temp_xprt’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-09 17:32:50 -08:00
Linus Torvalds	8a1f006ad3	NFS client bugfixes for Linux 3.14 Highlights: - Fix several races in nfs_revalidate_mapping - NFSv4.1 slot leakage in the pNFS files driver - Stable fix for a slot leak in nfs40_sequence_done - Don't reject NFSv4 servers that support ACLs with only ALLOW aces -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJS7Bb+AAoJEGcL54qWCgDyDuQP/17nKR5e6MLhixcAbvlcH+pN 8CGolAM3HmRXDWUW/PkBH3UguG8Tzx1Ex26vIxipPeTSwZabf6194Twj6L97DEGZ 2SouD158BW1TkAbhEN/alKB/4ZCPos05iXjZkrL7MRff+8FD0UvWR2pBT1F2jQdY ZftG76Q72qhZHfH07ZMxM/v4Oy2Ge98RDD35gfuuqMSjHpmN9tiB55PeheW33LVY fu6I/JEwmlJpgy2qUcDv7v0V4mDpjC7XbcjjHpMHL8zp/C5Rx/rdgt9OQPlwmjdV FD8MWNXLc5TWxIouLDFPVUv3WZPjyu449QHS9Wc95fSqsHcdl4j4SwLAoSvUIdHt vDI5PtWhw3WAezbtiuCQnT0xcoIOn5bLjOVP13taDcV9vlZLcFlyOpZ5gVE4/Yju zm4sCW2+imDc74ERGa4fukF6QhzzAVmD8RlCJwuNzwCfXiZ36+xSanMYiPoUiwLL OVNgymrm0fe7GVFQKWN2D+Vr68OQEmARO+KfA3UzP5rQV+9CU8zSHjbcoRWZ59QG VahOS5WDLQSrMp8W37yAHH9IiAWveAAKJJTHlOniRqH90QYPgyW18fTo7YcpW313 AQGFgr/1n4t27MWRLu5rdoN5v8+kwNi0UV6oboNIPoP1v15NkEMvc7HKFj5M883R qEYfe5wqN/eRNj68NT/+ =B7f0 -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: "Highlights: - Fix several races in nfs_revalidate_mapping - NFSv4.1 slot leakage in the pNFS files driver - Stable fix for a slot leak in nfs40_sequence_done - Don't reject NFSv4 servers that support ACLs with only ALLOW aces" * tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: nfs: initialize the ACL support bits to zero. NFSv4.1: Cleanup NFSv4.1: Clean up nfs41_sequence_done NFSv4: Fix a slot leak in nfs40_sequence_done NFSv4.1 free slot before resending I/O to MDS nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING NFS: Fix races in nfs_revalidate_mapping sunrpc: turn warn_gssd() log message into a dprintk() NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping nfs: handle servers that support only ALLOW ACE type.	2014-01-31 15:39:07 -08:00
Linus Torvalds	d9894c228b	Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux Pull nfsd updates from Bruce Fields: - Handle some loose ends from the vfs read delegation support. (For example nfsd can stop breaking leases on its own in a fewer places where it can now depend on the vfs to.) - Make life a little easier for NFSv4-only configurations (thanks to Kinglong Mee). - Fix some gss-proxy problems (thanks Jeff Layton). - miscellaneous bug fixes and cleanup * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits) nfsd: consider CLAIM_FH when handing out delegation nfsd4: fix delegation-unlink/rename race nfsd4: delay setting current_fh in open nfsd4: minor nfs4_setlease cleanup gss_krb5: use lcm from kernel lib nfsd4: decrease nfsd4_encode_fattr stack usage nfsd: fix encode_entryplus_baggage stack usage nfsd4: simplify xdr encoding of nfsv4 names nfsd4: encode_rdattr_error cleanup nfsd4: nfsd4_encode_fattr cleanup minor svcauth_gss.c cleanup nfsd4: better VERIFY comment nfsd4: break only delegations when appropriate NFSD: Fix a memory leak in nfsd4_create_session sunrpc: get rid of use_gssp_lock sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt sunrpc: don't wait for write before allowing reads from use-gss-proxy file nfsd: get rid of unused function definition Define op_iattr for nfsd4_open instead using macro NFSD: fix compile warning without CONFIG_NFSD_V3 ...	2014-01-30 10:18:43 -08:00
Linus Torvalds	2b2b15c32a	NFS client updates for Linux 3.14 Highlights include: - Stable fix for an infinite loop in RPC state machine - Stable fix for a use after free situation in the NFSv4 trunking discovery - Stable fix for error handling in the NFSv4 trunking discovery - Stable fix for the page write update code - Stable fix for the NFSv4.1 mount time security negotiation - Stable fix for the NFSv4 open code. - O_DIRECT locking fixes - fix an Oops in the pnfs file commit code - RPC layer needs finer grained handling of connection errors - More RPC GSS upcall fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJS5ozQAAoJEGcL54qWCgDy8EIQAMKYX1E5qOal3oJCzWdHAPNz ZSQ7CbA3c66vgJwpxy5Mz4gEtTK1IEzfTX31gLgkCXkyw54As+0lOa/SvoXFUusN BdBtskkIcVjhcly56xP2dzWGMsVrS8Vt+nwhsPv1Qaor5El0zXwPv8YE5PuuxJK5 fyQdFEsywnCHtmFdyBdzsV8qHvAA0rxZTMmd6ZDBPCi9362D+pfp/1ESVOA6O14N rMBAbadF0pVM1UNvcvxSQaeqwCNqg5OuYKgyy9rhlH0WiQ6ijvKPrLVwg2pKZ2hj DCmwEqmKNEpxIFeOvmgFs/uhOEBx2IOF58xTc0+X81q96yTVm80anG1VTNFX577U gO8Ts0K/gWTD8ghxz4vh4/llc4yUv8ep8zB3qdSfL8C217UJIwnshkbPct7P1DTh 8vpWtUeVJPu6rwcxMQXy0NntNZjRo1aqrv+htvFzPAMicM2KEAp73eOjStefvtr5 JkdbvhhOR6dLwPrUEXM5FW5ewURegLjLcEqw3tq8kMnH0nEYjWOMBaB+uT0QFXun EXNqCpQHmHisem/3lGU+iVPc9lPf3C6tPIgjvoSplKcah1l3phVx6a5ReL22Zx2n qB2ePHfqToMjMcWiW3O3sbRpaDb+Br7xI4l8F3oeicvfv7SKB8k1u/w2IIoXKFIa FIdD6R0UIPgdnH5c03EC =abfY -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client updates from Trond Myklebust: "Highlights include: - stable fix for an infinite loop in RPC state machine - stable fix for a use after free situation in the NFSv4 trunking discovery - stable fix for error handling in the NFSv4 trunking discovery - stable fix for the page write update code - stable fix for the NFSv4.1 mount time security negotiation - stable fix for the NFSv4 open code. - O_DIRECT locking fixes - fix an Oops in the pnfs file commit code - RPC layer needs finer grained handling of connection errors - more RPC GSS upcall fixes" * tag 'nfs-for-3.14-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (30 commits) pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done pnfs: fix BUG in filelayout_recover_commit_reqs nfs4: fix discover_server_trunking use after free NFSv4.1: Handle errors correctly in nfs41_walk_client_list nfs: always make sure page is up-to-date before extending a write to cover the entire page nfs: page cache invalidation for dio nfs: take i_mutex during direct I/O reads nfs: merge nfs_direct_write into nfs_file_direct_write nfs: merge nfs_direct_read into nfs_file_direct_read nfs: increment i_dio_count for reads, too nfs: defer inode_dio_done call until size update is done nfs: fix size updates for aio writes nfs4.1: properly handle ENOTSUP in SECINFO_NO_NAME NFSv4.1: Fix a race in nfs4_write_inode NFSv4.1: Don't trust attributes if a pNFS LAYOUTCOMMIT is outstanding point to the right include file in a comment (left over from `a9004abc3`) NFS: dprintk() should not print negative fileids and inode numbers nfs: fix dead code of ipv6_addr_scope sunrpc: Fix infinite loop in RPC state machine SUNRPC: Add tracepoint for socket errors ...	2014-01-28 08:46:44 -08:00
Jeff Layton	0ea9de0ea6	sunrpc: turn warn_gssd() log message into a dprintk() The original printk() made sense when the GSSAPI codepaths were called only when sec=krb5* was explicitly requested. Now however, in many cases the nfs client will try to acquire GSSAPI credentials by default, even when it's not requested. Since we don't have a great mechanism to distinguish between the two cases, just turn the pr_warn into a dprintk instead. With this change we can also get rid of the ratelimiting. We do need to keep the EXPORT_SYMBOL(gssd_running) in place since auth_gss.ko needs it and sunrpc.ko provides it. We can however, eliminate the gssd_running call in the nfs code since that's a bit of a layering violation. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-01-27 15:36:05 -05:00

1 2 3 4 5 ...

2249 Коммитов