Declare the p_encode callbacks with the proper prototype instead of
casting to kxdreproc_t and losing all type safety.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Hans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David Windsor <dwindsor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct rpc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
p_count is the only writeable memeber of struct rpc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.
This patch moves it into out out struct rpc_procinfo, and into a
separate writable array that is pointed to by struct rpc_version and
indexed by p_statidx.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Declare the p_decode callbacks with the proper prototype instead of
casting to kxdrdproc_t and losing all type safety.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
Declare the p_encode callbacks with the proper prototype instead of
casting to kxdreproc_t and losing all type safety.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
bugfixes.
A couple changes could theoretically break working setups on upgrade. I
don't expect complaints in practice, but they seem worth calling out
just in case:
- NFS security labels are now off by default; a new
security_label export flag reenables it per export. But,
having them on by default is a disaster, as it generally only
makes sense if all your clients and servers have similar
enough selinux policies. Thanks to Jason Tibbitts for
pointing this out.
- NFSv4/UDP support is off. It was never really supported, and
the spec explicitly forbids it. We only ever left it on out
of laziness; thanks to Jeff Layton for finally fixing that.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYtejbAAoJECebzXlCjuG+JhEQAK3YTYYNrPY26Pfiu0FghLFV
4qOHK4DOkJzrWIom5uWyBo7yOwH6WnQtTe/gCx/voOEW3lsJO7F3IfTnTVp+Smp6
GJeVtsr1vI9EBnwhMlyoJ5hZ2Ju5kX3MBVnew6+momt6620ZO7a+EtT+74ePaY8Y
jxLzWVA1UqbWYoMabNQpqgKypKvNrhwst72iYyBhNuL/qtGeBDQWwcrA+TFeE9tv
Ad7qB53xL1mr0Wn1CNIOR/IzVAj4o2H0vdjqrPjAdvbfwf8YYLNpJXt5k591wx/j
1TpiWIPqnwLjMT3X5NkQN3agZKeD+2ZWPrClr35TgRRe62CK6JblK9/Wwc5BRSzV
paMP3hOm6/dQOBA5C+mqPaHdEI8VqcHyZpxU4VC/ttsVEGTgaLhGwHGzSn+5lYiM
Qx9Sh50yFV3oiBW/sb/y8lBDwYm/Cq0OyqAU277idbdzjcFerMg1qt06tjEQhYMY
K2V7rS8NuADUF6F1BwOONZzvg7Rr7iWHmLh+iSM9TeQoEmz2jIHSaLIFaYOET5Jr
PIZS3rOYoa0FaKOYVYnZMC74n/LqP/Aou8B+1rRcLy5YEdUIIIVpFvqpg1nGv6PI
sA3zx/f13IRte3g0CuQiY0/2cx7uXk/gXJ7s5+ejEzljF/aYWiomx3mr6HqPQETn
CWEtXlfyJCyX+A8hbO+U
=iLLz
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"The nfsd update this round is mainly a lot of miscellaneous cleanups
and bugfixes.
A couple changes could theoretically break working setups on upgrade.
I don't expect complaints in practice, but they seem worth calling out
just in case:
- NFS security labels are now off by default; a new security_label
export flag reenables it per export. But, having them on by default
is a disaster, as it generally only makes sense if all your clients
and servers have similar enough selinux policies. Thanks to Jason
Tibbitts for pointing this out.
- NFSv4/UDP support is off. It was never really supported, and the
spec explicitly forbids it. We only ever left it on out of
laziness; thanks to Jeff Layton for finally fixing that"
* tag 'nfsd-4.11' of git://linux-nfs.org/~bfields/linux: (34 commits)
nfsd: Fix display of the version string
nfsd: fix configuration of supported minor versions
sunrpc: don't register UDP port with rpcbind when version needs congestion control
nfs/nfsd/sunrpc: enforce transport requirements for NFSv4
sunrpc: flag transports as having congestion control
sunrpc: turn bitfield flags in svc_version into bools
nfsd: remove superfluous KERN_INFO
nfsd: special case truncates some more
nfsd: minor nfsd_setattr cleanup
NFSD: Reserve adequate space for LOCKT operation
NFSD: Get response size before operation for all RPCs
nfsd/callback: Drop a useless data copy when comparing sessionid
nfsd/callback: skip the callback tag
nfsd/callback: Cleanup callback cred on shutdown
nfsd/idmap: return nfserr_inval for 0-length names
SUNRPC/Cache: Always treat the invalid cache as unexpired
SUNRPC: Drop all entries from cache_detail when cache_purge()
svcrdma: Poll CQs in "workqueue" mode
svcrdma: Combine list fields in struct svc_rdma_op_ctxt
svcrdma: Remove unused sc_dto_q field
...
Now that %z is standartised in C99 there is no reason to support %Z.
Unlike %L it doesn't even make format strings smaller.
Use BUILD_BUG_ON in a couple ATM drivers.
In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
is in my opinion is quite an achievement. Hopefully this patch inspires
someone else to trim vsprintf.c more.
Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We currently handle a client PROC_DESTROY request by turning it
CACHE_NEGATIVE, setting the expired time to now, and then waiting for
cache_clean to clean it up later. Since we forgot to set the cache's
nextcheck value, that could take up to 30 minutes. Also, though there's
probably no real bug in this case, setting CACHE_NEGATIVE directly like
this probably isn't a great idea in general.
So let's just remove the entry from the cache directly, and move this
bit of cache manipulation to a helper function.
Signed-off-by: Neil Brown <neilb@suse.com>
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."
The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize. Fix that.
[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946] ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901] gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017] svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257] ? __enqueue_entity+0x6c/0x70
[120408.589101] svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212] ? try_to_wake_up+0x4a/0x360
[120408.591036] ? wake_up_process+0x15/0x20
[120408.592093] ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177] svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168] svc_process_common+0x203/0x710 [sunrpc]
[120408.595220] svc_process+0x105/0x1c0 [sunrpc]
[120408.596278] nfsd+0xe9/0x160 [nfsd]
[120408.597060] kthread+0x101/0x140
[120408.597734] ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626] ? kthread_park+0x90/0x90
[120408.599448] ret_from_fork+0x22/0x30
Fixes: 1d658336b0 "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: stable@vger.kernel.org
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Context expiry times are in units of seconds since boot, not unix time.
The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.
Cc: stable@vger.kernel.org
Fixes: c5b29f885a "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
that makes ACL inheritance a little more useful in environments that
default to restrictive umasks. Requires client-side support, also on
its way for 4.10.
Other than that, miscellaneous smaller fixes and cleanup, especially to
the server rdma code.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYVAEqAAoJECebzXlCjuG+VM0QAKaR+ibSM31Ahpnrgit5/wrb
n630KDFztO7iqEeuHfPQ4/n05T2QR0JWsLpjLMFvx88Gy4gyXYk9cuDPIrNKX1IS
3/nnhBo0+EVnjODjufommCrtbPZlqOSsS3N03vWkB7rTi8QYsWBOThh+XLRJYOXo
LZzJE1WmXNeCXV1kXPBsauryywql1fmwTXBzmIf1HbzoGAVROMEA2qqh4Z3nb7BP
sJuGchWx0STBOuAa278ighXQPUW2lUft9uzw2bssOtMwfNyOs/Pd6nx4F1Lg6WwD
1UQXoiR8K3PqelZfoeFJ05v0css/sbNKep+huWRdOXZj3Kjpa20lKBX8xHfat7sN
1OQ4FHx8ToigX3c+wwtlCqRMCcIxqUYkRjqzPHyeBiSSSp0rLrId44rI5x/K0yay
3bkGw7hFDSzc0Nq2uZgmtlbyTC71hLNhkWe7ThofcVG/pS0JtAqBiKIVwXJPh/e0
PLmVHYGU6Xowjag5edJlXY1tlIlxtWfqsWUarCXS5bfKUa3UjMVSjyuljsDqqJsn
96fEWu7DiUo4HeGYmf8MJoeZYV2y0DKSQGeguVkUKWp2DoTzinQHTfdKvrZVwNuu
hVE9/QeWzUvPY13HOUaKD2skozhbUChqv0NHESKUv8gxE3svTEpYZkXrE74WNqMk
l/WXAhw+RdKZof4+qdjU
=JANY
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"The one new feature is support for a new NFSv4.2 mode_umask attribute
that makes ACL inheritance a little more useful in environments that
default to restrictive umasks. Requires client-side support, also on
its way for 4.10.
Other than that, miscellaneous smaller fixes and cleanup, especially
to the server rdma code"
[ The client side of the umask attribute was merged yesterday ]
* tag 'nfsd-4.10' of git://linux-nfs.org/~bfields/linux:
nfsd: add support for the umask attribute
sunrpc: use DEFINE_SPINLOCK()
svcrdma: Further clean-up of svc_rdma_get_inv_rkey()
svcrdma: Break up dprintk format in svc_rdma_accept()
svcrdma: Remove unused variable in rdma_copy_tail()
svcrdma: Remove unused variables in xprt_rdma_bc_allocate()
svcrdma: Remove svc_rdma_op_ctxt::wc_status
svcrdma: Remove DMA map accounting
svcrdma: Remove BH-disabled spin locking in svc_rdma_send()
svcrdma: Renovate sendto chunk list parsing
svcauth_gss: Close connection when dropping an incoming message
svcrdma: Clear xpt_bc_xps in xprt_setup_rdma_bc() error exit arm
nfsd: constify reply_cache_stats_operations structure
nfsd: update workqueue creation
sunrpc: GFP_KERNEL should be GFP_NOFS in crypto code
nfsd: catch errors in decode_fattr earlier
nfsd: clean up supported attribute handling
nfsd: fix error handling for clients that fail to return the layout
nfsd: more robust allocation failure handling in nfsd_reply_cache_init
There are two problems with refcounting of auth_gss messages.
First, the reference on the pipe->pipe list (taken by a call
to rpc_queue_upcall()) is not counted. It seems to be
assumed that a message in pipe->pipe will always also be in
pipe->in_downcall, where it is correctly reference counted.
However there is no guaranty of this. I have a report of a
NULL dereferences in rpc_pipe_read() which suggests a msg
that has been freed is still on the pipe->pipe list.
One way I imagine this might happen is:
- message is queued for uid=U and auth->service=S1
- rpc.gssd reads this message and starts processing.
This removes the message from pipe->pipe
- message is queued for uid=U and auth->service=S2
- rpc.gssd replies to the first message. gss_pipe_downcall()
calls __gss_find_upcall(pipe, U, NULL) and it finds the
*second* message, as new messages are placed at the head
of ->in_downcall, and the service type is not checked.
- This second message is removed from ->in_downcall and freed
by gss_release_msg() (even though it is still on pipe->pipe)
- rpc.gssd tries to read another message, and dereferences a pointer
to this message that has just been freed.
I fix this by incrementing the reference count before calling
rpc_queue_upcall(), and decrementing it if that fails, or normally in
gss_pipe_destroy_msg().
It seems strange that the reply doesn't target the message more
precisely, but I don't know all the details. In any case, I think the
reference counting irregularity became a measureable bug when the
extra arg was added to __gss_find_upcall(), hence the Fixes: line
below.
The second problem is that if rpc_queue_upcall() fails, the new
message is not freed. gss_alloc_msg() set the ->count to 1,
gss_add_msg() increments this to 2, gss_unhash_msg() decrements to 1,
then the pointer is discarded so the memory never gets freed.
Fixes: 9130b8dbc6 ("SUNRPC: allow for upcalls for same uid but different gss service")
Cc: stable@vger.kernel.org
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1011250
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
S5.3.3.1 of RFC 2203 requires that an incoming GSS-wrapped message
whose sequence number lies outside the current window is dropped.
The rationale is:
The reason for discarding requests silently is that the server
is unable to determine if the duplicate or out of range request
was due to a sequencing problem in the client, network, or the
operating system, or due to some quirk in routing, or a replay
attack by an intruder. Discarding the request allows the client
to recover after timing out, if indeed the duplication was
unintentional or well intended.
However, clients may rely on the server dropping the connection to
indicate that a retransmit is needed. Without a connection reset, a
client can wait forever without retransmitting, and the workload
just stops dead. I've reproduced this behavior by running xfstests
generic/323 on an NFSv4.0 mount with proto=rdma and sec=krb5i.
To address this issue, have the server close the connection when it
silently discards an incoming message due to a GSS sequence number
problem.
There are a few other places where the server will never reply.
Change those spots in a similar fashion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Writes may depend on the auth_gss crypto code, so we shouldn't be
allocating with GFP_KERNEL there.
This still leaves some crypto_alloc_* calls which end up doing
GFP_KERNEL allocations in the crypto code. Those could probably done at
crypto import time.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
As of ac4e97abce "scatterlist: sg_set_buf() argument must be in linear
mapping", sg_set_buf hits a BUG when make_checksum_v2->xdr_process_buf,
among other callers, passes it memory on the stack.
We only need a scatterlist to pass this to the crypto code, and it seems
like overkill to require kmalloc'd memory just to encrypt a few bytes,
but for now this seems the best fix.
Many of these callers are in the NFS write paths, so we allocate with
GFP_NOFS. It might be possible to do without allocations here entirely,
but that would probably be a bigger project.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Highlights include:
Stable bugfixes:
- sunrpc: fix writ espace race causing stalls
- NFS: Fix inode corruption in nfs_prime_dcache()
- NFSv4: Don't report revoked delegations as valid in
nfs_have_delegation()
- NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is
invalid
- NFSv4: Open state recovery must account for file permission changes
- NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
Features:
- Add support for tracking multiple layout types with an ordered list
- Add support for using multiple backchannel threads on the client
- Add support for pNFS file layout session trunking
- Delay xprtrdma use of DMA API (for device driver removal)
- Add support for xprtrdma remote invalidation
- Add support for larger xprtrdma inline thresholds
- Use a scatter/gather list for sending xprtrdma RPC calls
- Add support for the CB_NOTIFY_LOCK callback
- Improve hashing sunrpc auth_creds by using both uid and gid
Bugfixes:
- Fix xprtrdma use of DMA API
- Validate filenames before adding to the dcache
- Fix corruption of xdr->nwords in xdr_copy_to_scratch
- Fix setting buffer length in xdr_set_next_buffer()
- Don't deadlock the state manager on the SEQUENCE status flags
- Various delegation and stateid related fixes
- Retry operations if an interrupted slot receives EREMOTEIO
- Make nfs boot time y2038 safe
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJX/+ZfAAoJENfLVL+wpUDr5MUP/16s2Kp9ZZZZ7ICi3yrHOzb0
9WpCOmbKUIELXl8YgkxlvPUYMzTQTIc32TwbVgdFV0g41my/0+O3z3+IiTrUGxH5
8LgouMWBZ9KKmyUB//+KQAXr3j/bvDdF6Li6wJfz8a2o+9xT4oTkK1+Js8p0kn6e
HNKfRknfCKwvE+j4tPCLfs2RX5qDyBFILXwWhj1fAbmT3rbnp+QqkXD4mWUrXb9z
DBgxciXRhOkOQQAD2KQBFd2kUqWDZ5ED23b+aYsu9D3VCW45zitBqQFAxkQWL0hp
x8Mp+MDCxlgdEaGQPUmUiDtPkG1X9ZxUJCAwaJWWsZaItwR2Il+en2sETctnTZ1X
0IAxZVFdolzSeLzIfNx3OG32JdWJdaNjUzkIZam8gO6i1f6PAmK4alR0J3CT31nJ
/OEN76o1E7acGWRMmj+MAZ2U5gPfR7EitOzyE8ZUPcHgyeGMiynjwi56WIpeSvT2
F/Sp5kRe5+D5gtnYuppGp7Srp5vYdtFaz1zgPDUKpDLcxfDweO8AHGjJf3Zmrunx
X24yia4A14CnfcUy4vKpISXRykmkG/3Z0tpWwV53uXZm4nlQfRc7gPibiW7Ay521
af8sDoItW98K3DK5NQU7IUn83ua1TStzpoqlAEafRw//g9zPMTbhHvNvOyrRfrcX
kjWn6hNblMu9M34JOjtu
=XOrF
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"Highlights include:
Stable bugfixes:
- sunrpc: fix writ espace race causing stalls
- NFS: Fix inode corruption in nfs_prime_dcache()
- NFSv4: Don't report revoked delegations as valid in nfs_have_delegation()
- NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid
- NFSv4: Open state recovery must account for file permission changes
- NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
Features:
- Add support for tracking multiple layout types with an ordered list
- Add support for using multiple backchannel threads on the client
- Add support for pNFS file layout session trunking
- Delay xprtrdma use of DMA API (for device driver removal)
- Add support for xprtrdma remote invalidation
- Add support for larger xprtrdma inline thresholds
- Use a scatter/gather list for sending xprtrdma RPC calls
- Add support for the CB_NOTIFY_LOCK callback
- Improve hashing sunrpc auth_creds by using both uid and gid
Bugfixes:
- Fix xprtrdma use of DMA API
- Validate filenames before adding to the dcache
- Fix corruption of xdr->nwords in xdr_copy_to_scratch
- Fix setting buffer length in xdr_set_next_buffer()
- Don't deadlock the state manager on the SEQUENCE status flags
- Various delegation and stateid related fixes
- Retry operations if an interrupted slot receives EREMOTEIO
- Make nfs boot time y2038 safe"
* tag 'nfs-for-4.9-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (100 commits)
NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic
fs: nfs: Make nfs boot time y2038 safe
sunrpc: replace generic auth_cred hash with auth-specific function
sunrpc: add RPCSEC_GSS hash_cred() function
sunrpc: add auth_unix hash_cred() function
sunrpc: add generic_auth hash_cred() function
sunrpc: add hash_cred() function to rpc_authops struct
Retry operation on EREMOTEIO on an interrupted slot
pNFS: Fix atime updates on pNFS clients
sunrpc: queue work on system_power_efficient_wq
NFSv4.1: Even if the stateid is OK, we may need to recover the open modes
NFSv4: If recovery failed for a specific open stateid, then don't retry
NFSv4: Fix retry issues with nfs41_test/free_stateid
NFSv4: Open state recovery must account for file permission changes
NFSv4: Mark the lock and open stateids as invalid after freeing them
NFSv4: Don't test open_stateid unless it is set
NFSv4: nfs4_do_handle_exception() handle revoke/expiry of a single stateid
NFS: Always call nfs_inode_find_state_and_recover() when revoking a delegation
NFSv4: Fix a race when updating an open_stateid
NFSv4: Fix a race in nfs_inode_reclaim_delegation()
...
Current supplementary groups code can massively overallocate memory and
is implemented in a way so that access to individual gid is done via 2D
array.
If number of gids is <= 32, memory allocation is more or less tolerable
(140/148 bytes). But if it is not, code allocates full page (!)
regardless and, what's even more fun, doesn't reuse small 32-entry
array.
2D array means dependent shifts, loads and LEAs without possibility to
optimize them (gid is never known at compile time).
All of the above is unnecessary. Switch to the usual
trailing-zero-len-array scheme. Memory is allocated with
kmalloc/vmalloc() and only as much as needed. Accesses become simpler
(LEA 8(gi,idx,4) or even without displacement).
Maximum number of gids is 65536 which translates to 256KB+8 bytes. I
think kernel can handle such allocation.
On my usual desktop system with whole 9 (nine) aux groups, struct
group_info shrinks from 148 bytes to 44 bytes, yay!
Nice side effects:
- "gi->gid[i]" is shorter than "GROUP_AT(gi, i)", less typing,
- fix little mess in net/ipv4/ping.c
should have been using GROUP_AT macro but this point becomes moot,
- aux group allocation is persistent and should be accounted as such.
Link: http://lkml.kernel.org/r/20160817201927.GA2096@p183.telecom.by
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a hash_cred() function for RPCSEC_GSS, using only the
uid from the auth_cred.
Signed-off-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Highlights include:
- Stable patch from Olga to fix RPCSEC_GSS upcalls when the same user needs
multiple different security services (e.g. krb5i and krb5p).
- Stable patch to fix a regression introduced by the use of SO_REUSEPORT,
and that prevented the use of multiple different NFS versions to the
same server.
- TCP socket reconnection timer fixes.
- Patch from Neil to disable the use of IPv6 temporary addresses.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXrh03AAoJEGcL54qWCgDyp4EQALwZpmYCxWJE5xSHW95Fs124
HYM8g4LznOfs3/ohInb1ja2FaQqUy0XEk3pSjNKfyYgjuwB4qJSOpnAqoIKxJFGB
h4582leYZOZYMMCGslS2I4zcElBYO1WjnKNyb7MpZjCHmN0AdFfIcOXd2K7eL9hM
/poImcs5KfMGIEJqmKqMUxmJ3RjxpK3LySQAes/Y5odOiHC4SGJdGUmSeuPGTbQd
YjFWVHRFU6kVAzPd2Jl46Sgy6SpDaVz82HodXCSY+8lklmIkbIsVqJs0VWo3WkfL
r5WLQ3PzZvloQ7o/E9tZGiB/LEi7roa51hYsG4sleN6Kap5vwyWg0QIKjqyJdFxB
JmFanlCMfae3zNz4cusvgu1okvMnNqO4uRXJIAKfk64k775N9ebY7TXAZUK4/UbY
4nxCHcxygamP/k/8HYFpc4964tMaimIs9JUdojad5a3dzffwXcgEC/0HPUih9R+i
DO/cbVtWeDkmQPLrUqFfOAbmQdyAjELrv48d5BVIst49uuCULU2LlDlVLiAvaZvq
s2YNmr7lkHowvgaH4ShL89wuyyD14Xu5/f49oFBFNKEQay9YthQ8s3XmdZBG7Zl0
oyA1XJjWEq3p8nvPGIqFD26w75ppUbAWLTHsyoU0YfEYrZJrF9jPxowI7WlHgfVo
Io79x1sbgTrckjG+osAf
=UHph
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
- Stable patch from Olga to fix RPCSEC_GSS upcalls when the same user
needs multiple different security services (e.g. krb5i and krb5p).
- Stable patch to fix a regression introduced by the use of
SO_REUSEPORT, and that prevented the use of multiple different NFS
versions to the same server.
- TCP socket reconnection timer fixes.
- Patch from Neil to disable the use of IPv6 temporary addresses"
* tag 'nfs-for-4.8-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: Cap the transport reconnection timer at 1/2 lease period
NFSv4: Cleanup the setting of the nfs4 lease period
SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout
SUNRPC: Fix reconnection timeouts
NFSv4.2: LAYOUTSTATS may return NFS4ERR_ADMIN/DELEG_REVOKED
SUNRPC: disable the use of IPv6 temporary addresses.
SUNRPC: allow for upcalls for same uid but different gss service
SUNRPC: Fix up socket autodisconnect
SUNRPC: Handle EADDRNOTAVAIL on connection failures
It's possible to have simultaneous upcalls for the same UIDs but
different GSS service. In that case, we need to allow for the
upcall to gssd to proceed so that not the same context is used
by two different GSS services. Some servers lock the use of context
to the GSS service.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Trond made a change to the server's tcp logic that allows a fast
client to better take advantage of high bandwidth networks, but
may increase the risk that a single client could starve other
clients; a new sunrpc.svc_rpc_per_connection_limit parameter
should help mitigate this in the (hopefully unlikely) event this
becomes a problem in practice.
Tom Haynes added a minimal flex-layout pnfs server, which is of
no use in production for now--don't build it unless you're doing
client testing or further server development.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXo7HNAAoJECebzXlCjuG+zqUP/RxO5jZjBhNI8/ayGdDW/Jnq
s0Fu6B+aNRV3GnugmIeI4tWNGnPyERNzFtjLKlnwaasz/oW4qBLqGbNUWC5xKARS
erODs0hM/1aCYWwNBEc5qXP2u23HrWVuQ+B5fg42ACyliKFGq5faDRmf6XGU/1kB
8unXGWPAiLiNZD/bWP91fYhThlLgpfHBFZ7M3G2IqmzWZTSELPzwp1bpRWt7yWQQ
z1oYtXToycbwz3yPVk3cXtaoqpjDUVZf2Guqgqi1BwEyEtYOSaYo1VHNsKDf4OId
QXQh64AqIK4uszpvtNhvsEaAECN7IiB+N4n2laFiQVmAf8Hfl3AnV/gKeD4lKmTj
TY6knnjZO/X88wn80MB7JR1H1WXvvzNIHwNR95qfub/lVKX+C+0AORRtYhi5F9ec
ixNs/z1ImLpYxAjiP/T5anD5xcX2S+LcSv7kRjhEufqNFtRAIqBZO9ZWbCdXAAyE
tcH9Cru4jeIlFO/y6O61EVrn9FFj2+0uu+7urefNRQ2Y9pmKeculJrLF6WO8WHms
4IzXMmjZK+358RVdX2Ji5Hw6rBDvfgP+LjB8Jn8CeIiNRONEjT+2/AYQcfk61aLb
INUbk6G6Vfd8iMO4aaRI9tmW+vKCOZa0IbnrNE1oHKp/AKBDr25i5YPSCsnl3r4Q
iR7rRe9FIkfqBpbfjVFv
=mo54
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Highlights:
- Trond made a change to the server's tcp logic that allows a fast
client to better take advantage of high bandwidth networks, but may
increase the risk that a single client could starve other clients;
a new sunrpc.svc_rpc_per_connection_limit parameter should help
mitigate this in the (hopefully unlikely) event this becomes a
problem in practice.
- Tom Haynes added a minimal flex-layout pnfs server, which is of no
use in production for now--don't build it unless you're doing
client testing or further server development"
* tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux: (32 commits)
nfsd: remove some dead code in nfsd_create_locked()
nfsd: drop unnecessary MAY_EXEC check from create
nfsd: clean up bad-type check in nfsd_create_locked
nfsd: remove unnecessary positive-dentry check
nfsd: reorganize nfsd_create
nfsd: check d_can_lookup in fh_verify of directories
nfsd: remove redundant zero-length check from create
nfsd: Make creates return EEXIST instead of EACCES
SUNRPC: Detect immediate closure of accepted sockets
SUNRPC: accept() may return sockets that are still in SYN_RECV
nfsd: allow nfsd to advertise multiple layout types
nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock
nfsd/blocklayout: Make sure calculate signature/designator length aligned
xfs: abstract block export operations from nfsd layouts
SUNRPC: Remove unused callback xpo_adjust_wspace()
SUNRPC: Change TCP socket space reservation
SUNRPC: Add a server side per-connection limit
SUNRPC: Micro optimisation for svc_data_ready
SUNRPC: Call the default socket callbacks instead of open coding
SUNRPC: lock the socket while detaching it
...
A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
not really safe to use the the generic_cred->acred->ac_flags to store
the NO_CRKEY_TIMEOUT flag. A lookup for a unx_cred triggered while the
KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
with the auth_cred to be in a state where they're perpetually doing 4K
NFS_FILE_SYNC writes.
This can be reproduced as follows:
1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
They do not need to be the same export, nor do they even need to be from
the same NFS server. Also, v3 is fine.
$ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
$ sudo mount -o v3,sec=sys server2:/export /mnt/sys
2. As the normal user, before accessing the kerberized mount, kinit with
a short lifetime (but not so short that renewing the ticket would leave
you within the 4-minute window again by the time the original ticket
expires), e.g.
$ kinit -l 10m -r 60m
3. Do some I/O to the kerberized mount and verify that the writes are
wsize, UNSTABLE:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
4. Wait until you're within 4 minutes of key expiry, then do some more
I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
set. Verify that the writes are 4K, FILE_SYNC:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1
5. Now do some I/O to the sec=sys mount. This will cause
RPC_CRED_NO_CRKEY_TIMEOUT to be set:
$ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1
6. Writes for that user will now be permanently 4K, FILE_SYNC for that
user, regardless of which mount is being written to, until you reboot
the client. Renewing the kerberos ticket (assuming it hasn't already
expired) will have no effect. Grabbing a new kerberos ticket at this
point will have no effect either.
Move the flag to the auth->au_flags field (which is currently unused)
and rename it slightly to reflect that it's no longer associated with
the auth_cred->ac_flags. Add the rpc_auth to the arg list of
rpcauth_cred_key_to_expire and check the au_flags there too. Finally,
add the inode to the arg list of nfs_ctx_key_to_expire so we can
determine the rpc_auth to pass to rpcauth_cred_key_to_expire.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
GSS-Proxy doesn't produce very much debug logging at all. Printing out
the gss minor status will aid in troubleshooting if the
GSS_Accept_sec_context upcall fails.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Direct data placement is not allowed when using flavors that
guarantee integrity or privacy. When such security flavors are in
effect, don't allow the use of Read and Write chunks for moving
individual data items. All messages larger than the inline threshold
are sent via Long Call or Long Reply.
On my systems (CX-3 Pro on FDR), for small I/O operations, the use
of Long messages adds only around 5 usecs of latency in each
direction.
Note that when integrity or encryption is used, the host CPU touches
every byte in these messages. Even if it could be used, data
movement offload doesn't buy much in this case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Highlights include:
Features:
- Add support for the NFS v4.2 COPY operation
- Add support for NFS/RDMA over IPv6
Bugfixes and cleanups:
- Avoid race that crashes nfs_init_commit()
- Fix oops in callback path
- Fix LOCK/OPEN race when unlinking an open file
- Choose correct stateids when using delegations in setattr, read and write
- Don't send empty SETATTR after OPEN_CREATE
- xprtrdma: Prevent server from writing a reply into memory client has released
- xprtrdma: Support using Read list and Reply chunk in one RPC call
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXRu76AAoJENfLVL+wpUDrDVoQAKPKv1tEVJMRUQA3UVoKoixd
KjmmZMjl6GfpISwTZl+a8W549jyGuYH7Gl8vSbMaE9/FI+kJW6XZQniTYfFqY8/a
LbMSdNx1+yURisbkyO0vPqqwKw9r6UmsfGeUT8SpS3ff61yp4Oj436ra2qcPJsZ3
cWl/lHItzX7oKFAWmr0Nmq2X8ac/8+NFyK29+V/QGfwtp3qAPbpA8XM5HrHw3rA2
uk5uNSr3hwqz7P3+Hi7ZoO2m4nQTAbQnEunfYpxlOwz4IaM7qcGnntT6Jhwq1pGE
/1YasG7bHeiWjhynmZZ4CWuMkogau2UJ/G68Cz7ehLhPNr8rH/ZFCJZ+XX0e0CgI
1d+AwxZvgszIQVBY3S7sg8ezVSCPBXRFJ8rtzggGscqC53aP7L+rLfUFH+OKrhMg
6n7RQiq4EmGDJGviB/R2HixI9CpdOf2puNhDKSJmPOqiSS7UuHMw8QCq++vdru+1
GLGunGyO7D70yTV92KtsdzJlFlnfa/g+FIJrmaMpL3HH1h0stTctWX5xlTYmqEL3
z3aUuT8RySk2t1FTabSj6KRWqE/krK5BMZbX91kpF27WL4c/olXFaZPqBDsj0q4u
2rm1fIrc8RxLXctJan9ro092s/e9dup/1JxV5XWMq/EGS1ezvf+0XkCOtURaAWp3
2aPHlx7M8iuq2SouL6f7
=QMmY
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.7-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"Highlights include:
Features:
- Add support for the NFS v4.2 COPY operation
- Add support for NFS/RDMA over IPv6
Bugfixes and cleanups:
- Avoid race that crashes nfs_init_commit()
- Fix oops in callback path
- Fix LOCK/OPEN race when unlinking an open file
- Choose correct stateids when using delegations in setattr, read and
write
- Don't send empty SETATTR after OPEN_CREATE
- xprtrdma: Prevent server from writing a reply into memory client
has released
- xprtrdma: Support using Read list and Reply chunk in one RPC call"
* tag 'nfs-for-4.7-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (61 commits)
pnfs: pnfs_update_layout needs to consider if strict iomode checking is on
nfs/flexfiles: Use the layout segment for reading unless it a IOMODE_RW and reading is disabled
nfs/flexfiles: Helper function to detect FF_FLAGS_NO_READ_IO
nfs: avoid race that crashes nfs_init_commit
NFS: checking for NULL instead of IS_ERR() in nfs_commit_file()
pnfs: make pnfs_layout_process more robust
pnfs: rework LAYOUTGET retry handling
pnfs: lift retry logic from send_layoutget to pnfs_update_layout
pnfs: fix bad error handling in send_layoutget
flexfiles: add kerneldoc header to nfs4_ff_layout_prepare_ds
flexfiles: remove pointless setting of NFS_LAYOUT_RETURN_REQUESTED
pnfs: only tear down lsegs that precede seqid in LAYOUTRETURN args
pnfs: keep track of the return sequence number in pnfs_layout_hdr
pnfs: record sequence in pnfs_layout_segment when it's created
pnfs: don't merge new ff lsegs with ones that have LAYOUTRETURN bit set
pNFS/flexfiles: When initing reads or writes, we might have to retry connecting to DSes
pNFS/flexfiles: When checking for available DSes, conditionally check for MDS io
pNFS/flexfile: Fix erroneous fall back to read/write through the MDS
NFS: Reclaim writes via writepage are opportunistic
NFSv4: Use the right stateid for delegations in setattr, read and write
...
The length of the GSS MIC token need not be a multiple of four bytes.
It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
would previously only trim mic.len + 4 B. The remaining up to three
bytes would then trigger a check in nfs4svc_decode_compoundargs(),
leading to a "garbage args" error and mount failure:
nfs4svc_decode_compoundargs: compound not properly padded!
nfsd: failed to decode arguments!
This would prevent older clients using the pre-RFC 4121 MIC format
(37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
servers using krb5i.
The trimming was introduced by commit 4c190e2f91 ("sunrpc: trim off
trailing checksum before returning decrypted or integrity authenticated
buffer").
Fixes: 4c190e2f91 "unrpc: trim off trailing checksum..."
Signed-off-by: Tomáš Trnka <ttrnka@mail.muni.cz>
Cc: stable@vger.kernel.org
Acked-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We need to be able to call the generic_cred creator from different
contexts. Add a gfp_t parm to the crcreate operation and to
rpcauth_lookup_credcache. For now, we just push the gfp_t parms up
one level to the *_lookup_cred functions.
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pull crypto fixes from Herbert Xu:
"This fixes an NFS regression caused by the skcipher/hash conversion in
sunrpc. It also fixes a build problem in certain configurations with
bcm63xx"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm63xx - fix device tree compilation
sunrpc: Fix skcipher/shash conversion
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The skcpiher/shash conversion introduced a number of bugs in the
sunrpc code:
1) Missing calls to skcipher_request_set_tfm lead to crashes.
2) The allocation size of shash_desc is too small which leads to
memory corruption.
Fixes: 3b5cf20cf4 ("sunrpc: Use skcipher and ahash/shash")
Reported-by: J. Bruce Fields <bfields@fieldses.org>
Tested-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Highlights include:
Features:
- Add support for multiple NFSv4.1 callbacks in flight
- Initial patchset for RPC multipath support
- Adapt RPC/RDMA to use the new completion queue API
Bugfixes and cleanups:
- nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
- Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
- Fix RPC/RDMA credit accounting
- Properly handle RDMA_ERROR replies
- xprtrdma: Do not wait if ib_post_send() fails
- xprtrdma: Segment head and tail XDR buffers on page boundaries
- xprtrdma cleanups for dprintk, physical_op_map and unused macros
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW8Y7MAAoJEGcL54qWCgDyMsMP+we8JSgfVqI5X1lKpU9aPWkI
D912ybtV58Kv0elKwYvQMqm+mRvdMNz1hZNJa4sAEaPVBOGfFjyZLy3xlDlr0HTf
M+Juh0FNLTcUh1obxJamjsbpNxfg4b6f/Z29KWRzahv/MlpMJVS3hLjpAEzCcTYr
WfOOovV6mragtsBINegGl/6jk/x2D22JDnKcTU+8ltVZGJtZe+HoqTFhUOrLO5qm
wR3YO22fbOuiZxCPoST06kMNiksYnYXnOju8RwlKwFYq3bWke0jWstQtIC4vKs6K
4u5o74aTBL5zMkJPnJuIfN2if4LJPptSr1n7pItbv3MLmgY1mWjE6N2BATpijfhQ
p+Gt/GHTAvswuWrmwySZKLj/Q8EkBuw4ohPFwLQ9eFHl2USoV3G9KQw7H0odR4d1
IyQPCag+suN2lWBreFkPIV48dZyeCVk6JmJmy3SN+d0L1t3jd6gwSO2UBgG7S2Gd
qVbdxYRiU/zYP6E5wFouLhIc1beSfe4vnJqvnuWZrIId+haTE2+OLi7772WGIkSe
xoZVTg7AX4Wu79ZyWoH+e9FnDvEsRkRVv7HQfpsMq2gynBWj70/KemEoeZnjqWaB
UOWcH8/vNLrnwlXTh0VHG6I8t3s0EXgqQB4//tYRLI42oIj35W2pIMnjYt52DeVB
Mo5mbYghtR9bgeoRQ6V4
=kC3t
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Features:
- Add support for multiple NFSv4.1 callbacks in flight
- Initial patchset for RPC multipath support
- Adapt RPC/RDMA to use the new completion queue API
Bugfixes and cleanups:
- nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
- Cleanups to remove nfs_inode_dio_wait and nfs4_file_fsync
- Fix RPC/RDMA credit accounting
- Properly handle RDMA_ERROR replies
- xprtrdma: Do not wait if ib_post_send() fails
- xprtrdma: Segment head and tail XDR buffers on page boundaries
- xprtrdma cleanups for dprintk, physical_op_map and unused macros"
* tag 'nfs-for-4.6-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (35 commits)
nfs/blocklayout: make sure making a aligned read request
nfs4: nfs4_ff_layout_prepare_ds should return NULL if connection failed
nfs: remove nfs_inode_dio_wait
nfs: remove nfs4_file_fsync
xprtrdma: Use new CQ API for RPC-over-RDMA client send CQs
xprtrdma: Use an anonymous union in struct rpcrdma_mw
xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQs
xprtrdma: Serialize credit accounting again
xprtrdma: Properly handle RDMA_ERROR replies
rpcrdma: Add RPCRDMA_HDRLEN_ERR
xprtrdma: Do not wait if ib_post_send() fails
xprtrdma: Segment head and tail XDR buffers on page boundaries
xprtrdma: Clean up dprintk format string containing a newline
xprtrdma: Clean up physical_op_map()
xprtrdma: Clean up unused RPCRDMA_INLINE_PAD_THRESH macro
NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
SUNRPC: Allow addition of new transports to a struct rpc_clnt
NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
SUNRPC: Make NFS swap work with multipath
...
Pull crypto update from Herbert Xu:
"Here is the crypto update for 4.6:
API:
- Convert remaining crypto_hash users to shash or ahash, also convert
blkcipher/ablkcipher users to skcipher.
- Remove crypto_hash interface.
- Remove crypto_pcomp interface.
- Add crypto engine for async cipher drivers.
- Add akcipher documentation.
- Add skcipher documentation.
Algorithms:
- Rename crypto/crc32 to avoid name clash with lib/crc32.
- Fix bug in keywrap where we zero the wrong pointer.
Drivers:
- Support T5/M5, T7/M7 SPARC CPUs in n2 hwrng driver.
- Add PIC32 hwrng driver.
- Support BCM6368 in bcm63xx hwrng driver.
- Pack structs for 32-bit compat users in qat.
- Use crypto engine in omap-aes.
- Add support for sama5d2x SoCs in atmel-sha.
- Make atmel-sha available again.
- Make sahara hashing available again.
- Make ccp hashing available again.
- Make sha1-mb available again.
- Add support for multiple devices in ccp.
- Improve DMA performance in caam.
- Add hashing support to rockchip"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
crypto: qat - remove redundant arbiter configuration
crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
crypto: qat - Change the definition of icp_qat_uof_regtype
hwrng: exynos - use __maybe_unused to hide pm functions
crypto: ccp - Add abstraction for device-specific calls
crypto: ccp - CCP versioning support
crypto: ccp - Support for multiple CCPs
crypto: ccp - Remove check for x86 family and model
crypto: ccp - memset request context to zero during import
lib/mpi: use "static inline" instead of "extern inline"
lib/mpi: avoid assembler warning
hwrng: bcm63xx - fix non device tree compatibility
crypto: testmgr - allow rfc3686 aes-ctr variants in fips mode.
crypto: qat - The AE id should be less than the maximal AE number
lib/mpi: Endianness fix
crypto: rockchip - add hash support for crypto engine in rk3288
crypto: xts - fix compile errors
crypto: doc - add skcipher API documentation
crypto: doc - update AEAD AD handling
...
* multipath:
NFS add callback_ops to nfs4_proc_bind_conn_to_session_callback
pnfs/NFSv4.1: Add multipath capabilities to pNFS flexfiles servers over NFSv3
SUNRPC: Allow addition of new transports to a struct rpc_clnt
NFSv4.1: nfs4_proc_bind_conn_to_session must iterate over all connections
SUNRPC: Make NFS swap work with multipath
SUNRPC: Add a helper to apply a function to all the rpc_clnt's transports
SUNRPC: Allow caller to specify the transport to use
SUNRPC: Use the multipath iterator to assign a transport to each task
SUNRPC: Make rpc_clnt store the multipath iterators
SUNRPC: Add a structure to track multiple transports
SUNRPC: Make freeing of struct xprt rcu-safe
SUNRPC: Uninline xprt_get(); It isn't performance critical.
SUNRPC: Reorder rpc_task to put waitqueue related info in same cachelines
SUNRPC: Remove unused function rpc_task_reset_client
On Mon, 15 Feb 2016, Trond Myklebust wrote:
> Hi Scott,
>
> On Mon, Feb 15, 2016 at 2:28 PM, Scott Mayhew <smayhew@redhat.com> wrote:
> > md5 is disabled in fips mode, and attempting to import a gss context
> > using md5 while in fips mode will result in crypto_alg_mod_lookup()
> > returning -ENOENT, which will make its way back up to
> > gss_pipe_downcall(), where the BUG() is triggered. Handling the -ENOENT
> > allows for a more graceful failure.
> >
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> > net/sunrpc/auth_gss/auth_gss.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> > index 799e65b..c30fc3b 100644
> > --- a/net/sunrpc/auth_gss/auth_gss.c
> > +++ b/net/sunrpc/auth_gss/auth_gss.c
> > @@ -737,6 +737,9 @@ gss_pipe_downcall(struct file *filp, const char __user *src, size_t mlen)
> > case -ENOSYS:
> > gss_msg->msg.errno = -EAGAIN;
> > break;
> > + case -ENOENT:
> > + gss_msg->msg.errno = -EPROTONOSUPPORT;
> > + break;
> > default:
> > printk(KERN_CRIT "%s: bad return from "
> > "gss_fill_context: %zd\n", __func__, err);
> > --
> > 2.4.3
> >
>
> Well debugged, but I unfortunately do have to ask if this patch is
> sufficient? In addition to -ENOENT, and -ENOMEM, it looks to me as if
> crypto_alg_mod_lookup() can also fail with -EINTR, -ETIMEDOUT, and
> -EAGAIN. Don't we also want to handle those?
You're right, I was focusing on the panic that I could easily reproduce.
I'm still not sure how I could trigger those other conditions.
>
> In fact, peering into the rats nest that is
> gss_import_sec_context_kerberos(), it looks as if that is just a tiny
> subset of all the errors that we might run into. Perhaps the right
> thing to do here is to get rid of the BUG() (but keep the above
> printk) and just return a generic error?
That sounds fine to me -- updated patch attached.
-Scott
>From d54c6b64a107a90a38cab97577de05f9a4625052 Mon Sep 17 00:00:00 2001
From: Scott Mayhew <smayhew@redhat.com>
Date: Mon, 15 Feb 2016 15:12:19 -0500
Subject: [PATCH] auth_gss: remove the BUG() from gss_pipe_downcall()
Instead return a generic error via gss_msg->msg.errno. None of the
errors returned by gss_fill_context() should necessarily trigger a
kernel panic.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This is a pre-patch for the RPC multipath code. It sets up the storage in
struct rpc_clnt for the multipath code.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
This patch replaces uses of blkcipher with skcipher and the long
obsolete hash interface with either shash (for non-SG users) and
ahash.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The principal name on a gss cred is used to setup the NFSv4.0 callback,
which has to have a client principal name to authenticate to.
That code wants the name to be in the form servicetype@hostname.
rpc.svcgssd passes down such names (and passes down no principal name at
all in the case the principal isn't a service principal).
gss-proxy always passes down the principal name, and passes it down in
the form servicetype/hostname@REALM. So we've been munging the name
gss-proxy passes down into the format the NFSv4.0 callback code expects,
or throwing away the name if we can't.
Since the introduction of the MACH_CRED enforcement in NFSv4.1, we've
also been using the principal name to verify that certain operations are
done as the same principal as was used on the original EXCHANGE_ID call.
For that application, the original name passed down by gss-proxy is also
useful.
Lack of that name in some cases was causing some kerberized NFSv4.1
mount failures in an Active Directory environment.
This fix only works in the gss-proxy case. The fix for legacy
rpc.svcgssd would be more involved, and rpc.svcgssd already has other
problems in the AD case.
Reported-and-tested-by: James Ralston <ralston@pobox.com>
Acked-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The gss_key_timeout() function causes a harmless warning in some
configurations, e.g. ARM imx_v6_v7_defconfig with gcc-5.2, if the
compiler cannot figure out the state of the 'expire' variable across
an rcu_read_unlock():
net/sunrpc/auth_gss/auth_gss.c: In function 'gss_key_timeout':
net/sunrpc/auth_gss/auth_gss.c:1422:211: warning: 'expire' may be used uninitialized in this function [-Wmaybe-uninitialized]
To avoid this warning without adding a bogus initialization, this
rewrites the function so the comparison is done inside of the
critical section. As a side-effect, it also becomes slightly
easier to understand because the implementation now more closely
resembles the comment above it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: c5e6aecd03 ("sunrpc: fix RCU handling of gc_ctx field")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
In an environment where the KDC is running Active Directory, the
exported composite name field returned in the context could be large
enough to span a page boundary. Attaching a scratch buffer to the
decoding xdr_stream helps deal with those cases.
The case where we saw this was actually due to behavior that's been
fixed in newer gss-proxy versions, but we're fixing it here too.
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
If we call groups_alloc() with invalid values then it's might lead to
memory corruption. For example, with a negative value then we might not
allocate enough for sizeof(struct group_info).
(We're doing this in the caller for consistency with other callers of
groups_alloc(). The other alternative might be to move the check out of
all the callers into groups_alloc().)
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Our UC-KLEE tool found a kernel memory leak of 512 bytes (on x86_64) for
each call to gssp_accept_sec_context_upcall()
(net/sunrpc/auth_gss/gss_rpc_upcall.c). Since it appears that this call
can be triggered by remote connections (at least, from a cursory a
glance at the call chain), it may be exploitable to cause kernel memory
exhaustion. We found the bug in kernel 3.16.3, but it appears to date
back to commit 9dfd87da1a (2013-08-20).
The gssp_accept_sec_context_upcall() function performs a pair of calls
to gssp_alloc_receive_pages() and gssp_free_receive_pages(). The first
allocates memory for arg->pages. The second then frees the pages
pointed to by the arg->pages array, but not the array itself.
Reported-by: David A. Ramos <daramos@stanford.edu>
Fixes: 9dfd87da1a ("rpc: fix huge kmalloc's in gss-proxy”)
Signed-off-by: David A. Ramos <daramos@stanford.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
It's always set to whatever CONFIG_SUNRPC_DEBUG is, so just use that.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Highlights include:
- Stable fix for a bug in nfs3_list_one_acl()
- Speed up NFS path walks by supporting LOOKUP_RCU
- More read/write code cleanups
- pNFS fixes for layout return on close
- Fixes for the RCU handling in the rpcsec_gss code
- More NFS/RDMA fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJT65zoAAoJEGcL54qWCgDyvq8QAJ+OKuC5dpngrZ13i4ZJIcK1
TJSkWCr44FhYPlrmkLCntsGX6C0376oFEtJ5uqloqK0+/QtvwRNVSQMKaJopKIVY
mR4En0WwpigxVQdW2lgto6bfOhzMVO+llVdmicEVrU8eeSThATxGNv7rxRzWorvL
RX3TwBkWSc0kLtPi66VRFQ1z+gg5I0kngyyhsKnLOaHHtpTYP2JDZlRPRkokXPUg
nmNedmC3JrFFkarroFIfYr54Qit2GW/eI2zVhOwHGCb45j4b2wntZ6wr7LpUdv3A
OGDBzw59cTpcx3Hij9CFvLYVV9IJJHBNd2MJqdQRtgWFfs+aTkZdk4uilUJCIzZh
f4BujQAlm/4X1HbPxsSvkCRKga7mesGM7e0sBDPHC1vu0mSaY1cakcj2kQLTpbQ7
gqa1cR3pZ+4shCq37cLwWU0w1yElYe1c4otjSCttPCrAjXbXJZSFzYnHm8DwKROR
t+yEDRL5BIXPu1nEtSnD2+xTQ3vUIYXooZWEmqLKgRtBTtPmgSn9Vd8P1OQXmMNo
VJyFXyjNx5WH06Wbc/jLzQ1/cyhuPmJWWyWMJlVROyv+FXk9DJUFBZuTkpMrIPcF
NlBXLV1GnA7PzMD9Xt9bwqteERZl6fOUDJLWS9P74kTk5c2kD+m+GaqC/rBTKKXc
ivr2s7aIDV48jhnwBSVL
=KE07
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- stable fix for a bug in nfs3_list_one_acl()
- speed up NFS path walks by supporting LOOKUP_RCU
- more read/write code cleanups
- pNFS fixes for layout return on close
- fixes for the RCU handling in the rpcsec_gss code
- more NFS/RDMA fixes"
* tag 'nfs-for-3.17-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
nfs: reject changes to resvport and sharecache during remount
NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error
SUNRPC: remove all refcounting of groupinfo from rpcauth_lookupcred
NFS: fix two problems in lookup_revalidate in RCU-walk
NFS: allow lockless access to access_cache
NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU
NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU
NFS: support RCU_WALK in nfs_permission()
sunrpc/auth: allow lockless (rcu) lookup of credential cache.
NFS: prepare for RCU-walk support but pushing tests later in code.
NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used.
NFS: add checks for returned value of try_module_get()
nfs: clear_request_commit while holding i_lock
pnfs: add pnfs_put_lseg_async
pnfs: find swapped pages on pnfs commit lists too
nfs: fix comment and add warn_on for PG_INODE_REF
nfs: check wait_on_bit_lock err in page_group_lock
sunrpc: remove "ec" argument from encrypt_v2 operation
sunrpc: clean up sparse endianness warnings in gss_krb5_wrap.c
sunrpc: clean up sparse endianness warnings in gss_krb5_seal.c
...
Fix the endianness handling in gss_wrap_kerberos_v1 and drop the memset
call there in favor of setting the filler bytes directly.
In gss_wrap_kerberos_v2, get rid of the "ec" variable which is always
zero, and drop the endianness conversion of 0. Sparse handles 0 as a
special case, so it's not necessary.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Use u16 pointer in setup_token and setup_token_v2. None of the fields
are actually handled as __be16, so this simplifies the code a bit. Also
get rid of some unneeded pointer increments.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
The handling of the gc_ctx pointer only seems to be partially RCU-safe.
The assignment and freeing are done using RCU, but many places in the
code seem to dereference that pointer without proper RCU safeguards.
Fix them to use rcu_dereference and to rcu_read_lock/unlock, and to
properly handle the case where the pointer is NULL.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
...and add an new rpc_auth function to call it when it exists. This
is only applicable for AUTH_GSS mechanisms, so we only specify this
for those sorts of credentials.
Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
If rpc.gssd sends us an acceptor name string trailing the context token,
stash it as part of the context.
Signed-off-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
rq_usedeferral and rq_splice_ok are used as 0 and 1, just defined to bool.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Highlights include:
- Massive cleanup of the NFS read/write code by Anna and Dros
- Support multiple NFS read/write requests per page in order to deal with
non-page aligned pNFS striping. Also cleans up the r/wsize < page size
code nicely.
- stable fix for ensuring inode is declared uptodate only after all the
attributes have been checked.
- stable fix for a kernel Oops when remounting
- NFS over RDMA client fixes
- move the pNFS files layout driver into its own subdirectory
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTl3pmAAoJEGcL54qWCgDyraIP/08ZbbDowVTP9572bxl+VR2i
zNbrflBtl1R05D4Imi/IEySK0w6xj1CLsncNpXAT2bxTlyKPW70tpiiPlRKMPuO8
JW+iPiepR2t0mol6MEd46yuV8btXVk8I+7IYjPXANiMJG8O5dJzNQ8NiCQOERBNt
FQ7rzTCFO0ESGXnT6vYrT4I0bwqYVklBiJRTT4PQVzhhhDq9qUdq21BlQjQJFXP4
9aBLurxKptlHBvE6A2Quja6ObEC0s31CxcijqHIJ+Ue4GbKcFbMG1tgjY7ESE/AD
rqzDeF0jvWHT+frmvFEUUXWqzF1ReZ4x9pfDoOgeG6T9/K6DT91O0yMOgG8jvlbF
8DSATNYGDX5sSjpvaG5JokGG+cGCk9srVDx+itn7HlwzalRwn0PjKtIYwOJ7TJIr
o/j20nOsPrRGF0OqLf9phyocgRrlbMKOzj1IXldHHfAbNkRcISTK08lxvsz96Ddn
zRyDmbsbY6QFXdB3AVSeQmg5R0OOLtzNIcsFPmNdvy5eiy67qU0lsGg8UGNnoz8k
PHN1pcGejkctLhQ32ee3w/W6zkrgpJZcNC9JSoG8Dc3SeXus0c3IgumRknFCmiep
ssN+1jEITAGeS5a2aBxwLQLVI2JAr2lxs5e+R4D5EsQlFkCl6Mrgtzh/aToWTuFl
Qt7l2zI3r3VieKT9u7Bh
=OyXR
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- massive cleanup of the NFS read/write code by Anna and Dros
- support multiple NFS read/write requests per page in order to deal
with non-page aligned pNFS striping. Also cleans up the r/wsize <
page size code nicely.
- stable fix for ensuring inode is declared uptodate only after all
the attributes have been checked.
- stable fix for a kernel Oops when remounting
- NFS over RDMA client fixes
- move the pNFS files layout driver into its own subdirectory"
* tag 'nfs-for-3.16-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
NFS: populate ->net in mount data when remounting
pnfs: fix lockup caused by pnfs_generic_pg_test
NFSv4.1: Fix typo in dprintk
NFSv4.1: Comment is now wrong and redundant to code
NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state
xprtrdma: Disconnect on registration failure
xprtrdma: Remove BUG_ON() call sites
xprtrdma: Avoid deadlock when credit window is reset
SUNRPC: Move congestion window constants to header file
xprtrdma: Reset connection timeout after successful reconnect
xprtrdma: Use macros for reconnection timeout constants
xprtrdma: Allocate missing pagelist
xprtrdma: Remove Tavor MTU setting
xprtrdma: Ensure ia->ri_id->qp is not NULL when reconnecting
xprtrdma: Reduce the number of hardway buffer allocations
xprtrdma: Limit work done by completion handler
xprtrmda: Reduce calls to ib_poll_cq() in completion handlers
xprtrmda: Reduce lock contention in completion handlers
xprtrdma: Split the completion queue
xprtrdma: Make rpcrdma_ep_destroy() return void
...
Pull nfsd updates from Bruce Fields:
"The largest piece is a long-overdue rewrite of the xdr code to remove
some annoying limitations: for example, there was no way to return
ACLs larger than 4K, and readdir results were returned only in 4k
chunks, limiting performance on large directories.
Also:
- part of Neil Brown's work to make NFS work reliably over the
loopback interface (so client and server can run on the same
machine without deadlocks). The rest of it is coming through
other trees.
- cleanup and bugfixes for some of the server RDMA code, from
Steve Wise.
- Various cleanup of NFSv4 state code in preparation for an
overhaul of the locking, from Jeff, Trond, and Benny.
- smaller bugfixes and cleanup from Christoph Hellwig and
Kinglong Mee.
Thanks to everyone!
This summer looks likely to be busier than usual for knfsd. Hopefully
we won't break it too badly; testing definitely welcomed"
* 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits)
nfsd4: fix FREE_STATEID lockowner leak
svcrdma: Fence LOCAL_INV work requests
svcrdma: refactor marshalling logic
nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry
nfs4: remove unused CHANGE_SECURITY_LABEL
nfsd4: kill READ64
nfsd4: kill READ32
nfsd4: simplify server xdr->next_page use
nfsd4: hash deleg stateid only on successful nfs4_set_delegation
nfsd4: rename recall_lock to state_lock
nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound
nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
nfsd4: use recall_lock for delegation hashing
nfsd: fix laundromat next-run-time calculation
nfsd: make nfsd4_encode_fattr static
SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING
nfsd: remove unused function nfsd_read_file
nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer
NFSD: Error out when getting more than one fsloc/secinfo/uuid
NFSD: Using type of uint32_t for ex_nflavors instead of int
...
RPC_MAX_AUTH_SIZE is scattered around several places. Better to set it
once in the auth code, where this kind of estimate should be made. And
while we're at it we can leave it zero when we're not using krb5i or
krb5p.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We're not taking a reference in the case where _gss_mech_get_by_pseudoflavor
loops without finding the correct rpcsec_gss flavour, so why are we
releasing it?
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
In gss_alloc_msg(), if the call to gss_encode_v1_msg() fails, we
want to release the reference to the pipe_version that was obtained
earlier in the function.
Fixes: 9d3a2260f0 (SUNRPC: Fix buffer overflow checking in...)
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Fix a race in which the RPC client is shutting down while the
gss daemon is processing a downcall. If the RPC client manages to
shut down before the gss daemon is done, then the struct gss_auth
used in gss_release_msg() may have already been freed.
Link: http://lkml.kernel.org/r/1392494917.71728.YahooMailNeo@web140002.mail.bf1.yahoo.com
Reported-by: John <da_audiophile@yahoo.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
An infinite loop is caused when nfs4_establish_lease() fails
with -EACCES. This causes nfs4_handle_reclaim_lease_error()
to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
This in turn causes nfs4_state_manager() to try and
reestablished the lease, again, again, again...
The problem is a valid RPCSEC_GSS client is being created when
rpc.gssd is not running.
Link: http://lkml.kernel.org/r/1392066375-16502-1-git-send-email-steved@redhat.com
Fixes: 0ea9de0ea6 (sunrpc: turn warn_gssd() log message into a dprintk())
Reported-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Highlights:
- Fix several races in nfs_revalidate_mapping
- NFSv4.1 slot leakage in the pNFS files driver
- Stable fix for a slot leak in nfs40_sequence_done
- Don't reject NFSv4 servers that support ACLs with only ALLOW aces
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJS7Bb+AAoJEGcL54qWCgDyDuQP/17nKR5e6MLhixcAbvlcH+pN
8CGolAM3HmRXDWUW/PkBH3UguG8Tzx1Ex26vIxipPeTSwZabf6194Twj6L97DEGZ
2SouD158BW1TkAbhEN/alKB/4ZCPos05iXjZkrL7MRff+8FD0UvWR2pBT1F2jQdY
ZftG76Q72qhZHfH07ZMxM/v4Oy2Ge98RDD35gfuuqMSjHpmN9tiB55PeheW33LVY
fu6I/JEwmlJpgy2qUcDv7v0V4mDpjC7XbcjjHpMHL8zp/C5Rx/rdgt9OQPlwmjdV
FD8MWNXLc5TWxIouLDFPVUv3WZPjyu449QHS9Wc95fSqsHcdl4j4SwLAoSvUIdHt
vDI5PtWhw3WAezbtiuCQnT0xcoIOn5bLjOVP13taDcV9vlZLcFlyOpZ5gVE4/Yju
zm4sCW2+imDc74ERGa4fukF6QhzzAVmD8RlCJwuNzwCfXiZ36+xSanMYiPoUiwLL
OVNgymrm0fe7GVFQKWN2D+Vr68OQEmARO+KfA3UzP5rQV+9CU8zSHjbcoRWZ59QG
VahOS5WDLQSrMp8W37yAHH9IiAWveAAKJJTHlOniRqH90QYPgyW18fTo7YcpW313
AQGFgr/1n4t27MWRLu5rdoN5v8+kwNi0UV6oboNIPoP1v15NkEMvc7HKFj5M883R
qEYfe5wqN/eRNj68NT/+
=B7f0
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights:
- Fix several races in nfs_revalidate_mapping
- NFSv4.1 slot leakage in the pNFS files driver
- Stable fix for a slot leak in nfs40_sequence_done
- Don't reject NFSv4 servers that support ACLs with only ALLOW aces"
* tag 'nfs-for-3.14-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: initialize the ACL support bits to zero.
NFSv4.1: Cleanup
NFSv4.1: Clean up nfs41_sequence_done
NFSv4: Fix a slot leak in nfs40_sequence_done
NFSv4.1 free slot before resending I/O to MDS
nfs: add memory barriers around NFS_INO_INVALID_DATA and NFS_INO_INVALIDATING
NFS: Fix races in nfs_revalidate_mapping
sunrpc: turn warn_gssd() log message into a dprintk()
NFS: fix the handling of NFS_INO_INVALID_DATA flag in nfs_revalidate_mapping
nfs: handle servers that support only ALLOW ACE type.
Pull nfsd updates from Bruce Fields:
- Handle some loose ends from the vfs read delegation support.
(For example nfsd can stop breaking leases on its own in a
fewer places where it can now depend on the vfs to.)
- Make life a little easier for NFSv4-only configurations
(thanks to Kinglong Mee).
- Fix some gss-proxy problems (thanks Jeff Layton).
- miscellaneous bug fixes and cleanup
* 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
nfsd: consider CLAIM_FH when handing out delegation
nfsd4: fix delegation-unlink/rename race
nfsd4: delay setting current_fh in open
nfsd4: minor nfs4_setlease cleanup
gss_krb5: use lcm from kernel lib
nfsd4: decrease nfsd4_encode_fattr stack usage
nfsd: fix encode_entryplus_baggage stack usage
nfsd4: simplify xdr encoding of nfsv4 names
nfsd4: encode_rdattr_error cleanup
nfsd4: nfsd4_encode_fattr cleanup
minor svcauth_gss.c cleanup
nfsd4: better VERIFY comment
nfsd4: break only delegations when appropriate
NFSD: Fix a memory leak in nfsd4_create_session
sunrpc: get rid of use_gssp_lock
sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
sunrpc: don't wait for write before allowing reads from use-gss-proxy file
nfsd: get rid of unused function definition
Define op_iattr for nfsd4_open instead using macro
NFSD: fix compile warning without CONFIG_NFSD_V3
...
The original printk() made sense when the GSSAPI codepaths were called
only when sec=krb5* was explicitly requested. Now however, in many cases
the nfs client will try to acquire GSSAPI credentials by default, even
when it's not requested.
Since we don't have a great mechanism to distinguish between the two
cases, just turn the pr_warn into a dprintk instead. With this change we
can also get rid of the ratelimiting.
We do need to keep the EXPORT_SYMBOL(gssd_running) in place since
auth_gss.ko needs it and sunrpc.ko provides it. We can however,
eliminate the gssd_running call in the nfs code since that's a bit of a
layering violation.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Replace hardcoded lowest common multiple algorithm by the lcm()
function in kernel lib.
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We can achieve the same result with a cmpxchg(). This also fixes a
potential race in use_gss_proxy(). The value of sn->use_gss_proxy could
go from -1 to 1 just after we check it in use_gss_proxy() but before we
acquire the spinlock. The procfile write would end up returning success
but the value would flip to 0 soon afterward. With this method we not
only avoid locking but the first "setter" always wins.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
An nfsd thread can call use_gss_proxy and find it set to '1' but find
gssp_clnt still NULL, so that when it attempts the upcall the result
will be an unnecessary -EIO.
So, ensure that gssp_clnt is created first, and set the use_gss_proxy
variable only if that succeeds.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
It doesn't make much sense to make reads from this procfile hang. As
far as I can tell, only gssproxy itself will open this file and it
never reads from it. Change it to just give the present setting of
sn->use_gss_proxy without waiting for anything.
Note that we do not want to call use_gss_proxy() in this codepath
since an inopportune read of this file could cause it to be disabled
prematurely.
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Now that we have a more reliable method to tell if gssd is running, we
can replace the sn->gssd_running flag with a function that will query to
see if it's up and running.
There's also no need to attempt an upcall that we know will fail, so
just return -EACCES if gssd isn't running. Finally, fix the warn_gss()
message not to claim that that the upcall timed out since we don't
necesarily perform one now when gssd isn't running, and remove the
extraneous newline from the message.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
- Stable fix for a NFSv4.1 delegation and state recovery deadlock
- Stable fix for a loop on irrecoverable errors when returning delegations
- Fix a 3-way deadlock between layoutreturn, open, and state recovery
- Update the MAINTAINERS file with contact information for Trond Myklebust
- Close needs to handle NFS4ERR_ADMIN_REVOKED
- Enabling v4.2 should not recompile nfsd and lockd
- Fix a couple of compile warnings
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJSoLTpAAoJEGcL54qWCgDy2dgQAIKkKAXccg3OG2b1SxJmiaja
PcrovNmgg3HvYQ7clUMqtrMByiXEpSybl6tAeXYUWE3sS1DISSBVEwO3MoOiASiM
951Ssx+CoyhsHYo5aH83sUIiWFl/YsRhpKmSr2cdQd13DQTFbPq896k64Inf6L2/
9fngoqOD7FunQHn8AiVPoDOQzObB0OuKhYCwuwLt47oPiwgmm12JQNCDxU1i4sxb
lkGUBLkPMs6D5IyI8XHaMyX3+8MvmPiIsjIKaNJRdhkuX/k7ollucTJXyvyEQKK0
PhBIWyUULmKcAXYwCfHf9UoyGZFvmj47YggyKcBd26OZUEFekcWrULfym46F1xak
EcO6D4mlTy5i5W0RBqYCj1oGud57rixZBmhLTbeq6sSJaiqBfGEs225Q17H7rsEB
YIghHiEFNnBmVWELhHxbJHQoY6HOugmZOuc0dxopaikN/7to8gnYoVyTIVlMfe/t
UNXZoer6GOOohJGtZ7s7v4Al7EzvwnVnBCBklEAKFJ7Ca2LEmq+b58oQW3nJ1mPn
y4TnihxYXsSEbqy+Lds9rumRhJLG1oVTpwficAm7N3HdK3abzCIPEt6iOHoCmXQz
J1B4gmwOKsDqVlCSpBsnc3ZiBlSJGOn6MmVQUCNFpzv/DetWn/BxEUPE8cNm8DaI
WioD0grC0/9bR8oD1m+w
=UZ51
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Stable fix for a NFSv4.1 delegation and state recovery deadlock
- Stable fix for a loop on irrecoverable errors when returning
delegations
- Fix a 3-way deadlock between layoutreturn, open, and state recovery
- Update the MAINTAINERS file with contact information for Trond
Myklebust
- Close needs to handle NFS4ERR_ADMIN_REVOKED
- Enabling v4.2 should not recompile nfsd and lockd
- Fix a couple of compile warnings
* tag 'nfs-for-3.13-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
nfs: fix do_div() warning by instead using sector_div()
MAINTAINERS: Update contact information for Trond Myklebust
NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery
SUNRPC: do not fail gss proc NULL calls with EACCES
NFSv4: close needs to handle NFS4ERR_ADMIN_REVOKED
NFSv4: Update list of irrecoverable errors on DELEGRETURN
NFSv4 wait on recovery for async session errors
NFS: Fix a warning in nfs_setsecurity
NFS: Enabling v4.2 should not recompile nfsd and lockd
Otherwise RPCSEC_GSS_DESTROY messages are not sent.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull nfsd changes from Bruce Fields:
"This includes miscellaneous bugfixes and cleanup and a performance fix
for write-heavy NFSv4 workloads.
(The most significant nfsd-relevant change this time is actually in
the delegation patches that went through Viro, fixing a long-standing
bug that can cause NFSv4 clients to miss updates made by non-nfs users
of the filesystem. Those enable some followup nfsd patches which I
have queued locally, but those can wait till 3.14)"
* 'nfsd-next' of git://linux-nfs.org/~bfields/linux: (24 commits)
nfsd: export proper maximum file size to the client
nfsd4: improve write performance with better sendspace reservations
svcrpc: remove an unnecessary assignment
sunrpc: comment typo fix
Revert "nfsd: remove_stid can be incorporated into nfs4_put_delegation"
nfsd4: fix discarded security labels on setattr
NFSD: Add support for NFS v4.2 operation checking
nfsd4: nfsd_shutdown_net needs state lock
NFSD: Combine decode operations for v4 and v4.1
nfsd: -EINVAL on invalid anonuid/gid instead of silent failure
nfsd: return better errors to exportfs
nfsd: fh_update should error out in unexpected cases
nfsd4: need to destroy revoked delegations in destroy_client
nfsd: no need to unhash_stid before free
nfsd: remove_stid can be incorporated into nfs4_put_delegation
nfsd: nfs4_open_delegation needs to remove_stid rather than unhash_stid
nfsd: nfs4_free_stid
nfsd: fix Kconfig syntax
sunrpc: trim off EC bytes in GSSAPI v2 unwrap
gss_krb5: document that we ignore sequence number
...
In gss_encode_v1_msg, it is pointless to BUG() after the overflow has
happened. Replace the existing sprintf()-based code with scnprintf(),
and warn if an overflow is ever triggered.
In gss_encode_v0_msg, replace the runtime BUG_ON() with an appropriate
compile-time BUILD_BUG_ON.
Reported-by: Bruce Fields <bfields@fieldses.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add the missing 'break' to ensure that we don't corrupt a legacy 'v0' type
message by appending the 'v1'.
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
As Bruce points out in RFC 4121, section 4.2.3:
"In Wrap tokens that provide for confidentiality, the first 16 octets
of the Wrap token (the "header", as defined in section 4.2.6), SHALL
be appended to the plaintext data before encryption. Filler octets
MAY be inserted between the plaintext data and the "header.""
...and...
"In Wrap tokens with confidentiality, the EC field SHALL be used to
encode the number of octets in the filler..."
It's possible for the client to stuff different data in that area on a
retransmission, which could make the checksum come out wrong in the DRC
code.
After decrypting the blob, we should trim off any extra count bytes in
addition to the checksum blob.
Reported-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
A couple times recently somebody has noticed that we're ignoring a
sequence number here and wondered whether there's a bug.
In fact, there's not. Thanks to Andy Adamson for pointing out a useful
explanation in rfc 2203. Add comments citing that rfc, and remove
"seqnum" to prevent static checkers complaining about unused variables.
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
For every other problem here we bail out with an error, but here for
some reason we're setting a negative cache entry (with, note, an
undefined expiry).
It seems simplest just to bail out in the same way as we do in other
cases.
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We depend on the xdr decoder to set this pointer, but if we error out
before we decode this piece it could be left NULL.
I think this is probably tough to hit without a buggy gss-proxy.
Reported-by: Andi Kleen <andi@firstfloor.org>
Cc: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
- Fix a few credential reference leaks resulting from the SP4_MACH_CRED
NFSv4.1 state protection code.
- Fix the SUNRPC bloatometer footprint: convert a 256K hashtable into the
intended 64 byte structure.
- Fix a long standing XDR issue with FREE_STATEID
- Fix a potential WARN_ON spamming issue
- Fix a missing dprintk() kuid conversion
New features:
- Enable the NFSv4.1 state protection support for the WRITE and COMMIT
operations.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSMiO+AAoJEGcL54qWCgDyuwEQALNAMpcRhASpqrRSuX94aKn3
ATENr87ov2FCXcTP/OBjdlcryyjp+0e5JBW5T0nHn90Uylz4p/87eOILlqIq4ax2
4QldKAuHdk5gLwiX5ebWpDtlwjTwyth1PRD7iPHT8lvIlO0IT7S/VDaa/04J37PL
Lw1zaTD0cpdRkdTnA12RDJ5oTW0YwmSBb5qJQROjinwa/ALuIZJpoBNCV01lIP2k
VaW0Yd8A+hqtawmxnf3G14r50Ds269AZ5K4hcRjQMEWeetlwfXFSTSjx8dzgsQkx
4VF6wiCSwsKEdrp8csRv+fsHiGRjNfzdSTrQxcJa+ssP6qX0KWHYPdw2jgbozX+2
kUQw2bFgxug+zdNjp+z1daJzw4QAfkjfNBWzt4w7a+8VOnR+/fydJzmka4mlJUKB
IDy8l/KrSCjCHi9VYal27+IQs/bcLAIvASUF14cZ/+ZY9MUsWhYXVPHNLhwTPds2
jFvawh77V6MHg/wA2+D7yHbHmOOmZaH2/Af9v3HKsVhhoLwqr5LO9qfAq63KSxzW
udzmjlSEhlOiJKDMZo9HigjKhU+Ndujr7RqsP6WFjTPa4yn6499cbTy7izze6MPB
JZDlmkInnZAtLDOuHAwxSNuNfBD6Yrzk1PV8Gv2xMEdp41bxgAg//K3WXx2vSGWa
4TQMHjaegAkdHyTK0rJD
=IdGo
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes (part 2) from Trond Myklebust:
"Bugfixes:
- Fix a few credential reference leaks resulting from the
SP4_MACH_CRED NFSv4.1 state protection code.
- Fix the SUNRPC bloatometer footprint: convert a 256K hashtable into
the intended 64 byte structure.
- Fix a long standing XDR issue with FREE_STATEID
- Fix a potential WARN_ON spamming issue
- Fix a missing dprintk() kuid conversion
New features:
- Enable the NFSv4.1 state protection support for the WRITE and
COMMIT operations"
* tag 'nfs-for-3.12-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: No, I did not intend to create a 256KiB hashtable
sunrpc: Add missing kuids conversion for printing
NFSv4.1: sp4_mach_cred: WARN_ON -> WARN_ON_ONCE
NFSv4.1: sp4_mach_cred: no need to ref count creds
NFSv4.1: fix SECINFO* use of put_rpccred
NFSv4.1: sp4_mach_cred: ask for WRITE and COMMIT
NFSv4.1 fix decode_free_stateid
Fix the declaration of the gss_auth_hash_table so that it creates
a 16 bucket hashtable, as I had intended.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull nfsd updates from Bruce Fields:
"This was a very quiet cycle! Just a few bugfixes and some cleanup"
* 'nfsd-next' of git://linux-nfs.org/~bfields/linux:
rpc: let xdr layer allocate gssproxy receieve pages
rpc: fix huge kmalloc's in gss-proxy
rpc: comment on linux_cred encoding, treat all as unsigned
rpc: clean up decoding of gssproxy linux creds
svcrpc: remove unused rq_resused
nfsd4: nfsd4_create_clid_dir prints uninitialized data
nfsd4: fix leak of inode reference on delegation failure
Revert "nfsd: nfs4_file_get_access: need to be more careful with O_RDWR"
sunrpc: prepare NFS for 2038
nfsd4: fix setlease error return
nfsd: nfs4_file_get_access: need to be more careful with O_RDWR
Highlights include:
- Fix NFSv4 recovery so that it doesn't recover lost locks in cases such as
lease loss due to a network partition, where doing so may result in data
corruption. Add a kernel parameter to control choice of legacy behaviour
or not.
- Performance improvements when 2 processes are writing to the same file.
- Flush data to disk when an RPCSEC_GSS session timeout is imminent.
- Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
NFS clients from being able to manipulate our lease and file lockingr
state.
- Allow sharing of RPCSEC_GSS caches between different rpc clients
- Fix the broken NFSv4 security auto-negotiation between client and server
- Fix rmdir() to wait for outstanding sillyrename unlinks to complete
- Add a tracepoint framework for debugging NFSv4 state recovery issues.
- Add tracing to the generic NFS layer.
- Add tracing for the SUNRPC socket connection state.
- Clean up the rpc_pipefs mount/umount event management.
- Merge more patches from Chuck in preparation for NFSv4 migration support.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJSLelVAAoJEGcL54qWCgDyo2IQAKOfRJyZVnf4ipxi3xLNl1QF
w/70DVSIF1S1djWN7G3vgkxj/R8KCvJ8CcvkAD2BEgRDeZJ9TtyKAdM/jYLZ+W05
7k2QKk8fkwZmc1Y2qDqFwKHzP5ZgP5L2nGx7FNhi/99wEAe47yFG3qd3rUWKrcOf
mnd863zgGDE2Q10slhoq/bywwMJo6tKZNeaIE8kPjgFbBEh/jslpAWr8dSA4QgvJ
nZ8VB5XU8L+XJ0GpHHdjYm9LvQ51DbQ6omOF+0P4fI093azKmf4ZsrjMDWT8+iu3
XkXlnQmKLGTi7yB43hHtn2NiRqwGzCcZ1Amo9PpCFaHUt1RP9cc37UhG1T+x1xWJ
STEKDbvCdQ3FU9FvbgrGEwBR0e8fNS4fZY3ToDBflIcfwre0aWs5RCodZMUD0nUI
4wY5J9NsQR/bL+v8KeUR4V4cXK8YrgL0zB4u4WYzH5Npxr5KD0NEKDNqRPhrB9l2
LLF9Haql8j76Ff0ek6UGFIZjDE0h6Fs71wLBpLj+ZWArOJ7vBuLMBSOVqNpld9+9
f2fEG7qoGF4FGTY4myH/eakMPaWnk9Ol4Ls/svSIapJ9+rePD+a93e/qnmdofIMf
4TuEYk6ERib1qXgaeDRQuCsm2YE1Co5skGMaOsRFWgReE1c12QoJQVst2nMtEKp3
uV2w8LgX18aZOZXJVkCM
=ZuW+
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
- Fix NFSv4 recovery so that it doesn't recover lost locks in cases
such as lease loss due to a network partition, where doing so may
result in data corruption. Add a kernel parameter to control
choice of legacy behaviour or not.
- Performance improvements when 2 processes are writing to the same
file.
- Flush data to disk when an RPCSEC_GSS session timeout is imminent.
- Implement NFSv4.1 SP4_MACH_CRED state protection to prevent other
NFS clients from being able to manipulate our lease and file
locking state.
- Allow sharing of RPCSEC_GSS caches between different rpc clients.
- Fix the broken NFSv4 security auto-negotiation between client and
server.
- Fix rmdir() to wait for outstanding sillyrename unlinks to complete
- Add a tracepoint framework for debugging NFSv4 state recovery
issues.
- Add tracing to the generic NFS layer.
- Add tracing for the SUNRPC socket connection state.
- Clean up the rpc_pipefs mount/umount event management.
- Merge more patches from Chuck in preparation for NFSv4 migration
support"
* tag 'nfs-for-3.12-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (107 commits)
NFSv4: use mach cred for SECINFO_NO_NAME w/ integrity
NFS: nfs_compare_super shouldn't check the auth flavour unless 'sec=' was set
NFSv4: Allow security autonegotiation for submounts
NFSv4: Disallow security negotiation for lookups when 'sec=' is specified
NFSv4: Fix security auto-negotiation
NFS: Clean up nfs_parse_security_flavors()
NFS: Clean up the auth flavour array mess
NFSv4.1 Use MDS auth flavor for data server connection
NFS: Don't check lock owner compatability unless file is locked (part 2)
NFS: Don't check lock owner compatibility in writes unless file is locked
nfs4: Map NFS4ERR_WRONG_CRED to EPERM
nfs4.1: Add SP4_MACH_CRED write and commit support
nfs4.1: Add SP4_MACH_CRED stateid support
nfs4.1: Add SP4_MACH_CRED secinfo support
nfs4.1: Add SP4_MACH_CRED cleanup support
nfs4.1: Add state protection handler
nfs4.1: Minimal SP4_MACH_CRED implementation
SUNRPC: Replace pointer values with task->tk_pid and rpc_clnt->cl_clid
SUNRPC: Add an identifier for struct rpc_clnt
SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints
...
In theory the linux cred in a gssproxy reply can include up to
NGROUPS_MAX data, 256K of data. In the common case we expect it to be
shorter. So do as the nfsv3 ACL code does and let the xdr code allocate
the pages as they come in, instead of allocating a lot of pages that
won't typically be used.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The reply to a gssproxy can include up to NGROUPS_MAX gid's, which will
take up more than a page. We therefore need to allocate an array of
pages to hold the reply instead of trying to allocate a single huge
buffer.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The encoding of linux creds is a bit confusing.
Also: I think in practice it doesn't really matter whether we treat any
of these things as signed or unsigned, but unsigned seems more
straightforward: uid_t/gid_t are unsigned and it simplifies the ngroups
overflow check.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We can use the normal coding infrastructure here.
Two minor behavior changes:
- we're assuming no wasted space at the end of the linux cred.
That seems to match gss-proxy's behavior, and I can't see why
it would need to do differently in the future.
- NGROUPS_MAX check added: note groups_alloc doesn't do this,
this is the caller's responsibility.
Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Most of the time an error from the credops crvalidate function means the
server has sent us a garbage verifier. The gss_validate function is the
exception where there is an -EACCES case if the user GSS_context on the client
has expired.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch provides the RPC layer helper functions to allow NFS to manage
data in the face of expired credentials - such as avoiding buffered WRITEs
and COMMITs when the gss context will expire before the WRITEs are flushed
and COMMITs are sent.
These helper functions enable checking the expiration of an underlying
credential key for a generic rpc credential, e.g. the gss_cred gss context
gc_expiry which for Kerberos is set to the remaining TGT lifetime.
A new rpc_authops key_timeout is only defined for the generic auth.
A new rpc_credops crkey_to_expire is only defined for the generic cred.
A new rpc_credops crkey_timeout is only defined for the gss cred.
Set a credential key expiry watermark, RPC_KEY_EXPIRE_TIMEO set to 240 seconds
as a default and can be set via a module parameter as we need to ensure there
is time for any dirty data to be flushed.
If key_timeout is called on a credential with an underlying credential key that
will expire within watermark seconds, we set the RPC_CRED_KEY_EXPIRE_SOON
flag in the generic_cred acred so that the NFS layer can clean up prior to
key expiration.
Checking a generic credential's underlying credential involves a cred lookup.
To avoid this lookup in the normal case when the underlying credential has
a key that is valid (before the watermark), a notify flag is set in
the generic credential the first time the key_timeout is called. The
generic credential then stops checking the underlying credential key expiry, and
the underlying credential (gss_cred) match routine then checks the key
expiration upon each normal use and sets a flag in the associated generic
credential only when the key expiration is within the watermark.
This in turn signals the generic credential key_timeout to perform the extra
credential lookup thereafter.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that all struct rpc_clnt for any given socket/rdma channel
share the same RPCSEC_GSS/krb5,krb5i,krb5p caches.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Ensure that if an rpc_clnt owns more than one RPCSEC_GSS-based authentication
mechanism, then those caches will share the same 'gssd' upcall pipe.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If an error condition occurs on rpc_pipefs creation, or the user mounts
rpc_pipefs and then unmounts it, then the dentries in struct gss_auth
need to be reset to NULL so that a second call to gss_pipes_dentries_destroy
doesn't try to free them again.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The clnt->cl_principal is being used exclusively to store the service
target name for RPCSEC_GSS/krb5 callbacks. Replace it with something that
is stored only in the RPCSEC_GSS-specific code.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Optimise away gss_encode_msg: we don't need to look up the pipe
version a second time.
Save the gss target name in struct gss_auth. It is a property of the
auth cache itself, and doesn't really belong in the rpc_client.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The change made to rsc_parse() in
0dc1531aca "svcrpc: store gss mech in
svc_cred" should also have been propagated to the gss-proxy codepath.
This fixes a crash in the gss-proxy case.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Uninitialized stack data was being used as the destination for memcpy's.
Longer term we'll just delete some of this code; all we're doing is
skipping over xdr that we don't care about.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd changes from Bruce Fields:
"Changes this time include:
- 4.1 enabled on the server by default: the last 4.1-specific issues
I know of are fixed, so we're not going to find the rest of the
bugs without more exposure.
- Experimental support for NFSv4.2 MAC Labeling (to allow running
selinux over NFS), from Dave Quigley.
- Fixes for some delicate cache/upcall races that could cause rare
server hangs; thanks to Neil Brown and Bodo Stroesser for extreme
debugging persistence.
- Fixes for some bugs found at the recent NFS bakeathon, mostly v4
and v4.1-specific, but also a generic bug handling fragmented rpc
calls"
* 'for-3.11' of git://linux-nfs.org/~bfields/linux: (31 commits)
nfsd4: support minorversion 1 by default
nfsd4: allow destroy_session over destroyed session
svcrpc: fix failures to handle -1 uid's
sunrpc: Don't schedule an upcall on a replaced cache entry.
net/sunrpc: xpt_auth_cache should be ignored when expired.
sunrpc/cache: ensure items removed from cache do not have pending upcalls.
sunrpc/cache: use cache_fresh_unlocked consistently and correctly.
sunrpc/cache: remove races with queuing an upcall.
nfsd4: return delegation immediately if lease fails
nfsd4: do not throw away 4.1 lock state on last unlock
nfsd4: delegation-based open reclaims should bypass permissions
svcrpc: don't error out on small tcp fragment
svcrpc: fix handling of too-short rpc's
nfsd4: minor read_buf cleanup
nfsd4: fix decoding of compounds across page boundaries
nfsd4: clean up nfs4_open_delegation
NFSD: Don't give out read delegations on creates
nfsd4: allow client to send no cb_sec flavors
nfsd4: fail attempts to request gss on the backchannel
nfsd4: implement minimal SP4_MACH_CRED
...
Store a pointer to the gss mechanism used in the rq_cred and cl_cred.
This will make it easier to enforce SP4_MACH_CRED, which needs to
compare the mechanism used on the exchange_id with that used on
protected operations.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd fixes from Bruce Fields:
"A couple minor fixes for the (new to 3.10) gss-proxy code.
And one regression from user-namespace changes. (XBMC clients were
doing something admittedly weird--sending -1 gid's--but something that
we used to allow.)"
* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
svcrpc: fix failures to handle -1 uid's and gid's
svcrpc: implement O_NONBLOCK behavior for use-gss-proxy
svcauth_gss: fix error code in use_gss_proxy()
Somebody noticed LTP was complaining about O_NONBLOCK opens of
/proc/net/rpc/use-gss-proxy succeeding and then a following read
hanging.
I'm not convinced LTP really has any business opening random proc files
and expecting them to behave a certain way. Maybe this isn't really a
bug.
But in any case the O_NONBLOCK behavior could be useful for someone that
wants to test whether gss-proxy is up without waiting.
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This seems to have been overlooked when we did the namespace
conversion. If a container is running a legacy version of rpc.gssd
then it will be disrupted if the global 'pipe_version' is set by a
container running the new version of rpc.gssd.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Recent changes to the NFS security flavour negotiation mean that
we have a stronger dependency on rpc.gssd. If the latter is not
running, because the user failed to start it, then we time out
and mark the container as not having an instance. We then
use that information to time out faster the next time.
If, on the other hand, the rpc.gssd successfully binds to an rpc_pipe,
then we mark the container as having an rpc.gssd instance.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If wait_event_interruptible_timeout() is successful, it returns
the number of seconds remaining until the timeout. In that
case, we should be retrying the upcall.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This should return zero on success and -EBUSY on error so the type
needs to be int instead of bool.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd fixes from Bruce Fields:
"Small fixes for two bugs and two warnings"
* 'for-3.10' of git://linux-nfs.org/~bfields/linux:
nfsd: fix oops when legacy_recdir_name_error is passed a -ENOENT error
SUNRPC: fix decoding of optional gss-proxy xdr fields
SUNRPC: Refactor gssx_dec_option_array() to kill uninitialized warning
nfsd4: don't allow owner override on 4.1 CLAIM_FH opens
- Ensure that we match the 'sec=' mount flavour against the server list
- Fix the NFSv4 byte range locking in the presence of delegations
- Ensure that we conform to the NFSv4.1 spec w.r.t. freeing lock stateids
- Fix a pNFS data server connection race
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRit1yAAoJEGcL54qWCgDyD9EQAKgb37dXhGt7OXBRBP4EY/T8
xJZ2tmdDZ6etLFJVftqCv05hBvyfilPLK0E9zg/zW/kvkKxYQ/fykvpzBR/+Q7KF
quOmjDHLhDTXBnXzPg1HEoeTaXI2/a8CdjpxxEkthD4+FaKlyCXM+EFtA9orT9ZI
oM+aNaqEzTjoQyryTFMcHxAvsrqjnZBa0MT6Fh45HaLaijV7CdDWoj6gjy6Lc3Al
4wHeT8QrZTp/NfIN16uykFZjeWwul4N9upu+CI2V8ZDMEit6JDYX4sl5tB41PzYW
audDBcu0waSqoVQ2mJ5OHoYGZf0wopMUFaAst+tn0pQvwWUfTjD8XtO8uOgeMNoz
2S+XxUC2qhSMszwNBVSmwe2LtSAyHiw32Md4hqkLYDH2c7tk8bJPKDXZJACBzJS7
O1aMmOgWar8+nmzvmXFeU804SxBykV1V8UgtXWp5IwC36V0HAYnM5xtHwXBR7HWe
lnuVHVdux7ySeAyrs2aMdKk7SAw5OC//WW8qoEF5USDEIljeoBzA+IYu9n91Hg2b
ufnsyxumGJ6dZ0iU2nJVoLagRaZcm6kOhnxcegMpb9IH2+RLCQNef09lj2iklm2j
mJA4o2lkVEHOswg/NwKn/I4ho8tbNNb8v//S5KiqrYhiiqZhOzu3RRtFeZi91iac
P/g+hPzfuGnmwcoCEUSa
=5zpc
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull more NFS client bugfixes from Trond Myklebust:
- Ensure that we match the 'sec=' mount flavour against the server list
- Fix the NFSv4 byte range locking in the presence of delegations
- Ensure that we conform to the NFSv4.1 spec w.r.t. freeing lock
stateids
- Fix a pNFS data server connection race
* tag 'nfs-for-3.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS4.1 Fix data server connection race
NFSv3: match sec= flavor against server list
NFSv4.1: Ensure that we free the lock stateid on the server
NFSv4: Convert nfs41_free_stateid to use an asynchronous RPC call
SUNRPC: Don't spam syslog with "Pseudoflavor not found" messages
NFSv4.x: Fix handling of partially delegated locks
The current code works, but sort of by accident: it obviously didn't
intend the error return to be interpreted as "true".
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
net/sunrpc/auth_gss/gss_rpc_xdr.c: In function ‘gssx_dec_option_array’:
net/sunrpc/auth_gss/gss_rpc_xdr.c:258: warning: ‘creds’ may be used uninitialized in this function
Return early if count is zero, to make it clearer to the compiler (and the
casual reviewer) that no more processing is done.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd changes from J Bruce Fields:
"Highlights include:
- Some more DRC cleanup and performance work from Jeff Layton
- A gss-proxy upcall from Simo Sorce: currently krb5 mounts to the
server using credentials from Active Directory often fail due to
limitations of the svcgssd upcall interface. This replacement
lifts those limitations. The existing upcall is still supported
for backwards compatibility.
- More NFSv4.1 support: at this point, if a user with a current
client who upgrades from 4.0 to 4.1 should see no regressions. In
theory we do everything a 4.1 server is required to do. Patches
for a couple minor exceptions are ready for 3.11, and with those
and some more testing I'd like to turn 4.1 on by default in 3.11."
Fix up semantic conflict as per Stephen Rothwell and linux-next:
Commit 030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
authentication") adds two new users of "PDE(inode)->data", but we're
supposed to use "PDE_DATA(inode)" instead since commit d9dda78bad
("procfs: new helper - PDE_DATA(inode)").
The old PDE() macro is no longer available since commit c30480b92c
("proc: Make the PROC_I() and PDE() macros internal to procfs")
* 'for-3.10' of git://linux-nfs.org/~bfields/linux: (60 commits)
NFSD: SECINFO doesn't handle unsupported pseudoflavors correctly
NFSD: Simplify GSS flavor encoding in nfsd4_do_encode_secinfo()
nfsd: make symbol nfsd_reply_cache_shrinker static
svcauth_gss: fix error return code in rsc_parse()
nfsd4: don't remap EISDIR errors in rename
svcrpc: fix gss-proxy to respect user namespaces
SUNRPC: gssp_procedures[] can be static
SUNRPC: define {create,destroy}_use_gss_proxy_proc_entry in !PROC case
nfsd4: better error return to indicate SSV non-support
nfsd: fix EXDEV checking in rename
SUNRPC: Use gssproxy upcall for server RPCGSS authentication.
SUNRPC: Add RPC based upcall mechanism for RPCGSS auth
SUNRPC: conditionally return endtime from import_sec_context
SUNRPC: allow disabling idle timeout
SUNRPC: attempt AF_LOCAL connect on setup
nfsd: Decode and send 64bit time values
nfsd4: put_client_renew_locked can be static
nfsd4: remove unused macro
nfsd4: remove some useless code
nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED
...
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
- NLM: stable fix for NFSv2/v3 blocking locks
- NFSv4.x: stable fixes for the delegation recall error handling code
- NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck Lever
- SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck
- NFSv4.x assorted state management and reboot recovery bugfixes
- NFSv4.1: In cases where we have already looked up a file, and hold a
valid filehandle, use the new open-by-filehandle operation instead of
opening by name.
- Allow the NFSv4.1 callback thread to freeze
- NFSv4.x: ensure that file unlock waits for readahead to complete
- NFSv4.1: ensure that the RPC layer doesn't override the NFS session
table size negotiation by limiting the number of slots.
- NFSv4.x: Fix SETATTR spec compatibility issues
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQIcBAABAgAGBQJRfu+cAAoJEGcL54qWCgDylxkP/24cXOLHMKMYnIab0cQYIW2m
SQgADGE+MqgTVlWjGVWublVMDY1R51iINsksAjxMtXYt50FdBJEqV2uxIGi4VnbR
nR9eppqQ6vk5e6r5+aZyVmWKoLFnJ4MBF6OpPUZB5mf8iH/fiixmSYLvseopPdDj
bjHwCxg+xEgew5EhQF/xqkEfkAp2NN84xUksTWb9uDIW2c3SJweY/ZVR2Zsqpugm
oqYVtrSLvNKqINQG8OP10s+mRWULwoqapF+kEHlxNbRy26C0zlbXPaneSgYzqHsY
OyRkAT7uJJqStYlqdW7k+DhyNMB+T33WAGJpWQlfJGYk5d/n0rtBJDVo0hfhCSQr
VkOXiO9J08NMbelCu4+0CJii7h5GCaqpuJEEmNL6AlF/TJVkIQJuRaG2+WDmEtO2
oYd4UfXlAbUuts1SW7u/yyN/yrjVTm1tZYRBqn2VJdqh1s8dMxEWPct2Yn314mpS
ODAnbDkEhtWlc9cloSRnwKec5WcxMZb19IJeK9ZvHm7PfIu/QHtj6Ren8s1//bZI
OMQxC/Vf/wcjMdNtr7QdMNxWG1aK8DL9mYP5XwCrkZ560LIrtxmhyqYeoGAfgO5u
+K/gKmQwjsaPhEa8jbP2/wI0II9yKPWj/fVwqhbhqaBUx5GA2iAKcdpPP6JAMAti
+PXkLTtkyrIgSNwzl63S
=Hgot
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes and cleanups from Trond Myklebust:
- NLM: stable fix for NFSv2/v3 blocking locks
- NFSv4.x: stable fixes for the delegation recall error handling code
- NFSv4.x: Security flavour negotiation fixes and cleanups by Chuck
Lever
- SUNRPC: A number of RPCSEC_GSS fixes and cleanups also from Chuck
- NFSv4.x assorted state management and reboot recovery bugfixes
- NFSv4.1: In cases where we have already looked up a file, and hold a
valid filehandle, use the new open-by-filehandle operation instead of
opening by name.
- Allow the NFSv4.1 callback thread to freeze
- NFSv4.x: ensure that file unlock waits for readahead to complete
- NFSv4.1: ensure that the RPC layer doesn't override the NFS session
table size negotiation by limiting the number of slots.
- NFSv4.x: Fix SETATTR spec compatibility issues
* tag 'nfs-for-3.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (67 commits)
NFSv4: Warn once about servers that incorrectly apply open mode to setattr
NFSv4: Servers should only check SETATTR stateid open mode on size change
NFSv4: Don't recheck permissions on open in case of recovery cached open
NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes
NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot
NFSv4: Ensure the LOCK call cannot use the delegation stateid
NFSv4: Use the open stateid if the delegation has the wrong mode
nfs: Send atime and mtime as a 64bit value
NFSv4: Record the OPEN create mode used in the nfs4_opendata structure
NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transports
SUNRPC: Allow rpc_create() to request that TCP slots be unlimited
SUNRPC: Fix a livelock problem in the xprt->backlog queue
NFSv4: Fix handling of revoked delegations by setattr
NFSv4 release the sequence id in the return on close case
nfs: remove unnecessary check for NULL inode->i_flock from nfs_delegation_claim_locks
NFS: Ensure that NFS file unlock waits for readahead to complete
NFS: Add functionality to allow waiting on all outstanding reads to complete
...
Use preferable function name which implies using a pseudo-random
number generator.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Note conflict: Chuck's patches modified (and made static)
gss_mech_get_by_OID, which is still needed by gss-proxy patches.
The conflict resolution is a bit minimal; we may want some more cleanup.
The main advantge of this new upcall mechanism is that it can handle
big tickets as seen in Kerberos implementations where tickets carry
authorization data like the MS-PAC buffer with AD or the Posix Authorization
Data being discussed in IETF on the krbwg working group.
The Gssproxy program is used to perform the accept_sec_context call on the
kernel's behalf. The code is changed to also pass the input buffer straight
to upcall mechanism to avoid allocating and copying many pages as tokens can
be as big (potentially more in future) as 64KiB.
Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, negotiation api]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch implements a sunrpc client to use the services of the gssproxy
userspace daemon.
In particular it allows to perform calls in user space using an RPC
call instead of custom hand-coded upcall/downcall messages.
Currently only accept_sec_context is implemented as that is all is needed for
the server case.
File server modules like NFS and CIFS can use full gssapi services this way,
once init_sec_context is also implemented.
For the NFS server case this code allow to lift the limit of max 2k krb5
tickets. This limit is prevents legitimate kerberos deployments from using krb5
authentication with the Linux NFS server as they have normally ticket that are
many kilobytes large.
It will also allow to lift the limitation on the size of the credential set
(uid,gid,gids) passed down from user space for users that have very many groups
associated. Currently the downcall mechanism used by rpc.svcgssd is limited
to around 2k secondary groups of the 65k allowed by kernel structures.
Signed-off-by: Simo Sorce <simo@redhat.com>
[bfields: containerization, concurrent upcalls, misc. fixes and cleanup]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We expose this parameter for a future caller.
It will be used to extract the endtime from the gss-proxy upcall mechanism,
in order to set the rsc cache expiration time.
Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Clean up: Reduce the symbol table footprint for auth_rpcgss.ko by
removing exported symbols for functions that are no longer used
outside of auth_rpcgss.ko.
The remaining two EXPORTs in gss_mech_switch.c get documenting
comments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
gss_mech_get() is no longer used outside of gss_mech_switch.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up. This matches a similar API for the client side, and
keeps ULP fingers out the of the GSS mech switch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 SECINFO operation returns a list of security flavors that
the server supports for a particular share. An NFSv4 client is
supposed to pick a pseudoflavor it supports that corresponds to one
of the flavors returned by the server.
GSS flavors in this list have a GSS tuple that identify a specific
GSS pseudoflavor.
Currently our client ignores the GSS tuple's "qop" value. A
matching pseudoflavor is chosen based only on the OID and service
value.
So far this omission has not had much effect on Linux. The NFSv4
protocol currently supports only one qop value: GSS_C_QOP_DEFAULT,
also known as zero.
However, if an NFSv4 server happens to return something other than
zero in the qop field, our client won't notice. This could cause
the client to behave in incorrect ways that could have security
implications.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The current GSS mech switch can find and load GSS pseudoflavor
modules by name ("krb5") or pseudoflavor number ("390003"), but
cannot find GSS modules by GSS tuple:
[ "1.2.840.113554.1.2.2", GSS_C_QOP_DEFAULT, RPC_GSS_SVC_NONE ]
This is important when dealing with a SECINFO request. A SECINFO
reply contains a list of flavors the server supports for the
requested export, but GSS flavors also have a GSS tuple that maps
to a pseudoflavor (like 390003 for krb5).
If the GSS module that supports the OID in the tuple is not loaded,
our client is not able to load that module dynamically to support
that pseudoflavor.
Add a way for the GSS mech switch to load GSS pseudoflavor support
by OID before searching for the pseudoflavor that matches the OID
and service.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A SECINFO reply may contain flavors whose kernel module is not
yet loaded by the client's kernel. A new RPC client API, called
rpcauth_get_pseudoflavor(), is introduced to do proper checking
for support of a security flavor.
When this API is invoked, the RPC client now tries to load the
module for each flavor first before performing the "is this
supported?" check. This means if a module is available on the
client, but has not been loaded yet, it will be loaded and
registered automatically when the SECINFO reply is processed.
The new API can take a full GSS tuple (OID, QoP, and service).
Previously only the OID and service were considered.
nfs_find_best_sec() is updated to verify all flavors requested in a
SECINFO reply, including AUTH_NULL and AUTH_UNIX. Previously these
two flavors were simply assumed to be supported without consulting
the RPC client.
Note that the replaced version of nfs_find_best_sec() can return
RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an
unsupported "service" value. nfs_find_best_sec() now returns
RPC_AUTH_UNIX in this case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The NFSv4 SECINFO procedure returns a list of security flavors. Any
GSS flavor also has a GSS tuple containing an OID, a quality-of-
protection value, and a service value, which specifies a particular
GSS pseudoflavor.
For simplicity and efficiency, I'd like to return each GSS tuple
from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC
client.
Define a data structure that is visible to both the NFS client and
the RPC client. Take structure and field names from the relevant
standards to avoid confusion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Commit f344f6df "SUNRPC: Auto-load RPC authentication kernel
modules", Mon Mar 20 13:44:08 2006, adds a request_module() call
in rpcauth_create() to auto-load RPC security modules when a ULP
tries to create a credential of that flavor.
In rpcauth_create(), the name of the module to load is built like
this:
request_module("rpc-auth-%u", flavor);
This means that for, say, RPC_AUTH_GSS, request_module() is looking
for a module or alias called "rpc-auth-6".
The GSS module is named "auth_rpcgss", and commit f344f6df does not
add any new module aliases. There is also no such alias provided in
/etc/modprobe.d on my system (Fedora 16). Without this alias, the
GSS module is not loaded on demand.
This is used by rpcauth_create(). The pseudoflavor_to_flavor() call
can return RPC_AUTH_GSS, which is passed to request_module().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
krb5 mounts started failing as of
683428fae8 "sunrpc: Update svcgss xdr
handle to rpsec_contect cache".
The problem is that mounts are usually done with some host principal
which isn't normally mapped to any user, in which case svcgssd passes
down uid -1, which the kernel is then expected to map to the
export-specific anonymous uid or gid.
The new uid_valid/gid_valid checks were therefore causing that downcall
to fail.
(Note the regression may not have been seen with older userspace that
tended to map unknown principals to an anonymous id on their own rather
than leaving it to the kernel.)
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull nfsd changes from J Bruce Fields:
"Miscellaneous bugfixes, plus:
- An overhaul of the DRC cache by Jeff Layton. The main effect is
just to make it larger. This decreases the chances of intermittent
errors especially in the UDP case. But we'll need to watch for any
reports of performance regressions.
- Containerized nfsd: with some limitations, we now support
per-container nfs-service, thanks to extensive work from Stanislav
Kinsbursky over the last year."
Some notes about conflicts, since there were *two* non-data semantic
conflicts here:
- idr_remove_all() had been added by a memory leak fix, but has since
become deprecated since idr_destroy() does it for us now.
- xs_local_connect() had been added by this branch to make AF_LOCAL
connections be synchronous, but in the meantime Trond had changed the
calling convention in order to avoid a RCU dereference.
There were a couple of more obvious actual source-level conflicts due to
the hlist traversal changes and one just due to code changes next to
each other, but those were trivial.
* 'for-3.9' of git://linux-nfs.org/~bfields/linux: (49 commits)
SUNRPC: make AF_LOCAL connect synchronous
nfsd: fix compiler warning about ambiguous types in nfsd_cache_csum
svcrpc: fix rpc server shutdown races
svcrpc: make svc_age_temp_xprts enqueue under sv_lock
lockd: nlmclnt_reclaim(): avoid stack overflow
nfsd: enable NFSv4 state in containers
nfsd: disable usermode helper client tracker in container
nfsd: use proper net while reading "exports" file
nfsd: containerize NFSd filesystem
nfsd: fix comments on nfsd_cache_lookup
SUNRPC: move cache_detail->cache_request callback call to cache_read()
SUNRPC: remove "cache_request" argument in sunrpc_cache_pipe_upcall() function
SUNRPC: rework cache upcall logic
SUNRPC: introduce cache_detail->cache_request callback
NFS: simplify and clean cache library
NFS: use SUNRPC cache creation and destruction helper for DNS cache
nfsd4: free_stid can be static
nfsd: keep a checksum of the first 256 bytes of request
sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer
sunrpc: fix comment in struct xdr_buf definition
...
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
Pull user namespace and namespace infrastructure changes from Eric W Biederman:
"This set of changes starts with a few small enhnacements to the user
namespace. reboot support, allowing more arbitrary mappings, and
support for mounting devpts, ramfs, tmpfs, and mqueuefs as just the
user namespace root.
I do my best to document that if you care about limiting your
unprivileged users that when you have the user namespace support
enabled you will need to enable memory control groups.
There is a minor bug fix to prevent overflowing the stack if someone
creates way too many user namespaces.
The bulk of the changes are a continuation of the kuid/kgid push down
work through the filesystems. These changes make using uids and gids
typesafe which ensures that these filesystems are safe to use when
multiple user namespaces are in use. The filesystems converted for
3.9 are ceph, 9p, afs, ocfs2, gfs2, ncpfs, nfs, nfsd, and cifs. The
changes for these filesystems were a little more involved so I split
the changes into smaller hopefully obviously correct changes.
XFS is the only filesystem that remains. I was hoping I could get
that in this release so that user namespace support would be enabled
with an allyesconfig or an allmodconfig but it looks like the xfs
changes need another couple of days before it they are ready."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (93 commits)
cifs: Enable building with user namespaces enabled.
cifs: Convert struct cifs_ses to use a kuid_t and a kgid_t
cifs: Convert struct cifs_sb_info to use kuids and kgids
cifs: Modify struct smb_vol to use kuids and kgids
cifs: Convert struct cifsFileInfo to use a kuid
cifs: Convert struct cifs_fattr to use kuid and kgids
cifs: Convert struct tcon_link to use a kuid.
cifs: Modify struct cifs_unix_set_info_args to hold a kuid_t and a kgid_t
cifs: Convert from a kuid before printing current_fsuid
cifs: Use kuids and kgids SID to uid/gid mapping
cifs: Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID to keyring_alloc
cifs: Use BUILD_BUG_ON to validate uids and gids are the same size
cifs: Override unmappable incoming uids and gids
nfsd: Enable building with user namespaces enabled.
nfsd: Properly compare and initialize kuids and kgids
nfsd: Store ex_anon_uid and ex_anon_gid as kuids and kgids
nfsd: Modify nfsd4_cb_sec to use kuids and kgids
nfsd: Handle kuids and kgids in the nfs4acl to posix_acl conversion
nfsd: Convert nfsxdr to use kuids and kgids
nfsd: Convert nfs3xdr to use kuids and kgids
...
Since commit 620038f6d2, gcc is throwing the following warning:
CC [M] net/sunrpc/auth_gss/auth_gss.o
In file included from include/linux/sunrpc/types.h:14:0,
from include/linux/sunrpc/sched.h:14,
from include/linux/sunrpc/clnt.h:18,
from net/sunrpc/auth_gss/auth_gss.c:45:
net/sunrpc/auth_gss/auth_gss.c: In function ‘gss_pipe_downcall’:
include/linux/sunrpc/debug.h:45:10: warning: ‘timeout’ may be used
uninitialized in this function [-Wmaybe-uninitialized]
printk(KERN_DEFAULT args); \
^
net/sunrpc/auth_gss/auth_gss.c:194:15: note: ‘timeout’ was declared here
unsigned int timeout;
^
If simple_get_bytes returns an error, then we'll end up calling printk
with an uninitialized timeout value. Reasonably harmless, but fairly
simple to fix by removing the printout of the uninitialised parameters.
Cc: Andy Adamson <andros@netapp.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
[Trond: just remove the parameters rather than initialising timeout]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
For most of SUNRPC caches (except NFS DNS cache) cache_detail->cache_upcall is
redundant since all that it's implementations are doing is calling
sunrpc_cache_pipe_upcall() with proper function address argument.
Cache request function address is now stored on cache_detail structure and
thus all the code can be simplified.
Now, for those cache details, which doesn't have cache_upcall callback (the
only one, which still has is nfs_dns_resolve_template)
sunrpc_cache_pipe_upcall will be called instead.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This callback will allow to simplify upcalls in further patches in this
series.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
For each received uid call make_kuid and validate the result.
For each received gid call make_kgid and validate the result.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
- Use from_kuid when generating the on the wire uid values.
- Use make_kuid when reading on the wire values.
In gss_encode_v0_msg, since the uid in gss_upcall_msg is now a kuid_t
generate the necessary uid_t value on the stack copy it into
gss_msg->databuf where it can safely live until the message is no
longer needed.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
When printing kuids and kgids for debugging purpropses convert them
to ordinary integers so their values can be fed to the oridnary
print functions.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
When comparing uids use uid_eq instead of ==.
When comparing gids use gid_eq instead of ==.
And unfortunate cost of type safety.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Convert variables that store uids and gids to be of type
kuid_t and kgid_t instead of type uid_t and gid_t.
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
When GSSAPI integrity signatures are in use, or when we're using GSSAPI
privacy with the v2 token format, there is a trailing checksum on the
xdr_buf that is returned.
It's checked during the authentication stage, and afterward nothing
cares about it. Ordinarily, it's not a problem since the XDR code
generally ignores it, but it will be when we try to compute a checksum
over the buffer to help prevent XID collisions in the duplicate reply
cache.
Fix the code to trim off the checksums after verifying them. Note that
in unwrap_integ_data, we must avoid trying to reverify the checksum if
the request was deferred since it will no longer be present when it's
revisited.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
tk_xprt is just a shortcut for tk_client->cl_xprt, however cl_xprt is
defined as an __rcu variable. Replace dereferences of tk_xprt with
non-rcu dereferences where it is safe to do so.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Only use the default GSSD_MIN_TIMEOUT if the gss downcall timeout is zero.
Store the full lifetime in gc_expiry (not 3/4 of the lifetime) as subsequent
patches will use the gc_expiry to determine buffered WRITE behavior in the
face of expired or soon to be expired gss credentials.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Clean up: Some function names have changed, but debugging messages
were never updated. Automate the construction of the function name
in debugging messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The patch "SUNRPC: Add rpcauth_list_flavors()" introduces a new error
path in gss_mech_list_pseudoflavors, but fails to release the spin lock.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The gss_mech_list_pseudoflavors() function provides a list of
currently registered GSS pseudoflavors. This list does not include
any non-GSS flavors that have been registered with the RPC client.
nfs4_find_root_sec() currently adds these extra flavors by hand.
Instead, nfs4_find_root_sec() should be looking at the set of flavors
that have been explicitly registered via rpcauth_register(). And,
other areas of code will soon need the same kind of list that
contains all flavors the kernel currently knows about (see below).
Rather than cloning the open-coded logic in nfs4_find_root_sec() to
those new places, introduce a generic RPC function that generates a
full list of registered auth flavors and pseudoflavors.
A new rpc_authops method is added that lists a flavor's
pseudoflavors, if it has any. I encountered an interesting module
loader loop when I tried to get the RPC client to invoke
gss_mech_list_pseudoflavors() by name.
This patch is a pre-requisite for server trunking discovery, and a
pre-requisite for fixing up the in-kernel mount client to do better
automatic security flavor selection.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Pull the rest of the nfsd commits from Bruce Fields:
"... and then I cherry-picked the remainder of the patches from the
head of my previous branch"
This is the rest of the original nfsd branch, rebased without the
delegation stuff that I thought really needed to be redone.
I don't like rebasing things like this in general, but in this situation
this was the lesser of two evils.
* 'for-3.5' of git://linux-nfs.org/~bfields/linux: (50 commits)
nfsd4: fix, consolidate client_has_state
nfsd4: don't remove rebooted client record until confirmation
nfsd4: remove some dprintk's and a comment
nfsd4: return "real" sequence id in confirmed case
nfsd4: fix exchange_id to return confirm flag
nfsd4: clarify that renewing expired client is a bug
nfsd4: simpler ordering of setclientid_confirm checks
nfsd4: setclientid: remove pointless assignment
nfsd4: fix error return in non-matching-creds case
nfsd4: fix setclientid_confirm same_cred check
nfsd4: merge 3 setclientid cases to 2
nfsd4: pull out common code from setclientid cases
nfsd4: merge last two setclientid cases
nfsd4: setclientid/confirm comment cleanup
nfsd4: setclientid remove unnecessary terms from a logical expression
nfsd4: move rq_flavor into svc_cred
nfsd4: stricter cred comparison for setclientid/exchange_id
nfsd4: move principal name into svc_cred
nfsd4: allow removing clients not holding state
nfsd4: rearrange exchange_id logic to simplify
...
Pull nfsd update from Bruce Fields.
* 'for-3.5-take-2' of git://linux-nfs.org/~bfields/linux: (23 commits)
nfsd: trivial: use SEEK_SET instead of 0 in vfs_llseek
SUNRPC: split upcall function to extract reusable parts
nfsd: allocate id-to-name and name-to-id caches in per-net operations.
nfsd: make name-to-id cache allocated per network namespace context
nfsd: make id-to-name cache allocated per network namespace context
nfsd: pass network context to idmap init/exit functions
nfsd: allocate export and expkey caches in per-net operations.
nfsd: make expkey cache allocated per network namespace context
nfsd: make export cache allocated per network namespace context
nfsd: pass pointer to export cache down to stack wherever possible.
nfsd: pass network context to export caches init/shutdown routines
Lockd: pass network namespace to creation and destruction routines
NFSd: remove hard-coded dereferences to name-to-id and id-to-name caches
nfsd: pass pointer to expkey cache down to stack wherever possible.
nfsd: use hash table from cache detail in nfsd export seq ops
nfsd: pass svc_export_cache pointer as private data to "exports" seq file ops
nfsd: use exp_put() for svc_export_cache put
nfsd: use cache detail pointer from svc_export structure on cache put
nfsd: add link to owner cache detail to svc_export structure
nfsd: use passed cache_detail pointer expkey_parse()
...
Move the rq_flavor into struct svc_cred, and use it in setclientid and
exchange_id comparisons as well.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Instead of keeping the principal name associated with a request in a
structure that's private to auth_gss and using an accessor function,
move it to svc_cred.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
The data in Kerberos gss tokens can be rotated. But we were lazy and
rejected any nonzero rotation value. It wasn't necessary for the
implementations we were testing against at the time.
But it appears that Windows does use a nonzero value here.
So, implement rotation to bring ourselves into compliance with the spec
and to interoperate with Windows.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Pull user namespace enhancements from Eric Biederman:
"This is a course correction for the user namespace, so that we can
reach an inexpensive, maintainable, and reasonably complete
implementation.
Highlights:
- Config guards make it impossible to enable the user namespace and
code that has not been converted to be user namespace safe.
- Use of the new kuid_t type ensures the if you somehow get past the
config guards the kernel will encounter type errors if you enable
user namespaces and attempt to compile in code whose permission
checks have not been updated to be user namespace safe.
- All uids from child user namespaces are mapped into the initial
user namespace before they are processed. Removing the need to add
an additional check to see if the user namespace of the compared
uids remains the same.
- With the user namespaces compiled out the performance is as good or
better than it is today.
- For most operations absolutely nothing changes performance or
operationally with the user namespace enabled.
- The worst case performance I could come up with was timing 1
billion cache cold stat operations with the user namespace code
enabled. This went from 156s to 164s on my laptop (or 156ns to
164ns per stat operation).
- (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
Most uid/gid setting system calls treat these value specially
anyway so attempting to use -1 as a uid would likely cause
entertaining failures in userspace.
- If setuid is called with a uid that can not be mapped setuid fails.
I have looked at sendmail, login, ssh and every other program I
could think of that would call setuid and they all check for and
handle the case where setuid fails.
- If stat or a similar system call is called from a context in which
we can not map a uid we lie and return overflowuid. The LFS
experience suggests not lying and returning an error code might be
better, but the historical precedent with uids is different and I
can not think of anything that would break by lying about a uid we
can't map.
- Capabilities are localized to the current user namespace making it
safe to give the initial user in a user namespace all capabilities.
My git tree covers all of the modifications needed to convert the core
kernel and enough changes to make a system bootable to runlevel 1."
Fix up trivial conflicts due to nearby independent changes in fs/stat.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
userns: Silence silly gcc warning.
cred: use correct cred accessor with regards to rcu read lock
userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
userns: Convert cgroup permission checks to use uid_eq
userns: Convert tmpfs to use kuid and kgid where appropriate
userns: Convert sysfs to use kgid/kuid where appropriate
userns: Convert sysctl permission checks to use kuid and kgids.
userns: Convert proc to use kuid/kgid where appropriate
userns: Convert ext4 to user kuid/kgid where appropriate
userns: Convert ext3 to use kuid/kgid where appropriate
userns: Convert ext2 to use kuid/kgid where appropriate.
userns: Convert devpts to use kuid/kgid where appropriate
userns: Convert binary formats to use kuid/kgid where appropriate
userns: Add negative depends on entries to avoid building code that is userns unsafe
userns: signal remove unnecessary map_cred_ns
userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
userns: Convert stat to return values mapped from kuids and kgids
userns: Convert user specfied uids and gids in chown into kuids and kgid
userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
...
gss_mech_list_pseudoflavors() parses a list of registered mechanisms.
On that list contains a list of pseudo flavors which was not being
parsed correctly, causing only the first pseudo flavor to be found.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
As a first step to converting struct cred to be all kuid_t and kgid_t
values convert the group values stored in group_info to always be
kgid_t values. Unless user namespaces are used this change should
have no effect.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
This is needed to share code between the current server upcall mechanism
and the new gssproxy upcall mechanism introduced in a following patch.
Signed-off-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from
which the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow concurrent
access to idmapper entries. Start the process of migrating users from
the single-threaded daemon-based approach to the multi-threaded
request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each other
for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve debugging
of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
VBm+EI00Pvmm5+fUjIlp
=LuHE
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates for Linux 3.4 from Trond Myklebust:
"New features include:
- Add NFS client support for containers.
This should enable most of the necessary functionality, including
lockd support, and support for rpc.statd, NFSv4 idmapper and
RPCSEC_GSS upcalls into the correct network namespace from which
the mount system call was issued.
- NFSv4 idmapper scalability improvements
Base the idmapper cache on the keyring interface to allow
concurrent access to idmapper entries. Start the process of
migrating users from the single-threaded daemon-based approach to
the multi-threaded request-key based approach.
- NFSv4.1 implementation id.
Allows the NFSv4.1 client and server to mutually identify each
other for logging and debugging purposes.
- Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
having to use the more counterintuitive 'vers=4,minorversion=1'.
- SUNRPC tracepoints.
Start the process of adding tracepoints in order to improve
debugging of the RPC layer.
- pNFS object layout support for autologin.
Important bugfixes include:
- Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
fail to wake up all tasks when applied to priority waitqueues.
- Ensure that we handle read delegations correctly, when we try to
truncate a file.
- A number of fixes for NFSv4 state manager loops (mostly to do with
delegation recovery)."
* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
NFS: fix sb->s_id in nfs debug prints
xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
xprtrdma: The transport should not bug-check when a dup reply is received
pnfs-obj: autologin: Add support for protocol autologin
NFS: Remove nfs4_setup_sequence from generic rename code
NFS: Remove nfs4_setup_sequence from generic unlink code
NFS: Remove nfs4_setup_sequence from generic read code
NFS: Remove nfs4_setup_sequence from generic write code
NFS: Fix more NFS debug related build warnings
SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
nfs: non void functions must return a value
SUNRPC: Kill compiler warning when RPC_DEBUG is unset
SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
NFS: Use cond_resched_lock() to reduce latencies in the commit scans
NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
NFS: ncommit count is being double decremented
SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
Try using machine credentials for RENEW calls
NFSv4.1: Fix a few issues in filelayout_commit_pagelist
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
...
Replace the variable length array in the RPCSEC_GSS crypto code with
a fixed length one. The size should be bounded by the variable
GSS_KRB5_MAX_BLOCKSIZE, so use that.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
net/sunrpc/svcsock.c:412:22: warning: incorrect type in assignment
(different address spaces)
- svc_partial_recvfrom now takes a struct kvec, so the variable
save_iovbase needs to be an ordinary (void *)
Make a bunch of variables in net/sunrpc/xprtsock.c static
Fix a couple of "warning: symbol 'foo' was not declared. Should it be
static?" reports.
Fix a couple of conflicting function declarations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A migration event will replace the rpc_xprt used by an rpc_clnt. To
ensure this can be done safely, all references to cl_xprt must now use
a form of rcu_dereference().
Special care is taken with rpc_peeraddr2str(), which returns a pointer
to memory whose lifetime is the same as the rpc_xprt.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: fix lockdep splats and layering violations ]
[ cel: forward ported to 3.4 ]
[ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch makes GSS auth cache details allocated and registered per network
namespace context.
Thus with this patch rsi_cache and rsc_cache contents for network namespace "X"
are controlled from proc file system mount for the same network namespace "X".
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
This is a cleanup patch. We don't need this reference anymore.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch subscribes RPC clients to RPC pipefs notifications. RPC clients
notifier block is registering with pipefs initialization during SUNRPC module
init.
This notifier callback is responsible for RPC client PipeFS directory and GSS
pipes creation. For pipes creation and destruction two additional callbacks
were added to struct rpc_authops.
Note that no locking required in notifier callback because PipeFS superblock
pointer is passed as an argument from it's creation or destruction routine and
thus we can be sure about it's validity.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch makes RPC GSS PipeFs pipes allocated in it's RPC client owner
network namespace context.
Pipes creation and destruction now done in separated functions, which takes
care about PipeFS superblock locking.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch is a final step towards to removing PipeFS inode references from
kernel code other than PipeFS itself. It makes all kernel SUNRPC PipeFS users
depends on pipe private data, which state depend on their specific operations,
etc.
This patch completes SUNRPC PipeFS preparations and allows to create pipe
private data and PipeFS dentries independently.
Next step will be making SUNPRC PipeFS dentries allocated by SUNRPC PipeFS
network namespace aware routines.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently gss auth holds RPC inode pointer which is now redundant since it
requires only pipes operations which takes private pipe data as an argument.
Thus this code can be cleaned and all references to RPC inode can be replaced
with privtae pipe data references.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
RPC pipe upcall doesn't requires only private pipe data. Thus RPC inode
references in this code can be removed.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Generally, pipe data is used only for pipes, and thus allocating space for it
on every RPC inode allocation is redundant. This patch splits private SUNRPC
PipeFS pipe data and inode, makes pipe data allocated only for pipe inodes.
This patch is also is a next step towards to to removing PipeFS inode
references from kernel code other than PipeFS itself.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currenly, inode i_lock is used to provide concurrent access to SUNPRC PipeFS
pipes. It looks redundant, since now other use of inode is present in most of
these places and thus can be easely replaced, which will allow to remove most
of inode references from PipeFS code. This is a first step towards to removing
PipeFS inode references from kernel code other than PipeFS itself.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
pptp: Accept packet with seq zero
RDS: Remove some unused iWARP code
net: fsl: fec: handle 10Mbps speed in RMII mode
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c: add missing iounmap
drivers/net/ethernet/tundra/tsi108_eth.c: add missing iounmap
ksz884x: fix mtu for VLAN
net_sched: sfq: add optional RED on top of SFQ
dp83640: Fix NOHZ local_softirq_pending 08 warning
gianfar: Fix invalid TX frames returned on error queue when time stamping
gianfar: Fix missing sock reference when processing TX time stamps
phylib: introduce mdiobus_alloc_size()
net: decrement memcg jump label when limit, not usage, is changed
net: reintroduce missing rcu_assign_pointer() calls
inet_diag: Rename inet_diag_req_compat into inet_diag_req
inet_diag: Rename inet_diag_req into inet_diag_req_v2
bond_alb: don't disable softirq under bond_alb_xmit
mac80211: fix rx->key NULL pointer dereference in promiscuous mode
nl80211: fix old station flags compatibility
mdio-octeon: use an unique MDIO bus name.
mdio-gpio: use an unique MDIO bus name.
...
commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).
We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of hacking specific service names into gss_encode_v1_msg, we should
just allow the caller to specify the service name explicitly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
* 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits)
Check validity of cl_rpcclient in nfs_server_list_show
NFS: Get rid of the nfs_rdata_mempool
NFS: Don't rely on PageError in nfs_readpage_release_partial
NFS: Get rid of unnecessary calls to ClearPageError() in read code
NFS: Get rid of nfs_restart_rpc()
NFS: Get rid of the unused nfs_write_data->flags field
NFS: Get rid of the unused nfs_read_data->flags field
NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup
NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
NFS: Use the inode->i_version to cache NFSv4 change attribute information
SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr
SUNRPC: Fix rpc_sockaddr2uaddr
nfs/super.c: local functions should be static
pnfsblock: fix writeback deadlock
pnfsblock: fix NULL pointer dereference
pnfs: recoalesce when ld read pagelist fails
pnfs: recoalesce when ld write pagelist fails
pnfs: make _set_lo_fail generic
pnfsblock: add missing rpc_put_mount and path_put
SUNRPC/NFS: make rpc pipe upcall generic
...
The same function is used by idmap, gss and blocklayout code. Make it
generic.
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Cc: stable@kernel.org [3.0]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.
Convert all rcu_assign_pointer of NULL value.
//smpl
@@ expression P; @@
- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)
// </smpl>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-3.1' of git://linux-nfs.org/~bfields/linux:
nfsd: don't break lease on CLAIM_DELEGATE_CUR
locks: rename lock-manager ops
nfsd4: update nfsv4.1 implementation notes
nfsd: turn on reply cache for NFSv4
nfsd4: call nfsd4_release_compoundargs from pc_release
nfsd41: Deny new lock before RECLAIM_COMPLETE done
fs: locks: remove init_once
nfsd41: check the size of request
nfsd41: error out when client sets maxreq_sz or maxresp_sz too small
nfsd4: fix file leak on open_downgrade
nfsd4: remember to put RW access on stateid destruction
NFSD: Added TEST_STATEID operation
NFSD: added FREE_STATEID operation
svcrpc: fix list-corrupting race on nfsd shutdown
rpc: allow autoloading of gss mechanisms
svcauth_unix.c: quiet sparse noise
svcsock.c: include sunrpc.h to quiet sparse noise
nfsd: Remove deprecated nfsctl system call and related code.
NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND
Fix up trivial conflicts in Documentation/feature-removal-schedule.txt
Make the case labels the same indent as the switch.
git diff -w shows 80 column line reflowing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: Fix decode_secinfo_maxsz
NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test
NFSv4.1: Fix some issues with pnfs_generic_pg_test
NFSv4.1: file layout must consider pg_bsize for coalescing
pnfs-obj: No longer needed to take an extra ref at add_device
SUNRPC: Ensure the RPC client only quits on fatal signals
NFSv4: Fix a readdir regression
nfs4.1: mark layout as bad on error path in _pnfs_return_layout
nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout
NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path
NFS: (d)printks should use %zd for ssize_t arguments
NFSv4.1: fix break condition in pnfs_find_lseg
nfs4.1: fix several problems with _pnfs_return_layout
NFSv4.1: allow zero fh array in filelayout decode layout
NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
NFSv4.1: Fix a refcounting issue in the pNFS device id cache
NFSv4.1: deprecate headerpadsz in CREATE_SESSION
NFS41: do not update isize if inode needs layoutcommit
NLM: Don't hang forever on NLM unlock requests
NFS: fix umount of pnfs filesystems
Fix a couple of instances where we were exiting the RPC client on
arbitrary signals. We should only do so on fatal signals.
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>