Highlights:
- Improve support for re-exporting NFS mounts - Replace NFSv4 XDR decoding C macros with xdr_stream helpers - Support for multiple RPC/RDMA chunks per RPC transaction -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAl/Q4dIACgkQM2qzM29m f5fInw//eDrmXBEhxbzcgeqNilGU5Qkn4INJtAcOGwPcw5Kjp4UVNGFpZNPqIDSf FP0Yw0d/rW7UggwCviPcs/adLTasU9skq1jgAv8d0ig4DtPbeqFo6BvbY+G2JxVF EfTeHzr6w6er8HRqyuLN4hjm1rQIpQlDHaYU4QcMs4fjPVv88eYLiwnYGYf3X46i vBYstu1IRxHhg2x4O833xmiL6VbkZDQoWwDjGICylxUBcNUtAmq/sETjTa4JVEJj 4vgXdcJmAFjNgAOrmoR3DISsr9mvCvKN9g3C0+hHiRERTGEon//HzvscWH74wT48 o0LUW0ZWgpmunTcmiSNeeiHNsUXJyy3A/xyEdteqqnvSxulxlqkQzb15Eb+92+6n BHGT/sOz1zz+/l9NCpdeEl5AkSA9plV8Iqd/kzwFwe1KwHMjldeMw/mhMut8EM2j b54EMsp40ipITAwBHvcygCXiWAn/mPex6bCr17Dijo6MsNLsyd+cDsazntbNzwz3 RMGMf2TPOi8tWswrTUS9J5xKk5LAEWX/6Z/hTA1YlsB3PfrhXO97ztrytxvoO/bp M0NREA+NNMn/JyyL8FT3ID5peaLVHhA1GHw9CcUw3C7OVzmsEg29D4zNo02dF1TC LIyekp0kbSGGY1jLOeMLsa6Jr+2+40CcctsooVkRA+3rN0tJQvw= =1uP3 -----END PGP SIGNATURE----- Merge tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6 Pull nfsd updates from Chuck Lever: "Several substantial changes this time around: - Previously, exporting an NFS mount via NFSD was considered to be an unsupported feature. With v5.11, the community has attempted to make re-exporting a first-class feature of NFSD. This would enable the Linux in-kernel NFS server to be used as an intermediate cache for a remotely-located primary NFS server, for example, even with other NFS server implementations, like a NetApp filer, as the primary. - A short series of patches brings support for multiple RPC/RDMA data chunks per RPC transaction to the Linux NFS server's RPC/RDMA transport implementation. This is a part of the RPC/RDMA spec that the other premiere NFS/RDMA implementation (Solaris) has had for a very long time, and completes the implementation of RPC/RDMA version 1 in the Linux kernel's NFS server. - Long ago, NFSv4 support was introduced to NFSD using a series of C macros that hid dprintk's and goto's. Over time, the kernel's XDR implementation has been greatly improved, but these C macros have remained and become fallow. A series of patches in this pull request completely replaces those macros with the use of current kernel XDR infrastructure. Benefits include: - More robust input sanitization in NFSD's NFSv4 XDR decoders. - Make it easier to use common kernel library functions that use XDR stream APIs (for example, GSS-API). - Align the structure of the source code with the RFCs so it is easier to learn, verify, and maintain our XDR implementation. - Removal of more than a hundred hidden dprintk() call sites. - Removal of some explicit manipulation of pages to help make the eventual transition to xdr->bvec smoother. - On top of several related fixes in 5.10-rc, there are a few more fixes to get the Linux NFSD implementation of NFSv4.2 inter-server copy up to speed. And as usual, there is a pinch of seasoning in the form of a collection of unrelated minor bug fixes and clean-ups. Many thanks to all who contributed this time around!" * tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6: (131 commits) nfsd: Record NFSv4 pre/post-op attributes as non-atomic nfsd: Set PF_LOCAL_THROTTLE on local filesystems only nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE exportfs: Add a function to return the raw output from fh_to_dentry() nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: allow filesystems to opt out of subtree checking nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations Revert "nfsd4: support change_attr_type attribute" nfsd4: don't query change attribute in v2/v3 case nfsd: minor nfsd4_change_attribute cleanup nfsd: simplify nfsd4_change_info nfsd: only call inode_query_iversion in the I_VERSION case nfs_common: need lock during iterate through the list NFSD: Fix 5 seconds delay when doing inter server copy NFSD: Fix sparse warning in nfs4proc.c SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall sunrpc: clean-up cache downcall nfsd: Fix message level for normal termination NFSD: Remove macros that are no longer used NFSD: Replace READ* macros in nfsd4_decode_compound() ...
This commit is contained in:
Коммит
1a50ede2b3
|
@ -154,6 +154,11 @@ struct which has the following members:
|
|||
to find potential names, and matches inode numbers to find the correct
|
||||
match.
|
||||
|
||||
flags
|
||||
Some filesystems may need to be handled differently than others. The
|
||||
export_operations struct also includes a flags field that allows the
|
||||
filesystem to communicate such information to nfsd. See the Export
|
||||
Operations Flags section below for more explanation.
|
||||
|
||||
A filehandle fragment consists of an array of 1 or more 4byte words,
|
||||
together with a one byte "type".
|
||||
|
@ -163,3 +168,50 @@ generated by encode_fh, in which case it will have been padded with
|
|||
nuls. Rather, the encode_fh routine should choose a "type" which
|
||||
indicates the decode_fh how much of the filehandle is valid, and how
|
||||
it should be interpreted.
|
||||
|
||||
Export Operations Flags
|
||||
-----------------------
|
||||
In addition to the operation vector pointers, struct export_operations also
|
||||
contains a "flags" field that allows the filesystem to communicate to nfsd
|
||||
that it may want to do things differently when dealing with it. The
|
||||
following flags are defined:
|
||||
|
||||
EXPORT_OP_NOWCC - disable NFSv3 WCC attributes on this filesystem
|
||||
RFC 1813 recommends that servers always send weak cache consistency
|
||||
(WCC) data to the client after each operation. The server should
|
||||
atomically collect attributes about the inode, do an operation on it,
|
||||
and then collect the attributes afterward. This allows the client to
|
||||
skip issuing GETATTRs in some situations but means that the server
|
||||
is calling vfs_getattr for almost all RPCs. On some filesystems
|
||||
(particularly those that are clustered or networked) this is expensive
|
||||
and atomicity is difficult to guarantee. This flag indicates to nfsd
|
||||
that it should skip providing WCC attributes to the client in NFSv3
|
||||
replies when doing operations on this filesystem. Consider enabling
|
||||
this on filesystems that have an expensive ->getattr inode operation,
|
||||
or when atomicity between pre and post operation attribute collection
|
||||
is impossible to guarantee.
|
||||
|
||||
EXPORT_OP_NOSUBTREECHK - disallow subtree checking on this fs
|
||||
Many NFS operations deal with filehandles, which the server must then
|
||||
vet to ensure that they live inside of an exported tree. When the
|
||||
export consists of an entire filesystem, this is trivial. nfsd can just
|
||||
ensure that the filehandle live on the filesystem. When only part of a
|
||||
filesystem is exported however, then nfsd must walk the ancestors of the
|
||||
inode to ensure that it's within an exported subtree. This is an
|
||||
expensive operation and not all filesystems can support it properly.
|
||||
This flag exempts the filesystem from subtree checking and causes
|
||||
exportfs to get back an error if it tries to enable subtree checking
|
||||
on it.
|
||||
|
||||
EXPORT_OP_CLOSE_BEFORE_UNLINK - always close cached files before unlinking
|
||||
On some exportable filesystems (such as NFS) unlinking a file that
|
||||
is still open can cause a fair bit of extra work. For instance,
|
||||
the NFS client will do a "sillyrename" to ensure that the file
|
||||
sticks around while it's still open. When reexporting, that open
|
||||
file is held by nfsd so we usually end up doing a sillyrename, and
|
||||
then immediately deleting the sillyrenamed file just afterward when
|
||||
the link count actually goes to zero. Sometimes this delete can race
|
||||
with other operations (for instance an rmdir of the parent directory).
|
||||
This flag causes nfsd to close any open files for this inode _before_
|
||||
calling into the vfs to do an unlink or a rename that would replace
|
||||
an existing file.
|
||||
|
|
|
@ -417,9 +417,11 @@ int exportfs_encode_fh(struct dentry *dentry, struct fid *fid, int *max_len,
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(exportfs_encode_fh);
|
||||
|
||||
struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
|
||||
int fh_len, int fileid_type,
|
||||
int (*acceptable)(void *, struct dentry *), void *context)
|
||||
struct dentry *
|
||||
exportfs_decode_fh_raw(struct vfsmount *mnt, struct fid *fid, int fh_len,
|
||||
int fileid_type,
|
||||
int (*acceptable)(void *, struct dentry *),
|
||||
void *context)
|
||||
{
|
||||
const struct export_operations *nop = mnt->mnt_sb->s_export_op;
|
||||
struct dentry *result, *alias;
|
||||
|
@ -432,10 +434,8 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
|
|||
if (!nop || !nop->fh_to_dentry)
|
||||
return ERR_PTR(-ESTALE);
|
||||
result = nop->fh_to_dentry(mnt->mnt_sb, fid, fh_len, fileid_type);
|
||||
if (PTR_ERR(result) == -ENOMEM)
|
||||
return ERR_CAST(result);
|
||||
if (IS_ERR_OR_NULL(result))
|
||||
return ERR_PTR(-ESTALE);
|
||||
return result;
|
||||
|
||||
/*
|
||||
* If no acceptance criteria was specified by caller, a disconnected
|
||||
|
@ -561,10 +561,26 @@ struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
|
|||
|
||||
err_result:
|
||||
dput(result);
|
||||
if (err != -ENOMEM)
|
||||
err = -ESTALE;
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(exportfs_decode_fh_raw);
|
||||
|
||||
struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
|
||||
int fh_len, int fileid_type,
|
||||
int (*acceptable)(void *, struct dentry *),
|
||||
void *context)
|
||||
{
|
||||
struct dentry *ret;
|
||||
|
||||
ret = exportfs_decode_fh_raw(mnt, fid, fh_len, fileid_type,
|
||||
acceptable, context);
|
||||
if (IS_ERR_OR_NULL(ret)) {
|
||||
if (ret == ERR_PTR(-ENOMEM))
|
||||
return ret;
|
||||
return ERR_PTR(-ESTALE);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(exportfs_decode_fh);
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
|
|
|
@ -697,7 +697,7 @@ bl_alloc_lseg(struct pnfs_layout_hdr *lo, struct nfs4_layoutget_res *lgr,
|
|||
|
||||
xdr_init_decode_pages(&xdr, &buf,
|
||||
lgr->layoutp->pages, lgr->layoutp->len);
|
||||
xdr_set_scratch_buffer(&xdr, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&xdr, scratch);
|
||||
|
||||
status = -EIO;
|
||||
p = xdr_inline_decode(&xdr, 4);
|
||||
|
|
|
@ -510,7 +510,7 @@ bl_alloc_deviceid_node(struct nfs_server *server, struct pnfs_device *pdev,
|
|||
goto out;
|
||||
|
||||
xdr_init_decode_pages(&xdr, &buf, pdev->pages, pdev->pglen);
|
||||
xdr_set_scratch_buffer(&xdr, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&xdr, scratch);
|
||||
|
||||
p = xdr_inline_decode(&xdr, sizeof(__be32));
|
||||
if (!p)
|
||||
|
|
|
@ -576,7 +576,7 @@ int nfs_readdir_page_filler(nfs_readdir_descriptor_t *desc, struct nfs_entry *en
|
|||
goto out_nopages;
|
||||
|
||||
xdr_init_decode_pages(&stream, &buf, xdr_pages, buflen);
|
||||
xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&stream, scratch);
|
||||
|
||||
do {
|
||||
if (entry->label)
|
||||
|
|
|
@ -171,4 +171,7 @@ const struct export_operations nfs_export_ops = {
|
|||
.encode_fh = nfs_encode_fh,
|
||||
.fh_to_dentry = nfs_fh_to_dentry,
|
||||
.get_parent = nfs_get_parent,
|
||||
.flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK|
|
||||
EXPORT_OP_CLOSE_BEFORE_UNLINK|EXPORT_OP_REMOTE_FS|
|
||||
EXPORT_OP_NOATOMIC_ATTR,
|
||||
};
|
||||
|
|
|
@ -666,7 +666,7 @@ filelayout_decode_layout(struct pnfs_layout_hdr *flo,
|
|||
return -ENOMEM;
|
||||
|
||||
xdr_init_decode_pages(&stream, &buf, lgr->layoutp->pages, lgr->layoutp->len);
|
||||
xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&stream, scratch);
|
||||
|
||||
/* 20 = ufl_util (4), first_stripe_index (4), pattern_offset (8),
|
||||
* num_fh (4) */
|
||||
|
|
|
@ -82,7 +82,7 @@ nfs4_fl_alloc_deviceid_node(struct nfs_server *server, struct pnfs_device *pdev,
|
|||
goto out_err;
|
||||
|
||||
xdr_init_decode_pages(&stream, &buf, pdev->pages, pdev->pglen);
|
||||
xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&stream, scratch);
|
||||
|
||||
/* Get the stripe count (number of stripe index) */
|
||||
p = xdr_inline_decode(&stream, 4);
|
||||
|
|
|
@ -378,7 +378,7 @@ ff_layout_alloc_lseg(struct pnfs_layout_hdr *lh,
|
|||
|
||||
xdr_init_decode_pages(&stream, &buf, lgr->layoutp->pages,
|
||||
lgr->layoutp->len);
|
||||
xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&stream, scratch);
|
||||
|
||||
/* stripe unit and mirror_array_cnt */
|
||||
rc = -EIO;
|
||||
|
|
|
@ -69,7 +69,7 @@ nfs4_ff_alloc_deviceid_node(struct nfs_server *server, struct pnfs_device *pdev,
|
|||
INIT_LIST_HEAD(&dsaddrs);
|
||||
|
||||
xdr_init_decode_pages(&stream, &buf, pdev->pages, pdev->pglen);
|
||||
xdr_set_scratch_buffer(&stream, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(&stream, scratch);
|
||||
|
||||
/* multipath count */
|
||||
p = xdr_inline_decode(&stream, 4);
|
||||
|
|
|
@ -1539,7 +1539,7 @@ static int nfs4_xdr_dec_listxattrs(struct rpc_rqst *rqstp,
|
|||
struct compound_hdr hdr;
|
||||
int status;
|
||||
|
||||
xdr_set_scratch_buffer(xdr, page_address(res->scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(xdr, res->scratch);
|
||||
|
||||
status = decode_compound_hdr(xdr, &hdr);
|
||||
if (status)
|
||||
|
|
|
@ -6403,10 +6403,8 @@ nfs4_xdr_dec_getacl(struct rpc_rqst *rqstp, struct xdr_stream *xdr,
|
|||
struct compound_hdr hdr;
|
||||
int status;
|
||||
|
||||
if (res->acl_scratch != NULL) {
|
||||
void *p = page_address(res->acl_scratch);
|
||||
xdr_set_scratch_buffer(xdr, p, PAGE_SIZE);
|
||||
}
|
||||
if (res->acl_scratch != NULL)
|
||||
xdr_set_scratch_page(xdr, res->acl_scratch);
|
||||
status = decode_compound_hdr(xdr, &hdr);
|
||||
if (status)
|
||||
goto out;
|
||||
|
|
|
@ -69,10 +69,14 @@ __state_in_grace(struct net *net, bool open)
|
|||
if (!open)
|
||||
return !list_empty(grace_list);
|
||||
|
||||
spin_lock(&grace_lock);
|
||||
list_for_each_entry(lm, grace_list, list) {
|
||||
if (lm->block_opens)
|
||||
if (lm->block_opens) {
|
||||
spin_unlock(&grace_lock);
|
||||
return true;
|
||||
}
|
||||
}
|
||||
spin_unlock(&grace_lock);
|
||||
return false;
|
||||
}
|
||||
|
||||
|
|
|
@ -408,6 +408,12 @@ static int check_export(struct inode *inode, int *flags, unsigned char *uuid)
|
|||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (inode->i_sb->s_export_op->flags & EXPORT_OP_NOSUBTREECHK &&
|
||||
!(*flags & NFSEXP_NOSUBTREECHECK)) {
|
||||
dprintk("%s: %s does not support subtree checking!\n",
|
||||
__func__, inode->i_sb->s_type->name);
|
||||
return -EINVAL;
|
||||
}
|
||||
return 0;
|
||||
|
||||
}
|
||||
|
|
|
@ -685,6 +685,7 @@ nfsd_file_cache_init(void)
|
|||
if (IS_ERR(nfsd_file_fsnotify_group)) {
|
||||
pr_err("nfsd: unable to create fsnotify group: %ld\n",
|
||||
PTR_ERR(nfsd_file_fsnotify_group));
|
||||
ret = PTR_ERR(nfsd_file_fsnotify_group);
|
||||
nfsd_file_fsnotify_group = NULL;
|
||||
goto out_notifier;
|
||||
}
|
||||
|
|
|
@ -185,10 +185,6 @@ out:
|
|||
/*
|
||||
* XDR decode functions
|
||||
*/
|
||||
static int nfsaclsvc_decode_voidarg(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
|
||||
static int nfsaclsvc_decode_getaclargs(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
|
@ -255,15 +251,6 @@ static int nfsaclsvc_decode_accessargs(struct svc_rqst *rqstp, __be32 *p)
|
|||
* XDR encode functions
|
||||
*/
|
||||
|
||||
/*
|
||||
* There must be an encoding function for void results so svc_process
|
||||
* will work properly.
|
||||
*/
|
||||
static int nfsaclsvc_encode_voidres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
}
|
||||
|
||||
/* GETACL */
|
||||
static int nfsaclsvc_encode_getaclres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
|
@ -378,10 +365,10 @@ struct nfsd3_voidargs { int dummy; };
|
|||
static const struct svc_procedure nfsd_acl_procedures2[5] = {
|
||||
[ACLPROC2_NULL] = {
|
||||
.pc_func = nfsacld_proc_null,
|
||||
.pc_decode = nfsaclsvc_decode_voidarg,
|
||||
.pc_encode = nfsaclsvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd3_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd3_voidargs),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = ST,
|
||||
},
|
||||
|
|
|
@ -245,10 +245,10 @@ struct nfsd3_voidargs { int dummy; };
|
|||
static const struct svc_procedure nfsd_acl_procedures3[3] = {
|
||||
[ACLPROC3_NULL] = {
|
||||
.pc_func = nfsd3_proc_null,
|
||||
.pc_decode = nfs3svc_decode_voidarg,
|
||||
.pc_encode = nfs3svc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd3_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd3_voidargs),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = ST,
|
||||
},
|
||||
|
|
|
@ -689,12 +689,9 @@ out:
|
|||
#define nfsd3_mkdirargs nfsd3_createargs
|
||||
#define nfsd3_readdirplusargs nfsd3_readdirargs
|
||||
#define nfsd3_fhandleargs nfsd_fhandle
|
||||
#define nfsd3_fhandleres nfsd3_attrstat
|
||||
#define nfsd3_attrstatres nfsd3_attrstat
|
||||
#define nfsd3_wccstatres nfsd3_attrstat
|
||||
#define nfsd3_createres nfsd3_diropres
|
||||
#define nfsd3_voidres nfsd3_voidargs
|
||||
struct nfsd3_voidargs { int dummy; };
|
||||
|
||||
#define ST 1 /* status*/
|
||||
#define FH 17 /* filehandle with length */
|
||||
|
@ -705,10 +702,10 @@ struct nfsd3_voidargs { int dummy; };
|
|||
static const struct svc_procedure nfsd_procedures3[22] = {
|
||||
[NFS3PROC_NULL] = {
|
||||
.pc_func = nfsd3_proc_null,
|
||||
.pc_decode = nfs3svc_decode_voidarg,
|
||||
.pc_encode = nfs3svc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd3_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd3_voidres),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = ST,
|
||||
},
|
||||
|
|
|
@ -206,7 +206,7 @@ static __be32 *
|
|||
encode_post_op_attr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *fhp)
|
||||
{
|
||||
struct dentry *dentry = fhp->fh_dentry;
|
||||
if (dentry && d_really_is_positive(dentry)) {
|
||||
if (!fhp->fh_no_wcc && dentry && d_really_is_positive(dentry)) {
|
||||
__be32 err;
|
||||
struct kstat stat;
|
||||
|
||||
|
@ -259,11 +259,11 @@ void fill_pre_wcc(struct svc_fh *fhp)
|
|||
{
|
||||
struct inode *inode;
|
||||
struct kstat stat;
|
||||
bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE);
|
||||
__be32 err;
|
||||
|
||||
if (fhp->fh_pre_saved)
|
||||
if (fhp->fh_no_wcc || fhp->fh_pre_saved)
|
||||
return;
|
||||
|
||||
inode = d_inode(fhp->fh_dentry);
|
||||
err = fh_getattr(fhp, &stat);
|
||||
if (err) {
|
||||
|
@ -272,11 +272,12 @@ void fill_pre_wcc(struct svc_fh *fhp)
|
|||
stat.ctime = inode->i_ctime;
|
||||
stat.size = inode->i_size;
|
||||
}
|
||||
if (v4)
|
||||
fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode);
|
||||
|
||||
fhp->fh_pre_mtime = stat.mtime;
|
||||
fhp->fh_pre_ctime = stat.ctime;
|
||||
fhp->fh_pre_size = stat.size;
|
||||
fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode);
|
||||
fhp->fh_pre_saved = true;
|
||||
}
|
||||
|
||||
|
@ -285,30 +286,30 @@ void fill_pre_wcc(struct svc_fh *fhp)
|
|||
*/
|
||||
void fill_post_wcc(struct svc_fh *fhp)
|
||||
{
|
||||
bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE);
|
||||
struct inode *inode = d_inode(fhp->fh_dentry);
|
||||
__be32 err;
|
||||
|
||||
if (fhp->fh_no_wcc)
|
||||
return;
|
||||
|
||||
if (fhp->fh_post_saved)
|
||||
printk("nfsd: inode locked twice during operation.\n");
|
||||
|
||||
err = fh_getattr(fhp, &fhp->fh_post_attr);
|
||||
fhp->fh_post_change = nfsd4_change_attribute(&fhp->fh_post_attr,
|
||||
d_inode(fhp->fh_dentry));
|
||||
if (err) {
|
||||
fhp->fh_post_saved = false;
|
||||
/* Grab the ctime anyway - set_change_info might use it */
|
||||
fhp->fh_post_attr.ctime = d_inode(fhp->fh_dentry)->i_ctime;
|
||||
fhp->fh_post_attr.ctime = inode->i_ctime;
|
||||
} else
|
||||
fhp->fh_post_saved = true;
|
||||
if (v4)
|
||||
fhp->fh_post_change =
|
||||
nfsd4_change_attribute(&fhp->fh_post_attr, inode);
|
||||
}
|
||||
|
||||
/*
|
||||
* XDR decode functions
|
||||
*/
|
||||
int
|
||||
nfs3svc_decode_voidarg(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
|
||||
int
|
||||
nfs3svc_decode_fhandle(struct svc_rqst *rqstp, __be32 *p)
|
||||
|
@ -642,12 +643,6 @@ nfs3svc_decode_commitargs(struct svc_rqst *rqstp, __be32 *p)
|
|||
* XDR encode functions
|
||||
*/
|
||||
|
||||
int
|
||||
nfs3svc_encode_voidres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
}
|
||||
|
||||
/* GETATTR */
|
||||
int
|
||||
nfs3svc_encode_attrstat(struct svc_rqst *rqstp, __be32 *p)
|
||||
|
@ -707,6 +702,7 @@ int
|
|||
nfs3svc_encode_readlinkres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
struct nfsd3_readlinkres *resp = rqstp->rq_resp;
|
||||
struct kvec *head = rqstp->rq_res.head;
|
||||
|
||||
*p++ = resp->status;
|
||||
p = encode_post_op_attr(rqstp, p, &resp->fh);
|
||||
|
@ -720,6 +716,8 @@ nfs3svc_encode_readlinkres(struct svc_rqst *rqstp, __be32 *p)
|
|||
*p = 0;
|
||||
rqstp->rq_res.tail[0].iov_len = 4 - (resp->len&3);
|
||||
}
|
||||
if (svc_encode_result_payload(rqstp, head->iov_len, resp->len))
|
||||
return 0;
|
||||
return 1;
|
||||
} else
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
|
@ -730,6 +728,7 @@ int
|
|||
nfs3svc_encode_readres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
struct nfsd3_readres *resp = rqstp->rq_resp;
|
||||
struct kvec *head = rqstp->rq_res.head;
|
||||
|
||||
*p++ = resp->status;
|
||||
p = encode_post_op_attr(rqstp, p, &resp->fh);
|
||||
|
@ -746,6 +745,9 @@ nfs3svc_encode_readres(struct svc_rqst *rqstp, __be32 *p)
|
|||
*p = 0;
|
||||
rqstp->rq_res.tail[0].iov_len = 4 - (resp->count & 3);
|
||||
}
|
||||
if (svc_encode_result_payload(rqstp, head->iov_len,
|
||||
resp->count))
|
||||
return 0;
|
||||
return 1;
|
||||
} else
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
|
|
|
@ -257,8 +257,8 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
|
|||
* in NFSv4 as in v3 except EXCLUSIVE4_1.
|
||||
*/
|
||||
current->fs->umask = open->op_umask;
|
||||
status = do_nfsd_create(rqstp, current_fh, open->op_fname.data,
|
||||
open->op_fname.len, &open->op_iattr,
|
||||
status = do_nfsd_create(rqstp, current_fh, open->op_fname,
|
||||
open->op_fnamelen, &open->op_iattr,
|
||||
*resfh, open->op_createmode,
|
||||
(u32 *)open->op_verf.data,
|
||||
&open->op_truncate, &open->op_created);
|
||||
|
@ -283,7 +283,7 @@ do_open_lookup(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate, stru
|
|||
* a chance to an acquire a delegation if appropriate.
|
||||
*/
|
||||
status = nfsd_lookup(rqstp, current_fh,
|
||||
open->op_fname.data, open->op_fname.len, *resfh);
|
||||
open->op_fname, open->op_fnamelen, *resfh);
|
||||
if (status)
|
||||
goto out;
|
||||
status = nfsd_check_obj_isreg(*resfh);
|
||||
|
@ -360,7 +360,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
|
|||
bool reclaim = false;
|
||||
|
||||
dprintk("NFSD: nfsd4_open filename %.*s op_openowner %p\n",
|
||||
(int)open->op_fname.len, open->op_fname.data,
|
||||
(int)open->op_fnamelen, open->op_fname,
|
||||
open->op_openowner);
|
||||
|
||||
/* This check required by spec. */
|
||||
|
@ -1023,8 +1023,8 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
|
|||
|
||||
write->wr_how_written = write->wr_stable_how;
|
||||
|
||||
nvecs = svc_fill_write_vector(rqstp, write->wr_pagelist,
|
||||
&write->wr_head, write->wr_buflen);
|
||||
nvecs = svc_fill_write_vector(rqstp, write->wr_payload.pages,
|
||||
write->wr_payload.head, write->wr_buflen);
|
||||
WARN_ON_ONCE(nvecs > ARRAY_SIZE(rqstp->rq_vec));
|
||||
|
||||
status = nfsd_vfs_write(rqstp, &cstate->current_fh, nf,
|
||||
|
@ -1425,7 +1425,7 @@ static __be32 nfsd4_do_copy(struct nfsd4_copy *copy, bool sync)
|
|||
return status;
|
||||
}
|
||||
|
||||
static int dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
|
||||
static void dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
|
||||
{
|
||||
dst->cp_src_pos = src->cp_src_pos;
|
||||
dst->cp_dst_pos = src->cp_dst_pos;
|
||||
|
@ -1444,8 +1444,6 @@ static int dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
|
|||
memcpy(&dst->stateid, &src->stateid, sizeof(src->stateid));
|
||||
memcpy(&dst->c_fh, &src->c_fh, sizeof(src->c_fh));
|
||||
dst->ss_mnt = src->ss_mnt;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void cleanup_async_copy(struct nfsd4_copy *copy)
|
||||
|
@ -1539,9 +1537,7 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
|
|||
refcount_set(&async_copy->refcount, 1);
|
||||
memcpy(©->cp_res.cb_stateid, ©->cp_stateid,
|
||||
sizeof(copy->cp_stateid));
|
||||
status = dup_copy_fields(copy, async_copy);
|
||||
if (status)
|
||||
goto out_err;
|
||||
dup_copy_fields(copy, async_copy);
|
||||
async_copy->copy_task = kthread_create(nfsd4_do_async_copy,
|
||||
async_copy, "%s", "copy thread");
|
||||
if (IS_ERR(async_copy->copy_task))
|
||||
|
@ -2276,7 +2272,7 @@ static void svcxdr_init_encode(struct svc_rqst *rqstp,
|
|||
xdr->end = head->iov_base + PAGE_SIZE - rqstp->rq_auth_slack;
|
||||
/* Tail and page_len should be zero at this point: */
|
||||
buf->len = buf->head[0].iov_len;
|
||||
xdr->scratch.iov_len = 0;
|
||||
xdr_reset_scratch_buffer(xdr);
|
||||
xdr->page_ptr = buf->pages - 1;
|
||||
buf->buflen = PAGE_SIZE * (1 + rqstp->rq_page_end - buf->pages)
|
||||
- rqstp->rq_auth_slack;
|
||||
|
@ -3282,7 +3278,7 @@ int nfsd4_max_reply(struct svc_rqst *rqstp, struct nfsd4_op *op)
|
|||
void warn_on_nonidempotent_op(struct nfsd4_op *op)
|
||||
{
|
||||
if (OPDESC(op)->op_flags & OP_MODIFIES_SOMETHING) {
|
||||
pr_err("unable to encode reply to nonidempotent op %d (%s)\n",
|
||||
pr_err("unable to encode reply to nonidempotent op %u (%s)\n",
|
||||
op->opnum, nfsd4_op_name(op->opnum));
|
||||
WARN_ON_ONCE(1);
|
||||
}
|
||||
|
@ -3295,16 +3291,13 @@ static const char *nfsd4_op_name(unsigned opnum)
|
|||
return "unknown_operation";
|
||||
}
|
||||
|
||||
#define nfsd4_voidres nfsd4_voidargs
|
||||
struct nfsd4_voidargs { int dummy; };
|
||||
|
||||
static const struct svc_procedure nfsd_procedures4[2] = {
|
||||
[NFSPROC4_NULL] = {
|
||||
.pc_func = nfsd4_proc_null,
|
||||
.pc_decode = nfs4svc_decode_voidarg,
|
||||
.pc_encode = nfs4svc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd4_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd4_voidres),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = 1,
|
||||
},
|
||||
|
|
|
@ -769,6 +769,7 @@ static int nfs4_init_cp_state(struct nfsd_net *nn, copy_stateid_t *stid,
|
|||
spin_lock(&nn->s2s_cp_lock);
|
||||
new_id = idr_alloc_cyclic(&nn->s2s_cp_stateids, stid, 0, 0, GFP_NOWAIT);
|
||||
stid->stid.si_opaque.so_id = new_id;
|
||||
stid->stid.si_generation = 1;
|
||||
spin_unlock(&nn->s2s_cp_lock);
|
||||
idr_preload_end();
|
||||
if (new_id < 0)
|
||||
|
@ -3066,7 +3067,7 @@ nfsd4_exchange_id(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
|
|||
|
||||
rpc_ntop(sa, addr_str, sizeof(addr_str));
|
||||
dprintk("%s rqstp=%p exid=%p clname.len=%u clname.data=%p "
|
||||
"ip_addr=%s flags %x, spa_how %d\n",
|
||||
"ip_addr=%s flags %x, spa_how %u\n",
|
||||
__func__, rqstp, exid, exid->clname.len, exid->clname.data,
|
||||
addr_str, exid->flags, exid->spa_how);
|
||||
|
||||
|
|
2665
fs/nfsd/nfs4xdr.c
2665
fs/nfsd/nfs4xdr.c
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -73,6 +73,14 @@ extern unsigned long nfsd_drc_mem_used;
|
|||
|
||||
extern const struct seq_operations nfs_exports_op;
|
||||
|
||||
/*
|
||||
* Common void argument and result helpers
|
||||
*/
|
||||
struct nfsd_voidargs { };
|
||||
struct nfsd_voidres { };
|
||||
int nfssvc_decode_voidarg(struct svc_rqst *rqstp, __be32 *p);
|
||||
int nfssvc_encode_voidres(struct svc_rqst *rqstp, __be32 *p);
|
||||
|
||||
/*
|
||||
* Function prototypes.
|
||||
*/
|
||||
|
@ -387,7 +395,6 @@ void nfsd_lockd_shutdown(void);
|
|||
|
||||
#define NFSD4_2_SUPPORTED_ATTRS_WORD2 \
|
||||
(NFSD4_1_SUPPORTED_ATTRS_WORD2 | \
|
||||
FATTR4_WORD2_CHANGE_ATTR_TYPE | \
|
||||
FATTR4_WORD2_MODE_UMASK | \
|
||||
NFSD4_2_SECURITY_ATTRS | \
|
||||
FATTR4_WORD2_XATTR_SUPPORT)
|
||||
|
|
|
@ -268,12 +268,20 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
|
|||
if (fileid_type == FILEID_ROOT)
|
||||
dentry = dget(exp->ex_path.dentry);
|
||||
else {
|
||||
dentry = exportfs_decode_fh(exp->ex_path.mnt, fid,
|
||||
data_left, fileid_type,
|
||||
nfsd_acceptable, exp);
|
||||
if (IS_ERR_OR_NULL(dentry))
|
||||
dentry = exportfs_decode_fh_raw(exp->ex_path.mnt, fid,
|
||||
data_left, fileid_type,
|
||||
nfsd_acceptable, exp);
|
||||
if (IS_ERR_OR_NULL(dentry)) {
|
||||
trace_nfsd_set_fh_dentry_badhandle(rqstp, fhp,
|
||||
dentry ? PTR_ERR(dentry) : -ESTALE);
|
||||
switch (PTR_ERR(dentry)) {
|
||||
case -ENOMEM:
|
||||
case -ETIMEDOUT:
|
||||
break;
|
||||
default:
|
||||
dentry = ERR_PTR(-ESTALE);
|
||||
}
|
||||
}
|
||||
}
|
||||
if (dentry == NULL)
|
||||
goto out;
|
||||
|
@ -291,6 +299,20 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp)
|
|||
|
||||
fhp->fh_dentry = dentry;
|
||||
fhp->fh_export = exp;
|
||||
|
||||
switch (rqstp->rq_vers) {
|
||||
case 4:
|
||||
if (dentry->d_sb->s_export_op->flags & EXPORT_OP_NOATOMIC_ATTR)
|
||||
fhp->fh_no_atomic_attr = true;
|
||||
break;
|
||||
case 3:
|
||||
if (dentry->d_sb->s_export_op->flags & EXPORT_OP_NOWCC)
|
||||
fhp->fh_no_wcc = true;
|
||||
break;
|
||||
case 2:
|
||||
fhp->fh_no_wcc = true;
|
||||
}
|
||||
|
||||
return 0;
|
||||
out:
|
||||
exp_put(exp);
|
||||
|
@ -559,6 +581,9 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry,
|
|||
*/
|
||||
set_version_and_fsid_type(fhp, exp, ref_fh);
|
||||
|
||||
/* If we have a ref_fh, then copy the fh_no_wcc setting from it. */
|
||||
fhp->fh_no_wcc = ref_fh ? ref_fh->fh_no_wcc : false;
|
||||
|
||||
if (ref_fh == fhp)
|
||||
fh_put(ref_fh);
|
||||
|
||||
|
@ -662,6 +687,7 @@ fh_put(struct svc_fh *fhp)
|
|||
exp_put(exp);
|
||||
fhp->fh_export = NULL;
|
||||
}
|
||||
fhp->fh_no_wcc = false;
|
||||
return;
|
||||
}
|
||||
|
||||
|
|
|
@ -35,6 +35,12 @@ typedef struct svc_fh {
|
|||
|
||||
bool fh_locked; /* inode locked by us */
|
||||
bool fh_want_write; /* remount protection taken */
|
||||
bool fh_no_wcc; /* no wcc data needed */
|
||||
bool fh_no_atomic_attr;
|
||||
/*
|
||||
* wcc data is not atomic with
|
||||
* operation
|
||||
*/
|
||||
int fh_flags; /* FH flags */
|
||||
#ifdef CONFIG_NFSD_V3
|
||||
bool fh_post_saved; /* post-op attrs saved */
|
||||
|
@ -54,7 +60,6 @@ typedef struct svc_fh {
|
|||
struct kstat fh_post_attr; /* full attrs after operation */
|
||||
u64 fh_post_change; /* nfsv4 change; see above */
|
||||
#endif /* CONFIG_NFSD_V3 */
|
||||
|
||||
} svc_fh;
|
||||
#define NFSD4_FH_FOREIGN (1<<0)
|
||||
#define SET_FH_FLAG(c, f) ((c)->fh_flags |= (f))
|
||||
|
@ -259,13 +264,16 @@ fh_clear_wcc(struct svc_fh *fhp)
|
|||
static inline u64 nfsd4_change_attribute(struct kstat *stat,
|
||||
struct inode *inode)
|
||||
{
|
||||
u64 chattr;
|
||||
if (IS_I_VERSION(inode)) {
|
||||
u64 chattr;
|
||||
|
||||
chattr = stat->ctime.tv_sec;
|
||||
chattr <<= 30;
|
||||
chattr += stat->ctime.tv_nsec;
|
||||
chattr += inode_query_iversion(inode);
|
||||
return chattr;
|
||||
chattr = stat->ctime.tv_sec;
|
||||
chattr <<= 30;
|
||||
chattr += stat->ctime.tv_nsec;
|
||||
chattr += inode_query_iversion(inode);
|
||||
return chattr;
|
||||
} else
|
||||
return time_to_chattr(&stat->ctime);
|
||||
}
|
||||
|
||||
extern void fill_pre_wcc(struct svc_fh *fhp);
|
||||
|
|
|
@ -609,7 +609,6 @@ nfsd_proc_statfs(struct svc_rqst *rqstp)
|
|||
* NFSv2 Server procedures.
|
||||
* Only the results of non-idempotent operations are cached.
|
||||
*/
|
||||
struct nfsd_void { int dummy; };
|
||||
|
||||
#define ST 1 /* status */
|
||||
#define FH 8 /* filehandle */
|
||||
|
@ -618,10 +617,10 @@ struct nfsd_void { int dummy; };
|
|||
static const struct svc_procedure nfsd_procedures2[18] = {
|
||||
[NFSPROC_NULL] = {
|
||||
.pc_func = nfsd_proc_null,
|
||||
.pc_decode = nfssvc_decode_void,
|
||||
.pc_encode = nfssvc_encode_void,
|
||||
.pc_argsize = sizeof(struct nfsd_void),
|
||||
.pc_ressize = sizeof(struct nfsd_void),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = 0,
|
||||
},
|
||||
|
@ -647,10 +646,10 @@ static const struct svc_procedure nfsd_procedures2[18] = {
|
|||
},
|
||||
[NFSPROC_ROOT] = {
|
||||
.pc_func = nfsd_proc_root,
|
||||
.pc_decode = nfssvc_decode_void,
|
||||
.pc_encode = nfssvc_encode_void,
|
||||
.pc_argsize = sizeof(struct nfsd_void),
|
||||
.pc_ressize = sizeof(struct nfsd_void),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = 0,
|
||||
},
|
||||
|
@ -685,10 +684,10 @@ static const struct svc_procedure nfsd_procedures2[18] = {
|
|||
},
|
||||
[NFSPROC_WRITECACHE] = {
|
||||
.pc_func = nfsd_proc_writecache,
|
||||
.pc_decode = nfssvc_decode_void,
|
||||
.pc_encode = nfssvc_encode_void,
|
||||
.pc_argsize = sizeof(struct nfsd_void),
|
||||
.pc_ressize = sizeof(struct nfsd_void),
|
||||
.pc_decode = nfssvc_decode_voidarg,
|
||||
.pc_encode = nfssvc_encode_voidres,
|
||||
.pc_argsize = sizeof(struct nfsd_voidargs),
|
||||
.pc_ressize = sizeof(struct nfsd_voidres),
|
||||
.pc_cachetype = RC_NOCACHE,
|
||||
.pc_xdrressize = 0,
|
||||
},
|
||||
|
|
|
@ -29,6 +29,8 @@
|
|||
#include "netns.h"
|
||||
#include "filecache.h"
|
||||
|
||||
#include "trace.h"
|
||||
|
||||
#define NFSDDBG_FACILITY NFSDDBG_SVC
|
||||
|
||||
bool inter_copy_offload_enable;
|
||||
|
@ -527,8 +529,7 @@ static void nfsd_last_thread(struct svc_serv *serv, struct net *net)
|
|||
return;
|
||||
|
||||
nfsd_shutdown_net(net);
|
||||
printk(KERN_WARNING "nfsd: last server has exited, flushing export "
|
||||
"cache\n");
|
||||
pr_info("nfsd: last server has exited, flushing export cache\n");
|
||||
nfsd_export_flush(net);
|
||||
}
|
||||
|
||||
|
@ -1009,17 +1010,16 @@ int nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
|
|||
struct kvec *resv = &rqstp->rq_res.head[0];
|
||||
__be32 *p;
|
||||
|
||||
dprintk("nfsd_dispatch: vers %d proc %d\n",
|
||||
rqstp->rq_vers, rqstp->rq_proc);
|
||||
|
||||
if (nfs_request_too_big(rqstp, proc))
|
||||
goto out_too_large;
|
||||
goto out_decode_err;
|
||||
|
||||
/*
|
||||
* Give the xdr decoder a chance to change this if it wants
|
||||
* (necessary in the NFSv4.0 compound case)
|
||||
*/
|
||||
rqstp->rq_cachetype = proc->pc_cachetype;
|
||||
|
||||
svcxdr_init_decode(rqstp);
|
||||
if (!proc->pc_decode(rqstp, argv->iov_base))
|
||||
goto out_decode_err;
|
||||
|
||||
|
@ -1050,29 +1050,51 @@ int nfsd_dispatch(struct svc_rqst *rqstp, __be32 *statp)
|
|||
out_cached_reply:
|
||||
return 1;
|
||||
|
||||
out_too_large:
|
||||
dprintk("nfsd: NFSv%d argument too large\n", rqstp->rq_vers);
|
||||
*statp = rpc_garbage_args;
|
||||
return 1;
|
||||
|
||||
out_decode_err:
|
||||
dprintk("nfsd: failed to decode arguments!\n");
|
||||
trace_nfsd_garbage_args_err(rqstp);
|
||||
*statp = rpc_garbage_args;
|
||||
return 1;
|
||||
|
||||
out_update_drop:
|
||||
dprintk("nfsd: Dropping request; may be revisited later\n");
|
||||
nfsd_cache_update(rqstp, RC_NOCACHE, NULL);
|
||||
out_dropit:
|
||||
return 0;
|
||||
|
||||
out_encode_err:
|
||||
dprintk("nfsd: failed to encode result!\n");
|
||||
trace_nfsd_cant_encode_err(rqstp);
|
||||
nfsd_cache_update(rqstp, RC_NOCACHE, NULL);
|
||||
*statp = rpc_system_err;
|
||||
return 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* nfssvc_decode_voidarg - Decode void arguments
|
||||
* @rqstp: Server RPC transaction context
|
||||
* @p: buffer containing arguments to decode
|
||||
*
|
||||
* Return values:
|
||||
* %0: Arguments were not valid
|
||||
* %1: Decoding was successful
|
||||
*/
|
||||
int nfssvc_decode_voidarg(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* nfssvc_encode_voidres - Encode void results
|
||||
* @rqstp: Server RPC transaction context
|
||||
* @p: buffer in which to encode results
|
||||
*
|
||||
* Return values:
|
||||
* %0: Local error while encoding
|
||||
* %1: Encoding was successful
|
||||
*/
|
||||
int nfssvc_encode_voidres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
}
|
||||
|
||||
int nfsd_pool_stats_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
int ret;
|
||||
|
|
|
@ -192,11 +192,6 @@ __be32 *nfs2svc_encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct svc_fh *f
|
|||
/*
|
||||
* XDR decode functions
|
||||
*/
|
||||
int
|
||||
nfssvc_decode_void(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return xdr_argsize_check(rqstp, p);
|
||||
}
|
||||
|
||||
int
|
||||
nfssvc_decode_fhandle(struct svc_rqst *rqstp, __be32 *p)
|
||||
|
@ -423,11 +418,6 @@ nfssvc_decode_readdirargs(struct svc_rqst *rqstp, __be32 *p)
|
|||
/*
|
||||
* XDR encode functions
|
||||
*/
|
||||
int
|
||||
nfssvc_encode_void(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
return xdr_ressize_check(rqstp, p);
|
||||
}
|
||||
|
||||
int
|
||||
nfssvc_encode_stat(struct svc_rqst *rqstp, __be32 *p)
|
||||
|
@ -469,6 +459,7 @@ int
|
|||
nfssvc_encode_readlinkres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
struct nfsd_readlinkres *resp = rqstp->rq_resp;
|
||||
struct kvec *head = rqstp->rq_res.head;
|
||||
|
||||
*p++ = resp->status;
|
||||
if (resp->status != nfs_ok)
|
||||
|
@ -483,6 +474,8 @@ nfssvc_encode_readlinkres(struct svc_rqst *rqstp, __be32 *p)
|
|||
*p = 0;
|
||||
rqstp->rq_res.tail[0].iov_len = 4 - (resp->len&3);
|
||||
}
|
||||
if (svc_encode_result_payload(rqstp, head->iov_len, resp->len))
|
||||
return 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -490,6 +483,7 @@ int
|
|||
nfssvc_encode_readres(struct svc_rqst *rqstp, __be32 *p)
|
||||
{
|
||||
struct nfsd_readres *resp = rqstp->rq_resp;
|
||||
struct kvec *head = rqstp->rq_res.head;
|
||||
|
||||
*p++ = resp->status;
|
||||
if (resp->status != nfs_ok)
|
||||
|
@ -507,6 +501,8 @@ nfssvc_encode_readres(struct svc_rqst *rqstp, __be32 *p)
|
|||
*p = 0;
|
||||
rqstp->rq_res.tail[0].iov_len = 4 - (resp->count&3);
|
||||
}
|
||||
if (svc_encode_result_payload(rqstp, head->iov_len, resp->count))
|
||||
return 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
|
|
@ -1,3 +1,4 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
#define CREATE_TRACE_POINTS
|
||||
#include "trace.h"
|
||||
|
|
176
fs/nfsd/trace.h
176
fs/nfsd/trace.h
|
@ -12,6 +12,100 @@
|
|||
#include "export.h"
|
||||
#include "nfsfh.h"
|
||||
|
||||
#define NFSD_TRACE_PROC_ARG_FIELDS \
|
||||
__field(unsigned int, netns_ino) \
|
||||
__field(u32, xid) \
|
||||
__array(unsigned char, server, sizeof(struct sockaddr_in6)) \
|
||||
__array(unsigned char, client, sizeof(struct sockaddr_in6))
|
||||
|
||||
#define NFSD_TRACE_PROC_ARG_ASSIGNMENTS \
|
||||
do { \
|
||||
__entry->netns_ino = SVC_NET(rqstp)->ns.inum; \
|
||||
__entry->xid = be32_to_cpu(rqstp->rq_xid); \
|
||||
memcpy(__entry->server, &rqstp->rq_xprt->xpt_local, \
|
||||
rqstp->rq_xprt->xpt_locallen); \
|
||||
memcpy(__entry->client, &rqstp->rq_xprt->xpt_remote, \
|
||||
rqstp->rq_xprt->xpt_remotelen); \
|
||||
} while (0);
|
||||
|
||||
#define NFSD_TRACE_PROC_RES_FIELDS \
|
||||
__field(unsigned int, netns_ino) \
|
||||
__field(u32, xid) \
|
||||
__field(unsigned long, status) \
|
||||
__array(unsigned char, server, sizeof(struct sockaddr_in6)) \
|
||||
__array(unsigned char, client, sizeof(struct sockaddr_in6))
|
||||
|
||||
#define NFSD_TRACE_PROC_RES_ASSIGNMENTS(error) \
|
||||
do { \
|
||||
__entry->netns_ino = SVC_NET(rqstp)->ns.inum; \
|
||||
__entry->xid = be32_to_cpu(rqstp->rq_xid); \
|
||||
__entry->status = be32_to_cpu(error); \
|
||||
memcpy(__entry->server, &rqstp->rq_xprt->xpt_local, \
|
||||
rqstp->rq_xprt->xpt_locallen); \
|
||||
memcpy(__entry->client, &rqstp->rq_xprt->xpt_remote, \
|
||||
rqstp->rq_xprt->xpt_remotelen); \
|
||||
} while (0);
|
||||
|
||||
TRACE_EVENT(nfsd_garbage_args_err,
|
||||
TP_PROTO(
|
||||
const struct svc_rqst *rqstp
|
||||
),
|
||||
TP_ARGS(rqstp),
|
||||
TP_STRUCT__entry(
|
||||
NFSD_TRACE_PROC_ARG_FIELDS
|
||||
|
||||
__field(u32, vers)
|
||||
__field(u32, proc)
|
||||
),
|
||||
TP_fast_assign(
|
||||
NFSD_TRACE_PROC_ARG_ASSIGNMENTS
|
||||
|
||||
__entry->vers = rqstp->rq_vers;
|
||||
__entry->proc = rqstp->rq_proc;
|
||||
),
|
||||
TP_printk("xid=0x%08x vers=%u proc=%u",
|
||||
__entry->xid, __entry->vers, __entry->proc
|
||||
)
|
||||
);
|
||||
|
||||
TRACE_EVENT(nfsd_cant_encode_err,
|
||||
TP_PROTO(
|
||||
const struct svc_rqst *rqstp
|
||||
),
|
||||
TP_ARGS(rqstp),
|
||||
TP_STRUCT__entry(
|
||||
NFSD_TRACE_PROC_ARG_FIELDS
|
||||
|
||||
__field(u32, vers)
|
||||
__field(u32, proc)
|
||||
),
|
||||
TP_fast_assign(
|
||||
NFSD_TRACE_PROC_ARG_ASSIGNMENTS
|
||||
|
||||
__entry->vers = rqstp->rq_vers;
|
||||
__entry->proc = rqstp->rq_proc;
|
||||
),
|
||||
TP_printk("xid=0x%08x vers=%u proc=%u",
|
||||
__entry->xid, __entry->vers, __entry->proc
|
||||
)
|
||||
);
|
||||
|
||||
#define show_nfsd_may_flags(x) \
|
||||
__print_flags(x, "|", \
|
||||
{ NFSD_MAY_EXEC, "EXEC" }, \
|
||||
{ NFSD_MAY_WRITE, "WRITE" }, \
|
||||
{ NFSD_MAY_READ, "READ" }, \
|
||||
{ NFSD_MAY_SATTR, "SATTR" }, \
|
||||
{ NFSD_MAY_TRUNC, "TRUNC" }, \
|
||||
{ NFSD_MAY_LOCK, "LOCK" }, \
|
||||
{ NFSD_MAY_OWNER_OVERRIDE, "OWNER_OVERRIDE" }, \
|
||||
{ NFSD_MAY_LOCAL_ACCESS, "LOCAL_ACCESS" }, \
|
||||
{ NFSD_MAY_BYPASS_GSS_ON_ROOT, "BYPASS_GSS_ON_ROOT" }, \
|
||||
{ NFSD_MAY_NOT_BREAK_LEASE, "NOT_BREAK_LEASE" }, \
|
||||
{ NFSD_MAY_BYPASS_GSS, "BYPASS_GSS" }, \
|
||||
{ NFSD_MAY_READ_IF_EXEC, "READ_IF_EXEC" }, \
|
||||
{ NFSD_MAY_64BIT_COOKIE, "64BIT_COOKIE" })
|
||||
|
||||
TRACE_EVENT(nfsd_compound,
|
||||
TP_PROTO(const struct svc_rqst *rqst,
|
||||
u32 args_opcnt),
|
||||
|
@ -51,6 +145,56 @@ TRACE_EVENT(nfsd_compound_status,
|
|||
__get_str(name), __entry->status)
|
||||
)
|
||||
|
||||
TRACE_EVENT(nfsd_compound_decode_err,
|
||||
TP_PROTO(
|
||||
const struct svc_rqst *rqstp,
|
||||
u32 args_opcnt,
|
||||
u32 resp_opcnt,
|
||||
u32 opnum,
|
||||
__be32 status
|
||||
),
|
||||
TP_ARGS(rqstp, args_opcnt, resp_opcnt, opnum, status),
|
||||
TP_STRUCT__entry(
|
||||
NFSD_TRACE_PROC_RES_FIELDS
|
||||
|
||||
__field(u32, args_opcnt)
|
||||
__field(u32, resp_opcnt)
|
||||
__field(u32, opnum)
|
||||
),
|
||||
TP_fast_assign(
|
||||
NFSD_TRACE_PROC_RES_ASSIGNMENTS(status)
|
||||
|
||||
__entry->args_opcnt = args_opcnt;
|
||||
__entry->resp_opcnt = resp_opcnt;
|
||||
__entry->opnum = opnum;
|
||||
),
|
||||
TP_printk("op=%u/%u opnum=%u status=%lu",
|
||||
__entry->resp_opcnt, __entry->args_opcnt,
|
||||
__entry->opnum, __entry->status)
|
||||
);
|
||||
|
||||
TRACE_EVENT(nfsd_compound_encode_err,
|
||||
TP_PROTO(
|
||||
const struct svc_rqst *rqstp,
|
||||
u32 opnum,
|
||||
__be32 status
|
||||
),
|
||||
TP_ARGS(rqstp, opnum, status),
|
||||
TP_STRUCT__entry(
|
||||
NFSD_TRACE_PROC_RES_FIELDS
|
||||
|
||||
__field(u32, opnum)
|
||||
),
|
||||
TP_fast_assign(
|
||||
NFSD_TRACE_PROC_RES_ASSIGNMENTS(status)
|
||||
|
||||
__entry->opnum = opnum;
|
||||
),
|
||||
TP_printk("opnum=%u status=%lu",
|
||||
__entry->opnum, __entry->status)
|
||||
);
|
||||
|
||||
|
||||
DECLARE_EVENT_CLASS(nfsd_fh_err_class,
|
||||
TP_PROTO(struct svc_rqst *rqstp,
|
||||
struct svc_fh *fhp,
|
||||
|
@ -421,6 +565,9 @@ TRACE_EVENT(nfsd_clid_inuse_err,
|
|||
__entry->cl_boot, __entry->cl_id)
|
||||
)
|
||||
|
||||
/*
|
||||
* from fs/nfsd/filecache.h
|
||||
*/
|
||||
TRACE_DEFINE_ENUM(NFSD_FILE_HASHED);
|
||||
TRACE_DEFINE_ENUM(NFSD_FILE_PENDING);
|
||||
TRACE_DEFINE_ENUM(NFSD_FILE_BREAK_READ);
|
||||
|
@ -435,13 +582,6 @@ TRACE_DEFINE_ENUM(NFSD_FILE_REFERENCED);
|
|||
{ 1 << NFSD_FILE_BREAK_WRITE, "BREAK_WRITE" }, \
|
||||
{ 1 << NFSD_FILE_REFERENCED, "REFERENCED"})
|
||||
|
||||
/* FIXME: This should probably be fleshed out in the future. */
|
||||
#define show_nf_may(val) \
|
||||
__print_flags(val, "|", \
|
||||
{ NFSD_MAY_READ, "READ" }, \
|
||||
{ NFSD_MAY_WRITE, "WRITE" }, \
|
||||
{ NFSD_MAY_NOT_BREAK_LEASE, "NOT_BREAK_LEASE" })
|
||||
|
||||
DECLARE_EVENT_CLASS(nfsd_file_class,
|
||||
TP_PROTO(struct nfsd_file *nf),
|
||||
TP_ARGS(nf),
|
||||
|
@ -461,12 +601,12 @@ DECLARE_EVENT_CLASS(nfsd_file_class,
|
|||
__entry->nf_may = nf->nf_may;
|
||||
__entry->nf_file = nf->nf_file;
|
||||
),
|
||||
TP_printk("hash=0x%x inode=0x%p ref=%d flags=%s may=%s file=%p",
|
||||
TP_printk("hash=0x%x inode=%p ref=%d flags=%s may=%s file=%p",
|
||||
__entry->nf_hashval,
|
||||
__entry->nf_inode,
|
||||
__entry->nf_ref,
|
||||
show_nf_flags(__entry->nf_flags),
|
||||
show_nf_may(__entry->nf_may),
|
||||
show_nfsd_may_flags(__entry->nf_may),
|
||||
__entry->nf_file)
|
||||
)
|
||||
|
||||
|
@ -492,10 +632,10 @@ TRACE_EVENT(nfsd_file_acquire,
|
|||
__field(u32, xid)
|
||||
__field(unsigned int, hash)
|
||||
__field(void *, inode)
|
||||
__field(unsigned int, may_flags)
|
||||
__field(unsigned long, may_flags)
|
||||
__field(int, nf_ref)
|
||||
__field(unsigned long, nf_flags)
|
||||
__field(unsigned char, nf_may)
|
||||
__field(unsigned long, nf_may)
|
||||
__field(struct file *, nf_file)
|
||||
__field(u32, status)
|
||||
),
|
||||
|
@ -512,12 +652,12 @@ TRACE_EVENT(nfsd_file_acquire,
|
|||
__entry->status = be32_to_cpu(status);
|
||||
),
|
||||
|
||||
TP_printk("xid=0x%x hash=0x%x inode=0x%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=0x%p status=%u",
|
||||
TP_printk("xid=0x%x hash=0x%x inode=%p may_flags=%s ref=%d nf_flags=%s nf_may=%s nf_file=%p status=%u",
|
||||
__entry->xid, __entry->hash, __entry->inode,
|
||||
show_nf_may(__entry->may_flags), __entry->nf_ref,
|
||||
show_nf_flags(__entry->nf_flags),
|
||||
show_nf_may(__entry->nf_may), __entry->nf_file,
|
||||
__entry->status)
|
||||
show_nfsd_may_flags(__entry->may_flags),
|
||||
__entry->nf_ref, show_nf_flags(__entry->nf_flags),
|
||||
show_nfsd_may_flags(__entry->nf_may),
|
||||
__entry->nf_file, __entry->status)
|
||||
);
|
||||
|
||||
DECLARE_EVENT_CLASS(nfsd_file_search_class,
|
||||
|
@ -533,7 +673,7 @@ DECLARE_EVENT_CLASS(nfsd_file_search_class,
|
|||
__entry->hash = hash;
|
||||
__entry->found = found;
|
||||
),
|
||||
TP_printk("hash=0x%x inode=0x%p found=%d", __entry->hash,
|
||||
TP_printk("hash=0x%x inode=%p found=%d", __entry->hash,
|
||||
__entry->inode, __entry->found)
|
||||
);
|
||||
|
||||
|
@ -561,7 +701,7 @@ TRACE_EVENT(nfsd_file_fsnotify_handle_event,
|
|||
__entry->mode = inode->i_mode;
|
||||
__entry->mask = mask;
|
||||
),
|
||||
TP_printk("inode=0x%p nlink=%u mode=0%ho mask=0x%x", __entry->inode,
|
||||
TP_printk("inode=%p nlink=%u mode=0%ho mask=0x%x", __entry->inode,
|
||||
__entry->nlink, __entry->mode, __entry->mask)
|
||||
);
|
||||
|
||||
|
|
|
@ -978,18 +978,25 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
|
|||
__be32 *verf)
|
||||
{
|
||||
struct file *file = nf->nf_file;
|
||||
struct super_block *sb = file_inode(file)->i_sb;
|
||||
struct svc_export *exp;
|
||||
struct iov_iter iter;
|
||||
__be32 nfserr;
|
||||
int host_err;
|
||||
int use_wgather;
|
||||
loff_t pos = offset;
|
||||
unsigned long exp_op_flags = 0;
|
||||
unsigned int pflags = current->flags;
|
||||
rwf_t flags = 0;
|
||||
bool restore_flags = false;
|
||||
|
||||
trace_nfsd_write_opened(rqstp, fhp, offset, *cnt);
|
||||
|
||||
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
|
||||
if (sb->s_export_op)
|
||||
exp_op_flags = sb->s_export_op->flags;
|
||||
|
||||
if (test_bit(RQ_LOCAL, &rqstp->rq_flags) &&
|
||||
!(exp_op_flags & EXPORT_OP_REMOTE_FS)) {
|
||||
/*
|
||||
* We want throttling in balance_dirty_pages()
|
||||
* and shrink_inactive_list() to only consider
|
||||
|
@ -998,6 +1005,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct nfsd_file *nf,
|
|||
* the client's dirty pages or its congested queue.
|
||||
*/
|
||||
current->flags |= PF_LOCAL_THROTTLE;
|
||||
restore_flags = true;
|
||||
}
|
||||
|
||||
exp = fhp->fh_export;
|
||||
use_wgather = (rqstp->rq_vers == 2) && EX_WGATHER(exp);
|
||||
|
@ -1049,7 +1058,7 @@ out_nfserr:
|
|||
trace_nfsd_write_err(rqstp, fhp, offset, host_err);
|
||||
nfserr = nfserrno(host_err);
|
||||
}
|
||||
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
|
||||
if (restore_flags)
|
||||
current_restore_flags(pflags, PF_LOCAL_THROTTLE);
|
||||
return nfserr;
|
||||
}
|
||||
|
@ -1724,7 +1733,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct svc_fh *ffhp, char *fname, int flen,
|
|||
struct inode *fdir, *tdir;
|
||||
__be32 err;
|
||||
int host_err;
|
||||
bool has_cached = false;
|
||||
bool close_cached = false;
|
||||
|
||||
err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
|
||||
if (err)
|
||||
|
@ -1783,8 +1792,9 @@ retry:
|
|||
if (ffhp->fh_export->ex_path.dentry != tfhp->fh_export->ex_path.dentry)
|
||||
goto out_dput_new;
|
||||
|
||||
if (nfsd_has_cached_files(ndentry)) {
|
||||
has_cached = true;
|
||||
if ((ndentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK) &&
|
||||
nfsd_has_cached_files(ndentry)) {
|
||||
close_cached = true;
|
||||
goto out_dput_old;
|
||||
} else {
|
||||
host_err = vfs_rename(fdir, odentry, tdir, ndentry, NULL, 0);
|
||||
|
@ -1805,7 +1815,7 @@ retry:
|
|||
* as that would do the wrong thing if the two directories
|
||||
* were the same, so again we do it by hand.
|
||||
*/
|
||||
if (!has_cached) {
|
||||
if (!close_cached) {
|
||||
fill_post_wcc(ffhp);
|
||||
fill_post_wcc(tfhp);
|
||||
}
|
||||
|
@ -1819,8 +1829,8 @@ retry:
|
|||
* shouldn't be done with locks held however, so we delay it until this
|
||||
* point and then reattempt the whole shebang.
|
||||
*/
|
||||
if (has_cached) {
|
||||
has_cached = false;
|
||||
if (close_cached) {
|
||||
close_cached = false;
|
||||
nfsd_close_cached_files(ndentry);
|
||||
dput(ndentry);
|
||||
goto retry;
|
||||
|
@ -1872,7 +1882,8 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
|
|||
type = d_inode(rdentry)->i_mode & S_IFMT;
|
||||
|
||||
if (type != S_IFDIR) {
|
||||
nfsd_close_cached_files(rdentry);
|
||||
if (rdentry->d_sb->s_export_op->flags & EXPORT_OP_CLOSE_BEFORE_UNLINK)
|
||||
nfsd_close_cached_files(rdentry);
|
||||
host_err = vfs_unlink(dirp, rdentry, NULL);
|
||||
} else {
|
||||
host_err = vfs_rmdir(dirp, rdentry);
|
||||
|
|
|
@ -144,7 +144,6 @@ union nfsd_xdrstore {
|
|||
#define NFS2_SVC_XDRSIZE sizeof(union nfsd_xdrstore)
|
||||
|
||||
|
||||
int nfssvc_decode_void(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_decode_fhandle(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_decode_sattrargs(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_decode_diropargs(struct svc_rqst *, __be32 *);
|
||||
|
@ -156,7 +155,6 @@ int nfssvc_decode_readlinkargs(struct svc_rqst *, __be32 *);
|
|||
int nfssvc_decode_linkargs(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_decode_symlinkargs(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_decode_readdirargs(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_encode_void(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_encode_stat(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_encode_attrstat(struct svc_rqst *, __be32 *);
|
||||
int nfssvc_encode_diropres(struct svc_rqst *, __be32 *);
|
||||
|
|
|
@ -273,7 +273,6 @@ union nfsd3_xdrstore {
|
|||
|
||||
#define NFS3_SVC_XDRSIZE sizeof(union nfsd3_xdrstore)
|
||||
|
||||
int nfs3svc_decode_voidarg(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_decode_fhandle(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_decode_sattrargs(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_decode_diropargs(struct svc_rqst *, __be32 *);
|
||||
|
@ -290,7 +289,6 @@ int nfs3svc_decode_symlinkargs(struct svc_rqst *, __be32 *);
|
|||
int nfs3svc_decode_readdirargs(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_decode_readdirplusargs(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_decode_commitargs(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_encode_voidres(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_encode_attrstat(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_encode_wccstat(struct svc_rqst *, __be32 *);
|
||||
int nfs3svc_encode_diropres(struct svc_rqst *, __be32 *);
|
||||
|
|
|
@ -76,12 +76,7 @@ static inline bool nfsd4_has_session(struct nfsd4_compound_state *cs)
|
|||
|
||||
struct nfsd4_change_info {
|
||||
u32 atomic;
|
||||
bool change_supported;
|
||||
u32 before_ctime_sec;
|
||||
u32 before_ctime_nsec;
|
||||
u64 before_change;
|
||||
u32 after_ctime_sec;
|
||||
u32 after_ctime_nsec;
|
||||
u64 after_change;
|
||||
};
|
||||
|
||||
|
@ -252,7 +247,8 @@ struct nfsd4_listxattrs {
|
|||
|
||||
struct nfsd4_open {
|
||||
u32 op_claim_type; /* request */
|
||||
struct xdr_netobj op_fname; /* request - everything but CLAIM_PREV */
|
||||
u32 op_fnamelen;
|
||||
char * op_fname; /* request - everything but CLAIM_PREV */
|
||||
u32 op_delegate_type; /* request - CLAIM_PREV only */
|
||||
stateid_t op_delegate_stateid; /* request - response */
|
||||
u32 op_why_no_deleg; /* response - DELEG_NONE_EXT only */
|
||||
|
@ -385,13 +381,6 @@ struct nfsd4_setclientid_confirm {
|
|||
nfs4_verifier sc_confirm;
|
||||
};
|
||||
|
||||
struct nfsd4_saved_compoundargs {
|
||||
__be32 *p;
|
||||
__be32 *end;
|
||||
int pagelen;
|
||||
struct page **pagelist;
|
||||
};
|
||||
|
||||
struct nfsd4_test_stateid_id {
|
||||
__be32 ts_id_status;
|
||||
stateid_t ts_id_stateid;
|
||||
|
@ -419,8 +408,7 @@ struct nfsd4_write {
|
|||
u64 wr_offset; /* request */
|
||||
u32 wr_stable_how; /* request */
|
||||
u32 wr_buflen; /* request */
|
||||
struct kvec wr_head;
|
||||
struct page ** wr_pagelist; /* request */
|
||||
struct xdr_buf wr_payload; /* request */
|
||||
|
||||
u32 wr_bytes_written; /* response */
|
||||
u32 wr_how_written; /* response */
|
||||
|
@ -433,7 +421,7 @@ struct nfsd4_exchange_id {
|
|||
u32 flags;
|
||||
clientid_t clientid;
|
||||
u32 seqid;
|
||||
int spa_how;
|
||||
u32 spa_how;
|
||||
u32 spo_must_enforce[3];
|
||||
u32 spo_must_allow[3];
|
||||
struct xdr_netobj nii_domain;
|
||||
|
@ -554,7 +542,7 @@ struct nfsd4_copy {
|
|||
bool cp_intra;
|
||||
|
||||
/* both */
|
||||
bool cp_synchronous;
|
||||
u32 cp_synchronous;
|
||||
|
||||
/* response */
|
||||
struct nfsd42_write_res cp_res;
|
||||
|
@ -615,7 +603,7 @@ struct nfsd4_copy_notify {
|
|||
};
|
||||
|
||||
struct nfsd4_op {
|
||||
int opnum;
|
||||
u32 opnum;
|
||||
const struct nfsd4_operation * opdesc;
|
||||
__be32 status;
|
||||
union nfsd4_op_u {
|
||||
|
@ -696,15 +684,8 @@ struct svcxdr_tmpbuf {
|
|||
|
||||
struct nfsd4_compoundargs {
|
||||
/* scratch variables for XDR decode */
|
||||
__be32 * p;
|
||||
__be32 * end;
|
||||
struct page ** pagelist;
|
||||
int pagelen;
|
||||
bool tail;
|
||||
__be32 tmp[8];
|
||||
__be32 * tmpp;
|
||||
struct xdr_stream *xdr;
|
||||
struct svcxdr_tmpbuf *to_free;
|
||||
|
||||
struct svc_rqst *rqstp;
|
||||
|
||||
u32 taglen;
|
||||
|
@ -767,22 +748,14 @@ static inline void
|
|||
set_change_info(struct nfsd4_change_info *cinfo, struct svc_fh *fhp)
|
||||
{
|
||||
BUG_ON(!fhp->fh_pre_saved);
|
||||
cinfo->atomic = (u32)fhp->fh_post_saved;
|
||||
cinfo->change_supported = IS_I_VERSION(d_inode(fhp->fh_dentry));
|
||||
cinfo->atomic = (u32)(fhp->fh_post_saved && !fhp->fh_no_atomic_attr);
|
||||
|
||||
cinfo->before_change = fhp->fh_pre_change;
|
||||
cinfo->after_change = fhp->fh_post_change;
|
||||
cinfo->before_ctime_sec = fhp->fh_pre_ctime.tv_sec;
|
||||
cinfo->before_ctime_nsec = fhp->fh_pre_ctime.tv_nsec;
|
||||
cinfo->after_ctime_sec = fhp->fh_post_attr.ctime.tv_sec;
|
||||
cinfo->after_ctime_nsec = fhp->fh_post_attr.ctime.tv_nsec;
|
||||
|
||||
}
|
||||
|
||||
|
||||
bool nfsd4_mach_creds_match(struct nfs4_client *cl, struct svc_rqst *rqstp);
|
||||
int nfs4svc_decode_voidarg(struct svc_rqst *, __be32 *);
|
||||
int nfs4svc_encode_voidres(struct svc_rqst *, __be32 *);
|
||||
int nfs4svc_decode_compoundargs(struct svc_rqst *, __be32 *);
|
||||
int nfs4svc_encode_compoundres(struct svc_rqst *, __be32 *);
|
||||
__be32 nfsd4_check_resp_size(struct nfsd4_compoundres *, u32);
|
||||
|
|
|
@ -213,12 +213,25 @@ struct export_operations {
|
|||
bool write, u32 *device_generation);
|
||||
int (*commit_blocks)(struct inode *inode, struct iomap *iomaps,
|
||||
int nr_iomaps, struct iattr *iattr);
|
||||
#define EXPORT_OP_NOWCC (0x1) /* don't collect v3 wcc data */
|
||||
#define EXPORT_OP_NOSUBTREECHK (0x2) /* no subtree checking */
|
||||
#define EXPORT_OP_CLOSE_BEFORE_UNLINK (0x4) /* close files before unlink */
|
||||
#define EXPORT_OP_REMOTE_FS (0x8) /* Filesystem is remote */
|
||||
#define EXPORT_OP_NOATOMIC_ATTR (0x10) /* Filesystem cannot supply
|
||||
atomic attribute updates
|
||||
*/
|
||||
unsigned long flags;
|
||||
};
|
||||
|
||||
extern int exportfs_encode_inode_fh(struct inode *inode, struct fid *fid,
|
||||
int *max_len, struct inode *parent);
|
||||
extern int exportfs_encode_fh(struct dentry *dentry, struct fid *fid,
|
||||
int *max_len, int connectable);
|
||||
extern struct dentry *exportfs_decode_fh_raw(struct vfsmount *mnt,
|
||||
struct fid *fid, int fh_len,
|
||||
int fileid_type,
|
||||
int (*acceptable)(void *, struct dentry *),
|
||||
void *context);
|
||||
extern struct dentry *exportfs_decode_fh(struct vfsmount *mnt, struct fid *fid,
|
||||
int fh_len, int fileid_type, int (*acceptable)(void *, struct dentry *),
|
||||
void *context);
|
||||
|
|
|
@ -328,6 +328,19 @@ inode_query_iversion(struct inode *inode)
|
|||
return cur >> I_VERSION_QUERIED_SHIFT;
|
||||
}
|
||||
|
||||
/*
|
||||
* For filesystems without any sort of change attribute, the best we can
|
||||
* do is fake one up from the ctime:
|
||||
*/
|
||||
static inline u64 time_to_chattr(struct timespec64 *t)
|
||||
{
|
||||
u64 chattr = t->tv_sec;
|
||||
|
||||
chattr <<= 32;
|
||||
chattr += t->tv_nsec;
|
||||
return chattr;
|
||||
}
|
||||
|
||||
/**
|
||||
* inode_eq_iversion_raw - check whether the raw i_version counter has changed
|
||||
* @inode: inode to check
|
||||
|
|
|
@ -385,13 +385,6 @@ enum lock_type4 {
|
|||
NFS4_WRITEW_LT = 4
|
||||
};
|
||||
|
||||
enum change_attr_type4 {
|
||||
NFS4_CHANGE_TYPE_IS_MONOTONIC_INCR = 0,
|
||||
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER = 1,
|
||||
NFS4_CHANGE_TYPE_IS_VERSION_COUNTER_NOPNFS = 2,
|
||||
NFS4_CHANGE_TYPE_IS_TIME_METADATA = 3,
|
||||
NFS4_CHANGE_TYPE_IS_UNDEFINED = 4
|
||||
};
|
||||
|
||||
/* Mandatory Attributes */
|
||||
#define FATTR4_WORD0_SUPPORTED_ATTRS (1UL << 0)
|
||||
|
@ -459,7 +452,6 @@ enum change_attr_type4 {
|
|||
#define FATTR4_WORD2_LAYOUT_BLKSIZE (1UL << 1)
|
||||
#define FATTR4_WORD2_MDSTHRESHOLD (1UL << 4)
|
||||
#define FATTR4_WORD2_CLONE_BLKSIZE (1UL << 13)
|
||||
#define FATTR4_WORD2_CHANGE_ATTR_TYPE (1UL << 15)
|
||||
#define FATTR4_WORD2_SECURITY_LABEL (1UL << 16)
|
||||
#define FATTR4_WORD2_MODE_UMASK (1UL << 17)
|
||||
#define FATTR4_WORD2_XATTR_SUPPORT (1UL << 18)
|
||||
|
|
|
@ -247,6 +247,8 @@ struct svc_rqst {
|
|||
|
||||
size_t rq_xprt_hlen; /* xprt header len */
|
||||
struct xdr_buf rq_arg;
|
||||
struct xdr_stream rq_arg_stream;
|
||||
struct page *rq_scratch_page;
|
||||
struct xdr_buf rq_res;
|
||||
struct page *rq_pages[RPCSVC_MAXPAGES + 1];
|
||||
struct page * *rq_respages; /* points into rq_pages */
|
||||
|
@ -519,9 +521,9 @@ void svc_wake_up(struct svc_serv *);
|
|||
void svc_reserve(struct svc_rqst *rqstp, int space);
|
||||
struct svc_pool * svc_pool_for_cpu(struct svc_serv *serv, int cpu);
|
||||
char * svc_print_addr(struct svc_rqst *, char *, size_t);
|
||||
int svc_encode_read_payload(struct svc_rqst *rqstp,
|
||||
unsigned int offset,
|
||||
unsigned int length);
|
||||
int svc_encode_result_payload(struct svc_rqst *rqstp,
|
||||
unsigned int offset,
|
||||
unsigned int length);
|
||||
unsigned int svc_fill_write_vector(struct svc_rqst *rqstp,
|
||||
struct page **pages,
|
||||
struct kvec *first, size_t total);
|
||||
|
@ -557,4 +559,18 @@ static inline void svc_reserve_auth(struct svc_rqst *rqstp, int space)
|
|||
svc_reserve(rqstp, space + rqstp->rq_auth_slack);
|
||||
}
|
||||
|
||||
/**
|
||||
* svcxdr_init_decode - Prepare an xdr_stream for svc Call decoding
|
||||
* @rqstp: controlling server RPC transaction context
|
||||
*
|
||||
*/
|
||||
static inline void svcxdr_init_decode(struct svc_rqst *rqstp)
|
||||
{
|
||||
struct xdr_stream *xdr = &rqstp->rq_arg_stream;
|
||||
struct kvec *argv = rqstp->rq_arg.head;
|
||||
|
||||
xdr_init_decode(xdr, &rqstp->rq_arg, argv->iov_base, NULL);
|
||||
xdr_set_scratch_page(xdr, rqstp->rq_scratch_page);
|
||||
}
|
||||
|
||||
#endif /* SUNRPC_SVC_H */
|
||||
|
|
|
@ -47,6 +47,8 @@
|
|||
#include <linux/sunrpc/svcsock.h>
|
||||
#include <linux/sunrpc/rpc_rdma.h>
|
||||
#include <linux/sunrpc/rpc_rdma_cid.h>
|
||||
#include <linux/sunrpc/svc_rdma_pcl.h>
|
||||
|
||||
#include <rdma/ib_verbs.h>
|
||||
#include <rdma/rdma_cm.h>
|
||||
|
||||
|
@ -142,10 +144,15 @@ struct svc_rdma_recv_ctxt {
|
|||
unsigned int rc_page_count;
|
||||
unsigned int rc_hdr_count;
|
||||
u32 rc_inv_rkey;
|
||||
__be32 *rc_write_list;
|
||||
__be32 *rc_reply_chunk;
|
||||
unsigned int rc_read_payload_offset;
|
||||
unsigned int rc_read_payload_length;
|
||||
__be32 rc_msgtype;
|
||||
|
||||
struct svc_rdma_pcl rc_call_pcl;
|
||||
|
||||
struct svc_rdma_pcl rc_read_pcl;
|
||||
struct svc_rdma_chunk *rc_cur_result_payload;
|
||||
struct svc_rdma_pcl rc_write_pcl;
|
||||
struct svc_rdma_pcl rc_reply_pcl;
|
||||
|
||||
struct page *rc_pages[RPCSVC_MAXPAGES];
|
||||
};
|
||||
|
||||
|
@ -171,6 +178,8 @@ extern void svc_rdma_handle_bc_reply(struct svc_rqst *rqstp,
|
|||
/* svc_rdma_recvfrom.c */
|
||||
extern void svc_rdma_recv_ctxts_destroy(struct svcxprt_rdma *rdma);
|
||||
extern bool svc_rdma_post_recvs(struct svcxprt_rdma *rdma);
|
||||
extern struct svc_rdma_recv_ctxt *
|
||||
svc_rdma_recv_ctxt_get(struct svcxprt_rdma *rdma);
|
||||
extern void svc_rdma_recv_ctxt_put(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_recv_ctxt *ctxt);
|
||||
extern void svc_rdma_flush_recv_queues(struct svcxprt_rdma *rdma);
|
||||
|
@ -179,16 +188,15 @@ extern int svc_rdma_recvfrom(struct svc_rqst *);
|
|||
|
||||
/* svc_rdma_rw.c */
|
||||
extern void svc_rdma_destroy_rw_ctxts(struct svcxprt_rdma *rdma);
|
||||
extern int svc_rdma_recv_read_chunk(struct svcxprt_rdma *rdma,
|
||||
struct svc_rqst *rqstp,
|
||||
struct svc_rdma_recv_ctxt *head, __be32 *p);
|
||||
extern int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma,
|
||||
__be32 *wr_ch, struct xdr_buf *xdr,
|
||||
unsigned int offset,
|
||||
unsigned long length);
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
const struct xdr_buf *xdr);
|
||||
extern int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct xdr_buf *xdr);
|
||||
const struct xdr_buf *xdr);
|
||||
extern int svc_rdma_process_read_list(struct svcxprt_rdma *rdma,
|
||||
struct svc_rqst *rqstp,
|
||||
struct svc_rdma_recv_ctxt *head);
|
||||
|
||||
/* svc_rdma_sendto.c */
|
||||
extern void svc_rdma_send_ctxts_destroy(struct svcxprt_rdma *rdma);
|
||||
|
@ -201,14 +209,14 @@ extern int svc_rdma_send(struct svcxprt_rdma *rdma,
|
|||
extern int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct xdr_buf *xdr);
|
||||
const struct xdr_buf *xdr);
|
||||
extern void svc_rdma_send_error_msg(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
struct svc_rdma_recv_ctxt *rctxt,
|
||||
int status);
|
||||
extern int svc_rdma_sendto(struct svc_rqst *);
|
||||
extern int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length);
|
||||
extern int svc_rdma_result_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length);
|
||||
|
||||
/* svc_rdma_transport.c */
|
||||
extern struct svc_xprt_class svc_rdma_class;
|
||||
|
|
|
@ -0,0 +1,128 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/*
|
||||
* Copyright (c) 2020, Oracle and/or its affiliates
|
||||
*/
|
||||
|
||||
#ifndef SVC_RDMA_PCL_H
|
||||
#define SVC_RDMA_PCL_H
|
||||
|
||||
#include <linux/list.h>
|
||||
|
||||
struct svc_rdma_segment {
|
||||
u32 rs_handle;
|
||||
u32 rs_length;
|
||||
u64 rs_offset;
|
||||
};
|
||||
|
||||
struct svc_rdma_chunk {
|
||||
struct list_head ch_list;
|
||||
|
||||
u32 ch_position;
|
||||
u32 ch_length;
|
||||
u32 ch_payload_length;
|
||||
|
||||
u32 ch_segcount;
|
||||
struct svc_rdma_segment ch_segments[];
|
||||
};
|
||||
|
||||
struct svc_rdma_pcl {
|
||||
unsigned int cl_count;
|
||||
struct list_head cl_chunks;
|
||||
};
|
||||
|
||||
/**
|
||||
* pcl_init - Initialize a parsed chunk list
|
||||
* @pcl: parsed chunk list to initialize
|
||||
*
|
||||
*/
|
||||
static inline void pcl_init(struct svc_rdma_pcl *pcl)
|
||||
{
|
||||
INIT_LIST_HEAD(&pcl->cl_chunks);
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_is_empty - Return true if parsed chunk list is empty
|
||||
* @pcl: parsed chunk list
|
||||
*
|
||||
*/
|
||||
static inline bool pcl_is_empty(const struct svc_rdma_pcl *pcl)
|
||||
{
|
||||
return list_empty(&pcl->cl_chunks);
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_first_chunk - Return first chunk in a parsed chunk list
|
||||
* @pcl: parsed chunk list
|
||||
*
|
||||
* Returns the first chunk in the list, or NULL if the list is empty.
|
||||
*/
|
||||
static inline struct svc_rdma_chunk *
|
||||
pcl_first_chunk(const struct svc_rdma_pcl *pcl)
|
||||
{
|
||||
if (pcl_is_empty(pcl))
|
||||
return NULL;
|
||||
return list_first_entry(&pcl->cl_chunks, struct svc_rdma_chunk,
|
||||
ch_list);
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_next_chunk - Return next chunk in a parsed chunk list
|
||||
* @pcl: a parsed chunk list
|
||||
* @chunk: chunk in @pcl
|
||||
*
|
||||
* Returns the next chunk in the list, or NULL if @chunk is already last.
|
||||
*/
|
||||
static inline struct svc_rdma_chunk *
|
||||
pcl_next_chunk(const struct svc_rdma_pcl *pcl, struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
if (list_is_last(&chunk->ch_list, &pcl->cl_chunks))
|
||||
return NULL;
|
||||
return list_next_entry(chunk, ch_list);
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_for_each_chunk - Iterate over chunks in a parsed chunk list
|
||||
* @pos: the loop cursor
|
||||
* @pcl: a parsed chunk list
|
||||
*/
|
||||
#define pcl_for_each_chunk(pos, pcl) \
|
||||
for (pos = list_first_entry(&(pcl)->cl_chunks, struct svc_rdma_chunk, ch_list); \
|
||||
&pos->ch_list != &(pcl)->cl_chunks; \
|
||||
pos = list_next_entry(pos, ch_list))
|
||||
|
||||
/**
|
||||
* pcl_for_each_segment - Iterate over segments in a parsed chunk
|
||||
* @pos: the loop cursor
|
||||
* @chunk: a parsed chunk
|
||||
*/
|
||||
#define pcl_for_each_segment(pos, chunk) \
|
||||
for (pos = &(chunk)->ch_segments[0]; \
|
||||
pos <= &(chunk)->ch_segments[(chunk)->ch_segcount - 1]; \
|
||||
pos++)
|
||||
|
||||
/**
|
||||
* pcl_chunk_end_offset - Return offset of byte range following @chunk
|
||||
* @chunk: chunk in @pcl
|
||||
*
|
||||
* Returns starting offset of the region just after @chunk
|
||||
*/
|
||||
static inline unsigned int
|
||||
pcl_chunk_end_offset(const struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
return xdr_align_size(chunk->ch_position + chunk->ch_payload_length);
|
||||
}
|
||||
|
||||
struct svc_rdma_recv_ctxt;
|
||||
|
||||
extern void pcl_free(struct svc_rdma_pcl *pcl);
|
||||
extern bool pcl_alloc_call(struct svc_rdma_recv_ctxt *rctxt, __be32 *p);
|
||||
extern bool pcl_alloc_read(struct svc_rdma_recv_ctxt *rctxt, __be32 *p);
|
||||
extern bool pcl_alloc_write(struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_pcl *pcl, __be32 *p);
|
||||
extern int pcl_process_nonpayloads(const struct svc_rdma_pcl *pcl,
|
||||
const struct xdr_buf *xdr,
|
||||
int (*actor)(const struct xdr_buf *,
|
||||
void *),
|
||||
void *data);
|
||||
|
||||
#endif /* SVC_RDMA_PCL_H */
|
|
@ -21,8 +21,8 @@ struct svc_xprt_ops {
|
|||
int (*xpo_has_wspace)(struct svc_xprt *);
|
||||
int (*xpo_recvfrom)(struct svc_rqst *);
|
||||
int (*xpo_sendto)(struct svc_rqst *);
|
||||
int (*xpo_read_payload)(struct svc_rqst *, unsigned int,
|
||||
unsigned int);
|
||||
int (*xpo_result_payload)(struct svc_rqst *, unsigned int,
|
||||
unsigned int);
|
||||
void (*xpo_release_rqst)(struct svc_rqst *);
|
||||
void (*xpo_detach)(struct svc_xprt *);
|
||||
void (*xpo_free)(struct svc_xprt *);
|
||||
|
|
|
@ -183,7 +183,8 @@ xdr_adjust_iovec(struct kvec *iov, __be32 *p)
|
|||
*/
|
||||
extern void xdr_shift_buf(struct xdr_buf *, size_t);
|
||||
extern void xdr_buf_from_iov(struct kvec *, struct xdr_buf *);
|
||||
extern int xdr_buf_subsegment(struct xdr_buf *, struct xdr_buf *, unsigned int, unsigned int);
|
||||
extern int xdr_buf_subsegment(const struct xdr_buf *buf, struct xdr_buf *subbuf,
|
||||
unsigned int base, unsigned int len);
|
||||
extern void xdr_buf_trim(struct xdr_buf *, unsigned int);
|
||||
extern int read_bytes_from_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int);
|
||||
extern int write_bytes_to_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int);
|
||||
|
@ -247,13 +248,57 @@ extern void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf,
|
|||
__be32 *p, struct rpc_rqst *rqst);
|
||||
extern void xdr_init_decode_pages(struct xdr_stream *xdr, struct xdr_buf *buf,
|
||||
struct page **pages, unsigned int len);
|
||||
extern void xdr_set_scratch_buffer(struct xdr_stream *xdr, void *buf, size_t buflen);
|
||||
extern __be32 *xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes);
|
||||
extern unsigned int xdr_read_pages(struct xdr_stream *xdr, unsigned int len);
|
||||
extern void xdr_enter_page(struct xdr_stream *xdr, unsigned int len);
|
||||
extern int xdr_process_buf(struct xdr_buf *buf, unsigned int offset, unsigned int len, int (*actor)(struct scatterlist *, void *), void *data);
|
||||
extern uint64_t xdr_align_data(struct xdr_stream *, uint64_t, uint32_t);
|
||||
extern uint64_t xdr_expand_hole(struct xdr_stream *, uint64_t, uint64_t);
|
||||
extern bool xdr_stream_subsegment(struct xdr_stream *xdr, struct xdr_buf *subbuf,
|
||||
unsigned int len);
|
||||
|
||||
/**
|
||||
* xdr_set_scratch_buffer - Attach a scratch buffer for decoding data.
|
||||
* @xdr: pointer to xdr_stream struct
|
||||
* @buf: pointer to an empty buffer
|
||||
* @buflen: size of 'buf'
|
||||
*
|
||||
* The scratch buffer is used when decoding from an array of pages.
|
||||
* If an xdr_inline_decode() call spans across page boundaries, then
|
||||
* we copy the data into the scratch buffer in order to allow linear
|
||||
* access.
|
||||
*/
|
||||
static inline void
|
||||
xdr_set_scratch_buffer(struct xdr_stream *xdr, void *buf, size_t buflen)
|
||||
{
|
||||
xdr->scratch.iov_base = buf;
|
||||
xdr->scratch.iov_len = buflen;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_set_scratch_page - Attach a scratch buffer for decoding data
|
||||
* @xdr: pointer to xdr_stream struct
|
||||
* @page: an anonymous page
|
||||
*
|
||||
* See xdr_set_scratch_buffer().
|
||||
*/
|
||||
static inline void
|
||||
xdr_set_scratch_page(struct xdr_stream *xdr, struct page *page)
|
||||
{
|
||||
xdr_set_scratch_buffer(xdr, page_address(page), PAGE_SIZE);
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_reset_scratch_buffer - Clear scratch buffer information
|
||||
* @xdr: pointer to xdr_stream struct
|
||||
*
|
||||
* See xdr_set_scratch_buffer().
|
||||
*/
|
||||
static inline void
|
||||
xdr_reset_scratch_buffer(struct xdr_stream *xdr)
|
||||
{
|
||||
xdr_set_scratch_buffer(xdr, NULL, 0);
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_stream_remaining - Return the number of bytes remaining in the stream
|
||||
|
@ -505,6 +550,27 @@ static inline bool xdr_item_is_present(const __be32 *p)
|
|||
return *p != xdr_zero;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_stream_decode_bool - Decode a boolean
|
||||
* @xdr: pointer to xdr_stream
|
||||
* @ptr: pointer to a u32 in which to store the result
|
||||
*
|
||||
* Return values:
|
||||
* %0 on success
|
||||
* %-EBADMSG on XDR buffer overflow
|
||||
*/
|
||||
static inline ssize_t
|
||||
xdr_stream_decode_bool(struct xdr_stream *xdr, __u32 *ptr)
|
||||
{
|
||||
const size_t count = sizeof(*ptr);
|
||||
__be32 *p = xdr_inline_decode(xdr, count);
|
||||
|
||||
if (unlikely(!p))
|
||||
return -EBADMSG;
|
||||
*ptr = (*p != xdr_zero);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_stream_decode_u32 - Decode a 32-bit integer
|
||||
* @xdr: pointer to xdr_stream
|
||||
|
@ -526,6 +592,27 @@ xdr_stream_decode_u32(struct xdr_stream *xdr, __u32 *ptr)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_stream_decode_u64 - Decode a 64-bit integer
|
||||
* @xdr: pointer to xdr_stream
|
||||
* @ptr: location to store 64-bit integer
|
||||
*
|
||||
* Return values:
|
||||
* %0 on success
|
||||
* %-EBADMSG on XDR buffer overflow
|
||||
*/
|
||||
static inline ssize_t
|
||||
xdr_stream_decode_u64(struct xdr_stream *xdr, __u64 *ptr)
|
||||
{
|
||||
const size_t count = sizeof(*ptr);
|
||||
__be32 *p = xdr_inline_decode(xdr, count);
|
||||
|
||||
if (unlikely(!p))
|
||||
return -EBADMSG;
|
||||
xdr_decode_hyper(p, ptr);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_stream_decode_opaque_fixed - Decode fixed length opaque xdr data
|
||||
* @xdr: pointer to xdr_stream
|
||||
|
|
|
@ -1410,101 +1410,112 @@ DEFINE_BADREQ_EVENT(drop);
|
|||
DEFINE_BADREQ_EVENT(badproc);
|
||||
DEFINE_BADREQ_EVENT(parse);
|
||||
|
||||
DECLARE_EVENT_CLASS(svcrdma_segment_event,
|
||||
TRACE_EVENT(svcrdma_encode_wseg,
|
||||
TP_PROTO(
|
||||
const struct svc_rdma_send_ctxt *ctxt,
|
||||
u32 segno,
|
||||
u32 handle,
|
||||
u32 length,
|
||||
u64 offset
|
||||
),
|
||||
|
||||
TP_ARGS(handle, length, offset),
|
||||
TP_ARGS(ctxt, segno, handle, length, offset),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, cq_id)
|
||||
__field(int, completion_id)
|
||||
__field(u32, segno)
|
||||
__field(u32, handle)
|
||||
__field(u32, length)
|
||||
__field(u64, offset)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->cq_id = ctxt->sc_cid.ci_queue_id;
|
||||
__entry->completion_id = ctxt->sc_cid.ci_completion_id;
|
||||
__entry->segno = segno;
|
||||
__entry->handle = handle;
|
||||
__entry->length = length;
|
||||
__entry->offset = offset;
|
||||
),
|
||||
|
||||
TP_printk("%u@0x%016llx:0x%08x",
|
||||
__entry->length, (unsigned long long)__entry->offset,
|
||||
__entry->handle
|
||||
TP_printk("cq_id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
|
||||
__entry->cq_id, __entry->completion_id,
|
||||
__entry->segno, __entry->length,
|
||||
(unsigned long long)__entry->offset, __entry->handle
|
||||
)
|
||||
);
|
||||
|
||||
#define DEFINE_SEGMENT_EVENT(name) \
|
||||
DEFINE_EVENT(svcrdma_segment_event, svcrdma_##name,\
|
||||
TP_PROTO( \
|
||||
u32 handle, \
|
||||
u32 length, \
|
||||
u64 offset \
|
||||
), \
|
||||
TP_ARGS(handle, length, offset))
|
||||
|
||||
DEFINE_SEGMENT_EVENT(decode_wseg);
|
||||
DEFINE_SEGMENT_EVENT(encode_rseg);
|
||||
DEFINE_SEGMENT_EVENT(send_rseg);
|
||||
DEFINE_SEGMENT_EVENT(encode_wseg);
|
||||
DEFINE_SEGMENT_EVENT(send_wseg);
|
||||
|
||||
DECLARE_EVENT_CLASS(svcrdma_chunk_event,
|
||||
TRACE_EVENT(svcrdma_decode_rseg,
|
||||
TP_PROTO(
|
||||
u32 length
|
||||
const struct rpc_rdma_cid *cid,
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
const struct svc_rdma_segment *segment
|
||||
),
|
||||
|
||||
TP_ARGS(length),
|
||||
TP_ARGS(cid, chunk, segment),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, length)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->length = length;
|
||||
),
|
||||
|
||||
TP_printk("length=%u",
|
||||
__entry->length
|
||||
)
|
||||
);
|
||||
|
||||
#define DEFINE_CHUNK_EVENT(name) \
|
||||
DEFINE_EVENT(svcrdma_chunk_event, svcrdma_##name, \
|
||||
TP_PROTO( \
|
||||
u32 length \
|
||||
), \
|
||||
TP_ARGS(length))
|
||||
|
||||
DEFINE_CHUNK_EVENT(send_pzr);
|
||||
DEFINE_CHUNK_EVENT(encode_write_chunk);
|
||||
DEFINE_CHUNK_EVENT(send_write_chunk);
|
||||
DEFINE_CHUNK_EVENT(encode_read_chunk);
|
||||
DEFINE_CHUNK_EVENT(send_reply_chunk);
|
||||
|
||||
TRACE_EVENT(svcrdma_send_read_chunk,
|
||||
TP_PROTO(
|
||||
u32 length,
|
||||
u32 position
|
||||
),
|
||||
|
||||
TP_ARGS(length, position),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, length)
|
||||
__field(u32, cq_id)
|
||||
__field(int, completion_id)
|
||||
__field(u32, segno)
|
||||
__field(u32, position)
|
||||
__field(u32, handle)
|
||||
__field(u32, length)
|
||||
__field(u64, offset)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->length = length;
|
||||
__entry->position = position;
|
||||
__entry->cq_id = cid->ci_queue_id;
|
||||
__entry->completion_id = cid->ci_completion_id;
|
||||
__entry->segno = chunk->ch_segcount;
|
||||
__entry->position = chunk->ch_position;
|
||||
__entry->handle = segment->rs_handle;
|
||||
__entry->length = segment->rs_length;
|
||||
__entry->offset = segment->rs_offset;
|
||||
),
|
||||
|
||||
TP_printk("length=%u position=%u",
|
||||
__entry->length, __entry->position
|
||||
TP_printk("cq_id=%u cid=%d segno=%u position=%u %u@0x%016llx:0x%08x",
|
||||
__entry->cq_id, __entry->completion_id,
|
||||
__entry->segno, __entry->position, __entry->length,
|
||||
(unsigned long long)__entry->offset, __entry->handle
|
||||
)
|
||||
);
|
||||
|
||||
TRACE_EVENT(svcrdma_decode_wseg,
|
||||
TP_PROTO(
|
||||
const struct rpc_rdma_cid *cid,
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
u32 segno
|
||||
),
|
||||
|
||||
TP_ARGS(cid, chunk, segno),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, cq_id)
|
||||
__field(int, completion_id)
|
||||
__field(u32, segno)
|
||||
__field(u32, handle)
|
||||
__field(u32, length)
|
||||
__field(u64, offset)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
const struct svc_rdma_segment *segment =
|
||||
&chunk->ch_segments[segno];
|
||||
|
||||
__entry->cq_id = cid->ci_queue_id;
|
||||
__entry->completion_id = cid->ci_completion_id;
|
||||
__entry->segno = segno;
|
||||
__entry->handle = segment->rs_handle;
|
||||
__entry->length = segment->rs_length;
|
||||
__entry->offset = segment->rs_offset;
|
||||
),
|
||||
|
||||
TP_printk("cq_id=%u cid=%d segno=%u %u@0x%016llx:0x%08x",
|
||||
__entry->cq_id, __entry->completion_id,
|
||||
__entry->segno, __entry->length,
|
||||
(unsigned long long)__entry->offset, __entry->handle
|
||||
)
|
||||
);
|
||||
|
||||
|
@ -1581,6 +1592,7 @@ DECLARE_EVENT_CLASS(svcrdma_dma_map_class,
|
|||
TP_ARGS(rdma, dma_addr, length))
|
||||
|
||||
DEFINE_SVC_DMA_EVENT(dma_map_page);
|
||||
DEFINE_SVC_DMA_EVENT(dma_map_err);
|
||||
DEFINE_SVC_DMA_EVENT(dma_unmap_page);
|
||||
|
||||
TRACE_EVENT(svcrdma_dma_map_rw_err,
|
||||
|
@ -1699,20 +1711,30 @@ TRACE_EVENT(svcrdma_small_wrch_err,
|
|||
|
||||
TRACE_EVENT(svcrdma_send_pullup,
|
||||
TP_PROTO(
|
||||
unsigned int len
|
||||
const struct svc_rdma_send_ctxt *ctxt,
|
||||
unsigned int msglen
|
||||
),
|
||||
|
||||
TP_ARGS(len),
|
||||
TP_ARGS(ctxt, msglen),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(unsigned int, len)
|
||||
__field(u32, cq_id)
|
||||
__field(int, completion_id)
|
||||
__field(unsigned int, hdrlen)
|
||||
__field(unsigned int, msglen)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->len = len;
|
||||
__entry->cq_id = ctxt->sc_cid.ci_queue_id;
|
||||
__entry->completion_id = ctxt->sc_cid.ci_completion_id;
|
||||
__entry->hdrlen = ctxt->sc_hdrbuf.len,
|
||||
__entry->msglen = msglen;
|
||||
),
|
||||
|
||||
TP_printk("len=%u", __entry->len)
|
||||
TP_printk("cq_id=%u cid=%d hdr=%u msg=%u (total %u)",
|
||||
__entry->cq_id, __entry->completion_id,
|
||||
__entry->hdrlen, __entry->msglen,
|
||||
__entry->hdrlen + __entry->msglen)
|
||||
);
|
||||
|
||||
TRACE_EVENT(svcrdma_send_err,
|
||||
|
@ -1819,7 +1841,7 @@ TRACE_EVENT(svcrdma_rq_post_err,
|
|||
)
|
||||
);
|
||||
|
||||
TRACE_EVENT(svcrdma_post_chunk,
|
||||
DECLARE_EVENT_CLASS(svcrdma_post_chunk_class,
|
||||
TP_PROTO(
|
||||
const struct rpc_rdma_cid *cid,
|
||||
int sqecount
|
||||
|
@ -1845,6 +1867,19 @@ TRACE_EVENT(svcrdma_post_chunk,
|
|||
)
|
||||
);
|
||||
|
||||
#define DEFINE_POST_CHUNK_EVENT(name) \
|
||||
DEFINE_EVENT(svcrdma_post_chunk_class, \
|
||||
svcrdma_post_##name##_chunk, \
|
||||
TP_PROTO( \
|
||||
const struct rpc_rdma_cid *cid, \
|
||||
int sqecount \
|
||||
), \
|
||||
TP_ARGS(cid, sqecount))
|
||||
|
||||
DEFINE_POST_CHUNK_EVENT(read);
|
||||
DEFINE_POST_CHUNK_EVENT(write);
|
||||
DEFINE_POST_CHUNK_EVENT(reply);
|
||||
|
||||
DEFINE_COMPLETION_EVENT(svcrdma_wc_read);
|
||||
DEFINE_COMPLETION_EVENT(svcrdma_wc_write);
|
||||
|
||||
|
|
|
@ -1500,30 +1500,6 @@ SVC_RQST_FLAG_LIST
|
|||
#define show_rqstp_flags(flags) \
|
||||
__print_flags(flags, "|", SVC_RQST_FLAG_LIST)
|
||||
|
||||
TRACE_EVENT(svc_recv,
|
||||
TP_PROTO(struct svc_rqst *rqst, int len),
|
||||
|
||||
TP_ARGS(rqst, len),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field(u32, xid)
|
||||
__field(int, len)
|
||||
__field(unsigned long, flags)
|
||||
__string(addr, rqst->rq_xprt->xpt_remotebuf)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->xid = be32_to_cpu(rqst->rq_xid);
|
||||
__entry->len = len;
|
||||
__entry->flags = rqst->rq_flags;
|
||||
__assign_str(addr, rqst->rq_xprt->xpt_remotebuf);
|
||||
),
|
||||
|
||||
TP_printk("addr=%s xid=0x%08x len=%d flags=%s",
|
||||
__get_str(addr), __entry->xid, __entry->len,
|
||||
show_rqstp_flags(__entry->flags))
|
||||
);
|
||||
|
||||
TRACE_DEFINE_ENUM(SVC_GARBAGE);
|
||||
TRACE_DEFINE_ENUM(SVC_SYSERR);
|
||||
TRACE_DEFINE_ENUM(SVC_VALID);
|
||||
|
|
|
@ -200,7 +200,7 @@ static int gssp_call(struct net *net, struct rpc_message *msg)
|
|||
|
||||
static void gssp_free_receive_pages(struct gssx_arg_accept_sec_context *arg)
|
||||
{
|
||||
int i;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < arg->npages && arg->pages[i]; i++)
|
||||
__free_page(arg->pages[i]);
|
||||
|
@ -210,14 +210,19 @@ static void gssp_free_receive_pages(struct gssx_arg_accept_sec_context *arg)
|
|||
|
||||
static int gssp_alloc_receive_pages(struct gssx_arg_accept_sec_context *arg)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
arg->npages = DIV_ROUND_UP(NGROUPS_MAX * 4, PAGE_SIZE);
|
||||
arg->pages = kcalloc(arg->npages, sizeof(struct page *), GFP_KERNEL);
|
||||
/*
|
||||
* XXX: actual pages are allocated by xdr layer in
|
||||
* xdr_partial_copy_from_skb.
|
||||
*/
|
||||
if (!arg->pages)
|
||||
return -ENOMEM;
|
||||
for (i = 0; i < arg->npages; i++) {
|
||||
arg->pages[i] = alloc_page(GFP_KERNEL);
|
||||
if (!arg->pages[i]) {
|
||||
gssp_free_receive_pages(arg);
|
||||
return -ENOMEM;
|
||||
}
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
@ -771,7 +771,6 @@ void gssx_enc_accept_sec_context(struct rpc_rqst *req,
|
|||
xdr_inline_pages(&req->rq_rcv_buf,
|
||||
PAGE_SIZE/2 /* pretty arbitrary */,
|
||||
arg->pages, 0 /* page base */, arg->npages * PAGE_SIZE);
|
||||
req->rq_rcv_buf.flags |= XDRBUF_SPARSE_PAGES;
|
||||
done:
|
||||
if (err)
|
||||
dprintk("RPC: gssx_enc_accept_sec_context: %d\n", err);
|
||||
|
@ -789,7 +788,7 @@ int gssx_dec_accept_sec_context(struct rpc_rqst *rqstp,
|
|||
scratch = alloc_page(GFP_KERNEL);
|
||||
if (!scratch)
|
||||
return -ENOMEM;
|
||||
xdr_set_scratch_buffer(xdr, page_address(scratch), PAGE_SIZE);
|
||||
xdr_set_scratch_page(xdr, scratch);
|
||||
|
||||
/* res->status */
|
||||
err = gssx_dec_status(xdr, &res->status);
|
||||
|
|
|
@ -778,7 +778,6 @@ void cache_clean_deferred(void *owner)
|
|||
*/
|
||||
|
||||
static DEFINE_SPINLOCK(queue_lock);
|
||||
static DEFINE_MUTEX(queue_io_mutex);
|
||||
|
||||
struct cache_queue {
|
||||
struct list_head list;
|
||||
|
@ -906,44 +905,26 @@ static ssize_t cache_do_downcall(char *kaddr, const char __user *buf,
|
|||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t cache_slow_downcall(const char __user *buf,
|
||||
size_t count, struct cache_detail *cd)
|
||||
{
|
||||
static char write_buf[32768]; /* protected by queue_io_mutex */
|
||||
ssize_t ret = -EINVAL;
|
||||
|
||||
if (count >= sizeof(write_buf))
|
||||
goto out;
|
||||
mutex_lock(&queue_io_mutex);
|
||||
ret = cache_do_downcall(write_buf, buf, count, cd);
|
||||
mutex_unlock(&queue_io_mutex);
|
||||
out:
|
||||
return ret;
|
||||
}
|
||||
|
||||
static ssize_t cache_downcall(struct address_space *mapping,
|
||||
const char __user *buf,
|
||||
size_t count, struct cache_detail *cd)
|
||||
{
|
||||
struct page *page;
|
||||
char *kaddr;
|
||||
char *write_buf;
|
||||
ssize_t ret = -ENOMEM;
|
||||
|
||||
if (count >= PAGE_SIZE)
|
||||
goto out_slow;
|
||||
if (count >= 32768) { /* 32k is max userland buffer, lets check anyway */
|
||||
ret = -EINVAL;
|
||||
goto out;
|
||||
}
|
||||
|
||||
page = find_or_create_page(mapping, 0, GFP_KERNEL);
|
||||
if (!page)
|
||||
goto out_slow;
|
||||
write_buf = kvmalloc(count + 1, GFP_KERNEL);
|
||||
if (!write_buf)
|
||||
goto out;
|
||||
|
||||
kaddr = kmap(page);
|
||||
ret = cache_do_downcall(kaddr, buf, count, cd);
|
||||
kunmap(page);
|
||||
unlock_page(page);
|
||||
put_page(page);
|
||||
ret = cache_do_downcall(write_buf, buf, count, cd);
|
||||
kvfree(write_buf);
|
||||
out:
|
||||
return ret;
|
||||
out_slow:
|
||||
return cache_slow_downcall(buf, count, cd);
|
||||
}
|
||||
|
||||
static ssize_t cache_write(struct file *filp, const char __user *buf,
|
||||
|
|
|
@ -614,6 +614,10 @@ svc_rqst_alloc(struct svc_serv *serv, struct svc_pool *pool, int node)
|
|||
rqstp->rq_server = serv;
|
||||
rqstp->rq_pool = pool;
|
||||
|
||||
rqstp->rq_scratch_page = alloc_pages_node(node, GFP_KERNEL, 0);
|
||||
if (!rqstp->rq_scratch_page)
|
||||
goto out_enomem;
|
||||
|
||||
rqstp->rq_argp = kmalloc_node(serv->sv_xdrsize, GFP_KERNEL, node);
|
||||
if (!rqstp->rq_argp)
|
||||
goto out_enomem;
|
||||
|
@ -842,6 +846,7 @@ void
|
|||
svc_rqst_free(struct svc_rqst *rqstp)
|
||||
{
|
||||
svc_release_buffer(rqstp);
|
||||
put_page(rqstp->rq_scratch_page);
|
||||
kfree(rqstp->rq_resp);
|
||||
kfree(rqstp->rq_argp);
|
||||
kfree(rqstp->rq_auth_data);
|
||||
|
@ -1622,7 +1627,7 @@ u32 svc_max_payload(const struct svc_rqst *rqstp)
|
|||
EXPORT_SYMBOL_GPL(svc_max_payload);
|
||||
|
||||
/**
|
||||
* svc_encode_read_payload - mark a range of bytes as a READ payload
|
||||
* svc_encode_result_payload - mark a range of bytes as a result payload
|
||||
* @rqstp: svc_rqst to operate on
|
||||
* @offset: payload's byte offset in rqstp->rq_res
|
||||
* @length: size of payload, in bytes
|
||||
|
@ -1630,12 +1635,13 @@ EXPORT_SYMBOL_GPL(svc_max_payload);
|
|||
* Returns zero on success, or a negative errno if a permanent
|
||||
* error occurred.
|
||||
*/
|
||||
int svc_encode_read_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
int svc_encode_result_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
{
|
||||
return rqstp->rq_xprt->xpt_ops->xpo_read_payload(rqstp, offset, length);
|
||||
return rqstp->rq_xprt->xpt_ops->xpo_result_payload(rqstp, offset,
|
||||
length);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(svc_encode_read_payload);
|
||||
EXPORT_SYMBOL_GPL(svc_encode_result_payload);
|
||||
|
||||
/**
|
||||
* svc_fill_write_vector - Construct data argument for VFS write call
|
||||
|
|
|
@ -813,8 +813,6 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt)
|
|||
len = svc_deferred_recv(rqstp);
|
||||
else
|
||||
len = xprt->xpt_ops->xpo_recvfrom(rqstp);
|
||||
if (len > 0)
|
||||
trace_svc_xdr_recvfrom(rqstp, &rqstp->rq_arg);
|
||||
rqstp->rq_stime = ktime_get();
|
||||
rqstp->rq_reserved = serv->sv_max_mesg;
|
||||
atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
|
||||
|
@ -868,7 +866,7 @@ int svc_recv(struct svc_rqst *rqstp, long timeout)
|
|||
|
||||
if (serv->sv_stats)
|
||||
serv->sv_stats->netcnt++;
|
||||
trace_svc_recv(rqstp, len);
|
||||
trace_svc_xdr_recvfrom(rqstp, &rqstp->rq_arg);
|
||||
return len;
|
||||
out_release:
|
||||
rqstp->rq_res.len = 0;
|
||||
|
|
|
@ -181,8 +181,8 @@ static void svc_set_cmsg_data(struct svc_rqst *rqstp, struct cmsghdr *cmh)
|
|||
}
|
||||
}
|
||||
|
||||
static int svc_sock_read_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
static int svc_sock_result_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
@ -635,7 +635,7 @@ static const struct svc_xprt_ops svc_udp_ops = {
|
|||
.xpo_create = svc_udp_create,
|
||||
.xpo_recvfrom = svc_udp_recvfrom,
|
||||
.xpo_sendto = svc_udp_sendto,
|
||||
.xpo_read_payload = svc_sock_read_payload,
|
||||
.xpo_result_payload = svc_sock_result_payload,
|
||||
.xpo_release_rqst = svc_udp_release_rqst,
|
||||
.xpo_detach = svc_sock_detach,
|
||||
.xpo_free = svc_sock_free,
|
||||
|
@ -1123,7 +1123,7 @@ static const struct svc_xprt_ops svc_tcp_ops = {
|
|||
.xpo_create = svc_tcp_create,
|
||||
.xpo_recvfrom = svc_tcp_recvfrom,
|
||||
.xpo_sendto = svc_tcp_sendto,
|
||||
.xpo_read_payload = svc_sock_read_payload,
|
||||
.xpo_result_payload = svc_sock_result_payload,
|
||||
.xpo_release_rqst = svc_tcp_release_rqst,
|
||||
.xpo_detach = svc_tcp_sock_detach,
|
||||
.xpo_free = svc_sock_free,
|
||||
|
|
|
@ -669,7 +669,7 @@ void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p,
|
|||
struct kvec *iov = buf->head;
|
||||
int scratch_len = buf->buflen - buf->page_len - buf->tail[0].iov_len;
|
||||
|
||||
xdr_set_scratch_buffer(xdr, NULL, 0);
|
||||
xdr_reset_scratch_buffer(xdr);
|
||||
BUG_ON(scratch_len < 0);
|
||||
xdr->buf = buf;
|
||||
xdr->iov = iov;
|
||||
|
@ -713,7 +713,7 @@ inline void xdr_commit_encode(struct xdr_stream *xdr)
|
|||
page = page_address(*xdr->page_ptr);
|
||||
memcpy(xdr->scratch.iov_base, page, shift);
|
||||
memmove(page, page + shift, (void *)xdr->p - page);
|
||||
xdr->scratch.iov_len = 0;
|
||||
xdr_reset_scratch_buffer(xdr);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(xdr_commit_encode);
|
||||
|
||||
|
@ -743,8 +743,7 @@ static __be32 *xdr_get_next_encode_buffer(struct xdr_stream *xdr,
|
|||
* the "scratch" iov to track any temporarily unused fragment of
|
||||
* space at the end of the previous buffer:
|
||||
*/
|
||||
xdr->scratch.iov_base = xdr->p;
|
||||
xdr->scratch.iov_len = frag1bytes;
|
||||
xdr_set_scratch_buffer(xdr, xdr->p, frag1bytes);
|
||||
p = page_address(*xdr->page_ptr);
|
||||
/*
|
||||
* Note this is where the next encode will start after we've
|
||||
|
@ -1052,8 +1051,7 @@ void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p,
|
|||
struct rpc_rqst *rqst)
|
||||
{
|
||||
xdr->buf = buf;
|
||||
xdr->scratch.iov_base = NULL;
|
||||
xdr->scratch.iov_len = 0;
|
||||
xdr_reset_scratch_buffer(xdr);
|
||||
xdr->nwords = XDR_QUADLEN(buf->len);
|
||||
if (buf->head[0].iov_len != 0)
|
||||
xdr_set_iov(xdr, buf->head, buf->len);
|
||||
|
@ -1101,24 +1099,6 @@ static __be32 * __xdr_inline_decode(struct xdr_stream *xdr, size_t nbytes)
|
|||
return p;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_set_scratch_buffer - Attach a scratch buffer for decoding data.
|
||||
* @xdr: pointer to xdr_stream struct
|
||||
* @buf: pointer to an empty buffer
|
||||
* @buflen: size of 'buf'
|
||||
*
|
||||
* The scratch buffer is used when decoding from an array of pages.
|
||||
* If an xdr_inline_decode() call spans across page boundaries, then
|
||||
* we copy the data into the scratch buffer in order to allow linear
|
||||
* access.
|
||||
*/
|
||||
void xdr_set_scratch_buffer(struct xdr_stream *xdr, void *buf, size_t buflen)
|
||||
{
|
||||
xdr->scratch.iov_base = buf;
|
||||
xdr->scratch.iov_len = buflen;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(xdr_set_scratch_buffer);
|
||||
|
||||
static __be32 *xdr_copy_to_scratch(struct xdr_stream *xdr, size_t nbytes)
|
||||
{
|
||||
__be32 *p;
|
||||
|
@ -1379,9 +1359,8 @@ EXPORT_SYMBOL_GPL(xdr_buf_from_iov);
|
|||
*
|
||||
* Returns -1 if base of length are out of bounds.
|
||||
*/
|
||||
int
|
||||
xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf,
|
||||
unsigned int base, unsigned int len)
|
||||
int xdr_buf_subsegment(const struct xdr_buf *buf, struct xdr_buf *subbuf,
|
||||
unsigned int base, unsigned int len)
|
||||
{
|
||||
subbuf->buflen = subbuf->len = len;
|
||||
if (base < buf->head[0].iov_len) {
|
||||
|
@ -1428,6 +1407,51 @@ xdr_buf_subsegment(struct xdr_buf *buf, struct xdr_buf *subbuf,
|
|||
}
|
||||
EXPORT_SYMBOL_GPL(xdr_buf_subsegment);
|
||||
|
||||
/**
|
||||
* xdr_stream_subsegment - set @subbuf to a portion of @xdr
|
||||
* @xdr: an xdr_stream set up for decoding
|
||||
* @subbuf: the result buffer
|
||||
* @nbytes: length of @xdr to extract, in bytes
|
||||
*
|
||||
* Sets up @subbuf to represent a portion of @xdr. The portion
|
||||
* starts at the current offset in @xdr, and extends for a length
|
||||
* of @nbytes. If this is successful, @xdr is advanced to the next
|
||||
* position following that portion.
|
||||
*
|
||||
* Return values:
|
||||
* %true: @subbuf has been initialized, and @xdr has been advanced.
|
||||
* %false: a bounds error has occurred
|
||||
*/
|
||||
bool xdr_stream_subsegment(struct xdr_stream *xdr, struct xdr_buf *subbuf,
|
||||
unsigned int nbytes)
|
||||
{
|
||||
unsigned int remaining, offset, len;
|
||||
|
||||
if (xdr_buf_subsegment(xdr->buf, subbuf, xdr_stream_pos(xdr), nbytes))
|
||||
return false;
|
||||
|
||||
if (subbuf->head[0].iov_len)
|
||||
if (!__xdr_inline_decode(xdr, subbuf->head[0].iov_len))
|
||||
return false;
|
||||
|
||||
remaining = subbuf->page_len;
|
||||
offset = subbuf->page_base;
|
||||
while (remaining) {
|
||||
len = min_t(unsigned int, remaining, PAGE_SIZE) - offset;
|
||||
|
||||
if (xdr->p == xdr->end && !xdr_set_next_buffer(xdr))
|
||||
return false;
|
||||
if (!__xdr_inline_decode(xdr, len))
|
||||
return false;
|
||||
|
||||
remaining -= len;
|
||||
offset = 0;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(xdr_stream_subsegment);
|
||||
|
||||
/**
|
||||
* xdr_buf_trim - lop at most "len" bytes off the end of "buf"
|
||||
* @buf: buf to be trimmed
|
||||
|
|
|
@ -4,5 +4,5 @@ obj-$(CONFIG_SUNRPC_XPRT_RDMA) += rpcrdma.o
|
|||
rpcrdma-y := transport.o rpc_rdma.o verbs.o frwr_ops.o \
|
||||
svc_rdma.o svc_rdma_backchannel.o svc_rdma_transport.o \
|
||||
svc_rdma_sendto.o svc_rdma_recvfrom.o svc_rdma_rw.o \
|
||||
module.o
|
||||
svc_rdma_pcl.o module.o
|
||||
rpcrdma-$(CONFIG_SUNRPC_BACKCHANNEL) += backchannel.o
|
||||
|
|
|
@ -74,11 +74,17 @@ out_unlock:
|
|||
*/
|
||||
static int svc_rdma_bc_sendto(struct svcxprt_rdma *rdma,
|
||||
struct rpc_rqst *rqst,
|
||||
struct svc_rdma_send_ctxt *ctxt)
|
||||
struct svc_rdma_send_ctxt *sctxt)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *rctxt;
|
||||
int ret;
|
||||
|
||||
ret = svc_rdma_map_reply_msg(rdma, ctxt, NULL, &rqst->rq_snd_buf);
|
||||
rctxt = svc_rdma_recv_ctxt_get(rdma);
|
||||
if (!rctxt)
|
||||
return -EIO;
|
||||
|
||||
ret = svc_rdma_map_reply_msg(rdma, sctxt, rctxt, &rqst->rq_snd_buf);
|
||||
svc_rdma_recv_ctxt_put(rdma, rctxt);
|
||||
if (ret < 0)
|
||||
return -EIO;
|
||||
|
||||
|
@ -86,8 +92,8 @@ static int svc_rdma_bc_sendto(struct svcxprt_rdma *rdma,
|
|||
* the rq_buffer before all retransmits are complete.
|
||||
*/
|
||||
get_page(virt_to_page(rqst->rq_buffer));
|
||||
ctxt->sc_send_wr.opcode = IB_WR_SEND;
|
||||
return svc_rdma_send(rdma, ctxt);
|
||||
sctxt->sc_send_wr.opcode = IB_WR_SEND;
|
||||
return svc_rdma_send(rdma, sctxt);
|
||||
}
|
||||
|
||||
/* Server-side transport endpoint wants a whole page for its send
|
||||
|
|
|
@ -0,0 +1,306 @@
|
|||
// SPDX-License-Identifier: GPL-2.0
|
||||
/*
|
||||
* Copyright (c) 2020 Oracle. All rights reserved.
|
||||
*/
|
||||
|
||||
#include <linux/sunrpc/svc_rdma.h>
|
||||
#include <linux/sunrpc/rpc_rdma.h>
|
||||
|
||||
#include "xprt_rdma.h"
|
||||
#include <trace/events/rpcrdma.h>
|
||||
|
||||
/**
|
||||
* pcl_free - Release all memory associated with a parsed chunk list
|
||||
* @pcl: parsed chunk list
|
||||
*
|
||||
*/
|
||||
void pcl_free(struct svc_rdma_pcl *pcl)
|
||||
{
|
||||
while (!list_empty(&pcl->cl_chunks)) {
|
||||
struct svc_rdma_chunk *chunk;
|
||||
|
||||
chunk = pcl_first_chunk(pcl);
|
||||
list_del(&chunk->ch_list);
|
||||
kfree(chunk);
|
||||
}
|
||||
}
|
||||
|
||||
static struct svc_rdma_chunk *pcl_alloc_chunk(u32 segcount, u32 position)
|
||||
{
|
||||
struct svc_rdma_chunk *chunk;
|
||||
|
||||
chunk = kmalloc(struct_size(chunk, ch_segments, segcount), GFP_KERNEL);
|
||||
if (!chunk)
|
||||
return NULL;
|
||||
|
||||
chunk->ch_position = position;
|
||||
chunk->ch_length = 0;
|
||||
chunk->ch_payload_length = 0;
|
||||
chunk->ch_segcount = 0;
|
||||
return chunk;
|
||||
}
|
||||
|
||||
static struct svc_rdma_chunk *
|
||||
pcl_lookup_position(struct svc_rdma_pcl *pcl, u32 position)
|
||||
{
|
||||
struct svc_rdma_chunk *pos;
|
||||
|
||||
pcl_for_each_chunk(pos, pcl) {
|
||||
if (pos->ch_position == position)
|
||||
return pos;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static void pcl_insert_position(struct svc_rdma_pcl *pcl,
|
||||
struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
struct svc_rdma_chunk *pos;
|
||||
|
||||
pcl_for_each_chunk(pos, pcl) {
|
||||
if (pos->ch_position > chunk->ch_position)
|
||||
break;
|
||||
}
|
||||
__list_add(&chunk->ch_list, pos->ch_list.prev, &pos->ch_list);
|
||||
pcl->cl_count++;
|
||||
}
|
||||
|
||||
static void pcl_set_read_segment(const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_chunk *chunk,
|
||||
u32 handle, u32 length, u64 offset)
|
||||
{
|
||||
struct svc_rdma_segment *segment;
|
||||
|
||||
segment = &chunk->ch_segments[chunk->ch_segcount];
|
||||
segment->rs_handle = handle;
|
||||
segment->rs_length = length;
|
||||
segment->rs_offset = offset;
|
||||
|
||||
trace_svcrdma_decode_rseg(&rctxt->rc_cid, chunk, segment);
|
||||
|
||||
chunk->ch_length += length;
|
||||
chunk->ch_segcount++;
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_alloc_call - Construct a parsed chunk list for the Call body
|
||||
* @rctxt: Ingress receive context
|
||||
* @p: Start of an un-decoded Read list
|
||||
*
|
||||
* Assumptions:
|
||||
* - The incoming Read list has already been sanity checked.
|
||||
* - cl_count is already set to the number of segments in
|
||||
* the un-decoded list.
|
||||
* - The list might not be in order by position.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Parsed chunk list was successfully constructed, and
|
||||
* cl_count is updated to be the number of chunks (ie.
|
||||
* unique positions) in the Read list.
|
||||
* %false: Memory allocation failed.
|
||||
*/
|
||||
bool pcl_alloc_call(struct svc_rdma_recv_ctxt *rctxt, __be32 *p)
|
||||
{
|
||||
struct svc_rdma_pcl *pcl = &rctxt->rc_call_pcl;
|
||||
unsigned int i, segcount = pcl->cl_count;
|
||||
|
||||
pcl->cl_count = 0;
|
||||
for (i = 0; i < segcount; i++) {
|
||||
struct svc_rdma_chunk *chunk;
|
||||
u32 position, handle, length;
|
||||
u64 offset;
|
||||
|
||||
p++; /* skip the list discriminator */
|
||||
p = xdr_decode_read_segment(p, &position, &handle,
|
||||
&length, &offset);
|
||||
if (position != 0)
|
||||
continue;
|
||||
|
||||
if (pcl_is_empty(pcl)) {
|
||||
chunk = pcl_alloc_chunk(segcount, position);
|
||||
if (!chunk)
|
||||
return false;
|
||||
pcl_insert_position(pcl, chunk);
|
||||
} else {
|
||||
chunk = list_first_entry(&pcl->cl_chunks,
|
||||
struct svc_rdma_chunk,
|
||||
ch_list);
|
||||
}
|
||||
|
||||
pcl_set_read_segment(rctxt, chunk, handle, length, offset);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_alloc_read - Construct a parsed chunk list for normal Read chunks
|
||||
* @rctxt: Ingress receive context
|
||||
* @p: Start of an un-decoded Read list
|
||||
*
|
||||
* Assumptions:
|
||||
* - The incoming Read list has already been sanity checked.
|
||||
* - cl_count is already set to the number of segments in
|
||||
* the un-decoded list.
|
||||
* - The list might not be in order by position.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Parsed chunk list was successfully constructed, and
|
||||
* cl_count is updated to be the number of chunks (ie.
|
||||
* unique position values) in the Read list.
|
||||
* %false: Memory allocation failed.
|
||||
*
|
||||
* TODO:
|
||||
* - Check for chunk range overlaps
|
||||
*/
|
||||
bool pcl_alloc_read(struct svc_rdma_recv_ctxt *rctxt, __be32 *p)
|
||||
{
|
||||
struct svc_rdma_pcl *pcl = &rctxt->rc_read_pcl;
|
||||
unsigned int i, segcount = pcl->cl_count;
|
||||
|
||||
pcl->cl_count = 0;
|
||||
for (i = 0; i < segcount; i++) {
|
||||
struct svc_rdma_chunk *chunk;
|
||||
u32 position, handle, length;
|
||||
u64 offset;
|
||||
|
||||
p++; /* skip the list discriminator */
|
||||
p = xdr_decode_read_segment(p, &position, &handle,
|
||||
&length, &offset);
|
||||
if (position == 0)
|
||||
continue;
|
||||
|
||||
chunk = pcl_lookup_position(pcl, position);
|
||||
if (!chunk) {
|
||||
chunk = pcl_alloc_chunk(segcount, position);
|
||||
if (!chunk)
|
||||
return false;
|
||||
pcl_insert_position(pcl, chunk);
|
||||
}
|
||||
|
||||
pcl_set_read_segment(rctxt, chunk, handle, length, offset);
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_alloc_write - Construct a parsed chunk list from a Write list
|
||||
* @rctxt: Ingress receive context
|
||||
* @pcl: Parsed chunk list to populate
|
||||
* @p: Start of an un-decoded Write list
|
||||
*
|
||||
* Assumptions:
|
||||
* - The incoming Write list has already been sanity checked, and
|
||||
* - cl_count is set to the number of chunks in the un-decoded list.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Parsed chunk list was successfully constructed.
|
||||
* %false: Memory allocation failed.
|
||||
*/
|
||||
bool pcl_alloc_write(struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_pcl *pcl, __be32 *p)
|
||||
{
|
||||
struct svc_rdma_segment *segment;
|
||||
struct svc_rdma_chunk *chunk;
|
||||
unsigned int i, j;
|
||||
u32 segcount;
|
||||
|
||||
for (i = 0; i < pcl->cl_count; i++) {
|
||||
p++; /* skip the list discriminator */
|
||||
segcount = be32_to_cpup(p++);
|
||||
|
||||
chunk = pcl_alloc_chunk(segcount, 0);
|
||||
if (!chunk)
|
||||
return false;
|
||||
list_add_tail(&chunk->ch_list, &pcl->cl_chunks);
|
||||
|
||||
for (j = 0; j < segcount; j++) {
|
||||
segment = &chunk->ch_segments[j];
|
||||
p = xdr_decode_rdma_segment(p, &segment->rs_handle,
|
||||
&segment->rs_length,
|
||||
&segment->rs_offset);
|
||||
trace_svcrdma_decode_wseg(&rctxt->rc_cid, chunk, j);
|
||||
|
||||
chunk->ch_length += segment->rs_length;
|
||||
chunk->ch_segcount++;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
static int pcl_process_region(const struct xdr_buf *xdr,
|
||||
unsigned int offset, unsigned int length,
|
||||
int (*actor)(const struct xdr_buf *, void *),
|
||||
void *data)
|
||||
{
|
||||
struct xdr_buf subbuf;
|
||||
|
||||
if (!length)
|
||||
return 0;
|
||||
if (xdr_buf_subsegment(xdr, &subbuf, offset, length))
|
||||
return -EMSGSIZE;
|
||||
return actor(&subbuf, data);
|
||||
}
|
||||
|
||||
/**
|
||||
* pcl_process_nonpayloads - Process non-payload regions inside @xdr
|
||||
* @pcl: Chunk list to process
|
||||
* @xdr: xdr_buf to process
|
||||
* @actor: Function to invoke on each non-payload region
|
||||
* @data: Arguments for @actor
|
||||
*
|
||||
* This mechanism must ignore not only result payloads that were already
|
||||
* sent via RDMA Write, but also XDR padding for those payloads that
|
||||
* the upper layer has added.
|
||||
*
|
||||
* Assumptions:
|
||||
* The xdr->len and ch_position fields are aligned to 4-byte multiples.
|
||||
*
|
||||
* Returns:
|
||||
* On success, zero,
|
||||
* %-EMSGSIZE on XDR buffer overflow, or
|
||||
* The return value of @actor
|
||||
*/
|
||||
int pcl_process_nonpayloads(const struct svc_rdma_pcl *pcl,
|
||||
const struct xdr_buf *xdr,
|
||||
int (*actor)(const struct xdr_buf *, void *),
|
||||
void *data)
|
||||
{
|
||||
struct svc_rdma_chunk *chunk, *next;
|
||||
unsigned int start;
|
||||
int ret;
|
||||
|
||||
chunk = pcl_first_chunk(pcl);
|
||||
|
||||
/* No result payloads were generated */
|
||||
if (!chunk || !chunk->ch_payload_length)
|
||||
return actor(xdr, data);
|
||||
|
||||
/* Process the region before the first result payload */
|
||||
ret = pcl_process_region(xdr, 0, chunk->ch_position, actor, data);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
/* Process the regions between each middle result payload */
|
||||
while ((next = pcl_next_chunk(pcl, chunk))) {
|
||||
if (!next->ch_payload_length)
|
||||
break;
|
||||
|
||||
start = pcl_chunk_end_offset(chunk);
|
||||
ret = pcl_process_region(xdr, start, next->ch_position - start,
|
||||
actor, data);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
chunk = next;
|
||||
}
|
||||
|
||||
/* Process the region after the last result payload */
|
||||
start = pcl_chunk_end_offset(chunk);
|
||||
ret = pcl_process_region(xdr, start, xdr->len - start, actor, data);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
return 0;
|
||||
}
|
|
@ -93,6 +93,7 @@
|
|||
* (see rdma_read_complete() below).
|
||||
*/
|
||||
|
||||
#include <linux/slab.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <asm/unaligned.h>
|
||||
#include <rdma/ib_verbs.h>
|
||||
|
@ -143,6 +144,10 @@ svc_rdma_recv_ctxt_alloc(struct svcxprt_rdma *rdma)
|
|||
goto fail2;
|
||||
|
||||
svc_rdma_recv_cid_init(rdma, &ctxt->rc_cid);
|
||||
pcl_init(&ctxt->rc_call_pcl);
|
||||
pcl_init(&ctxt->rc_read_pcl);
|
||||
pcl_init(&ctxt->rc_write_pcl);
|
||||
pcl_init(&ctxt->rc_reply_pcl);
|
||||
|
||||
ctxt->rc_recv_wr.next = NULL;
|
||||
ctxt->rc_recv_wr.wr_cqe = &ctxt->rc_cqe;
|
||||
|
@ -189,8 +194,13 @@ void svc_rdma_recv_ctxts_destroy(struct svcxprt_rdma *rdma)
|
|||
}
|
||||
}
|
||||
|
||||
static struct svc_rdma_recv_ctxt *
|
||||
svc_rdma_recv_ctxt_get(struct svcxprt_rdma *rdma)
|
||||
/**
|
||||
* svc_rdma_recv_ctxt_get - Allocate a recv_ctxt
|
||||
* @rdma: controlling svcxprt_rdma
|
||||
*
|
||||
* Returns a recv_ctxt or (rarely) NULL if none are available.
|
||||
*/
|
||||
struct svc_rdma_recv_ctxt *svc_rdma_recv_ctxt_get(struct svcxprt_rdma *rdma)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *ctxt;
|
||||
struct llist_node *node;
|
||||
|
@ -202,7 +212,6 @@ svc_rdma_recv_ctxt_get(struct svcxprt_rdma *rdma)
|
|||
|
||||
out:
|
||||
ctxt->rc_page_count = 0;
|
||||
ctxt->rc_read_payload_length = 0;
|
||||
return ctxt;
|
||||
|
||||
out_empty:
|
||||
|
@ -226,6 +235,11 @@ void svc_rdma_recv_ctxt_put(struct svcxprt_rdma *rdma,
|
|||
for (i = 0; i < ctxt->rc_page_count; i++)
|
||||
put_page(ctxt->rc_pages[i]);
|
||||
|
||||
pcl_free(&ctxt->rc_call_pcl);
|
||||
pcl_free(&ctxt->rc_read_pcl);
|
||||
pcl_free(&ctxt->rc_write_pcl);
|
||||
pcl_free(&ctxt->rc_reply_pcl);
|
||||
|
||||
if (!ctxt->rc_temp)
|
||||
llist_add(&ctxt->rc_node, &rdma->sc_recv_ctxts);
|
||||
else
|
||||
|
@ -385,100 +399,123 @@ static void svc_rdma_build_arg_xdr(struct svc_rqst *rqstp,
|
|||
arg->len = ctxt->rc_byte_len;
|
||||
}
|
||||
|
||||
/* This accommodates the largest possible Write chunk.
|
||||
*/
|
||||
#define MAX_BYTES_WRITE_CHUNK ((u32)(RPCSVC_MAXPAGES << PAGE_SHIFT))
|
||||
|
||||
/* This accommodates the largest possible Position-Zero
|
||||
* Read chunk or Reply chunk.
|
||||
*/
|
||||
#define MAX_BYTES_SPECIAL_CHUNK ((u32)((RPCSVC_MAXPAGES + 2) << PAGE_SHIFT))
|
||||
|
||||
/* Sanity check the Read list.
|
||||
/**
|
||||
* xdr_count_read_segments - Count number of Read segments in Read list
|
||||
* @rctxt: Ingress receive context
|
||||
* @p: Start of an un-decoded Read list
|
||||
*
|
||||
* Implementation limits:
|
||||
* - This implementation supports only one Read chunk.
|
||||
* Before allocating anything, ensure the ingress Read list is safe
|
||||
* to use.
|
||||
*
|
||||
* Sanity checks:
|
||||
* - Read list does not overflow Receive buffer.
|
||||
* - Segment size limited by largest NFS data payload.
|
||||
*
|
||||
* The segment count is limited to how many segments can
|
||||
* fit in the transport header without overflowing the
|
||||
* buffer. That's about 40 Read segments for a 1KB inline
|
||||
* threshold.
|
||||
* The segment count is limited to how many segments can fit in the
|
||||
* transport header without overflowing the buffer. That's about 40
|
||||
* Read segments for a 1KB inline threshold.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Read list is valid. @rctxt's xdr_stream is updated
|
||||
* to point to the first byte past the Read list.
|
||||
* %false: Read list is corrupt. @rctxt's xdr_stream is left
|
||||
* in an unknown state.
|
||||
* %true: Read list is valid. @rctxt's xdr_stream is updated to point
|
||||
* to the first byte past the Read list. rc_read_pcl and
|
||||
* rc_call_pcl cl_count fields are set to the number of
|
||||
* Read segments in the list.
|
||||
* %false: Read list is corrupt. @rctxt's xdr_stream is left in an
|
||||
* unknown state.
|
||||
*/
|
||||
static bool xdr_check_read_list(struct svc_rdma_recv_ctxt *rctxt)
|
||||
static bool xdr_count_read_segments(struct svc_rdma_recv_ctxt *rctxt, __be32 *p)
|
||||
{
|
||||
u32 position, len;
|
||||
bool first;
|
||||
__be32 *p;
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
|
||||
len = 0;
|
||||
first = true;
|
||||
rctxt->rc_call_pcl.cl_count = 0;
|
||||
rctxt->rc_read_pcl.cl_count = 0;
|
||||
while (xdr_item_is_present(p)) {
|
||||
u32 position, handle, length;
|
||||
u64 offset;
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream,
|
||||
rpcrdma_readseg_maxsz * sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
|
||||
if (first) {
|
||||
position = be32_to_cpup(p);
|
||||
first = false;
|
||||
} else if (be32_to_cpup(p) != position) {
|
||||
return false;
|
||||
xdr_decode_read_segment(p, &position, &handle,
|
||||
&length, &offset);
|
||||
if (position) {
|
||||
if (position & 3)
|
||||
return false;
|
||||
++rctxt->rc_read_pcl.cl_count;
|
||||
} else {
|
||||
++rctxt->rc_call_pcl.cl_count;
|
||||
}
|
||||
p += 2;
|
||||
len += be32_to_cpup(p);
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
}
|
||||
return len <= MAX_BYTES_SPECIAL_CHUNK;
|
||||
return true;
|
||||
}
|
||||
|
||||
/* The segment count is limited to how many segments can
|
||||
* fit in the transport header without overflowing the
|
||||
* buffer. That's about 60 Write segments for a 1KB inline
|
||||
* threshold.
|
||||
/* Sanity check the Read list.
|
||||
*
|
||||
* Sanity checks:
|
||||
* - Read list does not overflow Receive buffer.
|
||||
* - Chunk size limited by largest NFS data payload.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Read list is valid. @rctxt's xdr_stream is updated
|
||||
* to point to the first byte past the Read list.
|
||||
* %false: Read list is corrupt. @rctxt's xdr_stream is left
|
||||
* in an unknown state.
|
||||
*/
|
||||
static bool xdr_check_write_chunk(struct svc_rdma_recv_ctxt *rctxt, u32 maxlen)
|
||||
static bool xdr_check_read_list(struct svc_rdma_recv_ctxt *rctxt)
|
||||
{
|
||||
u32 i, segcount, total;
|
||||
__be32 *p;
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
segcount = be32_to_cpup(p);
|
||||
if (!xdr_count_read_segments(rctxt, p))
|
||||
return false;
|
||||
if (!pcl_alloc_call(rctxt, p))
|
||||
return false;
|
||||
return pcl_alloc_read(rctxt, p);
|
||||
}
|
||||
|
||||
total = 0;
|
||||
for (i = 0; i < segcount; i++) {
|
||||
u32 handle, length;
|
||||
u64 offset;
|
||||
static bool xdr_check_write_chunk(struct svc_rdma_recv_ctxt *rctxt)
|
||||
{
|
||||
u32 segcount;
|
||||
__be32 *p;
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream,
|
||||
rpcrdma_segment_maxsz * sizeof(*p));
|
||||
if (xdr_stream_decode_u32(&rctxt->rc_stream, &segcount))
|
||||
return false;
|
||||
|
||||
/* A bogus segcount causes this buffer overflow check to fail. */
|
||||
p = xdr_inline_decode(&rctxt->rc_stream,
|
||||
segcount * rpcrdma_segment_maxsz * sizeof(*p));
|
||||
return p != NULL;
|
||||
}
|
||||
|
||||
/**
|
||||
* xdr_count_write_chunks - Count number of Write chunks in Write list
|
||||
* @rctxt: Received header and decoding state
|
||||
* @p: start of an un-decoded Write list
|
||||
*
|
||||
* Before allocating anything, ensure the ingress Write list is
|
||||
* safe to use.
|
||||
*
|
||||
* Return values:
|
||||
* %true: Write list is valid. @rctxt's xdr_stream is updated
|
||||
* to point to the first byte past the Write list, and
|
||||
* the number of Write chunks is in rc_write_pcl.cl_count.
|
||||
* %false: Write list is corrupt. @rctxt's xdr_stream is left
|
||||
* in an indeterminate state.
|
||||
*/
|
||||
static bool xdr_count_write_chunks(struct svc_rdma_recv_ctxt *rctxt, __be32 *p)
|
||||
{
|
||||
rctxt->rc_write_pcl.cl_count = 0;
|
||||
while (xdr_item_is_present(p)) {
|
||||
if (!xdr_check_write_chunk(rctxt))
|
||||
return false;
|
||||
++rctxt->rc_write_pcl.cl_count;
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
|
||||
xdr_decode_rdma_segment(p, &handle, &length, &offset);
|
||||
trace_svcrdma_decode_wseg(handle, length, offset);
|
||||
|
||||
total += length;
|
||||
}
|
||||
return total <= maxlen;
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Sanity check the Write list.
|
||||
|
@ -498,24 +535,18 @@ static bool xdr_check_write_chunk(struct svc_rdma_recv_ctxt *rctxt, u32 maxlen)
|
|||
*/
|
||||
static bool xdr_check_write_list(struct svc_rdma_recv_ctxt *rctxt)
|
||||
{
|
||||
u32 chcount = 0;
|
||||
__be32 *p;
|
||||
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
rctxt->rc_write_list = p;
|
||||
while (xdr_item_is_present(p)) {
|
||||
if (!xdr_check_write_chunk(rctxt, MAX_BYTES_WRITE_CHUNK))
|
||||
return false;
|
||||
++chcount;
|
||||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
}
|
||||
if (!chcount)
|
||||
rctxt->rc_write_list = NULL;
|
||||
return chcount < 2;
|
||||
if (!xdr_count_write_chunks(rctxt, p))
|
||||
return false;
|
||||
if (!pcl_alloc_write(rctxt, &rctxt->rc_write_pcl, p))
|
||||
return false;
|
||||
|
||||
rctxt->rc_cur_result_payload = pcl_first_chunk(&rctxt->rc_write_pcl);
|
||||
return true;
|
||||
}
|
||||
|
||||
/* Sanity check the Reply chunk.
|
||||
|
@ -537,13 +568,14 @@ static bool xdr_check_reply_chunk(struct svc_rdma_recv_ctxt *rctxt)
|
|||
p = xdr_inline_decode(&rctxt->rc_stream, sizeof(*p));
|
||||
if (!p)
|
||||
return false;
|
||||
rctxt->rc_reply_chunk = NULL;
|
||||
if (xdr_item_is_present(p)) {
|
||||
if (!xdr_check_write_chunk(rctxt, MAX_BYTES_SPECIAL_CHUNK))
|
||||
return false;
|
||||
rctxt->rc_reply_chunk = p;
|
||||
}
|
||||
return true;
|
||||
|
||||
if (!xdr_item_is_present(p))
|
||||
return true;
|
||||
if (!xdr_check_write_chunk(rctxt))
|
||||
return false;
|
||||
|
||||
rctxt->rc_reply_pcl.cl_count = 1;
|
||||
return pcl_alloc_write(rctxt, &rctxt->rc_reply_pcl, p);
|
||||
}
|
||||
|
||||
/* RPC-over-RDMA Version One private extension: Remote Invalidation.
|
||||
|
@ -552,60 +584,53 @@ static bool xdr_check_reply_chunk(struct svc_rdma_recv_ctxt *rctxt)
|
|||
*
|
||||
* If there is exactly one distinct R_key in the received transport
|
||||
* header, set rc_inv_rkey to that R_key. Otherwise, set it to zero.
|
||||
*
|
||||
* Perform this operation while the received transport header is
|
||||
* still in the CPU cache.
|
||||
*/
|
||||
static void svc_rdma_get_inv_rkey(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_recv_ctxt *ctxt)
|
||||
{
|
||||
__be32 inv_rkey, *p;
|
||||
u32 i, segcount;
|
||||
struct svc_rdma_segment *segment;
|
||||
struct svc_rdma_chunk *chunk;
|
||||
u32 inv_rkey;
|
||||
|
||||
ctxt->rc_inv_rkey = 0;
|
||||
|
||||
if (!rdma->sc_snd_w_inv)
|
||||
return;
|
||||
|
||||
inv_rkey = xdr_zero;
|
||||
p = ctxt->rc_recv_buf;
|
||||
p += rpcrdma_fixed_maxsz;
|
||||
|
||||
/* Read list */
|
||||
while (xdr_item_is_present(p++)) {
|
||||
p++; /* position */
|
||||
if (inv_rkey == xdr_zero)
|
||||
inv_rkey = *p;
|
||||
else if (inv_rkey != *p)
|
||||
return;
|
||||
p += 4;
|
||||
}
|
||||
|
||||
/* Write list */
|
||||
while (xdr_item_is_present(p++)) {
|
||||
segcount = be32_to_cpup(p++);
|
||||
for (i = 0; i < segcount; i++) {
|
||||
if (inv_rkey == xdr_zero)
|
||||
inv_rkey = *p;
|
||||
else if (inv_rkey != *p)
|
||||
inv_rkey = 0;
|
||||
pcl_for_each_chunk(chunk, &ctxt->rc_call_pcl) {
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
if (inv_rkey == 0)
|
||||
inv_rkey = segment->rs_handle;
|
||||
else if (inv_rkey != segment->rs_handle)
|
||||
return;
|
||||
p += 4;
|
||||
}
|
||||
}
|
||||
|
||||
/* Reply chunk */
|
||||
if (xdr_item_is_present(p++)) {
|
||||
segcount = be32_to_cpup(p++);
|
||||
for (i = 0; i < segcount; i++) {
|
||||
if (inv_rkey == xdr_zero)
|
||||
inv_rkey = *p;
|
||||
else if (inv_rkey != *p)
|
||||
pcl_for_each_chunk(chunk, &ctxt->rc_read_pcl) {
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
if (inv_rkey == 0)
|
||||
inv_rkey = segment->rs_handle;
|
||||
else if (inv_rkey != segment->rs_handle)
|
||||
return;
|
||||
p += 4;
|
||||
}
|
||||
}
|
||||
|
||||
ctxt->rc_inv_rkey = be32_to_cpu(inv_rkey);
|
||||
pcl_for_each_chunk(chunk, &ctxt->rc_write_pcl) {
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
if (inv_rkey == 0)
|
||||
inv_rkey = segment->rs_handle;
|
||||
else if (inv_rkey != segment->rs_handle)
|
||||
return;
|
||||
}
|
||||
}
|
||||
pcl_for_each_chunk(chunk, &ctxt->rc_reply_pcl) {
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
if (inv_rkey == 0)
|
||||
inv_rkey = segment->rs_handle;
|
||||
else if (inv_rkey != segment->rs_handle)
|
||||
return;
|
||||
}
|
||||
}
|
||||
ctxt->rc_inv_rkey = inv_rkey;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -641,7 +666,8 @@ static int svc_rdma_xdr_decode_req(struct xdr_buf *rq_arg,
|
|||
if (*p != rpcrdma_version)
|
||||
goto out_version;
|
||||
p += 2;
|
||||
switch (*p) {
|
||||
rctxt->rc_msgtype = *p;
|
||||
switch (rctxt->rc_msgtype) {
|
||||
case rdma_msg:
|
||||
break;
|
||||
case rdma_nomsg:
|
||||
|
@ -735,30 +761,28 @@ static void svc_rdma_send_error(struct svcxprt_rdma *rdma,
|
|||
* the RPC/RDMA header small and fixed in size, so it is
|
||||
* straightforward to check the RPC header's direction field.
|
||||
*/
|
||||
static bool svc_rdma_is_backchannel_reply(struct svc_xprt *xprt,
|
||||
__be32 *rdma_resp)
|
||||
static bool svc_rdma_is_reverse_direction_reply(struct svc_xprt *xprt,
|
||||
struct svc_rdma_recv_ctxt *rctxt)
|
||||
{
|
||||
__be32 *p;
|
||||
__be32 *p = rctxt->rc_recv_buf;
|
||||
|
||||
if (!xprt->xpt_bc_xprt)
|
||||
return false;
|
||||
|
||||
p = rdma_resp + 3;
|
||||
if (*p++ != rdma_msg)
|
||||
if (rctxt->rc_msgtype != rdma_msg)
|
||||
return false;
|
||||
|
||||
if (*p++ != xdr_zero)
|
||||
if (!pcl_is_empty(&rctxt->rc_call_pcl))
|
||||
return false;
|
||||
if (*p++ != xdr_zero)
|
||||
if (!pcl_is_empty(&rctxt->rc_read_pcl))
|
||||
return false;
|
||||
if (*p++ != xdr_zero)
|
||||
if (!pcl_is_empty(&rctxt->rc_write_pcl))
|
||||
return false;
|
||||
if (!pcl_is_empty(&rctxt->rc_reply_pcl))
|
||||
return false;
|
||||
|
||||
/* XID sanity */
|
||||
if (*p++ != *rdma_resp)
|
||||
return false;
|
||||
/* call direction */
|
||||
if (*p == cpu_to_be32(RPC_CALL))
|
||||
/* RPC call direction */
|
||||
if (*(p + 8) == cpu_to_be32(RPC_CALL))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
|
@ -800,7 +824,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
|
|||
struct svcxprt_rdma *rdma_xprt =
|
||||
container_of(xprt, struct svcxprt_rdma, sc_xprt);
|
||||
struct svc_rdma_recv_ctxt *ctxt;
|
||||
__be32 *p;
|
||||
int ret;
|
||||
|
||||
rqstp->rq_xprt_ctxt = NULL;
|
||||
|
@ -833,7 +856,6 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
|
|||
rqstp->rq_respages = rqstp->rq_pages;
|
||||
rqstp->rq_next_page = rqstp->rq_respages;
|
||||
|
||||
p = (__be32 *)rqstp->rq_arg.head[0].iov_base;
|
||||
ret = svc_rdma_xdr_decode_req(&rqstp->rq_arg, ctxt);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
|
@ -841,14 +863,14 @@ int svc_rdma_recvfrom(struct svc_rqst *rqstp)
|
|||
goto out_drop;
|
||||
rqstp->rq_xprt_hlen = ret;
|
||||
|
||||
if (svc_rdma_is_backchannel_reply(xprt, p))
|
||||
if (svc_rdma_is_reverse_direction_reply(xprt, ctxt))
|
||||
goto out_backchannel;
|
||||
|
||||
svc_rdma_get_inv_rkey(rdma_xprt, ctxt);
|
||||
|
||||
p += rpcrdma_fixed_maxsz;
|
||||
if (*p != xdr_zero)
|
||||
goto out_readchunk;
|
||||
if (!pcl_is_empty(&ctxt->rc_read_pcl) ||
|
||||
!pcl_is_empty(&ctxt->rc_call_pcl))
|
||||
goto out_readlist;
|
||||
|
||||
complete:
|
||||
rqstp->rq_xprt_ctxt = ctxt;
|
||||
|
@ -856,10 +878,10 @@ complete:
|
|||
svc_xprt_copy_addrs(rqstp, xprt);
|
||||
return rqstp->rq_arg.len;
|
||||
|
||||
out_readchunk:
|
||||
ret = svc_rdma_recv_read_chunk(rdma_xprt, rqstp, ctxt, p);
|
||||
out_readlist:
|
||||
ret = svc_rdma_process_read_list(rdma_xprt, rqstp, ctxt);
|
||||
if (ret < 0)
|
||||
goto out_postfail;
|
||||
goto out_readfail;
|
||||
return 0;
|
||||
|
||||
out_err:
|
||||
|
@ -867,7 +889,7 @@ out_err:
|
|||
svc_rdma_recv_ctxt_put(rdma_xprt, ctxt);
|
||||
return 0;
|
||||
|
||||
out_postfail:
|
||||
out_readfail:
|
||||
if (ret == -EINVAL)
|
||||
svc_rdma_send_error(rdma_xprt, ctxt, ret);
|
||||
svc_rdma_recv_ctxt_put(rdma_xprt, ctxt);
|
||||
|
|
|
@ -190,14 +190,14 @@ static void svc_rdma_cc_release(struct svc_rdma_chunk_ctxt *cc,
|
|||
* - Stores arguments for the SGL constructor functions
|
||||
*/
|
||||
struct svc_rdma_write_info {
|
||||
const struct svc_rdma_chunk *wi_chunk;
|
||||
|
||||
/* write state of this chunk */
|
||||
unsigned int wi_seg_off;
|
||||
unsigned int wi_seg_no;
|
||||
unsigned int wi_nsegs;
|
||||
__be32 *wi_segs;
|
||||
|
||||
/* SGL constructor arguments */
|
||||
struct xdr_buf *wi_xdr;
|
||||
const struct xdr_buf *wi_xdr;
|
||||
unsigned char *wi_base;
|
||||
unsigned int wi_next_off;
|
||||
|
||||
|
@ -205,7 +205,8 @@ struct svc_rdma_write_info {
|
|||
};
|
||||
|
||||
static struct svc_rdma_write_info *
|
||||
svc_rdma_write_info_alloc(struct svcxprt_rdma *rdma, __be32 *chunk)
|
||||
svc_rdma_write_info_alloc(struct svcxprt_rdma *rdma,
|
||||
const struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
struct svc_rdma_write_info *info;
|
||||
|
||||
|
@ -213,10 +214,9 @@ svc_rdma_write_info_alloc(struct svcxprt_rdma *rdma, __be32 *chunk)
|
|||
if (!info)
|
||||
return info;
|
||||
|
||||
info->wi_chunk = chunk;
|
||||
info->wi_seg_off = 0;
|
||||
info->wi_seg_no = 0;
|
||||
info->wi_nsegs = be32_to_cpup(++chunk);
|
||||
info->wi_segs = ++chunk;
|
||||
svc_rdma_cc_init(rdma, &info->wi_cc);
|
||||
info->wi_cc.cc_cqe.done = svc_rdma_write_done;
|
||||
return info;
|
||||
|
@ -258,11 +258,11 @@ static void svc_rdma_write_done(struct ib_cq *cq, struct ib_wc *wc)
|
|||
/* State for pulling a Read chunk.
|
||||
*/
|
||||
struct svc_rdma_read_info {
|
||||
struct svc_rqst *ri_rqst;
|
||||
struct svc_rdma_recv_ctxt *ri_readctxt;
|
||||
unsigned int ri_position;
|
||||
unsigned int ri_pageno;
|
||||
unsigned int ri_pageoff;
|
||||
unsigned int ri_chunklen;
|
||||
unsigned int ri_totalbytes;
|
||||
|
||||
struct svc_rdma_chunk_ctxt ri_cc;
|
||||
};
|
||||
|
@ -358,7 +358,6 @@ static int svc_rdma_post_chunk_ctxt(struct svc_rdma_chunk_ctxt *cc)
|
|||
do {
|
||||
if (atomic_sub_return(cc->cc_sqecount,
|
||||
&rdma->sc_sq_avail) > 0) {
|
||||
trace_svcrdma_post_chunk(&cc->cc_cid, cc->cc_sqecount);
|
||||
ret = ib_post_send(rdma->sc_qp, first_wr, &bad_wr);
|
||||
if (ret)
|
||||
break;
|
||||
|
@ -405,7 +404,7 @@ static void svc_rdma_pagelist_to_sg(struct svc_rdma_write_info *info,
|
|||
struct svc_rdma_rw_ctxt *ctxt)
|
||||
{
|
||||
unsigned int sge_no, sge_bytes, page_off, page_no;
|
||||
struct xdr_buf *xdr = info->wi_xdr;
|
||||
const struct xdr_buf *xdr = info->wi_xdr;
|
||||
struct scatterlist *sg;
|
||||
struct page **page;
|
||||
|
||||
|
@ -443,40 +442,36 @@ svc_rdma_build_writes(struct svc_rdma_write_info *info,
|
|||
{
|
||||
struct svc_rdma_chunk_ctxt *cc = &info->wi_cc;
|
||||
struct svcxprt_rdma *rdma = cc->cc_rdma;
|
||||
const struct svc_rdma_segment *seg;
|
||||
struct svc_rdma_rw_ctxt *ctxt;
|
||||
__be32 *seg;
|
||||
int ret;
|
||||
|
||||
seg = info->wi_segs + info->wi_seg_no * rpcrdma_segment_maxsz;
|
||||
do {
|
||||
unsigned int write_len;
|
||||
u32 handle, length;
|
||||
u64 offset;
|
||||
|
||||
if (info->wi_seg_no >= info->wi_nsegs)
|
||||
seg = &info->wi_chunk->ch_segments[info->wi_seg_no];
|
||||
if (!seg)
|
||||
goto out_overflow;
|
||||
|
||||
xdr_decode_rdma_segment(seg, &handle, &length, &offset);
|
||||
offset += info->wi_seg_off;
|
||||
|
||||
write_len = min(remaining, length - info->wi_seg_off);
|
||||
write_len = min(remaining, seg->rs_length - info->wi_seg_off);
|
||||
if (!write_len)
|
||||
goto out_overflow;
|
||||
ctxt = svc_rdma_get_rw_ctxt(rdma,
|
||||
(write_len >> PAGE_SHIFT) + 2);
|
||||
if (!ctxt)
|
||||
return -ENOMEM;
|
||||
|
||||
constructor(info, write_len, ctxt);
|
||||
ret = svc_rdma_rw_ctx_init(rdma, ctxt, offset, handle,
|
||||
offset = seg->rs_offset + info->wi_seg_off;
|
||||
ret = svc_rdma_rw_ctx_init(rdma, ctxt, offset, seg->rs_handle,
|
||||
DMA_TO_DEVICE);
|
||||
if (ret < 0)
|
||||
return -EIO;
|
||||
|
||||
trace_svcrdma_send_wseg(handle, write_len, offset);
|
||||
|
||||
list_add(&ctxt->rw_list, &cc->cc_rwctxts);
|
||||
cc->cc_sqecount += ret;
|
||||
if (write_len == length - info->wi_seg_off) {
|
||||
seg += 4;
|
||||
if (write_len == seg->rs_length - info->wi_seg_off) {
|
||||
info->wi_seg_no++;
|
||||
info->wi_seg_off = 0;
|
||||
} else {
|
||||
|
@ -489,31 +484,46 @@ svc_rdma_build_writes(struct svc_rdma_write_info *info,
|
|||
|
||||
out_overflow:
|
||||
trace_svcrdma_small_wrch_err(rdma, remaining, info->wi_seg_no,
|
||||
info->wi_nsegs);
|
||||
info->wi_chunk->ch_segcount);
|
||||
return -E2BIG;
|
||||
}
|
||||
|
||||
/* Send one of an xdr_buf's kvecs by itself. To send a Reply
|
||||
* chunk, the whole RPC Reply is written back to the client.
|
||||
* This function writes either the head or tail of the xdr_buf
|
||||
* containing the Reply.
|
||||
/**
|
||||
* svc_rdma_iov_write - Construct RDMA Writes from an iov
|
||||
* @info: pointer to write arguments
|
||||
* @iov: kvec to write
|
||||
*
|
||||
* Returns:
|
||||
* On succes, returns zero
|
||||
* %-E2BIG if the client-provided Write chunk is too small
|
||||
* %-ENOMEM if a resource has been exhausted
|
||||
* %-EIO if an rdma-rw error occurred
|
||||
*/
|
||||
static int svc_rdma_send_xdr_kvec(struct svc_rdma_write_info *info,
|
||||
struct kvec *vec)
|
||||
static int svc_rdma_iov_write(struct svc_rdma_write_info *info,
|
||||
const struct kvec *iov)
|
||||
{
|
||||
info->wi_base = vec->iov_base;
|
||||
info->wi_base = iov->iov_base;
|
||||
return svc_rdma_build_writes(info, svc_rdma_vec_to_sg,
|
||||
vec->iov_len);
|
||||
iov->iov_len);
|
||||
}
|
||||
|
||||
/* Send an xdr_buf's page list by itself. A Write chunk is just
|
||||
* the page list. A Reply chunk is @xdr's head, page list, and
|
||||
* tail. This function is shared between the two types of chunk.
|
||||
/**
|
||||
* svc_rdma_pages_write - Construct RDMA Writes from pages
|
||||
* @info: pointer to write arguments
|
||||
* @xdr: xdr_buf with pages to write
|
||||
* @offset: offset into the content of @xdr
|
||||
* @length: number of bytes to write
|
||||
*
|
||||
* Returns:
|
||||
* On succes, returns zero
|
||||
* %-E2BIG if the client-provided Write chunk is too small
|
||||
* %-ENOMEM if a resource has been exhausted
|
||||
* %-EIO if an rdma-rw error occurred
|
||||
*/
|
||||
static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info,
|
||||
struct xdr_buf *xdr,
|
||||
unsigned int offset,
|
||||
unsigned long length)
|
||||
static int svc_rdma_pages_write(struct svc_rdma_write_info *info,
|
||||
const struct xdr_buf *xdr,
|
||||
unsigned int offset,
|
||||
unsigned long length)
|
||||
{
|
||||
info->wi_xdr = xdr;
|
||||
info->wi_next_off = offset - xdr->head[0].iov_len;
|
||||
|
@ -521,13 +531,49 @@ static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info,
|
|||
length);
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_xb_write - Construct RDMA Writes to write an xdr_buf
|
||||
* @xdr: xdr_buf to write
|
||||
* @data: pointer to write arguments
|
||||
*
|
||||
* Returns:
|
||||
* On succes, returns zero
|
||||
* %-E2BIG if the client-provided Write chunk is too small
|
||||
* %-ENOMEM if a resource has been exhausted
|
||||
* %-EIO if an rdma-rw error occurred
|
||||
*/
|
||||
static int svc_rdma_xb_write(const struct xdr_buf *xdr, void *data)
|
||||
{
|
||||
struct svc_rdma_write_info *info = data;
|
||||
int ret;
|
||||
|
||||
if (xdr->head[0].iov_len) {
|
||||
ret = svc_rdma_iov_write(info, &xdr->head[0]);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
if (xdr->page_len) {
|
||||
ret = svc_rdma_pages_write(info, xdr, xdr->head[0].iov_len,
|
||||
xdr->page_len);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
if (xdr->tail[0].iov_len) {
|
||||
ret = svc_rdma_iov_write(info, &xdr->tail[0]);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
return xdr->len;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_send_write_chunk - Write all segments in a Write chunk
|
||||
* @rdma: controlling RDMA transport
|
||||
* @wr_ch: Write chunk provided by client
|
||||
* @chunk: Write chunk provided by the client
|
||||
* @xdr: xdr_buf containing the data payload
|
||||
* @offset: payload's byte offset in @xdr
|
||||
* @length: size of payload, in bytes
|
||||
*
|
||||
* Returns a non-negative number of bytes the chunk consumed, or
|
||||
* %-E2BIG if the payload was larger than the Write chunk,
|
||||
|
@ -536,30 +582,28 @@ static int svc_rdma_send_xdr_pagelist(struct svc_rdma_write_info *info,
|
|||
* %-ENOTCONN if posting failed (connection is lost),
|
||||
* %-EIO if rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma, __be32 *wr_ch,
|
||||
struct xdr_buf *xdr,
|
||||
unsigned int offset, unsigned long length)
|
||||
int svc_rdma_send_write_chunk(struct svcxprt_rdma *rdma,
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
const struct xdr_buf *xdr)
|
||||
{
|
||||
struct svc_rdma_write_info *info;
|
||||
struct svc_rdma_chunk_ctxt *cc;
|
||||
int ret;
|
||||
|
||||
if (!length)
|
||||
return 0;
|
||||
|
||||
info = svc_rdma_write_info_alloc(rdma, wr_ch);
|
||||
info = svc_rdma_write_info_alloc(rdma, chunk);
|
||||
if (!info)
|
||||
return -ENOMEM;
|
||||
cc = &info->wi_cc;
|
||||
|
||||
ret = svc_rdma_send_xdr_pagelist(info, xdr, offset, length);
|
||||
if (ret < 0)
|
||||
ret = svc_rdma_xb_write(xdr, info);
|
||||
if (ret != xdr->len)
|
||||
goto out_err;
|
||||
|
||||
ret = svc_rdma_post_chunk_ctxt(&info->wi_cc);
|
||||
trace_svcrdma_post_write_chunk(&cc->cc_cid, cc->cc_sqecount);
|
||||
ret = svc_rdma_post_chunk_ctxt(cc);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
|
||||
trace_svcrdma_send_write_chunk(xdr->page_len);
|
||||
return length;
|
||||
return xdr->len;
|
||||
|
||||
out_err:
|
||||
svc_rdma_write_info_free(info);
|
||||
|
@ -581,62 +625,62 @@ out_err:
|
|||
*/
|
||||
int svc_rdma_send_reply_chunk(struct svcxprt_rdma *rdma,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct xdr_buf *xdr)
|
||||
const struct xdr_buf *xdr)
|
||||
{
|
||||
struct svc_rdma_write_info *info;
|
||||
int consumed, ret;
|
||||
struct svc_rdma_chunk_ctxt *cc;
|
||||
struct svc_rdma_chunk *chunk;
|
||||
int ret;
|
||||
|
||||
info = svc_rdma_write_info_alloc(rdma, rctxt->rc_reply_chunk);
|
||||
if (pcl_is_empty(&rctxt->rc_reply_pcl))
|
||||
return 0;
|
||||
|
||||
chunk = pcl_first_chunk(&rctxt->rc_reply_pcl);
|
||||
info = svc_rdma_write_info_alloc(rdma, chunk);
|
||||
if (!info)
|
||||
return -ENOMEM;
|
||||
cc = &info->wi_cc;
|
||||
|
||||
ret = svc_rdma_send_xdr_kvec(info, &xdr->head[0]);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
consumed = xdr->head[0].iov_len;
|
||||
|
||||
/* Send the page list in the Reply chunk only if the
|
||||
* client did not provide Write chunks.
|
||||
*/
|
||||
if (!rctxt->rc_write_list && xdr->page_len) {
|
||||
ret = svc_rdma_send_xdr_pagelist(info, xdr,
|
||||
xdr->head[0].iov_len,
|
||||
xdr->page_len);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
consumed += xdr->page_len;
|
||||
}
|
||||
|
||||
if (xdr->tail[0].iov_len) {
|
||||
ret = svc_rdma_send_xdr_kvec(info, &xdr->tail[0]);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
consumed += xdr->tail[0].iov_len;
|
||||
}
|
||||
|
||||
ret = svc_rdma_post_chunk_ctxt(&info->wi_cc);
|
||||
ret = pcl_process_nonpayloads(&rctxt->rc_write_pcl, xdr,
|
||||
svc_rdma_xb_write, info);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
|
||||
trace_svcrdma_send_reply_chunk(consumed);
|
||||
return consumed;
|
||||
trace_svcrdma_post_reply_chunk(&cc->cc_cid, cc->cc_sqecount);
|
||||
ret = svc_rdma_post_chunk_ctxt(cc);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
|
||||
return xdr->len;
|
||||
|
||||
out_err:
|
||||
svc_rdma_write_info_free(info);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_build_read_segment - Build RDMA Read WQEs to pull one RDMA segment
|
||||
* @info: context for ongoing I/O
|
||||
* @segment: co-ordinates of remote memory to be read
|
||||
*
|
||||
* Returns:
|
||||
* %0: the Read WR chain was constructed successfully
|
||||
* %-EINVAL: there were not enough rq_pages to finish
|
||||
* %-ENOMEM: allocating a local resources failed
|
||||
* %-EIO: a DMA mapping error occurred
|
||||
*/
|
||||
static int svc_rdma_build_read_segment(struct svc_rdma_read_info *info,
|
||||
struct svc_rqst *rqstp,
|
||||
u32 rkey, u32 len, u64 offset)
|
||||
const struct svc_rdma_segment *segment)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
struct svc_rdma_chunk_ctxt *cc = &info->ri_cc;
|
||||
struct svc_rqst *rqstp = info->ri_rqst;
|
||||
struct svc_rdma_rw_ctxt *ctxt;
|
||||
unsigned int sge_no, seg_len;
|
||||
unsigned int sge_no, seg_len, len;
|
||||
struct scatterlist *sg;
|
||||
int ret;
|
||||
|
||||
len = segment->rs_length;
|
||||
sge_no = PAGE_ALIGN(info->ri_pageoff + len) >> PAGE_SHIFT;
|
||||
ctxt = svc_rdma_get_rw_ctxt(cc->cc_rdma, sge_no);
|
||||
if (!ctxt)
|
||||
|
@ -670,8 +714,8 @@ static int svc_rdma_build_read_segment(struct svc_rdma_read_info *info,
|
|||
goto out_overrun;
|
||||
}
|
||||
|
||||
ret = svc_rdma_rw_ctx_init(cc->cc_rdma, ctxt, offset, rkey,
|
||||
DMA_FROM_DEVICE);
|
||||
ret = svc_rdma_rw_ctx_init(cc->cc_rdma, ctxt, segment->rs_offset,
|
||||
segment->rs_handle, DMA_FROM_DEVICE);
|
||||
if (ret < 0)
|
||||
return -EIO;
|
||||
|
||||
|
@ -684,54 +728,177 @@ out_overrun:
|
|||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* Walk the segments in the Read chunk starting at @p and construct
|
||||
* RDMA Read operations to pull the chunk to the server.
|
||||
/**
|
||||
* svc_rdma_build_read_chunk - Build RDMA Read WQEs to pull one RDMA chunk
|
||||
* @info: context for ongoing I/O
|
||||
* @chunk: Read chunk to pull
|
||||
*
|
||||
* Return values:
|
||||
* %0: the Read WR chain was constructed successfully
|
||||
* %-EINVAL: there were not enough resources to finish
|
||||
* %-ENOMEM: allocating a local resources failed
|
||||
* %-EIO: a DMA mapping error occurred
|
||||
*/
|
||||
static int svc_rdma_build_read_chunk(struct svc_rqst *rqstp,
|
||||
struct svc_rdma_read_info *info,
|
||||
__be32 *p)
|
||||
static int svc_rdma_build_read_chunk(struct svc_rdma_read_info *info,
|
||||
const struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
const struct svc_rdma_segment *segment;
|
||||
int ret;
|
||||
|
||||
ret = -EINVAL;
|
||||
info->ri_chunklen = 0;
|
||||
while (*p++ != xdr_zero && be32_to_cpup(p++) == info->ri_position) {
|
||||
u32 handle, length;
|
||||
u64 offset;
|
||||
|
||||
p = xdr_decode_rdma_segment(p, &handle, &length, &offset);
|
||||
ret = svc_rdma_build_read_segment(info, rqstp, handle, length,
|
||||
offset);
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
ret = svc_rdma_build_read_segment(info, segment);
|
||||
if (ret < 0)
|
||||
break;
|
||||
|
||||
trace_svcrdma_send_rseg(handle, length, offset);
|
||||
info->ri_chunklen += length;
|
||||
info->ri_totalbytes += segment->rs_length;
|
||||
}
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* Construct RDMA Reads to pull over a normal Read chunk. The chunk
|
||||
* data lands in the page list of head->rc_arg.pages.
|
||||
/**
|
||||
* svc_rdma_copy_inline_range - Copy part of the inline content into pages
|
||||
* @info: context for RDMA Reads
|
||||
* @offset: offset into the Receive buffer of region to copy
|
||||
* @remaining: length of region to copy
|
||||
*
|
||||
* Take a page at a time from rqstp->rq_pages and copy the inline
|
||||
* content from the Receive buffer into that page. Update
|
||||
* info->ri_pageno and info->ri_pageoff so that the next RDMA Read
|
||||
* result will land contiguously with the copied content.
|
||||
*
|
||||
* Return values:
|
||||
* %0: Inline content was successfully copied
|
||||
* %-EINVAL: offset or length was incorrect
|
||||
*/
|
||||
static int svc_rdma_copy_inline_range(struct svc_rdma_read_info *info,
|
||||
unsigned int offset,
|
||||
unsigned int remaining)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
unsigned char *dst, *src = head->rc_recv_buf;
|
||||
struct svc_rqst *rqstp = info->ri_rqst;
|
||||
unsigned int page_no, numpages;
|
||||
|
||||
numpages = PAGE_ALIGN(info->ri_pageoff + remaining) >> PAGE_SHIFT;
|
||||
for (page_no = 0; page_no < numpages; page_no++) {
|
||||
unsigned int page_len;
|
||||
|
||||
page_len = min_t(unsigned int, remaining,
|
||||
PAGE_SIZE - info->ri_pageoff);
|
||||
|
||||
head->rc_arg.pages[info->ri_pageno] =
|
||||
rqstp->rq_pages[info->ri_pageno];
|
||||
if (!info->ri_pageoff)
|
||||
head->rc_page_count++;
|
||||
|
||||
dst = page_address(head->rc_arg.pages[info->ri_pageno]);
|
||||
memcpy(dst + info->ri_pageno, src + offset, page_len);
|
||||
|
||||
info->ri_totalbytes += page_len;
|
||||
info->ri_pageoff += page_len;
|
||||
if (info->ri_pageoff == PAGE_SIZE) {
|
||||
info->ri_pageno++;
|
||||
info->ri_pageoff = 0;
|
||||
}
|
||||
remaining -= page_len;
|
||||
offset += page_len;
|
||||
}
|
||||
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_read_multiple_chunks - Construct RDMA Reads to pull data item Read chunks
|
||||
* @info: context for RDMA Reads
|
||||
*
|
||||
* The chunk data lands in head->rc_arg as a series of contiguous pages,
|
||||
* like an incoming TCP call.
|
||||
*
|
||||
* Return values:
|
||||
* %0: RDMA Read WQEs were successfully built
|
||||
* %-EINVAL: client provided too many chunks or segments,
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
static noinline int svc_rdma_read_multiple_chunks(struct svc_rdma_read_info *info)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
const struct svc_rdma_pcl *pcl = &head->rc_read_pcl;
|
||||
struct svc_rdma_chunk *chunk, *next;
|
||||
struct xdr_buf *buf = &head->rc_arg;
|
||||
unsigned int start, length;
|
||||
int ret;
|
||||
|
||||
start = 0;
|
||||
chunk = pcl_first_chunk(pcl);
|
||||
length = chunk->ch_position;
|
||||
ret = svc_rdma_copy_inline_range(info, start, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
pcl_for_each_chunk(chunk, pcl) {
|
||||
ret = svc_rdma_build_read_chunk(info, chunk);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
next = pcl_next_chunk(pcl, chunk);
|
||||
if (!next)
|
||||
break;
|
||||
|
||||
start += length;
|
||||
length = next->ch_position - info->ri_totalbytes;
|
||||
ret = svc_rdma_copy_inline_range(info, start, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
start += length;
|
||||
length = head->rc_byte_len - start;
|
||||
ret = svc_rdma_copy_inline_range(info, start, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
buf->len += info->ri_totalbytes;
|
||||
buf->buflen += info->ri_totalbytes;
|
||||
|
||||
head->rc_hdr_count = 1;
|
||||
buf->head[0].iov_base = page_address(head->rc_pages[0]);
|
||||
buf->head[0].iov_len = min_t(size_t, PAGE_SIZE, info->ri_totalbytes);
|
||||
buf->page_len = info->ri_totalbytes - buf->head[0].iov_len;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_read_data_item - Construct RDMA Reads to pull data item Read chunks
|
||||
* @info: context for RDMA Reads
|
||||
*
|
||||
* The chunk data lands in the page list of head->rc_arg.pages.
|
||||
*
|
||||
* Currently NFSD does not look at the head->rc_arg.tail[0] iovec.
|
||||
* Therefore, XDR round-up of the Read chunk and trailing
|
||||
* inline content must both be added at the end of the pagelist.
|
||||
*
|
||||
* Return values:
|
||||
* %0: RDMA Read WQEs were successfully built
|
||||
* %-EINVAL: client provided too many chunks or segments,
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
static int svc_rdma_build_normal_read_chunk(struct svc_rqst *rqstp,
|
||||
struct svc_rdma_read_info *info,
|
||||
__be32 *p)
|
||||
static int svc_rdma_read_data_item(struct svc_rdma_read_info *info)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
struct xdr_buf *buf = &head->rc_arg;
|
||||
struct svc_rdma_chunk *chunk;
|
||||
unsigned int length;
|
||||
int ret;
|
||||
|
||||
ret = svc_rdma_build_read_chunk(rqstp, info, p);
|
||||
chunk = pcl_first_chunk(&head->rc_read_pcl);
|
||||
ret = svc_rdma_build_read_chunk(info, chunk);
|
||||
if (ret < 0)
|
||||
goto out;
|
||||
|
||||
trace_svcrdma_send_read_chunk(info->ri_chunklen, info->ri_position);
|
||||
|
||||
head->rc_hdr_count = 0;
|
||||
|
||||
/* Split the Receive buffer between the head and tail
|
||||
|
@ -739,11 +906,9 @@ static int svc_rdma_build_normal_read_chunk(struct svc_rqst *rqstp,
|
|||
* chunk is not included in either the pagelist or in
|
||||
* the tail.
|
||||
*/
|
||||
head->rc_arg.tail[0].iov_base =
|
||||
head->rc_arg.head[0].iov_base + info->ri_position;
|
||||
head->rc_arg.tail[0].iov_len =
|
||||
head->rc_arg.head[0].iov_len - info->ri_position;
|
||||
head->rc_arg.head[0].iov_len = info->ri_position;
|
||||
buf->tail[0].iov_base = buf->head[0].iov_base + chunk->ch_position;
|
||||
buf->tail[0].iov_len = buf->head[0].iov_len - chunk->ch_position;
|
||||
buf->head[0].iov_len = chunk->ch_position;
|
||||
|
||||
/* Read chunk may need XDR roundup (see RFC 8166, s. 3.4.5.2).
|
||||
*
|
||||
|
@ -754,50 +919,149 @@ static int svc_rdma_build_normal_read_chunk(struct svc_rqst *rqstp,
|
|||
* Currently these chunks always start at page offset 0,
|
||||
* thus the rounded-up length never crosses a page boundary.
|
||||
*/
|
||||
info->ri_chunklen = XDR_QUADLEN(info->ri_chunklen) << 2;
|
||||
|
||||
head->rc_arg.page_len = info->ri_chunklen;
|
||||
head->rc_arg.len += info->ri_chunklen;
|
||||
head->rc_arg.buflen += info->ri_chunklen;
|
||||
length = XDR_QUADLEN(info->ri_totalbytes) << 2;
|
||||
buf->page_len = length;
|
||||
buf->len += length;
|
||||
buf->buflen += length;
|
||||
|
||||
out:
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* Construct RDMA Reads to pull over a Position Zero Read chunk.
|
||||
* The start of the data lands in the first page just after
|
||||
* the Transport header, and the rest lands in the page list of
|
||||
/**
|
||||
* svc_rdma_read_chunk_range - Build RDMA Read WQEs for portion of a chunk
|
||||
* @info: context for RDMA Reads
|
||||
* @chunk: parsed Call chunk to pull
|
||||
* @offset: offset of region to pull
|
||||
* @length: length of region to pull
|
||||
*
|
||||
* Return values:
|
||||
* %0: RDMA Read WQEs were successfully built
|
||||
* %-EINVAL: there were not enough resources to finish
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
static int svc_rdma_read_chunk_range(struct svc_rdma_read_info *info,
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
unsigned int offset, unsigned int length)
|
||||
{
|
||||
const struct svc_rdma_segment *segment;
|
||||
int ret;
|
||||
|
||||
ret = -EINVAL;
|
||||
pcl_for_each_segment(segment, chunk) {
|
||||
struct svc_rdma_segment dummy;
|
||||
|
||||
if (offset > segment->rs_length) {
|
||||
offset -= segment->rs_length;
|
||||
continue;
|
||||
}
|
||||
|
||||
dummy.rs_handle = segment->rs_handle;
|
||||
dummy.rs_length = min_t(u32, length, segment->rs_length) - offset;
|
||||
dummy.rs_offset = segment->rs_offset + offset;
|
||||
|
||||
ret = svc_rdma_build_read_segment(info, &dummy);
|
||||
if (ret < 0)
|
||||
break;
|
||||
|
||||
info->ri_totalbytes += dummy.rs_length;
|
||||
length -= dummy.rs_length;
|
||||
offset = 0;
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_read_call_chunk - Build RDMA Read WQEs to pull a Long Message
|
||||
* @info: context for RDMA Reads
|
||||
*
|
||||
* Return values:
|
||||
* %0: RDMA Read WQEs were successfully built
|
||||
* %-EINVAL: there were not enough resources to finish
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
static int svc_rdma_read_call_chunk(struct svc_rdma_read_info *info)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
const struct svc_rdma_chunk *call_chunk =
|
||||
pcl_first_chunk(&head->rc_call_pcl);
|
||||
const struct svc_rdma_pcl *pcl = &head->rc_read_pcl;
|
||||
struct svc_rdma_chunk *chunk, *next;
|
||||
unsigned int start, length;
|
||||
int ret;
|
||||
|
||||
if (pcl_is_empty(pcl))
|
||||
return svc_rdma_build_read_chunk(info, call_chunk);
|
||||
|
||||
start = 0;
|
||||
chunk = pcl_first_chunk(pcl);
|
||||
length = chunk->ch_position;
|
||||
ret = svc_rdma_read_chunk_range(info, call_chunk, start, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
pcl_for_each_chunk(chunk, pcl) {
|
||||
ret = svc_rdma_build_read_chunk(info, chunk);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
next = pcl_next_chunk(pcl, chunk);
|
||||
if (!next)
|
||||
break;
|
||||
|
||||
start += length;
|
||||
length = next->ch_position - info->ri_totalbytes;
|
||||
ret = svc_rdma_read_chunk_range(info, call_chunk,
|
||||
start, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
start += length;
|
||||
length = call_chunk->ch_length - start;
|
||||
return svc_rdma_read_chunk_range(info, call_chunk, start, length);
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_read_special - Build RDMA Read WQEs to pull a Long Message
|
||||
* @info: context for RDMA Reads
|
||||
*
|
||||
* The start of the data lands in the first page just after the
|
||||
* Transport header, and the rest lands in the page list of
|
||||
* head->rc_arg.pages.
|
||||
*
|
||||
* Assumptions:
|
||||
* - A PZRC has an XDR-aligned length (no implicit round-up).
|
||||
* - There can be no trailing inline content (IOW, we assume
|
||||
* a PZRC is never sent in an RDMA_MSG message, though it's
|
||||
* allowed by spec).
|
||||
* - A PZRC is never sent in an RDMA_MSG message, though it's
|
||||
* allowed by spec.
|
||||
*
|
||||
* Return values:
|
||||
* %0: RDMA Read WQEs were successfully built
|
||||
* %-EINVAL: client provided too many chunks or segments,
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
static int svc_rdma_build_pz_read_chunk(struct svc_rqst *rqstp,
|
||||
struct svc_rdma_read_info *info,
|
||||
__be32 *p)
|
||||
static noinline int svc_rdma_read_special(struct svc_rdma_read_info *info)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *head = info->ri_readctxt;
|
||||
struct xdr_buf *buf = &head->rc_arg;
|
||||
int ret;
|
||||
|
||||
ret = svc_rdma_build_read_chunk(rqstp, info, p);
|
||||
ret = svc_rdma_read_call_chunk(info);
|
||||
if (ret < 0)
|
||||
goto out;
|
||||
|
||||
trace_svcrdma_send_pzr(info->ri_chunklen);
|
||||
|
||||
head->rc_arg.len += info->ri_chunklen;
|
||||
head->rc_arg.buflen += info->ri_chunklen;
|
||||
buf->len += info->ri_totalbytes;
|
||||
buf->buflen += info->ri_totalbytes;
|
||||
|
||||
head->rc_hdr_count = 1;
|
||||
head->rc_arg.head[0].iov_base = page_address(head->rc_pages[0]);
|
||||
head->rc_arg.head[0].iov_len = min_t(size_t, PAGE_SIZE,
|
||||
info->ri_chunklen);
|
||||
|
||||
head->rc_arg.page_len = info->ri_chunklen -
|
||||
head->rc_arg.head[0].iov_len;
|
||||
buf->head[0].iov_base = page_address(head->rc_pages[0]);
|
||||
buf->head[0].iov_len = min_t(size_t, PAGE_SIZE, info->ri_totalbytes);
|
||||
buf->page_len = info->ri_totalbytes - buf->head[0].iov_len;
|
||||
|
||||
out:
|
||||
return ret;
|
||||
|
@ -824,26 +1088,34 @@ static void svc_rdma_save_io_pages(struct svc_rqst *rqstp,
|
|||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_recv_read_chunk - Pull a Read chunk from the client
|
||||
* svc_rdma_process_read_list - Pull list of Read chunks from the client
|
||||
* @rdma: controlling RDMA transport
|
||||
* @rqstp: set of pages to use as Read sink buffers
|
||||
* @head: pages under I/O collect here
|
||||
* @p: pointer to start of Read chunk
|
||||
*
|
||||
* Returns:
|
||||
* %0 if all needed RDMA Reads were posted successfully,
|
||||
* %-EINVAL if client provided too many segments,
|
||||
* %-ENOMEM if rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN if posting failed (connection is lost),
|
||||
* %-EIO if rdma_rw initialization failed (DMA mapping, etc).
|
||||
* The RPC/RDMA protocol assumes that the upper layer's XDR decoders
|
||||
* pull each Read chunk as they decode an incoming RPC message.
|
||||
*
|
||||
* Assumptions:
|
||||
* - All Read segments in @p have the same Position value.
|
||||
* On Linux, however, the server needs to have a fully-constructed RPC
|
||||
* message in rqstp->rq_arg when there is a positive return code from
|
||||
* ->xpo_recvfrom. So the Read list is safety-checked immediately when
|
||||
* it is received, then here the whole Read list is pulled all at once.
|
||||
* The ingress RPC message is fully reconstructed once all associated
|
||||
* RDMA Reads have completed.
|
||||
*
|
||||
* Return values:
|
||||
* %1: all needed RDMA Reads were posted successfully,
|
||||
* %-EINVAL: client provided too many chunks or segments,
|
||||
* %-ENOMEM: rdma_rw context pool was exhausted,
|
||||
* %-ENOTCONN: posting failed (connection is lost),
|
||||
* %-EIO: rdma_rw initialization failed (DMA mapping, etc).
|
||||
*/
|
||||
int svc_rdma_recv_read_chunk(struct svcxprt_rdma *rdma, struct svc_rqst *rqstp,
|
||||
struct svc_rdma_recv_ctxt *head, __be32 *p)
|
||||
int svc_rdma_process_read_list(struct svcxprt_rdma *rdma,
|
||||
struct svc_rqst *rqstp,
|
||||
struct svc_rdma_recv_ctxt *head)
|
||||
{
|
||||
struct svc_rdma_read_info *info;
|
||||
struct svc_rdma_chunk_ctxt *cc;
|
||||
int ret;
|
||||
|
||||
/* The request (with page list) is constructed in
|
||||
|
@ -861,23 +1133,29 @@ int svc_rdma_recv_read_chunk(struct svcxprt_rdma *rdma, struct svc_rqst *rqstp,
|
|||
info = svc_rdma_read_info_alloc(rdma);
|
||||
if (!info)
|
||||
return -ENOMEM;
|
||||
cc = &info->ri_cc;
|
||||
info->ri_rqst = rqstp;
|
||||
info->ri_readctxt = head;
|
||||
info->ri_pageno = 0;
|
||||
info->ri_pageoff = 0;
|
||||
info->ri_totalbytes = 0;
|
||||
|
||||
info->ri_position = be32_to_cpup(p + 1);
|
||||
if (info->ri_position)
|
||||
ret = svc_rdma_build_normal_read_chunk(rqstp, info, p);
|
||||
else
|
||||
ret = svc_rdma_build_pz_read_chunk(rqstp, info, p);
|
||||
if (pcl_is_empty(&head->rc_call_pcl)) {
|
||||
if (head->rc_read_pcl.cl_count == 1)
|
||||
ret = svc_rdma_read_data_item(info);
|
||||
else
|
||||
ret = svc_rdma_read_multiple_chunks(info);
|
||||
} else
|
||||
ret = svc_rdma_read_special(info);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
|
||||
ret = svc_rdma_post_chunk_ctxt(&info->ri_cc);
|
||||
trace_svcrdma_post_read_chunk(&cc->cc_cid, cc->cc_sqecount);
|
||||
ret = svc_rdma_post_chunk_ctxt(cc);
|
||||
if (ret < 0)
|
||||
goto out_err;
|
||||
svc_rdma_save_io_pages(rqstp, 0, head->rc_page_count);
|
||||
return 0;
|
||||
return 1;
|
||||
|
||||
out_err:
|
||||
svc_rdma_read_info_free(info);
|
||||
|
|
|
@ -358,49 +358,42 @@ static ssize_t svc_rdma_encode_read_list(struct svc_rdma_send_ctxt *sctxt)
|
|||
|
||||
/**
|
||||
* svc_rdma_encode_write_segment - Encode one Write segment
|
||||
* @src: matching Write chunk in the RPC Call header
|
||||
* @sctxt: Send context for the RPC Reply
|
||||
* @chunk: Write chunk to push
|
||||
* @remaining: remaining bytes of the payload left in the Write chunk
|
||||
* @segno: which segment in the chunk
|
||||
*
|
||||
* Return values:
|
||||
* On success, returns length in bytes of the Reply XDR buffer
|
||||
* that was consumed by the Write segment
|
||||
* that was consumed by the Write segment, and updates @remaining
|
||||
* %-EMSGSIZE on XDR buffer overflow
|
||||
*/
|
||||
static ssize_t svc_rdma_encode_write_segment(__be32 *src,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
unsigned int *remaining)
|
||||
static ssize_t svc_rdma_encode_write_segment(struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_chunk *chunk,
|
||||
u32 *remaining, unsigned int segno)
|
||||
{
|
||||
const struct svc_rdma_segment *segment = &chunk->ch_segments[segno];
|
||||
const size_t len = rpcrdma_segment_maxsz * sizeof(__be32);
|
||||
u32 length;
|
||||
__be32 *p;
|
||||
const size_t len = rpcrdma_segment_maxsz * sizeof(*p);
|
||||
u32 handle, length;
|
||||
u64 offset;
|
||||
|
||||
p = xdr_reserve_space(&sctxt->sc_stream, len);
|
||||
if (!p)
|
||||
return -EMSGSIZE;
|
||||
|
||||
xdr_decode_rdma_segment(src, &handle, &length, &offset);
|
||||
|
||||
if (*remaining < length) {
|
||||
/* segment only partly filled */
|
||||
length = *remaining;
|
||||
*remaining = 0;
|
||||
} else {
|
||||
/* entire segment was consumed */
|
||||
*remaining -= length;
|
||||
}
|
||||
xdr_encode_rdma_segment(p, handle, length, offset);
|
||||
|
||||
trace_svcrdma_encode_wseg(handle, length, offset);
|
||||
length = min_t(u32, *remaining, segment->rs_length);
|
||||
*remaining -= length;
|
||||
xdr_encode_rdma_segment(p, segment->rs_handle, length,
|
||||
segment->rs_offset);
|
||||
trace_svcrdma_encode_wseg(sctxt, segno, segment->rs_handle, length,
|
||||
segment->rs_offset);
|
||||
return len;
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_encode_write_chunk - Encode one Write chunk
|
||||
* @src: matching Write chunk in the RPC Call header
|
||||
* @sctxt: Send context for the RPC Reply
|
||||
* @remaining: size in bytes of the payload in the Write chunk
|
||||
* @chunk: Write chunk to push
|
||||
*
|
||||
* Copy a Write chunk from the Call transport header to the
|
||||
* Reply transport header. Update each segment's length field
|
||||
|
@ -411,33 +404,28 @@ static ssize_t svc_rdma_encode_write_segment(__be32 *src,
|
|||
* that was consumed by the Write chunk
|
||||
* %-EMSGSIZE on XDR buffer overflow
|
||||
*/
|
||||
static ssize_t svc_rdma_encode_write_chunk(__be32 *src,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
unsigned int remaining)
|
||||
static ssize_t svc_rdma_encode_write_chunk(struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_chunk *chunk)
|
||||
{
|
||||
unsigned int i, nsegs;
|
||||
u32 remaining = chunk->ch_payload_length;
|
||||
unsigned int segno;
|
||||
ssize_t len, ret;
|
||||
|
||||
len = 0;
|
||||
trace_svcrdma_encode_write_chunk(remaining);
|
||||
|
||||
src++;
|
||||
ret = xdr_stream_encode_item_present(&sctxt->sc_stream);
|
||||
if (ret < 0)
|
||||
return -EMSGSIZE;
|
||||
return ret;
|
||||
len += ret;
|
||||
|
||||
nsegs = be32_to_cpup(src++);
|
||||
ret = xdr_stream_encode_u32(&sctxt->sc_stream, nsegs);
|
||||
ret = xdr_stream_encode_u32(&sctxt->sc_stream, chunk->ch_segcount);
|
||||
if (ret < 0)
|
||||
return -EMSGSIZE;
|
||||
return ret;
|
||||
len += ret;
|
||||
|
||||
for (i = nsegs; i; i--) {
|
||||
ret = svc_rdma_encode_write_segment(src, sctxt, &remaining);
|
||||
for (segno = 0; segno < chunk->ch_segcount; segno++) {
|
||||
ret = svc_rdma_encode_write_segment(sctxt, chunk, &remaining, segno);
|
||||
if (ret < 0)
|
||||
return -EMSGSIZE;
|
||||
src += rpcrdma_segment_maxsz;
|
||||
return ret;
|
||||
len += ret;
|
||||
}
|
||||
|
||||
|
@ -448,32 +436,25 @@ static ssize_t svc_rdma_encode_write_chunk(__be32 *src,
|
|||
* svc_rdma_encode_write_list - Encode RPC Reply's Write chunk list
|
||||
* @rctxt: Reply context with information about the RPC Call
|
||||
* @sctxt: Send context for the RPC Reply
|
||||
* @length: size in bytes of the payload in the first Write chunk
|
||||
*
|
||||
* The client provides a Write chunk list in the Call message. Fill
|
||||
* in the segments in the first Write chunk in the Reply's transport
|
||||
* header with the number of bytes consumed in each segment.
|
||||
* Remaining chunks are returned unused.
|
||||
*
|
||||
* Assumptions:
|
||||
* - Client has provided only one Write chunk
|
||||
*
|
||||
* Return values:
|
||||
* On success, returns length in bytes of the Reply XDR buffer
|
||||
* that was consumed by the Reply's Write list
|
||||
* %-EMSGSIZE on XDR buffer overflow
|
||||
*/
|
||||
static ssize_t
|
||||
svc_rdma_encode_write_list(const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
unsigned int length)
|
||||
static ssize_t svc_rdma_encode_write_list(struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_send_ctxt *sctxt)
|
||||
{
|
||||
struct svc_rdma_chunk *chunk;
|
||||
ssize_t len, ret;
|
||||
|
||||
ret = svc_rdma_encode_write_chunk(rctxt->rc_write_list, sctxt, length);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
len = ret;
|
||||
len = 0;
|
||||
pcl_for_each_chunk(chunk, &rctxt->rc_write_pcl) {
|
||||
ret = svc_rdma_encode_write_chunk(sctxt, chunk);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
len += ret;
|
||||
}
|
||||
|
||||
/* Terminate the Write list */
|
||||
ret = xdr_stream_encode_item_absent(&sctxt->sc_stream);
|
||||
|
@ -489,56 +470,174 @@ svc_rdma_encode_write_list(const struct svc_rdma_recv_ctxt *rctxt,
|
|||
* @sctxt: Send context for the RPC Reply
|
||||
* @length: size in bytes of the payload in the Reply chunk
|
||||
*
|
||||
* Assumptions:
|
||||
* - Reply can always fit in the client-provided Reply chunk
|
||||
*
|
||||
* Return values:
|
||||
* On success, returns length in bytes of the Reply XDR buffer
|
||||
* that was consumed by the Reply's Reply chunk
|
||||
* %-EMSGSIZE on XDR buffer overflow
|
||||
* %-E2BIG if the RPC message is larger than the Reply chunk
|
||||
*/
|
||||
static ssize_t
|
||||
svc_rdma_encode_reply_chunk(const struct svc_rdma_recv_ctxt *rctxt,
|
||||
svc_rdma_encode_reply_chunk(struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
unsigned int length)
|
||||
{
|
||||
return svc_rdma_encode_write_chunk(rctxt->rc_reply_chunk, sctxt,
|
||||
length);
|
||||
struct svc_rdma_chunk *chunk;
|
||||
|
||||
if (pcl_is_empty(&rctxt->rc_reply_pcl))
|
||||
return xdr_stream_encode_item_absent(&sctxt->sc_stream);
|
||||
|
||||
chunk = pcl_first_chunk(&rctxt->rc_reply_pcl);
|
||||
if (length > chunk->ch_length)
|
||||
return -E2BIG;
|
||||
|
||||
chunk->ch_payload_length = length;
|
||||
return svc_rdma_encode_write_chunk(sctxt, chunk);
|
||||
}
|
||||
|
||||
static int svc_rdma_dma_map_page(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *ctxt,
|
||||
struct page *page,
|
||||
unsigned long offset,
|
||||
unsigned int len)
|
||||
struct svc_rdma_map_data {
|
||||
struct svcxprt_rdma *md_rdma;
|
||||
struct svc_rdma_send_ctxt *md_ctxt;
|
||||
};
|
||||
|
||||
/**
|
||||
* svc_rdma_page_dma_map - DMA map one page
|
||||
* @data: pointer to arguments
|
||||
* @page: struct page to DMA map
|
||||
* @offset: offset into the page
|
||||
* @len: number of bytes to map
|
||||
*
|
||||
* Returns:
|
||||
* %0 if DMA mapping was successful
|
||||
* %-EIO if the page cannot be DMA mapped
|
||||
*/
|
||||
static int svc_rdma_page_dma_map(void *data, struct page *page,
|
||||
unsigned long offset, unsigned int len)
|
||||
{
|
||||
struct svc_rdma_map_data *args = data;
|
||||
struct svcxprt_rdma *rdma = args->md_rdma;
|
||||
struct svc_rdma_send_ctxt *ctxt = args->md_ctxt;
|
||||
struct ib_device *dev = rdma->sc_cm_id->device;
|
||||
dma_addr_t dma_addr;
|
||||
|
||||
++ctxt->sc_cur_sge_no;
|
||||
|
||||
dma_addr = ib_dma_map_page(dev, page, offset, len, DMA_TO_DEVICE);
|
||||
trace_svcrdma_dma_map_page(rdma, dma_addr, len);
|
||||
if (ib_dma_mapping_error(dev, dma_addr))
|
||||
goto out_maperr;
|
||||
|
||||
trace_svcrdma_dma_map_page(rdma, dma_addr, len);
|
||||
ctxt->sc_sges[ctxt->sc_cur_sge_no].addr = dma_addr;
|
||||
ctxt->sc_sges[ctxt->sc_cur_sge_no].length = len;
|
||||
ctxt->sc_send_wr.num_sge++;
|
||||
return 0;
|
||||
|
||||
out_maperr:
|
||||
trace_svcrdma_dma_map_err(rdma, dma_addr, len);
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
/* ib_dma_map_page() is used here because svc_rdma_dma_unmap()
|
||||
/**
|
||||
* svc_rdma_iov_dma_map - DMA map an iovec
|
||||
* @data: pointer to arguments
|
||||
* @iov: kvec to DMA map
|
||||
*
|
||||
* ib_dma_map_page() is used here because svc_rdma_dma_unmap()
|
||||
* handles DMA-unmap and it uses ib_dma_unmap_page() exclusively.
|
||||
*
|
||||
* Returns:
|
||||
* %0 if DMA mapping was successful
|
||||
* %-EIO if the iovec cannot be DMA mapped
|
||||
*/
|
||||
static int svc_rdma_dma_map_buf(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *ctxt,
|
||||
unsigned char *base,
|
||||
unsigned int len)
|
||||
static int svc_rdma_iov_dma_map(void *data, const struct kvec *iov)
|
||||
{
|
||||
return svc_rdma_dma_map_page(rdma, ctxt, virt_to_page(base),
|
||||
offset_in_page(base), len);
|
||||
if (!iov->iov_len)
|
||||
return 0;
|
||||
return svc_rdma_page_dma_map(data, virt_to_page(iov->iov_base),
|
||||
offset_in_page(iov->iov_base),
|
||||
iov->iov_len);
|
||||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_xb_dma_map - DMA map all segments of an xdr_buf
|
||||
* @xdr: xdr_buf containing portion of an RPC message to transmit
|
||||
* @data: pointer to arguments
|
||||
*
|
||||
* Returns:
|
||||
* %0 if DMA mapping was successful
|
||||
* %-EIO if DMA mapping failed
|
||||
*
|
||||
* On failure, any DMA mappings that have been already done must be
|
||||
* unmapped by the caller.
|
||||
*/
|
||||
static int svc_rdma_xb_dma_map(const struct xdr_buf *xdr, void *data)
|
||||
{
|
||||
unsigned int len, remaining;
|
||||
unsigned long pageoff;
|
||||
struct page **ppages;
|
||||
int ret;
|
||||
|
||||
ret = svc_rdma_iov_dma_map(data, &xdr->head[0]);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
ppages = xdr->pages + (xdr->page_base >> PAGE_SHIFT);
|
||||
pageoff = offset_in_page(xdr->page_base);
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
len = min_t(u32, PAGE_SIZE - pageoff, remaining);
|
||||
|
||||
ret = svc_rdma_page_dma_map(data, *ppages++, pageoff, len);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
remaining -= len;
|
||||
pageoff = 0;
|
||||
}
|
||||
|
||||
ret = svc_rdma_iov_dma_map(data, &xdr->tail[0]);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
return xdr->len;
|
||||
}
|
||||
|
||||
struct svc_rdma_pullup_data {
|
||||
u8 *pd_dest;
|
||||
unsigned int pd_length;
|
||||
unsigned int pd_num_sges;
|
||||
};
|
||||
|
||||
/**
|
||||
* svc_rdma_xb_count_sges - Count how many SGEs will be needed
|
||||
* @xdr: xdr_buf containing portion of an RPC message to transmit
|
||||
* @data: pointer to arguments
|
||||
*
|
||||
* Returns:
|
||||
* Number of SGEs needed to Send the contents of @xdr inline
|
||||
*/
|
||||
static int svc_rdma_xb_count_sges(const struct xdr_buf *xdr,
|
||||
void *data)
|
||||
{
|
||||
struct svc_rdma_pullup_data *args = data;
|
||||
unsigned int remaining;
|
||||
unsigned long offset;
|
||||
|
||||
if (xdr->head[0].iov_len)
|
||||
++args->pd_num_sges;
|
||||
|
||||
offset = offset_in_page(xdr->page_base);
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
++args->pd_num_sges;
|
||||
remaining -= min_t(u32, PAGE_SIZE - offset, remaining);
|
||||
offset = 0;
|
||||
}
|
||||
|
||||
if (xdr->tail[0].iov_len)
|
||||
++args->pd_num_sges;
|
||||
|
||||
args->pd_length += xdr->len;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -549,48 +648,71 @@ static int svc_rdma_dma_map_buf(struct svcxprt_rdma *rdma,
|
|||
* @xdr: xdr_buf containing RPC message to transmit
|
||||
*
|
||||
* Returns:
|
||||
* %true if pull-up must be used
|
||||
* %false otherwise
|
||||
* %true if pull-up must be used
|
||||
* %false otherwise
|
||||
*/
|
||||
static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
static bool svc_rdma_pull_up_needed(const struct svcxprt_rdma *rdma,
|
||||
const struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct xdr_buf *xdr)
|
||||
const struct xdr_buf *xdr)
|
||||
{
|
||||
int elements;
|
||||
/* Resources needed for the transport header */
|
||||
struct svc_rdma_pullup_data args = {
|
||||
.pd_length = sctxt->sc_hdrbuf.len,
|
||||
.pd_num_sges = 1,
|
||||
};
|
||||
int ret;
|
||||
|
||||
/* For small messages, copying bytes is cheaper than DMA mapping.
|
||||
*/
|
||||
if (sctxt->sc_hdrbuf.len + xdr->len < RPCRDMA_PULLUP_THRESH)
|
||||
ret = pcl_process_nonpayloads(&rctxt->rc_write_pcl, xdr,
|
||||
svc_rdma_xb_count_sges, &args);
|
||||
if (ret < 0)
|
||||
return false;
|
||||
|
||||
if (args.pd_length < RPCRDMA_PULLUP_THRESH)
|
||||
return true;
|
||||
return args.pd_num_sges >= rdma->sc_max_send_sges;
|
||||
}
|
||||
|
||||
/* Check whether the xdr_buf has more elements than can
|
||||
* fit in a single RDMA Send.
|
||||
*/
|
||||
/* xdr->head */
|
||||
elements = 1;
|
||||
/**
|
||||
* svc_rdma_xb_linearize - Copy region of xdr_buf to flat buffer
|
||||
* @xdr: xdr_buf containing portion of an RPC message to copy
|
||||
* @data: pointer to arguments
|
||||
*
|
||||
* Returns:
|
||||
* Always zero.
|
||||
*/
|
||||
static int svc_rdma_xb_linearize(const struct xdr_buf *xdr,
|
||||
void *data)
|
||||
{
|
||||
struct svc_rdma_pullup_data *args = data;
|
||||
unsigned int len, remaining;
|
||||
unsigned long pageoff;
|
||||
struct page **ppages;
|
||||
|
||||
/* xdr->pages */
|
||||
if (!rctxt || !rctxt->rc_write_list) {
|
||||
unsigned int remaining;
|
||||
unsigned long pageoff;
|
||||
|
||||
pageoff = xdr->page_base & ~PAGE_MASK;
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
++elements;
|
||||
remaining -= min_t(u32, PAGE_SIZE - pageoff,
|
||||
remaining);
|
||||
pageoff = 0;
|
||||
}
|
||||
if (xdr->head[0].iov_len) {
|
||||
memcpy(args->pd_dest, xdr->head[0].iov_base, xdr->head[0].iov_len);
|
||||
args->pd_dest += xdr->head[0].iov_len;
|
||||
}
|
||||
|
||||
/* xdr->tail */
|
||||
if (xdr->tail[0].iov_len)
|
||||
++elements;
|
||||
ppages = xdr->pages + (xdr->page_base >> PAGE_SHIFT);
|
||||
pageoff = offset_in_page(xdr->page_base);
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
len = min_t(u32, PAGE_SIZE - pageoff, remaining);
|
||||
memcpy(args->pd_dest, page_address(*ppages) + pageoff, len);
|
||||
remaining -= len;
|
||||
args->pd_dest += len;
|
||||
pageoff = 0;
|
||||
ppages++;
|
||||
}
|
||||
|
||||
/* assume 1 SGE is needed for the transport header */
|
||||
return elements >= rdma->sc_max_send_sges;
|
||||
if (xdr->tail[0].iov_len) {
|
||||
memcpy(args->pd_dest, xdr->tail[0].iov_base, xdr->tail[0].iov_len);
|
||||
args->pd_dest += xdr->tail[0].iov_len;
|
||||
}
|
||||
|
||||
args->pd_length += xdr->len;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -603,54 +725,30 @@ static bool svc_rdma_pull_up_needed(struct svcxprt_rdma *rdma,
|
|||
* The device is not capable of sending the reply directly.
|
||||
* Assemble the elements of @xdr into the transport header buffer.
|
||||
*
|
||||
* Returns zero on success, or a negative errno on failure.
|
||||
* Assumptions:
|
||||
* pull_up_needed has determined that @xdr will fit in the buffer.
|
||||
*
|
||||
* Returns:
|
||||
* %0 if pull-up was successful
|
||||
* %-EMSGSIZE if a buffer manipulation problem occurred
|
||||
*/
|
||||
static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma,
|
||||
static int svc_rdma_pull_up_reply_msg(const struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
const struct xdr_buf *xdr)
|
||||
{
|
||||
unsigned char *dst, *tailbase;
|
||||
unsigned int taillen;
|
||||
struct svc_rdma_pullup_data args = {
|
||||
.pd_dest = sctxt->sc_xprt_buf + sctxt->sc_hdrbuf.len,
|
||||
};
|
||||
int ret;
|
||||
|
||||
dst = sctxt->sc_xprt_buf + sctxt->sc_hdrbuf.len;
|
||||
memcpy(dst, xdr->head[0].iov_base, xdr->head[0].iov_len);
|
||||
dst += xdr->head[0].iov_len;
|
||||
ret = pcl_process_nonpayloads(&rctxt->rc_write_pcl, xdr,
|
||||
svc_rdma_xb_linearize, &args);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
tailbase = xdr->tail[0].iov_base;
|
||||
taillen = xdr->tail[0].iov_len;
|
||||
if (rctxt && rctxt->rc_write_list) {
|
||||
u32 xdrpad;
|
||||
|
||||
xdrpad = xdr_pad_size(xdr->page_len);
|
||||
if (taillen && xdrpad) {
|
||||
tailbase += xdrpad;
|
||||
taillen -= xdrpad;
|
||||
}
|
||||
} else {
|
||||
unsigned int len, remaining;
|
||||
unsigned long pageoff;
|
||||
struct page **ppages;
|
||||
|
||||
ppages = xdr->pages + (xdr->page_base >> PAGE_SHIFT);
|
||||
pageoff = xdr->page_base & ~PAGE_MASK;
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
len = min_t(u32, PAGE_SIZE - pageoff, remaining);
|
||||
|
||||
memcpy(dst, page_address(*ppages) + pageoff, len);
|
||||
remaining -= len;
|
||||
dst += len;
|
||||
pageoff = 0;
|
||||
ppages++;
|
||||
}
|
||||
}
|
||||
|
||||
if (taillen)
|
||||
memcpy(dst, tailbase, taillen);
|
||||
|
||||
sctxt->sc_sges[0].length += xdr->len;
|
||||
trace_svcrdma_send_pullup(sctxt->sc_sges[0].length);
|
||||
sctxt->sc_sges[0].length = sctxt->sc_hdrbuf.len + args.pd_length;
|
||||
trace_svcrdma_send_pullup(sctxt, args.pd_length);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -660,22 +758,22 @@ static int svc_rdma_pull_up_reply_msg(struct svcxprt_rdma *rdma,
|
|||
* @rctxt: Write and Reply chunks provided by client
|
||||
* @xdr: prepared xdr_buf containing RPC message
|
||||
*
|
||||
* Load the xdr_buf into the ctxt's sge array, and DMA map each
|
||||
* element as it is added. The Send WR's num_sge field is set.
|
||||
* Returns:
|
||||
* %0 if DMA mapping was successful.
|
||||
* %-EMSGSIZE if a buffer manipulation problem occurred
|
||||
* %-EIO if DMA mapping failed
|
||||
*
|
||||
* Returns zero on success, or a negative errno on failure.
|
||||
* The Send WR's num_sge field is set in all cases.
|
||||
*/
|
||||
int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma,
|
||||
struct svc_rdma_send_ctxt *sctxt,
|
||||
const struct svc_rdma_recv_ctxt *rctxt,
|
||||
struct xdr_buf *xdr)
|
||||
const struct xdr_buf *xdr)
|
||||
{
|
||||
unsigned int len, remaining;
|
||||
unsigned long page_off;
|
||||
struct page **ppages;
|
||||
unsigned char *base;
|
||||
u32 xdr_pad;
|
||||
int ret;
|
||||
struct svc_rdma_map_data args = {
|
||||
.md_rdma = rdma,
|
||||
.md_ctxt = sctxt,
|
||||
};
|
||||
|
||||
/* Set up the (persistently-mapped) transport header SGE. */
|
||||
sctxt->sc_send_wr.num_sge = 1;
|
||||
|
@ -684,7 +782,7 @@ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma,
|
|||
/* If there is a Reply chunk, nothing follows the transport
|
||||
* header, and we're done here.
|
||||
*/
|
||||
if (rctxt && rctxt->rc_reply_chunk)
|
||||
if (!pcl_is_empty(&rctxt->rc_reply_pcl))
|
||||
return 0;
|
||||
|
||||
/* For pull-up, svc_rdma_send() will sync the transport header.
|
||||
|
@ -693,58 +791,8 @@ int svc_rdma_map_reply_msg(struct svcxprt_rdma *rdma,
|
|||
if (svc_rdma_pull_up_needed(rdma, sctxt, rctxt, xdr))
|
||||
return svc_rdma_pull_up_reply_msg(rdma, sctxt, rctxt, xdr);
|
||||
|
||||
++sctxt->sc_cur_sge_no;
|
||||
ret = svc_rdma_dma_map_buf(rdma, sctxt,
|
||||
xdr->head[0].iov_base,
|
||||
xdr->head[0].iov_len);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
/* If a Write chunk is present, the xdr_buf's page list
|
||||
* is not included inline. However the Upper Layer may
|
||||
* have added XDR padding in the tail buffer, and that
|
||||
* should not be included inline.
|
||||
*/
|
||||
if (rctxt && rctxt->rc_write_list) {
|
||||
base = xdr->tail[0].iov_base;
|
||||
len = xdr->tail[0].iov_len;
|
||||
xdr_pad = xdr_pad_size(xdr->page_len);
|
||||
|
||||
if (len && xdr_pad) {
|
||||
base += xdr_pad;
|
||||
len -= xdr_pad;
|
||||
}
|
||||
|
||||
goto tail;
|
||||
}
|
||||
|
||||
ppages = xdr->pages + (xdr->page_base >> PAGE_SHIFT);
|
||||
page_off = xdr->page_base & ~PAGE_MASK;
|
||||
remaining = xdr->page_len;
|
||||
while (remaining) {
|
||||
len = min_t(u32, PAGE_SIZE - page_off, remaining);
|
||||
|
||||
++sctxt->sc_cur_sge_no;
|
||||
ret = svc_rdma_dma_map_page(rdma, sctxt, *ppages++,
|
||||
page_off, len);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
remaining -= len;
|
||||
page_off = 0;
|
||||
}
|
||||
|
||||
base = xdr->tail[0].iov_base;
|
||||
len = xdr->tail[0].iov_len;
|
||||
tail:
|
||||
if (len) {
|
||||
++sctxt->sc_cur_sge_no;
|
||||
ret = svc_rdma_dma_map_buf(rdma, sctxt, base, len);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
}
|
||||
|
||||
return 0;
|
||||
return pcl_process_nonpayloads(&rctxt->rc_write_pcl, xdr,
|
||||
svc_rdma_xb_dma_map, &args);
|
||||
}
|
||||
|
||||
/* The svc_rqst and all resources it owns are released as soon as
|
||||
|
@ -894,9 +942,6 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
|
|||
container_of(xprt, struct svcxprt_rdma, sc_xprt);
|
||||
struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt;
|
||||
__be32 *rdma_argp = rctxt->rc_recv_buf;
|
||||
__be32 *wr_lst = rctxt->rc_write_list;
|
||||
__be32 *rp_ch = rctxt->rc_reply_chunk;
|
||||
struct xdr_buf *xdr = &rqstp->rq_res;
|
||||
struct svc_rdma_send_ctxt *sctxt;
|
||||
__be32 *p;
|
||||
int ret;
|
||||
|
@ -914,45 +959,22 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
|
|||
rpcrdma_fixed_maxsz * sizeof(*p));
|
||||
if (!p)
|
||||
goto err0;
|
||||
|
||||
ret = svc_rdma_send_reply_chunk(rdma, rctxt, &rqstp->rq_res);
|
||||
if (ret < 0)
|
||||
goto err2;
|
||||
|
||||
*p++ = *rdma_argp;
|
||||
*p++ = *(rdma_argp + 1);
|
||||
*p++ = rdma->sc_fc_credits;
|
||||
*p = rp_ch ? rdma_nomsg : rdma_msg;
|
||||
*p = pcl_is_empty(&rctxt->rc_reply_pcl) ? rdma_msg : rdma_nomsg;
|
||||
|
||||
if (svc_rdma_encode_read_list(sctxt) < 0)
|
||||
goto err0;
|
||||
if (wr_lst) {
|
||||
/* XXX: Presume the client sent only one Write chunk */
|
||||
unsigned long offset;
|
||||
unsigned int length;
|
||||
|
||||
if (rctxt->rc_read_payload_length) {
|
||||
offset = rctxt->rc_read_payload_offset;
|
||||
length = rctxt->rc_read_payload_length;
|
||||
} else {
|
||||
offset = xdr->head[0].iov_len;
|
||||
length = xdr->page_len;
|
||||
}
|
||||
ret = svc_rdma_send_write_chunk(rdma, wr_lst, xdr, offset,
|
||||
length);
|
||||
if (ret < 0)
|
||||
goto err2;
|
||||
if (svc_rdma_encode_write_list(rctxt, sctxt, length) < 0)
|
||||
goto err0;
|
||||
} else {
|
||||
if (xdr_stream_encode_item_absent(&sctxt->sc_stream) < 0)
|
||||
goto err0;
|
||||
}
|
||||
if (rp_ch) {
|
||||
ret = svc_rdma_send_reply_chunk(rdma, rctxt, &rqstp->rq_res);
|
||||
if (ret < 0)
|
||||
goto err2;
|
||||
if (svc_rdma_encode_reply_chunk(rctxt, sctxt, ret) < 0)
|
||||
goto err0;
|
||||
} else {
|
||||
if (xdr_stream_encode_item_absent(&sctxt->sc_stream) < 0)
|
||||
goto err0;
|
||||
}
|
||||
if (svc_rdma_encode_write_list(rctxt, sctxt) < 0)
|
||||
goto err0;
|
||||
if (svc_rdma_encode_reply_chunk(rctxt, sctxt, ret) < 0)
|
||||
goto err0;
|
||||
|
||||
ret = svc_rdma_send_reply_msg(rdma, sctxt, rctxt, rqstp);
|
||||
if (ret < 0)
|
||||
|
@ -979,28 +1001,46 @@ int svc_rdma_sendto(struct svc_rqst *rqstp)
|
|||
}
|
||||
|
||||
/**
|
||||
* svc_rdma_read_payload - special processing for a READ payload
|
||||
* svc_rdma_result_payload - special processing for a result payload
|
||||
* @rqstp: svc_rqst to operate on
|
||||
* @offset: payload's byte offset in @xdr
|
||||
* @length: size of payload, in bytes
|
||||
*
|
||||
* Returns zero on success.
|
||||
*
|
||||
* For the moment, just record the xdr_buf location of the READ
|
||||
* payload. svc_rdma_sendto will use that location later when
|
||||
* we actually send the payload.
|
||||
* Return values:
|
||||
* %0 if successful or nothing needed to be done
|
||||
* %-EMSGSIZE on XDR buffer overflow
|
||||
* %-E2BIG if the payload was larger than the Write chunk
|
||||
* %-EINVAL if client provided too many segments
|
||||
* %-ENOMEM if rdma_rw context pool was exhausted
|
||||
* %-ENOTCONN if posting failed (connection is lost)
|
||||
* %-EIO if rdma_rw initialization failed (DMA mapping, etc)
|
||||
*/
|
||||
int svc_rdma_read_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
int svc_rdma_result_payload(struct svc_rqst *rqstp, unsigned int offset,
|
||||
unsigned int length)
|
||||
{
|
||||
struct svc_rdma_recv_ctxt *rctxt = rqstp->rq_xprt_ctxt;
|
||||
struct svc_rdma_chunk *chunk;
|
||||
struct svcxprt_rdma *rdma;
|
||||
struct xdr_buf subbuf;
|
||||
int ret;
|
||||
|
||||
/* XXX: Just one READ payload slot for now, since our
|
||||
* transport implementation currently supports only one
|
||||
* Write chunk.
|
||||
*/
|
||||
rctxt->rc_read_payload_offset = offset;
|
||||
rctxt->rc_read_payload_length = length;
|
||||
chunk = rctxt->rc_cur_result_payload;
|
||||
if (!length || !chunk)
|
||||
return 0;
|
||||
rctxt->rc_cur_result_payload =
|
||||
pcl_next_chunk(&rctxt->rc_write_pcl, chunk);
|
||||
if (length > chunk->ch_length)
|
||||
return -E2BIG;
|
||||
|
||||
chunk->ch_position = offset;
|
||||
chunk->ch_payload_length = length;
|
||||
|
||||
if (xdr_buf_subsegment(&rqstp->rq_res, &subbuf, offset, length))
|
||||
return -EMSGSIZE;
|
||||
|
||||
rdma = container_of(rqstp->rq_xprt, struct svcxprt_rdma, sc_xprt);
|
||||
ret = svc_rdma_send_write_chunk(rdma, chunk, &subbuf);
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -80,7 +80,7 @@ static const struct svc_xprt_ops svc_rdma_ops = {
|
|||
.xpo_create = svc_rdma_create,
|
||||
.xpo_recvfrom = svc_rdma_recvfrom,
|
||||
.xpo_sendto = svc_rdma_sendto,
|
||||
.xpo_read_payload = svc_rdma_read_payload,
|
||||
.xpo_result_payload = svc_rdma_result_payload,
|
||||
.xpo_release_rqst = svc_rdma_release_rqst,
|
||||
.xpo_detach = svc_rdma_detach,
|
||||
.xpo_free = svc_rdma_free,
|
||||
|
|
Загрузка…
Ссылка в новой задаче