There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
handler triggers a completion and another thread does rdma_connect() or
the handler directly calls rdma_connect().
In all cases rdma_connect() needs to hold the handler_mutex, but when
handler's are invoked this is already held by the core code. This causes
ULPs using the 2nd method to deadlock.
Provide a rdma_connect_locked() and have all ULPs call it from their
handlers.
Link: https://lore.kernel.org/r/0-v2-53c22d5c1405+33-rdma_connect_locking_jgg@nvidia.com
Reported-and-tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Fixes: 2a7cec5381 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state")
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The only usage of these is to pass their address to sysfs_create_group()
and sysfs_remove_group(), both which takes const pointers. Make it const
to allow the compiler to put them in read-only memory.
Link: https://lore.kernel.org/r/20200930224004.24279-3-rikard.falkeborn@gmail.com
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Leon Romanovsky says:
====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_active_speed':
RDMA: Fix link active_speed size
RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
net/mlx5: Refactor query port speed functions
The rnbd_server module's communication manager (cm) initialization depends
on the registration of the "network namespace subsystem" of the RDMA CM
agent module. As such, when the kernel is configured to load the
rnbd_server and the RDMA cma module during initialization; and if the
rnbd_server module is initialized before RDMA cma module, a null ptr
dereference occurs during the RDMA bind operation.
Call trace:
Call Trace:
? xas_load+0xd/0x80
xa_load+0x47/0x80
cma_ps_find+0x44/0x70
rdma_bind_addr+0x782/0x8b0
? get_random_bytes+0x35/0x40
rtrs_srv_cm_init+0x50/0x80
rtrs_srv_open+0x102/0x180
? rnbd_client_init+0x6e/0x6e
rnbd_srv_init_module+0x34/0x84
? rnbd_client_init+0x6e/0x6e
do_one_initcall+0x4a/0x200
kernel_init_freeable+0x1f1/0x26e
? rest_init+0xb0/0xb0
kernel_init+0xe/0x100
ret_from_fork+0x22/0x30
Modules linked in:
CR2: 0000000000000015
All this happens cause the cm init is in the call chain of the module
init, which is not a preferred practice.
So remove the call to rdma_create_id() from the module init call chain.
Instead register rtrs-srv as an ib client, which makes sure that the
rdma_create_id() is called only when an ib device is added.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200907103106.104530-1-haris.iqbal@cloud.ionos.com
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The device .release function was not being set during the device
initialization. This was leading to the below warning, in error cases when
put_srv was called before device_add was called.
Warning:
Device '(null)' does not have a release() function, it is broken and must
be fixed. See Documentation/kobject.txt.
So, set the device .release function during device initialization in the
__alloc_srv() function.
Fixes: baa5b28b7a ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Link: https://lore.kernel.org/r/20200907102216.104041-1-haris.iqbal@cloud.ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There are error cases when we will call free_srv before device kobject is
initialized; in such cases calling put_device generates the following
warning:
kobject: '(null)' (000000009f5445ed): is not initialized, yet
kobject_put() is being called.
So call device_initialize() only once when the server is allocated. If we
end up calling put_srv() and subsequently free_srv(), our call to
put_device() would result in deletion of the obj. Call device_add() later
when we actually have a connection. Correspondingly, call device_del()
instead of device_unregister() when srv->dev_ref falls to 0.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200811092722.2450-1-haris.iqbal@cloud.ionos.com
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
lockdep triggers a warning from time to time when running a regression
test:
rnbd_client L685: </dev/nullb0@bla> Device disconnected.
rnbd_client L1756: Unloading module
workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130
The root cause is workqueue core expect flushing should not be done for a
!WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue.
In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq
WQ_MEM_RECLAIM.
To avoid the warning, remove the WQ_MEM_RECLAIM flag.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-4-haris.iqbal@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
In order to avoid all the clients to start reconnecting at the same time
schedule the reconnect dwork with a random jitter of +[0,8] seconds.
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200724111508.15734-2-haris.iqbal@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
IBTA declares "vendor option not supported" reject reason in REJ messages
if passive side doesn't want to accept proposed ECE options.
Due to the fact that ECE is managed by userspace, there is a need to let
users to provide such rejected reason.
Link: https://lore.kernel.org/r/20200526103304.196371-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The macros do_each_path/while_each_path lead to a smatch warning:
drivers/infiniband/ulp/rtrs/rtrs-clt.c:1196 rtrs_clt_failover_req() warn: inconsistent indenting
drivers/infiniband/ulp/rtrs/rtrs-clt.c:2890 rtrs_clt_request() warn: inconsistent indenting
Also checkpatch complains:
ERROR: Macros with multiple statements should be enclosed in a do - while loop
The macros are used only in two places: for a normal IO path and for the
failover path triggered after errors.
Get rid of the macros and just use a for loop iterating over the list of
paths in both places. It is easier to read and also less lines of code.
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200522053924.528980-1-danil.kipnis@cloud.ionos.com
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The rtrs_sess structure has already been extracted above from the
rtrs_srv_sess structure. Use that to avoid redundant dereferencing.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200522082833.1480551-1-haris.phnx@gmail.com
Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com>
Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
When Block Device Layer is disabled, BLK_MAX_SEGMENT_SIZE is undefined.
The rtrs is a transport library and should compile independently of the
block layer. The desired max segment size should be passed down by the
user.
Introduce max_segment_size parameter for the rtrs_clt_open() call.
Fixes: f7a7a5c228 ("block/rnbd: client: main functionality")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Fixes: cb80329c94 ("RDMA/rtrs: client: private header with client structs and functions")
Fixes: b5c27cdb09 ("RDMA/rtrs: public interface header to establish RDMA connections")
Link: https://lore.kernel.org/r/20200519111419.924170-1-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Fix to return negative error code -ENOMEM from the some error handling
cases instead of 0, as done elsewhere in this function.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Fixes: 91b11610af ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20200519091912.134358-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Remove the if-statement and return the value contained in _err_,
unconditionally.
Link: https://lore.kernel.org/r/20200519163612.GA6043@embeddedor
Addresses-Coverity-ID: 1493753 ("Identical code for different branches")
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
These > comparisons should be >= to prevent accessing one element beyond
the end of the buffer.
Fixes: 9cb8374804 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20200519154525.GA66801@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
The problem is that "req->sg_cnt" is an unsigned int so if "nr" is
negative, it gets type promoted to a high positive value and the condition
is false. This patch fixes it by handling negatives separately.
Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20200519133223.GN2078@kadam
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Add rtrs Makefile, Kconfig and also corresponding lines into upper layer
infiniband/ulp files.
Link: https://lore.kernel.org/r/20200511135131.27580-14-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is the sysfs interface to rtrs sessions on server side:
/sys/class/rtrs-server/<SESS-NAME>/
*** rtrs session accepted from a client peer
|
|- paths/<SRC@DST>/
*** established paths from a client in a session
|
|- disconnect
| *** disconnect path
|
|- hca_name
| *** HCA name
|
|- hca_port
| *** HCA port
|
|- stats/
*** current path statistics
|
|- rdma
Link: https://lore.kernel.org/r/20200511135131.27580-13-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This introduces set of functions used on server side to account statistics
of RDMA data sent/received.
Link: https://lore.kernel.org/r/20200511135131.27580-12-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is main functionality of rtrs-server module, which accepts set of
RDMA connections (so called rtrs session), creates/destroys sysfs entries
associated with rtrs session and notifies upper layer
(user of RTRS API) about RDMA requests or link events.
Link: https://lore.kernel.org/r/20200511135131.27580-11-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This header describes main structs and functions used by rtrs-server
module, mainly for accepting rtrs sessions, creating/destroying sysfs
entries, accounting statistics on server side.
Link: https://lore.kernel.org/r/20200511135131.27580-10-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is the sysfs interface to rtrs sessions on client side:
/sys/class/rtrs-client/<SESS-NAME>/
*** rtrs session created by rtrs_clt_open() API call
|
|- max_reconnect_attempts
| *** number of reconnect attempts for session
|
|- add_path
| *** adds another connection path into rtrs session
|
|- paths/<SRC@DST>/
*** established paths to server in a session
|
|- disconnect
| *** disconnect path
|
|- reconnect
| *** reconnect path
|
|- remove_path
| *** remove current path
|
|- state
| *** retrieve current path state
|
|- hca_port
| *** HCA port number
|
|- hca_name
| *** HCA name
|
|- stats/
*** current path statistics
|
|- cpu_migration
|- rdma
|- reconnects
|- reset_all
Link: https://lore.kernel.org/r/20200511135131.27580-9-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This introduces set of functions used on client side to account statistics
of RDMA data sent/received, amount of IOs inflight, latency, cpu
migrations, etc. Almost all statistics are collected using percpu
variables.
Link: https://lore.kernel.org/r/20200511135131.27580-8-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is main functionality of rtrs-client module, which manages set of
RDMA connections for each rtrs session, does multipathing, load balancing
and failover of RDMA requests.
Link: https://lore.kernel.org/r/20200511135131.27580-7-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This header describes main structs and functions used by rtrs-client
module, mainly for managing rtrs sessions, creating/destroying sysfs
entries, accounting statistics on client side.
Link: https://lore.kernel.org/r/20200511135131.27580-6-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This is a set of library functions existing as a rtrs-core module, used by
client and server modules.
Mainly these functions wrap IB and RDMA calls and provide a bit higher
abstraction for implementing of RTRS protocol on client or server sides.
Link: https://lore.kernel.org/r/20200511135131.27580-5-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
These are common private headers with rtrs protocol structures, logging,
sysfs and other helper functions, which are used on both client and server
sides.
Link: https://lore.kernel.org/r/20200511135131.27580-4-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Introduce public header which provides set of API functions to establish
RDMA connections from client to server machine using RTRS protocol, which
manages RDMA connections for each session, does multipathing and load
balancing.
Main functions for client (active) side:
rtrs_clt_open() - Creates set of RDMA connections incapsulated
in IBTRS session and returns pointer on RTRS
session object.
rtrs_clt_close() - Closes RDMA connections associated with RTRS
session.
rtrs_clt_request() - Requests zero-copy RDMA transfer to/from
server.
Main functions for server (passive) side:
rtrs_srv_open() - Starts listening for RTRS clients on specified
port and invokes RTRS callbacks for incoming
RDMA requests or link events.
rtrs_srv_close() - Closes RTRS server context.
Link: https://lore.kernel.org/r/20200511135131.27580-3-danil.kipnis@cloud.ionos.com
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>