From 3f6b3143031b678a8577df1f24ca977510aefcf5 Mon Sep 17 00:00:00 2001 From: "santosh.shilimkar@oracle.com" Date: Tue, 25 Aug 2015 12:01:59 -0700 Subject: [PATCH] RDS: Fix rds MR reference count in rds_rdma_unuse() rds_rdma_unuse() drops the mr reference count which it hasn't taken. Correct way of removing mr is to remove mr from the tree and then rdma_destroy_mr() it first, then rds_mr_put() to decrement its reference count. Whichever thread holds last reference will free the mr via rds_mr_put() This bug was triggering weird null pointer crashes. One if the trace for it is captured below. BUG: unable to handle kernel NULL pointer dereference at 0000000000000104 IP: [] rds_ib_free_mr+0x31/0x130 [rds_rdma] PGD 4366fa067 PUD 4366f9067 PMD 0 Oops: 0000 [#1] SMP [...] task: ffff88046da6a000 ti: ffff88046da6c000 task.ti: ffff88046da6c000 RIP: 0010:[] [] rds_ib_free_mr+0x31/0x130 [rds_rdma] RSP: 0018:ffff88046fa43bd8 EFLAGS: 00010286 RAX: 0000000071d38b80 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880079e7ff40 RBP: ffff88046fa43bf8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff88046fa43ca8 R11: ffff88046a802ed8 R12: ffff880079e7fa40 R13: 0000000000000000 R14: ffff880079e7ff40 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88046fa40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000104 CR3: 00000004366fb000 CR4: 00000000000006e0 Stack: ffff880079e7fa40 ffff880671d38f08 ffff880079e7ff40 0000000000000296 ffff88046fa43c28 ffffffffa087a38b ffff880079e7fa40 ffff880671d38f10 0000000000000000 0000000000000292 ffff88046fa43c48 ffffffffa087a3b6 Call Trace: [] rds_destroy_mr+0x8b/0xa0 [rds] [] __rds_put_mr_final+0x16/0x30 [rds] [] rds_rdma_unuse+0xc2/0x120 [rds] [] rds_recv_incoming_exthdrs+0x83/0xa0 [rds] [] rds_recv_incoming+0x92/0x200 [rds] [] rds_ib_process_recv+0x259/0x320 [rds_rdma] [] rds_ib_recv_tasklet_fn+0x1a8/0x490 [rds_rdma] [] ? __remove_hrtimer+0x58/0x90 [] tasklet_action+0xb1/0xc0 [] __do_softirq+0xe2/0x290 [] irq_exit+0xa6/0xb0 [] do_IRQ+0x65/0xf0 [] common_interrupt+0x6b/0x6b Signed-off-by: Santosh Shilimkar Signed-off-by: Santosh Shilimkar Signed-off-by: David S. Miller --- net/rds/rdma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/rds/rdma.c b/net/rds/rdma.c index c1df9b1cf3b2..4c93badeabf2 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -435,9 +435,10 @@ void rds_rdma_unuse(struct rds_sock *rs, u32 r_key, int force) /* If the MR was marked as invalidate, this will * trigger an async flush. */ - if (zot_me) + if (zot_me) { rds_destroy_mr(mr); - rds_mr_put(mr); + rds_mr_put(mr); + } } void rds_rdma_free_op(struct rm_rdma_op *ro)