Граф коммитов

811772 Коммитов

Автор SHA1 Сообщение Дата
Devesh Sharma 37f91cff2d RDMA/bnxt_re: Add extended psn structure for 57500 adapters
The new 57500 series of adapter has bigger psn search structure.  The size
of new structure is 16B. Changing the control path memory allocation and
fast path code to accommodate the new psn structure while maintaining the
backward compatibility.

There are few additional changes listed below:
 - For 57500 chip max-sge are limited to 6 for now.
 - For 57500 chip max-receive-sge should be set to 6 for now.
 - Add driver/hardware interface structure for new chip.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma 374c5285ab RDMA/bnxt_re: Enable GSI QP support for 57500 series
In the new 57500 series of adapters the GSI qp is a UD type QP unlike the
previous generation where it was a Raw Eth QP. Changing the control and
data path to support the same. Listing all the significant diffs:

 - AH creation resolve network type unconditionally
 - Add check at relevant places to distinguish from Raw Eth
   processing flow.
 - bnxt_re_process_res_ud_wc report completion with GRH flag
   when qp is GSI.
 - Change length, cfa_meta and smac to match new driver/hardware
   interface.
 - Add new driver/hardware interface.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma e0387e1dd4 RDMA/bnxt_re: Skip backing store allocation for 57500 series
The backing store to keep HW context data structures is allocated and
initialized by L2 driver. For 57500 chip RoCE driver do not require to
allocate and initialize additional memory. Changing to skip duplicate
allocation and initialization for 57500 adapters. Driver continues as
before for older chips.

This patch also takes care of stats context memory alignment to 128
boundary, a requirement for 57500 series of chip. Older chips do not care
of alignment, thus the change is unconditional.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma b353ce556d RDMA/bnxt_re: Add 64bit doorbells for 57500 series
The new chip series has 64 bit doorbell for notification queues. Thus,
both control and data path event queues need new routines to write 64 bit
doorbell. Adding the same. There is new doorbell interface between the
chip and driver. Changing the chip specific data structure definitions.

Additional significant changes are listed below
- bnxt_re_net_ring_free/alloc takes a new argument
- bnxt_qplib_enable_nq and enable_rcfw uses new doorbell offset
  for new chip.
- DB mapping for NQ and CREQ now maps 8 bytes.
- DBR_DBR_* macros renames to DBC_DBC_*
- store nq_db_offset in a 32bit data type.
- got rid of __iowrite64_copy, used writeq instead.
- changed the DB header initialization to simpler scheme.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma ae8637e131 RDMA/bnxt_re: Add chip context to identify 57500 series
Adding setup and destroy routines for chip-context. The chip context would
be used frequently in control and data path to take execution flow
depending on the chip type.  chip context structure pointer is added to
the relevant data structures.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Gal Pressman af8b38ed0b IB/mlx5: Simplify WQE count power of two check
Use is_power_of_2() instead of hard coding it in the driver. While at it,
fix the meaningless error print.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:14:55 -07:00
Davidlohr Bueso 1a7a05e88f Documentation/infiniband: update from locked to pinned_vm
We are really talking about pinned_vm here.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:56:23 -07:00
Davidlohr Bueso b95df5e3e4 drivers/IB,core: reduce scope of mmap_sem
ib_umem_get() uses gup_longterm() and relies on the lock to stabilze the
vma_list, so we cannot really get rid of mmap_sem altogether, but now that
the counter is atomic, we can get of some complexity that mmap_sem brings
with only pinned_vm.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Davidlohr Bueso 8ea1f989aa drivers/IB,usnic: reduce scope of mmap_sem
usnic_uiom_get_pages() uses gup_longterm() so we cannot really get rid of
mmap_sem altogether in the driver, but we can get rid of some complexity
that mmap_sem brings with only pinned_vm.  We can get rid of the wq
altogether as we no longer need to defer work to unpin pages as the
counter is now atomic. We also share the lock.

Acked-by: Parvi Kaustubhi <pkaustub@cisco.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Davidlohr Bueso 0e15c25336 drivers/IB,hfi1: do not se mmap_sem
This driver already uses gup_fast() and thus we can just drop the mmap_sem
protection around the pinned_vm counter. Note that the window between when
hfi1_can_pin_pages() is called and the actual counter is incremented
remains the same as mmap_sem was _only_ used for when ->pinned_vm was
touched.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.det>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Davidlohr Bueso 3a2a1e9056 drivers/IB,qib: optimize mmap_sem usage
The driver uses mmap_sem for both pinned_vm accounting and
get_user_pages(). Because rdma drivers might want to use gup_longterm() in
the future we still need some sort of mmap_sem serialization (as opposed
to removing it entirely by using gup_fast()). Now that pinned_vm is atomic
the writer lock can therefore be converted to reader.

This also fixes a bug that __qib_get_user_pages was not taking into
account the current value of pinned_vm.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Davidlohr Bueso 4f564ff3d4 drivers/mic/scif: do not use mmap_sem
The driver uses mmap_sem for both pinned_vm accounting and
get_user_pages(). By using gup_fast() and letting the mm handle the lock
if needed, we can no longer rely on the semaphore and simplify the whole
thing.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Davidlohr Bueso 70f8a3ca68 mm: make mm->pinned_vm an atomic64 counter
Taking a sleeping lock to _only_ increment a variable is quite the
overkill, and pretty much all users do this. Furthermore, some drivers
(ie: infiniband and scif) that need pinned semantics can go to quite
some trouble to actually delay via workqueue (un)accounting for pinned
pages when not possible to acquire it.

By making the counter atomic we no longer need to hold the mmap_sem and
can simply some code around it for pinned_vm users. The counter is 64-bit
such that we need not worry about overflows such as rdma user input
controlled from userspace.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 12:54:02 -07:00
Steve Wise a2bfd708b1 RDMA/iwpm: move kdoc comments to functions
Move the iwpm kdoc comments from the prototype declarations to above
the function bodies.  There are no functional changes in this patch.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05 15:40:41 -07:00
Leon Romanovsky a78e8723a5 RDMA/cma: Remove CM_ID statistics provided by rdma-cm module
Netlink statistics exported by rdma-cm never had any working user space
component published to the mailing list or to any open source
project. Canvassing various proprietary users, and the original requester,
we find that there are no real users of this interface.

This patch simply removes all occurrences of RDMA CM netlink in favour of
modern nldev implementation, which provides the same information and
accompanied by widely used user space component.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05 15:30:33 -07:00
Bart Van Assche bf3b4f066d IB/mlx5: Do not use hw_access_flags for be and CPU data
Avoid that sparse reports the following for the mlx5 driver:

drivers/infiniband/hw/mlx5/qp.c:2671:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2671:34:    left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2671:34:    right side has type int
drivers/infiniband/hw/mlx5/qp.c:2679:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2679:34:    left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2679:34:    right side has type int
drivers/infiniband/hw/mlx5/qp.c:2680:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2680:34:    left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2680:34:    right side has type int
drivers/infiniband/hw/mlx5/qp.c:2684:34: warning: invalid assignment: |=
drivers/infiniband/hw/mlx5/qp.c:2684:34:    left side has type restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2684:34:    right side has type int
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: incorrect type in argument 1 (different base types)
drivers/infiniband/hw/mlx5/qp.c:2686:28:    expected unsigned int [usertype] val
drivers/infiniband/hw/mlx5/qp.c:2686:28:    got restricted __be32 [usertype]
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32
drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32

This patch does not change any functionality.

Fixes: a60109dc9a ("IB/mlx5: Add support for extended atomic operations")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05 15:22:45 -07:00
Steve Wise b0bad9ad51 RDMA/IWPM: Support no port mapping requirements
A soft iwarp driver that uses the host TCP stack via a kernel mode socket
does not need port mapping.  In fact, if the port map daemon, iwpmd, is
running, then iwpmd must not try and create/bind a socket to the actual
port for a soft iwarp connection, since the driver already has that socket
bound.

Yet if the soft iwarp driver wants to interoperate with hard iwarp devices
that -are- using port mapping, then the soft iwarp driver's mappings still
need to be maintained and advertised by the iwpm protocol.

This patch enhances the rdma driver<->iwcm interface to allow an iwarp
driver to specify that it does not want port mapping.  The iwpm
kernel<->iwpmd interface is also enhanced to pass up this information on
map requests.

Care is taken to interoperate with the current iwpmd version (ABI version
3) and only use the new NL attributes if iwpmd supports ABI version 4.

The ABI version define has also been created in rdma_netlink.h so both
kernel and user code can share it.  The iwcm and iwpmd negotiate the ABI
version to use with a new HELLO netlink message.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:26:02 -07:00
Steve Wise f76903d574 RDMA/IWPM: refactor the IWPM message attribute names
In order to add new IWPM_NL attributes, the enums for the IWPM commands
attributes are refactored such that a new attribute can be added without
breaking ABI version 3. Instead of sharing nl attribute enums for both
request and response messages, we create separate enums for each IWPM
message request and reply.  This allows us to extend any given IWPM
message by adding new attributes for just that message.  These new enums
are created, though, in a way to avoid breaking ABI version 3.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:26:02 -07:00
Steve Wise 95b8e384d8 iw_cxgb*: kzalloc the iwcm verbs struct
So future additions to that struct get initialized by default.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:26:02 -07:00
Wei Hu (Xavier) d3743fa94c RDMA/hns: Fix the chip hanging caused by sending doorbell during reset
On hi08 chip, There is a possibility of chip hanging when sending doorbell
during reset. We can fix it by prohibiting doorbell during reset.

Fixes: 2d40788825 ("RDMA/hns: Add support for processing send wr and receive wr")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:13:50 -07:00
Wei Hu (Xavier) 6a04aed6af RDMA/hns: Fix the chip hanging caused by sending mailbox&CMQ during reset
On hi08 chip, There is a possibility of chip hanging and some errors when
sending mailbox & doorbell during reset.  We can fix it by prohibiting
mailbox and doorbell during reset and reset occurred to ensure that
hardware can work normally.

Fixes: a04ff739f2 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:13:50 -07:00
Wei Hu (Xavier) d061effc36 RDMA/hns: Fix the Oops during rmmod or insmod ko when reset occurs
In the reset process, the hns3 NIC driver notifies the RoCE driver to
perform reset related processing by calling the .reset_notify() interface
registered by the RoCE driver in hip08 SoC.

In the current version, if a reset occurs simultaneously during the
execution of rmmod or insmod ko, there may be Oops error as below:

 Internal error: Oops: 86000007 [#1] PREEMPT SMP
 Modules linked in: hns_roce(O) hns3(O) hclge(O) hnae3(O) [last unloaded: hns_roce_hw_v2]
 CPU: 0 PID: 14 Comm: kworker/0:1 Tainted: G           O      4.19.0-ge00d540 #1
 Hardware name: Huawei Technologies Co., Ltd.
 Workqueue: events hclge_reset_service_task [hclge]
 pstate: 60c00009 (nZCv daif +PAN +UAO)
 pc : 0xffff00000100b0b8
 lr : 0xffff00000100aea0
 sp : ffff000009afbab0
 x29: ffff000009afbab0 x28: 0000000000000800
 x27: 0000000000007ff0 x26: ffff80002f90c004
 x25: 00000000000007ff x24: ffff000008f97000
 x23: ffff80003efee0a8 x22: 0000000000001000
 x21: ffff80002f917ff0 x20: ffff8000286ea070
 x19: 0000000000000800 x18: 0000000000000400
 x17: 00000000c4d3225d x16: 00000000000021b8
 x15: 0000000000000400 x14: 0000000000000400
 x13: 0000000000000000 x12: ffff80003fac6e30
 x11: 0000800036303000 x10: 0000000000000001
 x9 : 0000000000000000 x8 : ffff80003016d000
 x7 : 0000000000000000 x6 : 000000000000003f
 x5 : 0000000000000040 x4 : 0000000000000000
 x3 : 0000000000000004 x2 : 00000000000007ff
 x1 : 0000000000000000 x0 : 0000000000000000
 Process kworker/0:1 (pid: 14, stack limit = 0x00000000af8f0ad9)
 Call trace:
  0xffff00000100b0b8
  0xffff00000100b3a0
  hns_roce_init+0x624/0xc88 [hns_roce]
  0xffff000001002df8
  0xffff000001006960
  hclge_notify_roce_client+0x74/0xe0 [hclge]
  hclge_reset_service_task+0xa58/0xbc0 [hclge]
  process_one_work+0x1e4/0x458
  worker_thread+0x40/0x450
  kthread+0x12c/0x130
  ret_from_fork+0x10/0x18
 Code: bad PC value

In the reset process, we will release the resources firstly, and after the
hardware reset is completed, we will reapply resources and reconfigure the
hardware.

We can solve this problem by modifying both the NIC and the RoCE
driver. We can modify the concurrent processing in the NIC driver to avoid
calling the .reset_notify and .uninit_instance ops at the same time. And
we need to modify the RoCE driver to record the reset stage and the
driver's init/uninit state, and check the state in the .reset_notify,
.init_instance. and uninit_instance functions to avoid NULL pointer
operation.

Fixes: cb7a94c9c8 ("RDMA/hns: Add reset process for RoCE in hip08")
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 16:13:50 -07:00
Kamal Heib 668aa15b5b RDMA/rxe: Improve loopback marking
Currently a packet is marked for loopback only if the source and
destination addresses equals. This is not enough when multiple gids are
present in rxe device's gid table and the traffic is from one gid to
another. Fix it by marking the packet for loopback if the destination MAC
address is equal to the source MAC address.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 15:57:49 -07:00
Kamal Heib fa40718804 RDMA/rxe: Move rxe_init_av() to rxe_av.c
Move the function rxe_init_av() to rxe_av.c file and use it instead of
calling rxe_av_from_attr() and rxe_av_fill_ip_info(), also remove the
unused rxe_dev parameter from rxe_init_av().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 15:57:49 -07:00
YueHaibing c3c668e742 RDMA/hns: Make some function static
Fixes the following sparse warnings:

drivers/infiniband/hw/hns/hns_roce_hw_v2.c:5822:5: warning:
 symbol 'hns_roce_v2_query_srq' was not declared. Should it be static?
drivers/infiniband/hw/hns/hns_roce_srq.c:158:6: warning:
 symbol 'hns_roce_srq_free' was not declared. Should it be static?
drivers/infiniband/hw/hns/hns_roce_srq.c:81:5: warning:
 symbol 'hns_roce_srq_alloc' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 15:33:58 -07:00
Jason Gunthorpe 6a8a2aa62d Linux 5.0-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxXYaEeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkSQH/2yrfnviNPFYpZOR
 QQdc71Bfhkd8m85SmWIsSebkxmi3hKFVj15sGbWXd6+0/VxjEEGvQCZpvVwJceke
 LwDxtkKGg/74wAqJvlSAWxFNZ+Had4jDeoSoeQChddsBVXBBCxQx2v6ECg3o2x7W
 k8Z8t4+3RijDf8fYXY9ETyO2zW8R/wgT+dnl+DPgUH7u4dxh7FzAUfc4bgZIDg+i
 FzBQfbTJuz4BU7uRZ9IJiwhWKv0Iyi2DR3BY8Z1pqEpRaUMJMrCs2WGytHbTgt9e
 0EtO1airbVneU4eumU/ZaF9cyEbah9HousEPnP7J09WG4s/Odxc4zE+uK1QqS2im
 5Xv88is=
 =dVd1
 -----END PGP SIGNATURE-----

Merge tag 'v5.0-rc5' into rdma.git for-next

Linux 5.0-rc5

Needed to merge the include/uapi changes so we have an up to date
single-tree for these files. Patches already posted are also expected to
need this for dependencies.
2019-02-04 14:53:42 -07:00
Bart Van Assche a163afc885 IB/core: Remove ib_sg_dma_address() and ib_sg_dma_len()
Keeping single line wrapper functions is not useful. Hence remove the
ib_sg_dma_address() and ib_sg_dma_len() functions. This patch does not
change any functionality.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua 6141f8fa5b IB/mlx5: Advertise XRC ODP support
Query all per transport caps for XRC and set the appropriate bits in the
per transport field of the advertised struct.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua 2e68daceac IB/mlx5: Advertise SRQ ODP support for supported transports
ODP support in SRQ is per transport capability. Based on device
capabilities set this flag in device structure for future queries.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua 08100fad5c IB/mlx5: Add ODP SRQ support
Add changes to the WQE page-fault handler to

1. Identify that the event is for a SRQ WQE
2. Pass SRQ object instead of a QP to the function that reads the WQE
3. Parse the SRQ WQE with respect to its structure

The rest is handled as for regular RQ WQE.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua fbeb4075c6 IB/mlx5: Let read user wqe also from SRQ buffer
Reading a WQE from SRQ is almost identical to reading from regular RQ.
The differences are the size of the queue, the size of a WQE and buffer
location.

Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE
from a SRQ or RQ by caller choice.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua 29917f4750 IB/mlx5: Add XRC initiator ODP support
Skip XRC segment in the beginning of a send WQE and fetch ODP XRC
capabilities when QP type is IB_QPT_XRC_INI. The rest of the handling is
the same as in RC QP.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:07 -07:00
Moni Shoua 6ff7414a17 IB/mlx5: Clean mlx5_ib_mr_responder_pfault_handler() signature
In the function mlx5_ib_mr_responder_pfault_handler()

1. The parameter wqe is used as read-only so there is no need to pass it
   by reference.
2. Remove the unused argument pfault from list of arguments.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
Moni Shoua 586f4e95c7 IB/mlx5: Remove useless check in ODP handler
When handling an ODP event for a receive WQE in SRQ the target QP is
unknown. Therefore, it is wrong to ask if QP has a SRQ in the page-fault
handler.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
Moni Shoua 52a72e2a39 IB/uverbs: Expose XRC ODP device capabilities
Expose XRC ODP capabilities as part of the extended device capabilities.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
Moni Shoua da82334219 IB/core: Allocate a bit for SRQ ODP support
The ODP support matrix is per operation and per transport. The support for
each transport (RC, UD, etc.) is described with a bit field.

ODP for SRQ WQEs is considered a different kind of support from ODP for RQ
WQs and therefore needs a different capability bit.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
Moni Shoua 10f56242e3 IB/mlx5: Fix the locking of SRQ objects in ODP events
QP and SRQ objects are stored in different containers so the action to get
and lock a common resource during ODP event needs to address that.

While here get rid of 'refcount' and 'free' fields in mlx5_core_srq struct
and use the fields with same semantics in common structure.

Fixes: 032080ab43 ("IB/mlx5: Lock QP during page fault handling")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04 14:34:06 -07:00
Jason Gunthorpe e431a80a54 Merge branch 'mlx5-next into rdma.git for-next
From
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

For dependencies needed in the next ODP patches.

* branch 'mlx5-next':
  net/mlx5: Set ODP SRQ support in firmware
  net/mlx5: Add XRC transport to ODP device capabilities layout
2019-02-04 14:33:02 -07:00
Linus Torvalds 8834f5600c Linux 5.0-rc5 2019-02-03 13:48:04 -08:00
Linus Torvalds 24b888d8d5 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A few updates for x86:

   - Fix an unintended sign extension issue in the fault handling code

   - Rename the new resource control config switch so it's less
     confusing

   - Avoid setting up EFI info in kexec when the EFI runtime is
     disabled.

   - Fix the microcode version check in the AMD microcode loader so it
     only loads higher version numbers and never downgrades

   - Set EFER.LME in the 32bit trampoline before returning to long mode
     to handle older AMD/KVM behaviour properly.

   - Add Darren and Andy as x86/platform reviewers"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Avoid confusion over the new X86_RESCTRL config
  x86/kexec: Don't setup EFI info if EFI runtime is not enabled
  x86/microcode/amd: Don't falsely trick the late loading mechanism
  MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers
  x86/fault: Fix sign-extend unintended sign extension
  x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode
  x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03 09:08:12 -08:00
Linus Torvalds cc6810e36b Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug fixes from Thomas Gleixner:
 "Two fixes for the cpu hotplug machinery:

   - Replace the overly clever 'SMT disabled by BIOS' detection logic as
     it breaks KVM scenarios and prevents speculation control updates
     when the Hyperthreads are brought online late after boot.

   - Remove a redundant invocation of the speculation control update
     function"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
  x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03 09:02:03 -08:00
Linus Torvalds 58f6d4287a Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "A pile of perf updates:

   - Fix broken sanity check in the /proc/sys/kernel/perf_cpu_time_max_percent
     write handler

   - Cure a perf script crash which caused by an unitinialized data
     structure

   - Highlight the hottest instruction in perf top and not a random one

   - Cure yet another clang issue when building perf python

   - Handle topology entries with no CPU correctly in the tools

   - Handle perf data which contains both tracepoints and performance
     counter entries correctly.

   - Add a missing NULL pointer check in perf ordered_events_free()"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf script: Fix crash when processing recorded stat data
  perf top: Fix wrong hottest instruction highlighted
  perf tools: Handle TOPOLOGY headers with no CPU
  perf python: Remove -fstack-clash-protection when building with some clang versions
  perf core: Fix perf_proc_update_handler() bug
  perf script: Fix crash with printing mixed trace point and other events
  perf ordered_events: Fix crash in ordered_events__free
2019-02-03 08:59:51 -08:00
Linus Torvalds 89401be658 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
 "The dump info for the efi page table debugging lacks a terminator
  which causes the kernel to crash when the debugfile is read"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/arm64: Fix debugfs crash by adding a terminator for ptdump marker
2019-02-03 08:57:05 -08:00
Linus Torvalds 312b3a93dd for-5.0-rc4-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlxWtJgACgkQxWXV+ddt
 WDsDow//ZpnyDwQWvSIfF2UUQOPlcBjbHKuBA7rU0wdybW635QYGR0mqnI+1VnMj
 7ssUkeN6N0a2gQzrUG4Y+zpdzOWv2xQ4jKZ9GMOp9gwyzEFyPkcFXOnmM8UfYtVu
 e3fK65e8BZHmTeu0kGKah4Dt1g0t4fUmhsKR4Pfp5YNJC+zuuGTwUW1K/ZQHXJ+3
 kTHc7WP1lsF7wgaZ+Gl+Kvp8fVrHVdygMVTdRBW8QaBgPLa/KExvK62jW+NmCYhj
 7OIkWdew7e8IXc3Ie5IbOomHAv7IgqqgiO9VO9+n0EpyV4UobUgxrgBKJ+0yc1Ya
 eidbKhMslwUE50y00JVm+vw0gwQHkR+hZDn/mRB6xiIeI8tu/yQIJZ6AhYJXoByR
 cu8+SNO5Z5dOZ1f146ZH8lnkr24tuSnkDUhbRDR5pdb4tAHHej2ALzhbfbwbPEpF
 IverYLw5fOMGeRU/mBsjkVadpSZ4S0HVNU85ERdhLtVLK1PSaY2UkUaA+Ii5y7au
 EYDjaGMflmJ8cAVqgtgedEff2n8OKDnzRZlz4IPLI73MVSITGZkM7PmYmYsLm2Bs
 mDPnmyqR8kzcd1RRtSeZTvqOpAIZG+QUOmD2jiKrchmp54Sz0V/HJWRs3aQybD6Q
 ph0yAcbkvgp/ewe5IFgaI0kcyH7zYdL6GtiI2WUE3/8DObgrgsA=
 =E2PP
 -----END PGP SIGNATURE-----

Merge tag 'for-5.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - regression fix: transaction commit can run away due to delayed ref
   waiting heuristic, this is not necessary now because of the proper
   reservation mechanism introduced in 5.0

 - regression fix: potential crash due to use-before-check of an ERR_PTR
   return value

 - fix for transaction abort during transaction commit that needs to
   properly clean up pending block groups

 - fix deadlock during b-tree node/leaf splitting, when this happens on
   some of the fundamental trees, we must prevent new tree block
   allocation to re-enter indirectly via the block group flushing path

 - potential memory leak after errors during mount

* tag 'for-5.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: On error always free subvol_name in btrfs_mount
  btrfs: clean up pending block groups when transaction commit aborts
  btrfs: fix potential oops in device_list_add
  btrfs: don't end the transaction for delayed refs in throttle
  Btrfs: fix deadlock when allocating tree block during leaf/node split
2019-02-03 08:48:33 -08:00
Moni Shoua 46861e3e88 net/mlx5: Set ODP SRQ support in firmware
To avoid compatibility issue with older kernels the firmware doesn't
allow SRQ to work with ODP unless kernel asks for it.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-02-03 12:49:59 +02:00
Moni Shoua dda7a817f2 net/mlx5: Add XRC transport to ODP device capabilities layout
The device capabilities for ODP structure was missing the field for XRC
transport so add it here.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-02-03 12:49:50 +02:00
Linus Torvalds 12491ed354 A single fix for building DT bindings in-tree.
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAlxVtl0QHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw7wHD/4qMBS9JpAki/RV3Y9NN/FwWRYC4HS59K7s
 XnCt0oQBdGeiu6hxEaZLsUMurIxsfP87YE6sM9iwiE1JqwKnPl8727BNMd8AGeGP
 dT0Q8MXY/VYksegw+IG2E2L7ILAh2TDuHBt5t8ZKFgAzxvebVo9Cjj/hCp/c9j4l
 AUAONZ/sSvyl/ejmkQWTuJpA1ns+GN/OUzHgiNytGat/1gfcNaiKJlZxChVdN7SX
 Vqwxoih/J44LYQibUEIM/WyeNwqfEA440bI0xxoIVomfEH1lV3RNmXg2lrmZrdgi
 yax9+iUzH81wzWOIlWYYLV06bqt77wyxUTOMA4thFH5pXoI9FoH4IvqC+W75Z4xN
 DSQ9rSz7Gx5uDa01ttWI5kiFMcWVqPSuzsx+6dzpmJWII5WPEqSGFA3WniiJe+fG
 pc/s2xNVXQxhvfXi+WSjFZBxv7cKtvBtIOi2oOLlmOy4MDu1BiJ8St3+1eMv4laN
 7/ZmyDMVhIL45cRn28rLZ0NZY5340nit6QcihOwpQ9LElBXPOQ1NVTEjtNlw6SPd
 RHhYIF4bVGuJKYaIB8aU4QobIKIsL0rHfr2NY0xnJNzcJ02Jr+dv1brX6Nqd9msH
 ybnQByAIBG5zrxQPZvik5v4BxEHr4JPITUrxNLkP1HNc7/Vq0HBcCJjsw/ruRLQi
 67/Zo8Ijvg==
 =cHtH
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull Devicetree fix from Rob Herring:
 "A single fix for building DT bindings in-tree"

* tag 'devicetree-fixes-for-5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: Fix dt_binding_check target for in tree builds
2019-02-02 10:34:32 -08:00
Linus Torvalds 74b13e7efe RISC-V Fixes for 5.0-rc5
This patch set contains a handful of mostly-independent patches:
 
 * A patch that causes our port to respect TIF_NEED_RESCHED, which fixes
   CONFIG_PREEMPT=y kernels.
 * A fix to avoid double-put on OF nodes.
 * Fix a misspelling of target in our Kconfig.
 * Generic PCIe is enabled in our defconfig.
 * A fix to our SBI early console to properly handle line endings.
 * A fix such that max_low_pfn is counted in PFNs.
 * A change to TASK_UNMAPPED_BASE to match what other arches do.
 
 This has passed by standard "boot Fedora" flow.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlxR5B4THHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRDvTKFQLMurQf7aEACfy+2EHqgTo5lzRZCwxmjzb4rm5gyH
 1GuR+2NPiwjPADxCqpO7tLFwSoLWU46K7GKfTrjXWbW2hNfBiHfPv62mPSX7dWR0
 kbUMRK0z5qm4mE1K8mmDFWPntGchuuRz1rfPI2Dv8McVybz2MbrvlrJ3zXXK/Qcu
 07cWbmeGd2RWIfj+tSYo+i5Aq4VPWQ6X00ToZlIZ4/4v69mFVgjqTIcc3Jx7unt1
 vGFRjP8jDVibwFAE1yDcUl9OCxqW/wUuwTpPW2HgN+55dktjBxvESl/pMl0AeGYk
 ehKdsaBVk/BkQT2XkYDIPxqK4Yb0VsuPOoH0Jsj+l3tSGjSeRA/LwlHiY7hALIVj
 LTg2FvEKs++hLnzwV6D5Qqj8TcdowSm/hDJttjEewM+IhLPOVGLz16yRcM9jIC7E
 1Jc6ByTo6HgVxhasF/ehT5SowJc1nOR3nS9VVd2P0yYysdN0m/kcfSgxQ1GoXMmC
 2tnMlwJO7pm8akuV2ifxUsM6KlRP47auLtTbRxuebpYnd6YKng7NRLF8Um2tgck4
 wtULBEXZCLYhHB55Lyy+wWDr7Pi3XK7qyvoTNP+AEJHcPtI23TsGKzB0Hu2unvxE
 s9r+0NGuiiuLAsR6Jba9FFSeZR7mJ9u1Bm0fz89Ob6dr1lYf0UUjqqQfHTfVZr40
 F7SqTyD+pe7IoQ==
 =Z3YY
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V fixes from Palmer Dabbelt:
 "This contains a handful of mostly-independent patches:

   - make our port respect TIF_NEED_RESCHED, which fixes
     CONFIG_PREEMPT=y kernels

   - fix double-put of OF nodes

   - fix a misspelling of target in our Kconfig

   - generic PCIe is enabled in our defconfig

   - fix our SBI early console to properly handle line
     endings

   - fix max_low_pfn being counted in PFNs

   - a change to TASK_UNMAPPED_BASE to match what other
     arches do

  This has passed my standard 'boot Fedora' flow"

* tag 'riscv-for-linus-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
  riscv: Adjust mmap base address at a third of task size
  riscv: fixup max_low_pfn with PFN_DOWN.
  tty/serial: use uart_console_write in the RISC-V SBL early console
  RISC-V: defconfig: Add CRYPTO_DEV_VIRTIO=y
  RISC-V: defconfig: Enable Generic PCIE by default
  RISC-V: defconfig: Move CONFIG_PCI{,E_XILINX}
  RISC-V: Kconfig: fix spelling mistake "traget" -> "target"
  RISC-V: asm/page.h: fix spelling mistake "CONFIG_64BITS" -> "CONFIG_64BIT"
  RISC-V: fix bad use of of_node_put
  RISC-V: Add _TIF_NEED_RESCHED check for kernel thread when CONFIG_PREEMPT=y
2019-02-02 10:26:14 -08:00
Linus Torvalds c8864cb70f for-linus-20190202
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlxVxeIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpq35EADCoVRFF8mi7wBhNvfN/yLqC9sgnqJM7vF3
 gPpm/E4MsTzCXrhgRmulrikfM8ywOT5ZBgIQp+BrhQgDZIlJ9fcyinFjU7o/gRm5
 R32IMJ4o7uh+YKWQyRSQu1WCaF3hvbNGUT7duMfTKjZ6t9TxTBgy46wYhb7YmNAP
 Ur8C1+4NfF/aHp59n6COM70KdYfswRxtgdEHGfmHAbmKDpvAeC+7I2LPpjTf5bH/
 YULnxk5sQFvCINDJH4zprZ0lSDy+63qk+Q1xxqpFNR1tnmRIVNvKhPPB5xfRLvzB
 mw3JdtwnvoY8Yv6eCs0u7mZs5L3I8zTI+4RtHH9nPD7ykIiUHpejgaRc4TGy2tox
 Dpgfc/Nvdsscpuy4QcFNHbWWBUu5pa5Li+KVS0FEP9FmD5UhcvVmZaZK+V6NuGO4
 9G9wraASFasPK7I0FMHlWLMIWKdj4s4n/H55QnP6yvFsnsrFaqnDwybUFiicjFkv
 hmNQmq8+5p0n3ZBVGQ/SI//vPUjuaFUKU2MhhW0NVz+KkgmEnOJ+W+C77//U33l+
 zhBURtdlnqfImFejEayhhtCtMATJcf2E1rHlA3nM6JyVWMRvQR6asb78QhAKZd6w
 el7rqRdWMwryUYFaROVurfOBMPFdhhyDB1qrOzAZIkhFqWwM9GX6+MH/y8+00HSA
 aA/rXg/daw==
 =8Zz+
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20190202' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few fixes that should go into this release. This contains:

   - MD pull request from Song, fixing a recovery OOM issue (Alexei)

   - Fix for a sync related stall (Jianchao)

   - Dummy callback for timeouts (Tetsuo)

   - IDE atapi sense ordering fix (me)"

* tag 'for-linus-20190202' of git://git.kernel.dk/linux-block:
  ide: ensure atapi sense request aren't preempted
  blk-mq: fix a hung issue when fsync
  block: pass no-op callback to INIT_WORK().
  md/raid5: fix 'out of memory' during raid cache recovery
2019-02-02 10:16:28 -08:00
Linus Torvalds 3cde55ee79 SCSI fixes on 20190201
Five minor bug fixes.  The libfc one is a tiny memory leak, the zfcp
 one is an incorrect user visible parameter and the rest are on error
 legs or obscure features.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXFTUrCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishe/hAQCDic6S
 FcD8TPv4jAt3qWun5vVCFAGCEuGet73p89sxpQEA51KcGrFjR9oCtZkd9nIzJL7M
 enGaQ0RtT8wPMSUNklE=
 =t6p4
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five minor bug fixes.

  The libfc one is a tiny memory leak, the zfcp one is an incorrect user
  visible parameter and the rest are on error legs or obscure features"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: 53c700: pass correct "dev" to dma_alloc_attrs()
  scsi: bnx2fc: Fix error handling in probe()
  scsi: scsi_debug: fix write_same with virtual_gb problem
  scsi: libfc: free skb when receiving invalid flogi resp
  scsi: zfcp: fix sysfs block queue limit output for max_segment_size
2019-02-02 10:12:53 -08:00