Граф коммитов

1427 Коммитов

Автор SHA1 Сообщение Дата
James Smart 4c47efc140 scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures
Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures.  Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:08 -05:00
James Smart 63df6d637e scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues
Similar to the io execution path that reports cpu context information, the
debugfs routines for cpu information needs to be aligned with new hardware
queue implementation.

Convert debugfs cnd nvme cpucheck statistics to report information per
Hardware Queue.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:28:11 -05:00
James Smart 18c27a6216 scsi: lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event
Both NVME and SCSI aborts are now processed off the CQ workqueue and do not
generate events for the slowpath any more.

Remove the unused event code.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart 5e5b511d8b scsi: lpfc: Partition XRI buffer list across Hardware Queues
Once the IO buff allocations were made shared, there was a single XRI
buffer list shared by all hardware queues.  A single list isn't great for
performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
SGLs and associated IO buffers get allocated/posted to the firmware; round
robin their assignment across all available hardware Queues so that there
is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart cdb42becdd scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu
Currently, both nvme and fcp each have their own concept of an io_channel,
which is a combination wq/cq and associated msix.  Different cpus would
share an io_channel.

The driver is now moving to per-cpu wq/cq pairs and msix vectors.  The
driver will still use separate wq/cq pairs per protocol on each cpu, but
the protocols will share the msix vector.

Given the elimination of the nvme and fcp io channels, the module
parameters will be removed.  A new parameter, lpfc_hdw_queue is added which
allows the wq/cq pair allocation per cpu to be overridden and allocated to
lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will
be based on the number of cpus. If non-zero, the parameter specifies the
number of queues to allocate. At this time, the maximum non-zero value is
64.

To manage this new paradigm, a new hardware queue structure is created to
track queue activity and relationships.

As MSIX vector allocation must be known before setting up the
relationships, msix allocation now occurs before queue datastructures are
allocated. If the number of vectors allocated is less than the desired
hardware queues, the hardware queue counts will be reduced to the number of
vectors

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart 7370d10ac9 scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane
There is a extra queue and msix vector for expresslane. Now that the driver
will be doing queues per cpu, this oddball queue is no longer needed.
Expresslane will utilize the normal per-cpu queues.

Updated debugfs sli4 queue output to go along with the change

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart 0794d601d1 scsi: lpfc: Implement common IO buffers between NVME and SCSI
Currently, both NVME and SCSI get their IO buffers from separate
pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split
between protocols.

Eliminate the independent pools and use a single pool. Each buffer
structure now has a common section and a protocol section. Per protocol
routines for SGL initialization are removed and replaced by common
routines. Initialization of the buffers is only done on the common area.
All other fields, which are protocol specific, are initialized when the
buffer is allocated for use in the per-protocol allocation routine.

In the past, the SCSI side allocated IO buffers as part of slave_alloc
calls until the maximum XRIs for SCSI was reached. As all XRIs are now
common and may be used for either protocol, allocation for everything is
done as part of adapter initialization and the scsi side has no action in
slave alloc.

As XRI's are no longer split, the lpfc_xri_split module parameter is
removed.

Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put
routines.  All SLI4 adapters utilize the new IO buffer scheme

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart e960f5ab40 scsi: lpfc: cleanup: Remove excess check on NVME io submit code path
lpfc_nvme_prep_io_cmd() checks for null pnode, but caller
lpfc_nvme_fcp_io_submit() has already ensured it's non-null.

Remove the pnode null check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart 0b05e9fe1f scsi: lpfc: cleanup: remove nrport from nvme command structure
An hba-wide lock is taken in the nvme io completion routine. The lock
covers null'ing of the nrport pointer in the cmd structure.

The nrport member isn't necessary. After extracting the pointer from the
command, the pointer was dereferenced to get the fc discovery node
pointer. But the fc discovery node pointer is alrady in the command
structure so the dereferrence was unnecessary.

Eliminated the nrport structure member and its use, which also eliminates
the port-wide lock.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
Greg Kroah-Hartman 50e931679a scsi: lpfc: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the return
value.  The function can work or not, but the code logic should never do
something different based on this.

Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-01-29 00:40:54 -05:00
Linus Torvalds 938edb8a31 SCSI misc on 20181224
This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
 megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.  Additionally, we have
 a pile of annotation, unused variable and minor updates.  The big API
 change is the updates for Christoph's DMA rework which include
 removing the DISABLE_CLUSTERING flag.  And finally there are a couple
 of target tree updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv
 qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj
 u1j3H7tha9j1it+4pUw=
 =GDa+
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
  megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.

  Additionally, we have a pile of annotation, unused variable and minor
  updates.

  The big API change is the updates for Christoph's DMA rework which
  include removing the DISABLE_CLUSTERING flag.

  And finally there are a couple of target tree updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
  scsi: isci: request: mark expected switch fall-through
  scsi: isci: remote_node_context: mark expected switch fall-throughs
  scsi: isci: remote_device: Mark expected switch fall-throughs
  scsi: isci: phy: Mark expected switch fall-through
  scsi: iscsi: Capture iscsi debug messages using tracepoints
  scsi: myrb: Mark expected switch fall-throughs
  scsi: megaraid: fix out-of-bound array accesses
  scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
  scsi: fcoe: remove set but not used variable 'port'
  scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
  scsi: smartpqi: fix build warnings
  scsi: smartpqi: update driver version
  scsi: smartpqi: add ofa support
  scsi: smartpqi: increase fw status register read timeout
  scsi: smartpqi: bump driver version
  scsi: smartpqi: add smp_utils support
  scsi: smartpqi: correct lun reset issues
  scsi: smartpqi: correct volume status
  scsi: smartpqi: do not offline disks for transient did no connect conditions
  scsi: smartpqi: allow for larger raid maps
  ...
2018-12-28 14:48:06 -08:00
James Smart 9e1f03e4d3 scsi: lpfc: Update lpfc version to 12.0.0.10
Update lpfc version to 12.0.0.10

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:08 -05:00
James Smart 5021267af1 scsi: lpfc: Adding ability to reset chip via pci bus reset
This patch adds a "pci_bus_reset" option to the board_mode sysfs attribute.
This option uses the pci_reset_bus() api to reset the PCIe link the adapter
is on, which will reset the chip/adapter.  Prior to issuing this option,
all functions on the same chip must be placed in the offline state by the
admin. After the reset, all of the instances may be brought online again.

The primary purpose of this functionality is to support cases where
firmware update required a chip reset but the admin did not want to reboot
the machine in order to instantiate the firmware update.

Sanity checks take place prior to the reset to ensure the adapter is the
sole entity on the PCIe bus and that all functions are in the offline
state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:08 -05:00
James Smart 72ca6b2220 scsi: lpfc: Add log messages to aid in debugging fc4type discovery issues
Current messages report generic actions (like send GID_FT), but misses
reporting for what protocol type the action is taken.

Revise the messages to reflect the FC4 protocol type being worked on.

[mkp: typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
James Smart 00292e0306 scsi: lpfc: Fix discovery failure when PLOGI is defered
When a target's link dropped, an RSCN was received to communicate the
change. The driver detected the loss of the target and issued and UNREG_RPI
mailbox command.  While that was being processed, another RSCN was received
to communicate the port coming back.  The driver deferred the PLOGI to the
port until the mailbox command finishes. When the mailbox command completed
it saw the pending port and called the routines to issue the
PLOGI. However, it forgot to clear the UNREG_INP state flag, so the PLOGI
xmt routine nooped the PLOGI request assuming it needed to wait for the
mailbox command.  At this point, login would never be re-attempted.

Clear UNREG_INP before issuing the deferred PLOGI.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
James Smart 529b3ddcff scsi: lpfc: update fault value on successful trunk events.
Currently, when a trunk link goes down due to some fault, the driver
snapshots the fault code.  If the link then comes back up, meaning there is
no fault, the driver is not clearing the fault code so the sysfs link_state
entry reports old/stale data.

Revise the logic so that on successful link up the fault code is cleared.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
James Smart e817e5d703 scsi: lpfc: Correct MDS loopback diagnostics support
The existing MDS loopback diagnostics support processing received frames in
the slowpath work thread. It caps the number of frames it will process at
64, before waiting for another event to indicate additional frame
reception. The net-net is this results in very slow frame processing during
loopback tests and sometimes orphans an io, causing the loopback test to
report failure by the switch.

Move MDS loopback frame processing out of the slow path worker thread and
into the normal RQ processing routines.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
James Smart 2977a09512 scsi: lpfc: Fix link state reporting for trunking when adapter is offline
If the adapter is taken offline, the trunk link port attributes continue to
report trunk links as up even though all links are down as the adapter is
offline.

Clear the trunk links state as part of taking the adapter offline.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
Ewan D. Milne 4e87eb2f46 scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid
Certain older adapters such as the OneConnect OCe10100 may not have a valid
wqpcnt value.  In this case, do not set queue->page_count to 0 in
lpfc_sli4_queue_alloc() as this will prevent the driver from initializing.

Fixes: 895427bd01 ("scsi: lpfc: NVME Initiator: Base modifications")
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by:   Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:25:35 -05:00
Christoph Hellwig 2a3d4eb8e2 scsi: flip the default on use_clustering
Most SCSI drivers want to enable "clustering", that is merging of
segments so that they might span more than a single page.  Remove the
ENABLE_CLUSTERING define, and require drivers to explicitly set
DISABLE_CLUSTERING to disable this feature.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-18 23:13:12 -05:00
James Smart 719162bd5b scsi: lpfc: Enable Management features for IF_TYPE=6
Addition of support for if_type=6 missed several checks for interface type,
resulting in the failure of several key management features such as
firmware dump and loopback testing.

Correct the checks on the if_type so that both SLI4 IF_TYPE's 2 and 6 are
supported.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-12 20:33:08 -05:00
Martin K. Petersen 2d1036aea4 Revert "scsi: lpfc: ls_rjt erroneus FLOGIs"
This reverts commit 287aba2592.

We killed the bad firmware and this mod is no longer necessary.

Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-12 20:26:56 -05:00
Jens Axboe 96f774106e Linux 4.20-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlwNpb0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGwGwH/00UHnXfxww3ixxz
 zwTVDzptA6SPm6s84yJOWatM5fXhPiAltZaHSYF9lzRzNU71NCq7Frhq3fQUIXKM
 OxqDn9nfSTWcjWTk2q5n2keyRV/KIn67YX7UgqFc1bO/mqtVjEgNWaMyblhI+e9E
 giu1ZXayHr43jK1cDOmGExZubXUq7Vsc9TOlrd+d2SwIqeEP7TCMrPhnHDwCNvX2
 UU5dtANpVzGtHaBcr37wJj+L8kODCc0f+PQ3g2ar5jTHst5SLlHp2u0AMRnUmgdi
 VkGx+mu/uk8mtwUqMIMqhplklVoqK6LTeLqsY5Xt32SKruw9UqyJGdphLjW2QP/g
 MkmA1lI=
 =7kaD
 -----END PGP SIGNATURE-----

Merge tag 'v4.20-rc6' into for-4.21/block

Pull in v4.20-rc6 to resolve the conflict in NVMe, but also to get the
two corruption fixes. We're going to be overhauling the direct dispatch
path, and we need to do that on top of the changes we made for that
in mainline.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-09 17:45:40 -07:00
James Smart de55b786b8 scsi: lpfc: update driver version to 12.0.0.9
Update the driver version to 12.0.0.9

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:33 -05:00
James Smart 7c4042a4d0 scsi: lpfc: Fix dif and first burst use in write commands
When dif and first burst is used in a write command wqe, the driver was not
properly setting fields in the io command request. This resulted in no dif
bytes being sent and invalid xfer_rdy's, resulting in the io being aborted
by the hardware.

Correct the wqe initializaton when both dif and first burst are used.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:33 -05:00
James Smart 1165a5c220 scsi: lpfc: Fix driver release of fw-logging buffers
On driver termination, after the driver stops fw logging by writing a
register on the chip, the driver immediately unmaps and frees the logging
buffer, without confirming in any way that the chip has received the write
and terminated the logging. As termination on the chip is not immediate,
the chip may issue a dma request to the now unmapped dma buffer, resulting
in a iommu fault.

Change the driver to receive a confirmation that logging ahs been
terminated. As the driver always issues an SLI reset with the device as
part of shutdown, and as part of that is receiving confirmation that the
reset is complete - the driver was modified to perform the write to disable
fw logging prior to the SLI reset and only free the fw log buffer after the
SLI reset is complete. That guarantees use of the fw log buffer is fully
terminated when it is unmapped.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:33 -05:00
James Smart 76558b2573 scsi: lpfc: Correct topology type reporting on G7 adapters
Driver missed classifying the chip type for G7 when reporting supported
topologies. This resulted in loop being shown as supported on FC links that
are not supported per the standard.

Add the chip classifications to the topology checks in the driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:33 -05:00
James Smart 1c36833d82 scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE
Driver is setting bits in word 10 of the SLI4 ABORT WQE (the wqid).  The
field was a carry over from a prior SLI revision. The field does not exist
in SLI4, and the action may result in an overlap with future definition of
the WQE.

Remove the setting of WQID in the ABORT WQE.

Also cleaned up WQE field settings - initialize to zero, don't bother to
set fields to zero.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 0a9e9687ac scsi: lpfc: Defer LS_ACC to FLOGI on point to point logins
The current discovery state machine the driver treated FLOGI oddly.  When
point to point, an FLOGI is to be exchanged by the two ports, with the port
with the most significant WWN then proceeding with PLOGI.  The
implementation in the driver was keyed to closely with "what have I sent",
not with what has happened between the two endpoints. Thus, it blatantly
would ACC an FLOGI, but reject PLOGI's until it had its FLOGI ACC'd. The
problem is - the sending of FLOGI may be delayed for some reason, or the
response to FLOGI held off by the other side. In the failing situation the
other side sent an FLOGI, which was ACC'd, then sent PLOGIs which were then
rjt'd until the retry count for the PLOGIs were exceeded and the port gave
up. The FLOGI may have been very late in transmit, or the response held off
until the PLOGIs failed. Given the other port had the higher WWN, no PLOGIs
would occur and communication stopped.

Correct the situation by changing the FLOGI handling. Defer any response to
an FLOGI until the driver has sent its FLOGI as well. Then, upon either
completion of the sent FLOGI, or upon sending an ACC to a received FLOGI
(which may be received before or just after FLOGI was sent). the driver
will act on who has the higher WWN. if the other port does, the driver will
noop any handling of an FLOGI response (if outstanding) and wait for PLOGI.
If the local port does, the driver will transition to sending PLOGI and
will noop any action on responding to an FLOGI (if not yet received).

Fortunately, to implement this, it only took another state flag and
deferring any FLOGI response if the FLOGI has yet to be transmit. All
subsequent actions were already in place.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 287aba2592 scsi: lpfc: ls_rjt erroneus FLOGIs
In some link initialization sequences, the fw generates an erroneous FLOGI
payload to the driver without an intervening link bounce.  The driver, when
it sees a 2nd FLOGI without an intervening link bounce, automatically
performs a link bounce. In this, the link bounce causes the situate to
repeat and in a nasty loop of link bounces.

Resolve the issue by validating the FLOGI payload. The erroneous FLOGI will
contain VVL signatures that are not normal. When the driver sees these, it
will simply reject the flogi rather than bouncing the link.  The reject is
consumed within the firmware.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 92ea83a878 scsi: lpfc: rport port swap discovery issue.
Two initiator ports were cable swapped and after swap both went down.  The
driver internally swaps the nlp nodes based on matching node wwn's but not
the same nport id as before. After detecting a change in the nodes RPI, the
driver sends an UNREG_RPI command and clears the NLP_RPI_REGISTERED flag,
then swaps the node information with the other node. But the other node's
NLP_RPI_REGISTERED flag is also cleared, but it is done so without an
UNREG_RPI being sent, which causes the later REG_RPI for that other node to
fail as the hardware believes its still registered.

Additionally, if the node swap occurred while the two nodes had PLOGI's in
flight, the fc4_types weren't properly getting swapped such that when the
PLOGIs commpleted and PRLI's were then sent, the PRLI's acted on bad
protocol types so the PRLI was for the wrong protocol. NVME devices saw
SCSI FCP PRLIs and vice versa.

Clean up the node swap so that the NLP_RPI_REGISTERED flag is handled
properly.

Fix the handling of the fc4_types when the nodes are swapped as well

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 8b47ae69e0 scsi: lpfc: Cap NPIV vports to 256
Depending on the chipset, the number of NPIV vports may vary and be in
excess of what most switches support (256). To avoid confusion with the
users, limit the reported NPIV vports to 256.

Additionally correct the 16G adapter which is reporting a bogus NPIV vport
number if the link is down.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 5a9eeff57f scsi: lpfc: Fix kernel Oops due to null pring pointers
Driver is hitting null pring pointers in lpfc_do_work().

Pointer assignment occurs based on SLI-revision. If recovering after an
error, its possible the sli revision for the port was cleared, making the
lpfc_phba_elsring() not return a ring pointer, thus the null pointer.

Add SLI revision checking to lpfc_phba_elsring() and status checking to all
callers.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 2c4c91415a scsi: lpfc: Fix a duplicate 0711 log message number.
Renumber one of the 0711 log messages so there isn't a duplication.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart dea16bdae2 scsi: lpfc: Fix discovery failures during port failovers with lots of vports
The driver is getting hit with 100s of RSCNs during remote port address
changes. Each of those RSCN's ends up generating UNREG_RPI and REG_PRI
mailbox commands.  The discovery engine within the driver doesn't wait for
the mailbox command completions. Instead it sets state flags and moves
forward. At some point, there's a massive backlog of mailbox commands which
take time for the adapter to process. Additionally, it appears there were
duplicate events from the switch so the driver generated duplicate mailbox
commands for the same remote port.  During this window, failures on PLOGI
and PRLI ELS's are see as the adapter is rejecting them as they are for
remote ports that still have pending mailbox commands.

Streamline the discovery engine so that PLOGI log checks for outstanding
UNREG_RPIs and defer the processing until the commands complete. This
better synchronizes the ELS transmission vs the RPI registrations.

Filter out multiple UNREG_RPIs being queued up for the same remote port.

Beef up log messages in this area.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 3e1f071892 scsi: lpfc: refactor mailbox structure context fields
The driver data structure for managing a mailbox command contained two
context fields. Unfortunately, the context were considered "generic" to be
used at the whim of the command code.  Of course, one section of code used
fields this way, while another did it that way, and eventually there were
mixups.

Refactored the structure so that the generic contexts become a node context
and a buffer context and all code standardizes on their use.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart 0f31e9593a scsi: lpfc: update manufacturer attribute to reflect Broadcom
Update manufacturer attribute to reflect Broadcom Inc, not Emulex

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:31 -05:00
James Smart cb34990b90 scsi: lpfc: Fix panic when FW-log buffsize is not initialized
While trying to get adapter fw-log for a function whose buffsize was set to
0, kernel panic occurred.

When buffsize is 0, the kernel buffer for the log won't be allocated.  When
fw log usage was enabled, it failed to check the buffer size, and log usage
was started. Eventually the driver referenced the unallocated log buffer.

Added checks of the buffer size before allowing fw logging to be enabled
and added check for valid buffer if enabling fw log.

Performed a couple other minor cleanups while fixing this:
 - clarified log messages
 - re-evaluated log message severity
 - treat any error as an error, not only a couple codes

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:31 -05:00
Martin Wilck dfb7513374 scsi: lpfc: fix block guard enablement on SLI3 adapters
Since f44ac12f1d, BG enablement is tracked with the LPFC_SLI3_BG_ENABLED
bit, which is set in lpfc_get_cfgparam before lpfc_sli_config_sli_port() is
called. The bit shouldn't be cleared before checking the feature.  Based on
problem analysis by David Bond.

Fixes: f44ac12f1d "scsi: lpfc: Memory allocation error during driver start-up on power8"
Tested-by: David Bond <dbond@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # 4.17.x
Cc: stable@vger.kernel.org # 4.18.x
Cc: stable@vger.kernel.org # 4.19.x
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-28 12:14:25 -05:00
Sabyasachi Gupta 359d0ac1e8 scsi: lpfc: Use dma_zalloc_coherent
Replaced dma_alloc_coherent + memset with dma_zalloc_coherent.

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-21 22:31:22 -05:00
Jens Axboe a78b03bc73 Linux 4.20-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlvx2sAeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGycgIAIuxobwt0RRKa0zO
 ROS+34JGoC2yU2P9VdEGWdtxS6ANMVQgKPBhWL6s+xR89Kd+V4xSdJLD1pNTxxqP
 0DCva0np1/Q4juH+JbU50v/lykoLgteZ0P0LBRGf1y8p3WiLPv45IbnNsMDNYhB2
 7a8rOmZYakRY9CPznRDw3X8cJt3sddKgFJHIOGz1OQJVWtCD0KPGcJmQNsbDSagY
 Zx6Z5BKSIdjRqaAdN5gDa1Pft3WQo7TpaQGl80lSsgr5LcjmscXA3sClOCy+25Mo
 FZLx0PcwP+Efq8RTGzNK51WSOMa6d37hvjDqUAdQBOR0KbyjRyXQwyQVw/MGbPJs
 7J3Pzm0=
 =56Mt
 -----END PGP SIGNATURE-----

Merge tag 'v4.20-rc3' into for-4.21/block

Merge in -rc3 to resolve a few conflicts, but also to get a few
important fixes that have gone into mainline since the block
4.21 branch was forked off (most notably the SCSI queue issue,
which is both a conflict AND needed fix).

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-18 15:46:03 -07:00
Christoph Hellwig f30e1bfd61 scsi: lpfc: use dma_set_mask_and_coherent
The driver currently uses pci_set_dma_mask despite otherwise using the
generic DMA API.  Switch it over to the better generic DMA API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-15 14:27:08 -05:00
Jens Axboe f664a3cc17 scsi: kill off the legacy IO path
This removes the legacy (non-mq) IO path for SCSI.

Cc: linux-scsi@vger.kernel.org
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-07 13:42:32 -07:00
James Smart ed5b3994c6 scsi: lpfc: update driver version to 12.0.0.8
Update the driver version to 12.0.0.8

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart 1dc5ec2452 scsi: lpfc: add Trunking support
Add trunking support to the driver. Trunking is found on more recent
asics. In general, trunking appears as a single "port" to the driver
and overall behavior doesn't differ. Link speed is reported as an
aggregate value, while link speed control is done on a per-physical
link basis with all links in the trunk symmetrical. Some commands
returning port information are updated to additionally provide
trunking information. And new ACQEs are generated to report physical
link events relative to the trunk.

This patch contains the following modifications:

- Added link speed settings of 128GB and 256GB.

- Added handling of trunk-related ACQEs, mainly logging and trapping
  of physical link statuses.

- Added additional bsg interface to query trunk state by applications.

- Augment link_state sysfs attribtute to display trunk link status

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart 7ea92eb458 scsi: lpfc: Implement GID_PT on Nameserver query to support faster failover
The switches seem to respond faster to GID_PT vs GID_FT NameServer
queries.  Add support for GID_PT to be used over GID_FT to enable
faster storage failover detection. Includes addition of new module
parameter to select between GID_PT and GID_FT (GID_FT is default).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart d83ca3ea83 scsi: lpfc: Correct loss of fc4 type on remote port address change
An address change for a remote port cause PRLI for the wrong protocol
to be sent.  The node copy done in the discovery code skipped copying
the fc4 protocols supported as well.

Fix the copy logic for the address change.  Beefed up log messages in
this area as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart d496b9a724 scsi: lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point
Testing a point-to-point topology and a case of re-FLOGI without
intervening link bouncing, showed an odd interaction with firmware and
a resulting scenario where the driver no longer probed after accepting
the new FLOGI.

Work around the firmware issue by issuing a link bounce if a FLOGI is
received after the link is already up and FLOGI's accepted.

While debugging the issue, realized that some debug traces should be
clarified to help in the future.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart b114d9009d scsi: lpfc: Correct LCB RJT handling
When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart 036cad1f1a scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces
On FCoE adapters, when running link bounce test in a loop, initiator
failed to login with switch switch and required driver reload to
recover. Switch reached a point where all subsequent FLOGIs would be
LS_RJT'd. Further testing showed the condition to be related to not
performing FCF discovery between FLOGI's.

Fix by monitoring FLOGI failures and once a repeated error is seen
repeat FCF discovery.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00