Updates to the usual drivers (megaraid_sas, scsi_debug, lpfc, target,
mpi3mr, hisi_sas, arcmsr). The major core change is the
constification of the host templates (which touches everything) along
with other minor fixups and clean ups.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZEmJACYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU4FAP0WYhFC
rkbY203/+ErUuwvOKum0VwJKUowCaUD0MBwScAD+Ok/NWobmjdXUBbPUbvVkr+hE
8B/xs9hodX+1fVJcVG0=
=fS/j
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (megaraid_sas, scsi_debug, lpfc, target,
mpi3mr, hisi_sas, arcmsr).
The major core change is the constification of the host templates
(which touches everything) along with other minor fixups and clean
ups"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (207 commits)
scsi: ufs: mcq: Use pointer arithmetic in ufshcd_send_command()
scsi: ufs: mcq: Annotate ufshcd_inc_sq_tail() appropriately
scsi: cxlflash: s/semahpore/semaphore/
scsi: lpfc: Silence an incorrect device output
scsi: mpi3mr: Use IRQ save variants of spinlock to protect chain frame allocation
scsi: scsi_debug: Fix missing error code in scsi_debug_init()
scsi: hisi_sas: Work around build failure in suspend function
scsi: lpfc: Fix ioremap issues in lpfc_sli4_pci_mem_setup()
scsi: mpt3sas: Fix an issue when driver is being removed
scsi: mpt3sas: Remove HBA BIOS version in the kernel log
scsi: target: core: Fix invalid memory access
scsi: scsi_debug: Drop sdebug_queue
scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts
scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store()
scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued()
scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll()
scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd
scsi: scsi_debug: Use scsi_block_requests() to block queues
scsi: scsi_debug: Protect block_unblock_all_queues() with mutex
scsi: scsi_debug: Change shost list lock to a mutex
...
In lpfc_sli4_pci_mem_unset(), case LPFC_SLI_INTF_IF_TYPE_1 does not have a
break statement, resulting in an incorrect device output.
Fix this by adding a break statement before the default option.
Signed-off-by: Jun Chen <jun_c@hust.edu.cn>
Link: https://lore.kernel.org/r/20230410023724.3209455-1-jun_c@hust.edu.cn
Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When if_type equals zero and pci_resource_start(pdev, PCI_64BIT_BAR4)
returns false, drbl_regs_memmap_p is not remapped. This passes a NULL
pointer to iounmap(), which can trigger a WARN() on certain arches.
When if_type equals six and pci_resource_start(pdev, PCI_64BIT_BAR4)
returns true, drbl_regs_memmap_p may has been remapped and
ctrl_regs_memmap_p is not remapped. This is a resource leak and passes a
NULL pointer to iounmap().
To fix these issues, we need to add null checks before iounmap(), and
change some goto labels.
Fixes: 1351e69fc6 ("scsi: lpfc: Add push-to-adapter support to sli4")
Signed-off-by: Shuchang Li <lishuchang@hust.edu.cn>
Link: https://lore.kernel.org/r/20230404072133.1022-1-lishuchang@hust.edu.cn
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
20 Fixes all in drivers except the one zone storage revalidation fix
to sd. The megaraid_sas fixes are more on the level of a driver
update (enabling crash dump and increasing lun number) but I thought
you could let this slide on -rc1 and the next most extensive update is
a load of fixes to mpi3mr.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAuqwyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVm/AP4t1E3w
Jc19sSzMV6OZc5x1eTSGzso80fkTeFgVXWOmcQD9G6ymH78hS0cb7FRIwtnHcV5r
6TwbInHF2LwNKzPhuMI=
=BdqN
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Twenty fixes all in drivers except the one zone storage revalidation
fix to sd.
The megaraid_sas fixes are more on the level of a driver update
(enabling crash dump and increasing lun number) but I thought you
could let this slide on -rc1 and the next most extensive update is a
load of fixes to mpi3mr"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Fix wrong zone_write_granularity value during revalidate
scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX file
scsi: megaraid_sas: Driver version update to 07.725.01.00-rc1
scsi: megaraid_sas: Add crash dump mode capability bit in MFI capabilities
scsi: megaraid_sas: Update max supported LD IDs to 240
scsi: mpi3mr: Bad drive in topology results kernel crash
scsi: mpi3mr: NVMe command size greater than 8K fails
scsi: mpi3mr: Return proper values for failures in firmware init path
scsi: mpi3mr: Wait for diagnostic save during controller init
scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabled
scsi: mpi3mr: ioctl timeout when disabling/enabling interrupt
scsi: lpfc: Avoid usage of list iterator variable after loop
scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
scsi: ufs: mcq: qcom: Clean the return path of ufs_qcom_mcq_config_resource()
scsi: ufs: mcq: qcom: Fix passing zero to PTR_ERR
scsi: ufs: ufs-qcom: Remove impossible check
scsi: ufs: core: Add soft dependency on governor_simpleondemand
scsi: hisi_sas: Check devm_add_action() return value
scsi: qla2xxx: Add option to disable FC2 Target support
scsi: target: iscsi: Fix an error message in iscsi_check_key()
Bjorn Helgaas <helgaas@kernel.org> says:
Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is native"),
which appeared in v6.0, the PCI core has enabled PCIe error reporting for
all devices during enumeration.
Remove driver code to do this and remove unnecessary includes of
<linux/aer.h> from several other drivers.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-8-helgaas@kernel.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Extended status reason code errors should mask off the IOERR_PARAM_MASK
before checking strict equalities for IOERR values.
Update the lpfc_error_lost_link() routine as such.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During tolerance tests that force an HBA to become unresponsive, rmmod
hangs resulting in the inability to remove the driver.
The lpfc_pci_remove_one_s4() routine attempts to submit a clean up mailbox
command via the lpfc_sli4_post_sync_mbox() routine, but ends up waiting
forever for a mailbox register to set its ready bit. Because the HBA is in
an unrecoverable and unresponsive state, the ready bit will never be set.
Create a new routine called lpfc_sli4_unrecoverable_port(), which checks a
port status register's error notification bits.
Use the lpfc_sli4_unrecoverable_port() routine in ready bit check routines
to early return error if port is deemed unrecoverable.
Also, when the lpfc_handle_eratt_s4() handler detects an unrecoverable
state, call the lpfc_sli4_offline_eratt() routine to kick off flushing
outstanding I/O.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A fabric controller can sometimes send an RDP request right before a link
down event. Because of this outstanding RDP request, the driver does not
remove the last reference count on its ndlp causing a potential leak of RPI
resources when devloss tmo fires.
In lpfc_cmpl_els_rsp(), modify the NPIV clause to always allow the
lpfc_drop_node() routine to execute when not registered with SCSI
transport.
This relaxes the contraint that an NPIV ndlp must be in a specific state in
order to call lpfc_drop node. Logic is revised such that the
lpfc_drop_node() routine is always called to ensure the last ndlp decrement
occurs.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When mapped to a target with multiple virtual ports, a link bounce
sometimes results in unsuccessful rediscovery of all of the target's
virtual ports. This is because a succession of repeat RSCNs for the
virtual target ports leaves ndlps in the REG_LOGIN state with the
NLP_REG_LOGIN_SEND flag set. With NLP_REG_LOGIN_SEND set, during the next
PLOGI, the driver will UNREG_RPI. When UNREG_RPI is processed, the driver
can be in the middle of PRLI_ISSUE or MAPPED state resulting in an illegal
state transition by the discovery engine and stalling.
Fix by calling the discovery state machine with DEVICE_RECOVERY event
during RSCN processing. This will set the NLP_IGNR_REG_CMPL bit and
prevent the old REG_LOGIN state from advancing. Then for the new PLOGI
issue, add the check for the NLP_IGNR_REG_CMPL bit to delay issuing the new
PLOGI until the queued REG_LOGIN and UNREG_LOGIN have been processed.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A target vendor array reboot in P2P topology can sometimes result in
unsuccessful rediscovery.
Rework the lpfc_cmpl_els_logo() routine such that when the LOGO completes
as a failure because of driver abort, the LOGO state is still recorded with
the discovery state machine.
This is a small rework to set LOGO completion without forcing a device
removal state change.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Lockdep enabled kernels report a theoretical deadlock state where the
cmf_timer interrupt occurs while the rx_monitor ring is being destroyed.
During rmmod, the cmf_timer is cancelled prior to the
lpfc_rx_monitor_destroy_ring call. This actually eliminates the need to
take the rx_monitor ring lock in lpfc_rx_monitor_destroy_ring. Thus, just
remove lock/unlock of rx_monitor in lpfc_rx_monitor_destroy_ring.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code sections where DMA resources are freed before list removal are
reworked to ensure item removal before being freed.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A static code analysis tool flagged the possibility of buffer overflow when
using copy_from_user() for a debugfs entry.
Currently, it is possible that copy_from_user() copies more bytes than what
would fit in the mybuf char array. Add a min() restriction check between
sizeof(mybuf) - 1 and nbytes passed from the userspace buffer to protect
against buffer overflow.
Link: https://lore.kernel.org/r/20230301231626.9621-2-justintee8345@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the &epd_pool->list is empty when executing
lpfc_get_io_buf_from_expedite_pool() the function would return an invalid
pointer. Even in the case if the list is guaranteed to be populated, the
iterator variable should not be used after the loop to be more robust for
future changes.
Linus proposed to avoid any use of the list iterator variable after the
loop, in the attempt to move the list iterator variable declaration into
the macro to avoid any potential misuse after the loop [1].
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jkl820.git@gmail.com>
Link: https://lore.kernel.org/r/20230301-scsi-lpfc-avoid-list-iterator-after-loop-v1-1-325578ae7561@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If kzalloc() fails in lpfc_sli4_cgn_params_read(), then we rely on
lpfc_read_object()'s routine to NULL check pdata.
Currently, an early return error is thrown from lpfc_read_object() to
protect us from NULL ptr dereference, but the errno code is -ENODEV.
Change the errno code to a more appropriate -ENOMEM.
Reported-by: Kang Chen <void0red@gmail.com>
Link: https://lore.kernel.org/all/20230226102338.3362585-1-void0red@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230228044336.5195-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It turns out that commit 596ff4a09b ("cpumask: re-introduce
constant-sized cpumask optimizations") exposed a number of cases of
drivers not checking the result of "cpumask_next()" and friends
correctly.
The documented correct check for "no more cpus in the cpumask" is to
check for the result being equal or larger than the number of possible
CPU ids, exactly _because_ we've always done those constant-sized
cpumask scans using a widened type before. So the return value of a
cpumask scan should be checked with
if (cpu >= nr_cpu_ids)
...
because the cpumask scan did not necessarily stop exactly *at* that
maximum CPU id.
But a few cases ended up instead using checks like
if (cpu == nr_cpumask_bits)
...
which used that internal "widened" number of bits. And that used to
work pretty much by accident (ok, in this case "by accident" is simply
because it matched the historical internal implementation of the cpumask
scanning, so it was more of a "intentionally using implementation
details rather than an accident").
But the extended constant-sized optimizations then did that internal
implementation differently, and now that code that did things wrong but
matched the old implementation no longer worked at all.
Which then causes subsequent odd problems due to using what ends up
being an invalid CPU ID.
Most of these cases require either unusual hardware or special uses to
hit, but the random.c one triggers quite easily.
All you really need is to have a sufficiently small CONFIG_NR_CPUS value
for the bit scanning optimization to be triggered, but not enough CPUs
to then actually fill that widened cpumask. At that point, the cpumask
scanning will return the NR_CPUS constant, which is _not_ the same as
nr_cpumask_bits.
This just does the mindless fix with
sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/'
to fix the incorrect uses.
The ones in the SCSI lpfc driver in particular could probably be fixed
more cleanly by just removing that repeated pattern entirely, but I am
not emptionally invested enough in that driver to care.
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/
Reported-by: Vernon Yang <vernon2gm@gmail.com>
Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Updates that missed the first pull, mostly because of needing more
soak time. Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug
fixes), an enclosure services (ses) update (mostly bug fixes) and
other minor bug fixes and changes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAJf2SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWYeAP9u6umX
5Trq6mtfdPyhSxIhgD6AwJDgVkeApSKAHZRu0AD/dfMTQmeaJory4LD4JW+uAgVl
yFPVz9VPlKTiaM2lwUI=
=BKzr
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Updates that missed the first pull, mostly because of needing more
soak time.
Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug fixes), an
enclosure services (ses) update (mostly bug fixes) and other minor bug
fixes and changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
scsi: zfcp: Trace when request remove fails after qdio send fails
scsi: zfcp: Change the type of all fsf request id fields and variables to u64
scsi: zfcp: Make the type for accessing request hashtable buckets size_t
scsi: ufs: core: Simplify ufshcd_execute_start_stop()
scsi: ufs: core: Rely on the block layer for setting RQF_PM
scsi: core: Extend struct scsi_exec_args
scsi: lpfc: Fix double word in comments
scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
scsi: core: Fix a source code comment
scsi: cxgbi: Remove unneeded version.h include
scsi: qedi: Remove unneeded version.h include
scsi: mpi3mr: Remove unneeded version.h include
scsi: mpi3mr: Fix missing mrioc->evtack_cmds initialization
scsi: mpi3mr: Use number of bits to manage bitmap sizes
scsi: mpi3mr: Remove unnecessary memcpy() to alltgt_info->dmi
scsi: mpi3mr: Fix issues in mpi3mr_get_all_tgt_info()
scsi: mpi3mr: Fix an issue found by KASAN
scsi: mpi3mr: Replace 1-element array with flex-array
scsi: ipr: Work around fortify-string warning
scsi: ipr: Make ipr_probe_ioa_part2() return void
...
Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas). The major
core change is a rework to remove the two helpers around
scsi_execute_cmd and use it as the only submission interface along
with other minor fixes and updates.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY/Z7BSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfF9AP9HzZaZ
0XumuxjchcJHRntcIAzb9kE62SSWFxBrgZQCtAEAyhAO1zK284esgCa+arkzEC7p
uhxpufQSGnYvbasStsU=
=VsH1
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas).
The major core change is a rework to remove the two helpers around
scsi_execute_cmd and use it as the only submission interface along
with other minor fixes and updates"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (142 commits)
scsi: ufs: core: Fix an error handling path in ufshcd_read_desc_param()
scsi: ufs: core: Fix device management cmd timeout flow
scsi: aic94xx: Add missing check for dma_map_single()
scsi: smartpqi: Replace one-element array with flexible-array member
scsi: mpt3sas: Fix a memory leak
scsi: qla2xxx: Remove the unused variable wwn
scsi: ufs: core: Fix kernel-doc syntax
scsi: ufs: core: Add hibernation callbacks
scsi: snic: Fix memory leak with using debugfs_lookup()
scsi: ufs: core: Limit DMA alignment check
scsi: Documentation: Correct spelling
scsi: Documentation: Correct spelling
scsi: target: Documentation: Correct spelling
scsi: aacraid: Allocate cmd_priv with scsicmd
scsi: ufs: qcom: dt-bindings: Add SM8550 compatible string
scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW version major 5
scsi: ufs: qcom: fix platform_msi_domain_free_irqs() reference
scsi: ufs: core: Enable DMA clustering
scsi: ufs: exynos: Fix the maximum segment size
scsi: ufs: exynos: Fix DMA alignment for PAGE_SIZE != 4096
...
Remove the repeated word "the" in comments.
[mkp: fixed additional typos in the changed lines]
Link: https://lore.kernel.org/r/20230217083046.4090-1-liubo03@inspur.com
Signed-off-by: Bo Liu <liubo03@inspur.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The LLDD and the stack currently process FPINs received from the fabric,
but the stack is not aware of any action taken by the driver to alleviate
congestion. The current interface between the driver and the SCSI stack is
limited to passing the notification mainly for statistics and heuristics.
The reaction to an FPIN could be handled either by the driver or by the
stack (marginal path and failover). Amend the interface to indicate if
action on an FPIN has already been reacted to by the LLDDs or not. Add an
additional flag to fc_host_fpin_rcv() to indicate if the FPIN has been
acknowledged/reacted to by the driver.
Also added a new event code FCH_EVT_LINK_FPIN_ACK to notify to the user
that the event has been acknowledged/reacted by the LLDD driver
Link: https://lore.kernel.org/r/20230209034326.882514-1-muneendra.kumar@broadcom.com
Co-developed-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Co-developed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Number of files depend on linux/sched/clock.h getting included
by linux/skbuff.h which soon will no longer be the case.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update copyrights to 2023 for files modified in the 14.2.0.10 patch set.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Define new FC Link ACQE with new attention types 0x8 (Link Activation
Failure) and 0x9 (Link Reset Protocol Event).
Both attention types are meant to be informational-only type ACQEs with no
action required.
0x8 is reported for diagnostic purposes, while 0x9 is posted during a
normal link up transition when activating BB Credit Recovery feature.
As such, modify lpfc_sli4_async_fc_evt() logic to log the attention types
according to its severity and early return when informational-only
attention types are encountered.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After enabling VMID, an issue LIP test was erasing fabric switch VMID
information.
Introduce a lpfc_reinit_vmid() routine, which reinitializes all VMID data
structures upon FLOGI completion in fabric topology.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During the sysfs firmware write process, a use-after-free read warning is
logged from the lpfc_wr_object() routine:
BUG: KFENCE: use-after-free read in lpfc_wr_object+0x235/0x310 [lpfc]
Use-after-free read at 0x0000000000cf164d (in kfence-#111):
lpfc_wr_object+0x235/0x310 [lpfc]
lpfc_write_firmware.cold+0x206/0x30d [lpfc]
lpfc_sli4_request_firmware_update+0xa6/0x100 [lpfc]
lpfc_request_firmware_upgrade_store+0x66/0xb0 [lpfc]
kernfs_fop_write_iter+0x121/0x1b0
new_sync_write+0x11c/0x1b0
vfs_write+0x1ef/0x280
ksys_write+0x5f/0xe0
do_syscall_64+0x59/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
The driver accessed wr_object pointer data, which was initialized into
mailbox payload memory, after the mailbox object was released back to the
mailbox pool.
Fix by moving the mailbox free calls to the end of the routine ensuring
that we don't reference internal mailbox memory after release.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In a large SAN testing configuration, frequent target port toggle tests are
occasionally resulting in missing lun path rediscoveries. An outstanding
PRLI can be inflight when a target RSCN dissappearance occurs, causing the
driver to retry PRLIs using invalid rpi contexts.
Fix by verifying that an ndlp's state was not restarted from PRLI_ISSUE
due to an intermediate RSCN. If not in a valid state, early exit PRLI
completion handling.
The last follow up RSCN indicating target reappearance retriggers
PLOGI/PRLI with a valid rpi context and is expected to succeed in LUN path
rediscovery.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With faulty cables in PT2PT topology, an unintentional ndlp double kref
decrement can occur.
If a FLOGI request is outstanding before the link goes down, the missing
FLOGI_ACC causes an F_Port ndlp to remain in the UNUSED state. During link
down, lpfc_cleanup_rpis() is called and decrements an ndlp kref.
Additionally, when the driver later decides to abort the FLOGI, the FLOGI
completion handler decrements the ndlp kref a second time.
Remove duplicate clean up logic in lpfc_cleanup_rpis() because the updated
FLOGI completion handler already handles the ndlp kref decrement.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The disable_vport() path calls the discovery state machine on all ndlps,
puts them into NPR state, and then calls lpfc_cleanup_rpis() with the
remove flag set. This unintentionally decrements an ndlp's kref twice and
can result in premature release of an ndlp because
lpfc_dev_loss_tmo_handler() triggers clean up of the ndlp again later.
Remove redundant code in disable_vport() that sets all the ndlps to NPR,
and change the call to lpfc_cleanup_rpis() to not remove the ndlps.
lpfc_dev_loss_tmo_handler() will handle final removal of the ndlps.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During I/O, the following warning message occasionally appears:
DMA-API: lpfc 0000:04:00.0: mapping sg segment longer than device claims to
support [len=131072] [max=65536]
The HBA is capable of supporting 131,072 bytes, so notify DMA layer via the
dma_set_max_seg_size() API during hba initialization.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The local variables called curr_data are incremented, but not actually used
for anything so they are removed.
The return value of lpfc_sli4_poll_eq is not used anywhere and is not
called outside of lpfc_sli.c. Thus, its declaration is removed from
lpfc_crtn.h Also, lpfc_sli4_poll_eq's path argument is not used in the
routine so it is removed along with corresponding macros.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kernel test robot pointed out non-NULL terminated string possibilities
when using strncpy() in lpfc_xcvr_data_show() routine. Although we
manually set the NULL character after strncpy(), strncpy() usage is
outdated.
Replace all strncpy() usages with the preferred strscpy() API.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kernel test robot detected inconsistent indentations for an if
statement block in the lpfc_xcvr_data_show() routine.
This patch reduces the extraneous tabs used for the if statement block in
question.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are
some core changes, mostly around reworking some of our user context
assumptions in device put and moving some code around. The remaining
updates are bug fixes and minor changes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY5jjrSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishR9iAPwN++uF
BNlCD36duS8LslKQMPAmFxWt3d/4RWAHsXj2WQEAtu9q8K9PSe1ueb4y+rAEG4oj
2AUQhR3v9ciWBBKlDog=
=JYJC
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (target, ufs, smartpqi, lpfc).
There are some core changes, mostly around reworking some of our user
context assumptions in device put and moving some code around.
The remaining updates are bug fixes and minor changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits)
scsi: sg: Fix get_user() in call sg_scsi_ioctl()
scsi: megaraid_sas: Fix some spelling mistakes in comment
scsi: core: Use SCSI_SCAN_INITIAL in do_scsi_scan_host()
scsi: core: Use SCSI_SCAN_RESCAN in __scsi_add_device()
scsi: ufs: ufs-mediatek: Remove unnecessary return code
scsi: ufs: core: Fix the polling implementation
scsi: libsas: Do not export sas_ata_wait_after_reset()
scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset
scsi: libsas: Add smp_ata_check_ready_type()
scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset"
scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()"
scsi: ufs: ufs-mediatek: Modify the return value
scsi: ufs: ufs-mediatek: Remove unneeded code
scsi: device_handler: alua: Call scsi_device_put() from non-atomic context
scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of alua_check_vpd()"
scsi: snic: Fix possible UAF in snic_tgt_create()
scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts
scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization
scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails
scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices
...
Nothing in this file needs anything from linux/msi.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20221113202428.436270297@linutronix.de
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a FLOGI completes with a sequence timeout error, a freed kref ptr
dereference crash can occur due to a timing race involving ndlp referencing
in lpfc_dev_loss_tmo_callbk.
Fix by ensuring the driver accounts for an outstanding FLOGI when dev_loss
is active. Also, don't remove the HBA_FLOGI_OUTSTANDING flag when the
FLOGI is retried to allow the driver to handle the reference counts
correctly in lpfc_dev_loss_tmo_handler.
Reported-by: Dietmar Hahn <dietmar.hahn@fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@fujitsu.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221116011921.105995-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The dynamic mi_ver value holds the currently configured MI setting. mi_ver
was being displayed as part of the cmf_info sysfs attribute, when the
output string meant to display MI capabilities instead.
Add a mi_cap member in the lpfc_pc_sli4_params structure that will store MI
capabilities during initialization so that cmf_info prints out capabilities
instead of current configuration.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221116011921.105995-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The lpfc_cmf_timer adjusts phba->cmf_link_byte_count periodically and can
artifically inflate bandwidth percent.
During bandwidth calculation, correct for this by setting a cap of logging
a maximum of 100%.
Bandwidth calculation is only used for display under LOG_CGN_MGMT so there
is no expectation of impacts on performance.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221116011921.105995-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Adapter configurations with limited EQ resources may fail to initialize.
Firmware resources are queried in lpfc_sli4_read_config(). The driver
parameters cfg_irq_chann and cfg_hdw_queue are adjusted from defaults if
constrained by firmware resources.
The minimum resource check includes a special allocation for queues such as
ELS, MBOX, NVME LS. However the additional reservation was also incorrectly
applied to EQ resources.
Reordered WQ|CQ|EQ resource checks to apply the special allocation
adjustment to WQ and CQ resources only.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221116011921.105995-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use memset_startat() helper to simplify the code, no functional changes in
this patch.
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Link: https://lore.kernel.org/r/20221111074310.132125-1-xiujianfeng@huawei.com
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pointer lp is being initialized and incremented but the result is never
read. The pointer is redundant and can be removed.
Once lp is removed, pcmd is not longer used. So remove pcmd as well
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Link: https://lore.kernel.org/r/20221108183620.93978-1-jsmart2021@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The DUMP_MEMORY mailbox command is implemented for page A0 and A2 to
retrieve transceiver information from firmware.
The mailbox command output is then formatted to print raw data values for
userspace to parse via sysfs.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221017164323.14536-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When bandwidth reduces from or recovers back to 100% due to congestion
management, log the event.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221017164323.14536-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During I/O and simultaneous cat of /sys/kernel/debug/lpfc/fnX/rx_monitor, a
hard lockup similar to the call trace below may occur.
The spin_lock_bh in lpfc_rx_monitor_report is not protecting from timer
interrupts as expected, so change the strength of the spin lock to _irq.
Kernel panic - not syncing: Hard LOCKUP
CPU: 3 PID: 110402 Comm: cat Kdump: loaded
exception RIP: native_queued_spin_lock_slowpath+91
[IRQ stack]
native_queued_spin_lock_slowpath at ffffffffb814e30b
_raw_spin_lock at ffffffffb89a667a
lpfc_rx_monitor_record at ffffffffc0a73a36 [lpfc]
lpfc_cmf_timer at ffffffffc0abbc67 [lpfc]
__hrtimer_run_queues at ffffffffb8184250
hrtimer_interrupt at ffffffffb8184ab0
smp_apic_timer_interrupt at ffffffffb8a026ba
apic_timer_interrupt at ffffffffb8a01c4f
[End of IRQ stack]
apic_timer_interrupt at ffffffffb8a01c4f
lpfc_rx_monitor_report at ffffffffc0a73c80 [lpfc]
lpfc_rx_monitor_read at ffffffffc0addde1 [lpfc]
full_proxy_read at ffffffffb83e7fc3
vfs_read at ffffffffb833fe71
ksys_read at ffffffffb83402af
do_syscall_64 at ffffffffb800430b
entry_SYSCALL_64_after_hwframe at ffffffffb8a000ad
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20221017164323.14536-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>