Enable for explicit FCP channel FC Endpoint Security error reporting and
handle any FSF security errors according to specification. Take the
following recovery actions when a FSF_SECURITY_ERROR is reported for the
specified FSF commands:
- Open Port: Retry the command if possible
- Send FCP : Physically close the remote port and reopen
For Open Port the command status is set to error, which triggers a retry.
For Send FCP the command status is set to error and recovery is triggered
to physically reopen the remote port.
Link: https://lore.kernel.org/r/20200312174505.51294-10-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Trace changes in Fibre Channel Endpoint Security capabilities of FCP
devices as well as changes in Fibre Channel Endpoint Security state of
their connections to FC remote ports as FC Endpoint Security changes with
trace level 3 in HBA DBF.
A change in FC Endpoint Security capabilities of FCP devices is traced as
response to FSF command FSF_QTCB_EXCHANGE_PORT_DATA with a trace tag of
"fsfcesa" and a WWPN of ZFCP_DBF_INVALID_WWPN = 0x0000000000000000 (see
FC-FS-4 §18 "Name_Identifier Formats", NAA field).
A change in FC Endpoint Security state of connections between FCP devices
and FC remote ports is traced as response to FSF command
FSF_QTCB_OPEN_PORT_WITH_DID with a trace tag of "fsfcesp".
Example trace record of FC Endpoint Security capability change of FCP
device formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : HBA
Subarea : 00
Level : 3
Exception : -
CPU ID : ...
Caller : 0x...
Record ID : 5 ZFCP_DBF_HBA_FCES
Tag : fsfcesa FSF FC Endpoint Security adapter
Request ID : 0x...
Request status : 0x00000010
FSF cmnd : 0x0000000e FSF_QTCB_EXCHANGE_PORT_DATA
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : n/a
Prot stat : n/a
Prot stat qual : n/a
Port handle : 0x00000000 none (invalid)
LUN handle : n/a
WWPN : 0x0000000000000000 ZFCP_DBF_INVALID_WWPN
FCES old : 0x00000000 old FC Endpoint Security
FCES new : 0x00000007 new FC Endpoint Security
Example trace record of FC Endpoint Security change of connection to
FC remote port formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : HBA
Subarea : 00
Level : 3
Exception : -
CPU ID : ...
Caller : 0x...
Record ID : 5 ZFCP_DBF_HBA_FCES
Tag : fsfcesp FSF FC Endpoint Security port
Request ID : 0x...
Request status : 0x00000010
FSF cmnd : 0x00000005 FSF_QTCB_OPEN_PORT_WITH_DID
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : n/a
Prot stat : n/a
Prot stat qual : n/a
Port handle : 0x...
WWPN : 0x500507630401120c WWPN
FCES old : 0x00000000 old FC Endpoint Security
FCES new : 0x00000004 new FC Endpoint Security
Link: https://lore.kernel.org/r/20200312174505.51294-9-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Log the usage of and subsequent changes in FC Endpoint Security of
connections between FCP devices and FC remote ports to the kernel ring
buffer. Activation of FC Endpoint Security is logged as informational.
Change and deactivation are logged as warning.
No logging takes place, if FC Endpoint Security is not used (i.e. never
activated) on a connection or if it does not change during reopen of a port
(e.g. due to adapter or port recovery).
Link: https://lore.kernel.org/r/20200312174505.51294-8-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add an interface to read Fibre Channel Endpoint Security information of FCP
channels and their connections to FC remote ports. It comes in the form of
new sysfs attributes that are attached to the CCW device representing the
FCP device and its zfcp port objects.
The read-only sysfs attribute "fc_security" of a CCW device representing a
FCP device shows the FC Endpoint Security capabilities of the device.
Possible values are: "unknown", "unsupported", "none", or a comma-
separated list of one or more mnemonics and/or one hexadecimal value
representing the supported FC Endpoint Security:
Authentication: Authentication supported
Encryption : Encryption supported
The read-only sysfs attribute "fc_security" of a zfcp port object shows the
FC Endpoint Security used on the connection between its parent FCP device
and the FC remote port. Possible values are: "unknown", "unsupported",
"none", or a mnemonic or hexadecimal value representing the FC Endpoint
Security used:
Authentication: Connection has been authenticated
Encryption : Connection is encrypted
Both sysfs attributes may return hexadecimal values instead of mnemonics,
if the mnemonic lookup table does not contain an entry for the FC Endpoint
Security reported by the FCP device.
Link: https://lore.kernel.org/r/20200312174505.51294-7-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce automatic variables for adapter and QTCB bottom in
zfcp_fsf_open_port_handler(). This facilitates subsequent changes to meet
the 80 character per line limit.
Link: https://lore.kernel.org/r/20200312174505.51294-6-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.
Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.
Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.
The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value. Only if something triggered an
exchange port data, some values changed. Active_fc4s became all zeros as
unknown equivalent during link down. Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph). Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.
(cf. v4.9 commit bd77befa5b ("zfcp: fix fc_host port_type with NPIV"))
This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x01 0x00 ..."
...
fabric_name = "0x10000027f8e04c49"
...
permanent_port_name = "0xc05076e4588059c1"
port_id = "0x244800"
port_state = "Linkdown"
port_type = "NPort (fabric via point-to-point)"
...
speed = "16 Gbit"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "37"
...
utilization = "0 0 0"
The second and next invocations produce (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x00 0x00 ..."
...
fabric_name = "0x0"
...
permanent_port_name = "0x0"
port_id = "0x000000"
port_state = "Linkdown"
port_type = "Unknown"
...
speed = "unknown"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "38"
...
utilization = "0 0 0"
Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down(). All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().
Port_name and node_name are always valid, so never reset them.
Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.
Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().
When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.
Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Manufacturer, HBA model, firmware version, and hardware version. Use the
same value format as for the driver-specific attributes. Keep the
driver-specific attributes for stable user space sysfs API.
Link: https://lore.kernel.org/r/20200312174505.51294-4-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FICON Express8S or older, as well as card features newer than FICON
Express16S+ have no certain firmware level requirement.
FICON Express16S or FICON Express16S+ have the following
minimum firmware level requirements to show a proper fabric name value:
z13 machine
FICON Express16S , MCL P08424.005 , LIC version 0x00000721
z14 machine
FICON Express16S , MCL P42611.008 , LIC version 0x10200069
FICON Express16S+ , MCL P42625.010 , LIC version 0x10300147
Otherwise, the read value is not the fabric name.
Each FCP channel of these card features might need one SAN fabric re-login
after concurrent microcode update in order to show the proper fabric name.
Possible ways to trigger a SAN fabric re-login are one of: Pull fibres
between FCP channel port and SAN switch port on either side and re-plug,
disable SAN switch port adjacent to FCP channel port and re-enable switch
port, or at Service Element toggle off all CHPIDs of FCP channel over all
LPARs and toggle CHPIDs on again. Zfcp operating subchannels (FCP devices)
on such FCP channel recovers a fabric re-login.
Initialize fabric name for any topology and have it an invalid WWPN 0x0 for
anything but fabric topology. Otherwise for e.g. point-to-point topology
one could see the initial -1 from fc_host_setup() and after a link unplug
our fabric name would turn to 0x0 (with subsequent commit ("zfcp: fix
fc_host attributes that should be unknown on local link down") and stay 0x0
on link replug. I did not initialize to 0x0 somewhere even earlier in the
code path such that it would not flap from real to 0x0 to real on e.g. an
exchange config data with fabric topology.
Link: https://lore.kernel.org/r/20200312174505.51294-3-maier@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
v2.6.27 commit cc8c282963 ("[SCSI] zfcp: Automatically attach remote
ports") introduced zfcp automatic port scan.
Before that, the user had to use the sysfs attribute "port_add" of an FCP
device (adapter) to add and open remote (target) ports, even for the remote
peer port in point-to-point topology. That code path did a proper port open
recovery trigger taking the erp_lock.
Since above commit, a new helper function zfcp_erp_open_ptp_port()
performed an UNlocked port open recovery trigger. This can race with other
parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt
e.g. adapter->erp_total_count or adapter->erp_ready_head.
As already found for fabric topology in v4.17 commit fa89adba19 ("scsi:
zfcp: fix infinite iteration on ERP ready list"), there was an endless loop
during tracing of rport (un)block. A subsequent v4.18 commit 9e156c54ac
("scsi: zfcp: assert that the ERP lock is held when tracing a recovery
trigger") introduced a lockdep assertion for that case.
As a side effect, that lockdep assertion now uncovered the unlocked code
path for PtP. It is from within an adapter ERP action:
zfcp_erp_strategy[1479] intentionally DROPs erp lock around
zfcp_erp_strategy_do_action()
zfcp_erp_strategy_do_action[1441] NO erp lock
zfcp_erp_adapter_strategy[876] NO erp lock
zfcp_erp_adapter_strategy_open[855] NO erp lock
zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock
zfcp_erp_adapter_strat_fsf_xconf[772] erp lock only around
zfcp_erp_action_to_running(),
BUT *_not_* around
zfcp_erp_enqueue_ptp_port()
zfcp_erp_enqueue_ptp_port[728] BUG: *_not_* taking erp lock
_zfcp_erp_port_reopen[432] assumes to be called with erp lock
zfcp_erp_action_enqueue[314] assumes to be called with erp lock
zfcp_dbf_rec_trig[288] _checks_ to be called with erp lock:
lockdep_assert_held(&adapter->erp_lock);
It causes the following lockdep warning:
WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288
zfcp_dbf_rec_trig+0x16a/0x188
no locks held by zfcperp0.0.17c0/775.
Fix this by using the proper locked recovery trigger helper function.
Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com
Fixes: cc8c282963 ("[SCSI] zfcp: Automatically attach remote ports")
Cc: <stable@vger.kernel.org> #v2.6.27+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This makes the SCSI tracing code slightly easier to read.
Link: https://lore.kernel.org/r/20200313203102.16613-6-bvanassche@acm.org
Fixes: bf81623542 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use these functions instead of open-coding them.
Link: https://lore.kernel.org/r/20200313203102.16613-5-bvanassche@acm.org
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Move the get_unaligned_be24(), get_unaligned_le24() and
put_unaligned_le24() definitions from various drivers into
include/linux/unaligned/generic.h. Add a put_unaligned_be24()
implementation.
Link: https://lore.kernel.org/r/20200313203102.16613-4-bvanassche@acm.org
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jens Axboe <axboe@fb.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # For drivers/usb
Reviewed-by: Felipe Balbi <balbi@kernel.org> # For drivers/usb/gadget
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use the generic __{get,put}_unaligned_[bl]e() definitions instead of
duplicating these. Since a later patch will add more definitions into
<linux/unaligned/generic.h>, this patch ensures that these definitions have
to be added only once. See also commit a7f626c194 ("C6X: headers"). See
also commit 6510d41954 ("kernel: Move arches to use common unaligned
access").
Link: https://lore.kernel.org/r/20200313203102.16613-3-bvanassche@acm.org
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The C language supports implicitly casting a void pointer into a non-void
pointer. Remove explicit void pointer to non-void pointer casts because
these are superfluous.
Link: https://lore.kernel.org/r/20200313203102.16613-2-bvanassche@acm.org
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allow users with read permissions to issue REPORT ZONE commands and users
with write permissions to manage zones on block devices supporting the ZBC
specification.
Link: https://lore.kernel.org/r/20200226170518.92963-2-ryanattard@ryanattard.info
Signed-off-by: Ryan Attard <ryanattard@ryanattard.info>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The print of pr_err() does not come with device information, so replace it
with dev_err(). Also improve the grammar in the message.
Link: https://lore.kernel.org/r/1583940144-230800-1-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Large queues of I/O to offline devices that are eventually submitted when
devices are unblocked result in a many repeated "rejecting I/O to offline
device" messages. These messages can fill up the dmesg buffer in crash
dumps so no useful prior messages remain. In addition, if a serial console
is used, the flood of messages can cause a hard lockup in the console code.
Introduce a flag indicating the message has already been logged for the
device, and reset the flag when scsi_device_set_state() changes the device
state.
Link: https://lore.kernel.org/r/20200311143930.20674-1-emilne@redhat.com
Reviewed-by: Bart van Assche <bvanassche@acm.org>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This file had its own peculiar style, not following any other
files inside the Kernel (as far as I saw).
Had to do a number of changes here, starting by removing the two
leading asterisks from each line, adding table and literal
block markups and changing whitespace and blank lines.
The end result is that (IMHO), it is now a lot easier to read
it as a text file, while producing a good html output.
Link: https://lore.kernel.org/r/6f8e4da4ea643adbe048f55504a59427c5e50c97.1583136624.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>