nvme: avoid fallback to sequential scan due to transient issues

Currently, if nvme_scan_ns_list fails, nvme_scan_work will fall back to
a sequential scan. nvme_scan_ns_list can fail for a variety of reasons,
e.g. a transient transport issue, and the resulting sequential scan can
be extremely expensive on controllers reporting an NN value close to the
maximum allowed (> 4 billion). Avoid sequential scans wherever possible
by only falling back to them in two cases:

- When the NVMe version supported (VS) value reported by the device is
  older than NVME_VS(1, 1, 0), before which support of Identify NS List
  not required.
- When the Identify NS List command fails with the DNR bit set in the
  status. This is to accommodate (non-compliant) devices which report a
  VS value which implies support for Identify NS List, but nevertheless
  do not support the command. Such devices will most likely fail the
  command with the DNR bit set.

The third case is when the device claims support for Identify NS List
but the command fails with DNR not set. In such cases, fallback to
sequential scan is potentially expensive and likely unnecessary, as a
retry of the list scan should succeed. So this change skips the fallback
in this third case.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This commit is contained in:
Uday Shankar 2022-11-14 17:23:59 -07:00 коммит произвёл Christoph Hellwig
Родитель 91c11d5f32
Коммит 811f4de034
1 изменённых файлов: 11 добавлений и 4 удалений

Просмотреть файл

@ -4460,9 +4460,6 @@ static int nvme_scan_ns_list(struct nvme_ctrl *ctrl)
u32 prev = 0;
int ret = 0, i;
if (nvme_ctrl_limited_cns(ctrl))
return -EOPNOTSUPP;
ns_list = kzalloc(NVME_IDENTIFY_DATA_SIZE, GFP_KERNEL);
if (!ns_list)
return -ENOMEM;
@ -4570,8 +4567,18 @@ static void nvme_scan_work(struct work_struct *work)
}
mutex_lock(&ctrl->scan_lock);
if (nvme_scan_ns_list(ctrl) != 0)
if (nvme_ctrl_limited_cns(ctrl)) {
nvme_scan_ns_sequential(ctrl);
} else {
/*
* Fall back to sequential scan if DNR is set to handle broken
* devices which should support Identify NS List (as per the VS
* they report) but don't actually support it.
*/
ret = nvme_scan_ns_list(ctrl);
if (ret > 0 && ret & NVME_SC_DNR)
nvme_scan_ns_sequential(ctrl);
}
mutex_unlock(&ctrl->scan_lock);
}