This series consists of the usual driver updates (ufs, pm80xx, lpfc,
mpi3mr, mpt3sas, hisi_sas, libsas) and minor updates and bug fixes.
The most impactful change is likely the switch from GFP_DMA to
GFP_KERNEL in a bunch of drivers, but even that shouldn't affect too
many people.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYeCJZyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfMnAQCsERG9
V4yX8LDpBjD7leIccf+6krJNNWaIWYYkEdxpzQD9FShB7/yDakFq3erW2y5mVqac
dZ065M0ckE4bxk9uMIE=
=gPHF
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (ufs, pm80xx, lpfc,
mpi3mr, mpt3sas, hisi_sas, libsas) and minor updates and bug fixes.
The most impactful change is likely the switch from GFP_DMA to
GFP_KERNEL in a bunch of drivers, but even that shouldn't affect too
many people"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (121 commits)
scsi: mpi3mr: Bump driver version to 8.0.0.61.0
scsi: mpi3mr: Fixes around reply request queues
scsi: mpi3mr: Enhanced Task Management Support Reply handling
scsi: mpi3mr: Use TM response codes from MPI3 headers
scsi: mpi3mr: Add io_uring interface support in I/O-polled mode
scsi: mpi3mr: Print cable mngnt and temp threshold events
scsi: mpi3mr: Support Prepare for Reset event
scsi: mpi3mr: Add Event acknowledgment logic
scsi: mpi3mr: Gracefully handle online FW update operation
scsi: mpi3mr: Detect async reset that occurred in firmware
scsi: mpi3mr: Add IOC reinit function
scsi: mpi3mr: Handle offline FW activation in graceful manner
scsi: mpi3mr: Code refactor of IOC init - part2
scsi: mpi3mr: Code refactor of IOC init - part1
scsi: mpi3mr: Fault IOC when internal command gets timeout
scsi: mpi3mr: Display IOC firmware package version
scsi: mpi3mr: Handle unaligned PLL in unmap cmnds
scsi: mpi3mr: Increase internal cmnds timeout to 60s
scsi: mpi3mr: Do access status validation before adding devices
scsi: mpi3mr: Add support for PCIe Managed Switch SES device
...
Pull in the 5.16 fixes branch to resolve a conflict in the UFS driver
core.
Conflicts:
drivers/scsi/ufs/ufshcd.c
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove this include directive from code that does not use any functionality
from kernel/async.c.
Link: https://lore.kernel.org/r/20211129194609.3466071-13-bvanassche@acm.org
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait
given that it is unused now. Also convert the boolean at_head parameter
to actually use the bool type while touching the prototype.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fix the following sparse warnings in ufshpb_set_hpb_read_to_upiu():
sparse warnings: (new ones prefixed by >>)
drivers/scsi/ufs/ufshpb.c:335:27: sparse: sparse: cast from restricted __be64
drivers/scsi/ufs/ufshpb.c:335:25: sparse: expected restricted __be64 [usertype] ppn_tmp
drivers/scsi/ufs/ufshpb.c:335:25: sparse: got unsigned long long [usertype]
Link: https://lore.kernel.org/r/20211111222452.384089-1-huobean@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series is all the stragglers that didn't quite make the first
merge window pull. It's mostly minor updates and bug fixes of merge
window code but it also has two driver updates: ufs and qla2xxx.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYY5mOyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpjAQDboVkH
7RQblJf8AKDMjN2baSIrmbk7qEUqzRgo6Ef3egEAi044Gx4KqBwzBLiCREcFW/Mt
F95pt5udsLypGhpfZlE=
=fiv8
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This series is all the stragglers that didn't quite make the first
merge window pull. It's mostly minor updates and bug fixes of merge
window code but it also has two driver updates: ufs and qla2xxx"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (46 commits)
scsi: scsi_debug: Don't call kcalloc() if size arg is zero
scsi: core: Remove command size deduction from scsi_setup_scsi_cmnd()
scsi: scsi_ioctl: Validate command size
scsi: ufs: ufshpb: Properly handle max-single-cmd
scsi: core: Avoid leaving shost->last_reset with stale value if EH does not run
scsi: bsg: Fix errno when scsi_bsg_register_queue() fails
scsi: sr: Remove duplicate assignment
scsi: ufs: ufs-exynos: Introduce ExynosAuto v9 virtual host
scsi: ufs: ufs-exynos: Multi-host configuration for ExynosAuto v9
scsi: ufs: ufs-exynos: Support ExynosAuto v9 UFS
scsi: ufs: ufs-exynos: Add pre/post_hce_enable drv callbacks
scsi: ufs: ufs-exynos: Factor out priv data init
scsi: ufs: ufs-exynos: Add EXYNOS_UFS_OPT_SKIP_CONFIG_PHY_ATTR option
scsi: ufs: ufs-exynos: Support custom version of ufs_hba_variant_ops
scsi: ufs: ufs-exynos: Add setup_clocks callback
scsi: ufs: ufs-exynos: Add refclkout_stop control
scsi: ufs: ufs-exynos: Simplify drv_data retrieval
scsi: ufs: ufs-exynos: Change pclk available max value
scsi: ufs: Add quirk to enable host controller without PH configuration
scsi: ufs: Add quirk to handle broken UIC command
...
This series consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes. Notable core changes are the removal of scsi->tag which caused
some churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYYUfBCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbUJAQDZt4oc
vUx9JpyrdHxxTCuOzVFd8W1oJn0k5ltCBuz4yAD8DNbGhGm93raMSJ3FOOlzLEbP
RG8vBdpxMudlvxAPi/A=
=BSFz
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes.
Notable core changes are the removal of scsi->tag which caused some
churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
scsi: lpfc: Update lpfc version to 14.0.0.3
scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
scsi: lpfc: Fix link down processing to address NULL pointer dereference
scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted
scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine
scsi: lpfc: Correct sysfs reporting of loop support after SFP status change
scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup()
scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
scsi: ufs: mediatek: Avoid sched_clock() misuse
scsi: mpt3sas: Make mpt3sas_dev_attrs static
scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions
scsi: target: core: Stop using bdevname()
scsi: aha1542: Use memcpy_{from,to}_bvec()
scsi: sr: Add error handling support for add_disk()
scsi: sd: Add error handling support for add_disk()
scsi: target: Perform ALUA group changes in one step
scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path
scsi: target: Fix alua_tg_pt_gps_count tracking
scsi: target: Fix ordered tag handling
...
The spec recommends that for transfer length larger than the max-single-cmd
attribute (bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD) it is possible to couple
pre-requests with the HPB-READ command. Being a recommendation, using
pre-requests can be perceived merely as a means of optimization. A common
practice was to send pre-requests for chunks within some interval, and
leave the READ10 untouched if larger.
Now that the pre-request flows have been removed, all the commands are
single commands. Properly handle this attribute and do not send HPB-READ
for transfer lengths larger than max-single-cmd.
[mkp: resolve conflict]
Fixes: 09d9e4d041 ("scsi: ufs: ufshpb: Remove HPB2.0 flows")
Link: https://lore.kernel.org/r/20211031123654.17719-1-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The partial UFS revert in 5.15 is needed for some additional fixes in
the 5.16 SCSI tree. Merge the fixes branch.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8MnsQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpuBpEACzrzbUfkTQ33bwF60mZQaqbR0ha7TrP/hp
oAqthmf1S2U+7mzXHQ+6MN7p4+TVPa/ITxQZtLTw7U/68+w68tTUZfZHJ5H6tSXu
92OHFDDP4ZeqATRTcJBij/5Si9BiKBHexMqeyVYPw0DWdEukAko9f7Z81GonFbTu
EIdIWivBc76bLiK/X3w7lhLcaNyUv9cKalwjbI4xtwcHtcIYj5d2jIc9PF2I9Xtl
3oqNT4GOSv7s3mW7syB1UEPrzbhVIzCSNbMSviCoK7GA5g8EN5KMEGQQoUJ942Zv
bHMjMpGrXsWebPto9maXycGY/9WsVcpNB7opyQRpyG8yDDZq0AFNJxD/NBMkQo4S
Sfp0fxpVXDRWu7zX0EktwGyOp4YNwfS6pDeAhqhnSl2uPWTsxGZ0kXvlMpR9Rt/t
TjEKZe6lmcC7s42rPVRBRw5HEzEsVovf0z4lyvC4M223CV3c5cuYkAAtCcqLdVWq
JkceHSb7EKu7QY6jf3sBud14HaAj+sub7kffOWhhAxObg3Ytsql61AGzbhandnxT
AtN3n9PBHNGmrSv4MiiuP+Dq5jeT5NspFkf1FvnRcfmZMJtH1VXHKr84JbAy4VHr
5cZoDJzL9Zm1d865f+VWkZeYd3b2kKP8C0dm6tAn4VweT6eb8bu6tgB7wFQwLIFK
aRxz5vQ1AQ==
=dLYJ
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block
Pull QUEUE_FLAG_SCSI_PASSTHROUGH removal from Jens Axboe:
"This contains a series leading to the removal of the
QUEUE_FLAG_SCSI_PASSTHROUGH queue flag"
* tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block:
block: remove blk_{get,put}_request
block: remove QUEUE_FLAG_SCSI_PASSTHROUGH
block: remove the initialize_rq_fn blk_mq_ops method
scsi: add a scsi_alloc_request helper
bsg-lib: initialize the bsg_job in bsg_transport_sg_io_fn
nfsd/blocklayout: use ->get_unique_id instead of sending SCSI commands
sd: implement ->get_unique_id
block: add a ->get_unique_id method
The Host Performance Buffer feature allows UFS read commands to carry the
physical media addresses along with the LBAs, thus allowing less internal
L2P-table switches in the device. HPB1.0 allowed a single LBA, while
HPB2.0 increases this capacity up to 255 blocks.
Carrying more than a single record, the read operation is no longer purely
of type "read" but a "hybrid" command: Writing the physical address to the
device in one operation and reading back the required payload in another.
The JEDEC HPB spec defines two commands for this operation:
HPB-WRITE-BUFFER (0x2) to write the physical addresses to device, and
HPB-READ to read the payload.
With the current HPB design the UFS driver has no alternative but to divide
the READ request into 2 separate commands: HPB-WRITE-BUFFER and HPB-READ.
This causes a great deal of aggravation to the block layer guys who
demanded that we completely revert the entire HPB driver regardless of the
huge amount of corporate effort already invested in it.
As a compromise, remove only the pieces that implement the 2.0
specification. This is done as a matter of urgency for the final 5.15
release.
Link: https://lore.kernel.org/r/20211030062301.248-1-avri.altman@wdc.com
Tested-by: Avri Altman <avri.altman@wdc.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
These are now pointless wrappers around blk_mq_{alloc,free}_request,
so remove them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20211025070517.1548584-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge the 5.15/scsi-fixes branch into the staging tree to resolve UFS
conflict reported by sfr.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In ufshpb, pm_runtime_{get,put}_sync() are used to avoid unwanted runtime
suspend during query requests. Whereas commit b294ff3e34 ("scsi: ufs:
core: Enable power management for wlun") modified the driver core to use
ufshcd_rpm_{get,put}_sync() APIs.
Switch to these APIs in HPB module as well.
Link: https://lore.kernel.org/r/20210902003534epcms2p1937a0f0eeb48a441cb69f5ef13ff8430@epcms2p1
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following parameters are not used in the function. Remove them.
*func(): ufshpb_set_hpb_read_to_upiu
-> struct ufshpb_lu *hpb
-> u32 lpn
Link: https://lore.kernel.org/r/20210901025617.31174-1-cw9316.lee@samsung.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: ChanWoo Lee <cw9316.lee@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When HPB pinned region exists and mctx allocation for this region fails, a
memory leak is possible because memory is not released for the subregion
table of the current region.
Free memory for the subregion table of the current region.
Link: https://lore.kernel.org/r/1891546521.01629711601304.JavaMail.epsvc@epcpadp3
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host control mode, eviction is perceived as an extreme measure. There
are several conditions that both the entering and exiting regions should
meet, so that eviction will take place.
The common case however, is that those conditions are rarely met, so it is
normal that the act of eviction fails. Therefore, do not report an error
in host control mode if eviction fails.
Link: https://lore.kernel.org/r/20210808090024.21721-5-avri.altman@wdc.com
Fixes: 6c59cb501b (scsi: ufs: ufshpb: Make eviction depend on region's reads)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
'num_inflight_map_req' should not be negative. It is incremented and
decremented without any protection, allowing it theoretically to be
negative, should some weird unbalanced count occur.
Verify that the those calls are properly serialized.
Link: https://lore.kernel.org/r/20210808090024.21721-4-avri.altman@wdc.com
Fixes: 33845a2d84 (scsi: ufs: ufshpb: Limit the number of in-flight map requests)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The purpose of the "cold"-timer is not to hang-on to active regions with no
reads. Therefore the read timeout should be rewound on every read, and not
just when the region is activated.
Link: https://lore.kernel.org/r/20210808090024.21721-2-avri.altman@wdc.com
Fixes: 13c044e916 (scsi: ufs: ufshpb: Add "cold" regions timer)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For Micron UFS devices the L2P entry need to be byteswapped before sending
an HPB READ command to the UFS device. Add the quirk
UFS_DEVICE_QUIRK_SWAP_L2P_ENTRY_FOR_HPB_READ to address this.
Link: https://lore.kernel.org/r/20210804182128.458356-2-huobean@gmail.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Elaborate some more on the host control mode logic parameters, explaining
what they do and how to configure them.
Link: https://lore.kernel.org/r/20210712095039.8093-13-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Support devices that report they are using host control mode.
Link: https://lore.kernel.org/r/20210712095039.8093-12-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host control mode the host is the originator of map requests. To not
flood the device with map requests, use a simple throttling mechanism that
limits the number of in-flight map requests.
Link: https://lore.kernel.org/r/20210712095039.8093-10-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In order not to hang on to "cold" regions, we inactivate a region that has
had no READ access for a predefined amount of time - READ_TO_MS. For that
purpose monitor the active regions list, polling it on every
POLLING_INTERVAL_MS. On timeout expiry add the region to the
"to-be-inactivated" list unless it is clean and did not exhaust its
READ_TO_EXPIRIES - another parameter.
None of this applies to pinned regions.
Link: https://lore.kernel.org/r/20210712095039.8093-9-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The spec does not define what the host's recommended response is when the
device sends HPB dev reset response (oper 0x2).
Update all active HPB regions.
Link: https://lore.kernel.org/r/20210712095039.8093-8-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host mode, the host is expected to send HPB WRITE BUFFER with buffer-id
= 0x1 when it inactivates a region.
Use the map-requests pool as there is no point in assigning a designated
cache for umap-requests.
[mkp: REQ_OP_DRV_*]
Link: https://lore.kernel.org/r/20210712095039.8093-7-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host mode, eviction is considered an extreme measure. Verify that the
entering region has enough reads, and the exiting region has fewer reads.
Link: https://lore.kernel.org/r/20210712095039.8093-6-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In host control mode, reads are the major source of activation trials.
Keep track of those reads counters, for both active as well inactive
regions.
We reset the read counter upon write - we are only interested in "clean"
reads.
Keep those counters normalized, as we are using those reads as a
comparative score, to make various decisions. If during consecutive
normalizations an active region has exhaust its reads - inactivate it.
While at it, protect the {active,inactive}_count stats by adding them into
the applicable handler.
Link: https://lore.kernel.org/r/20210712095039.8093-5-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Given a transfer length, set_dirty meticulously iterates over all the
entries, across subregions and regions if needed. Currently its only use is
to mark dirty blocks, but HCM may benefit from it as well to manage its
read counters.
Link: https://lore.kernel.org/r/20210712095039.8093-4-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In device control mode, the device may recommend the host to either
activate or inactivate a region, and the host should follow. Meaning those
are not actually recommendations, but more of instructions.
Conversely, in host control mode, the recommendation protocol is slightly
changed:
a) The device may only recommend the host to update a subregion of an
already-active region. And,
b) The device may *not* recommend to inactivate a region.
Furthermore, in host control mode, the host may choose not to follow any of
the device's recommendations. However, in case of a recommendation to
update an active and clean subregion, it is better to follow those
recommendation because otherwise the host has no other way to know that
some internal relocation took place.
Link: https://lore.kernel.org/r/20210712095039.8093-3-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
We will use control_mode later when we need to differentiate between device
and host control modes.
Link: https://lore.kernel.org/r/20210712095039.8093-2-avri.altman@wdc.com
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Version 2.0 of HBP supports reads of varying sizes from 4KB to 1MB.
A read operation <= 32KB is supported as single HPB read. A read between
36KB and 1MB is supported by a combination of write buffer command and HPB
read command to deliver more PPN. The write buffer commands may not be
issued immediately due to busy tags. To use HPB read more aggressively, the
driver can requeue the write buffer command. The requeue threshold is
implemented as timeout and can be modified with requeue_timeout_ms entry in
sysfs.
[mkp: REQ_OP_DRV_* and blk_rq_is_passthrough()]
Link: https://lore.kernel.org/r/20210712090025epcms2p3b3d94f6f1b2cfa394e3d9ba130ca0fa7@epcms2p3
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the logical address of a read I/O belongs to an active sub-region, the
HPB driver modifies the read I/O command to an HPB read. The driver
modifies the UFS UPIU instead of modifying the existing SCSI command.
In HPB version 1.0, the maximum read I/O size that can be converted to HPB
read is 4KB.
The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.
[mkp: REQ_OP_DRV_* and blk_rq_is_passthrough()]
Link: https://lore.kernel.org/r/20210712085936epcms2p4b0ec5c8cecdeea6cc043d684363842b6@epcms2p4
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement L2P map management in HPB.
The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.
Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sense data. The HPB module
performs L2P mapping management on the host through the delivered
information.
A pinned region is a preset region on the UFS device that is always
in activate-state.
The data structures for map data requests and L2P mappings use the mempool
API, minimizing allocation overhead while avoiding static allocation.
The mininum size of the memory pool used in the HPB is implemented
as a module parameter so that it can be configurable by the user.
To guarantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096.
The map_work manages active/inactive via 2 "to-do" lists:
- hpb->lh_inact_rgn: regions to be inactivated
- hpb->lh_act_srgn: subregions to be activated
These lists are maintained on I/O completion.
[mkp: switch to REQ_OP_DRV_*]
Link: https://lore.kernel.org/r/20210712085859epcms2p36e420f19564f6cd0c4a45d54949619eb@epcms2p3
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Implement Host Performance Buffer (HPB) initialization and add function
calls to UFS core driver.
NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of I/O requests to the corresponding physical
addresses of the flash storage. In UFS, logical-to-physical-address (L2P)
map data, which is required to identify the physical address for the
requested I/Os, can only be partially stored in SRAM from NAND flash. Due
to this partial loading, accessing the flash address area, where the L2P
information for that address is not loaded in the SRAM, can result in
serious performance degradation.
The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command. The HPB read command allows to
read data faster than a regular read command in UFS since it provides the
physical address (HPB Entry) of the desired logical block in addition to
its logical address. The UFS device can access the physical block in NAND
directly without searching and uploading L2P mapping table. This improves
read performance because the NAND read operation for uploading L2P mapping
table is removed.
In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, HPB parameters are
configured in the device.
Total start-up time of popular applications was measured and the difference
observed between HPB being enabled and disabled. Popular applications are
12 game apps and 24 non-game apps. Each test cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.
The following is the test environment:
- kernel version: 4.4.0
- RAM: 8GB
- UFS 2.1 (64GB)
Results:
+-------+----------+----------+-------+
| cycle | baseline | with HPB | diff |
+-------+----------+----------+-------+
| 1 | 272.4 | 264.9 | -7.5 |
| 2 | 250.4 | 248.2 | -2.2 |
| 3 | 226.2 | 215.6 | -10.6 |
| 4 | 230.6 | 214.8 | -15.8 |
| 5 | 232.0 | 218.1 | -13.9 |
| 6 | 231.9 | 212.6 | -19.3 |
+-------+----------+----------+-------+
We also measured HPB performance using iozone:
$ iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16 -s $IO_RANGE/16 -F \
mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5 mnt/tmp_6 mnt/tmp_7 \
mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12 mnt/tmp_13 \
mnt/tmp_14 mnt/tmp_15 mnt/tmp_16
Results:
+----------+--------+---------+
| IO range | HPB on | HPB off |
+----------+--------+---------+
| 1 GB | 294.8 | 300.87 |
| 4 GB | 293.51 | 179.35 |
| 8 GB | 294.85 | 162.52 |
| 16 GB | 293.45 | 156.26 |
| 32 GB | 277.4 | 153.25 |
+----------+--------+---------+
Link: https://lore.kernel.org/r/20210712085830epcms2p8c1288b7f7a81b044158a18232617b572@epcms2p8
Reported-by: kernel test robot <lkp@intel.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Tested-by: Can Guo <cang@codeaurora.org>
Tested-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Acked-by: Avri Altman <Avri.Altman@wdc.com>
Signed-off-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>