WSL2-Linux-Kernel/drivers/md
Gal Ofri 97ae27252f md/raid5: avoid device_lock in read_one_chunk()
There is a lock contention on device_lock in read_one_chunk().
device_lock is taken to sync conf->active_aligned_reads and
conf->quiesce.
read_one_chunk() takes the lock, then waits for quiesce=0 (resumed)
before incrementing active_aligned_reads.
raid5_quiesce() takes the lock, sets quiesce=2 (in-progress), then waits
for active_aligned_reads to be zero before setting quiesce=1
(suspended).

Introduce a fast (lockless) path in read_one_chunk(): activate aligned
read without taking device_lock.  In case quiesce starts while
activating the aligned-read in fast path, deactivate it and revert to
old behavior (take device_lock and wait for quiesce to finish).

Add smp store/load in raid5_quiesce()/read_one_chunk() respectively to
gaurantee that read_one_chunk() does not miss an ongoing quiesce.

My setups:
1. 8 local nvme drives (each up to 250k iops).
2. 8 ram disks (brd).

Each setup with raid6 (6+2), 1024 io threads on a 96 cpu-cores (48 per
socket) system. Record both iops and cpu spent on this contention with
rand-read-4k. Record bw with sequential-read-128k.  Note: in most cases
cpu is still busy but due to "new" bottlenecks.

nvme:
              | iops           | cpu  | bw
-----------------------------------------------
without patch | 1.6M           | ~50% | 5.5GB/s
with patch    | 2M (throttled) | 0%   | 16GB/s (throttled)

ram (brd):
              | iops           | cpu  | bw
-----------------------------------------------
without patch | 2M             | ~80% | 24GB/s
with patch    | 4M             | 0%   | 55GB/s

CC: Song Liu <song@kernel.org>
CC: Neil Brown <neilb@suse.de>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Gal Ofri <gal.ofri@storing.io>
Signed-off-by: Song Liu <song@kernel.org>
2021-06-14 22:32:07 -07:00
..
bcache include: remove pagemap.h from blkdev.h 2021-05-06 19:24:11 -07:00
persistent-data dm space map common: fix division bug in sm_ll_find_free_block() 2021-04-19 12:48:13 -04:00
Kconfig md: mark some personalities as deprecated 2021-06-14 22:32:07 -07:00
Makefile
dm-bio-prison-v1.c
dm-bio-prison-v1.h
dm-bio-prison-v2.c
dm-bio-prison-v2.h
dm-bio-record.h block: store a block_device pointer in struct bio 2021-01-24 18:17:20 -07:00
dm-bufio.c dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size 2021-03-04 14:53:54 -05:00
dm-builtin.c
dm-cache-background-tracker.c
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c dm: use bdev_read_only to check if a device is read-only 2021-01-24 18:15:57 -07:00
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm cache: remove needless request_queue NULL pointer checks 2021-03-26 14:53:42 -04:00
dm-clone-metadata.c dm clone metadata: remove unused function 2021-04-19 13:20:31 -04:00
dm-clone-metadata.h
dm-clone-target.c dm-clone: use blkdev_issue_flush in commit_metadata 2021-01-27 09:51:48 -07:00
dm-core.h dm: fix deadlock when swapping to encrypted device 2021-02-11 09:45:28 -05:00
dm-crypt.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
dm-delay.c
dm-dust.c dm dust: remove h from printk format specifier 2021-02-03 10:10:04 -05:00
dm-ebs-target.c dm ebs: fix a few typos 2021-03-26 14:53:42 -04:00
dm-era-target.c dm era: only resize metadata in preresume 2021-02-11 09:45:22 -05:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c dm: simplify target code conditional on CONFIG_BLK_DEV_ZONED 2021-02-11 09:45:27 -05:00
dm-init.c
dm-integrity.c dm integrity: fix sparse warnings 2021-05-13 14:53:49 -04:00
dm-io.c block: Add bio_max_segs 2021-02-26 15:49:51 -07:00
dm-ioctl.c dm ioctl: filter the returned values according to name or uuid prefix 2021-03-26 14:53:41 -04:00
dm-kcopyd.c
dm-linear.c dm: simplify target code conditional on CONFIG_BLK_DEV_ZONED 2021-02-11 09:45:27 -05:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c block: Add bio_max_segs 2021-02-26 15:49:51 -07:00
dm-log.c
dm-mpath.c
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-ps-historical-service-time.c
dm-ps-io-affinity.c
dm-ps-queue-length.c
dm-ps-round-robin.c
dm-ps-service-time.c
dm-raid.c dm raid: remove unnecessary discard limits for raid0 and raid10 2021-04-30 14:38:37 -04:00
dm-raid1.c block: store a block_device pointer in struct bio 2021-01-24 18:17:20 -07:00
dm-region-hash.c
dm-rq.c dm rq: fix double free of blk_mq_tag_set in dev remove after table load fails 2021-04-30 14:19:08 -04:00
dm-rq.h
dm-snap-persistent.c dm: replace dm_vcalloc() 2021-04-19 13:13:26 -04:00
dm-snap-transient.c
dm-snap.c dm snapshot: fix crash with transient storage and zero chunk size 2021-05-13 14:42:52 -04:00
dm-stats.c
dm-stats.h
dm-stripe.c
dm-switch.c
dm-sysfs.c
dm-table.c dm: replace dm_vcalloc() 2021-04-19 13:13:26 -04:00
dm-target.c
dm-thin-metadata.c dm: use bdev_read_only to check if a device is read-only 2021-01-24 18:15:57 -07:00
dm-thin-metadata.h
dm-thin.c dm thin: remove needless request_queue NULL pointer check 2021-03-26 14:53:42 -04:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c
dm-verity-fec.c dm verity fec: fix misaligned RS roots IO 2021-04-14 14:28:29 -04:00
dm-verity-fec.h dm verity fec: fix misaligned RS roots IO 2021-04-14 14:28:29 -04:00
dm-verity-target.c dm verity: allow only one error handling mode 2021-03-26 14:53:41 -04:00
dm-verity-verify-sig.c
dm-verity-verify-sig.h
dm-verity.h
dm-writecache.c dm writecache: fix flexible_array.cocci warnings 2021-03-26 14:53:41 -04:00
dm-zero.c
dm-zoned-metadata.c block: use an on-stack bio in blkdev_issue_flush 2021-01-27 09:51:48 -07:00
dm-zoned-reclaim.c
dm-zoned-target.c dm table: Fix zoned model check and zone sectors check 2021-03-22 12:32:31 -04:00
dm-zoned.h
dm.c dm: unexport dm_{get,put}_table_device 2021-03-26 14:53:42 -04:00
dm.h dm table: fix DAX iterate_devices based device capability checks 2021-02-09 08:45:30 -05:00
md-autodetect.c
md-bitmap.c md: Constify attribute_group structs 2021-06-14 22:32:07 -07:00
md-bitmap.h
md-cluster.c
md-cluster.h
md-faulty.c md: mark some personalities as deprecated 2021-06-14 22:32:07 -07:00
md-linear.c md: mark some personalities as deprecated 2021-06-14 22:32:07 -07:00
md-linear.h
md-multipath.c md: mark some personalities as deprecated 2021-06-14 22:32:07 -07:00
md-multipath.h
md.c md: add comments in md_integrity_register 2021-06-14 22:32:07 -07:00
md.h md: Constify attribute_group structs 2021-06-14 22:32:07 -07:00
raid0.c md: add io accounting for raid0 and raid5 2021-06-14 22:32:06 -07:00
raid0.h
raid1-10.c
raid1.c md/raid1: enable io accounting 2021-06-14 22:32:07 -07:00
raid1.h md/raid1: enable io accounting 2021-06-14 22:32:07 -07:00
raid5-cache.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
raid5-log.h
raid5-ppl.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
raid5.c md/raid5: avoid device_lock in read_one_chunk() 2021-06-14 22:32:07 -07:00
raid5.h
raid10.c md/raid10: enable io accounting 2021-06-14 22:32:07 -07:00
raid10.h md/raid10: enable io accounting 2021-06-14 22:32:07 -07:00