WSL2-Linux-Kernel/block
Yu Kuai 1340f02773 block, bfq: fix null pointer dereference in bfq_bio_bfqg()
[ Upstream commit f02be9002c ]

Out test found a following problem in kernel 5.10, and the same problem
should exist in mainline:

BUG: kernel NULL pointer dereference, address: 0000000000000094
PGD 0 P4D 0
Oops: 0000 [#1] SMP
CPU: 7 PID: 155 Comm: kworker/7:1 Not tainted 5.10.0-01932-g19e0ace2ca1d-dirty 4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-b4
Workqueue: kthrotld blk_throtl_dispatch_work_fn
RIP: 0010:bfq_bio_bfqg+0x52/0xc0
Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002
RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000
RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18
RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10
R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000
R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 bfq_bic_update_cgroup+0x3c/0x350
 ? ioc_create_icq+0x42/0x270
 bfq_init_rq+0xfd/0x1060
 bfq_insert_requests+0x20f/0x1cc0
 ? ioc_create_icq+0x122/0x270
 blk_mq_sched_insert_requests+0x86/0x1d0
 blk_mq_flush_plug_list+0x193/0x2a0
 blk_flush_plug_list+0x127/0x170
 blk_finish_plug+0x31/0x50
 blk_throtl_dispatch_work_fn+0x151/0x190
 process_one_work+0x27c/0x5f0
 worker_thread+0x28b/0x6b0
 ? rescuer_thread+0x590/0x590
 kthread+0x153/0x1b0
 ? kthread_flush_work+0x170/0x170
 ret_from_fork+0x1f/0x30
Modules linked in:
CR2: 0000000000000094
---[ end trace e2e59ac014314547 ]---
RIP: 0010:bfq_bio_bfqg+0x52/0xc0
Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002
RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000
RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18
RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10
R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000
R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Root cause is quite complex:

1) use bfq elevator for the test device.
2) create a cgroup CG
3) config blk throtl in CG

   blkg_conf_prep
    blkg_create

4) create a thread T1 and issue async io in CG:

   bio_init
    bio_associate_blkg
   ...
   submit_bio
    submit_bio_noacct
     blk_throtl_bio -> io is throttled
     // io submit is done

5) switch elevator:

   bfq_exit_queue
    blkcg_deactivate_policy
     list_for_each_entry(blkg, &q->blkg_list, q_node)
      blkg->pd[] = NULL
      // bfq policy is removed

5) thread t1 exist, then remove the cgroup CG:

   blkcg_unpin_online
    blkcg_destroy_blkgs
     blkg_destroy
      list_del_init(&blkg->q_node)
      // blkg is removed from queue list

6) switch elevator back to bfq

 bfq_init_queue
  bfq_create_group_hierarchy
   blkcg_activate_policy
    list_for_each_entry_reverse(blkg, &q->blkg_list)
     // blkg is removed from list, hence bfq policy is still NULL

7) throttled io is dispatched to bfq:

 bfq_insert_requests
  bfq_init_rq
   bfq_bic_update_cgroup
    bfq_bio_bfqg
     bfqg = blkg_to_bfqg(blkg)
     // bfqg is NULL because bfq policy is NULL

The problem is only possible in bfq because only bfq can be deactivated and
activated while queue is online, while others can only be deactivated while
the device is removed.

Fix the problem in bfq by checking if blkg is online before calling
blkg_to_bfqg().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221108103434.2853269-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-02 17:41:02 +01:00
..
partitions block: drop unused includes in <linux/genhd.h> 2022-03-16 14:23:46 +01:00
Kconfig SCSI misc on 20210902 2021-09-02 15:09:46 -07:00
Kconfig.iosched Revert "block/mq-deadline: Add cgroup support" 2021-08-11 13:47:26 -06:00
Makefile block-5.15-2021-09-11 2021-09-11 10:19:51 -07:00
badblocks.c
bdev.c block: simplify the block device syncing code 2022-04-27 14:38:50 +02:00
bfq-cgroup.c block, bfq: fix null pointer dereference in bfq_bio_bfqg() 2022-12-02 17:41:02 +01:00
bfq-iosched.c block, bfq: protect 'bfqd->queued' by 'bfqd->lock' 2022-11-10 18:15:37 +01:00
bfq-iosched.h bfq: Get rid of __bio_blkcg() usage 2022-06-09 10:23:19 +02:00
bfq-wf2q.c block/bfq_wf2q: correct weight to ioprio 2022-04-08 14:23:55 +02:00
bio-integrity.c block: bio-integrity: Advance seed correctly for larger interval sizes 2022-02-08 18:34:05 +01:00
bio.c block: ensure iov_iter advances for added pages 2022-08-17 14:24:01 +02:00
blk-cgroup-rwstat.c
blk-cgroup-rwstat.h
blk-cgroup.c block: fix bio_clone_blkg_association() to associate with proper blkcg_gq 2022-06-09 10:23:32 +02:00
blk-core.c block: blk_queue_enter() / __bio_queue_enter() must return -EAGAIN for nowait 2022-09-23 14:15:48 +02:00
blk-crypto-fallback.c
blk-crypto-internal.h
blk-crypto.c blk-crypto: fix check for too-large dun_bytes 2021-08-25 06:45:00 -06:00
blk-exec.c block: return errors from blk_execute_rq() 2021-06-30 15:35:45 -06:00
blk-flush.c block: Fix fsync always failed if once failed 2022-01-27 11:05:25 +01:00
blk-integrity.c block: flush the integrity workqueue in blk_integrity_unregister 2021-09-14 20:03:30 -06:00
blk-ioc.c block: fix default IO priority handling again 2022-08-11 13:07:50 +02:00
blk-iocost.c block: don't allow the same type rq_qos add more than once 2022-08-17 14:24:24 +02:00
blk-iolatency.c block: don't allow the same type rq_qos add more than once 2022-08-17 14:24:24 +02:00
blk-ioprio.c block: Introduce the ioprio rq-qos policy 2021-06-21 15:03:40 -06:00
blk-ioprio.h block: Introduce the ioprio rq-qos policy 2021-06-21 15:03:40 -06:00
blk-lib.c block: export blk_next_bio() 2021-06-17 15:51:20 +02:00
blk-map.c block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern 2022-03-08 19:12:31 +01:00
blk-merge.c block: don't merge across cgroup boundaries if blkcg is enabled 2022-04-08 14:22:59 +02:00
blk-mq-cpumap.c
blk-mq-debugfs-zoned.c
blk-mq-debugfs.c blk-mq: don't create hctx debugfs dir until q->debugfs_dir is created 2022-08-17 14:23:12 +02:00
blk-mq-debugfs.h
blk-mq-pci.c
blk-mq-rdma.c
blk-mq-sched.c block: limit request dispatch loop duration 2022-04-08 14:22:59 +02:00
blk-mq-sched.h blk: Fix lock inversion between ioc lock and bfqd lock 2021-06-24 18:43:55 -06:00
blk-mq-sysfs.c block: remove blk-mq-sysfs dead code 2021-08-02 13:37:29 -06:00
blk-mq-tag.c blk-mq: avoid to iterate over stale request 2021-09-12 19:32:43 -06:00
blk-mq-tag.h blk-mq: Some tag allocation code refactoring 2021-05-24 06:47:22 -06:00
blk-mq-virtio.c
blk-mq.c blk-mq: fix io hung due to missing commit_rqs 2022-08-31 17:16:50 +02:00
blk-mq.h blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() 2021-12-01 09:04:56 +01:00
blk-pm.c scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() 2022-01-27 11:04:15 +01:00
blk-pm.h
blk-rq-qos.c rq-qos: fix missed wake-ups in rq_qos_throttle try two 2021-06-08 15:12:57 -06:00
blk-rq-qos.h block: don't allow the same type rq_qos add more than once 2022-08-17 14:24:24 +02:00
blk-settings.c block: Fix partition check for host-aware zoned block devices 2021-10-27 06:58:01 -06:00
blk-stat.c
blk-stat.h
blk-sysfs.c block: don't delete queue kobject before its children 2022-04-08 14:23:07 +02:00
blk-throttle.c blk-throttle: prevent overflow while calculating wait time 2022-10-26 12:35:47 +02:00
blk-timeout.c
blk-wbt.c blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init() 2022-10-26 12:35:54 +02:00
blk-wbt.h blk-wbt: introduce a new disable state to prevent false positive by rwb_enabled() 2021-06-21 15:03:41 -06:00
blk-zoned.c block: Hold invalidate_lock in BLKRESETZONE ioctl 2021-11-18 19:17:15 +01:00
blk.h block: bump max plugged deferred size from 16 to 32 2021-11-18 19:16:16 +01:00
bounce.c block: use memcpy_from_bvec in __blk_queue_bounce 2021-08-02 13:37:28 -06:00
bsg-lib.c scsi: bsg-lib: Fix commands without data transfer in bsg_transport_sg_io_fn() 2021-08-01 13:21:40 -04:00
bsg.c scsi: bsg: Fix device unregistration 2021-09-14 00:22:15 -04:00
disk-events.c block: return errors from disk_alloc_events 2021-08-23 12:55:45 -06:00
elevator.c block/wbt: fix negative inflight counter when remove scsi device 2022-02-23 12:03:15 +01:00
fops.c block: hold ->invalidate_lock in blkdev_fallocate 2021-09-24 11:06:58 -06:00
genhd.c block: Fix possible memory leak for rq_wb on add_disk failure 2022-11-10 18:15:36 +01:00
holder.c block: drop unused includes in <linux/genhd.h> 2022-03-16 14:23:46 +01:00
ioctl.c block/compat_ioctl: fix range check in BLKGETSIZE 2022-04-27 14:39:02 +02:00
ioprio.c block: fix default IO priority handling again 2022-08-11 13:07:50 +02:00
keyslot-manager.c
kyber-iosched.c kyber: avoid q->disk dereferences in trace points 2021-10-15 21:02:57 -06:00
mq-deadline.c block: fix async_depth sysfs interface for mq-deadline 2022-01-27 11:05:25 +01:00
opal_proto.h
sed-opal.c block: sed-opal: kmalloc the cmd/resp buffers 2022-11-26 09:24:35 +01:00
t10-pi.c block: use bvec_kmap_local in t10_pi_type1_{prepare,complete} 2021-08-02 13:37:28 -06:00