WSL2-Linux-Kernel/drivers
Mike Christie 308cec14e6 [SCSI] libiscsi: Fix scsi command timeout oops in iscsi_eh_timed_out
Yanling Qi from LSI found the root cause of the panic, below is his
analysis:

Problem description: the open iscsi driver installs eh_timed_out handler
to the
blank_transport_template of the scsi middle level that causes panic of
timed
out command of other host

Here are the details

Iscsi Session creation

During iscsi session creation time, the iscsi_tcp_session_create() of
iscsi_tpc.c will create a scsi-host for the session. See the statement
marked
with the label A. The statement B replaces the shost->transportt point
with a
local struct variable.

static struct iscsi_cls_session *
iscsi_tcp_session_create(struct iscsi_endpoint *ep, uint16_t cmds_max,
                         uint16_t qdepth, uint32_t initial_cmdsn,
                         uint32_t *hostno)
{
        struct iscsi_cls_session *cls_session;
        struct iscsi_session *session;
        struct Scsi_Host *shost;
        int cmd_i;
        if (ep) {
                printk(KERN_ERR "iscsi_tcp: invalid ep %p.\n", ep);
                return NULL;
        }

A        shost = iscsi_host_alloc(&iscsi_sht, 0, qdepth);

        if (!shost)

                return NULL;

B         shost->transportt = iscsi_tcp_scsi_transport;

        shost->max_lun = iscsi_max_lun;

Please note the scsi host is allocated by invoking isccsi_host_alloc()
in
libiscsi.c

Polluting the middle level blank_transport_template in
iscsi_host_alloc() of
libiscsi.c

The iscsi_host_alloc() invokes the middle level function
scsi_host_alloc() in
hosts.c for allocating a scsi_host. Then the statement marked with C
assigns
the iscsi_eh_cmd_timed_out handler to the eh_timed_out callback
function.

struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,

                                   int dd_data_size, uint16_t qdepth)

{
        struct Scsi_Host *shost;
        struct iscsi_host *ihost;
        shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) +
dd_data_size);
        if (!shost)
                return NULL;

 C      shost->transportt->eh_timed_out = iscsi_eh_cmd_timed_out;

Please note the shost->transport is the middle level
blank_transport_template
as shown in the code segment below. We see two problems here. 1.
iscsi_eh_cmd_timed_out is installed to the blank_transport_template that
will
cause some body else problem. 2. iscsi_eh_cmd_timed_out will never be
invoked
when iscsi command gets timeout because the statement B resets the
pointer.

Middle level blank_transport_template

In the middle level function scsi_host_alloc() of hosts.c, the middle
level
assigns a blank_transport_template for those hosts not implementing its
transport layer. All HBAs without supporting a specific scsi_transport
will
share the middle level blank_transport_template. Please see the
statement D

struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int
privsize)

{
        struct Scsi_Host *shost;
        gfp_t gfp_mask = GFP_KERNEL;
        int rval;
        if (sht->unchecked_isa_dma && privsize)
                gfp_mask |= __GFP_DMA;

         shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask);
        if (!shost)
                return NULL;

        shost->host_lock = &shost->default_lock;

        spin_lock_init(shost->host_lock);

        shost->shost_state = SHOST_CREATED;

        INIT_LIST_HEAD(&shost->__devices);

        INIT_LIST_HEAD(&shost->__targets);

        INIT_LIST_HEAD(&shost->eh_cmd_q);

        INIT_LIST_HEAD(&shost->starved_list);

        init_waitqueue_head(&shost->host_wait);

        mutex_init(&shost->scan_mutex);

        shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy */

        shost->dma_channel = 0xff;

        /* These three are default values which can be overridden */

        shost->max_channel = 0;

        shost->max_id = 8;

        shost->max_lun = 8;

        /* Give each shost a default transportt */

 D       shost->transportt = &blank_transport_template;

Why we see panic at iscsi_eh_cmd_timed_out()

The mpp virtual HBA doesn’t have a specific scsi_transport. Therefore,
the
blank_transport_template will be assigned to the virtual host of the MPP
virtual HBA by SCSI middle level. Please note that the statement C has
assigned
iscsi-transport eh_timedout handler to the blank_transport_template.
When a mpp
virtual command gets timedout, the iscsi_eh_cmd_timed_out() will be
invoked to
handle mpp virtual command timeout from the middle level
scsi_times_out()
function of the scsi_error.c.

enum blk_eh_timer_return scsi_times_out(struct request *req)

{

        struct scsi_cmnd *scmd = req->special;

        enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);

        enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;

        scsi_log_completion(scmd, TIMEOUT_ERROR);

        if (scmd->device->host->transportt->eh_timed_out)

 E               eh_timed_out =
scmd->device->host->transportt->eh_timed_out;

        else if (scmd->device->host->hostt->eh_timed_out)

                eh_timed_out = scmd->device->host->hostt->eh_timed_out;

        else

                eh_timed_out = NULL;

        if (eh_timed_out) {

                rtn = eh_timed_out(scmd);

It is very easy to understand why we get panic in the
iscsi_eh_cmd_timed_out().
A scsi_cmnd from a no-iscsi device definitely can not resolve out a
session and
session->lock. The panic can be happed anywhere during the differencing.

static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd
*scmd)

{

        struct iscsi_cls_session *cls_session;

        struct iscsi_session *session;

        struct iscsi_conn *conn;

        enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;

        cls_session = starget_to_session(scsi_target(scmd->device));

        session = cls_session->dd_data;

        debug_scsi("scsi cmd %p timedout\n", scmd);

        spin_lock(&session->lock);

This patch fixes the problem by moving the setting of the
iscsi_eh_cmd_timed_out to iscsi_add_host, which is after the LLDs
have set their transport template to shost->transportt.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2009-02-10 11:15:19 -05:00
..
accessibility
acpi Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release 2009-02-07 01:34:56 -05:00
amba [ARM] Fix realview build 2009-01-08 16:29:41 +00:00
ata Fix my email address in qd65xx.[ch]/pata_qdi.c 2009-02-03 16:53:56 -08:00
atm generic swap(): iphase: rename swap() to swap_byte_order() 2009-01-08 08:31:14 -08:00
auxdisplay
base driver-core: fix kernel-doc parameter name 2009-01-28 15:55:48 -08:00
block powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/block 2009-01-16 16:15:13 +11:00
bluetooth
cdrom
char sx.c: fix missed unlock_kernel() on error path in sx_fw_ioctl() 2009-02-05 12:56:48 -08:00
clocksource
connector
cpufreq [CPUFREQ] Make ignore_nice_load setting of ondemand work as expected. 2009-02-05 12:25:26 -05:00
cpuidle
crypto
dca dca: redesign locks to fix deadlocks 2009-02-02 23:26:57 -08:00
dio m68k: dio - Kill resource_size_t format warnings 2009-01-12 20:56:42 +01:00
dma i.MX31: Image Processing Unit DMA and IRQ drivers 2009-01-19 15:36:21 -07:00
edac powerpc: More printing warning fixes for the l64 to ll64 conversion 2009-01-28 17:15:52 +11:00
eisa
firewire firewire: core: Remove card from list of cards when enable fails 2009-02-01 11:17:24 +01:00
firmware DMI: Introduce dmi_first_match to make the interface more flexible 2009-01-27 02:15:47 -05:00
gpio gpiolib: fix request related issue 2009-01-29 18:04:43 -08:00
gpu i915: Fix more size_t format string warnings 2009-02-09 08:57:29 -08:00
hid HID: document difference between hid_blacklist and hid_ignore_list 2009-01-29 11:23:12 +01:00
hwmon lis3lv02d: add axes knowledge for HP 6710 2009-02-05 12:56:47 -08:00
i2c i2c: Move old eeprom driver to /drivers/misc/eeprom 2009-01-26 21:19:53 +01:00
ide Fix my email address in qd65xx.[ch]/pata_qdi.c 2009-02-03 16:53:56 -08:00
idle i7300_idle: struct device - replace bus_id with dev_name(), dev_set_name() 2009-01-06 10:44:39 -08:00
ieee1394 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 2009-02-06 08:48:16 -08:00
infiniband Merge branches 'ehca', 'ipoib' and 'mlx4' into for-linus 2009-01-16 15:05:54 -08:00
input input: PCF50633 input driver 2009-01-11 01:34:25 +01:00
isdn isdn: Fix missing ifdef in isdn_ppp 2009-01-26 12:24:38 -08:00
leds lis3lv02d: merge with leds hp disk 2009-01-15 16:39:40 -08:00
lguest lguest: Fix a memory leak with the lg object during launcher close 2009-01-30 11:34:11 +10:30
macintosh Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2009-01-07 11:31:52 -08:00
mca
md md: Ensure an md array never has too many devices. 2009-02-06 18:02:46 +11:00
media V4L/DVB (10411): s5h1409: Perform s5h1409 soft reset after tuning 2009-02-01 10:41:02 -02:00
memstick memstick: annotate endianness of attribute structs 2009-01-09 16:54:41 -08:00
message [SCSI] mpt fusion: Add Firmware debug support 2009-01-13 10:36:02 -06:00
mfd mfd: Remove non exported references from pcf50633 2009-01-15 11:50:58 +01:00
misc sgi-xp: fix writing past the end of kzalloc()'d space 2009-02-05 12:56:49 -08:00
mmc pxamci: enable DMA for write ops after CMD/RESP 2009-02-02 20:57:07 +01:00
mtd Merge master.kernel.org:/home/rmk/linux-2.6-arm 2009-02-03 16:52:10 -08:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2009-02-05 16:11:32 -08:00
nubus
of drivers/of: Add the of_find_i2c_device_by_node function. 2009-01-09 15:49:06 -07:00
oprofile oprofile: fix uninitialized use of struct op_entry 2009-01-17 17:26:39 +01:00
parisc Documentation: move DMA-mapping.txt to Doc/PCI/ 2009-01-29 18:19:29 -08:00
parport parport: ieee1284: use del_timer_sync() in parport_wait_event() 2009-01-06 15:59:31 -08:00
pci PCI PM: make the PM core more careful with drivers using the new PM framework 2009-02-04 17:22:35 -08:00
pcmcia powerpc: Change u64/s64 to a long long integer type 2009-01-13 14:47:59 +11:00
platform Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release 2009-02-07 01:34:56 -05:00
pnp Merge branch 'linus' into release 2009-01-09 03:39:43 -05:00
power power_supply: pda_power: Don't request shared IRQs w/ IRQF_DISABLED 2009-01-26 02:09:26 +03:00
ps3 powerpc/ps3: Printing fixups for l64 to ll64 conversion drivers/ps3 2009-01-16 16:15:14 +11:00
rapidio rapidio: remove excess kernel-doc notation 2009-01-06 15:59:28 -08:00
regulator leds: Fix bounds checking of wm8350->pmic.led 2009-01-30 21:50:49 +00:00
rtc rtc-ds1390: fix compilation warnings in drivers/rtc/rtc-ds1390.c 2009-02-05 12:56:48 -08:00
s390 lcs: fix compilation for !CONFIG_IP_MULTICAST 2009-01-25 17:59:26 -08:00
sbus sparc64: Fix unsigned long long warnings in drivers. 2009-01-06 13:20:38 -08:00
scsi [SCSI] libiscsi: Fix scsi command timeout oops in iscsi_eh_timed_out 2009-02-10 11:15:19 -05:00
serial Add enable_ms to jsm driver 2009-01-30 08:40:54 -08:00
sh
sn
spi spi: Move at25 (for SPI eeproms) to /drivers/misc/eeprom 2009-01-26 21:19:54 +01:00
ssb
staging Staging: panel: fix lcd panel driver build failure 2009-02-09 11:26:18 -08:00
tc
telephony
thermal thermal: struct device - replace bus_id with dev_name(), dev_set_name() 2009-01-06 10:44:37 -08:00
uio UIO: Pass information about ioports to userspace (V2) 2009-01-06 10:44:44 -08:00
usb USB: Storage: Update unusual_devs entry for Datafab KECF-USB 2009-02-09 11:19:49 -08:00
uwb uwb: lock rc->rsvs_lock with spin_lock_bh() 2009-01-23 12:57:20 +00:00
video Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 2009-02-09 08:52:28 -08:00
virtio virtio-pci: do not oops on config change if driver not loaded 2009-02-02 19:17:56 -08:00
w1 w1: send status messages after command processing 2009-01-08 08:31:14 -08:00
watchdog [ARM] omap: watchdog: allow OMAP watchdog driver on OMAP34xx platforms 2009-01-24 16:48:42 +00:00
xen xen: make sysfs files behave as their names suggest 2009-01-29 13:20:36 +01:00
zorro m68k: zorro - Use %pR to print resources 2009-01-12 20:56:43 +01:00
Kconfig
Makefile Merge branch 'drivers-platform' into release 2009-01-09 04:56:56 -05:00