Граф коммитов

126473 Коммитов

Автор SHA1 Сообщение Дата
Dan Williams 52d7463433 [SCSI] isci: revert bcn filtering
The initial bcn filtering implementation was validated on a kernel
baseline that predated the switch to new libata error handling.  Also,
prior to that conversion we borrowed the mvsas MVS_DEV_EH approach to
prevent the unwanted extra ap->ops->phy_reset(ap) that occurred in the
ata_bus_probe() path.

After the conversion to new libata eh resets at discovery are more
frequent and get filtered prematurely by IDEV_EH.  The result is that
our bcn filtering has been blocked from running and at discovery and it
appears to stall discovery completion to the point of triggering hung
task timeouts.  So, revert the implementation for now.  When it returns
it will go into libsas proper.

The domain rediscovery that takes place due to ->lldd_I_T_nexus_reset()
events should now be properly waited for by the ata_port_wait_eh() call
in ata_port_probe().  So the hard coded delay in the isci
->lldd_I_T_nexus_reset() and other libsas drivers should help debounce
the libsas thread from seeing temporary device removals.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:23:01 +04:00
Jeff Skirvin 8e35a1398c [SCSI] isci: Fix hard reset timeout conditions.
A hard reset can timeout before or after the last phy in the
port goes away.  If after, then notify the OS that the last
phy has failed.

The recovery for the failed hard reset has been removed.
This recovery code was unecessary in that the link would
recover from the failure normally by a new link reset sequence
or hotplug of the remote device.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:22:41 +04:00
Jeff Skirvin 5412e25c55 [SCSI] isci: No need to manage the pending reset bit on pending requests.
The lldd does not need to look at or manage the pending device
reset bit in pending sas_tasks.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:20:28 +04:00
Jeff Skirvin 3b34c169f8 [SCSI] isci: Remove redundant isci_request.ttype field.
Use the existing IREQ_TMF flag as a request type indicator.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:19:47 +04:00
Jeff Skirvin 98145cb722 [SCSI] isci: Fix task management for SMP, SATA and on dev remove.
libsas uses the LLDD abort task interface to handle I/O timeouts
in the SATA/STP and SMP discovery paths, so this change will terminate
STP/SMP requests. Also, if the device is gone, the lldd will prevent
libsas from further escalations in the error handler.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:17:48 +04:00
Jeff Skirvin db49c2d037 [SCSI] isci: No task_done callbacks in error handler paths.
libsas will cleanup pending sas_tasks after error handler
path functions are called; do not call task_done callbacks.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:17:04 +04:00
Jeff Skirvin b343dff1a2 [SCSI] isci: Handle task request timeouts correctly.
In the case where "task" requests timeout (note that this class of
requests can also include SATA/STP soft reset FIS transmissions),
handle the case where the task was being managed by some call to
terminate the task request by completing both the tmf and the aborting
process.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:16:23 +04:00
Jeff Skirvin d689168222 [SCSI] isci: Fix tag leak in tasks and terminated requests.
Make sure terminated requests and completed task tags are freed.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:16:04 +04:00
Jeff Skirvin c2cb8a5fd7 [SCSI] isci: Immediately fail I/O to removed devices.
In the case where an I/O fails to start in isci_request_execute,
only allow retries if the device is not already gone.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:15:17 +04:00
Jeff Skirvin 0e2e27990e [SCSI] isci: Lookup device references through requests in completions.
The LLDD needs to obtain a reference to the device through the request
itself and not through the domain_device, because the
domain_device.lldd_dev is set to NULL early in the lldd_dev_gone call.
This relies on the fact that the isci_remote_device object is keeping a
seperate reference count of outstanding requests.  TODO: unify the
request count tracking with the isci_remote_device kref.

The failure signature of this condition looks like the following
log, where the important bits are the call to lldd_dev_gone followed
by a crash in isci_terminate_request_core:

[  229.151541] isci 0000:0b:00.0: isci_remote_device_gone: domain_device = ffff8801492d4800, isci_device = ffff880143c657d0, isci_port = ffff880143c63658
[  229.166007] isci 0000:0b:00.0: isci_remote_device_stop: isci_device = ffff880143c657d0
[  229.175317] isci 0000:0b:00.0: isci_terminate_pending_requests: idev=ffff880143c657d0 request=ffff88014741f000; task=ffff8801470f46c0 old_state=2
[  229.189702] isci 0000:0b:00.0: isci_terminate_request_core: device = ffff880143c657d0; request = ffff88014741f000
[  229.201339] isci 0000:0b:00.0: isci_terminate_request_core: before completion wait (ffff88014741f000/ffff880149715ad0)
[  229.213414] isci 0000:0b:00.0: sci_controller_process_completions: completion queue entry:0x8000a0e9
[  229.214401] BUG: unable to handle kernel NULL pointer dereference at 0000000000000228
[  229.214401] IP:jdskirvi-testlbo [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci]
[  229.214401] PGD 13d19e067 PUD 13d104067 PMD 0
[  229.214401] Oops: 0000 [#1] SMP
[  229.214401] CPU 0 x kernel: [  226
[  229.214401] Modules linked in: ipv6 dm_multipath uinput nouveau snd_hda_codec_realtek snd_hda_intel ttm drm_kms_helper drm snd_hda_codec snd_hwdep snd_pcm snd_timer i2c_algo_bit isci snd libsas ioatdma mxm_wmi iTCO_wdt soundcore snd_page_alloc scsi_transport_sas iTCO_vendor_support wmi dca video i2c_i801 i2c_core [last unloaded: speedstep_lib]
[  229.214401]
[  229.214401] Pid: 5, comm: kworker/u:0 Not tainted 3.0.0-isci-11.7.29+ #30.353196] Buffer  Intel Corporation Stoakley/Pearlcity Workstation
[  229.214401] RIP: 0010:[<ffffffffa00a58be>] I/O error on dev [<ffffffffa00a58be>] sci_request_completed_state_enter+0x50/0xafb [isci]
[  229.214401] RSP: 0018:ffff88014fc03d20  EFLAGS: 00010046
[  229.214401] RAX: 0000000000000000 RBX: ffff88014741f000 RCX: 0000000000000000
[  229.214401] RDX: ffffffffa00b2c90 RSI: 0000000000000017 RDI: ffff88014741f0a0
[  229.214401] RBP: ffff88014fc03d90 R08: 0000000000000018 R09: 0000000000000000
[  229.214401] R10: 0000000000000000 R11: ffffffff81a17d98 R12: 000000000000001d
[  229.214401] R13: ffff8801470f46c0 R14: 0000000000000000 R15: 0000000000008000
[  229.214401] FS:  0000000000000000(0000) GS:ffff88014fc00000(0000) knlGS:0000000000000000
[  229.214401] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  229.214401] CR2: 0000000000000228 CR3: 000000013ceaa000 CR4: 00000000000406f0
[  229.214401] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  229.214401] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  229.214401] Process kworker/u:0 (pid: 5, threadinfo ffff880149714000, task ffff880149718000)
[  229.214401] Call Trace:
[  229.214401]  <IRQ>
[  229.214401]  [<ffffffffa00aa6ce>] sci_change_state+0x4a/0x4f [isci]
[  229.214401]  [<ffffffffa00a4ca6>] sci_io_request_tc_completion+0x79c/0x7a0 [isci]
[  229.214401]  [<ffffffffa00acf35>] sci_controller_process_completions+0x14f/0x396 [isci]
[  229.214401]  [<ffffffffa00abbda>] ? spin_lock_irq+0xe/0x10 [isci]
[  229.214401]  [<ffffffffa00ad2cf>] isci_host_completion_routine+0x71/0x2be [isci]
[  229.214401]  [<ffffffff8107c6b3>] ? mark_held_locks+0x52/0x70
[  229.214401]  [<ffffffff810538e8>] tasklet_action+0x90/0xf1
[  229.214401]  [<ffffffff81054050>] __do_softirq+0xe5/0x1bf
[  229.214401]  [<ffffffff8106d9d1>] ? hrtimer_interrupt+0x129/0x1bb
[  229.214401]  [<ffffffff814ff69c>] call_softirq+0x1c/0x30
[  229.214401]  [<ffffffff8100bb67>] do_softirq+0x4b/0xa3
[  229.214401]  [<ffffffff81053d84>] irq_exit+0x53/0xb4
[  229.214401]  [<ffffffff814fffe7>] smp_apic_timer_interrupt+0x83/0x91
[  229.214401]  [<ffffffff814fee53>] apic_timer_interrupt+0x13/0x20
[  229.214401]  <EOI>
[  229.214401]  [<ffffffff814f7ad4>] ? retint_restore_args+0x13/0x13
[  229.214401]  [<ffffffff8107af29>] ? trace_hardirqs_off+0xd/0xf
[  229.214401]  [<ffffffff8104ea71>] ? vprintk+0x40b/0x452
[  229.214401]  [<ffffffff814f4b5a>] printk+0x41/0x47
[  229.214401]  [<ffffffff81314484>] __dev_printk+0x78/0x7a
[  229.214401]  [<ffffffff8131471e>] dev_printk+0x45/0x47
[  229.214401]  [<ffffffffa00ae2a3>] isci_terminate_request_core+0x15d/0x317 [isci]
[  229.214401]  [<ffffffffa00af1ad>] isci_terminate_pending_requests+0x1a4/0x204 [isci]
[  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
[  229.214401]  [<ffffffffa00a7d9e>] isci_remote_device_nuke_requests+0xa6/0xff [isci]
[  229.214401]  [<ffffffffa00a811a>] isci_remote_device_stop+0x7c/0x166 [isci]
[  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
[  229.214401]  [<ffffffffa00a827a>] isci_remote_device_gone+0x76/0x7e [isci]
[  229.214401]  [<ffffffffa002363e>] sas_notify_lldd_dev_gone+0x34/0x36 [libsas]
[  229.214401]  [<ffffffffa0023945>] sas_unregister_dev+0x57/0x9c [libsas]
[  229.214401]  [<ffffffffa00239c0>] sas_unregister_domain_devices+0x36/0x65 [libsas]
[  229.214401]  [<ffffffffa0022cb8>] sas_deform_port+0x72/0x1ac [libsas]
[  229.214401]  [<ffffffffa00229f6>] ? sas_phye_oob_error+0xc3/0xc3 [libsas]
[  229.214401]  [<ffffffffa0022a34>] sas_phye_loss_of_signal+0x3e/0x42 [libsas]

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:14:44 +04:00
Wayne Boyer 5a918353ec [SCSI] ipr: add definitions for additional adapter
Add the appropriate definition and table entry for an additional adapter.

Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:12:50 +04:00
Moger, Babu a18a920c70 [SCSI] scsi_dh: check queuedata pointer before proceeding further
This patch validates sdev pointer in scsi_dh_activate before proceeding further.

Without this check we might see the panic as below. I have seen this
panic multiple times..

Call trace:

 #0 [ffff88007d647b50] machine_kexec at ffffffff81020902
 #1 [ffff88007d647ba0] crash_kexec at ffffffff810875b0
 #2 [ffff88007d647c70] oops_end at ffffffff8139c650
 #3 [ffff88007d647c90] __bad_area_nosemaphore at ffffffff8102dd15
 #4 [ffff88007d647d50] page_fault at ffffffff8139b8cf
    [exception RIP: scsi_dh_activate+0x82]
    RIP: ffffffffa0041922  RSP: ffff88007d647e00  RFLAGS: 00010046
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 00000000000093c5
    RDX: 00000000000093c5  RSI: ffffffffa02e6640  RDI: ffff88007cc88988
    RBP: 000000000000000f   R8: ffff88007d646000   R9: 0000000000000000
    R10: ffff880082293790  R11: 00000000ffffffff  R12: ffff88007cc88988
    R13: 0000000000000000  R14: 0000000000000286  R15: ffff880037b845e0
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
 #5 [ffff88007d647e38] run_workqueue at ffffffff81060268
 #6 [ffff88007d647e78] worker_thread at ffffffff81060386
 #7 [ffff88007d647ee8] kthread at ffffffff81064436
 #8 [ffff88007d647f48] kernel_thread at ffffffff81003fba

Signed-off-by: Babu Moger <babu.moger@netapp.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-31 13:10:36 +04:00
Stephen M. Cameron a0c124137a [SCSI] hpsa: detect controller lockup
When controller lockup condition is detected,
we should fail all outstanding commands and disable
the controller.  This will enable multipath solutions
to recover gracefully.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:35:01 +04:00
Stephen M. Cameron bb158eabda [SCSI] hpsa: fix flush cache transfer length
We weren't filling in the transfer length of the
flush cache command (it transfers 4 bytes of zeroes).
Firmware didn't seem to be bothered by this, but it
should be fixed.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:34:27 +04:00
Scott Teel b7ec021fe6 [SCSI] hpsa: fix potential array overflow in hpsa_update_scsi_devices
The currentsd[] array in hpsa_update_scsi_devices had room for
256 devices.  The code was iterating over however many physical
and logical devices plus an additional number of possible external
MSA2XXX controllers, which together could potentially exceed 256.

We increased the size of the currentsd array to 1024 + 1024 + 32 + 1
elements to reflect a reasonable maximum possible number of devices
which might be encountered.  We also don't just walk off the end
of the array if the array controller reports more devices than we
are prepared to handle, we just ignore the excessive devices.

Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:34:04 +04:00
Scott Teel cfe5badcab [SCSI] hpsa: rename HPSA_MAX_SCSI_DEVS_PER_HBA
Rename HPSA_MAX_SCSI_DEVS_PER_HBA to HPSA_MAX_DEVICES

Signed-off-by: Scott Teel <scott.teel@hp.com>
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:16:38 +04:00
Stephen M. Cameron 03ab31f4c1 [SCSI] hpsa: remove unused busy_initializing and busy_scanning
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:09:59 +04:00
Stephen M. Cameron c0d6a4d17b [SCSI] hpsa: set max sectors instead of taking the default
Set the max hardware sectors in the SCSI host template to 8192
to allow for larger i/o's (8192 is the same limit the cciss
driver currently has.)

Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:09:24 +04:00
Andrew Vasquez d424754cbe [SCSI] qla2xxx: Correct inadvertent clearing of RISC_INTR status.
During heavy I/O (CPU-affinity mode enabled) and CLI/Agent
interactions, the driver would report periodic mailbox command
timeout statuses.  Within the CPU-affinity ISR handler, the
driver should check the 'disable-msix-handshake' flag in deciding
whether or not to clear HCCRX_CLR_RISC_INT.  The mode is not
specific to a dedicated queue, instead, applies to the current
'ha' context.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:08:15 +04:00
Dave Jones a63ec37629 [SCSI] pmcraid: pmcraid_chr_ioctl uses incorrect argument order to kmalloc()
Size is 1st arg, not second.

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:06:58 +04:00
Bhanu Prakash Gollapudi fd2541893d [SCSI] bnx2fc: Bumped version to 1.0.9
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:05:52 +04:00
Bhanu Prakash Gollapudi 32c3045450 [SCSI] bnx2fc: Handle SRR LS_ACC drop scenario
When SRR LS_ACC is dropped, the driver was not issuing ABTS for SRR when it
times out. Since the target received SRR, it was able to send the XFER_RDY and
the the original IO request completed successfully. In this condition ABTS was
not sent during bnx2fc_srr_compl(). Fix this by first checking for ELS timeout
and issue ABTS before checking if original IO request is complete.

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 14:04:01 +04:00
Bhanu Prakash Gollapudi 99cc600cdd [SCSI] bnx2fc: Handle ABTS timeout during ulp timeout
If the IO and the corresponding ABTS are not responded by a target, cleanup the
IO and issue explicit logout when ulp timer expires while waiting for ABTS to
complete. Wait for the session to be ready before returning to the SCSI layer.
If the session is not ready let the SCSI-ml escalate the error recovery.

Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 13:28:55 +04:00
Petr Uzel c68bf8eeaa [SCSI] st: fix race in st_scsi_execute_end
The call to complete() in st_scsi_execute_end() wakes up sleeping thread
in write_behind_check(), which frees the st_request, thus invalidating
the pointer to the associated bio structure, which is then passed to the
blk_rq_unmap_user(). Fix by storing pointer to bio structure into
temporary local variable.

This bug is present since at least linux-2.6.32.

CC: stable@kernel.org
Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Reported-by: Juergen Groß <juergen.gross@ts.fujitsu.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 13:27:28 +04:00
Bart Van Assche 3308511c93 [SCSI] Make scsi_free_queue() kill pending SCSI commands
Make sure that SCSI device removal via scsi_remove_host() does finish
all pending SCSI commands. Currently that's not the case and hence
removal of a SCSI host during I/O can cause a deadlock. See also
"blkdev_issue_discard() hangs forever if underlying storage device is
removed" (http://bugzilla.kernel.org/show_bug.cgi?id=40472). See also
http://lkml.org/lkml/2011/8/27/6.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: <stable@kernel.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 13:20:28 +04:00
Dave Kleikamp 21208ae5a2 [SCSI] sd: remove arbitrary SD_MAX_DISKS namespace limit
There is no reason to limit the SCSI disk namespace to sdXXX.

Add new error messages to sd_probe() in the unlikely event that either
ida_get_new() or sd_format_disk_name() fail.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:58:11 +04:00
nagalakshmi.nandigama@lsi.com 6e88020025 [SCSI] mpt2sas: Bump driver version to 10.100.00.00
Bump driver vesion to 10.100.00.00

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:57:19 +04:00
nagalakshmi.nandigama@lsi.com 35116db95c [SCSI] mpt2sas: Fix for Panic when inactive volume is tried deleting
The driver was setting the action to MPI2_CONFIG_ACTION_PAGE_READ_CURRENT,
which only returns active volumes. In order to get info on inactive volumes,
the driver needs to change the action to
MPI2_RAID_PGAD_FORM_GET_NEXT_CONFIGNUM, and traverse each config till the
iocstatus is MPI2_IOCSTATUS_CONFIG_INVALID_PAGE returned.
Added a change in the driver to remove the instance of
sas_device object when the driver returns "1" from the slave_configure callback.
Also fixed code to report the hot spares to the operating system with a /dev/sg
assigned.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:56:55 +04:00
nagalakshmi.nandigama@lsi.com 6faace2a0e [SCSI] mpt2sas: Fix for issue Port Reset taking long time(around 5 mins) to complete while issued during creating a volume
This is due to the slave_configuration routine is getting called when
host reset is active, and config page reads are failing, and driver
attempts to added device with stale config data.

To fix the issue, added error checking in slave_configure to check
for configuration pages failing, and return "1" so the device  is
not configured.  The config pages are failing if raid volume is
configured while issuing a host reset, thus driver is reading stale
data and proceeding to attempt to add.  The fix is to return error
so the volume is not configured.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:55:23 +04:00
nagalakshmi.nandigama@lsi.com 918134efe9 [SCSI] mpt2sas: Fix for deadlock between hot plug worker threads and host reset context
This is due to driver reporting a device missing to the OS then the OS sending
a SYNC_CACHE request to driver while the IO queues are locked due to host reset.

To fix the issue, the driver will be waking up the port enable context
immediately when the driver receives the reply message, instead of waiting
on the hot plug worker threads.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:54:42 +04:00
nagalakshmi.nandigama@lsi.com f3db032f1a [SCSI] mpt2sas: Fix for dead lock occurring between host_lock and sas_device_lock
Fix for dead lock occurring between host_lock and sas_device_lock.

The deadlock is between two spin locks, between the shost->host_lock
and driver ioc->sas_device_lock.

The fix is to rearrange the code in the  FW/Driver device removal
handshake so the ioc->sas_device_lock is not occurring when the
shost->host_lock is taken.

[jejb: zero initialise sas_address to fix spurious compiler warning]
Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:53:45 +04:00
nagalakshmi.nandigama@lsi.com f881ceadd4 [SCSI] mpt2sas: Fix drives not getting properly deleted if sas cable is removed while host reset is active
The fix is in the driver-firmware handshake device removal code. We
need to read the controller ioc_state to see if controller is OPERATIONAL
prior to sending target reset and OP_REMOVE. Previously it was checking
the flag ioc->shost_recovery flag, which is always set when host reset is
active, thus preventing drives from getting properly deleted.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:48:34 +04:00
nagalakshmi.nandigama@lsi.com 24f09b598d [SCSI] mpt2sas: Fix failure message displayed during diag reset
The fix is to inhibit the warning message in _scsih_get_sas_address
when the MPI2_IOCSTATUS_CONFIG_INVALID_PAGE ioc status is returned.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:47:22 +04:00
nagalakshmi.nandigama@lsi.com 0167ac67ff [SCSI] mpt2sas: Fix for system hang when discovery in progress
Fix for issue : While discovery is in progress, hot unplug and hot plug of
enclosure connected to the controller card is causing system to hang.

When a device is in the process of being detected at driver load time then
if it is removed, the device that is no longer present will not be added
to the list. So the code in _scsih_probe_sas() is rearranged as such so
the devices that failed to be detected are not added to the list.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:46:26 +04:00
nagalakshmi.nandigama@lsi.com 921cd8024b [SCSI] mpt2sas: New feature - Fast Load Support
New feature Fast Load Support.

(1)Asynchronous SCSI scanning: This will allow the drivers to scan
for devices in parallel while other device drivers are loading at
the same time. This will improve the amount of time it takes for the
OS to load.

(2) Reporting Devices while port enable is active: This feature will
allow devices to be reported to OS immediately while port enable is
active. The previous implementation waits for port enable to complete,
and then report devices. This feature is only enabled on IT firmware
configurations when there are no boot device configured in BIOS Configuration
Utility, else the driver will wait till port enable completes reporting
devices. For IR firmware, this feature is turned off. This feature is to
address large SAS topologies (>100 drives) when the boot OS is using onboard
SATA device, in other words, the boot devices is not
connected to our controller.

(3) Scanning for devices after diagnostic reset completes: A new routine
_scsih_scan_start is added. This will scan the expander pages, IR pages,
and sas device pages, then reporting new devices to SCSI Mid layer. It
seems the driver is not supporting adding devices while diagnostic reset
is active. Apparently this is due to the sanity checks on
ioc->shost_recovery flag throughout the context of kernel work thread FIFO,
and the mpt2sas_fw_work.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:35:57 +04:00
nagalakshmi.nandigama@lsi.com f9d979ce10 [SCSI] mpt2sas: MPI next revision header update
1)Added ProxyVF_ID field to Configuration Request message.
2)Added IO Unit Page 8, IO Unit Page 9,and IO Unit Page 10.
3)Added SASNotifyPrimitiveMasks field to IOC Page 7.
4)Added SAS NOTIFY Primitive event.
5)Added Temperature Threshold Event.
6)Added Host Message Event.
7)Added Send Host Message request and reply.

Signed-off-by: Nagalakshmi Nandigama <nagalakshmi.nandigama@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-10-30 12:30:52 +04:00
Linus Torvalds ec7ae51753 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
  [SCSI] qla4xxx: export address/port of connection (fix udev disk names)
  [SCSI] ipr: Fix BUG on adapter dump timeout
  [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
  [SCSI] hpsa: change confusing message to be more clear
  [SCSI] iscsi class: fix vlan configuration
  [SCSI] qla4xxx: fix data alignment and use nl helpers
  [SCSI] iscsi class: fix link local mispelling
  [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
  [SCSI] aacraid: use lower snprintf() limit
  [SCSI] lpfc 8.3.27: Change driver version to 8.3.27
  [SCSI] lpfc 8.3.27: T10 additions for SLI4
  [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
  [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
  [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
  [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
  [SCSI] megaraid_sas: Changelog and version update
  [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
  [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
  [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
  [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
  ...
2011-10-28 16:44:18 -07:00
Linus Torvalds 97d2eb13a0 Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client
* 'for-linus' of git://ceph.newdream.net/git/ceph-client:
  libceph: fix double-free of page vector
  ceph: fix 32-bit ino numbers
  libceph: force resend of osd requests if we skip an osdmap
  ceph: use kernel DNS resolver
  ceph: fix ceph_monc_init memory leak
  ceph: let the set_layout ioctl set single traits
  Revert "ceph: don't truncate dirty pages in invalidate work thread"
  ceph: replace leading spaces with tabs
  libceph: warn on msg allocation failures
  libceph: don't complain on msgpool alloc failures
  libceph: always preallocate mon connection
  libceph: create messenger with client
  ceph: document ioctls
  ceph: implement (optional) max read size
  ceph: rename rsize -> rasize
  ceph: make readpages fully async
2011-10-28 16:42:18 -07:00
Linus Torvalds 68d99b2c8e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (549 commits)
  ALSA: hda - Fix ADC input-amp handling for Cx20549 codec
  ALSA: hda - Keep EAPD turned on for old Conexant chips
  ALSA: hda/realtek - Fix missing volume controls with ALC260
  ASoC: wm8940: Properly set codec->dapm.bias_level
  ALSA: hda - Fix pin-config for ASUS W90V
  ALSA: hda - Fix surround/CLFE headphone and speaker pins order
  ALSA: hda - Fix typo
  ALSA: Update the sound git tree URL
  ALSA: HDA: Add new revision for ALC662
  ASoC: max98095: Convert codec->hw_write to snd_soc_write
  ASoC: keep pointer to resource so it can be freed
  ASoC: sgtl5000: Fix wrong mask in some snd_soc_update_bits calls
  ASoC: wm8996: Fix wrong mask for setting WM8996_AIF_CLOCKING_2
  ASoC: da7210: Add support for line out and DAC
  ASoC: da7210: Add support for DAPM
  ALSA: hda/realtek - Fix DAC assignments of multiple speakers
  ASoC: Use SGTL5000_LINREG_VDDD_MASK instead of hardcoded mask value
  ASoC: Set sgtl5000->ldo in ldo_regulator_register
  ASoC: wm8996: Use SND_SOC_DAPM_AIF_OUT for AIF2 Capture
  ASoC: wm8994: Use SND_SOC_DAPM_AIF_OUT for AIF3 Capture
  ...
2011-10-28 14:25:01 -07:00
Linus Torvalds 0e59e7e7fe Merge branch 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci
* 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
  PCI: Clean-up MPS debug output
  pci: Clamp pcie_set_readrq() when using "performance" settings
  PCI: enable MPS "performance" setting to properly handle bridge MPS
  PCI: Workaround for Intel MPS errata
  PCI: Add support for PASID capability
  PCI: Add implementation for PRI capability
  PCI: Export ATS functions to modules
  PCI: Move ATS implementation into own file
  PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
  PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
  PCI / PM: Extend PME polling to all PCI devices
  PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
  PCI: Make pci_setup_bridge() non-static for use by arch code
  x86: constify PCI raw ops structures
  PCI: Add quirk for known incorrect MPSS
  PCI: Add Solarflare vendor ID and SFC4000 device IDs
2011-10-28 14:20:44 -07:00
Linus Torvalds 46b51ea209 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (83 commits)
  mmc: fix compile error when CONFIG_BLOCK is not enabled
  mmc: core: Cleanup eMMC4.5 conditionals
  mmc: omap_hsmmc: if multiblock reads are broken, disable them
  mmc: core: add workaround for controllers with broken multiblock reads
  mmc: core: Prevent too long response times for suspend
  mmc: recognise SDIO cards with SDIO_CCCR_REV 3.00
  mmc: sd: Handle SD3.0 cards not supporting UHS-I bus speed mode
  mmc: core: support HPI send command
  mmc: core: Add cache control for eMMC4.5 device
  mmc: core: Modify the timeout value for writing power class
  mmc: core: new discard feature support at eMMC v4.5
  mmc: core: mmc sanitize feature support for v4.5
  mmc: dw_mmc: modify DATA register offset
  mmc: sdhci-pci: add flag for devices that can support runtime PM
  mmc: omap_hsmmc: ensure pbias configuration is always done
  mmc: core: Add Power Off Notify Feature eMMC 4.5
  mmc: sdhci-s3c: fix potential NULL dereference
  mmc: replace printk with appropriate display macro
  mmc: core: Add default timeout value for CMD6
  mmc: sdhci-pci: add runtime pm support
  ...
2011-10-28 14:16:11 -07:00
Linus Torvalds 1fdb24e969 Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
  ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
  ARM: gic, local timers: use the request_percpu_irq() interface
  ARM: gic: consolidate PPI handling
  ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
  ARM: mach-s5p64x0: remove mach/memory.h
  ARM: mach-s3c64xx: remove mach/memory.h
  ARM: plat-mxc: remove mach/memory.h
  ARM: mach-prima2: remove mach/memory.h
  ARM: mach-zynq: remove mach/memory.h
  ARM: mach-bcmring: remove mach/memory.h
  ARM: mach-davinci: remove mach/memory.h
  ARM: mach-pxa: remove mach/memory.h
  ARM: mach-ixp4xx: remove mach/memory.h
  ARM: mach-h720x: remove mach/memory.h
  ARM: mach-vt8500: remove mach/memory.h
  ARM: mach-s5pc100: remove mach/memory.h
  ARM: mach-tegra: remove mach/memory.h
  ARM: plat-tcc: remove mach/memory.h
  ARM: mach-mmp: remove mach/memory.h
  ARM: mach-cns3xxx: remove mach/memory.h
  ...

Fix up mostly pretty trivial conflicts in:
 - arch/arm/Kconfig
 - arch/arm/include/asm/localtimer.h
 - arch/arm/kernel/Makefile
 - arch/arm/mach-shmobile/board-ap4evb.c
 - arch/arm/mach-u300/core.c
 - arch/arm/mm/dma-mapping.c
 - arch/arm/mm/proc-v7.S
 - arch/arm/plat-omap/Kconfig
largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
addition of NEED_MACH_MEMORY_H next to HAVE_IDE.
2011-10-28 12:02:27 -07:00
Linus Torvalds 37be944a02 Merge branch 'drm-core-next' of git://people.freedesktop.org/~airlied/linux
* 'drm-core-next' of git://people.freedesktop.org/~airlied/linux: (290 commits)
  Revert "drm/ttm: add a way to bo_wait for either the last read or last write"
  Revert "drm/radeon/kms: add a new gem_wait ioctl with read/write flags"
  vmwgfx: Don't pass unused arguments to do_dirty functions
  vmwgfx: Emulate depth 32 framebuffers
  drm/radeon: Lower the severity of the radeon lockup messages.
  drm/i915/dp: Fix eDP on PCH DP on CPT/PPT
  drm/i915/dp: Introduce is_cpu_edp()
  drm/i915: use correct SPD type value
  drm/i915: fix ILK+ infoframe support
  drm/i915: add DP test request handling
  drm/i915: read full receiver capability field during DP hot plug
  drm/i915/dp: Remove eDP special cases from bandwidth checks
  drm/i915/dp: Fix the math in intel_dp_link_required
  drm/i915/panel: Always record the backlight level again (but cleverly)
  i915: Move i915_read/write out of line
  drm/i915: remove transcoder PLL mashing from mode_set per specs
  drm/i915: if transcoder disable fails, say which
  drm/i915: set watermarks for third pipe on IVB
  drm/i915: export a CPT mode set verification function
  drm/i915: fix transcoder PLL select masking
  ...
2011-10-28 05:54:23 -07:00
Linus Torvalds 8e6d539e0f Merge branch 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-rdrand-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, random: Verify RDRAND functionality and allow it to be disabled
  x86, random: Architectural inlines to get random integers with RDRAND
  random: Add support for architectural random hooks

Fix up trivial conflicts in drivers/char/random.c: the architectural
random hooks touched "get_random_int()" that was simplified to use MD5
and not do the keyptr thing any more (see commit 6e5714eaf77d: "net:
Compute protocol sequence numbers and fragment IDs using MD5").
2011-10-28 05:29:07 -07:00
Linus Torvalds 8237eb946a Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode, AMD: Add microcode revision to /proc/cpuinfo
  x86, microcode: Correct microcode revision format
  coretemp: Get microcode revision from cpu_data
  x86, intel: Use c->microcode for Atom errata check
  x86, intel: Output microcode revision in /proc/cpuinfo
  x86, microcode: Don't request microcode from userspace unnecessarily

Fix up trivial conflicts in arch/x86/kernel/cpu/amd.c (conflict between
moving AMD BSP code to cpu_dev helper function and adding AMD microcode
revision to /proc/cpuinfo code)
2011-10-28 05:14:48 -07:00
Linus Torvalds cc21fe518a Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Hyper-V: Integrate the clocksource with Hyper-V detection code

Fix up conflicts in drivers/staging/hv/Makefile manually (some of the hv
code has moved out of staging to drivers/hv/)
2011-10-28 05:08:40 -07:00
Linus Torvalds a93f3e9f42 Merge branch 'x86-geode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-geode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: geode: New PCEngines Alix system driver
2011-10-28 05:04:26 -07:00
Jon Mason a513a99a7c PCI: Clean-up MPS debug output
Clean-up MPS debug output to make it a single line and aligned, thus
making it more readable for a large number of buses and devices in a
single system.

Suggested by Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Benjamin Herrenschmidt a1c473aa11 pci: Clamp pcie_set_readrq() when using "performance" settings
When configuring the PCIe settings for "performance", we allow parents
to have a larger Max Payload Size than children and rely on children
Max Read Request Size to not be larger than their own MPS to avoid
having the host bridge generate responses they can't cope with.

However, various drivers in Linux call pci_set_readrq() with arbitrary
values, assuming this to be a simple performance tweak. This breaks
under our "performance" configuration.

Fix that by making sure the value programmed by pcie_set_readrq() is
never larger than the configured MPS for that device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:44 -07:00
Jon Mason 62f392ea5b PCI: enable MPS "performance" setting to properly handle bridge MPS
Rework the "performance" MPS option to configure the device MPS with the
smaller of the device MPSS or the bridge MPS (which is assumed to be
properly configured at this point to the largest allowable MPS based on
its parent bus).

Also, rework the MRRS setting to report an inability to set the MRRS to
a valid setting.

Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-10-27 12:45:43 -07:00