When a cpu really is stuck in the kernel, it can be often
impossible to figure out which cpu is stuck where. The
worst case is when the stuck cpu has interrupts disabled.
Therefore, implement a global cpu state capture that uses
SMP message interrupts which are not disabled by the
normal IRQ enable/disable APIs of the kernel.
As long as we can get a sysrq 'y' to the kernel, we can
get a dump. Even if the console interrupt cpu is wedged,
we can trigger it from userspace using /proc/sysrq-trigger
The output is made compact so that this facility is more
useful on high cpu count systems, which is where this
facility will likely find itself the most useful :)
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
at91_mci: minor cleanup
mmc: mmc host test driver
mmc: Fix omap compile by replacing dev_name with dma_dev_name
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
Blackfin SPORTS UART Driver: converting BFIN->BLACKFIN
Blackfin serial driver: add extra IRQ flag for 8250 serial driver
8250 Serial Driver: Added support for 8250-class UARTs in HV Sistemas H8606 board
Blackfin arch: Fix bug - USB fails to build for BF524/BF526
Blackfin arch: update boards defconfig files
Blackfin arch: IO Port functions to read/write unalligned memory
Blackfin arch: enable a choice to provide 4M DMA memory
Blackfin arch: cleanup the icplb/dcplb multiple hit checks
Blackfin arch: Add workaround to read edge triggered GPIOs
Blackfin arch: Sync channel defines with struct dma_register dma_io_base_addr.
Blackfin arch: Check for Anomaly 05000182
[Blackfin] arch: rename bf5xx-flash to bfin-async-flash
[Blackfin] arch: Blackfin checksum annotations
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Fix up restorer in debug_trap exception return path.
sh: Make is_valid_bugaddr() more intelligent on nommu.
sh: use the common ascii hex helpers
sh: fix sh7785 master clock value
sh: Fix up thread info pointer in syscall_badsys resume path.
sh: Fix up optimized SH-4 memcpy on big endian.
sh: disable initrd defaults in .empty_zero_page.
sh: display boot params by default on entry.
The libata-acpi.c code currently accepts hotplug messages from both the
port and the device. This does not match the behaviour of the bay
driver, and may result in confusion when two hotplug requests are
received for the same device. This patch limits the hotplug notification
to removable ACPI devices, which in turn allows it to use the _STA
method to determine whether the device has been removed or inserted.
On removal, devices are marked as detached. On insertion, a hotplug scan
is started. This should avoid lockups caused by the ata layer attempting
to scan devices which have been removed. The uevent sending is moved
outside the spinlock in order to avoid a warning generated by it firing
when interrupts are disabled.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
I was hoping ATA_HORKAGE_NODMA | ATA_HORKAGE_SKIP_PM could keep it
happy but no even this doesn't work under certain configurations and
it's not like we can do anything useful with the cofig device anyway.
Replace ATA_HORKAGE_SKIP_PM with ATA_HORKAGE_DISABLE and use it for
the config device. This makes the device completely ignored by
libata.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When 4140 PMP is attached to sil24, NCQ commands to fan out port 1 and
2 (0 based) often stall if commands are in progress to other ports.
I've tried a number of things but can't tell what's going on. It
never happens w/ ahci and reportedly sata_mv which can issue NCQ
commands to multiple devices simultaneously like sil24 does.
Disable NCQ for devices behind 4140 PMP for the time being.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Mark Lord <liml@rtr.ca>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
There's no reason to schedule LPM action after probing is complete
causing another EH iteration. Just schedule it together with probing
itself.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
PMP notification during reset can make some controllers fail reset
processing and needs to be turned off during resets. PMP attach and
full-revalidation path did this via sata_pmp_configure() but the quick
revalidation wasn't. Move the notification disable code right above
fan-out port recovery so that it's always turned off.
This fixes obscure reset failures.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This timeout was set low because previously PMP register access was
done via polling and register access timeouts could stack up. This is
no longer the case. One timeout will make all following accesses fail
immediately.
In rare cases both marvell and SIMG PMPs need almost a second. Bump
it to 3s.
While at it, rename it to SATA_PMP_RW_TIMEOUT. It's not specific to
SCR access.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
No reason to get overzealous about recovered comm and data errors.
Some PHYs habitually sets them w/o no good reason and being draconian
about these soft error conditions doesn't seem to help anybody.
If need ever rises, we might need to add soft PHY error condition, say
AC_ERR_MAYBE_ATA_BUS and use it only to determine whether speed down
is necessary but I don't think that's very likely to happen. It's far
more likely we'll get timeouts or fatal transmission errors if
recovered errors are so prominent that they hamper operation.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Originally, whole reset processing was done while the port is frozen
and SError was cleared during @postreset(). This had two race
conditions. 1: hotplug could occur after reset but before SError is
cleared and libata won't know about it. 2: hotplug could occur after
all the reset is complete but before the port is thawed. As all
events are cleared on thaw, the hotplug event would be lost.
Commit ac371987a8 kills the first race
by clearing SError during link resume but before link onlineness test.
However, this doesn't fix race #2 and in some cases clearing SError
after SRST is a good idea.
This patch solves this problem by cross checking link onlineness with
classification result after SError is cleared and port is thawed.
Reset is retried if link is online but all devices attached to the
link are unknown. As all devices will be revalidated, this one-way
check is enough to ensure that all devices are detected and
revalidated reliably.
This, luckily, also fixes the cases where host controller returns
bogus status while harddrive is spinning up after hotplug making
classification run before the device sends the first FIS and thus
causes misdetection.
Low level drivers can bypass the logic by setting class explicitly to
ATA_DEV_NONE if ever necessary (currently none requires this).
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Previously reset freeze/thaw handling lived outside of ata_eh_reset()
mainly because the original PMP reset code needed the port frozen
while resetting all the fan-out ports, which is no longer the case.
This patch moves freeze/thaw handling into ata_eh_reset().
@prereset() and @postreset() are now called w/o freezing the port
although @prereset() an be called frozen if the port is frozen prior
to entering ata_eh_reset().
This makes code simpler and will help removing hotplug event related
races.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Reorganize ata_eh_reset() such that @prereset() is called even when no
reset method is available and if block is used instead of goto to skip
actual reset. This makes no reset case behave better (readiness wait)
and future changes easier.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The @online out parameter is supposed to set to true iff link is
online and reset succeeded as advertised in the function description
and callers are coded expecting that. However, sata_link_reset()
didn't behave this way on device readiness test failure. Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Minor coding-style fixes for sata_promise:
- remove stray blank lines
- fix checkpatch.pl errors; warnings about long lines
remain, but I don't intend to address those at this time
- remove two inline directives: neither is essential and
both functions are trivially inlinable anyway by virtue
of being static and having a single unique call site
- fix comment in pdc_interrupt(): the bits in PDC_INT_SEQMASK
denote SEQIDs not tags, the distinction becomes important
when NCQ gets implemented
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch cleans up sata_promise's mmio accesses.
In sata_promise there are three distinct mmio address spaces:
1. global registers, offsets from host->iomap[PDC_MMIO_BAR]
2. per-port ATA registers, offsets from ap->ioaddr.cmd_addr
3. per-port SATA registers, offsets from ap->ioaddr.scr_addr
The driver currently often fails to indicate which address space
a given mmio base pointer refers to, which is a source of bugs
and confusion (see recent pdc_thaw() irq clearing bug; it's also
been an obstacle for the pending NCQ extensions).
To reduce these problems, adopt a coding style where the name of
a base pointer always indicates which address space it refers to:
1. global registers: host_mmio
2. per-port ATA registers: ata_mmio
3. per-port SATA registers: sata_mmio
Also rearrange register offset definitions to clearly indicate
which address space they belong to, and add a symbolic definition
for the previously hard-coded PHYMODE4 register.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch fixes two bugs in sata_promise's irq status clearing paths:
1. When clearing the irq status for a specific port, the driver
read the global SEQMASK register. This is wrong because that
clears the irq status for _all_ ports.
2. pdc_thaw() incorrectly added the PDC_INT_SEQMASK host register
offset to a per-port ata engine base address. This resulted in
it reading the unrelated PDC_PKT_SUBMIT register, which did not
have the desired irq status clearing effect.
In both cases the fix is to read from the port's Command/Status
register. This also matches what Promise's own driver does.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Use the kernel-provided clamp_val() macro.
FIT was always applied to a member of struct ata_timing (unsigned short)
and two constants. clamp_val will not cast to short anymore.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Check for an empty request queue before stopping EDMA after a FBS-NCQ error,
as per recommendation from the Marvell datasheet.
This ensures that the EDMA won't suddenly become active again
just after our subsequent check of the empty/idle bits.
Also bump DRV_VERSION.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Part five of simplifying/fixing handling of the main_irq_mask register
to resolve unexpected interrupt issues observed in 2.6.26-rc*.
Keep a cached copy of the main_irq_mask so that we don't have
to stall the CPU to read it on every pass through mv_interrupt.
This significantly speeds up interrupt handling, both for sata_mv,
and for any other driver/device sharing the same PCI IRQ line.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Part four of simplifying/fixing handling of the main_irq_mask register
to resolve unexpected interrupt issues observed in 2.6.26-rc*.
Ignore masked IRQs in mv_interrupt().
This prevents "unexpected device interrupt while idle" messages.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Part three of simplifying/fixing handling of the main_irq_mask register
to resolve unexpected interrupt issues observed in 2.6.26-rc*.
Partially fix a reported bug whereby we sometimes miss seeing drives on
a port-multiplier, as reported by Gwendal Grignou <gwendal@google.com>.
The problem was that we were receiving unexpected interrupts
during EH from POLLed commands while accessing port-multiplier registers.
These unexpected interrupts can be prevented by masking the DONE_IRQ bit
for the port whenever not operating in EDMA mode.
Also fix port_stop() to mask all port interrupts.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Part two of simplifying/fixing handling of the main_irq_mask register
to resolve unexpected interrupt issues observed in 2.6.26-rc*.
Consolidate all updates of the host main_irq_mask register
into a single function. This simplifies maintenance,
and also prepares the way for caching it (later).
No functionality changes in this update.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Part one of simplifying/fixing handling of the main_irq_mask register
to resolve unexpected interrupt issues observed in 2.6.26-rc*.
Don't blindly enable port IRQs at host init time.
Instead, enable only the bits that we want,
which in this case is simply the PCI_ERR bit.
The per-port bits can wait until the ports are reset/probed for devices.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Now that we handle the FIS_IRQ_CAUSE register correctly,
we can also now handle SATA asynchronous notification events.
So enable them, but only for the more modern GenIIe chips.
(older chips have unaddressed errata issues related to this).
This fixes hot plug/unplug for port-muliplier ports.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Group all of the flags for GenIIe devices into a common definition,
to ensure that any updates to them are shared by all GenIIe devices.
This will help make future maintenance somewhat simpler.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix handling of the FIS_IRQ_CAUSE register in sata_mv.
This register exists *only* on GenIIe devices, so don't bother
writing to it on older chips. Also, it has to be read/cleared
in mv_err_intr() before clearing the main ERR_IRQ_CAUSE register.
This keeps sata_mv from getting stuck forever on certain error types.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Always request a softreset after hardreset succeeds.
This fixes a regression reported by Martin Michlmayr <tbm@cyrius.com>.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Set ATAPI host state machine to control IDE device terminate sequence.
Some IDE harddisk may assert terminate sequence in the middle of a
formal DMA transaction and resume later. Bit DETECT_TERM in ATAPI_CTRL
register determines whether the ATAPI host state machine or the kernel
driver should take care of this case.
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: LAPIC: ignore pending timers if LVTT is disabled
KVM: Update MAINTAINERS for new mailing lists
KVM: Fix kvm_vcpu_block() task state race
KVM: ia64: Set KVM_IOAPIC_NUM_PINS to 48
KVM: ia64: fix GVMM module including position-dependent objects
KVM: ia64: Define new kvm_fpreg struture to replace ia64_fpreg
KVM: PIT: take inject_pending into account when emulating hlt
s390: KVM guest: fix compile error
KVM: x86 emulator: fix writes to registers with modrm encodings
* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: save and restore dsparb and d_state registers.
drm/i915: fix off by one in VGA save/restore of AR & CR regs.
drm: disable tasklets not IRQs when taking the drm lock spinlock
Revert "drm/vbl rework: rework how the drm deals with vblank."
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c/max6875: Really prevent 24RF08 corruption
i2c-amd756: Fix functionality flags
i2c: Kill the old driver matching scheme
i2c: Convert remaining new-style drivers to use module aliasing
i2c: Switch pasemi to the new device/driver matching scheme
i2c: Clean up Blackfin BF527 I2C device declarations
i2c-nforce2: Disable the second SMBus channel on the DFI Lanparty NF4 Expert
i2c: New co-maintainer
According to the tests in do_initcalls(), the proper error code in case no
device is found is -ENODEV, not -ENXIO or -EIO.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some input drivers do not check whether they're actually running on the
correct platform, causing multi-platform kernels to crash if they are not.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some network drivers do not check whether they're actually running on the
correct platform, causing multi-platform kernels to crash if they are not.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Apollo frame buffer device driver (dnfb) doesn't check whether it's
actually running on Apollo hardware, causing a crash if it isn't.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Macintosh IDE driver (macide) doesn't check whether it's actually running
on Mac hardware, causing a crash if it isn't.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Correct FB_HP300 dependencies:
- FB_HP300 doesn't depend only on HP300, but also on DIO (which depends on
HP300)
- FB_HP300 does not need FB_CFB_FILLRECT
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_FB_DAFB is a leftover from pre-Kconfig
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
i2c-core takes care of the possible corruption of 24RF08 chips for
quite some times, so device devices no longer need to do it. And they
really should not, as applying the prevention twice voids it.
I thought that I had fixed all drivers long ago but apparently I had
missed that one.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Ben Gardner <bgardner@wabtec.com>
The i2c-amd756 driver pretends to support SMBus process call
transactions but actually does not. Fix it.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Remove the old driver_name/type scheme for i2c driver matching. Only the
standard aliasing model will be used from now on.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Update all the remaining new-style i2c drivers to use standard module
aliasing instead of the old driver_name/type driver matching scheme.
Note that the tuner driver is a bit quirky at the moment, as it
overwrites i2c_client.name with arbitrary strings. We write "tuner"
back on remove, to make sure that driver cycling will work properly,
but there may still be troublesome corner cases.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
There is a strange chip at 0x2e on the second SMBus channel of the
DFI Lanparty NF4 Expert motherboard. Accessing the chip reboots the
system. As there's nothing interesting on this SMBus channel, the
easiest and safest thing to do is to disable it on that board.
This is a better fix to bug #5889 than the it87 driver update that was
done originally:
http://bugzilla.kernel.org/show_bug.cgi?id=5889
Signed-off-by: Jean Delvare <khali@linux-fr.org>