security_capable takes ns, cred, cap. But the LSM capable() hook takes
cred, ns, cap. The capability helper functions also take cred, ns, cap.
Rather than flip argument order just to flip it back, leave them alone.
Heck, this should be a little faster since argument will be in the right
place!
Signed-off-by: Eric Paris <eparis@redhat.com>
The 'name', 'owner', and 'mod_name' members are redundant with the
identically named fields in the 'driver' sub-structure. Rather than
switching each instance to specify these fields explicitly, introduce
a macro to simplify this.
Eliminate further redundancy by allowing the drvname argument to
DEFINE_XENBUS_DRIVER() to be blank (in which case the first entry from
the ID table will be used for .driver.name).
Also eliminate the questionable xenbus_register_{back,front}end()
wrappers - their sole remaining purpose was the checking of the
'owner' field, proper setting of which shouldn't be an issue anymore
when the macro gets used.
v2: Restore DRV_NAME for the driver name in xen-pciback.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
I noticed that hotplug of one setup does not work with recent change in
pci tree.
After checking the bridge conf setup, I noticed that the bridges get
assigned but do not get enabled.
The reason is the following commit, while simply ignores bridge
resources when enabling a pci device:
| commit bbef98ab0f
| Author: Ram Pai <linuxram@us.ibm.com>
| Date: Sun Nov 6 10:33:10 2011 +0800
|
| PCI: defer enablement of SRIOV BARS
|...
| NOTE: Note, there is subtle change in the pci_enable_device() API. Any
| driver that depends on SRIOV BARS to be enabled in pci_enable_device()
| can fail.
Put back bridge resource and ROM resource checking to fix the problem.
That should fix regression like BIOS does not assign correct resource to
bridge.
Discussion can be found at:
http://www.spinics.net/lists/linux-pci/msg12874.html
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of one IB card with guest VM, found that, msi is not
initialized properly.
It turns out __write_msi_msg will do nothing if device current_state is
not PCI_D0. And, that pci device does not have pm_cap in guest VM.
There is an error in setting of power state to PCI_D0 in
pci_enable_device(), but error is not returned for this. Following is
code flow:
pci_enable_device() --> __pci_enable_device_flags() -->
do_pci_enable_device() --> pci_set_power_state() -->
__pci_start_power_transition()
We have following condition inside __pci_start_power_transition():
if (platform_pci_power_manageable(dev)) {
error = platform_pci_set_power_state(dev, state);
if (!error)
pci_update_current_state(dev, state);
} else {
error = -ENODEV;
/* Fall back to PCI_D0 if native PM is not supported */
if (!dev->pm_cap)
dev->current_state = PCI_D0;
}
Here, from platform_pci_set_power_state(), acpi_pci_set_power_state() is
getting called and that is failing with ENODEV because of following
condition:
if (!handle || ACPI_SUCCESS(acpi_get_handle(handle, "_EJ0",&tmp)))
return -ENODEV;
Because of that, pci_update_current_state() is not getting called.
With this patch, if device power state can not be set via
platform_pci_set_power_state and that device does not have native pm
support, then PCI device power state will be set to PCI_D0.
-v2: This also reverts 47e9037ac1, as it's
not needed after this change.
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Ajaykumar Hotchandani<ajaykumar.hotchandani@oracle.com>
Signed-off-by: Yinghai Lu<yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 0d52f54e2e (PCI / ACPI: Make
acpiphp ignore root bridges using PCIe native hotplug) added code
that made the acpiphp driver completely ignore PCIe root complexes
for which the kernel had been granted control of the native PCIe
hotplug feature by the BIOS through _OSC. Unfortunately, however,
this was a mistake, because on some systems there were PCI bridges
supporting PCI (non-PCIe) hotplug under such root complexes and
those bridges should have been handled by acpiphp.
For this reason, revert the changes made by the commit mentioned
above and make register_slot() in drivers/pci/hotplug/acpiphp_glue.c
avoid registering hotplug slots for PCIe ports that belong to
root complexes with native PCIe hotplug enabled (which means that
the BIOS has granted the kernel control of this feature for the
given root complex). This is reported to address the original
issue fixed by commit 0d52f54e2e and
to work on the system where that commit broke things.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This adjusts PCI_IOAPIC to be user configurable (possibly as a
module) on x86, since the base architecture code for adding
IO-APICs dynamically isn't there yet (and hence having the code
present everywhere is pretty pointless).
To make this consistent, a MODULE_DEVICE_TABLE() declaration
gets added, the class specifications get corrected (by properly
using PCI_DEVICE_CLASS() intended for purposes like this), and
the probe and remove functions get their sections adjusted.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: http://lkml.kernel.org/r/4EDDD71A02000078000659F1@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I get this compile failure on parisc:
drivers/pci/ats.c: In function 'ats_alloc_one':
drivers/pci/ats.c:29: error: implicit declaration of function 'kzalloc'
drivers/pci/ats.c:29: warning: assignment makes pointer from integer without a cast
drivers/pci/ats.c: In function 'ats_free_one':
drivers/pci/ats.c:45: error: implicit declaration of function 'kfree'
Because ats.c is missing linux/slab.h as an include. This patch fixes it
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All the PCI BARs of a device are enabled when the device is enabled
using pci_enable_device(). This unnecessarily enables SRIOV BARs of the
device.
On some platforms, which do not support SRIOV as yet, the
pci_enable_device() fails to enable the device if its SRIOV BARs are not
allocated resources correctly.
The following patch fixes the above problem. The SRIOV BARs are now
enabled when IOV capability of the device is enabled in sriov_enable().
NOTE: Note, there is subtle change in the pci_enable_device() API. Any
driver that depends on SRIOV BARS to be enabled in pci_enable_device()
can fail.
The patch has been touch tested on power and x86 platform.
Tested-by: Michael Wang <wangyun@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
More consistency cleanups. Drop the _OFF, separate and indent
CTRL/CAP/STATUS bit definitions. This helped find the previous
mis-use of bit 0 in the PASID capability register.
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PASID ECN indicates bit 0 is reserved in the capability register.
Switch pci_enable_pasid() to error if PASID is already enabled and
don't expose enable as a feature in pci_pasid_features().
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I traced a nasty kexec on panic boot failure to the fact that we had
screaming msi interrupts and we were not disabling the msi messages at
kernel startup. The booting kernel had not enabled those interupts so
was not prepared to handle them.
I can see no reason why we would ever want to leave the msi interrupts
enabled at boot if something else has enabled those interrupts. The pci
spec specifies that msi interrupts should be off by default. Drivers
are expected to enable the msi interrupts if they want to use them. Our
interrupt handling code reprograms the interrupt handlers at boot and
will not be be able to do anything useful with an unexpected interrupt.
This patch applies cleanly all of the way back to 2.6.32 where I noticed
the problem.
Cc: stable@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Modify pci_acpi_wake_dev() to avoid resuming PME-capable devices
whose PME Status bits are not set, which may happen currently if
several devices are associated with the same wakeup GPE and all
of them are notified whenever at least one of them signals PME.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If the kernel has requested control of the SHPC native hotplug
feature for a given root bridge, the acpiphp driver should not try
to handle that root bridge and it should leave it to shpchp.
Failing to do so causes problems to happen if shpchp is loaded
and unloaded before loading acpiphp (ACPI-based hotplug won't work
in that case anyway).
To address this issue make find_root_bridges() ignore PCI root
bridges with SHPC native hotplug enabled and make add_bridge()
return error code if SHPC native hotplug is enabled for the given
root bridge. This causes acpiphp to refuse to load if SHPC native
hotplug is enabled for all root bridges and to refuse binding to
the root bridges with SHPC native hotplug enabled.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use non-ordered workqueue for attention button events.
Attention button events on each slot can be handled asynchronously. So
we should use non-ordered workqueue. This patch also removes ordered
workqueue in pciehp as a result.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix improper workqueue cleanup.
In the current pciehp, pcied_cleanup() calls destroy_workqueue()
before calling pcie_port_service_unregister(). This causes kernel oops
because flush_workqueue() is called in the pcie_port_service_unregister()
code path after the workqueue was destroyed. So pcied_cleanup() must call
pcie_port_service_unregister() first before calling destroy_workqueue().
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Right now we forcibly clear ASPM state on all devices if the BIOS indicates
that the feature isn't supported. Based on the Microsoft presentation
"PCI Express In Depth for Windows Vista and Beyond", I'm starting to think
that this may be an error. The implication is that unless the platform
grants full control via _OSC, Windows will not touch any PCIe features -
including ASPM. In that case clearing ASPM state would be an error unless
the platform has granted us that control.
This patch reworks the ASPM disabling code such that the actual clearing
of state is triggered by a successful handoff of PCIe control to the OS.
The general ASPM code undergoes some changes in order to ensure that the
ability to clear the bits isn't overridden by ASPM having already been
disabled. Further, this theoretically now allows for situations where
only a subset of PCIe roots hand over control, leaving the others in the
BIOS state.
It's difficult to know for sure that this is the right thing to do -
there's zero public documentation on the interaction between all of these
components. But enough vendors enable ASPM on platforms and then set this
bit that it seems likely that they're expecting the OS to leave them alone.
Measured to save around 5W on an idle Thinkpad X220.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These are extended capabilities, rename and move to proper
group for consistency.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/<device>/msi_irqs
This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs. For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI hotplug: shpchp: don't blindly claim non-AMD 0x7450 device IDs
PCI: pciehp: wait 100 ms after Link Training check
PCI: pciehp: wait 1000 ms before Link Training check
PCI: pciehp: Retrieve link speed after link is trained
PCI: Let PCI_PRI depend on PCI
PCI: Fix compile errors with PCI_ATS and !PCI_IOV
PCI / ACPI: Make acpiphp ignore root bridges using PCIe native hotplug
Previously we claimed device ID 0x7450, regardless of the vendor, which is
clearly wrong. Now we'll claim that device ID only for AMD.
I suspect this was just a typo in the original code, but it's possible this
change will break shpchp on non-7450 AMD bridges. If so, we'll have to fix
them as we find them.
Reference: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638863
Reported-by: Ralf Jung <ralfjung-e@gmx.de>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If the port supports Link speeds greater than 5.0 GT/s, we must wait
for 100 ms after Link training completes before sending configuration
request.
Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We need to wait for 1000 ms after Data Link Layer Link Active (DLLLA)
bit reads 1b before sending configuration request. Currently pciehp
does this wait after checking Link Training (LT) bit. But we need it
before checking LT bit because LT is still set even after DLLLA bit is
set on some platforms.
Acked-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During hot plug, board_added will call pciehp_power_on_slot().
But link speed is updated in pciehp_power_on_slot().
We should not update link speed there, because that is too early.
So move the link speed update to pciehp_check_link_status() after making
sure the link has been trained.
-v2: fix compile warning that Kenji found.
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include <linux/module.h>
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include <linux/module.h>
net: sch_generic remove redundant use of <linux/module.h>
net: inet_timewait_sock doesnt need <linux/module.h>
...
Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
scsi: drop unused Kconfig symbol
pci: drop unused Kconfig symbol
stmmac: drop unused Kconfig symbol
x86: drop unused Kconfig symbol
powerpc: drop unused Kconfig symbols
powerpc: 40x: drop unused Kconfig symbol
mips: drop unused Kconfig symbols
openrisc: drop unused Kconfig symbols
arm: at91: drop unused Kconfig symbol
samples: drop unused Kconfig symbol
m32r: drop unused Kconfig symbol
score: drop unused Kconfig symbols
sh: drop unused Kconfig symbol
um: drop unused Kconfig symbol
sparc: drop unused Kconfig symbol
alpha: drop unused Kconfig symbol
Fix up trivial conflict in drivers/net/ethernet/stmicro/stmmac/Kconfig
as per Michal: the STMMAC_DUAL_MAC config variable is still unused and
should be deleted.
These were getting module.h implicitly from device.h but we want
to clean that up, so we fix it here to avoid things like:
pci/slot.c: In function ‘pci_hp_create_module_link’:
pci/slot.c:383: error: ‘module_kset’ undeclared (first use in this function)
Similarly, rpadlpar_core.c is modular, so add module.h to its includes.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
They were implicitly getting it from device.h --> module.h but
we want to clean that up. So add the minimal header for these
macros.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
There's no other Kconfig symbol that depends on XEN_PCIDEV_FE_DEBUG.
Neither is there anything that uses CONFIG_XEN_PCIDEV_FE_DEBUG.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
This avoids the PCI_PRI question in 'make config' when PCI
is not selected.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If the kernel has requested control of the PCIe native hotplug
feature for a given root complex, the acpiphp driver should not try
to handle that root complex and it should leave it to pciehp.
Failing to do so causes problems to happen if acpiphp is loaded
before pciehp on such systems.
To address this issue make find_root_bridges() ignore PCIe root
complexes with PCIe native hotplug enabled and make add_bridge()
return error code if PCIe native hotplug is enabled for the given
root port. This causes acpiphp to refuse to load if PCIe native
hotplug is enabled for all complexes and to refuse binding to
the root complexes with PCIe native hotplug is enabled.
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
PCI: Clean-up MPS debug output
pci: Clamp pcie_set_readrq() when using "performance" settings
PCI: enable MPS "performance" setting to properly handle bridge MPS
PCI: Workaround for Intel MPS errata
PCI: Add support for PASID capability
PCI: Add implementation for PRI capability
PCI: Export ATS functions to modules
PCI: Move ATS implementation into own file
PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
PCI / PM: Extend PME polling to all PCI devices
PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
PCI: Make pci_setup_bridge() non-static for use by arch code
x86: constify PCI raw ops structures
PCI: Add quirk for known incorrect MPSS
PCI: Add Solarflare vendor ID and SFC4000 device IDs
Clean-up MPS debug output to make it a single line and aligned, thus
making it more readable for a large number of buses and devices in a
single system.
Suggested by Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When configuring the PCIe settings for "performance", we allow parents
to have a larger Max Payload Size than children and rely on children
Max Read Request Size to not be larger than their own MPS to avoid
having the host bridge generate responses they can't cope with.
However, various drivers in Linux call pci_set_readrq() with arbitrary
values, assuming this to be a simple performance tweak. This breaks
under our "performance" configuration.
Fix that by making sure the value programmed by pcie_set_readrq() is
never larger than the configured MPS for that device.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Rework the "performance" MPS option to configure the device MPS with the
smaller of the device MPSS or the bridge MPS (which is assumed to be
properly configured at this point to the largest allowable MPS based on
its parent bus).
Also, rework the MRRS setting to report an inability to set the MRRS to
a valid setting.
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, ioapic: Consolidate the explicit EOI code
x86, ioapic: Restore the mask bit correctly in eoi_ioapic_irq()
x86, kdump, ioapic: Reset remote-IRR in clear_IO_APIC
iommu: Rename the DMAR and INTR_REMAP config options
x86, ioapic: Define irq_remap_modify_chip_defaults()
x86, msi, intr-remap: Use the ioapic set affinity routine
iommu: Cleanup ifdefs in detect_intel_iommu()
iommu: No need to set dmar_disabled in check_zero_address()
iommu: Move IOMMU specific code to intel-iommu.c
intr_remap: Call dmar_dev_scope_init() explicitly
x86, x2apic: Enable the bios request for x2apic optout
* 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus: don't rely on xen_initial_domain to detect local xenstore
xenbus: Fix loopback event channel assuming domain 0
xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive
* 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Check if the device is found instead of blindly assuming so.
xen/pciback: Do not dereference psdev during printk when it is NULL.
xen: remove XEN_PLATFORM_PCI config option
xen: XEN_PVHVM depends on PCI
xen/pciback: double lock typo
xen/pciback: use mutex rather than spinlock in vpci backend
xen/pciback: Use mutexes when working with Xenbus state transitions.
xen/pciback: miscellaneous adjustments
xen/pciback: use mutex rather than spinlock in passthrough backend
xen/pciback: use resource_size()
* 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pci: support multi-segment systems
xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.
xen/pci: make bus notifier handler return sane values
xen-swiotlb: fix printk and panic args
xen-swiotlb: Fix wrong panic.
xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB
xen-pcifront: Update warning comment to use 'e820_host' option.
Devices supporting Process Address Space Identifiers
(PASIDs) can use an IOMMU to access multiple IO address
spaces at the same time. A PCIe device indicates support for
this feature by implementing the PASID capability. This
patch adds support for the capability to the Linux kernel.
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Implement the necessary functions to handle PRI capabilities
on PCIe devices. With PRI devices behind an IOMMU can signal
page fault conditions to software and recover from such
faults.
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch makes the ATS functions usable for modules.
They will be used by a module implementing some advanced
AMD IOMMU features.
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
ATS does not depend on IOV support, so move the code into
its own file. This file will also include support for the
PRI and PASID capabilities later.
Also give ATS its own Kconfig variable to allow selecting it
without IOV support.
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The result returned by acpi_dev_run_wake() is always either -EINVAL
or -ENODEV, while obviously it should return 0 on success. The
problem is that the leftover error variable, that's not really used
in the function, is initialized with -ENODEV and then returned
without modification.
To fix this issue remove the error variable from acpi_dev_run_wake()
and make the function return 0 on success as appropriate.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I originally submitted a patch to workaround this by pushing all Ejection
Requests and Device Checks onto the kacpi_hotplug queue.
http://marc.info/?l=linux-acpi&m=131678270930105&w=2
The patch is still insufficient in that Bus Checks also need to be added.
Rather than add all events, including non-PCI-hotplug events, to the
hotplug queue, mjg suggested that a better approach would be to modify
the acpiphp driver so only acpiphp events would be added to the
kacpi_hotplug queue.
It's a longer patch, but at least we maintain the benefit of having separate
queues in ACPI. This, of course, is still only a workaround the problem.
As Bjorn and mjg pointed out, we have to refactor a lot of this code to do
the right thing but at this point it is a better to have this code working.
The acpi core places all events on the kacpi_notify queue. When the acpiphp
driver is loaded and a PCI card with a PCI-to-PCI bridge is removed the
following call sequence occurs:
cleanup_p2p_bridge()
-> cleanup_bridge()
-> acpi_remove_notify_handler()
-> acpi_os_wait_events_complete()
-> flush_workqueue(kacpi_notify_wq)
which is the queue we are currently executing on and the process will hang.
Move all hotplug acpiphp events onto the kacpi_hotplug workqueue. In
handle_hotplug_event_bridge() and handle_hotplug_event_func() we can simply
push the rest of the work onto the kacpi_hotplug queue and then avoid the
deadlock.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: mjg@redhat.com
Cc: bhelgaas@google.com
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The land of PCI power management is a land of sorrow and ugliness,
especially in the area of signaling events by devices. There are
devices that set their PME Status bits, but don't really bother
to send a PME message or assert PME#. There are hardware vendors
who don't connect PME# lines to the system core logic (they know
who they are). There are PCI Express Root Ports that don't bother
to trigger interrupts when they receive PME messages from the devices
below. There are ACPI BIOSes that forget to provide _PRW methods for
devices capable of signaling wakeup. Finally, there are BIOSes that
do provide _PRW methods for such devices, but then don't bother to
call Notify() for those devices from the corresponding _Lxx/_Exx
GPE-handling methods. In all of these cases the kernel doesn't have
a chance to receive a proper notification that it should wake up a
device, so devices stay in low-power states forever. Worse yet, in
some cases they continuously send PME Messages that are silently
ignored, because the kernel simply doesn't know that it should clear
the device's PME Status bit.
This problem was first observed for "parallel" (non-Express) PCI
devices on add-on cards and Matthew Garrett addressed it by adding
code that polls PME Status bits of such devices, if they are enabled
to signal PME, to the kernel. Recently, however, it has turned out
that PCI Express devices are also affected by this issue and that it
is not limited to add-on devices, so it seems necessary to extend
the PME polling to all PCI devices, including PCI Express and planar
ones. Still, it would be wasteful to poll the PME Status bits of
devices that are known to receive proper PME notifications, so make
the kernel (1) poll the PME Status bits of all PCI and PCIe devices
enabled to signal PME and (2) disable the PME Status polling for
devices for which correct PME notifications are received.
Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 15bed0f2f added a quirk for the e823 Ricoh card reader to lower the
base frequency. However, the quirk first checks to see if the proprietary
MMC controller is disabled, and returns if so. On some devices, such as the
Lenovo X220, the MMC controller is already disabled by firmware it seems,
but the frequency change is still needed so sdhci-pci can talk to the cards.
Since the MMC controller is disabled, the frequency fixup was never being run
on these machines.
This moves the e823 check above the MMC controller check so that it always
gets run.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=722509
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "powernv" platform of the powerpc architecture needs to assign PCI
resources using a specific algorithm to fit some HW constraints of
the IBM "IODA" architecture (related to the ability to create error
handling domains that encompass specific segments of MMIO space).
For doing so, it wants to call pci_setup_bridge() from architecture
specific resource management in order to configure bridges after all
resources have been assigned. So make it non-static.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Using legacy interrupts and TLPs > 256 bytes on the SFC4000 (all
revisions) may cause interrupt messages to be replayed. In some
systems this results in a non-recoverable MCE. Early boards using the
SFC4000 set the maximum payload size supported (MPSS) to 1024 bytes
and we should override that.
There are probably other devices with similar issues, so give this
quirk a generic name.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults. Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.
Also, add the option for peer to peer DMA MPS configuration. Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another. To work around this, simply make the system
wide MPS the smallest possible value (128B).
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In pcie_find_smpss(), we have the following statement:
if (dev->is_hotplug_bridge && (!list_is_singular(&dev->bus->devices) ||
dev->bus->self->pcie_type != PCI_EXP_TYPE_ROOT_PORT))
The problem is that at least on my machine, this gets called for the
root complex (virtual P2P bridge), and dev->bus->self is NULL since
the parent bus for this is not itself anchor to a PCI device.
This adds the necessary NULL check.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jon Mason <mason@myri.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices. Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode. Also, make pcie_bus_safe
the default procedure.
Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Simon Kirby <sim@hostway.ca>
Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b03e7495a8 ("PCI: Set PCI-E Max Payload Size on fabric")
introduced a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.
To correct this, verify that the self pointer in pci_bus is non-NULL
before dereferencing it.
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With Xen changeset 23428 "libxl: Add 'e820_host' option to config file"
the E820 as seen from the host can now be passed into the guest.
This means that a PV guest can now:
- Use the correct PCI I/O gap. Before these patches, Linux guest would
boot up and would tell:
[ 0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:c0000000)
while in actuality the PCI I/O gap should have been:
[ 0.000000] Allocating PCI resources starting at b0000000 (gap: b0000000:4c000000)
- The PV domain with PCI devices was limited to 3GB. It now can be booted
with 4GB, 8GB, or whatever number you want. The PCI devices will now _not_ conflict
with System RAM. Meaning the drivers can load.
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: linux-pci@vger.kernel.org
CC: stable@kernel.org
[v2: Made the string less broken up. Suggested by Joe Perches]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fix new kernel-doc warning in pci.c:
Warning(drivers/pci/pci.c:3259): No description found for parameter 'mps'
Warning(drivers/pci/pci.c:3259): Excess function parameter 'rq' description in 'pcie_set_mps'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: OF: Don't crash when bridge parent is NULL.
PCI: export pcie_bus_configure_settings symbol
PCI: code and comments cleanup
PCI: make cardbus-bridge resources optional
PCI: make SRIOV resources optional
PCI : ability to relocate assigned pci-resources
PCI: honor child buses add_size in hot plug configuration
PCI: Set PCI-E Max Payload Size on fabric
In pcibios_get_phb_of_node(), we will crash while booting if
bus->bridge->parent is NULL.
Check for this case and avoid dereferencing the NULL pointer.
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (28 commits)
ACPI: delete stale reference in kernel-parameters.txt
ACPI: add missing _OSI strings
ACPI: remove NID_INVAL
thermal: make THERMAL_HWMON implementation fully internal
thermal: split hwmon lookup to a separate function
thermal: hide CONFIG_THERMAL_HWMON
ACPI print OSI(Linux) warning only once
ACPI: DMI workaround for Asus A8N-SLI Premium and Asus A8N-SLI DELUX
ACPI / Battery: propagate sysfs error in acpi_battery_add()
ACPI / Battery: avoid acpi_battery_add() use-after-free
ACPI: introduce "acpi_rsdp=" parameter for kdump
ACPI: constify ops structs
ACPI: fix CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS
ACPI: fix 80 char overflow
ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()
ACPI / Battery: Add the check before refresh sysfs in the battery_notify()
ACPI / Battery: Add the hibernation process in the battery_notify()
ACPI / Battery: Rename acpi_battery_quirks2 with acpi_battery_quirks
ACPI / Battery: Change 16-bit signed negative battery current into correct value
ACPI / Battery: Add the power unit macro
...
pcie_bus_configure_settings needs to be exported if the PCI hotplug
driver is being compiled as a module.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
a) adjust_resource_sorted() is now called reassign_resource_sorted()
b) nice-to-have is now called optional
c) add_list is now called realloc_list.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Allocate resources to cardbus bridge only after all other genuine
resources requests are satisfied. Dont retry if resource allocation
for cardbus-bridges fail.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
From: Yinghai Lu <yinghai@kernel.org>
Allocate resources to SRIOV BARs only after all other required
resource-requests are satisfied. Dont retry if resource allocation for SRIOV
BARs fail.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Currently pci-bridges are allocated enough resources to satisfy their immediate
requirements. Any additional resource-requests fail if additional free space,
contiguous to the one already allocated, is not available. This behavior is not
reasonable since sufficient contiguous resources, that can satisfy the request,
are available at a different location.
This patch provides the ability to expand and relocate a allocated resource.
v2: Changelog: Fixed size calculation in pci_reassign_resource()
v3: Changelog : Split this patch. The resource.c changes are already
upstream. All the pci driver changes are in here.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
git commit c8adf9a3e8
"PCI: pre-allocate additional resources to devices only after
successful allocation of essential resources."
fails to take into consideration the optional-resources needed by children
devices while calculating the optional-resource needed by the bridge.
This can be a problem on some setup. For example, if a hotplug bridge has 8
children hotplug bridges, the bridge should have enough resources to accomodate
the hotplug requirements for each of its children hotplug bridges. Currently
this is not the case.
This patch fixes the problem.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
On a given PCI-E fabric, each device, bridge, and root port can have a
different PCI-E maximum payload size. There is a sizable performance
boost for having the largest possible maximum payload size on each PCI-E
device. However, if improperly configured, fatal bus errors can occur.
Thus, it is important to ensure that PCI-E payloads sends by a device
are never larger than the MPS setting of all devices on the way to the
destination.
This can be achieved two ways:
- A conservative approach is to use the smallest common denominator of
the entire tree below a root complex for every device on that fabric.
This means for example that having a 128 bytes MPS USB controller on one
leg of a switch will dramatically reduce performances of a video card or
10GE adapter on another leg of that same switch.
It also means that any hierarchy supporting hotplug slots (including
expresscard or thunderbolt I suppose, dbl check that) will have to be
entirely clamped to 128 bytes since we cannot predict what will be
plugged into those slots, and we cannot change the MPS on a "live"
system.
- A more optimal way is possible, if it falls within a couple of
constraints:
* The top-level host bridge will never generate packets larger than the
smallest TLP (or if it can be controlled independently from its MPS at
least)
* The device will never generate packets larger than MPS (which can be
configured via MRRS)
* No support of direct PCI-E <-> PCI-E transfers between devices without
some additional code to specifically deal with that case
Then we can use an approach that basically ignores downstream requests
and focuses exclusively on upstream requests. In that case, all we need
to care about is that a device MPS is no larger than its parent MPS,
which allows us to keep all switches/bridges to the max MPS supported by
their parent and eventually the PHB.
In this case, your USB controller would no longer "starve" your 10GE
Ethernet and your hotplug slots won't affect your global MPS.
Additionally, the hotplugged devices themselves can be configured to a
larger MPS up to the value configured in the hotplug bridge.
To choose between the two available options, two PCI kernel boot args
have been added to the PCI calls. "pcie_bus_safe" will provide the
former behavior, while "pcie_bus_perf" will perform the latter behavior.
By default, the latter behavior is used.
NOTE: due to the location of the enablement, each arch will need to add
calls to this function. This patch only enables x86.
This patch includes a number of changes recommended by Benjamin
Herrenschmidt.
Tested-by: Jordan_Hargrave@dell.com
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: remove printks about disabled bridge windows
PCI: fold pci_calc_resource_flags() into decode_bar()
PCI: treat mem BAR type "11" (reserved) as 32-bit, not 64-bit, BAR
PCI: correct pcie_set_readrq write size
PCI: pciehp: change wait time for valid configuration access
x86/PCI: Preserve existing pci=bfsort whitelist for Dell systems
PCI: ARI is a PCIe v2 feature
x86/PCI: quirks: Use pci_dev->revision
PCI: Make the struct pci_dev * argument of pci_fixup_irqs const.
PCI hotplug: cpqphp: use pci_dev->vendor
PCI hotplug: cpqphp: use pci_dev->subsystem_{vendor|device}
x86/PCI: config space accessor functions should not ignore the segment argument
PCI: Assign values to 'pci_obff_signal_type' enumeration constants
x86/PCI: reduce severity of host bridge window conflict warnings
PCI: enumerate the PCI device only removed out PCI hieratchy of OS when re-scanning PCI
PCI: PCIe AER: add aer_recover_queue
x86/PCI: select direct access mode for mmconfig option
PCI hotplug: Rename is_ejectable which also exists in dock.c
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
merge fchmod() and fchmodat() guts, kill ancient broken kludge
xfs: fix misspelled S_IS...()
xfs: get rid of open-coded S_ISREG(), etc.
vfs: document locking requirements for d_move, __d_move and d_materialise_unique
omfs: fix (mode & S_IFDIR) abuse
btrfs: S_ISREG(mode) is not mode & S_IFREG...
ima: fmode_t misspelled as mode_t...
pci-label.c: size_t misspelled as mode_t
jffs2: S_ISLNK(mode & S_IFMT) is pointless
snd_msnd ->mode is fmode_t, not mode_t
v9fs_iop_get_acl: get rid of unused variable
vfs: dont chain pipe/anon/socket on superblock s_inodes list
Documentation: Exporting: update description of d_splice_alias
fs: add missing unlock in default_llseek()
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>
Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
fs: Merge split strings
treewide: fix potentially dangerous trailing ';' in #defined values/expressions
uwb: Fix misspelling of neighbourhood in comment
net, netfilter: Remove redundant goto in ebt_ulog_packet
trivial: don't touch files that are removed in the staging tree
lib/vsprintf: replace link to Draft by final RFC number
doc: Kconfig: `to be' -> `be'
doc: Kconfig: Typo: square -> squared
doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
drivers/net: static should be at beginning of declaration
drivers/media: static should be at beginning of declaration
drivers/i2c: static should be at beginning of declaration
XTENSA: static should be at beginning of declaration
SH: static should be at beginning of declaration
MIPS: static should be at beginning of declaration
ARM: static should be at beginning of declaration
rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Update my e-mail address
PCIe ASPM: forcedly -> forcibly
gma500: push through device driver tree
...
Fix up trivial conflicts:
- arch/arm/mach-ep93xx/dma-m2p.c (deleted)
- drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
- drivers/net/r8169.c (just context changes)
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
iommu/core: Fix build with INTR_REMAP=y && CONFIG_DMAR=n
iommu/amd: Don't use MSI address range for DMA addresses
iommu/amd: Move missing parts to drivers/iommu
iommu: Move iommu Kconfig entries to submenu
x86/ia64: intel-iommu: move to drivers/iommu/
x86: amd_iommu: move to drivers/iommu/
msm: iommu: move to drivers/iommu/
drivers: iommu: move to a dedicated folder
x86/amd-iommu: Store device alias as dev_data pointer
x86/amd-iommu: Search for existind dev_data before allocting a new one
x86/amd-iommu: Allow dev_data->alias to be NULL
x86/amd-iommu: Use only dev_data in low-level domain attach/detach functions
x86/amd-iommu: Use only dev_data for dte and iotlb flushing routines
x86/amd-iommu: Store ATS state in dev_data
x86/amd-iommu: Store devid in dev_data
x86/amd-iommu: Introduce global dev_data_list
x86/amd-iommu: Remove redundant device_flush_dte() calls
iommu-api: Add missing header file
Fix up trivial conflicts (independent additions close to each other) in
drivers/Makefile and include/linux/pci.h
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (51 commits)
PM: Improve error code of pm_notifier_call_chain()
PM: Add "RTC" to PM trace time stamps to avoid confusion
PM / Suspend: Export suspend_set_ops, suspend_valid_only_mem
PM / Suspend: Add .suspend_again() callback to suspend_ops
PM / OPP: Introduce function to free cpufreq table
ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active
PM / Domains: Take .power_off() error code into account
ARM / shmobile: Use genpd_queue_power_off_work()
ARM / shmobile: Use pm_genpd_poweroff_unused()
PM / Domains: Introduce function to power off all unused PM domains
OMAP: PM: disable idle on suspend for GPIO and UART
OMAP: PM: omap_device: add API to disable idle on suspend
OMAP: PM: omap_device: add system PM methods for PM domain handling
OMAP: PM: omap_device: conditionally use PM domain runtime helpers
PM / Runtime: Add new helper function: pm_runtime_status_suspended()
PM / Domains: Queue up power off work only if it is not pending
PM / Domains: Improve handling of wakeup devices during system suspend
PM / Domains: Do not restore all devices on power off error
PM / Domains: Allow callbacks to execute all runtime PM helpers
PM / Domains: Do not execute device callbacks under locks
...
* 'of-pci' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
pci/of: Consolidate pci_bus_to_OF_node()
pci/of: Consolidate pci_device_to_OF_node()
x86/devicetree: Use generic PCI <-> OF matching
microblaze/pci: Move the remains of pci_32.c to pci-common.c
microblaze/pci: Remove powermac originated cruft
pci/of: Match PCI devices to OF nodes dynamically
I don't think there's enough value in the fact of a bridge window
being disabled to justify cluttering the dmesg log with it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
decode_bar() and pci_calc_resource_flags() both looked at the PCI BAR
type information, and it's simpler to just do it all in one place.
decode_bar() sets IORESOURCE_IO, IORESOURCE_MEM, and IORESOURCE_MEM_64
as appropriate, so res->flags contains all the information pci_bar_type
does, so we don't need to test the pci_bar_type return value.
decode_bar() used to return pci_bar_type, which we no longer need. We
can simplify it a bit by returning the struct resource flags rather than
updating them internally.
In pci_update_resource(), there's no need to decode the BAR type bits
again; we can just test for IORESOURCE_MEM_64 directly.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This fixes a minor regression where broken PCI devices that use the
reserved "11" memory BAR type worked before e354597cce but not after.
The low four bits of a memory BAR are "PTT0" where P=1 for prefetchable
BARs, and TT is as follows:
00 32-bit BAR, anywhere in lower 4GB
01 anywhere below 1MB (reserved as of PCI 2.2)
10 64-bit BAR
11 reserved
Prior to e354597cce, we treated "0100" as a 64-bit BAR and all others,
including prefetchable 64-bit BARs ("1100") as 32-bit BARs. The e354597cce
fix, which appeared in 2.6.28, treats "x1x0" as 64-bit BARs, so the
reserved "x110" types are treated as 64-bit instead of 32-bit.
This patch returns to treating the reserved "11" type as a 32-bit BAR and
adds a warning if we see it.
It also logs a note if we see a 1M BAR. This is not a warning, because
such hardware conforms to pre-PCI 2.2 spec, but I think it's worth noting
because Linux ignores the 1M restriction if it ever has to assign the BAR.
CC: Peter Chubb <peterc@gelato.unsw.edu.au>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=35952
Reported-by: Jan Zwiegers <jan@radicalsystems.co.za>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When setting the PCI-E MRRS, pcie_set_readrq queries the current
settings via a pci_read_config_word call but writes the modified result
via a pci_write_config_dword. This results in writing 16 more bits than
were queried.
Also, the function description comment is slightly incorrect.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Naoki Yanagimoto reported that configuration read on some hot-added
PCIe device returns invalid value. This patch fixes this problem.
According to the PCIe spec, software must wait for at least 1 second
to judge if the hot-added device is broken after Data Link Layer State
Changed Event. This patch changes pciehp driver to wait for 1 second
after the Data Link Layer State Changed Event is detected before
initiating a configuration access instead of 100 ms.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Naoki Yanagimoto <yanagimoto@np.css.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The function pci_enable_ari() may mistakenly set the downstream port
of a v1 PCIe switch in ARI Forwarding mode. This is a PCIe v2 feature,
and with an SR-IOV device on that switch port believing the switch above
is ARI capable it may attempt to use functions 8-255, translating into
invalid (non-zero) device numbers for that bus. This has been seen
to cause Completion Timeouts and general misbehaviour including hangs
and panics.
Cc: stable@kernel.org
Acked-by: Don Dutile <ddutile@redhat.com>
Tested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Aside of the usual motivation for constification, this function has a
history of being abused a hook for interrupt and other fixups so I turned
this function const ages ago in the MIPS code but it should be done
treewide.
Due to function pointer passing in varous places a few other functions
had to be constified as well.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
To: Anton Vorontsov <avorontsov@mvista.com>
To: Chris Metcalf <cmetcalf@tilera.com>
To: Colin Cross <ccross@android.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
To: Eric Miao <eric.y.miao@gmail.com>
To: Erik Gilling <konkers@android.com>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
To: "H. Peter Anvin" <hpa@zytor.com>
To: Imre Kaloz <kaloz@openwrt.org>
To: Ingo Molnar <mingo@redhat.com>
To: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
To: Jesse Barnes <jbarnes@virtuousgeek.org>
To: Krzysztof Halasa <khc@pm.waw.pl>
To: Lennert Buytenhek <kernel@wantstofly.org>
To: Matt Turner <mattst88@gmail.com>
To: Nicolas Pitre <nico@fluxnic.net>
To: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mundt <lethal@linux-sh.org>
To: Richard Henderson <rth@twiddle.net>
To: Russell King <linux@arm.linux.org.uk>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-pci@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-tegra@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The driver reads PCI vendor ID from the PCI configuration register while it is
already stored by the PCI subsystem in the 'vendor' field of 'struct pci_dev'...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The driver reads PCI subsystem IDs from the PCI configuration registers while
they are already stored by the PCI subsystem in the 'subsystem_{vendor|device}'
fields of 'struct pci_dev'...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When hot-plugging a root bridge, we always prevent assigning a bus number
that already exists. This makes sure we don't step over an existing bus.
But sometimes we only remove PCI device in PCI hieratchy of OS, i,e.
echo 1 > /sys/bus/pci/devices/.../remove
but actually don't hotplug this device out the platform, so in this case
we still should re-scan this bus to enumerate this device when re-scanning
PCI again.
Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In addition to native PCIe AER, now APEI (ACPI Platform Error
Interface) GHES (Generic Hardware Error Source) can be used to report
PCIe AER errors too. To add support to APEI GHES PCIe AER recovery,
aer_recover_queue is added to export the recovery function in native
PCIe AER driver.
Recoverable PCIe AER errors are reported via NMI in APEI GHES. Then
APEI GHES uses irq_work to delay the error processing into an IRQ
handler. But PCIe AER recovery can be very time-consuming, so
aer_recover_queue, which can be used in IRQ handler, delays the real
recovery action into the process context, that is, work queue.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
While it's declared static, etags points you to the wrong function
in drivers/acpi/dock.c and acpiphp_glue.c for example also makes
use of some (exported..) functions from this file.
If you trust etags and oversee the static declaration (what happened
to me) one gets totally confused...
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524. Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.
This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Cc: <stable@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Structs battery_file, acpi_dock_ops, file_operations,
thermal_cooling_device_ops, thermal_zone_device_ops, kernel_param_ops
are not changed in runtime. It is safe to make them const.
register_hotplug_dock_device() was altered to take const "ops" argument
to respect acpi_dock_ops' const notion.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Acked-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* pm-runtime:
OMAP: PM: disable idle on suspend for GPIO and UART
OMAP: PM: omap_device: add API to disable idle on suspend
OMAP: PM: omap_device: add system PM methods for PM domain handling
OMAP: PM: omap_device: conditionally use PM domain runtime helpers
PM / Runtime: Add new helper function: pm_runtime_status_suspended()
PM / Runtime: Consistent utilization of deferred_resume
PM / Runtime: Prevent runtime_resume from racing with probe
PM / Runtime: Replace "run-time" with "runtime" in documentation
PM / Runtime: Improve documentation of enable, disable and barrier
PM: Limit race conditions between runtime PM and system sleep (v2)
PCI / PM: Detect early wakeup in pci_pm_prepare()
PM / Runtime: Return special error code if runtime PM is disabled
PM / Runtime: Update documentation of interactions with system sleep
Multiple attempts to dynamically reallocate pci resources have
unfortunately lead to regressions. Though we continue to fix the
regressions and fine tune the dynamic-reallocation behavior, we have not
reached a acceptable state yet.
This patch provides a interim solution. It disables dynamic reallocation
by default, but adds the ability to enable it through pci=realloc kernel
command line parameter.
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A subsequent patch is going to move the invocation of
pm_runtime_barrier() from dpm_prepare() to __device_suspend().
Consequently, early wakeup events resulting from runtime resume
requests for wakeup devices queued up right before system suspend
will only be detected after all of the subsystem-level .prepare()
callbacks have run. However, the PCI bus type calls
pm_runtime_get_sync() from its pci_pm_prepare() callback routine,
so it would destroy the early wakeup events information regarding PCI
devices. To prevent this from happening add an early wakeup
detection mechanism, analogous to the one currently in dpm_prepare(),
to pci_pm_prepare().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Merriam-Webster tells us that the word exists. However ...
* Google suggests `forcibly' because it doesn't recognize `forcedly'.
* Google lists 494 thousand results for `forcedly'.
* Google lists 13.7 million results for `forcibly'.
* Linus's repo contains 1 occurrence of `forcedly' ( 0 after my change).
* Linus's repo contains 60 occurrences of `forcibly' (61 after my change).
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
mmc: queue: bring discard_granularity/alignment into line with SCSI
mmc: queue: append partition subname to queue thread name
mmc: core: make erase timeout calculation allow for gated clock
mmc: block: switch card to User Data Area when removing the block driver
mmc: sdio: reset card during power_restore
mmc: cb710: fix #ifdef HAVE_EFFICIENT_UNALIGNED_ACCESS
mmc: sdhi: DMA slave ID 0 is invalid
mmc: tmio: fix regression in TMIO_MMC_WRPROTECT_DISABLE handling
mmc: omap_hsmmc: use original sg_len for dma_unmap_sg
mmc: omap_hsmmc: fix ocr mask usage
mmc: sdio: fix runtime PM path during driver removal
mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
mmc: sdhi: fix module unloading
mmc: of_mmc_spi: add NO_IRQ define to of_mmc_spi.c
mmc: vub300: fix null dereferences in error handling
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI/ACPI: fix type mismatch
PCI: fix new kernel-doc warning
PCI: Fix warning in drivers/pci/probe.c on sparc64
After commit e866500247
(PM: Allow pm_runtime_suspend() to succeed during system suspend) it
is possible that a device resumed by the pm_runtime_resume(dev) in
pci_pm_prepare() will be suspended immediately from a work item,
timer function or otherwise, defeating the very purpose of calling
pm_runtime_resume(dev) from there. To prevent that from happening
it is necessary to increment the runtime PM usage counter of the
device by replacing pm_runtime_resume() with pm_runtime_get_sync().
Moreover, the incremented runtime PM usage counter has to be
decremented by the corresponding pci_pm_complete(), via
pm_runtime_put_sync().
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.
Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.
As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.
Compile-tested on x86_64.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: Compare only lower 32 bits of framebuffer map offsets
drm/i915: Don't leak in i915_gem_shmem_pread_slow()
drm/radeon/kms: do bounds checking for 3D_LOAD_VBPNTR and bump array limit
drm/radeon/kms: fix mac g5 quirk
x86/uv/x2apic: update for change in pci bridge handling.
alpha, drm: Remove obsolete Alpha support in MGA DRM code
alpha/drm: Cleanup Alpha support in DRM generic code
savage: remove unnecessary if statement
drm/radeon: fix GUI idle IH debug statements
drm/radeon/kms: check modes against max pixel clock
drm: fix fbs in DRM_IOCTL_MODE_GETRESOURCES ioctl
When I added 3448a19da4
I forgot about the special uv handling code for this, so this
patch fixes it up.
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Ingo Molnar
Signed-off-by: Dave Airlie <airlied@redhat.com>
Several fixes as well where the +1 was missing.
Done via coccinelle scripts like:
@@
struct resource *ptr;
@@
- ptr->end - ptr->start + 1
+ resource_size(ptr)
and some grep and typing.
Mostly uncompiled, no cross-compilers.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc32, leon: bugfix in LEON SMP interrupt init
sparc32, sun4m: bugfix in SMP IPI traphandler
sparc: Remove unnecessary semicolons
Add support for allocating irqs for bootbus devices
Do not skip interrupt sources in sun4d interrupt handler and acknowledge interrupts correctly
Restructure sun4d_build_device_irq so that timer interrupts can be allocated
sparc: PCIC_PCI needs SPARC32 dependency
sparc: Do not select GENERIC_HARDIRQS_NO_DEPRECATED
sparc32,leon: add GRPCI2 PCI Host driver
sparc32,leon: added LEON-common low-level PCI routines
sparc32: added CONFIG_PCIC_PCI Kconfig setting
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).
This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).
The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If CONFIG_PM is not set, init_iommu_pm_ops() introduced by commit
134fac3f45 (PCI / Intel IOMMU: Use
syscore_ops instead of sysdev class and sysdev) is not defined
appropriately. Fix this issue.
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The LEON architecture does not have a BIOS or bootloader that
initializes PCI for us, instead Linux generic PCI layer is used
to set up resources and IRQ.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.infradead.org/iommu-2.6:
intel-iommu: Fix off-by-one in RMRR setup
intel-iommu: Add domain check in domain_remove_one_dev_info
intel-iommu: Remove Host Bridge devices from identity mapping
intel-iommu: Use coherent DMA mask when requested
intel-iommu: Dont cache iova above 32bit
intel-iommu: Speed up processing of the identity_mapping function
intel-iommu: Check for identity mapping candidate using system dma mask
intel-iommu: Only unlink device domains from iommu
intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
intel-iommu: Flush unmaps at domain_exit
intel-iommu: Remove obsolete comment from detect_intel_iommu
intel-iommu: fix VT-d PMR disable for TXT on S3 resume
Fix pci.c kernel-doc warnings:
Warning(drivers/pci/pci.c:3292): No description found for parameter 'flags'
Warning(drivers/pci/pci.c:3292): Excess function parameter 'change_bridge_flags' description in 'pci_set_vga_state'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We were mapping an extra byte (and hence usually an extra page):
iommu_prepare_identity_map() expects to be given an 'end' argument which
is the last byte to be mapped; not the first byte *not* to be mapped.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The comment in domain_remove_one_dev_info() states "No need to compare
PCI domain; it has to be the same". But for the si_domain that isn't
going to be true, as it consists of all the PCI devices that are
identity mapped thus multiple PCI domains can be in si_domain. The
code needs to validate the PCI domain too.
Signed-off-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When using the 1:1 (identity) PCI DMA remapping, PCI Host Bridge devices
that do not use the IOMMU causes a kernel panic. Fix that by not
inserting those devices into the si_domain.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The __intel_map_single function is not honoring the passed in DMA mask.
This results in not using the coherent DMA mask when called from
intel_alloc_coherent().
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Mike Travis and Mike Habeck reported an issue where iova allocation
would return a range that was larger than a device's dma mask.
https://lkml.org/lkml/2011/3/29/423
The dmar initialization code will reserve all PCI MMIO regions and copy
those reservations into a domain specific iova tree. It is possible for
one of those regions to be above the dma mask of a device. It is typical
to allocate iovas with a 32bit mask (despite device's dma mask possibly
being larger) and cache the result until it exhausts the lower 32bit
address space. Freeing the iova range that is >= the last iova in the
lower 32bit range when there is still an iova above the 32bit range will
corrupt the cached iova by pointing it to a region that is above 32bit.
If that region is also larger than the device's dma mask, a subsequent
allocation will return an unusable iova and cause dma failure.
Simply don't cache an iova that is above the 32bit caching boundary.
Reported-by: Mike Travis <travis@sgi.com>
Reported-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Acked-by: Mike Travis <travis@sgi.com>
Tested-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When there are a large count of PCI devices, and the pass through
option for iommu is set, much time is spent in the identity_mapping
function hunting though the iommu domains to check if a specific
device is "identity mapped".
Speed up the function by checking the cached info to see if
it's mapped to the static identity domain.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The identity mapping code appears to make the assumption that if the
devices dma_mask is greater than 32bits the device can use identity
mapping. But that is not true: take the case where we have a 40bit
device in a 44bit architecture. The device can potentially receive a
physical address that it will truncate and cause incorrect addresses
to be used.
Instead check to see if the device's dma_mask is large enough
to address the system's dma_mask.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Mike Habeck <habeck@sgi.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit a97590e5 added unlinking domains from iommus to reciprocate the
iommu from domains unlinking that was already done. We actually want
to only do this for device domains and never for the static
identity map domain or VM domains. The SI domain is special and
never freed, while VM domain->id lives in their own special address
space, separate from iommu->domain_ids.
In the current code, a VM can get domain->id zero, then mark that
domain unused when unbound from pci-stub. This leads to DMAR
write faults when the device is re-bound to the host driver.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@kernel.org
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
- size >= 2MiB, and
- virtual address aligned to 2MiB, and
- physical address aligned to 2MiB, and
- on hardware that supports superpages.
(and likewise for larger superpages).
We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.
Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.
Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.
==
The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.
I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.
The original sequence of commits leading to identical code was:
Youquan Song (3):
intel-iommu: super page support
intel-iommu: Fix superpage alignment calculation error
intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
David Woodhouse (4):
intel-iommu: Precalculate superpage support for dmar_domain
intel-iommu: Fix hardware_largepage_caps()
intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
IO_SPACE_LIMIT is currently used in two ways:
1) As a way to mask I/O port values read out of PCI base address
registers. This value should be 64-bit.
2) As a value which is the upper limit for all I/O "ports" in the
system.
On sparc64 we store the full 64-bit physical I/O address in the
resources. For this reason we define IO_SPACE_LIMIT at a 64-bit
"all 1's".
This is the right value to use for ioport_resource.end and for the
check made in drivers/pcmcia/rsrc_nonstatic.c:adjust_io().
But in driver/pci/probe.c:__pci_read_base() we mask this against
a "u32" variable and thus get the following warning:
drivers/pci/probe.c: In function ¡__pci_read_base¢:
drivers/pci/probe.c:207: warning: large integer implicitly truncated to unsigned type
Fix this by using an explicit "u32" cast.
I considered changing sparc64 to define a 32-bit "all 1's" like
most other systems do, but this wouldn't work because the checks
in PCMCIA's rsrc_nonstatic.c would no longer be right since they
are testing against fully formed 64-bit resources. As described
above, on sparc64 such resources will hold full 64-bit physical
I/O addresses, not bus-centric 32-bit ones.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI EC: remove redundant code
ACPI: Add D3 cold state
ACPI: processor: fix processor_physically_present in UP kernel
ACPI: Split out custom_method functionality into an own driver
ACPI: Cleanup custom_method debug stuff
ACPI EC: enable MSI workaround for Quanta laptops
ACPICA: Update to version 20110413
ACPICA: Execute an orphan _REG method under the EC device
ACPICA: Move ACPI_NUM_PREDEFINED_REGIONS to a more appropriate place
ACPICA: Update internal address SpaceID for DataTable regions
ACPICA: Add more methods eligible for NULL package element removal
ACPICA: Split all internal Global Lock functions to new file - evglock
ACPI: EC: add another DMI check for ASUS hardware
ACPI EC: remove dead code
ACPICA: Fix code divergence of global lock handling
ACPICA: Use acpi_os_create_lock interface
ACPI: osl, add acpi_os_create_lock interface
ACPI:Fix goto flows in thermal-sys
_SxW returns an Integer containing the lowest D-state supported in state
Sx. If OSPM has not indicated that it supports _PR3, then the value “3”
corresponds to D3. If it has indicated _PR3 support, the value “3”
represents D3hot and the value “4” represents D3cold.
Linux does set _OSC._PR3, so we should fix it to expect that _SxW can
return 4.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (169 commits)
drivers/gpu/drm/radeon/atom.c: fix warning
drm/radeon/kms: bump kms version number
drm/radeon/kms: properly set num banks for fusion asics
drm/radeon/kms/atom: move dig phy init out of modesetting
drm/radeon/kms/cayman: fix typo in register mask
drm/radeon/kms: fix typo in spread spectrum code
drm/radeon/kms: fix tile_config value reported to userspace on cayman.
drm/radeon/kms: fix incorrect comparison in cayman setup code.
drm/radeon/kms: add wait idle ioctl for eg->cayman
drm/radeon/cayman: setup hdp to invalidate and flush when asked
drm/radeon/evergreen/btc/fusion: setup hdp to invalidate and flush when asked
agp/uninorth: Fix lockups with radeon KMS and >1x.
drm/radeon/kms: the SS_Id field in the LCD table if for LVDS only
drm/radeon/kms: properly set the CLK_REF bit for DCE3 devices
drm/radeon/kms: fixup eDP connector handling
drm/radeon/kms: bail early for eDP in hotplug callback
drm/radeon/kms: simplify hotplug handler logic
drm/radeon/kms: rewrite DP handling
drm/radeon/kms/atom: add support for setting DP panel mode
drm/radeon/kms: atombios.h updates for DP panel mode
...
We typically batch unmaps to be lazily flushed out at
regular intervals. When we destroy a domain, we need
to force a flush of these lazy unmaps to be sure none
reference the domain we're about to free.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=35062
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Since cacd4213d8, this comment no longer applies.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This patch is a follow on to https://lkml.org/lkml/2011/3/21/239, which
was merged as commit 51a63e67da.
This patch adds support for S3, as pointed out by Chris Wright.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
PCI: Don't use dmi_name_in_vendors in quirk
PCI: remove unused AER functions
PCI/sysfs: move bus cpuaffinity to class dev_attrs
PCI: add rescan to /sys/.../pci_bus/.../
PCI: update bridge resources to get more big ranges when allocating space (again)
KVM: Use pci_store/load_saved_state() around VM device usage
PCI: Add interfaces to store and load the device saved state
PCI: Track the size of each saved capability data area
PCI/e1000e: Add and use pci_disable_link_state_locked()
x86/PCI: derive pcibios_last_bus from ACPI MCFG
PCI: add latency tolerance reporting enable/disable support
PCI: add OBFF enable/disable support
PCI: add ID-based ordering enable/disable support
PCI hotplug: acpiphp: assume device is in state D0 after powering on a slot.
PCI: Set PCIE maxpayload for card during hotplug insertion
PCI/ACPI: Report _OSC control mask returned on failure to get control
x86/PCI: irq and pci_ids patch for Intel Panther Point DeviceIDs
PCI: handle positive error codes
PCI: check pci_vpd_pci22_wait() return
PCI: Use ICH6_GPIO_EN in ich6_lpc_acpi_gpio
...
Fix up trivial conflicts in include/linux/pci_ids.h: commit a6e5e2be44
moved the intel SMBUS ID definitons to the i2c-i801.c driver.
Don't use the costly dmi_name_in_vendors() when we know the string we
are looking for can only be in the DMI board name field. This is more
robust and, more importantly, much faster.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the commit 28eb5f2, aer_osc_setup is removed but corresponding
definiton information in the aerdrv.h is missed.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Requested by Greg KH to fix a race condition in the creating of PCI bus
cpuaffinity files.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After remove the device from /sys, we have to rescan all or
find out the bridge and access /sys../device/rescan there.
this patch add /sys/.../pci_bus/.../rescan. So user can rescan more easy.
that is more clean and easy to understand.
like after remove 0000:c4:00.0, you can rescan 0000:c4 directly.
-v2: According to Jesse, use function instead of exposing attr, so could hide
#ifdef in header file.
also add code to remove rescan file in remove path.
-v3: GregKH pointed out that we should use dev_attrs to avoid racing.
So add pcibus_attrs and make it to be member of pcibus_attrs.
-v4: Change name to pcibus_dev_attrs according to GregKH
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
With Ram's fixes, this should be safe to do again. So let's give it
another try.
BIOS separates IO ranges between several IOHs, and on some slots, BIOS
assigns resources to a bridge, but stops assigning resources to the
device under that bridge, because the device needs a big resource.
So:
1. allocate resources and record the failed device resources
2. clear the BIOS assigned resources of the parent bridge of failing device
3. go back and call pci assign unassigned
4. if it still fails, go up the tree, clear more bridges. and try again
Now Ram's allocate requested resource already got into mainline. could
put this one again.
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For KVM device assignment, we'd like to save off the state of a device
prior to passing it to the guest and restore it later. We also want
to allow pci_reset_funciton() to be called while the device is owned
by the guest. This however overwrites and invalidates the struct pci_dev
buffers, so we can't just manually call save and restore. Add generic
interfaces for the saved state to be stored and reloaded back into
struct pci_dev at a later time.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This will allow us to store and load it later.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, gart: Rename pci-gart_64.c to amd_gart_64.c
x86/amd-iommu: Use threaded interupt handler
arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS
x86/amd-iommu: Add support for invalidate_all command
x86/amd-iommu: Add extended feature detection
x86/amd-iommu: Add ATS enable/disable code
x86/amd-iommu: Add flag to indicate IOTLB support
x86/amd-iommu: Flush device IOTLB if ATS is enabled
x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver
PCI: Move ATS declarations in seperate header file
dma-debug: print information about leaked entry
x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled
x86/amd-iommu: Rename iommu_flush_device
x86/amd-iommu: Improve handling of full command buffer
x86/amd-iommu: Rename iommu_flush* to domain_flush*
x86/amd-iommu: Remove command buffer resetting logic
x86/amd-iommu: Cleanup completion-wait handling
x86/amd-iommu: Cleanup inv_pages command handling
x86/amd-iommu: Move inv-dte command building to own function
x86/amd-iommu: Move compl-wait command building to own function
During pci remove/rescan testing found:
pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
pci 0000:c0:03.0: bridge window [io 0x1000-0x0fff]
pci 0000:c0:03.0: bridge window [mem 0xf0000000-0xf00fffff]
pci 0000:c0:03.0: bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
pci 0000:c0:03.0: device not available (can't reserve [io 0x1000-0x0fff])
pci 0000:c0:03.0: Error enabling bridge (-22), continuing
pci 0000:c0:03.0: enabling bus mastering
pci 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: device not available (can't reserve [io 0x1000-0x0fff])
pcieport: probe of 0000:c0:03.0 failed with error -22
This bug was caused by commit c8adf9a3e8 ("PCI: pre-allocate
additional resources to devices only after successful allocation of
essential resources.")
After that commit, pci_hotplug_io_size is changed to additional_io_size
from minium size. So it will not go through resource_size(res) != 0
path, and will not be reset.
The root cause is: pci_bridge_check_ranges will set RESOURCE_IO flag for
pci bridge, and later if children do not need IO resource. those bridge
resources will not need to be allocated. but flags is still there.
that will confuse the the pci_enable_bridges later.
related code:
static void assign_requested_resources_sorted(struct resource_list *head,
struct resource_list_x *fail_head)
{
struct resource *res;
struct resource_list *list;
int idx;
for (list = head->next; list; list = list->next) {
res = list->res;
idx = res - &list->dev->resource[0];
if (resource_size(res) && pci_assign_resource(list->dev, idx)) {
...
reset_resource(res);
}
}
}
At last, We have to clear the flags in pbus_size_mem/io when requested
size == 0 and !add_head. becasue this case it will not go through
adjust_resources_sorted().
Just make size1 = size0 when !add_head. it will make flags get cleared.
At the same time when requested size == 0, add_size != 0, will still
have in head and add_list. because we do not clear the flags for it.
After this, we will get right result:
pci 0000:c0:03.0: PCI bridge to [bus c4-c9]
pci 0000:c0:03.0: bridge window [io disabled]
pci 0000:c0:03.0: bridge window [mem 0xf0000000-0xf00fffff]
pci 0000:c0:03.0: bridge window [mem 0xfc180000000-0xfc197ffffff 64bit pref]
pci 0000:c0:03.0: enabling bus mastering
pci 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: setting latency timer to 64
pcieport 0000:c0:03.0: irq 160 for MSI/MSI-X
pcieport 0000:c0:03.0: Signaling PME through PCIe PME interrupt
pci 0000:c4:00.0: Signaling PME through PCIe PME interrupt
pcie_pme 0000:c0:03.0:pcie01: service driver pcie_pme loaded
aer 0000:c0:03.0:pcie02: service driver aer loaded
pciehp 0000:c0:03.0:pcie04: Hotplug Controller:
v3: more simple fix. also fix one typo in pbus_size_mem
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Latency tolerance reporting allows devices to send messages to the root
complex indicating their latency tolerance for snooped & unsnooped
memory transactions. Add support for enabling & disabling this
feature, along with a routine to set the max latencies a device should
send upstream.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
OBFF (optimized buffer flush/fill), where supported, can help improve
energy efficiency by giving devices information about when interrupts
and other activity will have a reduced power impact. It requires
support from both the device and system (i.e. not only does the device
need to respond to OBFF messages, but the platform must be capable of
generating and routing them to the end point).
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add support to allow drivers to enable/disable ID-based ordering. Where
supported, ID-based ordering can significantly improve the latency of
individual requests by preventing them from queueing up behind unrelated
traffic.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Devices which do not support PCI configuration space based power
management may not otherwise be enabled.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The following patch sets the MaxPayload setting to match the parent
reading when inserting a PCIE card into a hotplug slot. On our system,
the upstream bridge is set to 256, but when inserting a card, the card
setting defaults to 128. As soon as I/O is performed to the card it
starts receiving errors since the payload size is too small.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Callers expect pci_user_{read,write}_config_*() to indicate errors by
returning negative values. Prior to this change, the indicated routines
could return positive error codes (e.g. PCIBIOS_BAD_REGISTER_NUMBER)
which callers would mistakenly interpret as success.
This change converts any non-zero return from the mentioned routines
into unambiguous negative value return codes.
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_vpd_pci22_write() calls pci_vpd_pci22_wait() after writing
PCI_VPD_DATA and PCI_VPD_ADDR to wait for the VPD operation to complete.
The result pci_vpd_pci22_wait() was not checked for error.
This change checks for error.
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We were just lucky that ICH4_GPIO_EN and ICH6_GPIO_EN happen to have
the same value.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
TI816X (common name for DM816x/C6A816x/AM389x family) devices configured
to boot as PCIe Endpoint have class code = 0. This makes kernel PCI bus
code to skip allocating BARs to these devices resulting into following
type of error when trying to enable them:
"Device 0000:01:00.0 not available because of resource collisions"
The device cannot be operated because of the above issue.
This patch adds a ID specific (TI VENDOR ID and 816X DEVICE ID based)
'early' fixup quirk to replace class code with
PCI_CLASS_MULTIMEDIA_VIDEO as class.
Signed-off-by: Hemant Pedanekar <hemantp@ti.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If it was preempted, and the variable aer_mask_override is changed
after the spin_unlock_irqrestore it will write an uninitialized
variable by the pci_write_config_dword() function.
Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The pci_pm_reset() function is not a very nice interface due to its
limitations and conditional behavior (e.g. it doesn't affect devices
in low-power states), but it cannot be simply dropped, because
existing device drivers may depend on it. However, its behavior and
limitations should be well documented, so add an appropriate
kerneldoc comment to it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 2f671e2d allowed us to clear ASPM state when the FADT
tells us it isn't supported, but we don't put this into effect
if the aspm_policy is set to POLICY_POWERSAVE. Enable the
state to be cleared regardless of policy.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
So in a lot of modern systems, a GPU will always be below a parent bridge that won't share with any other GPUs. This means VGA arbitration on those GPUs can be controlled by using the bridge routing instead of io/mem decodes.
The problem is locating which GPUs share which upstream bridges. This patch attempts to identify all the GPUs which can be controlled via bridges, and ones that can't. This patch endeavours to work out the bridge sharing semantics.
When disabling GPUs via a bridge, it doesn't do irq callbacks or touch the io/mem decodes for the gpu.
Signed-off-by: Dave Airlie <airlied@redhat.com>
* git://git.infradead.org/iommu-2.6:
intel_iommu: disable all VT-d PMRs when TXT launched
intel-iommu: Fix get_domain_for_dev() error path
intel-iommu: Unlink domain from iommu
intel-iommu: Fix use after release during device attach
Intel VT-d Protected Memory Regions (PMRs) are supposed to be disabled,
on each VT-d engine, after DMA remapping is enabled on the engines.
This is because the behavior of having both enabled is not deterministic
and because, if TXT has been used to launch the kernel, the PMRs may be
programmed to cover memory regions that will be used for DMA.
Under some circumstances (certain quirks detected, lack of multiple
devices, etc.), the current code does not set up DMA remapping on some
VT-d engines. In such cases it also skips disabling the PMRs. This
causes failures when the kernel is launched with TXT (most often this
occurs on the graphics engine and results in colored vertical bars on
the display).
This patch detects when the kernel has been launched with TXT and then
disables the PMRs on all VT-d engines. In some cases where the reason
that remapping is not being enabled is due to possible ACPI DMAR table
errors, the VT-d engine addresses may not be correct and thus not able
to be safely programmed even to disable PMRs. Because part of the TXT
launch process is the verification of these addresses, it will always be
safe to disable PMRs if the TXT launch has succeeded and hence only
doing this in such cases.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: pci-label: Fix build failure when CONFIG_NLS is set to 'm' by allmodconfig
Create a kconfig option symbol for PCI_LABEL and enable it
when DMI || ACPI are enabled.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Xen save/restore is going to use hibernate device callbacks for
quiescing devices and putting them back to normal operations and it
would need to select CONFIG_HIBERNATION for this purpose. However,
that also would cause the hibernate interfaces for user space to be
enabled, which might confuse user space, because the Xen kernels
don't support hibernation. Moreover, it would be wasteful, as it
would make the Xen kernels include a substantial amount of code that
they would never use.
To address this issue introduce new power management Kconfig option
CONFIG_HIBERNATE_CALLBACKS, such that it will only select the code
that is necessary for the hibernate device callbacks to work and make
CONFIG_HIBERNATION select it. Then, Xen save/restore will be able to
select CONFIG_HIBERNATE_CALLBACKS without dragging the entire
hibernate code along with it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
In commit 13583b1659 ("PCI: refactor io size calculation code") Ram
had a thinko in the refactorization of the code: the end result used the
variable 'align' for the bus alignment, but the original code used
'min_align'.
Since then, another use of that 'align' variable got introduced by
commit c8adf9a3e8 ("PCI: pre-allocate additional resources to devices
only after successful allocation of essential resources.")
Fix both of those uses to use 'min_align' as they should.
Daniel Hellstrom <daniel@gaisler.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves the relevant declarations from the local
header file in drivers/pci to a more accessible locations so
that it can be used by the AMD IOMMU driver too.
The file is named pci-ats.h because support for the PCI PRI
capability will also be added there in a later patch-set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'syscore' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
Introduce ARCH_NO_SYSDEV_OPS config option (v2)
cpufreq: Use syscore_ops for boot CPU suspend/resume (v2)
KVM: Use syscore_ops instead of sysdev class and sysdev
PCI / Intel IOMMU: Use syscore_ops instead of sysdev class and sysdev
timekeeping: Use syscore_ops instead of sysdev class and sysdev
x86: Use syscore_ops instead of sysdev classes and sysdevs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Disable ASPM when _OSC control is not granted for PCIe services
PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
PCI: PCIe links may not get configured for ASPM under POWERSAVE mode
PCI/ACPI: Report ASPM support to BIOS if not disabled from command line
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits)
ACPI: minor printk format change in acpi_pad
ACPI: make acpi_pad /sys output more readable
ACPICA: Update version to 20110316
ACPICA: Header support for SLIC table
ACPI: Make sure the FADT is at least rev 2 before using the reset register
ACPI: Bug compatibility for Windows on the ACPI reboot vector
ACPICA: Fix access width for reset vector
ACPI battery: fribble sysfs files from a resume notifier
ACPI button: remove unused procfs I/F
ACPI, APEI, Add PCIe AER error information printing support
PCIe, AER, use pre-generated prefix in error information printing
ACPI, APEI, Add ERST record ID cache
ACPI: Use syscore_ops instead of sysdev class and sysdev
ACPI: Remove the unused EC sysdev class
ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
ACPI: use __init where possible in processor driver
Thermal_Framework-Fix_crash_during_hwmon_unregister
ACPICA: Update version to 20110211.
ACPICA: Add mechanism to defer _REG methods for some installed handlers
ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name
...
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Intel IOMMU subsystem uses a sysdev class and a sysdev for
executing iommu_suspend() after interrupts have been turned off
on the boot CPU (during system suspend) and for executing
iommu_resume() before turning on interrupts on the boot CPU
(during system resume). However, since both of these functions
ignore their arguments, the entire mechanism may be replaced with a
struct syscore_ops object which is simpler.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
The AER error information printing support is implemented in
drivers/pci/pcie/aer/aer_print.c. So some string constants, functions
and macros definitions can be re-used without being exported.
The original PCIe AER error information printing function is not
re-used directly because the overall format is quite different. And
changing the original printing format may make some original users'
scripts broken.
Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
When printing PCIe AER error information, each line is prefixed with
PCIe device and driver information. In original implementation, the
prefix is generated when each line is printed. In fact, all lines
share the same prefix. So this patch pre-generated the prefix, and
use that one when each line is printed.
In addition to common prefix can be pre-generated, the trailing white
spaces in string constants and NULLs in char * array constants can be
removed too. These can reduce the object file size further.
The size of object file before and after changing is as follow:
text data bss dec
before: 3038 0 0 3038
after: 2118 0 0 2118
Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
v3 -> v2: Added text to describe the problem
v2 -> v1: Split this patch from v1
v1 : Part of: http://marc.info/?l=linux-pci&m=130042212003242&w=2
Disable ASPM when no _OSC control for PCIe services is granted
by the BIOS. This is to protect systems with a buggy BIOS that
did not set the ACPI FADT "ASPM Controls" bit even though the
underlying HW can't do ASPM.
To turn "on" ASPM the minimum the BIOS needs to do:
1. Clear the ACPI FADT "ASPM Controls" bit.
2. Support _OSC appropriately
There is no _OSC Control bit for ASPM. However, we expect the BIOS to
support _OSC for a Root Bridge that originates a PCIe hierarchy. If this
is not the case - we are better off not enabling ASPM on that server.
Commit 852972acff (ACPI: Disable ASPM if the
Platform won't provide _OSC control for PCIe) describes the above scenario.
To quote verbatim from there:
[The PCI SIG documentation for the _OSC OS/firmware handshaking interface
states:
"If the _OSC control method is absent from the scope of a host bridge
device, then the operating system must not enable or attempt to use any
features defined in this section for the hierarchy originated by the host
bridge."
The obvious interpretation of this is that the OS should not attempt to use
PCIe hotplug, PME or AER - however, the specification also notes that an
_OSC method is *required* for PCIe hierarchies, and experimental validation
with An Alternative OS indicates that it doesn't use any PCIe functionality
if the _OSC method is missing. That arguably means we shouldn't be using
MSI or extended config space, but right now our problems seem to be limited
to vendors being surprised when ASPM gets enabled on machines when other
OSs refuse to do so. So, for now, let's just disable ASPM if the _OSC
method doesn't exist or refuses to hand over PCIe capability control.]
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
v3 -> v2: Modified the text that describes the problem
v2 -> v1: Returned -EPERM
v1 : http://marc.info/?l=linux-pci&m=130013194803727&w=2
For servers whose hardware cannot handle ASPM the BIOS ought to set the
FADT bit shown below:
In Sec 5.2.9.3 (IA-PC Boot Arch. Flags) of ACPI4.0a Specification, please
see Table 5-11:
PCIe ASPM Controls: If set, indicates to OSPM that it must not enable
OPSM ASPM control on this platform.
However there are shipping servers whose BIOS did not set this bit. (An
example is the HP ProLiant DL385 G6. A Maintenance BIOS will fix that).
For such servers even if a call is made via pci_no_aspm(), based on _OSC
support in the BIOS, it may be too late because the ASPM code may have
already allocated and filled its "link_list".
So if a user sets the ASPM "policy" to "powersave" via /sys then
pcie_aspm_set_policy() will run through the "link_list" and re-configure
ASPM policy on devices that advertise ASPM L0s/L1 capability:
# echo powersave > /sys/module/pcie_aspm/parameters/policy
# cat /sys/module/pcie_aspm/parameters/policy
default performance [powersave]
That can cause NMIs since the hardware doesn't play well with ASPM:
[ 1651.906015] NMI: PCI system error (SERR) for reason b1 on CPU 0.
[ 1651.906015] Dazed and confused, but trying to continue
Ideally, the BIOS should have set that FADT bit in the first place but we
could be more robust - especially given the fact that Windows doesn't
cause NMIs in the above scenario.
There should be a sanity check to not allow a user to modify ASPM policy
when aspm_disabled is set.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
v3 -> v2: Moved ASPM enabling logic to pci_set_power_state()
v2 -> v1: Preserved the logic in pci_raw_set_power_state()
: Added ASPM enabling logic after scanning Root Bridge
: http://marc.info/?l=linux-pci&m=130046996216391&w=2
v1 : http://marc.info/?l=linux-pci&m=130013164703283&w=2
The assumption made in commit 41cd766b06
(PCI: Don't enable aspm before drivers have had a chance to veto it) that
pci_enable_device() will result in re-configuring ASPM when aspm_policy is
POWERSAVE is no longer valid. This is due to commit
97c145f7c8 (PCI: read current power state
at enable time) which resets dev->current_state to D0. Due to this the
call to pcie_aspm_pm_state_change() is never made. Note the equality check
(below) that returns early:
./drivers/pci/pci.c: pci_raw_set_pci_power_state()
546 /* Check if we're already there */
547 if (dev->current_state == state)
548 return 0;
Therefore OSPM never configures the PCIe links for ASPM to turn them "on".
Fix it by configuring ASPM from the pci_enable_device() code path. This
also allows a driver such as the e1000e networking driver a chance to
disable ASPM (L0s, L1), if need be, prior to enabling the device. A
driver may perform this action if the device is known to mis-behave
wrt ASPM.
Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We need to distinguish the situation in which ASPM support is
disabled from the command line or through .config from the situation
in which it is disabled, because the hardware or BIOS can't handle
it. In the former case we should not report ASPM support to the BIOS
through ACPI _OSC, but in the latter case we should do that.
Introduce pcie_aspm_support_enabled() that can be used by
acpi_pci_root_add() to determine whether or not it should report ASPM
support to the BIOS through _OSC.
Cc: stable@kernel.org
References: https://bugzilla.kernel.org/show_bug.cgi?id=29722
References: https://bugzilla.kernel.org/show_bug.cgi?id=20232
Reported-and-tested-by: Ortwin Glück <odi@odi.ch>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Tested-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: label: remove #include of ACPI header to avoid warnings
PCI: label: Fix compilation error when CONFIG_ACPI is unset
PCI: pre-allocate additional resources to devices only after successful allocation of essential resources.
PCI: introduce reset_resource()
PCI: data structure agnostic free list function
PCI: refactor io size calculation code
PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH
PCI hotplug: acpiphp: set current_state to D0 in register_slot
PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs
PCI: add more checking to ICH region quirks
PCI: aer-inject: Override PCIe AER Mask Registers
PCI: fix tlan build when CONFIG_PCI is not enabled
PCI: remove quirk for pre-production systems
PCI: Avoid potential NULL pointer dereference in pci_scan_bridge
PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs
PCI: sysfs: Fix failure path for addition of "vpd" attribute
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/epip/linux-2.6-unicore32: (40 commits)
unicore32: rewrite arch-specific tlb.h to use asm-generic version
unicore32: modify io_p2v and io_v2p macros, and adjust PKUNITY_mmio_BASEs
unicore32: replace unicore32-specific iomap functions with generic lib implementation
unicore32 machine related: add frame buffer driver for pkunity-v3 soc
unicore32 machine related files: add i2c bus drivers for pkunity-v3 soc
unicore32 io: redefine __REG(x) and re-use readl/writel funcs
unicore32 i8042 upgrade and bugfix: adjust resource request region type
unicore32 upgrade to v2.6.38-rc5: add one more paramter for pte_alloc_map call
unicore32 i8042: adjust io funcs of i8042-unicore32io.h
unicore32: rename PKUNITY_IOSPACE_BASE to PKUNITY_MMIO_BASE
unicore32: modify function names and parameters for irq_chips
unicore32: remove unused lines in arch/unicore32/include/asm/irq.h
unicore32 time.c: change calculate method for clock_event_device
unicore32: ADD MAINTAINER for unicore32 architecture
unicore32 machine related files: ps2 driver
unicore32 machine related files: pci bus handling
unicore32 machine related files: hardware registers
unicore32 machine related files: core files
unicore32 additional architecture files: boot process
unicore32 additional architecture files: low-level lib: misc
...
Acked-by: Arnd Bergmann <arnd@arndb.de>
I found that including acpi/apci_drivers.h is not necessary and
introduces these warnings:
In file included from drivers/pci/pci-label.c:32:
include/acpi/acpi_drivers.h:103: warning: ‘struct acpi_device’ declared inside parameter list
include/acpi/acpi_drivers.h:103: warning: its scope is only this definition or declaration, which is probably not what you want
include/acpi/acpi_drivers.h:107: warning: ‘struct acpi_pci_root’ declared inside parameter list
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch fixes compilation error descibed below introduced by
the commit 6058989bad
drivers/pci/pci-label.c: In function ‘pci_create_firmware_label_files’:
drivers/pci/pci-label.c:366:2: error: implicit declaration of function ‘device_has_dsm’
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (21 commits)
PM / Hibernate: Reduce autotuned default image size
PM / Core: Introduce struct syscore_ops for core subsystems PM
PM QoS: Make pm_qos settings readable
PM / OPP: opp_find_freq_exact() documentation fix
PM: Documentation/power/states.txt: fix repetition
PM: Make system-wide PM and runtime PM treat subsystems consistently
PM: Simplify kernel/power/Kconfig
PM: Add support for device power domains
PM: Drop pm_flags that is not necessary
PM: Allow pm_runtime_suspend() to succeed during system suspend
PM: Clean up PM_TRACE dependencies and drop unnecessary Kconfig option
PM: Remove CONFIG_PM_OPS
PM: Reorder power management Kconfig options
PM: Make CONFIG_PM depend on (CONFIG_PM_SLEEP || CONFIG_PM_RUNTIME)
PM / ACPI: Remove references to pm_flags from bus.c
PM: Do not create wakeup sysfs files for devices that cannot wake up
USB / Hub: Do not call device_set_wakeup_capable() under spinlock
PM: Use appropriate printk() priority level in trace.c
PM / Wakeup: Don't update events_check_enabled in pm_get_wakeup_count()
PM / Wakeup: Make pm_save_wakeup_count() work as documented
...
After redefining CONFIG_PM to depend on (CONFIG_PM_SLEEP ||
CONFIG_PM_RUNTIME) the CONFIG_PM_OPS option is redundant and can be
replaced with CONFIG_PM.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
If we run out of domain_ids and fail iommu_attach_domain(), we
fall into domain_exit() without having setup enough of the
domain structure for this to do anything useful. In fact, it
typically runs off into the weeds walking the bogus domain->devices
list. Just free the domain.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
When we remove a device, we unlink the iommu from the domain, but
we never do the reverse unlinking of the domain from the iommu.
This means that we never clear iommu->domain_ids, eventually leading
to resource exhaustion if we repeatedly bind and unbind a device
to a driver. Also free empty domains to avoid a resource leak.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Linux tries to pre-allocate minimal resources to hotplug bridges. This
works fine as long as there are enough resources to satisfy all other
genuine resource requirements. However if enough resources are not
available to satisfy any of these nice-to-have pre-allocations, the
resource-allocator reports errors and returns failure.
This patch distinguishes between must-have resource from nice-to-have
resource. Any failure to allocate nice-to-have resources are ignored.
This behavior can be particularly useful to trigger automatic
reallocation when the OS discovers genuine allocation-conflicts or
genuine unallocated-requests caused by buggy allocation behavior of the
native BIOS/uEFI.
https://bugzilla.kernel.org/show_bug.cgi?id=15960 captures the
movitation behind the patch. This patch is verified to resolve the above
bug.
changelog v2: o fixed a bug where pci_assign_resource() was called on a
resource of zero resource size.
changelog v3: addressed Bjorn's comment
o "Please don't indent and right-justify the changelog".
o removed add_size from struct resource. The additional
size is now tracked using a linked list.
changelog v4: o moved freeing up of elements in head list from
assign_requested_resources_sorted() to
__assign_resources_sorted().
o removed a wrong reference to 'add_size' in
pbus_size_mem().
o some code optimizations in adjust_resources_sorted()
and assign_requested_resources_sorted()
changelog v5: o moved freeing up of elements in head list from
assign_requested_resources_sorted() to
__assign_resources_sorted().
o removed a wrong reference to 'add_size' in
pbus_size_mem().
o some code optimizations in adjust_resources_sorted()
and assign_requested_resources_sorted()
changelog v5: o factored out common code and made them into
separate independent patches
o added comments in kdoc format
o added a BUG_ON in pci_assign_unassigned_resources()
to catch for memory leak.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Replace free_failed_list() with a free_list() call. free_list() can
handle 'resource_list_x', 'resource_list' and any linked list linked
through ->next
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Refactor code that calculates the io size in pbus_size_io() and
pbus_mem_io() into separate functions.
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some broken BIOSes on ICH4 chipset report an ACPI region which is in
conflict with legacy IDE ports when ACPI is disabled. Even though the
regions overlap, IDE ports are working correctly (we cannot find out
the decoding rules on chipsets).
So the only problem is the reported region itself, if we don't reserve
the region in the quirk everything works as expected.
This patch avoids reserving any quirk regions below PCIBIOS_MIN_IO
which is 0x1000. Some regions might be (and are by a fast google
query) below this border, but the only difference is that they won't
be reserved anymore. They should still work though the same as before.
The conflicts look like (1f.0 is bridge, 1f.1 is IDE ctrl):
pci 0000:00:1f.1: address space collision: [io 0x0170-0x0177] conflicts with 0000:00:1f.0 [io 0x0100-0x017f]
At 0x0100 a 128 bytes long ACPI region is reported in the quirk for
ICH4. ata_piix then fails to find disks because the IDE legacy ports
are zeroed:
ata_piix 0000:00:1f.1: device not available (can't reserve [io 0x0000-0x0007])
References: https://bugzilla.novell.com/show_bug.cgi?id=558740
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If a device doesn't support power management (pm_cap == 0) but it is
acpi_pci_power_manageable() because there is a _PS0 method declared for
it and _EJ0 is also declared for the slot then nobody is going to set
current_state = PCI_D0 for this device. This is what I think it is
happening:
pci_enable_device
|
__pci_enable_device_flags
/* here we do not set current_state because !pm_cap */
|
do_pci_enable_device
|
pci_set_power_state
|
__pci_start_power_transition
|
pci_platform_power_transition
/* platform_pci_power_manageable() calls acpi_pci_power_manageable that
* returns true */
|
platform_pci_set_power_state
/* acpi_pci_set_power_state gets called and does nothing because the
* acpi device has _EJ0, see the comment "If the ACPI device has _EJ0,
* ignore the device" */
at this point if we refer to the commit message that introduced the
comment above (10b3dcae0f), it is up to
the hotplug driver to set the state to D0.
However AFAICT the pci hotplug driver never does, in fact
drivers/pci/hotplug/acpiphp_glue.c:register_slot sets the slot flags to
(SLOT_ENABLED | SLOT_POWEREDON) but it does not set the pci device
current state to PCI_D0.
So my proposed fix is also to set current_state = PCI_D0 in
register_slot.
Comments are very welcome.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch exports ACPI _DSM (Device Specific Method) provided firmware
instance number and string name of PCI devices as defined by 'PCI
Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming a
PCI or PCI Express Device Under Operating Systems) to sysfs.
New files created are:
/sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
/sys/bus/pci/devices/.../acpi_index which contains the firmware device type
instance for the given device.
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/acpi_index
1
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/label
Embedded Broadcom 5709C NIC 1
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/acpi_index
2
cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/label
Embedded Broadcom 5709C NIC 2
The ACPI _DSM provided firmware 'instance number' and 'string name' will
be given priority if the firmware also provides 'SMBIOS type 41 device
type instance and string'.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Per ICH4 and ICH6 specs, ACPI and GPIO regions are valid iff ACPI_EN
and GPIO_EN bits are set to 1. Add checks for these bits into the
quirks prior to the region creation.
While at it, name the constants by macros.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Renninger <trenn@suse.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I have several systems which have the same problem: The PCIe AER
corrected and uncorrected masks have all the error bits set. This
results in the inablility to test with the aer_inject module & utility
on those systems.
Add the 'aer_mask_override' module parameter which will override the
corrected or uncorrected masks for a PCI device. The mask will have the
bit corresponding to the status passed into the aer_inject() function.
After this patch it is possible to successfully use the aer_inject
utility on those PCI slots.
Successfully tested by me on a Dell and Intel whitebox which exhibited
the mask problem.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The wakeup.run_wake_count ACPI device field is only used by the PCI
runtime PM code to "protect" devices from being prepared for
generating wakeup signals more than once in a row. However, it
really doesn't provide any protection, because (1) all of the
functions it is supposed to protect use their own reference counters
effectively ensuring that the device will be set up for generating
wakeup signals just once and (2) the PCI runtime PM code uses
wakeup.run_wake_count in a racy way, since nothing prevents
acpi_dev_run_wake() from being called concurrently from two different
threads for the same device.
Remove the wakeup.run_wake_count ACPI device field which is
unnecessary, confusing and used in a wrong way.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cleanup code. Cosmetic change to make the code look easier
to read.
Reviewed-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Check the returned vector values for any values that are
odd or plain incorrect (say vector value zero), and if so
print a warning. Also fixup the return values.
This patch was precipiated by the Xen PCIBack returning the
incorrect values due to how it was retrieving PIRQ values.
This has been fixed in the xen-pciback by
"xen/pciback: Utilize 'xen_pirq_from_irq' to get PIRQ value"
patch.
Reviewed-by: Ian Campbell <Ian.Campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
flush_scheduled_work() is scheduled for deprecation. Cancel ->op_work
directly instead.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ryan Wilson <hap9@epoch.ncsc.mil>
Cc: Jan Beulich <JBeulich@novell.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Revert commit 7eb93b175d
Author: Yu Zhao <yu.zhao@intel.com>
Date: Fri Apr 3 15:18:11 2009 +0800
PCI: SR-IOV quirk for Intel 82576 NIC
If BIOS doesn't allocate resources for the SR-IOV BARs, zero the Flash
BAR and program the SR-IOV BARs to use the old Flash Memory Space.
Please refer to Intel 82576 Gigabit Ethernet Controller Datasheet
section 7.9.2.14.2 for details.
http://download.intel.com/design/network/datashts/82576_Datasheet.pdf
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This quirk was added before SR-IOV was in production and now all machines that
originally had this issue alreayd have bios updates to correct the issue. The
quirk itself is no longer needed and in fact causes bugs if run. Remove it.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Yu Zhao <yu.zhao@intel.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reintroduces commit 47970b1b which was subsequently reverted
as f00eaeea. The original change was broken and caused X startup
failures and generally made privileged processes incapable of reading
device dependent config space. The normal capable() interface returns
true on success, but the LSM interface returns 0 on success. This thinko
is now fixed in this patch, and has been confirmed to work properly.
So, once again...Eric Paris noted that commit de139a3 ("pci: check caps
from sysfs file open to read device dependent config space") caused the
capability check to bypass security modules and potentially auditing.
Rectify this by calling security_capable() when checking the open file's
capabilities for config space reads.
Reported-by: Eric Paris <eparis@redhat.com>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Alex Riesen <raa.lkml@gmail.com>
Cc: Sedat Dilek <sedat.dilek@googlemail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
This reverts commit 47970b1b2a.
It turns out it breaks several distributions. Looks like the stricter
selinux checks fail due to selinux policies not being set to allow the
access - breaking X, but also lspci.
So while the change was clearly the RightThing(tm) to do in theory, in
practice we have backwards compatibility issues making it not work.
Reported-by: Dave Young <hidave.darkstar@gmail.com>
Acked-by: David Airlie <airlied@linux.ie>
Acked-by: Alex Riesen <raa.lkml@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Chris Wright <chrisw@sous-sol.org>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Paris noted that commit de139a3 ("pci: check caps from sysfs file
open to read device dependent config space") caused the capability check
to bypass security modules and potentially auditing. Rectify this by
calling security_capable() when checking the open file's capabilities
for config space reads.
Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
pci_add_new_bus() calls pci_alloc_child_bus() which calls pci_alloc_bus()
that allocates memory dynamically with kzalloc(). The return value of
kzalloc() is the pointer that's eventually returned from
pci_add_new_bus(), so since kzalloc() can fail and return NULL so can
pci_add_new_bus(). Thus we may end up dereferencing a NULL pointer in
drivers/pci/probe.c::pci_scan_bridge(). Seems to me we should test for
this and bail out if it happens rather than crashing.
Also removed some trailing whitespace that bugged me while looking at
this.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 280c73d ("PCI: centralize the capabilities code in
pci-sysfs.c") changed the initialisation of the "rom" and "vpd"
attributes, and made the failure path for the "vpd" attribute
incorrect. We must free the new attribute structure (attr), but
instead we currently free dev->vpd->attr. That will normally be NULL,
resulting in a memory leak, but it might be a stale pointer, resulting
in a double-free.
Found by inspection; compile-tested only.
Cc: stable@kernel.org
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
is used to configure any non-standard kernel with a much larger scope than
only small devices.
This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
references to the option throughout the kernel. A new CONFIG_EMBEDDED
option is added that automatically selects CONFIG_EXPERT when enabled and
can be used in the future to isolate options that should only be
considered for embedded systems (RISC architectures, SLOB, etc).
Calling the option "EXPERT" more accurately represents its intention: only
expert users who understand the impact of the configuration changes they
are making should enable it.
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David Woodhouse <david.woodhouse@intel.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Robin Holt <holt@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Obtain the new pgd pointer before releasing the page containing this
value.
Cc: stable@kernel.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI/PM: Report wakeup events before resuming devices
PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events
PCI: sysfs: Update ROM to include default owner write access
x86/PCI: make Broadcom CNB20LE driver EMBEDDED and EXPERIMENTAL
x86/PCI: don't use native Broadcom CNB20LE driver when ACPI is available
PCI/ACPI: Request _OSC control once for each root bridge (v3)
PCI: enable pci=bfsort by default on future Dell systems
PCI/PCIe: Clear Root PME Status bits early during system resume
PCI: pci-stub: ignore zero-length id parameters
x86/PCI: irq and pci_ids patch for Intel Patsburg
PCI: Skip id checking if no id is passed
PCI: fix __pci_device_probe kernel-doc warning
PCI: make pci_restore_state return void
PCI: Disable ASPM if BIOS asks us to
PCI: Add mask bit definition for MSI-X table
PCI: MSI: Move MSI-X entry definition to pci_regs.h
Fix up trivial conflicts in drivers/net/{skge.c,sky2.c} that had in the
meantime been converted to not use legacy PCI power management, and thus
no longer use pci_restore_state() at all (and that caused trivial
conflicts with the "make pci_restore_state return void" patch)
Make wakeup events be reported by the PCI subsystem before attempting to
resume devices or queuing up runtime resume requests for them, because
wakeup events should be reported as soon as they have been detected.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After recent changes related to wakeup events pm_wakeup_event()
automatically checks if the given device is configured to signal wakeup,
so pci_wakeup_event() may be a static inline function calling
pm_wakeup_event() directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCI sysfs ROM interface requires an enabling write to access the ROM
image, but the default file mode is 0400. The original proposed patch
adding sysfs ROM support was a true read-only interface, with the
enabling bit coming in as a feature request. I suspect it was simply an
oversight that the file mode didn't get updated to match the API.
Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Move the evaluation of acpi_pci_osc_control_set() (to request control of
PCI Express native features) into acpi_pci_root_add() to avoid calling
it many times for the same root complex with the same arguments.
Additionally, check if all of the requisite _OSC support bits are set
before calling acpi_pci_osc_control_set() for a given root complex.
References: https://bugzilla.kernel.org/show_bug.cgi?id=20232
Reported-by: Ozan Caglayan <ozan@pardus.org.tr>
Tested-by: Ozan Caglayan <ozan@pardus.org.tr>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'stable/xenbus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/xenbus: making backend support modular is too complex
xen/pci: Make xen-pcifront be dependent on XEN_XENBUS_FRONTEND
xen/xenbus: fixup checkpatch issues in xenbus_probe*
xen/netfront: select XEN_XENBUS_FRONTEND
xen/xenbus: clean up noise in xenbus_probe_frontend.c
xen/xenbus: clean up noise in xenbus_probe_backend.c
xen/xenbus: clean up noise in xenbus_probe.c
xen/xenbus: cleanup debug noise in xenbus_comms.c
xen/xenbus: clean up error handling
xen/xenbus: make frontend bus GPL
xen/xenbus: make sure backend bus is registered earlier
xenbus/frontend: register bus earlier
xen: remove xen/evtchn.h
xen: add backend driver support
xen: separate out frontend xenbus
Remove kobject.h from files which don't need it, notably,
sched.h and fs.h.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed that PCI Express PMEs don't work on my Toshiba Portege R500
after the system has been woken up from a sleep state by a PME
(through Wake-on-LAN). After some investigation it turned out that
the BIOS didn't clear the Root PME Status bit in the root port that
received the wakeup PME and since the Requester ID was also set in
the port's Root Status register, any subsequent PMEs didn't trigger
interrupts.
This problem can be avoided by clearing the Root PME Status bits in
all PCI Express root ports during early resume. For this purpose,
add an early resume routine to the PCIe port driver and make this
driver be always registered, even if pci_ports_disable is set (in
which case the driver's only function is to provide the early
resume callback).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci-stub uses strsep() to separate list of ids and generates a warning
message when it fails to parse an id. However, not specifying the
parameter results in ids set to an empty string. strsep() happily
returns the empty string as the first token and thus triggers the
warning message spuriously.
Make the tokner ignore zero length ids.
Reported-by: Chris Wright <chrisw@sous-sol.org>
Reported-by: Prasad Joshi <P.G.Joshi@student.reading.ac.uk>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Will get warning when pci stub driver is built-in kenel like:
pci-stub: invalid id string ""
So stop early if no id is passed.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_restore_state only ever returns 0, thus there is no benefit in
having it return any value. Also, a large majority of the callers do
not check the return code of pci_restore_state. Make the
pci_restore_state a void return and avoid the overhead.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We currently refuse to touch the ASPM registers if the BIOS tells us that
ASPM isn't supported. This can cause problems if the BIOS has (for any
reason) enabled ASPM on some devices anyway. Change the code such that we
explicitly clear ASPM if the FADT indicates that ASPM isn't supported,
and make sure we tidy up appropriately on device removal in order to deal
with the hotplug case. If ASPM is disabled because the BIOS doesn't hand
over control then we won't touch the registers.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Then we can use it instead of magic number 1.
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Then it can be used by others.
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If pcie_ports_disabled is set, pcie_port_service_register() returns
error code and select_detection_mode() should not attempt to
unregister dummy_driver and use dummy_slots. It should return
PCIEHP_DETECT_ACPI immediately instead.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32: Make sure we can map all of lowmem if we need to
x86, vt-d: Handle previous faults after enabling fault handling
x86: Enable the intr-remap fault handling after local APIC setup
x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode
x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic
x86, xsave: Use alloc_bootmem_align() instead of alloc_bootmem()
bootmem: Add alloc_bootmem_align()
x86, gcc-4.6: Use gcc -m options when building vdso
x86: HPET: Chose a paranoid safe value for the ETIME check
x86: io_apic: Avoid unused variable warning when CONFIG_GENERIC_PENDING_IRQ=n
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf: Fix off by one in perf_swevent_init()
perf: Fix duplicate events with multiple-pmu vs software events
ftrace: Have recordmcount honor endianness in fn_ELF_R_INFO
scripts/tags.sh: Add magic for trace-events
tracing: Fix panic when lseek() called on "trace" opened for writing
This reverts commit b126b4703a.
We're going back to the old behavior of allocating from bus resources
in _CRS order.
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts commit 82e3e767c2.
We're going back to considering bus resources in the order we found
them (in _CRS order, when we're using _CRS), so we don't need to
define any ordering.
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I wrote this quirk awhile ago to properly setup MCP55 chips on hypertransport
busses so that interrupts reached whatever cpu happend to boot the kdump kernel.
while that works well, it was recently shown to me that a a non-hypertransport
variant of the MCP55 exists, and on those system the register that this quirk
manipulates causes hangs if you write to it. Since the quirk was only meant to
handle errors found on MCP55 chips that have a HT interface, this patch adds a
filter to make sure the chip is an HT capable before making the needed register
adjustment. This lets the broken MCP55s work with kdump while not breaking the
non-HT variants.
Resolves https://bugzilla.kernel.org/show_bug.cgi?id=23952
Tested successfully by the reporter and myself.
Cc: stable@kernel.org
Reported-by: Mathieu Bérard <mathieu@mberard.eu>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fault handling is getting enabled after enabling the interrupt-remapping (as
the success of interrupt-remapping can affect the apic mode and hence the
fault handling mode).
Hence there can potentially be some faults between the window of enabling
interrupt-remapping in the vt-d and the fault-handling of the vt-d units.
Handle any previous faults after enabling the vt-d fault handling.
For v2.6.38 cleanup, need to check if we can remove the dmar_fault() in the
enable_intr_remapping() and see if we can enable fault handling along with
enabling intr-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20101201062244.630417138@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
On platforms with Intel 7500 chipset, there were some reports of system
hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.
During kdump, there is a window where the devices might be still using old
kernel's interrupt information, while the kdump kernel is coming up. This can
cause vt-d faults as the interrupt configuration from the old kernel map to
null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
we still have the same issue but in this case we will see benign spurious
interrupt hit the new kernel).
Based on platform config settings, these platforms seem to generate NMI/SMI
when a vt-d fault happens and there were reports that the resulting SMI causes
the system to hang.
Fix it by masking vt-d spec defined errors to platform error reporting logic.
VT-d spec related errors are already handled by the VT-d OS code, so need to
report the same error through other channels.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1291667190.2675.8.camel@sbsiddha-MOBL3.sc.intel.com>
Cc: stable@kernel.org [v2.6.32+]
Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Reported-and-tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* 'drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
pci root complex: support for tile architecture
drivers/net/tile/: on-chip network drivers for the tile architecture
MAINTAINERS: add drivers/char/hvc_tile.c as maintained by tile
This change enables PCI root complex support for TILEPro. Unlike
TILE-Gx, TILEPro has no support for memory-mapped I/O, so the PCI
support consists of hypervisor upcalls for PIO, DMA, etc. However,
the performance is fine for the devices we have tested with so far
(1Gb Ethernet, SATA, etc.).
The <asm/io.h> header was tweaked to be a little bit more aggressive
about disabling attempts to map/unmap IO port space. The hacky
<asm/pci-bridge.h> header was rolled into the <asm/pci.h> header
and the result was simplified. Both of the latter two headers were
preliminary versions not meant for release before now - oh well.
There is one quirk for our TILEmpower platform, which accidentally
negotiates up to 5GT and needs to be kicked down to 2.5GT.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I just loaded 2.6.37-rc2 on my machines, and I noticed that X no longer starts.
Running an strace of the X server shows that it's doing this:
open("/sys/bus/pci/devices/0000:07:00.0/resource0", O_RDWR) = 10
mmap(NULL, 16777216, PROT_READ|PROT_WRITE, MAP_SHARED, 10, 0) = -1 EINVAL (Invalid argument)
This code seems to be asking for a shared read/write mapping of 16MB worth of
BAR0 starting at file offset 0, and letting the kernel assign a starting
address. Unfortunately, this -EINVAL causes X not to start. Looking into
dmesg, there's a complaint like so:
process "Xorg" tried to map 0x01000000 bytes at page 0x00000000 on 0000:07:00.0 BAR 0 (start 0x 96000000, size 0x 1000000)
...with the following code in pci_mmap_fits:
pci_start = (mmap_api == PCI_MMAP_SYSFS) ?
pci_resource_start(pdev, resno) >> PAGE_SHIFT : 0;
if (start >= pci_start && start < pci_start + size &&
start + nr <= pci_start + size)
It looks like the logic here is set up such that when the mmap call comes via
sysfs, the check in pci_mmap_fits wants vma->vm_pgoff to be between the
resource's start and end address, and the end of the vma to be no farther than
the end. However, the sysfs PCI resource files always start at offset zero,
which means that this test always fails for programs that mmap the sysfs files.
Given the comment in the original commit
3b519e4ea6, I _think_ the old procfs files
require that the file offset be equal to the resource's base address when
mmapping.
I think what we want here is for pci_start to be 0 when mmap_api ==
PCI_MMAP_PROCFS. The following patch makes that change, after which the Matrox
and Mach64 X drivers work again.
Acked-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: sysfs: fix printk warnings
PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
PCI: read current power state at enable time
PCI: fix size checks for mmap() on /proc/bus/pci files
x86/PCI: coalesce overlapping host bridge windows
PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area
Cast pci_resource_start() and pci_resource_len() to u64 for printk.
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When a PCI bus has two resources with the same start/end, e.g.,
pci_bus 0000:04: resource 2 [mem 0xd0000000-0xd7ffffff pref]
pci_bus 0000:04: resource 7 [mem 0xd0000000-0xd7ffffff]
the previous pci_bus_find_resource_prev() implementation would alternate
between them forever:
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
returns [mem 0xd0000000-0xd7ffffff]
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff])
returns [mem 0xd0000000-0xd7ffffff pref]
pci_bus_find_resource_prev(... [mem 0xd0000000-0xd7ffffff pref])
returns [mem 0xd0000000-0xd7ffffff]
...
This happened because there was no ordering between two resources with the
same start and end. A resource that had the same start and end as the
cursor, but was not itself the cursor, was considered to be before the
cursor.
This patch fixes the hang by making a fixed ordering between any two
resources.
In addition, it tries to allocate from positively decoded regions before
using any subtractively decoded resources. This means we will use a
positive decode region before a subtractive decode one, even if it means
using a smaller address.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=22062
Reported-by: Borislav Petkov <bp@amd64.org>
Tested-by: Borislav Petkov <bp@amd64.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When we enable a PCI device, we avoid doing a lot of the initial setup
work if the device's enable count is non-zero. If we don't fetch the
power state though, we may later fail to set up MSI due to the unknown
status. So pick it up before we short circuit the rest due to a
pre-existing enable or mismatched enable/disable pair (as happens with
VGA devices, which are special in a special way).
Tested-by: Jesse Brandeburg <jesse.brandeburg@gmail.com>
Reported-by: Dave Airlie <airlied@linux.ie>
Tested-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The checks for valid mmaps of PCI resources made through /proc/bus/pci files
that were introduced in 9eff02e204 have several
problems:
1. mmap() calls on /proc/bus/pci files are made with real file offsets > 0,
whereas under /sys/bus/pci/devices, the start of the resource corresponds
to offset 0. This may lead to false negatives in pci_mmap_fits(), which
implicitly assumes the /sys/bus/pci/devices layout.
2. The loop in proc_bus_pci_mmap doesn't skip empty resouces. This leads
to false positives, because pci_mmap_fits() doesn't treat empty resources
correctly (the calculated size is 1 << (8*sizeof(resource_size_t)-PAGE_SHIFT)
in this case!).
3. If a user maps resources with BAR > 0, pci_mmap_fits will emit bogus
WARNINGS for the first resources that don't fit until the correct one is found.
On many controllers the first 2-4 BARs are used, and the others are empty.
In this case, an mmap attempt will first fail on the non-empty BARs
(including the "right" BAR because of 1.) and emit bogus WARNINGS because
of 3., and finally succeed on the first empty BAR because of 2.
This is certainly not the intended behaviour.
This patch addresses all 3 issues.
Updated with an enum type for the additional parameter for pci_mmap_fits().
Cc: stable@kernel.org
Signed-off-by: Martin Wilck <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
While testing various randconfigs with ktest.pl, I hit the following panic:
BUG: unable to handle kernel paging request at f7e54b03
IP: [<c0d63409>] ibmphp_access_ebda+0x101/0x19bb
Adding printks, I found that the loop that reads the ebda blocks
can move out of the mapped section.
ibmphp_access_ebda: start=f7e44c00 size=5120 end=f7e46000
ibmphp_access_ebda: io_mem=f7e44d80 offset=384
ibmphp_access_ebda: io_mem=f7e54b03 offset=65283
The start of the iomap was at f7e44c00 and had a size of 5120,
making the end f7e46000. We start with an offset of 0x180 or
384, giving the first read at 0xf7e44d80. Reading that location
yields 65283, which is much bigger than the 5120 that was allocated
and makes the next read at f7e54b03 which is outside the mapped area.
Perhaps this is a bug in the driver, or buggy hardware, but this patch
is more about not crashing my box on start up and just giving a warning
if it detects this error.
This patch at least lets my box boot with just a warning.
Cc: Chandru Siddalingappa <chandru@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Stanse found that when pdev is found and has no driver a reference is
leaked in pcifront_common_process. So add pci_dev_put there. For the
pdev == NULL case, pci_dev_put(NULL) is fine.
[v2: Updated to not dereference pcidev->dev per Milton's observation]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Milton Miller <miltonm@bga.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
In drivers/pci/xen-pcifront.c the xen/xenbus.h header is included twice -
once is enough.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
xen: register xen pci notifier
xen: initialize cpu masks for pv guests in xen_smp_init
xen: add a missing #include to arch/x86/pci/xen.c
xen: mask the MTRR feature from the cpuid
xen: make hvc_xen console work for dom0.
xen: add the direct mapping area for ISA bus access
xen: Initialize xenbus for dom0.
xen: use vcpu_ops to setup cpu masks
xen: map a dummy page for local apic and ioapic in xen_set_fixmap
xen: remap MSIs into pirqs when running as initial domain
xen: remap GSIs as pirqs when running as initial domain
xen: introduce XEN_DOM0 as a silent option
xen: map MSIs into pirqs
xen: support GSI -> pirq remapping in PV on HVM guests
xen: add xen hvm acpi_register_gsi variant
acpi: use indirect call to register gsi in different modes
xen: implement xen_hvm_register_pirq
xen: get the maximum number of pirqs from xen
xen: support pirq != irq
* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
X86/PCI: Remove the dependency on isapnp_disable.
xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
x86: xen: Sanitse irq handling (part two)
swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
xen/pci: Request ACS when Xen-SWIOTLB is activated.
xen-pcifront: Xen PCI frontend driver.
xenbus: prevent warnings on unhandled enumeration values
xenbus: Xen paravirtualised PCI hotplug support.
xen/x86/PCI: Add support for the Xen PCI subsystem
x86: Introduce x86_msi_ops
msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
x86/PCI: Export pci_walk_bus function.
x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
x86/PCI: Clean up pci_cache_line_size
xen: fix shared irq device passthrough
xen: Provide a variant of xen_poll_irq with timeout.
xen: Find an unbound irq number in reverse order (high to low).
xen: statically initialize cpu_evtchn_mask_p
...
Fix up trivial conflicts in drivers/pci/Makefile
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (27 commits)
x86: allocate space within a region top-down
x86: update iomem_resource end based on CPU physical address capabilities
x86/PCI: allocate space from the end of a region, not the beginning
PCI: allocate bus resources from the top down
resources: support allocating space within a region from the top down
resources: handle overflow when aligning start of available area
resources: ensure callback doesn't allocate outside available space
resources: factor out resource_clip() to simplify find_resource()
resources: add a default alignf to simplify find_resource()
x86/PCI: MMCONFIG: fix region end calculation
PCI: Add support for polling PME state on suspended legacy PCI devices
PCI: Export some PCI PM functionality
PCI: fix message typo
PCI: log vendor/device ID always
PCI: update Intel chipset names and defines
PCI: use new ccflags variable in Makefile
PCI: add PCI_MSIX_TABLE/PBA defines
PCI: add PCI vendor id for STmicroelectronics
x86/PCI: irq and pci_ids patch for Intel Patsburg DeviceIDs
PCI: OLPC: Only enable PCI configuration type override on XO-1
...
The BKL was pushed into this function when it was converted to use the
unlocked_ioctl interface, but nothing that the function touches is
actually protected by the BKL. So just remove the BKL entirely, so that
we finally can get a realistic system build without the BKL being
enabled at all.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allocate space from the highest-address PCI bus resource first, then work
downward.
Previously, we looked for space in PCI host bridge windows in the order
we discovered the windows. For example, given the following windows
(discovered via an ACPI _CRS method):
pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff]
pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff]
pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff]
pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff]
pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff]
pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff]
pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff]
we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then
[mem 0x000c0000-0x000effff], and so on.
With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then
[mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc.
Allocating top-down follows Windows practice, so we're less likely to
trip over BIOS defects in the _CRS description.
On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region
doesn't actually work and is likely a BIOS defect. The symptom is that we
move the AHCI controller to 0xbff00000, which leads to "Boot has failed,
sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933
Reported-by: Brian Bloniarz <phunge0@hotmail.com>
Reported-and-tested-by: Stefan Becker <chemobejk@gmail.com>
Reported-by: Denys Vlasenko <dvlasenk@redhat.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: remove in_workqueue_context()
workqueue: Clarify that schedule_on_each_cpu is synchronous
memory_hotplug: drop spurious calls to flush_scheduled_work()
shpchp: update workqueue usage
pciehp: update workqueue usage
isdn/eicon: don't call flush_scheduled_work() from diva_os_remove_soft_isr()
workqueue: add and use WQ_MEM_RECLAIM flag
workqueue: fix HIGHPRI handling in keep_working()
workqueue: add queue_work and activate_work trace points
workqueue: prepare for more tracepoints
workqueue: implement flush[_delayed]_work_sync()
workqueue: factor out start_flush_work()
workqueue: cleanup flush/cancel functions
workqueue: implement alloc_ordered_workqueue()
Fix up trivial conflict in fs/gfs2/main.c as per Tejun
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
block: autoconvert trivial BKL users to private mutex
drivers: autoconvert trivial BKL users to private mutex
ipmi: autoconvert trivial BKL users to private mutex
mac: autoconvert trivial BKL users to private mutex
mtd: autoconvert trivial BKL users to private mutex
scsi: autoconvert trivial BKL users to private mutex
Fix up trivial conflicts (due to addition of private mutex right next to
deletion of a version string) in drivers/char/pcmcia/cm40[04]0_cs.c
This is a port of the 2.6.18 Xen PCI front driver with fixes
to make it build under 2.6.34 and later (for the full list of
changes: git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
historic/xen-pcifront-0.1). It also includes the fixes
to make it work properly.
[v2: Updated Kconfig, removed crud, added Reviewed-by]
[v3: Added 'static', fixed grant table leak, redid Kconfig]
[v4: Added one more 'static' and removed comments]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Jan Beulich <JBeulich@novell.com>
Introduce an override for the arch_[teardown|setup]_msi_irqs
that can be utilized to fallback to the default arch_* code.
If a platform wants to utilize the code paths defined
in driver/pci/msi.c it has to define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
or HAVE_DEFAULT_MSI_SETUP_IRQS. Otherwise the old mechanism
of over-ridding the arch_* works fine.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: x86@kernel.org
In preperation of modularizing Xen-pcifront the pci_walk_bus
needs to be exported so that the xen-pcifront module can walk
call the pci subsystem to walk the PCI devices and claim them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> [http://marc.info/?l=linux-pci&m=126149958010298&w=2]
The patch below updates broken web addresses in the kernel
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Finn Thain <fthain@telegraphics.com.au>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Dimitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Acked-by: Ben Pfaff <blp@cs.stanford.edu>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* Rename shpchp_wq to shpchp_ordered_wq and add non-ordered shpchp_wq
which is used instead of the system workqueue. This is to remove
the use of flush_scheduled_work() which is deprecated and scheduled
for removal.
* With cmwq in place, there's no point in creating workqueues lazily.
Create both shpchp_wq and shpchp_ordered_wq upfront.
* Include workqueue.h from shpchp.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* Rename pciehp_wq to pciehp_ordered_wq and add non-ordered pciehp_wq
which is used instead of the system workqueue. This is to remove
the use of flush_scheduled_work() which is deprecated and scheduled
for removal.
* With cmwq in place, there's no point in creating workqueues lazily.
Create both pciehp_wq and pciehp_ordered_wq upfront.
* Include workqueue.h from pciehp.h.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Not all hardware vendors hook up the PME line for legacy PCI devices,
meaning that wakeup events get lost. The only way around this is to poll
the devices to see if their state has changed, so add support for doing
that on legacy PCI devices that aren't part of the core chipset.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It's helpful to have some extra PCI power management functions available to
platform code, so move the declarations to an exported header.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Previously we had to have CONFIG_PCI_DEBUG=y or CONFIG_DYNAMIC_DEBUG=y
to turn on this printk, but I think the IDs are valuable enough that it's
worth putting them in the log always.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These are already defined in pcilib's pci/header.h but not in kernel's
linux/pci_regs.h. Copy them to avoid using magic numbers.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A long time ago I worked on a RHEL5 bug in which kdump hung during boot
on a set of systems. The systems hung because they never received timer
interrupts during calibrate_delay. These systems also all had Opteron
processors on a hypertransport bus, bridged to a pci bus via an Nvidia
MCP55 northbridge chip. After much wrangling I managed to learn from
Nvidia that they have an undocumented register in some versions of that
chip which control how legacy interrupts are send to the cpu complex
when the ioapic isn't active. Nvidia defaults this register to only
send legacy interrupts to the BSP, so if kdump happens to boot on an AP,
we never get timer interrupts and boom. I had initially used this quirk
as a workaround, with my intent being to move apic initalization to an
earlier point in the boot process, so the setting of the register would
be irrelevant. Given the work involved in doing that however, the
fragile nature of the apic initalization code, and the fact that, over
the 2 years since we found this bug, the MCP55 is the only chip which
seems to have this issue, I've figure at this point its likely safer to
just carry the quirk around. By setting the referenced bits in this
hidden register, interrupts will be broadcast to all cpus when the
ioapic isn't active on the above described systems.
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There is a design issue related to PCIe AER and _OSC that the BIOS
may be asked to grant control of the AER service even if some
Hardware Error Source Table (HEST) entries contain information
meaning that the BIOS really should control it. Namely,
pcie_port_acpi_setup() calls pcie_aer_get_firmware_first() that
determines whether or not the AER service should be controlled by
the BIOS on the basis of the HEST information for the given PCIe
port. The BIOS is asked to grant control of the AER service for
a PCIe Root Complex if pcie_aer_get_firmware_first() returns 'false'
for at least one root port in that complex, even if all of the other
root ports' HEST entries have the FIRMWARE_FIRST flag set (and none
of them has the GLOBAL flag set). However, if the AER service is
controlled by the kernel, that may interfere with the BIOS' handling
of the error sources having the FIRMWARE_FIRST flag. Moreover,
there may be PCIe endpoints that have the FIRMWARE_FIRST flag set in
HEST and are attached to the root ports in question, in which case it
also may be unsafe to ask the BIOS for control of the AER service.
For this reason, introduce a function checking if there's at least
one PCIe-related HEST entry with the FIRMWARE_FIRST flag set and
disable the native AER service altogether if this function returns
'true'.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Get rid of init_MUTEX[_LOCKED]() and use sema_init() instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
quiet the warning about use of uninitialized e_src in
aer_isr() e_src is initialized by get_e_source()
Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All operations in the pci procfs ioctl functions are
atomic, so no lock is needed here.
Also add a compat_ioctl method, since all the commands
are compatible in 32 bit mode.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Indent the branch of an if.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@r disable braces4@
position p1,p2;
statement S1,S2;
@@
(
if (...) { ... }
|
if (...) S1@p1 S2@p2
)
@script:python@
p1 << r.p1;
p2 << r.p2;
@@
if (p1[0].column == p2[0].column):
cocci.print_main("branch",p1)
cocci.print_secs("after",p2)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
irq_2_iommu is in struct irq_cfg, so we can do the irq_remapped check
based on irq_cfg instead of going through a lookup function. That's
especially interesting in the eoi_ioapic_irq() hotpath.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Switch the intr_remapping code to use the irq_2_iommu struct in
irg_cfg.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
That interrupt remapping code is x86 specific and tied to the io_apic
code. No need for separate allocator functions in the interrupt
remapping code. This allows to simplify the code and irq_2_iommu is
small (13 bytes on 64bit) so it's not a real problem even if interrupt
remapping is runtime disabled. If it's compile time disabled the
impact is zero.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
No need to dereference irq_desc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
With SPARSE_IRQ=y the irte descriptors are dynamically allocated, but not
freed in free_irte().
That was ok as long as the sparse irq core was not freeing irq descriptors on
destroy_irq(). Now we leak the irte descriptor. Free it in free_irte().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Handing down irq_desc to msi just so that msi can access
irq_desc.irq_data.msi_desc is a pretty stupid idea. The calling code
can hand down a pointer to msi_desc so msi code does not need to know
about the irq descriptor at all.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
All these files use the big kernel lock in a trivial
way to serialize their private file operations,
typically resulting from an earlier semi-automatic
pushdown from VFS.
None of these drivers appears to want to lock against
other code, and they all use the BKL as the top-level
lock in their file operations, meaning that there
is no lock-order inversion problem.
Consequently, we can remove the BKL completely,
replacing it with a per-file mutex in every case.
Using a scripted approach means we can avoid
typos.
These drivers do not seem to be under active
maintainance from my brief investigation. Apologies
to those maintainers that I have missed.
file=$1
name=$2
if grep -q lock_kernel ${file} ; then
if grep -q 'include.*linux.mutex.h' ${file} ; then
sed -i '/include.*<linux\/smp_lock.h>/d' ${file}
else
sed -i 's/include.*<linux\/smp_lock.h>.*$/include <linux\/mutex.h>/g' ${file}
fi
sed -i ${file} \
-e "/^#include.*linux.mutex.h/,$ {
1,/^\(static\|int\|long\)/ {
/^\(static\|int\|long\)/istatic DEFINE_MUTEX(${name}_mutex);
} }" \
-e "s/\(un\)*lock_kernel\>[ ]*()/mutex_\1lock(\&${name}_mutex)/g" \
-e '/[ ]*cycle_kernel_lock();/d'
else
sed -i -e '/include.*\<smp_lock.h\>/d' ${file} \
-e '/cycle_kernel_lock()/d'
fi
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* git://git.infradead.org/iommu-2.6:
intel-iommu: Use symbolic values instead of magic numbers in Lenovo w/a
intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space
When the Lenovo Ideapad S10-3 is booted with HT enabled,
it hits a boot hang in the intel_idle driver.
This occurs when entering ATM-C4 for the first time,
unless BM_STS is first cleared.
acpi_idle doesn't see this because it first checks
and clears BM_STS, but it would hit the same hang
if that check were disabled.
http://bugs.meego.com/show_bug.cgi?id=7093https://bugs.launchpad.net/ubuntu/+source/linux/+bug/634702
Signed-off-by: Len Brown <len.brown@intel.com>
drivers/pci/intel-iommu.c: In function `__iommu_calculate_agaw':
drivers/pci/intel-iommu.c:437: sorry, unimplemented: inlining failed in call to 'width_to_agaw': function body not available
drivers/pci/intel-iommu.c:445: sorry, unimplemented: called from here
Move the offending function (and its siblings) to top-of-file, remove the
forward declaration.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=17441
Reported-by: Martin Mokrejs <mmokrejs@ribosome.natur.cuni.cz>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 9eecabcb9a ("intel-iommu: Abort
IOMMU setup for igfx if BIOS gave no shadow GTT space") uses a bunch of
magic numbers. Provide #defines for those to make it look slightly saner.
Signed-off-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This fixes the prototype for both pci_resource_alignment() and
pci_sriov_resource_alignment().
Patch started as debugging effort from Cam Macdonell.
Cc: Cam Macdonell <cam@cs.ualberta.ca>
Cc: Avi Kivity <avi@redhat.com>
[chrisw: add iov bits]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: bus speed strings should be const
PCI hotplug: Fix build with CONFIG_ACPI unset
PCI: PCIe: Remove the port driver module exit routine
PCI: PCIe: Move PCIe PME code to the pcie directory
PCI: PCIe: Disable PCIe port services during port initialization
PCI: PCIe: Ask BIOS for control of all native services at once
ACPI/PCI: Negotiate _OSC control bits before requesting them
ACPI/PCI: Do not preserve _OSC control bits returned by a query
ACPI/PCI: Make acpi_pci_query_osc() return control bits
ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
PCI: PCIe: Introduce commad line switch for disabling port services
PCI: PCIe AER: Introduce pci_aer_available()
x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
We return 1 if the IOMMU has been detected. Zero or an error number
if we failed to find it. This is in preperation of using the IOMMU_INIT
so that we can detect whether an IOMMU is present. I have not
tested this for regression on Calgary, nor on AMD Vi chipsets as
I don't have that hardware.
CC: Muli Ben-Yehuda <muli@il.ibm.com>
CC: "Jon D. Mason" <jdmason@kudzu.us>
CC: "Darrick J. Wong" <djwong@us.ibm.com>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
CC: David Woodhouse <David.Woodhouse@intel.com>
CC: Chris Wright <chrisw@sous-sol.org>
CC: Yinghai Lu <yinghai@kernel.org>
CC: Joerg Roedel <joerg.roedel@amd.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Fujita Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
LKML-Reference: <1282845485-8991-3-git-send-email-konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
One of the recent changes caused complilation of
drivers/pci/hotplug/pciehp_core.c to fail. Fix this issue.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCIe port driver's module exit routine is never used, so drop it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCIe PME code only consists of one file, so it doesn't need to
occupy its own directory. Move it to drivers/pci/pcie/pme.c and
remove the contents of drivers/pci/pcie/pme .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In principle PCIe port services may be enabled by the BIOS, so it's
better to disable them during port initialization to avoid spurious
events from being generated.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After commit 852972acff (ACPI: Disable
ASPM if the platform won't provide _OSC control for PCIe) control of
the PCIe Capability Structure is unconditionally requested by
acpi_pci_root_add(), which in principle may cause problems to
happen in two ways. First, the BIOS may refuse to give control of
the PCIe Capability Structure if it is not asked for any of the
_OSC features depending on it at the same time. Second, the BIOS may
assume that control of the _OSC features depending on the PCIe
Capability Structure will be requested in the future and may behave
incorrectly if that doesn't happen. For this reason, control of
the PCIe Capability Structure should always be requested along with
control of any other _OSC features that may depend on it (ie. PCIe
native PME, PCIe native hot-plug, PCIe AER).
Rework the PCIe port driver so that (1) it checks which native PCIe
port services can be enabled, according to the BIOS, and (2) it
requests control of all these services simultaneously. In
particular, this causes pcie_portdrv_probe() to fail if the BIOS
refuses to grant control of the PCIe Capability Structure, which
means that no native PCIe port services can be enabled for the PCIe
Root Complex the given port belongs to. If that happens, ASPM is
disabled to avoid problems with mishandling it by the part of the
PCIe hierarchy for which control of the PCIe Capability Structure
has not been received.
Make it possible to override this behavior using 'pcie_ports=native'
(use the PCIe native services regardless of the BIOS response to the
control request), or 'pcie_ports=compat' (do not use the PCIe native
services at all).
Accordingly, rework the existing PCIe port service drivers so that
they don't request control of the services directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It is possible that the BIOS will not grant control of all _OSC
features requested via acpi_pci_osc_control_set(), so it is
recommended to negotiate the final set of _OSC features with the
query flag set before calling _OSC to request control of these
features.
To implement it, rework acpi_pci_osc_control_set() so that the caller
can specify the mask of _OSC control bits to negotiate and the mask
of _OSC control bits that are absolutely necessary to it. Then,
acpi_pci_osc_control_set() will run _OSC queries in a loop until
the mask of _OSC control bits returned by the BIOS is equal to the
mask passed to it. Also, before running the _OSC request
acpi_pci_osc_control_set() will check if the caller's required
control bits are present in the final mask.
Using this mechanism we will be able to avoid situations in which the
BIOS doesn't grant control of certain _OSC features, because they
depend on some other _OSC features that have not been requested.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce kernel command line switch pcie_ports= allowing one to
disable all of the native PCIe port services, so that PCIe ports
are treated like PCI-to-PCI bridges.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce a function allowing the caller to check whether to try to
enable PCIe AER.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
drivers/pci/intel-iommu.c: In function 'dma_pte_addr':
drivers/pci/intel-iommu.c:239: warning: passing argument 1 of '__cmpxchg64' from incompatible pointer type
It seems that __cmpxchg64() now cares about the type of its pointer argument,
so give it a (uint64_t *) instead of a pointer to a structure which contains
only that.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Allow disabling the source id checking while programming the interrupt
remap table entry. Useful for debugging or working around the broken
source id checks on some platforms.
Signed-off-by: Chris Wright <chrisw@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'acpica' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (27 commits)
ACPI / ACPICA: Simplify acpi_ev_initialize_gpe_block()
ACPI / ACPICA: Fail acpi_gpe_wakeup() if ACPI_GPE_CAN_WAKE is unset
ACPI / ACPICA: Do not execute _PRW methods during initialization
ACPI: Fix bogus GPE test in acpi_bus_set_run_wake_flags()
ACPICA: Update version to 20100702
ACPICA: Fix for Alias references within Package objects
ACPICA: Fix lint warning for 64-bit constant
ACPICA: Remove obsolete GPE function
ACPICA: Update debug output components
ACPICA: Add support for WDDT - Watchdog Descriptor Table
ACPICA: Drop acpi_set_gpe
ACPICA: Use low-level GPE enable during GPE block initialization
ACPI / EC: Do not use acpi_set_gpe
ACPI / EC: Drop suspend and resume routines
ACPICA: Remove wakeup GPE reference counting which is not used
ACPICA: Introduce acpi_gpe_wakeup()
ACPICA: Rename acpi_hw_gpe_register_bit
ACPICA: Update version to 20100528
ACPICA: Add signatures for undefined tables: ATKG, GSCI, IEIT
ACPICA: Optimization: Reduce the number of namespace walks
...
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (30 commits)
PCI: update for owner removal from struct device_attribute
PCI: Fix warnings when CONFIG_DMI unset
PCI: Do not run NVidia quirks related to MSI with MSI disabled
x86/PCI: use for_each_pci_dev()
PCI: use for_each_pci_dev()
PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
PCI: export SMBIOS provided firmware instance and label to sysfs
PCI: Allow read/write access to sysfs I/O port resources
x86/PCI: use host bridge _CRS info on ASRock ALiveSATA2-GLAN
PCI: remove unused HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_{SIZE|BOUNDARY}
PCI: disable mmio during bar sizing
PCI: MSI: Remove unsafe and unnecessary hardware access
PCI: Default PCIe ASPM control to on and require !EMBEDDED to disable
PCI: kernel oops on access to pci proc file while hot-removal
PCI: pci-sysfs: remove casts from void*
ACPI: Disable ASPM if the platform won't provide _OSC control for PCIe
PCI hotplug: make sure child bridges are enabled at hotplug time
PCI hotplug: shpchp: Removed check for hotplug of display devices
PCI hotplug: pciehp: Fixed return value sign for pciehp_unconfigure_device
PCI: Don't enable aspm before drivers have had a chance to veto it
...
Commit 69309a0590 ("x86, asm: Clean up and simplify set_64bit()")
sanitized the x86-64 types to set_64bit(), and incidentally resulted in
warnings like
drivers/pci/intr_remapping.c: In function 'modify_irte':
drivers/pci/intr_remapping.c:314: warning: passing argument 1 of 'set_64bit' from incompatible pointer type
arch/x86/include/asm/cmpxchg_64.h:6: note:expected 'volatile u64 *' but argument is of type 'long unsigned int *'
It turns out that the change to set_64bit() really does clean up things,
and the PCI intr_remapping.c file did a rather ugly cast in order to
avoid warnings with the previous set_64bit() type model.
Removing the ugly cast fixes the warning, and makes everybody happy and
expects a set_64bit() to take the logical "u64 *" argument.
Pointed-out-by: Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Export cache-coherency capability
iommu-api: Extension to check for interrupt remapping
x86/amd-iommu: Use for_each_pci_dev()
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_fsl,mv,nv: prepare for NCQ command completion update
ata: Convert pci_table entries to PCI_VDEVICE (if PCI_ANY_ID is used)
libata: more PCI IDs for jmicron controllers
ata_piix: fix locking around SIDPR access
[libata] update blacklist for new hyphenated pattern ranges (v2)
libata: allow hyphenated pattern ranges
ata_generic: drop hard coded DMA force logic for CENATEK
[libata] ahci: Fix warning: comparison between 'enum <anonymous>' and 'enum <anonymous>'
[libata] add ATA_CMD_DSM to ata_get_cmd_descript
[libata] Add Samsung PATA controller driver, pata_samsung_cf
[libata] Add 460EX on-chip SATA driver, sata_dwc_460ex
libata: reduce blacklist size even more (v2)
libata: reduce blacklist size (v2)
libata: glob_match for ata_device_blacklist (v2)
ahci_platform: Remove unneeded ahci_driver.probe assignment
ahci_platform: Provide for vendor specific init
On some platforms (MacPro3,1) the BIOS assigns the ioatdma device to the
incorrect iommu causing faults when the driver initializes. Add a quirk
to catch this misconfiguration and try falling back to untranslated
operation (which works in the MacPro3,1 case).
Assuming there are other platforms with misconfigured iommus teach the
ioatdma driver to treat initialization failures as non-fatal (just fail
the driver load and emit a warning instead of triggering a BUG_ON).
This can be classified as a boot regression since 2.6.32 on affected
platforms since the ioatdma module did not autoload prior to that
kernel.
Cc: <stable@kernel.org>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Reported-by: Chris Li <lkml@chrisli.org>
Tested-by: Chris Li <lkml@chrisli.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This patch fixes the below warnings introduced by the commit
911e1c9b05 ("PCI:
export SMBIOS provided firmware instance and label to sysfs").
drivers/pci/pci.h: In function ‘pci_create_firmware_label_files’:
drivers/pci/pci.h:16: warning: ‘return’ with a value, in function returning void
drivers/pci/pci.h: In function ‘pci_remove_firmware_label_files’:
drivers/pci/pci.h:18: warning: ‘return’ with a value, in function returning void
The warnings are seen because of the below code, doing a retun 0
from the functions 'pci_create_firmware_label_files' and
'pci_remove_firmware_label_files' defined as void.
+#ifndef CONFIG_DMI
+static inline void pci_create_firmware_label_files(struct pci_dev *pdev)
+{ return 0; }
+static inline void pci_remove_firmware_label_files(struct pci_dev *pdev)
+{ return 0; }
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Add support for JMB364 and 369.
Patch-originally-from: Aries Lee <arieslee@jmicron.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
There is no reason to run NVidia-specific quirks related to HT MSI
mappings with MSI disabled via pci=nomsi, so make
__nv_msi_ht_cap_quirk() return immediately in that case.
This allows at least one machine to boot 100% of the time with
pci=nomsi (it still doesn't boot reliably without that).
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16443 .
Cc: stable@kernel.org
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
return the last MSI message written instead of reading it from the
device, since it may be called while the device is in a reduced
power state.
However, the pSeries platform code really does need to read messages
from the device, since they are initially written by firmware.
Therefore:
- Restore the previous behaviour of read_msi_msg_desc()
- Add new functions get_cached_msi_msg{,_desc}() which return the
last MSI message written
- Use the new functions where appropriate
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch exports SMBIOS provided firmware instance and label of
onboard PCI devices to sysfs. New files are:
/sys/bus/pci/devices/.../label which contains the firmware name for
the device in question, and
/sys/bus/pci/devices/.../index which contains the firmware device type
instance for the given device.
Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com>
Signed-off-by: Narendra K <narendra_k@dell.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI sysfs resource files currently only allow mmap'ing. On x86 this
works fine for memory backed BARs, but doesn't work at all for I/O
port backed BARs. Add read/write to I/O port PCI sysfs resource
files to allow userspace access to these device regions.
Acked-by: Chris Wright <chrisw@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In 2.6.34, we transformed the PCI DMA API into the generic device
mode. The PCI DMA API is just the wrapper of the DMA API.
So we don't need HAVE_ARCH_PCI_SET_DMA_MAX_SEGMENT_SIZE or
HAVE_ARCH_PCI_SET_DMA_SEGMENT_BOUNDARY (which enable architectures to
have the own implementations). Both haven't been used anyway.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It is a known issue that mmio decoding shall be disabled while doing PCI
bar sizing. Host bridge and other devices (PCI PIC) shall be excluded for
certain platforms. This patch mainly comes from Mathew Willcox's
patch in http://kerneltrap.org/mailarchive/linux-kernel/2007/9/13/258969.
A new flag bit "mmio_alway_on" is added to pci_dev with the intention that
devices with their mmio decoding cannot be disabled during BAR sizing shall
have this bit set, preferrablly in their quirks.
Without this patch, Intel Moorestown platform graphics unit will be
corrupted during bar sizing activities.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
During suspend on an SMP system, {read,write}_msi_msg_desc() may be
called to mask and unmask interrupts on a device that is already in a
reduced power state. At this point memory-mapped registers including
MSI-X tables are not accessible, and config space may not be fully
functional either.
While a device is in a reduced power state its interrupts are
effectively masked and its MSI(-X) state will be restored when it is
brought back to D0. Therefore these functions can simply read and
write msi_desc::msg for devices not in D0.
Further, read_msi_msg_desc() should only ever be used to update a
previously written message, so it can always read msi_desc::msg
and never needs to touch the hardware.
Tested-by: "Michael Chan" <mchan@broadcom.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The CONFIG_PCIEASPM option is confusing and potentially dangerous. ASPM is
a hardware mediated feature rather than one under direct OS control, and
even if the config option is disabled the system firmware may have turned
on ASPM on various bits of hardware. This can cause problems later -
various hardware that claims to support ASPM does a poor job of it and may
hang or cause other difficulties. The kernel is able to recognise this in
many cases and disable the ASPM functionality, but only if CONFIG_PCIEASPM
is enabled.
Given that in its default configuration this option will either leave the
hardware as it was originally or disable hardware functionality that may
cause problems, it should by default y. The only reason to disable it
ought to be to reduce code size, so make it dependent on CONFIG_EMBEDDED.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Cc: lrodriguez@atheros.com
Cc: maximlevitsky@gmail.com
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Found one PCIe Module with several bridges built-in where a "cold"
hotadd doesn't work.
If we end up reassigning bridge windows at hotadd time, and have to loop
through assigning new ranges, we won't end up enabling the child bridges
because the first assignment pass already tried to enable them, which
prevents __pci_bridge_assign_resource from updating the windows.
So try to move enabling of child bridges to the end, and only do it
once.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Removed check to prevent hotplug of display devices within shpchp.
Originally this was thought to have been required within the PCI
Hotplug specification for some legacy devices. However there is
no such requirement in the most recent revision. The check prevents
hotplug of not only display devices but also computational GPUs
which require serviceability.
Signed-off-by: Praveen Kalamegham <praveen@nextio.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The aspm code will currently set the configured aspm policy before drivers
have had an opportunity to indicate that their hardware doesn't support it.
Unfortunately, putting some hardware in L0 or L1 can result in the hardware
no longer responding to any requests, even after aspm is disabled. It makes
more sense to leave aspm policy at the BIOS defaults at initial setup time,
reconfiguring it after pci_enable_device() is called. This allows the
driver to blacklist individual devices beforehand.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use resource_size_t for MMIO address instead of unsigned long. Otherwise,
higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE.
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_enable_device can fail. In that case, a printed warning would be
more appropriate.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Junchang Wang <junchangwang@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Assigning zero where NULL should be used.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
MSI delivery from on-board ahci controller doesn't work on K8M800. At
this point, it's unclear whether the culprit is with the ahci
controller or the host bridge. Given the track record and considering
the rather minimal impact of MSI, disabling it seems reasonable.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Rainer Hurtado Navarro <publio.escipion.el.africano@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In all AMD 780 family northbridges, the vendor ID of the internal
graphics PCI/PCI bridge reads not as AMD but as that of the mainboard
vendor, because the hardware actually returns the value of the subsystem
vendor ID (erratum 18).
We currently have additional quirk entries for Asus and Acer, but it is
likely that we will encounter more systems with other vendor IDs.
Since we do not know in advance all possible vendor IDs, a better way to
find the device is to declare the quirk on the host bridge, whose ID is
always correct, and use that device as a stepping stone to find the PCI/
PCI bridge, if present.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The SLOT_REG_RSVDZ_MASK macro is normally used like this:
slot_reg &= ~SLOT_REG_RSVDZ_MASK;
The ~ operator has higher precedence than the | operator from inside the
macro, so it needs parenthesis.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some compiler generates following warnings:
In function 'aer_isr':
warning: 'e_src.id' may be used uninitialized in this function
warning: 'e_src.status' may be used uninitialized in this function
Avoid status flag "int ret" and return constants instead, so that
gcc sees the return value matching "it is initialized" better.
Acked-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch (as1388) changes the way the PCI core handles runtime PM
settings when probing or unbinding drivers. Now the core will make
sure the device is enabled for runtime PM, with a usage count >= 1,
when a driver is probed. It does the same when calling a driver's
remove method.
If the driver wants to use runtime PM, all it has to do is call
pm_runtime_pu_noidle() near the end of its probe routine (to cancel
the core's usage increment) and pm_runtime_get_noresume() near the
start of its remove routine (to restore the usage count). It does not
need to mess around with setting the runtime state to enabled,
disabled, active, or suspended.
The patch updates e1000e and r8169, the only PCI drivers that already
use the existing runtime PM interface.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch allows IOMMU users to determine whether the
hardware and software support safe, isolated interrupt
remapping. Not all Intel IOMMUs have the hardware, and the
software for AMD is not there yet.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
If we fail to assign resources to a PCI BAR, this patch makes us try the
original address from BIOS rather than leaving it disabled.
Linux tries to make sure all PCI device BARs are inside the upstream
PCI host bridge or P2P bridge apertures, reassigning BARs if necessary.
Windows does similar reassignment.
Before this patch, if we could not move a BAR into an aperture, we left
the resource unassigned, i.e., at address zero. Windows leaves such BARs
at the original BIOS addresses, and this patch makes Linux do the same.
This is a bit ugly because we disable the resource long before we try to
reassign it, so we have to keep track of the BIOS BAR address somewhere.
For lack of a better place, I put it in the struct pci_dev.
I think it would be cleaner to attempt the assignment immediately when the
claim fails, so we could easily remember the original address. But we
currently claim motherboard resources in the middle, after attempting to
claim PCI resources and before assigning new PCI resources, and changing
that is a fairly big job.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263
Reported-by: Andrew <nitr0@seti.kr.ua>
Tested-by: Andrew <nitr0@seti.kr.ua>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After the previous patch that introduced acpi_gpe_wakeup() and
modified the ACPI suspend and wakeup code to use it, the third
argument of acpi_{enable|disable}_gpe() and the GPE wakeup
reference counter are not necessary any more. Remove them and
modify all of the users of acpi_{enable|disable}_gpe()
accordingly. Also drop GPE type constants that aren't used
any more.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
virtio-pci resets the device at startup by writing to the status
register, but this does not clear the pci config space,
specifically msi enable status which affects register
layout.
This breaks things like kdump when they try to use e.g. virtio-blk.
Fix by forcing msi off at startup. Since pci.c already has
a routine to do this, we export and use it instead of duplicating code.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Commit c7f486567c
(PCI PM: PCIe PME root port service driver) causes the native PCIe
PME signaling to be used by default, if the BIOS allows the kernel to
control the standard configuration registers of PCIe root ports.
However, the native PCIe PME is coupled to the native PCIe hotplug
and calling pcie_pme_acpi_setup() makes some BIOSes expect that
the native PCIe hotplug will be used as well. That, in turn, causes
problems to appear on systems where the PCIe hotplug driver is not
loaded. The usual symptom, as reported by Jaroslav Kameník and
others, is that the ACPI GPE associated with PCIe hotplug keeps
firing continuously causing kacpid to take substantial percentage
of CPU time.
To work around this issue, change the default so that the native
PCIe PME signaling is only used if directly requested with the help
of the pcie_pme= command line switch.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15924 , which is
a listed regression from 2.6.33.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Jaroslav Kameník <jaroslav@kamenik.cz>
Tested-by: Antoni Grzymala <antekgrzymala@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Certain revisions of this chipset appear to be broken. There is a shadow
GTT which mirrors the real GTT but contains pre-translated physical
addresses, for performance reasons. When a GTT update happens, the
translations are done once and the resulting physical addresses written
back to the shadow GTT.
Except sometimes, the physical address is actually written back to the
_real_ GTT, not the shadow GTT. Thus we start to see faults when that
physical address is fed through translation again.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
stanse found the following double lock.
In get_domain_for_dev:
spin_lock_irqsave(&device_domain_lock, flags);
domain_exit(domain);
domain_remove_dev_info(domain);
spin_lock_irqsave(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
spin_unlock_irqrestore(&device_domain_lock, flags);
This happens when the domain is created by another CPU at the same time
as this function is creating one, and the other CPU wins the race to
attach it to the device in question, so we have to destroy our own
newly-created one.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Commit a99c47a2 "intel-iommu: errors with smaller iommu widths" replace the
dmar_domain->pgd with the first entry of page table when iommu's supported
width is smaller than dmar_domain's. But it use physical address directly
for new dmar_domain->pgd...
This result in KVM oops with VT-d on some machines.
Reported-by: Allen Kay <allen.m.kay@intel.com>
Cc: Tom Lyon <pugs@cisco.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: clear bridge resource range if BIOS assigned bad one
PCI: hotplug/cpqphp, fix NULL dereference
Revert "PCI: create function symlinks in /sys/bus/pci/slots/N/"
PCI: change resource collision messages from KERN_ERR to KERN_INFO
There are devices out there which are PCI Hot-plug controllers with
compaq PCI IDs, but are not bridges, hence have pdev->subordinate
NULL. But cpqphp expects the pointer to be non-NULL.
Add a check to the probe function to avoid oopses like:
BUG: unable to handle kernel NULL pointer dereference at 00000050
IP: [<f82e3c41>] cpqhpc_probe+0x951/0x1120 [cpqphp]
*pdpt = 0000000033779001 *pde = 0000000000000000
...
The device here was:
00:0b.0 PCI Hot-plug controller [0804]: Compaq Computer Corporation PCI Hotplug Controller [0e11:a0f7] (rev 11)
Subsystem: Compaq Computer Corporation Device [0e11:a2f8]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts commit 75568f8094.
Since they're just a convenience anyway, remove these symlinks since
they're causing duplicate filename errors in the wild.
Acked-by: Alex Chiang <achiang@canonical.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We can often deal with PCI resource issues by moving devices around. In
that case, there's no point in alarming the user with messages like these.
There are many bug reports where the message itself is the only problem,
e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/413419 .
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
JMB362 is a new variant of jmicron controller which is similar to
JMB360 but has two SATA ports instead of one. As there is no PATA
port, single function AHCI mode can be used as in JMB360. Add pci
quirk for JMB362.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Aries Lee <arieslee@jmicron.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.infradead.org/iommu-2.6:
intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables
intel-iommu: Combine the BIOS DMAR table warning messages
panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')
panic: Allow warnings to set different taint flags
intel-iommu: intel_iommu_map_range failed at very end of address space
intel-iommu: errors with smaller iommu widths
intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled
intel-iommu: use physfn to search drhd for VF
intel-iommu: Print out iommu seq_id
intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported
intel-iommu: Avoid global flushes with caching mode.
intel-iommu: Use correct domain ID when caching mode is enabled
intel-iommu mistakenly uses offset_pfn when caching mode is enabled
intel-iommu: use for_each_set_bit()
intel-iommu: Fix section mismatch dmar_ir_support() uses dmar_tbl.
Removed check to prevent hotplug of display devices within pciehp.
Originally this was thought to have been required within the PCI
Hotplug specification for some legacy devices. However there is
no such requirement in the most recent revision. The check prevents
hotplug of not only display devices but also computational GPUs
which require serviceability.
Signed-off-by: Praveen Kalamegham <praveen@nextio.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
At the moment only PCI-E briges can be flagged as hotplug, thus
allowing manual resource preallocation via pci=hpmemsize=nnM and
pci=hpiosize=nnM kernel parameters. Some PCI hotplug bridges, e.g.
PLX 6254 can also benefit from this functionalily, as kernel fails
to properly allocate their resources when hotplug device is added
and PCI bus is rescanned.
This patch adds header quirk for PLX 6254 that marks this bridge
as hotplug. Other PCI bridges with similar problems can use it
as well.
Signed-off-by: Felix Radensky <felix@embedded-sol.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCI config space bin_attr read handler has a hardcoded CAP_SYS_ADMIN
check to verify privileges before allowing a user to read device
dependent config space. This is meant to protect from an unprivileged
user potentially locking up the box.
When assigning a PCI device directly to a guest with libvirt and KVM,
the sysfs config space file is chown'd to the unprivileged user that
the KVM guest will run as. The guest needs to have full access to the
device's config space since it's responsible for driving the device.
However, despite being the owner of the sysfs file, the CAP_SYS_ADMIN
check will not allow read access beyond the config header.
With this patch we check privileges against the capabilities used when
openining the sysfs file. The allows a privileged process to open the
file and hand it to an unprivileged process, and the unprivileged process
can still read all of the config space.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allows bin_attr->read,write,mmap callbacks to check file specific data
(such as inode owner) as part of any privilege validation.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
Now, a dedicated HEST tabling parsing code is used for PCIE AER
firmware_first setup. It is rebased on general HEST tabling parsing
code of APEI. The firmware_first setup code is moved from PCI core to
AER driver too, because it is only AER related.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Len Brown <len.brown@intel.com>
We now know how to deal with these tables so that they are harmless.
Set TAINT_FIRMWARE_WORKAROUND instead of the default TAINT_WARN.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We have nearly the same code for warnings repeated four times. Move
it into a separate function.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Use kmemdup when some other buffer is immediately copied into the
allocated region.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression from,to,size,flag;
statement S;
@@
- to = \(kmalloc\|kzalloc\)(size,flag);
+ to = kmemdup(from,size,flag);
if (to==NULL || ...) S
- memcpy(to, from, size);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_read/write_vpd() can fail due to a timeout. Usually the command
times out because of firmware issues (incorrect vpd length, etc.) on the
PCI card. Currently, the timeout occurs silently.
Output a message to the user indicating that they should check with
their vendor for new firmware.
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This fixes all occurrences of pci_enable_device and pci_disable_device
in all comments. There are no code changes involved.
Signed-off-by: Roman Fietze <roman.fietze@telemotive.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
As reported in <http://bugs.debian.org/552299>, MSI appears to be
broken for this on-board device. We already have a quirk for the
P5N32-SLI Premium; extend it to cover both variants of the board.
Reported-by: Romain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Add amd_iommu=off command line option
iommu-api: Remove iommu_{un}map_range functions
x86/amd-iommu: Implement ->{un}map callbacks for iommu-api
x86/amd-iommu: Make amd_iommu_iova_to_phys aware of multiple page sizes
x86/amd-iommu: Make iommu_unmap_page and fetch_pte aware of page sizes
x86/amd-iommu: Make iommu_map_page and alloc_pte aware of page sizes
kvm: Change kvm_iommu_map_pages to map large pages
VT-d: Change {un}map_range functions to implement {un}map interface
iommu-api: Add ->{un}map callbacks to iommu_ops
iommu-api: Add iommu_map and iommu_unmap functions
iommu-api: Rename ->{un}map function pointers to ->{un}map_range
intel_iommu_map_range() doesn't allow allocation at the very end of the
address space; that code has been simplified and corrected.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When using iommu_domain_alloc with the Intel iommu, the domain address
width is always initialized to 48 bits (agaw 2). This domain->agaw value
is then used by pfn_to_dma_pte to (always) build a 4 level page table.
However, not all systems support iommu width of 48 or 4 level page tables.
In particular, the Core i5-660 and i5-670 support an address width of 36
bits (not 39!), an agaw of only 1, and only 3 level page tables.
This version of the patch simply lops off extra levels of the page tables
if the agaw value of the iommu is less than what is currently allocated
for the domain (in intel_iommu_attach_device). If there were already
allocated addresses above what the new iommu can handle, EFAULT is
returned.
Signed-off-by: Tom Lyon <pugs@cisco.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This reverts commit 977d17bb17, because it
can cause problems with some devices not getting any resources at all
when the resource tree is re-allocated.
For an example of this, see
https://bugzilla.kernel.org/show_bug.cgi?id=15960
(originally https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4982)
(lkml thread: http://lkml.org/lkml/2010/4/19/20)
where Peter Henriksson reported his Xonar DX sound card gone, because
the IO port region was no longer allocated.
Reported-bisected-and-tested-by: Peter Henriksson <peter.henriksson@gmail.com>
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Skip zero-ing in aer_alloc_rpc() since it is allocated by kzalloc().
The closing comment marker "*/" is recommended for kernel-doc comments.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I noticed that when I inject a fatal error to an endpoint via
aer-inject, aer_root_reset() is called as reset_link for a
downstream port at upstream of the endpoint:
pcieport 0000:00:06.0: AER: Uncorrected (Fatal) error received: id=5401
:
pcieport 0000:52:02.0: Root Port link has been reset
It externally appears to be working, but internally issues some
accesses to PCI_ERR_ROOT_COMMAND/STATUS registers that is for
root port so not available on downstream port.
This patch introduces default_downstream_reset_link that is
a version of aer_root_reset() with no accesses to root port's
register. It is used for downstream ports that has no reset_link
function its specific.
This patch also updates related description in pcieaer-howto.txt.
Some minor fixes are included.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The pcie->port of port service device points the port associated
the service with. The find_aer_service iterates over children of
given port udev.
So it is clear that the pcie->port of port service of given port
udev must always point the udev.
Therefore we can know the type of udev without checking its children.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make it clear that we only interest in 2 *_RCV bits.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Current get_e_source() returns pointer to an element of array.
However since it also progress consume counter, it is possible
that the element is overwritten by newly produced data before
the element is really consumed.
This patch changes get_e_source() to copy contents of the element
to address pointed by its caller. Once copied the element in
array can be consumed.
And relocate this function to more innocuous place.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Divide tricky for-loop into readable if-blocks.
The logic to set multi_error_valid (to force walking pci bus
hierarchy to find 2nd~ error devices) is changed too, to check
MULTI_{,_UN}COR_RCV bit individually and to force walk only when
it is required.
And rework setting e_info->severity for uncorrectable, not to use
magic numbers.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Stop iteration if we cannot register any more.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Take core part of find_device_iter() to make a new function
is_error_source() that checks given device has report an error
or not.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Return bool to indicate that the source device is found or not.
This allows us to skip calling aer_process_err_devices() if we can.
And move dev_printk for debug into this function.
v2: return bool instead of int
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
These functions are only called from init/remove path of aerdrv,
so move them from aerdrv_core.c to aerdrv.c, to make them static.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This cleanup solves some minor naming issues by removing unuseful
function aer_delete_rootport() and by renaming disable_root_aer()
to aer_disable_rootport().
- Inconsistent location of alloc & free:
The struct rpc is allocated in aer_alloc_rpc() at aerdrv.c
while it is implicitly freed in aer_delete_rootport() at
aerdrv_core.c.
- Inconsistent function name:
It makes a bit confusion that aer_delete_rootport() is seemed
to be paired with aer_enable_rootport(), i.e. there is neither
"add" against "delete" nor "disable" against "enable".
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A successful write() to the "reset" sysfs attribute should return the
number of bytes written, not 0. Otherwise userspace (bash) retries the
write over and over again.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Most current machines have no problem with this, and in fact many devices and
features work best (or only!) with MSI.
Reported-by: Petteri Räty <betelgeuse@gentoo.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pci_lock must be a real spinlock in preempt-rt. Convert it to
raw_spinlock. No change for !RT kernels.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch (as1353) removes a couple of unnecessary assignments from
the PCI core. The should_wakeup flag is naturally initialized to 0;
there's no need to clear it.
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Create convenience symlinks in sysfs, linking slots to device
functions, and vice versa. These links make it easier for users to
figure out which devices actually live in what slots.
For example:
sapphire:/sys/bus/pci/slots # ls
1 10 2 3 4 5 6 7 8 9
sapphire:/sys/bus/pci/slots # ls -l 3
total 0
-r--r--r-- 1 root root 65536 Aug 18 14:10 address
lrwxrwxrwx 1 root root 0 Aug 18 14:10 function0 ->
../../../../devices/pci0000:23/0000:23:01.0
lrwxrwxrwx 1 root root 0 Aug 18 14:10 function1 ->
../../../../devices/pci0000:23/0000:23:01.1
sapphire:/sys/bus/pci/slots # ls -l 3/function0/slot
lrwxrwxrwx 1 root root 0 Aug 18 14:13 3/function0/slot ->
../../../bus/pci/slots/3
The original form of this patch was written by Matthew Wilcox,
and was enhanced to include links from the sysfs slots/ directory
pointing back at the device functions.
Cc: willy@linux.intel.com
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This ensures that the translations for unmapped IO mappings or
unmapped memory are properly removed from the MMU hash table
before such an unplug. Without this, the hypervisor refuses the
unplug operations due to those resources still being mapped by
the partition.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If the firmware puts a device back into D0 state at resume time, we'll
update its state in resume_noirq and thus skip the platform resume code.
Calling that code twice should be safe and we ought to avoid getting to
that point anyway, so remove the check and also allow the platform pci
code to be called for D0.
Fixes USB not being powered after resume on recent Lenovo machines.
Acked-by: Alex Chiang <achiang@canonical.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This reverts c519a5a7da. That change added a warning about devices that
didn't respond correctly when sizing BARs, which helped diagnose broken
devices. But the test wasn't specific enough, so it also complained about
working devices with zero-size BARs, e.g.,
https://bugzilla.kernel.org/show_bug.cgi?id=15822
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Commit 074835f014 ("intel-iommu: Fix
kernel hand if interrupt remapping disabled in BIOS") is adding a check
for interrupt remapping disabled and is dereferencing the dmar_tbl
pointer without checking its value.
Unfortunately, this value is null when booting inside a 64bit virtual
box guest with io-apic disabled, leading to a crash. With a check on it,
the guest is now booting. It's triggering a WARN() in
clockevent_delta2ns but it's better than not booting at all and allows
the user to see there's something wrong on their io-apic setup.
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
When virtfn is used, we should use physfn to find correct drhd
-v2: add pci_physfn() Suggested by Roland Dreier <rdreier@cisco.com>
do can remove ifdef in dmar.c
-v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
also change dmar_pci_device_match() static
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Roland Dreier <rdreier@cisco.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
While it may be efficient on real hardware, emulation of global
invalidations is very expensive as all shadow entries must be examined.
This patch changes the behaviour when caching mode is enabled (which is
the case when IOMMU emulation takes place). In this case, page specific
invalidation is used instead.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In caching-mode mappings of pages (changes from non-present to present)
require invalidation.
Currently, this IOTLB flush is performed with domain ID of zero.
This is not according to the VT-d spec and causes big problems for
emulating software.
This patch uses the correct domain ID in IOTLB flushes.
Device IOTLB invalidation is performed only on present to non-present
changes. This decision is now based on explicit parameter instead of
zero domain-ID.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
intel_map_sg used offset_pfn which was set to zero when invalidating the IOTLB.
intel_map_sg now uses size variable for this matter.
Signed-off-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Replace open-coded loop with for_each_set_bit().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
While testing completion timeouts I found that hardware was not recovering.
It looks like the hot reset was never being propagated to the endpoint
devices on the bus due to the fact that we were clearing the bit too
quickly.
The documentation I have states that we should be transmitting hot reset
TS1s for 2ms. To achieve this I have added a 2ms delay from the time we
set the secondary bus reset bit to the time we clear it. In addition I
changed the define used for the secondary bus reset bit to match the
register define that was being used.
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The missing initialization of the nb_cntl.strap_msi_enable does not
seem to be the only problem that prevents MSI, so that quirk is not
sufficient to enable MSI on all machines. To be safe, disable MSI
unconditionally for the internal graphics and HDMI audio on these
chipsets.
[rjw: Added the PCI_VENDOR_ID_AI quirk.]
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: truncate _CRS windows with _LEN > _MAX - _MIN + 1
x86/PCI: for host bridge address space collisions, show conflicting resource
frv/PCI: remove redundant warnings
x86/PCI: remove redundant warnings
PCI: don't say we claimed a resource if we failed
PCI quirk: Disable MSI on VIA K8T890 systems
PCI quirk: RS780/RS880: work around missing MSI initialization
PCI quirk: only apply CX700 PCI bus parking quirk if external VT6212L is present
PCI: complain about devices that seem to be broken
PCI: print resources consistently with %pR
PCI: make disabled window printk style match the enabled ones
PCI: break out primary/secondary/subordinate for readability
PCI: for address space collisions, show conflicting resource
resources: add interfaces that return conflict information
PCI: cleanup error return for pcix get and set mmrbc functions
PCI: fix access of PCI_X_CMD by pcix get and set mmrbc functions
PCI: kill off pci_register_set_vga_state() symbol export.
PCI: fix return value from pcix_get_max_mmrbc()
pci_claim_resource() can fail, so pay attention and only claim success
when it actually succeeded. If pci_claim_resource() fails, it prints a
useful diagnostic.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bugzilla 15287 indicates that there's a problem with Message Signalled
Interrupts on VIA K8T890 systems. Add a quirk to disable MSI on these
systems.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Tested-by: Jan Kreuzer <kontrollator@gmx.de>
Tested-by: lh <jarryson@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
AMD says in section 2.5.4 (GFX MSI Enable) of #43291 (AMD 780G Family
Register Programming Requirements):
The SBIOS must enable internal graphics MSI capability in GCCFG by
setting the following: NBCFG.NB_CNTL.STRAP_MSI_ENABLE='1'
Quite a few BIOS writers misinterpret this sentence and think that
enabling MSI is an optional feature. However, clearing that bit just
prevents delivery of MSI messages but does not remove the MSI PCI
capabilities registers, and so leaves these devices unusable for any
driver that attempts to use MSI.
Setting that bit is not possible after the BIOS has locked down the
configuration registers, so we have to manually disable MSI for the
affected devices.
This fixes the codec communication errors in the HDA driver when
accessing the HDMI audio device, and allows us to get rid of the
overcautious quirk in radeon_irq_kms.c.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Alex Deucher <alexdeucher@gamil.com>
Cc: <stable@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Apply the CX700 quirk only when an external VT6212L is present (which
is the case for the errant hardware the quirk was written for), don't
touch the settings otherwise -- Hauppage PVR-500 tuners need PCI Bus
Parking in order to work and when that's turned on everything seems
to behave fine.
I guess the underlying problem is a combination of an external VT6212L
and the CX700 rather than the CX700's PCI being broken completely for
all cases...
Reported-by: Jeroen Roos <jeroen@roosnl.com>
Signed-off-by: Tim Yamin <plasm@roo.me.uk>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If we can tell that a device isn't working correctly, we should tell
the user to make debugging easier. Otherwise, it can take a lot of
work to determine whether the problem is in the driver, PCMCIA, PCI,
hardware, etc., as in http://bugzilla.kernel.org/show_bug.cgi?id=12006
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No functional change; just print resources in the conventional style.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No functional change; this just tweaks the changes from 349e1823a405
so the new printks for disabled PCI-to-PCI bridge windows match the
ones for the enabled windows.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
CC: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
No functional change; just add names for the primary/secondary/subordinate
bus numbers read from config space rather than repeatedly masking/shifting.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
With request_resource_conflict(), we can learn what the actual conflict is,
so print that info for debugging purposes.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
driver core: numa: fix BUILD_BUG_ON for node_read_distance
driver-core: document ERR_PTR() return values
kobject: documentation: Update to refer to kset-example.c.
sysdev: the cpu probe/release attributes should be sysdev_class_attributes
kobject: documentation: Fix erroneous example in kobject doc.
driver-core: fix missing kernel-doc in firmware_class
Driver core: Early platform kernel-doc update
sysfs: fix sysfs lockdep warning in mlx4 code
sysfs: fix sysfs lockdep warning in infiniband code
sysfs: fix sysfs lockdep warning in ipmi code
sysfs: Initialised pci bus legacy_mem field before use
sysfs: use sysfs_bin_attr_init in firmware class driver
pcix_get_mmrbc() returns the maximum memory read byte count (mmrbc), if
successful, or an appropriate error value, if not.
Distinguishing errors from correct values and understanding the meaning of an
error can be somewhat confusing in that:
correct values: 512, 1024, 2048, 4096
errors: -EINVAL -22
PCIBIOS_FUNC_NOT_SUPPORTED 0x81
PCIBIOS_BAD_VENDOR_ID 0x83
PCIBIOS_DEVICE_NOT_FOUND 0x86
PCIBIOS_BAD_REGISTER_NUMBER 0x87
PCIBIOS_SET_FAILED 0x88
PCIBIOS_BUFFER_TOO_SMALL 0x89
The PCIBIOS_ errors are returned from the PCI functions generated by the
PCI_OP_READ() and PCI_OP_WRITE() macros.
In a similar manner, pcix_set_mmrbc() also returns the PCIBIOS_ error values
returned from pci_read_config_[word|dword]() and pci_write_config_word().
Following pcix_get_max_mmrbc()'s example, the following patch simply returns
-EINVAL for all PCIBIOS_ errors encountered by pcix_get_mmrbc(), and -EINVAL
or -EIO for those encountered by pcix_set_mmrbc().
This simplification was chosen in light of the fact that none of the current
callers of these functions are interested in the specific type of error
encountered. In the future, should this change, one could simply create a
function that maps each PCIBIOS_ error to a corresponding unique errno value,
which could be called by pcix_get_max_mmrbc(), pcix_get_mmrbc(), and
pcix_set_mmrbc().
Additionally, this patch eliminates some unnecessary variables.
Cc: stable@kernel.org
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
An e1000 driver on a system with a PCI-X bus was always being returned
a value of 135 from both pcix_get_mmrbc() and pcix_set_mmrbc(). This
value reflects an error return of PCIBIOS_BAD_REGISTER_NUMBER from
pci_bus_read_config_dword(,, cap + PCI_X_CMD,).
This is because for a dword, the following portion of the PCI_OP_READ()
macro:
if (PCI_##size##_BAD) return PCIBIOS_BAD_REGISTER_NUMBER;
expands to:
if (pos & 3) return PCIBIOS_BAD_REGISTER_NUMBER;
And is always true for 'cap + PCI_X_CMD', which is 0xe4 + 2 = 0xe6. ('cap' is
the result of calling pci_find_capability(, PCI_CAP_ID_PCIX).)
The same problem exists for pci_bus_write_config_dword(,, cap + PCI_X_CMD,).
In both cases, instead of calling _dword(), _word() should be called.
Cc: stable@kernel.org
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
When pci_register_set_vga_state() was made __init, the EXPORT_SYMBOL() was
retained, which now leaves us with a section mismatch.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Cc: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
For the PCI_X_STATUS register, pcix_get_max_mmrbc() is returning an incorrect
value, which is based on:
(stat & PCI_X_STATUS_MAX_READ) >> 12
Valid return values are 512, 1024, 2048, 4096, which correspond to a 'stat'
(masked and right shifted by 21) of 0, 1, 2, 3, respectively.
A right shift by 11 would generate the correct return value when 'stat' (masked
and right shifted by 21) has a value of 1 or 2. But for a value of 0 or 3 it's
not possible to generate the correct return value by only right shifting.
Fix is based on pcix_get_mmrbc()'s similar dealings with the PCI_X_CMD register.
Cc: stable@kernel.org
Signed-off-by: Dean Nelson <dnelson@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PPC64 is failing to boot the latest mmotm due to an uninitialised pointer in
pci_create_legacy_files(). The surprise is that machines boot at all and it
would appear to affect current mainline as well. This patch fixes the problem.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'for-linus' of git://git.monstr.eu/linux-2.6-microblaze: (27 commits)
microblaze: entry.S use delay slot for return handlers
microblaze: Save current task directly
microblaze: Simplify entry.S - save/restore r3/r4 - ret_from_trap
microblaze: PCI early support for noMMU system
microblaze: Fix dma alloc and free coherent dma functions
microblaze: Add consistent code
microblaze: pgtable.h: move consistent functions
microblaze: Remove ancient Kconfig option for consistent mapping
microblaze: Remove VMALLOC_VMADDR
microblaze: Add define for ASM_LOOP
microblaze: Preliminary support for dma drivers
microblaze: remove trailing space in messages
microblaze: Use generic show_mem()
microblaze: Change temp register for cmdline
microblaze: Preliminary support for dma drivers
microblaze: Move cache function to cache.c
microblaze: Add support from PREEMPT
microblaze: Add support for Xilinx PCI host bridge
microblaze: Enable PCI, missing files
microblaze: Add core PCI files
...
Per ACPI spec, _ERG method should be executed before device driver
gets control for hotpluged device. Firmware might do some configuration
there. See http://bugzilla.kernel.org/show_bug.cgi?id=10805. In this
machine, _REG method of docked device will configure cardbus bridge.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: Paul Martin <pm@debian.org>
Signed-off-by: Len Brown <len.brown@intel.com>
We can use pci-dma-compat.h to implement pci_set_dma_mask and
pci_set_consistent_dma_mask as we do with the other PCI DMA API.
We can remove HAVE_ARCH_PCI_SET_DMA_MASK too.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dma_set_coherent_mask corresponds to pci_set_consistent_dma_mask. This is
necessary to move to the generic device model DMA API from the PCI bus
specific API in the long term.
dma_set_coherent_mask works in the exact same way that
pci_set_consistent_dma_mask does. So this patch also changes
pci_set_consistent_dma_mask to call dma_set_coherent_mask.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This changes pci_set_dma_mask to call the generic DMA API, dma_set_mask.
pci_set_dma_mask (in drivers/pci/pci.c) does the same things that
dma_set_mask does on all the architectures that use pci_set_dma_mask;
calls dma_supprted and sets dev->dma_mask. So we safely change
pci_set_dma_mask to simply call dma_set_mask.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@suse.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are two parts of changes. The first is just enable
PCI in Makefiles and in Kconfig. The second is the rest of
missing files. I didn't want to add it with previous patch
because that patch is too big.
Current Microblaze toolchain has problem with weak symbols
that's why is necessary to apply this changes to be possible
to compile pci support.
Xilinx knows about this problem.
Signed-off-by: Michal Simek <monstr@monstr.eu>
In the future, we are going to be changing the lock type for struct
device (once we get the lockdep infrastructure properly worked out) To
make that changeover easier, and to possibly burry the lock in a
different part of struct device, let's create some functions to lock and
unlock a device so that no out-of-core code needs to be changed in the
future.
This patch creates the device_lock/unlock/trylock() functions, and
converts all in-tree users to them.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Phil Carmody <ext-phil.2.carmody@nokia.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Magnus Damm <damm@igel.co.jp>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alex Chiang <achiang@hp.com>
Cc: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrew Patterson <andrew.patterson@hp.com>
Cc: Yu Zhao <yu.zhao@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Wolfram Sang <w.sang@pengutronix.de>
Cc: CHENG Renquan <rqcheng@smu.edu.sg>
Cc: Oliver Neukum <oliver@neukum.org>
Cc: Frans Pop <elendil@planet.nl>
Cc: David Vrabel <david.vrabel@csr.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
After merging the final tree, today's linux-next build (powerpc
allyesconfig) failed like this:
drivers/pci/pci-sysfs.c: In function 'pci_create_legacy_files':
drivers/pci/pci-sysfs.c:645: error: lvalue required as unary '&' operand
drivers/pci/pci-sysfs.c:658: error: lvalue required as unary '&' operand
Caused by commit "sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on
dynamic attributes" interacting with commit "sysfs: Use one lockdep
class per sysfs attribute") both from the driver-core tree.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
These are the non-static sysfs attributes that exist on
my test machine. Fix them to use sysfs_attr_init or
sysfs_bin_attr_init as appropriate. It simply requires
making a sysfs attribute present to see this. So this
is a little bit tedious but otherwise not too bad.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.
Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime
* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime
* potentially better optimized code as the compiler
can assume that the const data cannot be changed
* the compiler/linker move const data into .rodata
and therefore exclude them from false sharing
Signed-off-by: Emese Revfy <re.emese@gmail.com>
Acked-by: David Teigland <teigland@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch changes the iommu-api functions for mapping and
unmapping page ranges to use the new page-size based
interface. This allows to remove the range based functions
later.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The new function pointer names match better with the
top-level functions of the iommu-api which are using them.
Main intention of this change is to make the ->{un}map
pointer names free for two new mapping functions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This patch solves nasty problem original driver has.
Original goal of the ricoh_mmc was to disable this device because then,
mmc cards can be read using standard SDHCI controller, thus avoiding
writing of yet another driver.
However, the act of disablement, makes other pci functions that belong to
this controller (xD and memstick) shift up one level, thus pci core has
now wrong idea about these devices.
To fix this issue, this patch moves the driver into the pci quirk section,
thus it is executes before the pci is enumerated, and therefore solving
that issue, also same sequence of commands is performed on resume for same
reasons.
Also regardless of the above, this way is cleaner. You still need to set
CONFIG_MMC_RICOH_MMC to enable this quirk
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Acked-by: Philip Langdale <philipl@overt.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make the run-time power management of PCI devices be inactive by
default by calling pm_runtime_forbid() for each PCI device during its
initialization. This setting may be overriden by the user space with
the help of the /sys/devices/.../power/control interface.
That's necessary to avoid breakage on systems where ACPI-based
wake-up is known to fail for some devices.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-bootmem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
early_res: Need to save the allocation name in drop_range_partial()
sparsemem: Fix compilation on PowerPC
early_res: Add free_early_partial()
x86: Fix non-bootmem compilation on PowerPC
core: Move early_res from arch/x86 to kernel/
x86: Add find_fw_memmap_area
Move round_up/down to kernel.h
x86: Make 32bit support NO_BOOTMEM
early_res: Enhance check_and_double_early_res
x86: Move back find_e820_area to e820.c
x86: Add find_early_area_size
x86: Separate early_res related code from e820.c
x86: Move bios page reserve early to head32/64.c
sparsemem: Put mem map for one node together.
sparsemem: Put usemap for one node together
x86: Make 64 bit use early_res instead of bootmem before slab
x86: Only call dma32_reserve_bootmem 64bit !CONFIG_NUMA
x86: Make early_node_mem get mem > 4 GB if possible
x86: Dynamically increase early_res array size
x86: Introduce max_early_res and early_res_count
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1341 commits)
virtio_net: remove forgotten assignment
be2net: fix tx completion polling
sis190: fix cable detect via link status poll
net: fix protocol sk_buff field
bridge: Fix build error when IGMP_SNOOPING is not enabled
bnx2x: Tx barriers and locks
scm: Only support SCM_RIGHTS on unix domain sockets.
vhost-net: restart tx poll on sk_sndbuf full
vhost: fix get_user_pages_fast error handling
vhost: initialize log eventfd context pointer
vhost: logging thinko fix
wireless: convert to use netdev_for_each_mc_addr
ethtool: do not set some flags, if others failed
ipoib: returned back addrlen check for mc addresses
netlink: Adding inode field to /proc/net/netlink
axnet_cs: add new id
bridge: Make IGMP snooping depend upon BRIDGE.
bridge: Add multicast count/interval sysfs entries
bridge: Add hash elasticity/max sysfs entries
bridge: Add multicast_snooping sysfs toggle
...
Trivial conflicts in Documentation/feature-removal-schedule.txt