The conditions to allow runtime PM on PCIe ports are currently spread
across two different files: The condition relating to hotplug ports is
located in portdrv_pci.c whereas all other conditions are located in pci.c.
Consolidate all conditions in a single place in pci.c, thus making it
easier to follow the logic and amend conditions down the road.
Note that the condition relating to hotplug ports is inserted *before* the
condition relating to the "pcie_port_pm=force" command line option, so
runtime PM is not afforded to hotplug ports even if this option is given.
That's exactly how the code behaved up until now. If this is not desired,
the ordering of the conditions can simply be reversed.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently pcie_portdrv_probe() activates runtime PM on a PCIe port even
if it will never actually suspend because the BIOS is too old or the
"pcie_port_pm=off" option was specified on the kernel command line.
A few CPU cycles can be saved by not activating runtime PM at all in these
cases, because rpm_idle() and rpm_suspend() will bail out right at the
beginning when calling rpm_check_suspend_allowed(), instead of carrying out
various locking and assignments, invoking rpm_callback(), getting back
-EBUSY and rolling everything back.
The conditions checked in pci_bridge_d3_possible() are all static, they
never change during uptime of the system, hence it's safe to call this to
determine if runtime PM should be activated.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After a device has been added, removed or had its D3cold attributes
changed, we recheck whether its parent bridge may runtime suspend to D3hot
with pci_bridge_d3_update().
The most naive algorithm would be to iterate over the bridge's children and
check if any of them are blocking D3.
The function already tries to be a bit smarter than that by first checking
the device that was changed. If this device already blocks D3 on the
bridge, then walking over all the other children can be skipped. A
drawback of this approach is that if the device is *not* blocking D3, it
will be checked a second time by pci_walk_bus(). But that's cheap and is
outweighed by the performance gain of potentially skipping pci_walk_bus()
altogether.
The algorithm can be optimized further by taking into account if D3 is
currently allowed for the bridge, as shown in the following truth table:
(a) remove && bridge_d3: D3 is currently allowed for the bridge and
removing one of its children won't change
that. No action necessary.
(b) remove && !bridge_d3: D3 may now be allowed for the bridge if the
removed child was the only one blocking it.
Check all its siblings to verify that.
(c) !remove && bridge_d3: D3 may now be disallowed but this can only
be caused by the added/changed child, not
any of its siblings. Check only that single
device.
(d) !remove && !bridge_d3: D3 may now be allowed for the bridge if the
changed child was the only one blocking it.
Check all its siblings to verify that.
By checking beforehand if the changed child
is blocking D3, we may be able to skip
checking its siblings.
Currently we do not special-case option (a) and in case of option (c) we
gratuitously call pci_walk_bus(). Speed up the algorithm by adding these
optimizations. Reword the comments a bit in an attempt to improve clarity.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The algorithm to update the flag indicating whether a bridge may go to D3
makes a few optimizations based on whether the update was caused by the
removal of a device on the one hand, versus the addition of a device or the
change of its D3cold flags on the other hand.
The information whether the update pertains to a removal is currently
passed in by the caller, but the function may as well determine that itself
by examining the device in question, thereby allowing for a considerable
simplification and reduction of the code.
Out of several options to determine removal, I've chosen the function
device_is_registered() because it's cheap: It merely returns the
dev->kobj.state_in_sysfs flag. That flag is set through device_add() when
the root bus is scanned and cleared through device_remove(). The call to
pci_bridge_d3_update() happens after each of these calls, respectively, so
the ordering is correct.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This function is always called with an existing pci_dev struct, which
holds a reference on the pci_bus struct it resides on, which in turn
holds a reference on pci_bus->bridge, which is the pci_dev's parent.
Hence there's no need to acquire an additional ref on the parent.
More specifically, the pci_dev exists until pci_destroy_dev() drops the
final reference on it, so all calls to pci_bridge_d3_update() must be
finished before that. It is arguably the caller's responsibility to ensure
that it doesn't call pci_bridge_d3_update() with a pci_dev that might
suddenly disappear, but in any case the existing callers are all safe:
- The call in pci_destroy_dev() happens before the call to put_device().
- The call in pci_bus_add_device() is synchronized with pci_destroy_dev()
using pci_lock_rescan_remove().
- The calls to pci_d3cold_disable() from the xhci and nouveau drivers
are safe because a ref on the pci_dev is held as long as it's bound to
a driver.
- The calls to pci_d3cold_enable() / pci_d3cold_disable() when modifying
the sysfs "d3cold_allowed" entry are also safe because kernfs_drain()
waits for existing sysfs users to finish before removing the entry,
and pci_destroy_dev() is called way after that.
No functional change intended.
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
One some systems, the firmware does not allow certain PCI devices to be put
in deep D-states. This can cause problems for wakeup signalling, if the
device does not support PME# in the deepest allowed suspend state. For
example, Pierre reports that on his system, ACPI does not permit his xHCI
host controller to go into D3 during runtime suspend -- but D3 is the only
state in which the controller can generate PME# signals. As a result, the
controller goes into runtime suspend but never wakes up, so it doesn't work
properly. USB devices plugged into the controller are never detected.
If the device relies on PME# for wakeup signals but is not capable of
generating PME# in the target state, the PCI core should accurately report
that it cannot do wakeup from runtime suspend. This patch modifies the
pci_dev_run_wake() routine to add this check.
Reported-by: Pierre de Villemereuil <flyos@mailoo.org>
Tested-by: Pierre de Villemereuil <flyos@mailoo.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org
CC: Lukas Wunner <lukas@wunner.de>
Resource allocation for VFs is done via the VF BARx registers in the PF's
SR-IOV Capability, and the BARs in the VFs themselves are read-only zeros
(see SR-IOV spec r1.1, secs 3.3.14 and 3.4.1.11).
Even though the actual VF BARs are read-only zeros, the VF dev->resource[]
structs describe the space allocated for the VF (this is a piece of the
space described by the VF BARx register in the PF's SR-IOV capability).
It's meaningless to request additional alignment for a VF: the VF BAR
alignment is completely determined by the alignment of the VF BARx in the
PF and the size of the VF BAR.
Ignore the user's alignment requests for VF devices.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Users may request additional alignment of PCI resources, e.g., to align
BARs on page boundaries so they can be shared with guests via VFIO. This
of course may require reallocation if firmware has already assigned the
BARs with smaller alignments.
If the platform has requested PCI_PROBE_ONLY, we should never change any
PCI BARs, so we can't provide any additional alignment. Also, if a BAR is
marked as IORESOURCE_PCI_FIXED, e.g., for PCI Enhanced Allocation or if the
firmware depends on the current BAR value, we can't change the alignment.
In these cases, log a message and ignore the user's alignment requests.
[bhelgaas: changelog, use goto to simplify PCI_PROBE_ONLY check]
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Whenever a device is resumed or its power state is changed using the
platform, its new power state is read from the PM Control & Status Register
and cached in pci_dev->current_state by calling pci_update_current_state().
If the device is in D3cold, reading from config space typically results in
a fabricated "all ones" response. But if it's in D3hot, the two bits
representing the power state in the PMCSR are *also* set to 1. Thus D3hot
and D3cold are not discernible by just reading the PMCSR.
To account for this, pci_update_current_state() uses two workarounds:
- When transitioning to D3cold using pci_platform_power_transition(), the
new power state is set blindly by pci_update_current_state(), i.e.
without verifying that the device actually *is* in D3cold. This is
achieved by setting the "state" argument to PCI_D3cold. The "state"
argument was originally intended to convey the new state in case the
device doesn't have the PM capability. It is *also* used to convey the
device state if the PM capability is present and the new state is D3cold,
but this was never explained in the kerneldoc.
- Once the current_state is set to D3cold, further invocations of
pci_update_current_state() will blindly assume that the device is still
in D3cold and leave the current_state unmodified. To get out of this
impasse, the current_state has to be set directly, typically by calling
pci_raw_set_power_state() or pci_enable_device().
It would be desirable if pci_update_current_state() could reliably detect
D3cold by itself. That would allow us to do away with these workarounds,
and it would allow for a smarter, more energy conserving runtime resume
strategy after system sleep: Currently devices which utilize
direct_complete are mandatorily runtime resumed in their ->complete stage.
This can be avoided if their power state after system sleep is the same as
before, but it requires a mechanism to detect the power state reliably.
We've just gained the ability to query the platform firmware for its
opinion on the device's power state. On platforms conforming to ACPI 4.0
or newer, this allows recognition of D3cold. Pre-4.0 platforms lack _PR3
and therefore the deepest power state that will ever be reported is D3hot,
even though the device may actually be in D3cold. To detect D3cold in
those cases, accessibility of the vendor ID in config space is probed using
pci_device_is_present(). This also works for devices which are not
platform-power-manageable at all, but can be suspended to D3cold using a
nonstandard mechanism (e.g. some hybrid graphics laptops or Thunderbolt on
the Mac).
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Usually the most accurate way to determine a PCI device's power state is to
read its PM Control & Status Register. There are two cases however when
this is not an option: If the device doesn't have the PM capability at
all, or if it is in D3cold (in which case its config space is
inaccessible).
In both cases, we can alternatively query the platform firmware for its
opinion on the device's power state. To facilitate this, augment struct
pci_platform_pm_ops with a ->get_power callback and implement it for
acpi_pci_platform_pm (the only pci_platform_pm_ops existing so far).
It is used by a forthcoming commit to let pci_update_current_state()
recognize D3cold.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are devices not power-manageable by the platform, but still able to
runtime suspend to D3cold with a non-standard mechanism. One example is
laptop hybrid graphics where the discrete GPU and its built-in HDA
controller are power-managed either with a _DSM (AMD PowerXpress, Nvidia
Optimus) or a separate gmux controller (MacBook Pro). Another example is
Thunderbolt on Macs which is power-managed with custom ACPI methods.
When putting the system to sleep, we currently handle such devices
improperly by transitioning them from D3cold to D3hot (the default power
state defined at the top of pci_target_state()). This wastes energy and
prolongs the suspend sequence (powering up the Thunderbolt controller takes
2 seconds).
Avoid that by assuming that a non-standard PM mechanism is at work if the
device is not platform-power-manageable but currently in D3cold.
If the device is wakeup enabled, we might still have to wake it up from
D3cold if PME cannot be signaled from that power state.
The check for devices without PM capability comes before the check for
D3cold since such devices could in theory also be powered down by
non-standard means and should then be afforded direct-complete as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add a new helper function pci_find_resource() that can be used to find out
whether a given resource (for example from a child device) is contained
within given PCI device's standard resources.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/resource:
unicore32/PCI: Remove pci=firmware command line parameter handling
ARM/PCI: Remove arch-specific pcibios_enable_device()
ARM64/PCI: Remove arch-specific pcibios_enable_device()
MIPS/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
ARM/PCI: Claim bus resources on PCI_PROBE_ONLY set-ups
PCI: generic: Claim bus resources on PCI_PROBE_ONLY set-ups
PCI: Add generic pci_bus_claim_resources()
alx: Use pci_(request|release)_mem_regions
ethernet/intel: Use pci_(request|release)_mem_regions
GenWQE: Use pci_(request|release)_mem_regions
lpfc: Use pci_(request|release)_mem_regions
NVMe: Use pci_(request|release)_mem_regions
PCI: Add helpers to request/release memory and I/O regions
PCI: Extending pci=resource_alignment to specify device/vendor IDs
sparc/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
powerpc/pci: Implement pci_resource_to_user() with pcibios_resource_to_bus()
microblaze/PCI: Implement pci_resource_to_user() with pcibios_resource_to_bus()
PCI: Unify pci_resource_to_user() declarations
microblaze/PCI: Remove useless __pci_mmap_set_pgprot()
powerpc/pci: Remove __pci_mmap_set_pgprot()
PCI: Ignore write combining when mapping I/O port space
* pci/aspm:
PCI/ASPM: Remove redundant check of pcie_set_clkpm
* pci/dpc:
PCI: Remove DPC tristate module option
PCI: Bind DPC to Root Ports as well as Downstream Ports
PCI: Fix whitespace in struct dpc_dev
PCI: Convert Downstream Port Containment driver to use devm_* functions
* pci/hotplug:
PCI: Allow additional bus numbers for hotplug bridges
* pci/misc:
PCI: Include <asm/dma.h> for isa_dma_bridge_buggy
PCI: Make bus_attr_resource_alignment static
MAINTAINERS: Add file patterns for PCI device tree bindings
PCI: Fix comment typo
* pci/msi:
PCI/MSI: irqchip: Fix PCI_MSI dependencies
* pci/pm:
PCI: pciehp: Ignore interrupts during D3cold
PCI: Document connection between pci_power_t and hardware PM capability
PCI: Add runtime PM support for PCIe ports
ACPI / hotplug / PCI: Runtime resume bridge before rescan
PCI: Power on bridges before scanning new devices
PCI: Put PCIe ports into D3 during suspend
PCI: Don't clear d3cold_allowed for PCIe ports
PCI / PM: Enforce type casting for pci_power_t
* pci/virtualization:
PCI: Add ACS quirk for Solarflare SFC9220
PCI: Add DMA alias quirk for Adaptec 3805
PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset
PCI: Add function 1 DMA alias quirk for Marvell 88SE9182
A user may hot add a switch requiring more than one bus to enumerate. This
previously required a system reboot if BIOS did not sufficiently pad the
bus resource, which they frequently don't do.
Add a kernel parameter so a user can specify the minimum number of bus
numbers to reserve for a hotplug bridge's subordinate buses so rebooting
won't be necessary.
The default is 1, which is equivalent to previous behavior.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
When assign new PCI platform PM operations check for all mandatory fields to
prevent NULL pointer dereference.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
At least on arm, <asm/dma.h> does not get included when building
drivers/pci/pci.o. This causes the following build warning which can be
fixed by including <asm/dma.h>:
drivers/pci/pci.c:37:5: warning: symbol 'isa_dma_bridge_buggy' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some uio-based PCI drivers, e.g., uio_cif do not work if the assigned PCI
memory resources are not page aligned.
By using the kernel option "pci=resource_alignment" it is possible to force
single PCI boards to use page alignment for their memory resources.
However, this is fairly cumbersome if several of these boards are in use
as the specification of the cards has to be done via PCI bus/slot/function
number which might change, e.g., by adding another board.
Extend the kernel option "pci=resource_alignment" to allow specification of
relevant devices via PCI device/vendor (and subdevice/subvendor) IDs. The
specification of the devices via device/vendor is indicated by a leading
string "pci:" as argument to "pci=resource_alignment". The format of the
specification is pci:<vendor>:<device>[:<subvendor>:<subdevice>]
Signed-off-by: Mathias Koehrer <mathias.koehrer@etas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the Linux PCI core does not touch power state of PCI bridges and
PCIe ports when system suspend is entered. Leaving them in D0 consumes
power unnecessarily and may prevent the CPU from entering deeper C-states.
With recent PCIe hardware we can power down the ports to save power given
that we take into account few restrictions:
- The PCIe port hardware is recent enough, starting from 2015.
- Devices connected to PCIe ports are effectively in D3cold once the port
is transitioned to D3 (the config space is not accessible anymore and
the link may be powered down).
- Devices behind the PCIe port need to be allowed to transition to D3cold
and back. There is a way both drivers and userspace can forbid this.
- If the device behind the PCIe port is capable of waking the system it
needs to be able to do so from D3cold.
This patch adds a new flag to struct pci_device called 'bridge_d3'. This
flag is set and cleared by the PCI core whenever there is a change in power
management state of any of the devices behind the PCIe port. When system
later on is suspended we only need to check this flag and if it is true
transition the port to D3 otherwise we leave it in D0.
Also provide override mechanism via command line parameter
"pcie_port_pm=[off|force]" that can be used to disable or enable the
feature regardless of the BIOS manufacturing date.
Tested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The symbol bus_attr_resource_alignment is not exported or declared
elsewhere, so make it static to fix the following warning:
drivers/pci/pci.c:4900:1: warning: symbol 'bus_attr_resource_alignment' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Extend pci_bus_find_domain_nr() so it can find the domain from either:
- ACPI, via the new acpi_pci_bus_find_domain_nr() interface, or
- DT, via of_pci_bus_find_domain_nr()
Note that this is only used for CONFIG_PCI_DOMAINS_GENERIC=y, so it does
not affect x86 or ia64.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bus_find_domain_nr() retrieves the host bridge domain number in a
DT-specific way. Rename it to of_pci_bus_find_domain_nr() to reflect that,
so we can add a corresponding function for ACPI.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Instead of assigning bus->domain_nr inside pci_bus_assign_domain_nr(),
return the domain and let the caller do the assignment. Rename
pci_bus_assign_domain_nr() to pci_bus_find_domain_nr() to reflect this.
No functional change intended.
[bhelgaas: changelog]
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add pci_unmap_iospace() to undo what pci_remap_iospace() did.
This is needed to support hotplug removal of host bridges that use
pci_remap_iospace().
[bhelgaas: changelog]
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
* pci/arm64:
PCI, of: Move PCI I/O space management to PCI core code
PCI: generic, thunder: Use generic ECAM API
PCI: Provide common functions for ECAM mapping
* pci/host-hv:
PCI: hv: Add explicit barriers to config space access
* pci/hotplug:
PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit
* pci/resource:
PCI: Disable all BAR sizing for devices with non-compliant BARs
x86/PCI: Mark Broadwell-EP Home Agent 1 as having non-compliant BARs
PCI: Identify Enhanced Allocation (EA) BAR Equivalent resources in sysfs
Resource flags are exposed to userspace via the sysfs "resource" file.
lspci reads the sysfs file to determine resource properties.
Add a "BAR Equivalent Indicator" flag so lspci can distinguish between
[virtual] and [enhanced] resources.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sean O. Stalley <sean.stalley@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No functional changes in this patch.
PCI I/O space mapping code does not depend on OF; therefore it can be moved
to PCI core code. This way we will be able to use it, e.g., in ACPI PCI
code.
Suggested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Liviu Dudau <Liviu.Dudau@arm.com>
The original thought was that if a device implemented ACS, then surely
we want to use that... well, it turns out that devices can make an ACS
capability so broken that we still need to fall back to quirks.
Reverse the order of ACS enabling to give quirks first shot at it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Solve IOMMU support issues with PCIe non-transparent bridges that use
Requester ID look-up tables (RID-LUT), e.g., the PEX8733.
The NTB connects devices in two independent PCI domains. Devices separated
by the NTB are not able to discover each other. A PCI packet being
forwared from one domain to another has to have its RID modified so it
appears on correct bus and completions are forwarded back to the original
domain through the NTB. The RID is translated using a preprogrammed table
(LUT) and the PCI packet propagates upstream away from the NTB. If the
destination system has IOMMU enabled, the packet will be discarded because
the new RID is unknown to the IOMMU. Adding a DMA alias for the new RID
allows IOMMU to properly recognize the packet.
Each device behind the NTB has a unique RID assigned in the RID-LUT. The
current DMA alias implementation supports only a single alias, so it's not
possible to support mutiple devices behind the NTB when IOMMU is enabled.
Enable all possible aliases on a given bus (256) that are stored in a
bitset. Alias devfn is directly translated to a bit number. The bitset is
not allocated for devices that have no need for DMA aliases.
More details can be found in the following article:
http://www.plxtech.com/files/pdf/technical/expresslane/RTC_Enabling%20MulitHostSystemDesigns.pdf
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
One of the quirks that adds DMA aliases logs an informational message in
dmesg. Move that to pci_add_dma_alias() so all users log the message
consistently. No functional change intended (except extra message).
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Add a pci_add_dma_alias() interface to encapsulate the details of adding an
alias. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Fix spelling of "initalization".
[bhelgaas: also fix pci/pci.c]
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/aer:
PCI/AER: Log aer_inject error injections
PCI/AER: Log actual error causes in aer_inject
PCI/AER: Use dev_warn() in aer_inject
PCI/AER: Fix aer_inject error codes
* pci/enumeration:
PCI: Fix broken URL for Dell biosdevname
* pci/kconfig:
PCI: Cleanup pci/pcie/Kconfig whitespace
PCI: Include pci/hotplug Kconfig directly from pci/Kconfig
PCI: Include pci/pcie/Kconfig directly from pci/Kconfig
* pci/misc:
PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
PCI: Add QEMU top-level IDs for (sub)vendor & device
unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition
PCI: Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h
PCI: Move pci_dma_* helpers to common code
frv/PCI: Remove stray pci_{alloc,free}_consistent() declaration
* pci/virtualization:
PCI: Wait for up to 1000ms after FLR reset
PCI: Support SR-IOV on any function type
* pci/vpd:
PCI: Prevent VPD access for buggy devices
PCI: Sleep rather than busy-wait for VPD access completion
PCI: Fold struct pci_vpd_pci22 into struct pci_vpd
PCI: Rename VPD symbols to remove unnecessary "pci22"
PCI: Remove struct pci_vpd_ops.release function pointer
PCI: Move pci_vpd_release() from header file to pci/access.c
PCI: Move pci_read_vpd() and pci_write_vpd() close to other VPD code
PCI: Determine actual VPD size on first access
PCI: Use bitfield instead of bool for struct pci_vpd_pci22.busy
PCI: Allow access to VPD attributes with size 0
PCI: Update VPD definitions
Some devices take longer than the spec indicates to return from FLR reset,
a notable case of this is Intel integrated graphics (IGD), which can often
take an additional 300ms powering down an attached LCD panel as part of the
FLR. Allow devices up to 1000ms, testing every 100ms whether the second
dword of config space is read as -1.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_create_root_bus() passes a "parent" pointer to
pci_bus_assign_domain_nr(). When CONFIG_PCI_DOMAINS_GENERIC is defined,
pci_bus_assign_domain_nr() dereferences that pointer. Many callers of
pci_create_root_bus() supply a NULL "parent" pointer, which leads to a NULL
pointer dereference error.
7c67470009 ("PCI: Move domain assignment from arm64 to generic code")
moved the "parent" dereference from arm64 to generic code. Only arm64 used
that code (because only arm64 defined CONFIG_PCI_DOMAINS_GENERIC), and it
always supplied a valid "parent" pointer. Other arches supplied NULL
"parent" pointers but didn't defined CONFIG_PCI_DOMAINS_GENERIC, so they
used a no-op version of pci_bus_assign_domain_nr().
8c7d14746a ("ARM/PCI: Move to generic PCI domains") defined
CONFIG_PCI_DOMAINS_GENERIC on ARM, and many ARM platforms use
pci_common_init(), which supplies a NULL "parent" pointer.
These platforms (cns3xxx, dove, footbridge, iop13xx, etc.) crash
with a NULL pointer dereference like this while probing PCI:
Unable to handle kernel NULL pointer dereference at virtual address 000000a4
PC is at pci_bus_assign_domain_nr+0x10/0x84
LR is at pci_create_root_bus+0x48/0x2e4
Kernel panic - not syncing: Attempted to kill init!
[bhelgaas: changelog, add "Reported:" and "Fixes:" tags]
Reported: http://forum.doozan.com/read.php?2,17868,22070,quote=1
Fixes: 8c7d14746a ("ARM/PCI: Move to generic PCI domains")
Fixes: 7c67470009 ("PCI: Move domain assignment from arm64 to generic code")
Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: stable@vger.kernel.org # v4.0+
Christoph added a generic include/linux/pci-dma-compat.h, so now there's
one place with most of the PCI DMA interfaces. Move more PCI DMA-related
things there:
- The PCI_DMA_* direction constants from linux/pci.h
- The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
CONFIG_PCI implementations from drivers/pci/pci.c
- The pci_set_dma_max_seg_size() and pci_set_dma_seg_boundary()
!CONFIG_PCI stubs from linux/pci.h
- The pci_set_dma_mask() and pci_set_consistent_dma_mask()
!CONFIG_PCI stubs from linux/pci.h
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
include/asm-generic/pci-bridge.h is now empty, so remove every #include of
it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Will Deacon <will.deacon@arm.com> (arm64)
Fix all whitespace issues (missing or needed whitespace) in all files in
drivers/pci. Code is compiled with allyesconfig before and after code
changes and objects are recorded and checked with objdiff and they are not
changed after this commit.
Signed-off-by: Bogicevic Sasa <brutallesale@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci_platform_pm_ops structure is never modified, so declare it as
const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* pci/aer:
PCI/AER: Clear error status registers during enumeration and restore
* pci/hotplug:
PCI: pciehp: Queue power work requests in dedicated function
* pci/misc:
PCI: Turn off Request Attributes to avoid Chelsio T5 Completion erratum
x86/PCI: Make pci_subsys_init() static
PCI: Add builtin_pci_driver() to avoid registration boilerplate
PCI: Remove unnecessary "if" statement
* pci/msi:
x86/PCI: Don't alloc pcibios-irq when MSI is enabled
PCI/MSI: Export all remapped MSIs to sysfs attributes
PCI: Disable MSI on SiS 761
* pci/resource:
sparc/PCI: Add mem64 resource parsing for root bus
PCI: Expand Enhanced Allocation BAR output
PCI: Make Enhanced Allocation bitmasks more obvious
PCI: Handle Enhanced Allocation capability for SR-IOV devices
PCI: Add support for Enhanced Allocation devices
PCI: Add Enhanced Allocation register entries
PCI: Handle IORESOURCE_PCI_FIXED when assigning resources
PCI: Handle IORESOURCE_PCI_FIXED when sizing resources
PCI: Clear IORESOURCE_UNSET when reverting to firmware-assigned address
* pci/virtualization:
PCI: Fix sriov_enable() error path for pcibios_enable_sriov() failures
PCI: Wait 1 second between disabling VFs and clearing NumVFs
PCI: Reorder pcibios_sriov_disable()
PCI: Remove VFs in reverse order if virtfn_add() fails
PCI: Remove redundant validation of SR-IOV offset/stride registers
PCI: Set SR-IOV NumVFs to zero after enumeration
PCI: Enable SR-IOV ARI Capable Hierarchy before reading TotalVFs
PCI: Don't try to restore VF BARs
An Enhanced Allocation Capability entry with BEI 0 fills in
dev->resource[0] just like a real BAR 0 would, but non-EA experts might not
connect "EA - BEI 0" with BAR 0.
Decode the EA jargon a little bit, e.g., change this:
pci 0002:01:00.0: EA - BEI 0, Prop 0x00: [mem 0x84300000-0x84303fff]
to this:
pci 0002:01:00.0: BAR 0: [mem 0x84300000-0x84303fff] (from Enhanced Allocation, properties 0x00)
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>