The aspm code will currently set the configured aspm policy before drivers
have had an opportunity to indicate that their hardware doesn't support it.
Unfortunately, putting some hardware in L0 or L1 can result in the hardware
no longer responding to any requests, even after aspm is disabled. It makes
more sense to leave aspm policy at the BIOS defaults at initial setup time,
reconfiguring it after pci_enable_device() is called. This allows the
driver to blacklist individual devices beforehand.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This problem happened when removing PCIe root port using PCI logical
hotplug operation.
The immediate cause of this problem is that the pointer to invalid
data structure is passed to pcie_update_aspm_capable() by
pcie_aspm_exit_link_state(). When pcie_aspm_exit_link_state() received
a pointer to root port link, it unconfigures the root port link and
frees its data structure at first. At this point, there are not links
to configure under the root port and the data structure for root port
link is already freed. So pcie_aspm_exit_link_state() must not call
pcie_update_aspm_capable() and pcie_config_aspm_path().
This patch fixes the problem by changing pcie_aspm_exit_link_state()
not to call pcie_update_aspm_capable() and pcie_config_aspm_path() if
the specified link is root port link.
------------[ cut here ]------------
kernel BUG at drivers/pci/pcie/aspm.c:606!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:40/0000:40:13.0/remove
CPU 1
Modules linked in: shpchp
Pid: 9345, comm: sysfsd Not tainted 2.6.32-rc5 #98 ProLiant DL785 G6
RIP: 0010:[<ffffffff811df69b>] [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
RSP: 0018:ffff88082a2f5ca0 EFLAGS: 00010202
RAX: 0000000000000e77 RBX: ffff88182cc3e000 RCX: ffff88082a33d006
RDX: 0000000000000001 RSI: ffffffff811dff4a RDI: ffff88182cc3e000
RBP: ffff88082a2f5cc0 R08: ffff88182cc3e000 R09: 0000000000000000
R10: ffff88182fc00180 R11: ffff88182fc00198 R12: ffff88182cc3e000
R13: 0000000000000000 R14: ffff88182cc3e000 R15: ffff88082a2f5e20
FS: 00007f259a64b6f0(0000) GS:ffff880864600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007feb53f73da0 CR3: 000000102cc94000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process sysfsd (pid: 9345, threadinfo ffff88082a2f4000, task ffff88082a33cf00)
Stack:
ffff88182cc3e000 ffff88182cc3e000 0000000000000000 ffff88082a33cf00
<0> ffff88082a2f5cf0 ffffffff811dff52 ffff88082a2f5cf0 ffff88082c525168
<0> ffff88402c9fd2f8 ffff88402c9fd2f8 ffff88082a2f5d20 ffffffff811d7db2
Call Trace:
[<ffffffff811dff52>] pcie_aspm_exit_link_state+0xf5/0x11e
[<ffffffff811d7db2>] pci_stop_bus_device+0x76/0x7e
[<ffffffff811d7d67>] pci_stop_bus_device+0x2b/0x7e
[<ffffffff811d7e4f>] pci_remove_bus_device+0x15/0xb9
[<ffffffff811dcb8c>] remove_callback+0x29/0x3a
[<ffffffff81135aeb>] sysfs_schedule_callback_work+0x15/0x6d
[<ffffffff81072790>] worker_thread+0x19d/0x298
[<ffffffff8107273b>] ? worker_thread+0x148/0x298
[<ffffffff81135ad6>] ? sysfs_schedule_callback_work+0x0/0x6d
[<ffffffff810765c0>] ? autoremove_wake_function+0x0/0x38
[<ffffffff810725f3>] ? worker_thread+0x0/0x298
[<ffffffff8107629e>] kthread+0x7d/0x85
[<ffffffff8102eafa>] child_rip+0xa/0x20
[<ffffffff8102e4bc>] ? restore_args+0x0/0x30
[<ffffffff81076221>] ? kthread+0x0/0x85
[<ffffffff8102eaf0>] ? child_rip+0x0/0x20
Code: 89 e5 8a 50 48 31 c0 c0 ea 03 83 e2 07 e8 b2 de fe ff c9 48 98 c3 55 48 89 e5 41 56 49 89 fe 41 55 41 54 53 48 83 7f 10 00 74 04 <0f> 0b eb fe 48 8b 05 da 7d 63 00 4c 8d 60 e8 4c 89 e1 eb 24 4c
RIP [<ffffffff811df69b>] pcie_update_aspm_capable+0x15/0xbe
RSP <ffff88082a2f5ca0>
---[ end trace 6ae0f65bdeab8555 ]---
Reported-by: Alex Chiang <achiang@hp.com>
Tested-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Change for PCIe ASPM driver to use pci_is_pcie() instead of checking
pci_dev->is_pcie.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Use pci_pcie_cap() instead of pci_find_capability() to get PCIe capability
offset in PCIe ASPM driver. This avoids unnecessary search in PCI
configuration space.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The definition of the ASPM support field in the Link Capabilities
Register had been changed by the "ASPM optionality ECN" as follows:
<Before>
00b Reserved
01b L0s Supported
10b Reserved
11b L0s and L1 Supported
<After>
00b No ASPM Support
01b L0s Supported
10b L1 Supported
11b L0s and L1 Supported
Current linux ASPM driver doesn't enable ASPM if the support field is
00b or 10b. So there is no impact about 00b. But current linux ASPM
driver doesn't enable L1 if the support field is 10b. With this patch,
10b (L1 support) is handled properly.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The L0s state can be managed separately for each direction (upstream
direction and downstream direction) of the link. But in the current
implementation, those are mixed up. With this patch, L0s for each
direction are managed separately.
To maintain three states (upstream direction L0s, downstream L0s and
L1), 'aspm_support', 'aspm_enabled', 'aspm_capable', 'aspm_disable'
and 'aspm_default' fields in struct pcie_link_state are changed to
3-bit from 2-bit. The 'latency' field is separated to two 'latency_up'
and 'latency_dw' fields to maintain exit latencies for each direction
of the link. For L0, 'latency_up.l0' and 'latency_dw.l0' are used to
configure upstream direction L0s and downstream direction L0s
respectively. For L1, larger value of 'latency_up.l1' and
'latency_dw.l1' is considered as L1 exit latency.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current implementation, ASPM L0s/L1 is disabled for all links
in the hierarchy if one of the link doesn't meet latency requirement.
But we can partially enable ASPM L0s/L1 on sub-tree in the hierarchy.
This patch allows partial L0s/L1 enablement in the hierarchy. And it
also reduce the calculation cost of ASPM configuration very much.
In the previous implementation, all links were enabled with the same
state. With this patch, enabled state for each link is determined
simply as follows (the 'requested' is from policy_to_aspm_state()).
enabled = requested & (link->aspm_capable & link->aspm_disable)
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce 'aspm_capable' field to maintain the capable ASPM setting of
the link. By the 'aspm_capable', we don't need to recheck latency
every time ASPM policy is changed.
Each bit in 'aspm_capable' is associated to ASPM state (L0S/L1). The
bit is set if the associated ASPM state is supported by the link and
it satisfies the latency requirement (i.e. exit latency < endpoint
acceptable latency). The 'aspm_capable' is updated when
- an endpoint device is added (boot time or hot-plug time)
- an endpoint device is removed (hot-unplug time)
- PCI power state is changed.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce 'aspm_disable' flag to manage disabled ASPM state more
robust way.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix possible NULL dereference in pcie_aspm_exit_link_state(). This
patch also cleanup some code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Remove the following check in __pcie_aspm_config_link() because it
nerver be true.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We must not clear bits in 'aspm_enabled' using 'aspm_support', or
'aspm_enabled' and 'aspm_default' might be different from the actual
state. In addtion, 'aspm_default' should be intialized even if
'aspm_support' is 0.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
By having a pointer to the root port link, we can remove loops in
get_root_port_link() to search the root port link.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We don't need the 'has_switch' field in the struct pcie_link_state.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for calc_L0S_latency() and calc_L1_latency().
- Separate exit latency and acceptable latency calculation.
- Some minor cleanups.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, callers of pcie_set_clock_pm() check
Clock PM capability of the link or current Clock PM state of the link.
This check should be done in pcie_set_clock_pm() itself.
This patch moves those checks into pcie_set_clock_pm(). It also
introduces pcie_set_clkpm_nocheck() that is equivalent to old
pcie_set_clock_pm(), for the caller who wants to change Clocl PM state
regardless of the Clock PM capability or current Clock PM state. In
addition, this patch changes the function name from
pcie_set_clock_pm() to pcie_set_clkpm() for consistency.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up ASPM initialization by refactoring some functionality, renaming
functions, and moving things around.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the current ASPM implementation, there are many functions that
take a pointer to struct pci_dev corresponding to the upstream component
of the link as a parameter. But, since those functions handle PCI
express link state, a pointer to struct pcie_link_state is more
suitable than a pointer to struct pci_dev. Changing a parameter to a
pointer to struct pcie_link_state makes ASPM code much simpler and
easier to read. This patch also contains some minor cleanups. This patch
doesn't have any functional change.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cleanup for some fields in pcie_link_state.
- Add comments.
- make "downstream_has_switch" field 1-bit.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "clk_pm_capable", "clk_pm_enable" and "bios_clk_state" fields in
the struct pcie_link_state only take 1-bit value. So those fields
don't need to be defined as unsigned int. This patch makes those
fields 1-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Clean up latency related data structures for ASPM.
- Introduce struct acpi_latency for exit latency and acceptable
latency management. With this change, struct endpoint_state is no
longer needed.
- We don't need to hold both upstream latency and downstream latency
in the current implementation.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The "support_state", "enabled_state" and "bios_aspm_state" fields in
the struct pcie_link_state take 2-bit value. So those fields don't
need to be defined as unsigned int. This patch makes those fields
2-bit, and cleans up some related code.
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix a typo in struct pcie_link_state.
The "sibiling" field in the struct pcie_link_state should be
"sibling".
Acked-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
VIA has a strange chipset, it has root port under a bridge. Disable ASPM
for such strange chipset.
Cc: stable@kernel.org
Tested-by: Wolfgang Denk <wd@denx.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We only want to disable ASPM when the last function is removed from
the parent's device list. We determine this by checking to see if
the parent's device list is completely empty.
Unfortunately, we never hit that code because the parent is considered
an upstream port, and never had an ASPM link_state associated with it.
The early check for !link_state causes us to return early, we never
discover that our device list is empty, and thus we never remove the
downstream ports' link_state nodes.
Instead of checking to see if the parent's device list is empty, we can
check to see if we are the last device on the list, and if so, then we
know that we can clean up properly.
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Alex Chiang <achiang@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The cpu_relax() function can be a noop on certain architectures like
IA-64 when CPU threads are disabled, so use msleep instead during link
retraining busy/wait loop.
Introduce define LINK_RETRAIN_TIMEOUT instead of hard-coding timeout in
pcie_aspm_configure_common_clock.
Use time_after() to avoid jiffy wraparound when checking for expired
timeout.
After timeout expires, recheck link status register link training bit
instead of checking for expired timeout to avoid possible false
positive.
Note that Matthew Wilcox came up with the first rough version of this
patch.
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In a PCIe hierarchy with a switch present, if the link state of an
endpoint device is changed, we must check the whole hierarchy from the
endpoint device to root port, and for each link in the hierarchy, the new
link state should be configured. Previously, the implementation checked
the state but forgot to configure the links between root port to switch.
Fixes Novell bz #448987.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The _OSC capabilities OSC_ACTIVE_STATE_PWR_SUPPORT and
OSC_CLOCK_PWR_CAPABILITY_SUPPORT are set when the root bridge is added
with pci_acpi_osc_support(), so we no longer need to do it in the ASPM
driver. Also add the function pcie_aspm_enabled, which returns true if
pcie_aspm=off is not on the kernel command-line.
Signed-off-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Makes a Compaq 6735s boot reliably again. It used to hang in the loop
on some boots. Give the link one second to train, otherwise break out
of the loop and reset the previously set clock bits.
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This patch uniformizes PCI probing debug boot messages with dev_printk()
intead of manual printk()
It changes adress range output from [%llx, %llx] to [%#llx-%#llx], like
in pci_request_region().
For example, it goes from the mixed-style:
PCI: 0000:00:1b.0 reg 10 64bit mmio: [f4280000, f4283fff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
to uniform:
pci 0000:00:1b.0: reg 10 64bit mmio: [0xf4280000-0xf4283fff]
pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
This patch has been runtime tested, boot log messages diffed, everything
looks OK.
Acked-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
pcie_aspm=force did not work because aspm_force was being double negated
leading to the sanity check failing. Moving a bracket should fix this.
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
A new option, pcie_aspm=force, will force ASPM to be enabled, even on system
with PCIe 1.0 devices.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Disable ASPM on pre-1.1 PCIe devices, as many of them don't implement it
correctly.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The ACPI FADT table includes an ASPM control bit. If the bit is set, do
not enable ASPM since it may indicate that the platform doesn't actually
support the feature.
Tested-by: Jack Howarth <howarth@bromo.msbb.uc.edu>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The Slot 03:00.* of JMicron controller has two functions, but one is
PCIE endpoint the other isn't PCIE device, very strange. PCIE spec
defines all functions should have the same config for ASPM, so disable
ASPM for the whole slot in this case.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.
This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
-default, BIOS default setting
-powersave, highest power saving mode, enable all available ASPM
state and clock power management
-performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.
In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.
Note: some devices might not work well with aspm, either because chipset
issue or device issue. The patch provide API (pci_disable_link_state),
driver can disable ASPM for specific device.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This reverts commit 6c723d5bd8.
It caused build errors on non-x86 platforms, config file confusion, and
even some boot errors on some x86-64 boxes. All around, not quite ready
for prime-time :(
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
PCI Express ASPM defines a protocol for PCI Express components in the D0
state to reduce Link power by placing their Links into a low power state
and instructing the other end of the Link to do likewise. This
capability allows hardware-autonomous, dynamic Link power reduction
beyond what is achievable by software-only controlled power management.
However, The device should be configured by software appropriately.
Enabling ASPM will save power, but will introduce device latency.
This patch adds ASPM support in Linux. It introduces a global policy for
ASPM, a sysfs file /sys/module/pcie_aspm/parameters/policy can control
it. The interface can be used as a boot option too. Currently we have
below setting:
-default, BIOS default setting
-powersave, highest power saving mode, enable all available ASPM
state
and clock power management
-performance, highest performance, disable ASPM and clock power
management
By default, the 'default' policy is used currently.
In my test, power difference between powersave mode and performance mode
is about 1.3w in a system with 3 PCIE links.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>