Граф коммитов

551 Коммитов

Автор SHA1 Сообщение Дата
Anton Blanchard 0baa91cb73 cpuidle: powernv: Avoid a branch in the core snooze_loop() loop
When in the snooze_loop() we want to take up the least amount of
resources. On my version of gcc (6.3), we end up with an extra
branch because it predicts snooze_timeout_en to be false, whereas it
is almost always true.

Use likely() to avoid the branch and be a little nicer to the
other non idle threads on the core.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:19 +02:00
Anton Blanchard 26eb48a9fa cpuidle: powernv: Don't continually set thread priority in snooze_loop()
The powerpc64 kernel exception handlers have preserved thread priorities
for a long time now, so there is no need to continually set it.

Just set it once on entry and once exit.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:18 +02:00
Anton Blanchard 79b578111f cpuidle: powernv: Don't bounce between low and very low thread priority
The core of snooze_loop() continually bounces between low and very
low thread priority. Changing thread priorities is an expensive
operation that can negatively impact other threads on a core.

All CPUs that can run PowerNV support very low priority, so we can
avoid the change completely.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:17:18 +02:00
Marcin Nowakowski 02018b3929 cpuidle: cpuidle-cps: remove unused variable
'core' in cps_cpuidle_init has never been used and is unnecessary, so
remove the dead code.

Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-04-19 23:04:54 +02:00
Vaidyanathan Srinivasan 293d264f13 cpuidle: powernv: Pass correct drv->cpumask for registration
drv->cpumask defaults to cpu_possible_mask in __cpuidle_driver_init().
On PowerNV platform cpu_present could be less than cpu_possible in cases
where firmware detects the cpu, but it is not available to the OS.  When
CONFIG_HOTPLUG_CPU=n, such cpus are not hotplugable at runtime and hence
we skip creating cpu_device.

This breaks cpuidle on powernv where register_cpu() is not called for
cpus in cpu_possible_mask that cannot be hot-added at runtime.

Trying cpuidle_register_device() on cpu without cpu_device will cause
crash like this:

cpu 0xf: Vector: 380 (Data SLB Access) at [c000000ff1503490]
    pc: c00000000022c8bc: string+0x34/0x60
    lr: c00000000022ed78: vsnprintf+0x284/0x42c
    sp: c000000ff1503710
   msr: 9000000000009033
   dar: 6000000060000000
  current = 0xc000000ff1480000
  paca    = 0xc00000000fe82d00   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/8
Linux version 4.11.0-rc2 (sv@sagarika) (gcc version 4.9.4
(Buildroot 2017.02-00004-gc28573e) ) #15 SMP Fri Mar 17 19:32:02 IST 2017
enter ? for help
[link register   ] c00000000022ed78 vsnprintf+0x284/0x42c
[c000000ff1503710] c00000000022ebb8 vsnprintf+0xc4/0x42c (unreliable)
[c000000ff1503800] c00000000022ef40 vscnprintf+0x20/0x44
[c000000ff1503830] c0000000000ab61c vprintk_emit+0x94/0x2cc
[c000000ff15038a0] c0000000000acc9c vprintk_func+0x60/0x74
[c000000ff15038c0] c000000000619694 printk+0x38/0x4c
[c000000ff15038e0] c000000000224950 kobject_get+0x40/0x60
[c000000ff1503950] c00000000022507c kobject_add_internal+0x60/0x2c4
[c000000ff15039e0] c000000000225350 kobject_init_and_add+0x70/0x78
[c000000ff1503a60] c00000000053c288 cpuidle_add_sysfs+0x9c/0xe0
[c000000ff1503ae0] c00000000053aeac cpuidle_register_device+0xd4/0x12c
[c000000ff1503b30] c00000000053b108 cpuidle_register+0x98/0xcc
[c000000ff1503bc0] c00000000085eaf0 powernv_processor_idle_init+0x140/0x1e0
[c000000ff1503c60] c00000000000cd60 do_one_initcall+0xc0/0x15c
[c000000ff1503d20] c000000000833e84 kernel_init_freeable+0x1a0/0x25c
[c000000ff1503dc0] c00000000000d478 kernel_init+0x24/0x12c
[c000000ff1503e30] c00000000000b564 ret_from_kernel_thread+0x5c/0x78

This patch fixes the bug by passing correct cpumask from
powernv-cpuidle driver.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-29 22:55:36 +02:00
Gautham R. Shenoy ecad4502d0 powernv-cpuidle: Validate DT property array size
The various properties associated with powernv idle states such as
names, flags, residency-ns, latencies-ns, psscr, psscr-mask are
exposed in the device-tree as property arrays such the pointwise
entries in each of these arrays correspond to the properties of the
same idle state.

This patch validates that the lengths of the property arrays are the
same. If there is a mismatch, the patch will ensure that we bail out
and not expose the platform idle states via cpuidle.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Reviewed-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-29 00:14:00 +02:00
Vaidyanathan Srinivasan ad0a45fd9c cpuidle: Validate cpu_dev in cpuidle_add_sysfs()
If a given cpu is not in cpu_present and cpu hotplug
is disabled, arch can skip setting up the cpu_dev.

Arch cpuidle driver should pass correct cpu mask
for registration, but failing to do so by the driver
causes error to propagate and crash like this:

[   30.076045] Unable to handle kernel paging request for data at address 0x00000048
[   30.076100] Faulting instruction address: 0xc0000000007b2f30
cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670]
    pc: c0000000007b2f30: kobject_get+0x20/0x70
    lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0
    sp: c000003feb18b8f0
   msr: 9000000000009033
   dar: 48
 dsisr: 40000000
  current = 0xc000003fd2ed8300
  paca    = 0xc00000000fbab500   softe: 0        irq_happened: 0x01
    pid   = 1, comm = swapper/0
Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0
20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017
enter ? for help
[c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0
[c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0
[c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130
[c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0
[c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120
[c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4
[c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0
[c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360
[c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160
[c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74

Validating cpu_dev fixes the crash and reports correct error message like:

[   30.163506] Failed to register cpuidle device for cpu136
[   30.173329] Registration of powernv driver failed.

Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
[ rjw: Comment massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-21 22:26:37 +01:00
Linus Torvalds 1827adb11a Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull sched.h split-up from Ingo Molnar:
 "The point of these changes is to significantly reduce the
  <linux/sched.h> header footprint, to speed up the kernel build and to
  have a cleaner header structure.

  After these changes the new <linux/sched.h>'s typical preprocessed
  size goes down from a previous ~0.68 MB (~22K lines) to ~0.45 MB (~15K
  lines), which is around 40% faster to build on typical configs.

  Not much changed from the last version (-v2) posted three weeks ago: I
  eliminated quirks, backmerged fixes plus I rebased it to an upstream
  SHA1 from yesterday that includes most changes queued up in -next plus
  all sched.h changes that were pending from Andrew.

  I've re-tested the series both on x86 and on cross-arch defconfigs,
  and did a bisectability test at a number of random points.

  I tried to test as many build configurations as possible, but some
  build breakage is probably still left - but it should be mostly
  limited to architectures that have no cross-compiler binaries
  available on kernel.org, and non-default configurations"

* 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (146 commits)
  sched/headers: Clean up <linux/sched.h>
  sched/headers: Remove #ifdefs from <linux/sched.h>
  sched/headers: Remove the <linux/topology.h> include from <linux/sched.h>
  sched/headers, hrtimer: Remove the <linux/wait.h> include from <linux/hrtimer.h>
  sched/headers, x86/apic: Remove the <linux/pm.h> header inclusion from <asm/apic.h>
  sched/headers, timers: Remove the <linux/sysctl.h> include from <linux/timer.h>
  sched/headers: Remove <linux/magic.h> from <linux/sched/task_stack.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/init.h>
  sched/core: Remove unused prefetch_stack()
  sched/headers: Remove <linux/rculist.h> from <linux/sched.h>
  sched/headers: Remove the 'init_pid_ns' prototype from <linux/sched.h>
  sched/headers: Remove <linux/signal.h> from <linux/sched.h>
  sched/headers: Remove <linux/rwsem.h> from <linux/sched.h>
  sched/headers: Remove the runqueue_is_locked() prototype
  sched/headers: Remove <linux/sched.h> from <linux/sched/hotplug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/debug.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/nohz.h>
  sched/headers: Remove <linux/sched.h> from <linux/sched/stat.h>
  sched/headers: Remove the <linux/gfp.h> include from <linux/sched.h>
  sched/headers: Remove <linux/rtmutex.h> from <linux/sched.h>
  ...
2017-03-03 10:16:38 -08:00
Linus Torvalds 080e4168c0 More power management updates for v4.11-rc1
- Fix for a cpuidle menu governor problem that started to take an
    unnecessary spinlock after one of the recent updates and that
    did not play well with the RT patch (Rafael Wysocki).
 
  - Fix for the new intel_pstate operation mode switching feature
    added recently that did not reinitialize P-state limits properly
    when switching operation modes (Rafael Wysocki).
 
  - Removal of unused global notifiers from the PM QoS framework
    (Viresh Kumar).
 
  - Generic power domains framework update to make it handle
    asynchronous invocations of PM callbacks in the "noirq" phases
    of system suspend/hibernation correctly (Ulf Hansson).
 
  - Two hibernation core cleanups (Rafael Wysocki).
 
  - intel_idle cleanup related to the sysfs interface (Len Brown).
 
  - Off-by-one bug fix in the OPP (Operating Performance Points)
    framework (Andrzej Hajda).
 
  - OPP framework's documentation fix (Viresh Kumar).
 
  - cpufreq qoriq driver cleanup (Tang Yuantian).
 
  - Fixes for typos in comments in the device runtime PM framework
    (Christophe Jaillet).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYuLeGAAoJEILEb/54YlRxJvcP/0BRmh8Hn4Itx/NIWNg71X6j
 U+v8Pn8T3MP33gCcleLYlgre2JUIAUDmhdK99+UOx+/abhjMhQSaF3HhTOwYaPtQ
 6njoHVS0NnfqUf+x5kp+EpRxBVNucYVbdRTVd1DsHIeLLz/96DFOzb/R7tko/pKx
 pFMWvNdotHLLgXOG1UvdRimwTDlFMffxFzD8Se53LPjRXS0S73A5VWfqZOye44Re
 j3W1AJ0Idgq5uduA6J8x1MWbaxDq1h+j6CSUm05yvqrINzxXwXt0Hv6stCQTo+Gb
 YMdiBd8MujNyAgcchw3jiDQ8Vp+zmfLPcHrfPe//SSefj26eB8LyVNSYelvbUdOz
 cNjvyErva37MmaegCL9QC7WbLM+A7VE6bU6YzDCi/rR8jYMJ51Fb9jGiYb/oimry
 OLlblEekikUsskWv4hGV1JVt5VhmUMlagWtexxn+lMszATcZro0tfXu/vgQWksYs
 noUnwuWJWxvj2aNMsvbzW3HLlTGSmYl2UxJ7IymQQaTDblwF9Kg61rm3+5coUctd
 ifceynDVp9Gju25faYgZ+Dq9+o8ktlOGOHRRPdLIRNJ/T+4tUDnlGkdbPb+Tfn03
 XUIzYCu74U8/oW8gOk6t0WpmWzvxEXNgdirdEIR6y3loYIC0Jr3v4gyD975Eug74
 Hzfrdg7ignAmWV+nf6UY
 =SeUF
 -----END PGP SIGNATURE-----

Merge tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management updates deom Rafael Wysocki:
 "These fix two bugs introduced by recent power management updates (in
  the cpuidle menu governor and intel_pstate) and a few other issues,
  clean up things and remove unused code.

  Specifics:

   - Fix for a cpuidle menu governor problem that started to take an
     unnecessary spinlock after one of the recent updates and that did
     not play well with the RT patch (Rafael Wysocki).

   - Fix for the new intel_pstate operation mode switching feature added
     recently that did not reinitialize P-state limits properly when
     switching operation modes (Rafael Wysocki).

   - Removal of unused global notifiers from the PM QoS framework
     (Viresh Kumar).

   - Generic power domains framework update to make it handle
     asynchronous invocations of PM callbacks in the "noirq" phases of
     system suspend/hibernation correctly (Ulf Hansson).

   - Two hibernation core cleanups (Rafael Wysocki).

   - intel_idle cleanup related to the sysfs interface (Len Brown).

   - Off-by-one bug fix in the OPP (Operating Performance Points)
     framework (Andrzej Hajda).

   - OPP framework's documentation fix (Viresh Kumar).

   - cpufreq qoriq driver cleanup (Tang Yuantian).

   - Fixes for typos in comments in the device runtime PM framework
     (Christophe Jaillet)"

* tag 'pm-extra-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / OPP: Documentation: Fix opp-microvolt in examples
  intel_idle: stop exposing platform acronyms in sysfs
  cpufreq: intel_pstate: Fix limits issue with operation mode switching
  PM / hibernate: Define pr_fmt() and use pr_*() instead of printk()
  PM / hibernate: Untangle power_down()
  cpuidle: menu: Avoid taking spinlock for accessing QoS values
  PM / QoS: Remove global notifiers
  PM / runtime: Fix some typos
  cpufreq: qoriq: clean up unused code
  PM / OPP: fix off-by-one bug in dev_pm_opp_get_max_volt_latency loop
  PM / Domains: Power off masters immediately in the power off sequence
  PM / Domains: Rename is_async to one_dev_on for genpd_power_off()
  PM / Domains: Move genpd_power_off() above genpd_power_on()
2017-03-02 17:33:52 -08:00
Ingo Molnar 03441a3482 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h>
We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/stat.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:34 +01:00
Ingo Molnar 4f17722c72 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/loadavg.h>
We are going to split <linux/sched/loadavg.h> out of <linux/sched.h>, which
will have to be picked up from a couple of .c files.

Create a trivial placeholder <linux/sched/topology.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Ingo Molnar e601757102 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h>
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which
will have to be picked up from other headers and .c files.

Create a trivial placeholder <linux/sched/clock.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:27 +01:00
Ingo Molnar 4c822698cb sched/headers: Prepare for new header dependencies before moving code to <linux/sched/idle.h>
We are going to split  <linux/sched/idle.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/idle.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:26 +01:00
Rafael J. Wysocki 6dbf5cea05 cpuidle: menu: Avoid taking spinlock for accessing QoS values
After commit 9908859aca (cpuidle/menu: add per CPU PM QoS resume
latency consideration) the cpuidle menu governor calls
dev_pm_qos_read_value() on CPU devices to read the current resume
latency QoS constraint values for them.  That function takes a spinlock
to prevent the device's power.qos pointer from becoming NULL during
the access which is a problem for the RT patchset where spinlocks are
converted into mutexes and the idle loop stops working.

However, it is not even necessary for the menu governor to take
that spinlock, because the power.qos pointer accessed under it
cannot be modified during the access anyway.

For this reason, introduce a "raw" routine for accessing device
QoS resume latency constraints without locking and use it in the
menu governor.

Fixes: 9908859aca (cpuidle/menu: add per CPU PM QoS resume latency consideration)
Acked-by: Alex Shi <alex.shi@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-02-27 15:07:38 +01:00
Linus Torvalds 38705613b7 powerpc updates for 4.11 part 1.
Highlights include:
 
  - Support for direct mapped LPC on POWER9, giving Linux direct access to
    devices that may be on there such as a UART.
 
  - Memory hotplug support for the Power9 Radix MMU.
 
  - Add new AUX vectors describing the processor's cache geometry, to be used by
    glibc.
 
  - The ability for a guest to ask the hypervisor to resize the guest's hash
    table, and in addition support for doing so automatically when memory is
    hotplugged into/out-of the guest. This allows the hash table to be sized
    based on the current memory usage of the guest, rather than the maximum
    possible memory usage.
 
  - Implementation of optprobes (kprobe optimisation) for powerpc.
 
 In addition there's the topic branch shared with the KVM tree, which includes
 support for guests to use the Radix MMU on Power9.
 
 Thanks to:
   Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton Blanchard,
   Benjamin Herrenschmidt, Chris Packham, Daniel Axtens, Daniel Borkmann, David
   Gibson, Finn Thain, Gautham R. Shenoy, Gavin Shan, Greg Kurz, Joel Stanley,
   John Allen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
   Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi
   Bangoria, Reza Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYrWj0AAoJEFHr6jzI4aWAsn4P/08Kz3TtOvDuuPGVNoO7fWOn
 ag5/zVt8R4FuCALqWpAZbVqMuUU4wLxG0RuWlmBNYYhrjMC6JxHpOSjQXxM2D7YT
 CdGTJxG414r6HMOeToL9i/z33o0m+KT07tscer+QMKlXVKCR2z0fEJch+zoPBHYA
 5eVBFpLLTtbNiX9UnrcM/vYz61d56kT4YJey9/8qbAkTAc1rMPa8ucU5UiKYJ7yX
 mF8cd7WE+7aqif9V8yN59G2rcbz+h3pbMw/gzImiYsYrUj4fLjU+VTKL5PPT+UFy
 WWBXD3MAMm1dksMMZi3hgoo2BZhDn3RkymeYi6Jo4kDknNMPZzkMxGyvaJ8Eq5H/
 bIYXdS1AbTtvaaEEuWqDFjpnChOEvj/8IeqitU0jjlql8BVjNKg/ESaaKucZr+pO
 pk2Mfvw0Tb/lxJT5qj27yq4aRsxJwdFOoPYCN7MquPp/wV2Tg5M6h4nVQ4T6Wl0S
 tMFQeCqXflhDWh0Xgr2UXpF66/YTj3Du5LasOTkgGeU30Z8TcNGFEmDWShKP3cEm
 e0oQE+OHhIGN4WSBXBAto/Gw8/0v3pXlMs+VEfeHqenwPss2sWtSwXWe8khmiy9e
 DJ48sTzj75/Zx1fiqRldw9YEnrL+NK0eOOpvzxeyKpfvdUytc+chFfEqUmO6kl3Z
 DW2UmlZxmW+b0SfexCHL
 =Icle
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for direct mapped LPC on POWER9, giving Linux direct access
     to devices that may be on there such as a UART.

   - Memory hotplug support for the Power9 Radix MMU.

   - Add new AUX vectors describing the processor's cache geometry, to
     be used by glibc.

   - The ability for a guest to ask the hypervisor to resize the guest's
     hash table, and in addition support for doing so automatically when
     memory is hotplugged into/out-of the guest. This allows the hash
     table to be sized based on the current memory usage of the guest,
     rather than the maximum possible memory usage.

   - Implementation of optprobes (kprobe optimisation) for powerpc.

  In addition there's the topic branch shared with the KVM tree, which
  includes support for guests to use the Radix MMU on Power9.

  Thanks to:
    Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton
    Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens,
    Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin
    Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan,
    Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot,
    Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza
    Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun"

* tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits)
  powerpc/mm/radix: Skip ptesync in pte update helpers
  powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
  powerpc/mm/radix: Update pte update sequence for pte clear case
  powerpc/mm: Update PROTFAULT handling in the page fault path
  powerpc/xmon: Fix data-breakpoint
  powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y
  powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y
  powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n
  powerpc/pseries: Fix typo in parameter description
  powerpc/kprobes: Remove kprobe_exceptions_notify()
  kprobes: Introduce weak variant of kprobe_exceptions_notify()
  powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL
  powerpc/powernv: Fix opal_exit tracepoint opcode
  powerpc: Add a prototype for mcount() so it can be versioned
  powerpc: Drop GPL from of_node_to_nid() export to match other arches
  powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
  powerpc/kprobes: Implement Optprobes
  powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
  powerpc: Add helper to check if offset is within relative branch range
  powerpc/bpf: Introduce __PPC_SH64()
  ...
2017-02-22 10:30:38 -08:00
Gautham R. Shenoy 09206b600c powernv: Pass PSSCR value and mask to power9_idle_stop
The power9_idle_stop method currently takes only the requested stop
level as a parameter and picks up the rest of the PSSCR bits from a
hand-coded macro. This is not a very flexible design, especially when
the firmware has the capability to communicate the psscr value and the
mask associated with a particular stop state via device tree.

This patch modifies the power9_idle_stop API to take as parameters the
PSSCR value and the PSSCR mask corresponding to the stop state that
needs to be set. These PSSCR value and mask are respectively obtained
by parsing the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" fields from the device tree.

In addition to this, the patch adds support for handling stop states
for which ESL and EC bits in the PSSCR are zero. As per the
architecture, a wakeup from these stop states resumes execution from
the subsequent instruction as opposed to waking up at the System
Vector.

The older firmware sets only the Requested Level (RL) field in the
psscr and psscr-mask exposed in the device tree. For older firmware
where psscr-mask=0xf, this patch will set the default sane values that
the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
TR). For the new firmware, the patch will validate that the invariants
required by the ISA for the psscr values are maintained by the
firmware.

This skiboot patch that exports fully populated PSSCR values and the
mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html

[Optimize the number of instructions before entering STOP with
ESL=EC=0, validate the PSSCR values provided by the firimware
maintains the invariants required as per the ISA suggested by Balbir
Singh]

Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 08:32:13 +11:00
Gautham R. Shenoy 9e9fc6f00a cpuidle:powernv: Add helper function to populate powernv idle states.
In the current code for powernv_add_idle_states, there is a lot of code
duplication while initializing an idle state in powernv_states table.

Add an inline helper function to populate the powernv_states[] table
for a given idle state. Invoke this for populating the "Nap",
"Fastsleep" and the stop states in powernv_add_idle_states.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-31 08:32:13 +11:00
Alex Shi 9908859aca cpuidle/menu: add per CPU PM QoS resume latency consideration
There may be special requirements on CPU response time, like if a
interrupt is pinned to a CPU, that CPU should not go into excessively
deep idle states.  For this reason, add a mechanism for adding
PM QoS resume latency constraints for individual CPUs and modify the
menu governor to take them into account.

To that end, extend the device PM QoS pm_qos_resume_latency attribute
to CPUs, which is possible, because the exit latency for CPUs is
effectively equivalent to the resume latency for devices.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
Acked-by: Rik van Riel <riel@redhat.com>
[ rjw : Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30 11:03:32 +01:00
Alex Shi 8e37e1a2a3 cpuidle/menu: stop seeking deeper idle if current state is deep enough
Obsolete commit 71abbbf856 (cpuidle: extend cpuidle and menu governor
to handle dynamic states) wanted to introduce dynamic C-states, but that
idea was dropped long ago.  The nonsense deeper C-state checking
remained, though.

Since both target_residency and exit_latency are longer for deeper
idle state, there's no need to waste CPU time on useless checks.

Signed-off-by: Alex Shi <alex.shi@linaro.org>
Acked-by: Rik van Riel <riel@redhat.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-01-30 10:56:00 +01:00
Linus Torvalds 7c0f6ba682 Replace <asm/uaccess.h> with <linux/uaccess.h> globally
This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-24 11:46:01 -08:00
Rafael J. Wysocki 0e7414b7aa cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
Since cpuidle_use_deepest_state() is not static, add a proper
kerneldoc comment to it to document its purpose.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-06 02:25:03 +01:00
Pan Bian 8f6040cebd cpuidle: fix improper return value on error
In function cpuidle_add_state_sysfs(), variable ret takes the return
value. Its value should be negative on errors. Because ret is reset in
the loop, its value will be 0 during the second and after repeat of the
loop. If kzalloc() returns a NULL pointer then, it will return 0. It may
be better to explicitly assign "-ENOMEM" when the call to kzalloc()
fails.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=188901
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-06 02:24:14 +01:00
Jacob Pan bb8313b603 cpuidle: Allow enforcing deepest idle state selection
When idle injection is used to cap power, we need to override the
governor's choice of idle states.

For this reason, make it possible the deepest idle state selection to
be enforced by setting a flag on a given CPU to achieve the maximum
potential power draw reduction.

Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-29 14:02:21 +01:00
Andrew Donnellan ed61390bff cpuidle/powernv: staticise powernv_idle_driver
powernv_idle_driver isn't exported, it can be made static. Found by sparse.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-24 22:18:43 +01:00
Sudeep Holla a94e502c22 cpuidle: dt: assign ->enter_freeze to same as ->enter callback function
enter_freeze() callback is expected atleast to do the same as enter()
but it has to guarantee that interrupts aren't enabled at any point
in its execution, as the tick is frozen.

CPUs execute ->enter_freeze with the local tick or entire timekeeping
suspended, so it must not re-enable interrupts at any point (even
temporarily) or attempt to change states of clock event devices.

It will be called when the system goes to suspend-to-idle and will
reduce power usage because CPUs won't be awaken for unnecessary IRQs
(i.e. woken up only on IRQs from "wakeup sources")

We can reuse the same code for both the enter() and enter_freeze()
callbacks as along as they don't re-enable interrupts. Only "coupled"
cpuidle mechanism enables interrupts and doing that with timekeeping
suspended is generally not safe.

Since this generic DT based idle driver doesn't support "coupled"
states, it is safe to assume that the interrupts are not re-enabled.

This patch assign enter_freeze to same as enter callback function which
helps to save power without any intermittent spurious wakeups from
suspend-to-idle.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-11-23 02:03:11 +01:00
Daniel Lezcano e5f1b24587 cpuidle: governors: Remove remaining old module code
The governor's code use try_module_get() and put_module() to refcount
the governor's module. But the governors are not compiled as module.

The refcount does not prevent to switch the governor or unload
a module as they aren't compiled as modules. The code is pointless,
so remove it.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-10-21 14:49:51 +02:00
Linus Torvalds 133d970e0d Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS updates from Ralf Baechle:
 "This is the main MIPS pull request for 4.9:

  MIPS core arch code:
   - traps: 64bit kernels should read CP0_EBase 64bit
   - traps: Convert ebase to KSEG0
   - c-r4k: Drop bc_wback_inv() from icache flush
   - c-r4k: Split user/kernel flush_icache_range()
   - cacheflush: Use __flush_icache_user_range()
   - uprobes: Flush icache via kernel address
   - KVM: Use __local_flush_icache_user_range()
   - c-r4k: Fix flush_icache_range() for EVA
   - Fix -mabi=64 build of vdso.lds
   - VDSO: Drop duplicated -I*/-E* aflags
   - tracing: move insn_has_delay_slot to a shared header
   - tracing: disable uprobe/kprobe on compact branch instructions
   - ptrace: Fix regs_return_value for kernel context
   - Squash lines for simple wrapper functions
   - Move identification of VP(E) into proc.c from smp-mt.c
   - Add definitions of SYNC barrierstype values
   - traps: Ensure full EBase is written
   - tlb-r4k: If there are wired entries, don't use TLBINVF
   - Sanitise coherentio semantics
   - dma-default: Don't check hw_coherentio if device is non-coherent
   - Support per-device DMA coherence
   - Adjust MIPS64 CAC_BASE to reflect Config.K0
   - Support generating Flattened Image Trees (.itb)
   - generic: Introduce generic DT-based board support
   - generic: Convert SEAD-3 to a generic board
   - Enable hardened usercopy
   - Don't specify STACKPROTECTOR in defconfigs

  Octeon:
   - Delete dead code and files across the platform.
   - Change to use all memory into use by default.
   - Rename upper case variables in setup code to lowercase.
   - Delete legacy hack for broken bootloaders.
   - Leave maintaining the link state to the actual ethernet/PHY drivers.
   - Add DTS for D-Link DSR-500N.
   - Fix PCI interrupt routing on D-Link DSR-500N.

  Pistachio:
   - Remove ANDROID_TIMED_OUTPUT from defconfig

  TX39xx:
   - Move GPIO setup from .mem_setup() to .arch_init()
   - Convert to Common Clock Framework

  TX49xx:
   - Move GPIO setup from .mem_setup() to .arch_init()
   - Convert to Common Clock Framework

  txx9wdt:
   - Add missing clock (un)prepare calls for CCF

  BMIPS:
   - Add PW, GPIO SDHCI and NAND device node names
   - Support APPENDED_DTB
   - Add missing bcm97435svmb to DT_NONE
   - Rename bcm96358nb4ser to bcm6358-neufbox4-sercom
   - Add DT examples for BCM63268, BCM3368 and BCM6362
   - Add support for BCM3368 and BCM6362

  PCI
   - Reduce stack frame usage
   - Use struct list_head lists
   - Support for CONFIG_PCI_DOMAINS_GENERIC
   - Make pcibios_set_cache_line_size an initcall
   - Inline pcibios_assign_all_busses
   - Split pci.c into pci.c & pci-legacy.c
   - Introduce CONFIG_PCI_DRIVERS_LEGACY
   - Support generic drivers

  CPC
   - Convert bare 'unsigned' to 'unsigned int'
   - Avoid lock when MIPS CM >= 3 is present

  GIC:
   - Delete unused file smp-gic.c

  mt7620:
   - Delete unnecessary assignment for the field "owner" from PCI

  BCM63xx:
   - Let clk_disable() return immediately if clk is NULL

  pm-cps:
   - Change FSB workaround to CPU blacklist
   - Update comments on barrier instructions
   - Use MIPS standard lightweight ordering barrier
   - Use MIPS standard completion barrier
   - Remove selection of sync types
   - Add MIPSr6 CPU support
   - Support CM3 changes to Coherence Enable Register

  SMP:
   - Wrap call to mips_cpc_lock_other in mips_cm_lock_other
   - Introduce mechanism for freeing and allocating IPIs

  cpuidle:
   - cpuidle-cps: Enable use with MIPSr6 CPUs.

  SEAD3:
   - Rewrite to use DT and generic kernel feature.

  USB:
   - host: ehci-sead3: Remove SEAD-3 EHCI code

  FBDEV:
   - cobalt_lcdfb: Drop SEAD3 support

  dt-bindings:
   -  Document a binding for simple ASCII LCDs

  auxdisplay:
   - img-ascii-lcd: driver for simple ASCII LCD displays

  irqchip i8259:
   - i8259: Add domain before mapping parent irq
   - i8259: Allow platforms to override poll function
   - i8259: Remove unused i8259A_irq_pending

  Malta:
   - Rewrite to use DT

  of/platform:
   - Probe "isa" busses by default

  CM:
   - Print CM error reports upon bus errors

  Module:
   - Migrate exception table users off module.h and onto extable.h
   - Make various drivers explicitly non-modular:
   - Audit and remove any unnecessary uses of module.h

  mailmap:
   - Canonicalize to Qais' current email address.

  Documentation:
   - MIPS supports HAVE_REGS_AND_STACK_ACCESS_API

  Loongson1C:
   - Add CPU support for Loongson1C
   - Add board support
   - Add defconfig
   - Add RTC support for Loongson1C board

  All this except one Documentation fix has sat in linux-next and has
  survived Imagination's automated build test system"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (127 commits)
  Documentation: MIPS supports HAVE_REGS_AND_STACK_ACCESS_API
  MIPS: ptrace: Fix regs_return_value for kernel context
  MIPS: VDSO: Drop duplicated -I*/-E* aflags
  MIPS: Fix -mabi=64 build of vdso.lds
  MIPS: Enable hardened usercopy
  MIPS: generic: Convert SEAD-3 to a generic board
  MIPS: generic: Introduce generic DT-based board support
  MIPS: Support generating Flattened Image Trees (.itb)
  MIPS: Adjust MIPS64 CAC_BASE to reflect Config.K0
  MIPS: Print CM error reports upon bus errors
  MIPS: Support per-device DMA coherence
  MIPS: dma-default: Don't check hw_coherentio if device is non-coherent
  MIPS: Sanitise coherentio semantics
  MIPS: PCI: Support generic drivers
  MIPS: PCI: Introduce CONFIG_PCI_DRIVERS_LEGACY
  MIPS: PCI: Split pci.c into pci.c & pci-legacy.c
  MIPS: PCI: Inline pcibios_assign_all_busses
  MIPS: PCI: Make pcibios_set_cache_line_size an initcall
  MIPS: PCI: Support for CONFIG_PCI_DOMAINS_GENERIC
  MIPS: PCI: Use struct list_head lists
  ...
2016-10-15 09:26:12 -07:00
Chris Metcalf 6727ad9e20 nmi_backtrace: generate one-line reports for idle cpus
When doing an nmi backtrace of many cores, most of which are idle, the
output is a little overwhelming and very uninformative.  Suppress
messages for cpus that are idling when they are interrupted and just
emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

We do this by grouping all the cpuidle code together into a new
.cpuidle.text section, and then checking the address of the interrupted
PC to see if it lies within that section.

This commit suitably tags x86 and tile idle routines, and only adds in
the minimal framework for other architectures.

Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
Tested-by: Petr Mladek <pmladek@suse.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:30 -07:00
Matt Redfearn 72bc8c75ea cpuidle: cpuidle-cps: Enable use with MIPSr6 CPUs.
This patch enables the MIPS CPS driver for MIPSr6 CPUs.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-mips@linux-mips.org
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14228/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-10-04 16:13:57 +02:00
Linus Torvalds 597f03f9d1 Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug updates from Thomas Gleixner:
 "Yet another batch of cpu hotplug core updates and conversions:

   - Provide core infrastructure for multi instance drivers so the
     drivers do not have to keep custom lists.

   - Convert custom lists to the new infrastructure. The block-mq custom
     list conversion comes through the block tree and makes the diffstat
     tip over to more lines removed than added.

   - Handle unbalanced hotplug enable/disable calls more gracefully.

   - Remove the obsolete CPU_STARTING/DYING notifier support.

   - Convert another batch of notifier users.

   The relayfs changes which conflicted with the conversion have been
   shipped to me by Andrew.

   The remaining lot is targeted for 4.10 so that we finally can remove
   the rest of the notifiers"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  cpufreq: Fix up conversion to hotplug state machine
  blk/mq: Reserve hotplug states for block multiqueue
  x86/apic/uv: Convert to hotplug state machine
  s390/mm/pfault: Convert to hotplug state machine
  mips/loongson/smp: Convert to hotplug state machine
  mips/octeon/smp: Convert to hotplug state machine
  fault-injection/cpu: Convert to hotplug state machine
  padata: Convert to hotplug state machine
  cpufreq: Convert to hotplug state machine
  ACPI/processor: Convert to hotplug state machine
  virtio scsi: Convert to hotplug state machine
  oprofile/timer: Convert to hotplug state machine
  block/softirq: Convert to hotplug state machine
  lib/irq_poll: Convert to hotplug state machine
  x86/microcode: Convert to hotplug state machine
  sh/SH-X3 SMP: Convert to hotplug state machine
  ia64/mca: Convert to hotplug state machine
  ARM/OMAP/wakeupgen: Convert to hotplug state machine
  ARM/shmobile: Convert to hotplug state machine
  arm64/FP/SIMD: Convert to hotplug state machine
  ...
2016-10-03 19:43:08 -07:00
Rafael J. Wysocki e35db92b4f Merge branches 'pm-cpuidle', 'pm-opp' and 'pm-avs'
* pm-cpuidle:
  ARM: cpuidle: Fix error return code

* pm-opp:
  PM / OPP: Don't support OPP if it provides supported-hw but platform does not
  PM / OPP: avoid maybe-uninitialized warning

* pm-avs:
  PM / AVS: SmartReflex: Neaten logging
2016-10-02 01:43:16 +02:00
Sebastian Andrzej Siewior dfc616d8b3 cpuidle/coupled: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160824091444.brdr5zpbxjvh6n3f@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06 18:30:24 +02:00
Sebastian Andrzej Siewior 10fcca9d87 cpuidle/powernv: Convert to hotplug state machine
Install the callbacks via the state machine.

v1…v2: - Use only CPUHP_CPUIDLE_DEAD (requested by Daniel Lezcano)

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160824091259.ozyslcopxvbfdqzy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06 18:30:24 +02:00
Sebastian Andrzej Siewior 529351fd3c cpuidle/pseries: Convert to hotplug state machine
Install the callbacks via the state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-pm@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160818125731.27256-11-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-09-06 18:30:24 +02:00
Christophe Jaillet af48d7bc37 ARM: cpuidle: Fix error return code
We know that 'ret = 0' because it has been tested a few lines above.
So, if 'kzalloc' fails, 0 will be returned instead of an error code.
Return -ENOMEM instead.

Fixes: a0d46a3dfd ("ARM: cpuidle: Register per cpuidle device")
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: 4.1+ <stable@vger.kernel.org> # 4.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-12 02:33:14 +02:00
Linus Torvalds bad60e6f25 powerpc updates for 4.8 # 1
Highlights:
  - PowerNV PCI hotplug support.
  - Lots more Power9 support.
  - eBPF JIT support on ppc64le.
  - Lots of cxl updates.
  - Boot code consolidation.
 
 Bug fixes:
  - Fix spin_unlock_wait() from Boqun Feng
  - Fix stack pointer corruption in __tm_recheckpoint() from Michael Neuling
  - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
  - mm: Ensure "special" zones are empty from Oliver O'Halloran
  - ftrace: Separate the heuristics for checking call sites from Michael Ellerman
  - modules: Never restore r2 for a mprofile-kernel style mcount() call from Michael Ellerman
  - Fix endianness when reading TCEs from Alexey Kardashevskiy
  - start rtasd before PCI probing from Greg Kurz
  - PCI: rpaphp: Fix slot registration for multiple slots under a PHB from Tyrel Datwyler
  - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev Bhattiprolu
 
 Cleanups & fixes:
  - Drop support for MPIC in pseries from Rashmica Gupta
  - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
  - Remove unused symbols in asm-offsets.c from Rashmica Gupta
  - Fix SRIOV not building without EEH enabled from Russell Currey
  - Remove kretprobe_trampoline_holder. from Thiago Jung Bauermann
  - Reduce log level of PCI I/O space warning from Benjamin Herrenschmidt
  - Add array bounds checking to crash_shutdown_handlers from Suraj Jitindar Singh
  - Avoid -maltivec when using clang integrated assembler from Anton Blanchard
  - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
  - Fix error return value in cmm_mem_going_offline() from Rasmus Villemoes
  - export cpu_to_core_id() from Mauricio Faria de Oliveira
  - Remove old symbols from defconfigs from Andrew Donnellan
  - Update obsolete comments in setup_32.c about entry conditions from Benjamin Herrenschmidt
  - Add comment explaining the purpose of setup_kdump_trampoline() from Benjamin Herrenschmidt
  - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin Hao
  - Remove RELOCATABLE_PPC32 from Kevin Hao
  - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh
 
 Minor cleanups & fixes:
  - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
    Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King, Geliang
    Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman, Michael Ellerman,
    Stephen Rothwell, Stewart Smith.
 
 Freescale updates from Scott:
  - "Highlights include more 8xx optimizations, device tree updates,
    and MVME7100 support."
 
 PowerNV PCI hotplug from Gavin Shan:
  - PCI: Add pcibios_setup_bridge()
  - Override pcibios_setup_bridge()
  - Remove PCI_RESET_DELAY_US
  - Move pnv_pci_ioda_setup_opal_tce_kill() around
  - Increase PE# capacity
  - Allocate PE# in reverse order
  - Create PEs in pcibios_setup_bridge()
  - Setup PE for root bus
  - Extend PCI bridge resources
  - Make pnv_ioda_deconfigure_pe() visible
  - Dynamically release PE
  - Update bridge windows on PCI plug
  - Delay populating pdn
  - Support PCI slot ID
  - Use PCI slot reset infrastructure
  - Introduce pnv_pci_get_slot_id()
  - Functions to get/set PCI slot state
  - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
  - Print correct PHB type names
 
 Power9 idle support from Shreyas B. Prabhu:
  - set power_save func after the idle states are initialized
  - Use PNV_THREAD_WINKLE macro while requesting for winkle
  - make hypervisor state restore a function
  - Rename idle_power7.S to idle_book3s.S
  - Rename reusable idle functions to hardware agnostic names
  - Make pnv_powersave_common more generic
  - abstraction for saving SPRs before entering deep idle states
  - Add platform support for stop instruction
  - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
  - cpuidle/powernv: cleanup cpuidle-powernv.c
  - cpuidle/powernv: Add support for POWER ISA v3 idle states
  - Use deepest stop state when cpu is offlined
 
 Power9 PMU from Madhavan Srinivasan:
  - factor out power8 pmu macros and defines
  - factor out power8 pmu functions
  - factor out power8 __init_pmu code
  - Add power9 event list macros for generic and cache events
  - Power9 PMU support
  - Export Power9 generic and cache events to sysfs
 
 Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
  - Add XICS emulation APIs
  - Move a few exception common handlers to make room
  - Add support for HV virtualization interrupts
  - Add mechanism to force a replay of interrupts
  - Add ICP OPAL backend
  - Discover IODA3 PHBs
  - pci: Remove obsolete SW invalidate
  - opal: Add real mode call wrappers
  - Rename TCE invalidation calls
  - Remove SWINV constants and obsolete TCE code
  - Rework accessing the TCE invalidate register
  - Fallback to OPAL for TCE invalidations
  - Use the device-tree to get available range of M64's
  - Check status of a PHB before using it
  - pci: Don't try to allocate resources that will be reassigned
 
 Other Power9:
  - Send SIGBUS on unaligned copy and paste from Chris Smart
  - Large Decrementer support from Oliver O'Halloran
  - Load Monitor Register Support from Jack Miller
 
 Performance improvements from Anton Blanchard:
  - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
  - Avoid load hit store in setup_sigcontext()
  - Remove assembly versions of strcpy, strcat, strlen and strcmp
  - Align hot loops of some string functions
 
 eBPF JIT from Naveen N. Rao:
  - Fix/enhance 32-bit Load Immediate implementation
  - Optimize 64-bit Immediate loads
  - Introduce rotate immediate instructions
  - A few cleanups
  - Isolate classic BPF JIT specifics into a separate header
  - Implement JIT compiler for extended BPF
 
 Operator Panel driver from Suraj Jitindar Singh:
  - devicetree/bindings: Add binding for operator panel on FSP machines
  - Add inline function to get rc from an ASYNC_COMP opal_msg
  - Add driver for operator panel on FSP machines
 
 Sparse fixes from Daniel Axtens:
  - make some things static
  - Introduce asm-prototypes.h
  - Include headers containing prototypes
  - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
  - kvm: Clarify __user annotations
  - Pass endianness to sparse
  - Make ppc_md.{halt, restart} __noreturn
 
 MM fixes & cleanups from Aneesh Kumar K.V:
  - radix: Update LPCR HR bit as per ISA
  - use _raw variant of page table accessors
  - Compile out radix related functions if RADIX_MMU is disabled
  - Clear top 16 bits of va only on older cpus
  - Print formation regarding the the MMU mode
  - hash: Update SDR1 size encoding as documented in ISA 3.0
  - radix: Update PID switch sequence
  - radix: Update machine call back to support new HCALL.
  - radix: Add LPID based tlb flush helpers
  - radix: Add a kernel command line to disable radix
  - Cleanup LPCR defines
 
 Boot code consolidation from Benjamin Herrenschmidt:
  - Move epapr_paravirt_early_init() to early_init_devtree()
  - cell: Don't use flat device-tree after boot
  - ge_imp3a: Don't use the flat device-tree after boot
  - mpc85xx_ds: Don't use the flat device-tree after boot
  - mpc85xx_rdb: Don't use the flat device-tree after boot
  - Don't test for machine type in rtas_initialize()
  - Don't test for machine type in smp_setup_cpu_maps()
  - dt: Add of_device_compatible_match()
  - Factor do_feature_fixup calls
  - Move 64-bit feature fixup earlier
  - Move 64-bit memory reserves to setup_arch()
  - Use a cachable DART
  - Move FW feature probing out of pseries probe()
  - Put exception configuration in a common place
  - Remove early allocation of the SMU command buffer
  - Move MMU backend selection out of platform code
  - pasemi: Remove IOBMAP allocation from platform probe()
  - mm/hash: Don't use machine_is() early during boot
  - Don't test for machine type to detect HEA special case
  - pmac: Remove spurrious machine type test
  - Move hash table ops to a separate structure
  - Ensure that ppc_md is empty before probing for machine type
  - Move 64-bit probe_machine() to later in the boot process
  - Move 32-bit probe() machine to later in the boot process
  - Get rid of ppc_md.init_early()
  - Move the boot time info banner to a separate function
  - Move setting of {i,d}cache_bsize to initialize_cache_info()
  - Move the content of setup_system() to setup_arch()
  - Move cache info inits to a separate function
  - Re-order the call to smp_setup_cpu_maps()
  - Re-order setup_panic()
  - Make a few boot functions __init
  - Merge 32-bit and 64-bit setup_arch()
 
 Other new features:
  - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
  - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
  - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
  - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
  - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
  - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
  - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
  - Add a parameter to disable 1TB segs from Oliver O'Halloran
  - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
  - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
  - pseries: Add pseries hotplug workqueue from John Allen
  - pseries: Add support for hotplug interrupt source from John Allen
  - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
  - pseries: Move property cloning into its own routine from Nathan Fontenot
  - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
  - pseries: Auto-online hotplugged memory from Nathan Fontenot
  - pseries: Remove call to memblock_add() from Nathan Fontenot
 
 cxl:
  - Add set and get private data to context struct from Michael Neuling
  - make base more explicitly non-modular from Paul Gortmaker
  - Use for_each_compatible_node() macro from Wei Yongjun
  - Frederic Barrat
    - Abstract the differences between the PSL and XSL
    - Make vPHB device node match adapter's
  - Philippe Bergheaud
    - Add mechanism for delivering AFU driver specific events
    - Ignore CAPI adapters misplaced in switched slots
    - Refine slice error debug messages
  - Andrew Donnellan
    - static-ify variables to fix sparse warnings
    - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
    - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
    - Add cxl_check_and_switch_mode() API to switch bi-modal cards
    - remove dead Kconfig options
    - fix potential NULL dereference in free_adapter()
  - Ian Munsie
    - Update process element after allocating interrupts
    - Add support for CAPP DMA mode
    - Fix allowing bogus AFU descriptors with 0 maximum processes
    - Fix allocating a minimum of 2 pages for the SPA
    - Fix bug where AFU disable operation had no effect
    - Workaround XSL bug that does not clear the RA bit after a reset
    - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
    - powerpc/powernv: Split cxl code out into a separate file
    - Add cxl_slot_is_supported API
    - Enable bus mastering for devices using CAPP DMA mode
    - Move cxl_afu_get / cxl_afu_put to base
    - Allow a default context to be associated with an external pci_dev
    - Do not create vPHB if there are no AFU configuration records
    - powerpc/powernv: Add support for the cxl kernel api on the real phb
    - Add support for using the kernel API with a real PHB
    - Add kernel APIs to get & set the max irqs per context
    - Add preliminary workaround for CX4 interrupt limitation
    - Add support for interrupts on the Mellanox CX4
    - Workaround PE=0 hardware limitation in Mellanox CX4
    - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n
 
 selftests:
  - Test unaligned copy and paste from Chris Smart
  - Load Monitor Register Tests from Jack Miller
  - Cyril Bur
    - exec() with suspended transaction
    - Use signed long to read perf_event_paranoid
    - Fix usage message in context_switch
    - Fix generation of vector instructions/types in context_switch
  - Michael Ellerman
    - Use "Delta" rather than "Error" in normal output
    - Import Anton's mmap & futex micro benchmarks
    - Add a test for PROT_SAO
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXnWchAAoJEFHr6jzI4aWAe64P/36Vd9yJLptjkoyZp8/IQtu1
 Cv8buQwGdKuSMzdkcUAOXcC3fe2u70ZWXMKKLfY3koIV1IAiqdWk5/XWRKMP2XmE
 dG0LhSf0uu7uh+mE0WvQnRu46ImeKtQ+mPp4Hbs/s9SxMSeYjruv3vdWWmgUq0cl
 Gac2qJSRtAMmgLuHWMjf7N5mxOTOnKejU4o2i9cJ+YHmWKOdCigv2Ge1UadOQFlC
 E7tRPiUR3asfDfj+e+LVTTdToH6p8pk+mOUzIoZ8jIkQ+IXzi62UDl5+Rw9mqiuX
 1CtqEMUXxo2qwX+d4TcV/QUOp0YKPuIcUZ9NMMS+S3lOyJ4NFt+j2Izk7QJp5kNP
 gKVqB68TjDQsBuDr3P9ynlHbduxTIhZAqopbTrLe0FIg48nUe4n1yHJBVzqaVajX
 rFBJSsSUffBLAARNPSXJJhIgc2C1/qOC8dgMeDMcR2kPirDHaQZ/lY1yEpq1yiqR
 q6e3v5hvIAm4IjbYk0mF7TUxBrPGVE/ExyBINyASRoYxAJ1PyeD/iljZ9vI3asRA
 s+hhxT8H3f7lnqTrmJqMjHgAdGkmag07EdmvFNX4xK4aADSy7Y6g4dw25ffRopo9
 p9Jf9HX+dZv65Y3UjbV/6HuXcaSEBJJLSVWvii65PebqSN0LuHEFvNeIJ6Iblx0B
 AWh/hd0Iin2gdkcG39Mr
 =Z5kM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - PowerNV PCI hotplug support.
   - Lots more Power9 support.
   - eBPF JIT support on ppc64le.
   - Lots of cxl updates.
   - Boot code consolidation.

  Bug fixes:
   - Fix spin_unlock_wait() from Boqun Feng
   - Fix stack pointer corruption in __tm_recheckpoint() from Michael
     Neuling
   - Fix multiple bugs in memory_hotplug_max() from Bharata B Rao
   - mm: Ensure "special" zones are empty from Oliver O'Halloran
   - ftrace: Separate the heuristics for checking call sites from
     Michael Ellerman
   - modules: Never restore r2 for a mprofile-kernel style mcount() call
     from Michael Ellerman
   - Fix endianness when reading TCEs from Alexey Kardashevskiy
   - start rtasd before PCI probing from Greg Kurz
   - PCI: rpaphp: Fix slot registration for multiple slots under a PHB
     from Tyrel Datwyler
   - powerpc/mm: Add memory barrier in __hugepte_alloc() from Sukadev
     Bhattiprolu

  Cleanups & fixes:
   - Drop support for MPIC in pseries from Rashmica Gupta
   - Define and use PPC64_ELF_ABI_v2/v1 from Michael Ellerman
   - Remove unused symbols in asm-offsets.c from Rashmica Gupta
   - Fix SRIOV not building without EEH enabled from Russell Currey
   - Remove kretprobe_trampoline_holder from Thiago Jung Bauermann
   - Reduce log level of PCI I/O space warning from Benjamin
     Herrenschmidt
   - Add array bounds checking to crash_shutdown_handlers from Suraj
     Jitindar Singh
   - Avoid -maltivec when using clang integrated assembler from Anton
     Blanchard
   - Fix array overrun in ppc_rtas() syscall from Andrew Donnellan
   - Fix error return value in cmm_mem_going_offline() from Rasmus
     Villemoes
   - export cpu_to_core_id() from Mauricio Faria de Oliveira
   - Remove old symbols from defconfigs from Andrew Donnellan
   - Update obsolete comments in setup_32.c about entry conditions from
     Benjamin Herrenschmidt
   - Add comment explaining the purpose of setup_kdump_trampoline() from
     Benjamin Herrenschmidt
   - Merge the RELOCATABLE config entries for ppc32 and ppc64 from Kevin
     Hao
   - Remove RELOCATABLE_PPC32 from Kevin Hao
   - Fix .long's in tlb-radix.c to more meaningful from Balbir Singh

  Minor cleanups & fixes:
   - Andrew Donnellan, Anna-Maria Gleixner, Anton Blanchard, Benjamin
     Herrenschmidt, Bharata B Rao, Christophe Leroy, Colin Ian King,
     Geliang Tang, Greg Kurz, Madhavan Srinivasan, Michael Ellerman,
     Michael Ellerman, Stephen Rothwell, Stewart Smith.

  Freescale updates from Scott:
   - "Highlights include more 8xx optimizations, device tree updates,
     and MVME7100 support."

  PowerNV PCI hotplug from Gavin Shan:
   - PCI: Add pcibios_setup_bridge()
   - Override pcibios_setup_bridge()
   - Remove PCI_RESET_DELAY_US
   - Move pnv_pci_ioda_setup_opal_tce_kill() around
   - Increase PE# capacity
   - Allocate PE# in reverse order
   - Create PEs in pcibios_setup_bridge()
   - Setup PE for root bus
   - Extend PCI bridge resources
   - Make pnv_ioda_deconfigure_pe() visible
   - Dynamically release PE
   - Update bridge windows on PCI plug
   - Delay populating pdn
   - Support PCI slot ID
   - Use PCI slot reset infrastructure
   - Introduce pnv_pci_get_slot_id()
   - Functions to get/set PCI slot state
   - PCI/hotplug: PowerPC PowerNV PCI hotplug driver
   - Print correct PHB type names

  Power9 idle support from Shreyas B. Prabhu:
   - set power_save func after the idle states are initialized
   - Use PNV_THREAD_WINKLE macro while requesting for winkle
   - make hypervisor state restore a function
   - Rename idle_power7.S to idle_book3s.S
   - Rename reusable idle functions to hardware agnostic names
   - Make pnv_powersave_common more generic
   - abstraction for saving SPRs before entering deep idle states
   - Add platform support for stop instruction
   - cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
   - cpuidle/powernv: cleanup cpuidle-powernv.c
   - cpuidle/powernv: Add support for POWER ISA v3 idle states
   - Use deepest stop state when cpu is offlined

  Power9 PMU from Madhavan Srinivasan:
   - factor out power8 pmu macros and defines
   - factor out power8 pmu functions
   - factor out power8 __init_pmu code
   - Add power9 event list macros for generic and cache events
   - Power9 PMU support
   - Export Power9 generic and cache events to sysfs

  Power9 preliminary interrupt & PCI support from Benjamin Herrenschmidt:
   - Add XICS emulation APIs
   - Move a few exception common handlers to make room
   - Add support for HV virtualization interrupts
   - Add mechanism to force a replay of interrupts
   - Add ICP OPAL backend
   - Discover IODA3 PHBs
   - pci: Remove obsolete SW invalidate
   - opal: Add real mode call wrappers
   - Rename TCE invalidation calls
   - Remove SWINV constants and obsolete TCE code
   - Rework accessing the TCE invalidate register
   - Fallback to OPAL for TCE invalidations
   - Use the device-tree to get available range of M64's
   - Check status of a PHB before using it
   - pci: Don't try to allocate resources that will be reassigned

  Other Power9:
   - Send SIGBUS on unaligned copy and paste from Chris Smart
   - Large Decrementer support from Oliver O'Halloran
   - Load Monitor Register Support from Jack Miller

  Performance improvements from Anton Blanchard:
   - Avoid load hit store in __giveup_fpu() and __giveup_altivec()
   - Avoid load hit store in setup_sigcontext()
   - Remove assembly versions of strcpy, strcat, strlen and strcmp
   - Align hot loops of some string functions

  eBPF JIT from Naveen N. Rao:
   - Fix/enhance 32-bit Load Immediate implementation
   - Optimize 64-bit Immediate loads
   - Introduce rotate immediate instructions
   - A few cleanups
   - Isolate classic BPF JIT specifics into a separate header
   - Implement JIT compiler for extended BPF

  Operator Panel driver from Suraj Jitindar Singh:
   - devicetree/bindings: Add binding for operator panel on FSP machines
   - Add inline function to get rc from an ASYNC_COMP opal_msg
   - Add driver for operator panel on FSP machines

  Sparse fixes from Daniel Axtens:
   - make some things static
   - Introduce asm-prototypes.h
   - Include headers containing prototypes
   - Use #ifdef __BIG_ENDIAN__ #else for REG_BYTE
   - kvm: Clarify __user annotations
   - Pass endianness to sparse
   - Make ppc_md.{halt, restart} __noreturn

  MM fixes & cleanups from Aneesh Kumar K.V:
   - radix: Update LPCR HR bit as per ISA
   - use _raw variant of page table accessors
   - Compile out radix related functions if RADIX_MMU is disabled
   - Clear top 16 bits of va only on older cpus
   - Print formation regarding the the MMU mode
   - hash: Update SDR1 size encoding as documented in ISA 3.0
   - radix: Update PID switch sequence
   - radix: Update machine call back to support new HCALL.
   - radix: Add LPID based tlb flush helpers
   - radix: Add a kernel command line to disable radix
   - Cleanup LPCR defines

  Boot code consolidation from Benjamin Herrenschmidt:
   - Move epapr_paravirt_early_init() to early_init_devtree()
   - cell: Don't use flat device-tree after boot
   - ge_imp3a: Don't use the flat device-tree after boot
   - mpc85xx_ds: Don't use the flat device-tree after boot
   - mpc85xx_rdb: Don't use the flat device-tree after boot
   - Don't test for machine type in rtas_initialize()
   - Don't test for machine type in smp_setup_cpu_maps()
   - dt: Add of_device_compatible_match()
   - Factor do_feature_fixup calls
   - Move 64-bit feature fixup earlier
   - Move 64-bit memory reserves to setup_arch()
   - Use a cachable DART
   - Move FW feature probing out of pseries probe()
   - Put exception configuration in a common place
   - Remove early allocation of the SMU command buffer
   - Move MMU backend selection out of platform code
   - pasemi: Remove IOBMAP allocation from platform probe()
   - mm/hash: Don't use machine_is() early during boot
   - Don't test for machine type to detect HEA special case
   - pmac: Remove spurrious machine type test
   - Move hash table ops to a separate structure
   - Ensure that ppc_md is empty before probing for machine type
   - Move 64-bit probe_machine() to later in the boot process
   - Move 32-bit probe() machine to later in the boot process
   - Get rid of ppc_md.init_early()
   - Move the boot time info banner to a separate function
   - Move setting of {i,d}cache_bsize to initialize_cache_info()
   - Move the content of setup_system() to setup_arch()
   - Move cache info inits to a separate function
   - Re-order the call to smp_setup_cpu_maps()
   - Re-order setup_panic()
   - Make a few boot functions __init
   - Merge 32-bit and 64-bit setup_arch()

  Other new features:
   - tty/hvc: Use IRQF_SHARED for OPAL hvc consoles from Sam Mendoza-Jonas
   - tty/hvc: Use opal irqchip interface if available from Sam Mendoza-Jonas
   - powerpc: Add module autoloading based on CPU features from Alastair D'Silva
   - crypto: vmx - Convert to CPU feature based module autoloading from Alastair D'Silva
   - Wake up kopald polling thread before waiting for events from Benjamin Herrenschmidt
   - xmon: Dump ISA 2.06 SPRs from Michael Ellerman
   - xmon: Dump ISA 2.07 SPRs from Michael Ellerman
   - Add a parameter to disable 1TB segs from Oliver O'Halloran
   - powerpc/boot: Add OPAL console to epapr wrappers from Oliver O'Halloran
   - Assign fixed PHB number based on device-tree properties from Guilherme G. Piccoli
   - pseries: Add pseries hotplug workqueue from John Allen
   - pseries: Add support for hotplug interrupt source from John Allen
   - pseries: Use kernel hotplug queue for PowerVM hotplug events from John Allen
   - pseries: Move property cloning into its own routine from Nathan Fontenot
   - pseries: Dynamic add entires to associativity lookup array from Nathan Fontenot
   - pseries: Auto-online hotplugged memory from Nathan Fontenot
   - pseries: Remove call to memblock_add() from Nathan Fontenot

  cxl:
   - Add set and get private data to context struct from Michael Neuling
   - make base more explicitly non-modular from Paul Gortmaker
   - Use for_each_compatible_node() macro from Wei Yongjun
   - Frederic Barrat
   - Abstract the differences between the PSL and XSL
   - Make vPHB device node match adapter's
   - Philippe Bergheaud
   - Add mechanism for delivering AFU driver specific events
   - Ignore CAPI adapters misplaced in switched slots
   - Refine slice error debug messages
   - Andrew Donnellan
   - static-ify variables to fix sparse warnings
   - PCI/hotplug: pnv_php: export symbols and move struct types needed by cxl
   - PCI/hotplug: pnv_php: handle OPAL_PCI_SLOT_OFFLINE power state
   - Add cxl_check_and_switch_mode() API to switch bi-modal cards
   - remove dead Kconfig options
   - fix potential NULL dereference in free_adapter()
   - Ian Munsie
   - Update process element after allocating interrupts
   - Add support for CAPP DMA mode
   - Fix allowing bogus AFU descriptors with 0 maximum processes
   - Fix allocating a minimum of 2 pages for the SPA
   - Fix bug where AFU disable operation had no effect
   - Workaround XSL bug that does not clear the RA bit after a reset
   - Fix NULL pointer dereference on kernel contexts with no AFU interrupts
   - powerpc/powernv: Split cxl code out into a separate file
   - Add cxl_slot_is_supported API
   - Enable bus mastering for devices using CAPP DMA mode
   - Move cxl_afu_get / cxl_afu_put to base
   - Allow a default context to be associated with an external pci_dev
   - Do not create vPHB if there are no AFU configuration records
   - powerpc/powernv: Add support for the cxl kernel api on the real phb
   - Add support for using the kernel API with a real PHB
   - Add kernel APIs to get & set the max irqs per context
   - Add preliminary workaround for CX4 interrupt limitation
   - Add support for interrupts on the Mellanox CX4
   - Workaround PE=0 hardware limitation in Mellanox CX4
   - powerpc/powernv: Fix pci-cxl.c build when CONFIG_MODULES=n

  selftests:
   - Test unaligned copy and paste from Chris Smart
   - Load Monitor Register Tests from Jack Miller
   - Cyril Bur
   - exec() with suspended transaction
   - Use signed long to read perf_event_paranoid
   - Fix usage message in context_switch
   - Fix generation of vector instructions/types in context_switch
   - Michael Ellerman
   - Use "Delta" rather than "Error" in normal output
   - Import Anton's mmap & futex micro benchmarks
   - Add a test for PROT_SAO"

* tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (263 commits)
  powerpc/mm: Parenthesise IS_ENABLED() in if condition
  tty/hvc: Use opal irqchip interface if available
  tty/hvc: Use IRQF_SHARED for OPAL hvc consoles
  selftests/powerpc: exec() with suspended transaction
  powerpc: Improve comment explaining why we modify VRSAVE
  powerpc/mm: Drop unused externs for hpte_init_beat[_v3]()
  powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header
  powerpc/mm: Fix build break when PPC_NATIVE=n
  crypto: vmx - Convert to CPU feature based module autoloading
  powerpc: Add module autoloading based on CPU features
  powerpc/powernv/ioda: Fix endianness when reading TCEs
  powerpc/mm: Add memory barrier in __hugepte_alloc()
  powerpc/modules: Never restore r2 for a mprofile-kernel style mcount() call
  powerpc/ftrace: Separate the heuristics for checking call sites
  powerpc: Merge 32-bit and 64-bit setup_arch()
  powerpc/64: Make a few boot functions __init
  powerpc: Re-order setup_panic()
  powerpc: Re-order the call to smp_setup_cpu_maps()
  powerpc/32: Move cache info inits to a separate function
  powerpc/64: Move the content of setup_system() to setup_arch()
  ...
2016-07-30 21:01:36 -07:00
Sudeep Holla 220276e09b cpuidle: introduce CPU_PM_CPU_IDLE_ENTER macro for ARM{32, 64}
The function arm_enter_idle_state is exactly the same in both generic
ARM{32,64} CPUIdle driver and will be the same even on ARM64 backend
for ACPI processor idle driver. So we can unify it and move it to a
common place by introducing CPU_PM_CPU_IDLE_ENTER macro that can be
used in all places avoiding duplication.

This is in preparation of reuse of the generic cpuidle entry function
for ACPI LPI support on ARM64.

Suggested-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-21 23:29:38 +02:00
Shreyas B. Prabhu 3005c597ba cpuidle/powernv: Add support for POWER ISA v3 idle states
POWER ISA v3 defines a new idle processor core mechanism. In summary,
 a) new instruction named stop is added.
 b) new per thread SPR named PSSCR is added which controls the behavior
	of stop instruction.

Supported idle states and value to be written to PSSCR register to enter
any idle state is exposed via ibm,cpu-idle-state-names and
ibm,cpu-idle-state-psscr respectively. To enter an idle state,
platform provided power_stop() needs to be invoked with the appropriate
PSSCR value.

This patch adds support for this new mechanism in cpuidle powernv driver.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Cc: linux-pm@vger.kernel.org
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: linuxppc-dev@lists.ozlabs.org
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:42 +10:00
Shreyas B. Prabhu 957efcedee cpuidle/powernv: cleanup cpuidle-powernv.c
- Use stack instead of kzalloc'ed memory for variables while probing
   device tree for idle states.
 - Set cap for number of idle states that can be added to
   cpuidle_state_table
 - Minor change in way we check of_property_read_u32_array for error
   for sake of consistency
 - Drop unnecessary "&" while assigning function pointer

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:42 +10:00
Shreyas B. Prabhu 169f3fae0c cpuidle/powernv: Use CPUIDLE_STATE_MAX instead of MAX_POWERNV_IDLE_STATES
Use cpuidle's CPUIDLE_STATE_MAX macro instead of powernv specific
MAX_POWERNV_IDLE_STATES.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-pm@vger.kernel.org
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-15 20:18:41 +10:00
Shreyas B. Prabhu dbd1b8ea43 cpuidle: Fix last_residency division
Snooze is a poll idle state in powernv and pseries platforms. Snooze
has a timeout so that if a CPU stays in snooze for more than target
residency of the next available idle state, then it would exit
thereby giving chance to the cpuidle governor to re-evaluate and
promote the CPU to a deeper idle state. Therefore whenever snooze
exits due to this timeout, its last_residency will be target_residency
of the next deeper state.

Commit e93e59ce5b "cpuidle: Replace ktime_get() with local_clock()"
changed the math around last_residency calculation. Specifically,
while converting last_residency value from nano- to microseconds, it
carries out right shift by 10. Because of that, in snooze timeout
exit scenarios last_residency calculated is roughly 2.3% less than
target_residency of the next available state. This pattern is picked
up by get_typical_interval() in the menu governor and therefore
expected_interval in menu_select() is frequently less than the
target_residency of any state other than snooze.

Due to this we are entering snooze at a higher rate, thereby
affecting the single thread performance.

Fix this by using more precise division via ktime_us_delta().

Fixes: e93e59ce5b "cpuidle: Replace ktime_get() with local_clock()"
Reported-by: Anton Blanchard <anton@samba.org>
Bisected-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-04 14:17:34 +02:00
Daniel Lezcano e7387da520 cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter()
Commit 0b89e9aa28 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.

The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.

But the check is done against the state index returned by the back
end driver's ->enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.

 entered_state = target_state->enter(dev, drv, index);

 [ ... ]

 if (!cpuidle_state_is_coupled(drv, entered_state))
	local_irq_enable();

 [ ... ]

If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.

Fixes: 0b89e9aa28 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.15+ <stable@vger.kernel.org> # 3.15+
[ rjw: Subject & changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-05-18 02:48:37 +02:00
Rafael J. Wysocki fd7c3c29f9 Merge back new cpuidle material for v4.7. 2016-05-06 22:08:46 +02:00
James Morse 625fe4f8ff ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
arm_cpuidle_suspend() may return -EOPNOTSUPP, or any value returned
by the cpu_ops/cpuidle_ops suspend call. arm_enter_idle_state() doesn't
update 'ret' with this value, meaning we always signal success to
cpuidle_enter_state(), causing it to update the usage counters as if we
succeeded.

Fixes: 191de17aa3 ("ARM64: cpuidle: Replace cpu_suspend by the common ARM/ARM64 function")
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 4.1+ <stable@vger.kernel.org> # 4.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-28 15:15:14 +02:00
Daniel Lezcano e93e59ce5b cpuidle: Replace ktime_get() with local_clock()
The ktime_get() can have a non negligeable overhead, use local_clock()
instead.

In order to test the difference between ktime_get() and local_clock(),
a quick hack has been added to trigger, via debugfs, 10000 times a
call to ktime_get() and local_clock() and measure the elapsed time.

Then the average value, the min and max is computed for each call.

From userspace, the test above was called 100 times every 2 seconds.

So, ktime_get() and local_clock() have been called 1000000 times in
total.

The results are:

ktime_get():
============
 * average: 101 ns (stddev: 27.4)
 * maximum: 38313 ns
 * minimum: 65 ns

local_clock():
==============
 * average: 60 ns (stddev: 9.8)
 * maximum: 13487 ns
 * minimum: 46 ns

The local_clock() is faster and more stable.

Even if it is a drop in the ocean, changing the ktime_get() by the
local_clock() allows to save 80ns at idle time (entry + exit). And
in some circumstances, especially when there are several CPUs racing
for the clock access, we save tens of microseconds.

The idle duration resulting from a diff is converted from nanosec to
microsec. This could be done with integer division (div 1000) - which is
an expensive operation or by 10 bits shifting (div 1024) - which is fast
but unprecise.

The following table gives some results at the limits.

 ------------------------------------------
|   nsec   |   div(1000)   |   div(1024)   |
 ------------------------------------------
|   1e3    |        1 usec |      976 nsec |
 ------------------------------------------
|   1e6    |     1000 usec |      976 usec |
 ------------------------------------------
|   1e9    |  1000000 usec |   976562 usec |
 ------------------------------------------

There is a linear deviation of 2.34%. This loss of precision is acceptable
in the context of the resulting diff which is used for statistics. These
ones are processed to guess estimate an approximation of the duration of the
next idle period which ends up into an idle state selection. The selection
criteria takes into account the next duration based on large intervals,
represented by the idle state's target residency.

The 2^10 division is enough because the approximation regarding the 1e3
division is lost in all the approximations done for the next idle duration
computation.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-26 02:38:49 +02:00
Dave Gerlach c998c07836 cpuidle: Indicate when a device has been unregistered
Currently the 'registered' member of the cpuidle_device struct is set
to 1 during cpuidle_register_device. In this same function there are
checks to see if the device is already registered to prevent duplicate
calls to register the device, but this value is never set to 0 even on
unregister of the device. Because of this, any attempt to call
cpuidle_register_device after a call to cpuidle_unregister_device will
fail which shouldn't be the case.

To prevent this, set registered to 0 when the device is unregistered.

Fixes: c878a52d3c (cpuidle: Check if device is already registered)
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-09 02:14:39 +02:00
Rafael J. Wysocki 0c313cb207 cpuidle: menu: Fall back to polling if next timer event is near
Commit a9ceb78bc7 (cpuidle,menu: use interactivity_req to disable
polling) changed the behavior of the fallback state selection part
of menu_select() so it looks at interactivity_req instead of
data->next_timer_us when it makes its decision.  That effectively
caused polling to be used more often as fallback idle which led to
significant increases of energy consumption in some cases.

Commit e132b9b3bc (cpuidle: menu: use high confidence factors
only when considering polling) changed that logic again to be more
predictable, but that didn't help with the increased energy
consumption problem.

For this reason, go back to making decisions on which state to fall
back to based on data->next_timer_us which is the time we know for
sure something will happen rather than a prediction (which may be
inaccurate and turns out to be so often enough to be problematic).
However, take the target residency of the first proper idle state
(C1) into account, so that state is not used as the fallback one
if its target residency is greater than data->next_timer_us.

Fixes: a9ceb78bc7 (cpuidle,menu: use interactivity_req to disable polling)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-and-tested-by: Doug Smythies <dsmythies@telus.net>
2016-03-21 15:50:28 +01:00
Rik van Riel e132b9b3bc cpuidle: menu: use high confidence factors only when considering polling
The menu governor uses five different factors to pick the
idle state:
 - the user configured latency_req
 - the time until the next timer (next_timer_us)
 - the typical sleep interval, as measured recently
 - an estimate of sleep time by dividing next_timer_us by an observed factor
 - a load corrected version of the above, divided again by load

Only the first three items are known with enough confidence that
we can use them to consider polling, instead of an actual CPU
idle state, because the cost of being wrong about polling can be
excessive power use.

The latter two are used in the menu governor's main selection
loop, and can result in choosing a shallower idle state when
the system is expected to be busy again soon.

This pushes a busy system in the "performance" direction of
the performance<>power tradeoff, when choosing between idle
states, but stays more strictly on the "power" state when
deciding between polling and C1.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-17 02:40:32 +01:00
Rasmus Villemoes 3b99669b75 cpuidle: menu: help gcc generate slightly better code
We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32->64 multiply.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-02-17 00:28:15 +01:00
Rasmus Villemoes 7024b18ca4 cpuidle: menu: avoid expensive square root computation
Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -> 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance <=
U64_MAX/36).

On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev <= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-02-17 00:27:16 +01:00
Rafael J. Wysocki ad1ac94767 Merge branches 'pm-cpuidle', 'pm-cpufreq', 'pm-domains' and 'pm-sleep'
* pm-cpuidle:
  cpuidle: coupled: remove unused define cpuidle_coupled_lock
  cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze

* pm-cpufreq:
  cpufreq: cpufreq-dt: avoid uninitialized variable warnings:
  cpufreq: pxa2xx: fix pxa_cpufreq_change_voltage prototype
  cpufreq: Use list_is_last() to check last entry of the policy list
  cpufreq: Fix NULL reference crash while accessing policy->governor_data

* pm-domains:
  PM / Domains: Fix typo in comment
  PM / Domains: Fix potential deadlock while adding/removing subdomains
  PM / domains: fix lockdep issue for all subdomains

* pm-sleep:
  PM: APM_EMULATION does not depend on PM
2016-01-29 21:45:17 +01:00
Anders Roxell 75274b33e7 cpuidle: coupled: remove unused define cpuidle_coupled_lock
This was found with the -RT patch enabled, but the fix should apply to
non-RT also.

Used multi_v7_defconfig+PREEMPT_RT_FULL=y and this caused a compilation
warning without this fix:
../drivers/cpuidle/coupled.c:122:21: warning: 'cpuidle_coupled_lock'
defined but not used [-Wunused-variable]

Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-27 23:08:46 +01:00
Sudeep Holla 6f16886b7c cpuidle: fix fallback mechanism for suspend to idle in absence of enter_freeze
Commit 51164251f5 "sched / idle: Drop default_idle_call() fallback
from call_cpuidle()" made find_deepest_state() return non-negative
value and check all the states with index > 0.  Also as a result,
find_deepest_state() returns 0 even when enter_freeze callbacks are not
implemented and enter_freeze_proper() is called which ends up crashing
the kernel.

This patch updates the check for index > 0 in cpuidle_enter_freeze and
cpuidle_idle_call(when idle_should_freeze is true) to restore the
suspend-to-idle functionality in absence of enter_freeze callback.

Fixes: 51164251f5 "sched / idle: Drop default_idle_call() fallback from call_cpuidle()"
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-22 02:35:49 +01:00
Linus Torvalds 30f05309bd More power management and ACPI updates for v4.5-rc1
- Modify the driver core and the USB subsystem to allow USB devices
    to stay suspended over system suspend/resume cycles if they have
    been runtime-suspended already beforehand and fix some bugs on
    top of these changes (Tomeu Vizoso, Rafael Wysocki).
 
  - Update ACPICA to upstream revision 20160108, including updates
    of the ACPICA's copyright notices, a code fixup resulting from
    a regression fix that was necessary in the upstream code only
    (the regression fixed by it has never been present in Linux)
    and a compiler warning fix (Bob Moore, Lv Zheng).
 
  - Fix a recent regression in the cpuidle menu governor that broke
    it on practically all architectures other than x86 and make a
    couple of optimizations on top of that fix (Rafael Wysocki).
 
  - Clean up the selection of cpuidle governors depending on whether
    or not the kernel is configured for tickless systems (Jean Delvare).
 
  - Revert a recent commit that introduced a regression in the ACPI
    backlight driver, address the problem it attempted to fix in a
    different way and revert one more cosmetic change depending on
    the problematic commit (Hans de Goede).
 
  - Add two more ACPI backlight quirks (Hans de Goede).
 
  - Fix a few minor problems in the core devfreq code, clean it up
    a bit and update the MAINTAINERS information related to it
    (Chanwoo Choi, MyungJoo Ham).
 
  - Improve an error message in the ACPI fan driver (Andy Lutomirski).
 
  - Fix a recent build regression in the cpupower tool (Shreyas Prabhu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJWoCQ4AAoJEILEb/54YlRxLscQALEFVKSRnNaco72OqqRZs9Bu
 1RI6TgHTpZxR+Ef0+QWqE1QMnDwfImGhKDbSRm/t3S2sMYYZbAOL8cu4y6GmkBv4
 bOon/f9WEoPlQCFoo/6U4u8H45rNT5W9zX5+Bva8x+4Wu3n2J1QdvirnS5JHeHe1
 o6tGLaHuZXSwX8SLnCk8gJYK1VhATxbubJtpcVtvlnAhO11qUAwsscCrkUmB60i7
 5hLyrZb06hoa/hZVcIefGFuSd9qPhzDMQE2M20EohQ7UVkNJQdY9QNHMqCk2P42T
 nMWCNSwGnwfiO1p9ByXqunOFBCmyL7P+KV/DHsz6TFCVjz+jeG53Kqey9SkSJ/2W
 iaAE80K9MfOMvg8j7rib6fTn5uXBwRfqdeUDF/Hr64QqJoRn3R2LX4HmZe4L8ufb
 zA1rece67o8FD+7p7GkNbT3rPV/kA62tn/moFk446X5N+b261Kz90t1DVci8kRVf
 k+1gcvEdqO0GPpEHoirfXrBvQFixqkXakKj4r2aAob/DldQeLX7CkOUuRRJ1ykec
 bxwI9R0v8MlVe5rDxg+rPB0I9EFxRDmxqxpU5j0MRWxKnMRzLvBtHuk8YNVS/eU1
 xwyJOGcwF6yI0PaCFggPqmhebSrWLE7wJxaK+3bC+yiDTvHYPjB+4MfQrmkRAwwM
 azgb+ZgXDYx5wXeb8EjB
 =bKJ9
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management and ACPI updates from Rafael Wysocki:
 "This includes fixes on top of the previous batch of PM+ACPI updates
  and some new material as well.

  From the new material perspective the most significant are the driver
  core changes that should allow USB devices to stay suspended over
  system suspend/resume cycles if they have been runtime-suspended
  already beforehand.  Apart from that, ACPICA is updated to upstream
  revision 20160108 (cosmetic mostly, but including one fixup on top of
  the previous ACPICA update) and there are some devfreq updates the
  didn't make it before (due to timing).

  A few recent regressions are fixed, most importantly in the cpuidle
  menu governor and in the ACPI backlight driver and some x86 platform
  drivers depending on it.

  Some more bugs are fixed and cleanups are made on top of that.

  Specifics:

   - Modify the driver core and the USB subsystem to allow USB devices
     to stay suspended over system suspend/resume cycles if they have
     been runtime-suspended already beforehand and fix some bugs on top
     of these changes (Tomeu Vizoso, Rafael Wysocki).

   - Update ACPICA to upstream revision 20160108, including updates of
     the ACPICA's copyright notices, a code fixup resulting from a
     regression fix that was necessary in the upstream code only (the
     regression fixed by it has never been present in Linux) and a
     compiler warning fix (Bob Moore, Lv Zheng).

   - Fix a recent regression in the cpuidle menu governor that broke it
     on practically all architectures other than x86 and make a couple
     of optimizations on top of that fix (Rafael Wysocki).

   - Clean up the selection of cpuidle governors depending on whether or
     not the kernel is configured for tickless systems (Jean Delvare).

   - Revert a recent commit that introduced a regression in the ACPI
     backlight driver, address the problem it attempted to fix in a
     different way and revert one more cosmetic change depending on the
     problematic commit (Hans de Goede).

   - Add two more ACPI backlight quirks (Hans de Goede).

   - Fix a few minor problems in the core devfreq code, clean it up a
     bit and update the MAINTAINERS information related to it (Chanwoo
     Choi, MyungJoo Ham).

   - Improve an error message in the ACPI fan driver (Andy Lutomirski).

   - Fix a recent build regression in the cpupower tool (Shreyas
     Prabhu)"

* tag 'pm+acpi-4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
  cpuidle: menu: Avoid pointless checks in menu_select()
  sched / idle: Drop default_idle_call() fallback from call_cpuidle()
  cpupower: Fix build error in cpufreq-info
  cpuidle: Don't enable all governors by default
  cpuidle: Default to ladder governor on ticking systems
  time: nohz: Expose tick_nohz_enabled
  ACPICA: Update version to 20160108
  ACPICA: Silence a -Wbad-function-cast warning when acpi_uintptr_t is 'uintptr_t'
  ACPICA: Additional 2016 copyright changes
  ACPICA: Reduce regression fix divergence from upstream ACPICA
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Satellite R830
  ACPI / video: Revert "thinkpad_acpi: Use acpi_video_handles_brightness_key_presses()"
  ACPI / video: Document acpi_video_handles_brightness_key_presses() a bit
  ACPI / video: Fix using an uninitialized mutex / list_head in acpi_video_handles_brightness_key_presses()
  ACPI / video: Revert "ACPI / video: driver must be registered before checking for keypresses"
  ACPI / fan: Improve acpi_device_update_power error message
  ACPI / video: Add disable_backlight_sysfs_if quirk for the Toshiba Portege R700
  cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
  MAINTAINERS: Add devfreq-event entry
  MAINTAINERS: Add missing git repository and directory for devfreq
  ...
2016-01-20 19:06:49 -08:00
Rafael J. Wysocki 5bb1729cbd cpuidle: menu: Avoid pointless checks in menu_select()
If menu_select() cannot find a suitable state to return, it will
return the state index stored in data->last_state_idx.  This
means that it is pointless to look at the states whose indices
are less than or equal to data->last_state_idx in the main loop,
so don't do that.

Given that those checks are done on every idle state selection, this
change can save quite a bit of completely unnecessary overhead.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
2016-01-19 15:28:23 +01:00
Rafael J. Wysocki 51164251f5 sched / idle: Drop default_idle_call() fallback from call_cpuidle()
After commit 9c4b2867ed (cpuidle: menu: Fix menu_select() for
CPUIDLE_DRIVER_STATE_START == 0) it is clear that menu_select()
cannot return negative values.  Moreover, ladder_select_state()
will never return a negative value too, so make find_deepest_state()
return non-negative values too and drop the default_idle_call()
fallback from call_cpuidle().

This eliminates one branch from the idle loop and makes the governors
and find_deepest_state() handle the case when all states have been
disabled from sysfs consistently.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
2016-01-19 15:27:49 +01:00
Jean Delvare 10475b34f4 cpuidle: Don't enable all governors by default
Since commit d6f346f2d2 (cpuidle: improve governor Kconfig options)
the best cpuidle governor is selected automatically. There is little
point in additionally selecting the other one as it will not be used,
so don't select both governors by default.

Being able to select more than one governor is still good for
developers booting with cpuidle_sysfs_switch, though.

This fixes the second half of kernel BZ #65531.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=65531
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-15 22:39:58 +01:00
Jean Delvare 66a5f6b639 cpuidle: Default to ladder governor on ticking systems
The menu governor is currently the default on all systems. However the
documentation claims that the ladder governor is preferred on ticking
systems. So bump the rating of the ladder governor when NO_HZ is
disabled, or when booting with nohz=off.

This fixes the first half of kernel BZ #65531.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=65531
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-15 22:36:19 +01:00
Linus Torvalds f689b742f2 powerpc updates for 4.5
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
  - Optimise FP/VMX/VSX context switching from Anton Blanchard
 
  - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta,
    Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan
  - Allow wrapper to work on non-english system from Laurent Vivier
  - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
  - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt
  - Include KVM guest test in all interrupt vectors from Paul Mackerras
  - Fix DSCR inheritance over fork() from Anton Blanchard
  - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng
  - Print MSR TM bits in oops messages from Michael Neuling
  - Add TM signal return & invalid stack selftests from Michael Neuling
  - Limit EPOW reset event warnings from Vipin K Parashar
  - Remove the Cell QPACE code from Rashmica Gupta
  - Append linux_banner to exception information in xmon from Rashmica Gupta
  - Add selftest to check if VSRs are corrupted from Rashmica Gupta
  - Remove broken GregorianDay() from Daniel Axtens
  - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman
  - Add selftest script to test HMI functionality from Daniel Axtens
  - Remove obsolete OPAL v2 support from Stewart Smith
  - Make enter_rtas() private from Michael Ellerman
  - PPR exception cleanups from Michael Ellerman
  - Add page soft dirty tracking from Laurent Dufour
  - Add support for Nvlink NPUs from Alistair Popple
  - Add support for kexec on 476fpe from Alistair Popple
  - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
  - Copy only required pieces of the mm_context_t to the paca from Michael Neuling
  - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey
  - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt
  - Add HWCAP bits for Power9 from Michael Ellerman
  - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
  - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
  - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand
  - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
 
  - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain
  - cxl: use correct operator when writing pcie config space values from Andrew Donnellan
  - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain
  - cxl: fix build for GCC 4.6.x from Brian Norris
  - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
  - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan
 
  - Freescale updates from Scott: Highlights include moving QE code out of
    arch/powerpc (to be shared with arm), device tree updates, and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57
 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP
 SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR
 XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7
 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN
 QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3
 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD
 BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn
 qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP
 zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv
 BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk
 TUxMR/38aI7bLcIQjwj9
 =R5bX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Rafael J. Wysocki 9c4b2867ed cpuidle: menu: Fix menu_select() for CPUIDLE_DRIVER_STATE_START == 0
Commit a9ceb78bc7 (cpuidle,menu: use interactivity_req to disable
polling) exposed a bug in menu_select() causing it to return -1
on systems with CPUIDLE_DRIVER_STATE_START equal to zero, although
it should have returned 0.  As a result, idle states are not entered
by CPUs on those systems.

Namely, on the systems in question data->last_state_idx is initially
equal to -1 and the above commit modified the condition that would
have caused it to be changed to 0 to be less likely to trigger which
exposed the problem.  However, setting data->last_state_idx initially
to -1 doesn't make sense at all and on the affected systems it should
always be set to CPUIDLE_DRIVER_STATE_START (ie. 0) unconditionally,
so make that happen.

Fixes: a9ceb78bc7 (cpuidle,menu: use interactivity_req to disable polling)
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-01-14 23:24:22 +01:00
Stewart Smith e4d54f71d2 powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL
Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-17 22:40:54 +11:00
Paul Gortmaker 84599238ea drivers/cpuidle: make cpuidle-exynos.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

cpuidle/Kconfig.arm:config ARM_EXYNOS_CPUIDLE
cpuidle/Kconfig.arm:    bool "Cpu Idle Driver for the Exynos processors"

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modularity so that when reading the
driver there is no doubt it is builtin-only.

Since module_platform_driver() uses the same init level priority as
builtin_platform_driver() the init ordering remains unchanged with
this commit.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-15 00:22:22 +01:00
Paul Gortmaker fdc7d515ad drivers/cpuidle: make cpuidle-ux500.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

cpuidle/Kconfig.arm:config ARM_U8500_CPUIDLE
cpuidle/Kconfig.arm:    bool "Cpu Idle Driver for the ST-E u8500 processors"

...meaning that it currently is not being built as a module by anyone.

Lets remove the couple traces of modularity so that when reading the
driver there is no doubt it is builtin-only.

Since module_platform_driver() uses the same init level priority as
builtin_platform_driver() the init ordering remains unchanged with
this commit.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-15 00:22:22 +01:00
Paul Gortmaker 94e8057b84 drivers/cpuidle: make cpuidle-clps711x.c explicitly non-modular
The Kconfig currently controlling compilation of this code is:

drivers/cpuidle/Kconfig.arm:config ARM_CLPS711X_CPUIDLE
drivers/cpuidle/Kconfig.arm:    bool "CPU Idle Driver for CLPS711X processors"

...meaning that it currently is not being built as a module by anyone.

Lets remove the modular code that is essentially orphaned, so that
when reading the driver there is no doubt it is builtin-only.

Since module_platform_driver() uses the same init level priority as
builtin_platform_driver() the init ordering remains unchanged with
this commit.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

We also delete the MODULE_LICENSE tag etc. since all that information
is already contained at the top of the file in the comments.

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-12-15 00:22:21 +01:00
Rik van Riel efddfd90fb cpuidle,menu: smooth out measured_us calculation
The cpuidle state tables contain the maximum exit latency for each
cpuidle state. On x86, that is the exit latency for when the entire
package goes into that same idle state.

However, a lot of the time we only go into the core idle state,
not the package idle state. This means we see a much smaller exit
latency.

We have no way to detect whether we went into the core or package
idle state while idle, and that is ok.

However, the current menu_update logic does have the potential to
trip up the repeating pattern detection in get_typical_interval.
If the system is experiencing an exit latency near the idle state's
exit latency, some of the samples will have exit_us subtracted,
while others will not. This turns a repeating pattern into mush,
potentially breaking get_typical_interval.

Furthermore, for smaller sleep intervals, we know the chance that
all the cores in the package went to the same idle state are fairly
small. Dividing the measured_us by two, instead of subtracting the
full exit latency when hitting a small measured_us, will reduce the
error.

Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-17 02:24:25 +01:00
Rik van Riel a9ceb78bc7 cpuidle,menu: use interactivity_req to disable polling
The menu governor carefully figures out how much time we typically
sleep for an estimated sleep interval, or whether there is a repeating
pattern going on, and corrects that estimate for the CPU load.

Then it proceeds to ignore that information when determining whether
or not to consider polling. This is not a big deal on most x86 CPUs,
which have very low C1 latencies, and the patch should not have any
effect on those CPUs.

However, certain CPUs (eg. Atom) have much higher C1 latencies, and
it would be good to not waste performance and power on those CPUs if
we are expecting a very low wakeup latency.

Disable polling based on the estimated interactivity requirement, not
on the time to the next timer interrupt.

Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-17 02:24:25 +01:00
Rik van Riel 7884084f3b cpuidle,x86: increase forced cut-off for polling to 20us
The cpuidle menu governor has a forced cut-off for polling at 5us,
in order to deal with firmware that gives the OS bad information
on cpuidle states, leading to the system spending way too much time
in polling.

However, at least one x86 CPU family (Atom) has chips that have
a 20us break-even point for C1. Forcing the polling cut-off to
less than that wastes performance and power.

Increase the polling cut-off to 20us.

Systems with a lower C1 latency will be found in the states table by
the menu governor, which will pick those states as appropriate.

Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-17 02:24:24 +01:00
Russell King ab319939a5 cpuidle: mvebu: disable the bind/unbind attributes and use builtin_platform_driver
As the driver doesn't support unbinding, nor does it support arbitary
binding of devices, disable the bind/unbind attributes for this driver.
Also, as the driver has no remove function, it can never be modular,
so use builtin_platform_driver() to avoid the module exit boilerplate.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-10-23 12:40:48 +02:00
Russell King da1a64f80d cpuidle: mvebu: clean up multiple platform drivers
There's no need to use multiple platform drivers, especially when we
want to do something different in the probe, but we still use a common
probe function.

We can use the platform ID system to only register one platform driver,
but have it match several devices, and give us the CPU idle driver via
the ID's driver_data.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-10-23 12:40:47 +02:00
Linus Torvalds fa9a67ef9d Additional power management and ACPI material for v4.3-rc1
- Build fix for the new Mediatek MT8173 cpufreq driver (Guenter Roeck).
 
  - Generic power domains framework fixes (power on error code
    path, subdomain removal) and cleanup of a deprecated API user
    (Geert Uytterhoeven, Jon Hunter, Ulf Hansson).
 
  - cpufreq-dt driver fixes including two fixes for bugs related to
    the new Operating Performance Points Device Tree bindings
    introduced recently (Viresh Kumar).
 
  - Suspend frequency support for the cpufreq-dt driver
    (Bartlomiej Zolnierkiewicz, Viresh Kumar).
 
  - cpufreq core cleanups (Viresh Kumar).
 
  - intel_pstate driver fixes (Chen Yu, Kristen Carlson Accardi).
 
  - Additional sanity check in the cpuidle core (Xunlei Pang).
 
  - Fix for a comment related to CPU power management (Lina Iyer).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJV80XwAAoJEILEb/54YlRxIAkP/RrT4mDO9QZkocO+XYDBB739
 oi1aaSDmmVVKlFtFiyX4njS7S+U5UC8PYn5o/fFtVnP1kD9yuZkoak65U3t7bsKK
 +rjun1SzJeooaLXW4ZAJKyPldi/RrsfRJ2JOaEt2vK03ppjx2mouK1VpjNj+qmYd
 wwFb4q5S5Rcpph0Sx+xxhpZKoOgBmXk7nWIe5bWDxuORdx4yoxp4lus/P223wFJy
 r0kJnNyY55mnY4Yx5e3L8W1rMC/Sf6mt8pyMvqRpsKdBHaEbhmGsiERSi0eCZJwr
 AmL666TVgPk6zNM/yc3mFOTRZeAF3soSWF5R6+n2TswWvWFH+ksM14g9ndb7fTv3
 FUuOU3jyfSUj2T4xr0qUZLaQUFB3xNrnWBNnm/CzPwjJtH1u1SK8uBSRqdZnEdGq
 EeqwnwDLFeSAbG6lQVsKVVAPZHolDCnJPsvZebKdnJH09kjgUdm60BLOnif4QDgw
 ULPZ6NToFZkrQSMAX1Uqm42iv/qWcHIgS5x43jAmblY3TsuByqRrI7gBCy+hbGSN
 cQfeCxTdVPi/tUPYlJFmWn43dgEsnceSFKvlQRXGrZJM29BFvhVzJQjTFC8Oj79q
 OP/Rv28+VeV5J531BQu+w8cvg73GJ3DDab+i91ZkIsd7ZNjRYQKXBfR9TImTSD4l
 BPjvaPXwDg6EPhwjcpK/
 =C7+4
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull more power management and ACPI updates from Rafael Wysocki:
 "These are mostly fixes and cleanups on top of the previous PM+ACPI
  pull request (cpufreq core and drivers, cpuidle, generic power domains
  framework).  Some of them didn't make to that pull request and some
  fix issues introduced by it.

  The only really new thing is the support for suspend frequency in the
  cpufreq-dt driver, but it is needed to fix an issue with Exynos
  platforms.

  Specifics:

   - build fix for the new Mediatek MT8173 cpufreq driver (Guenter
     Roeck).

   - generic power domains framework fixes (power on error code path,
     subdomain removal) and cleanup of a deprecated API user (Geert
     Uytterhoeven, Jon Hunter, Ulf Hansson).

   - cpufreq-dt driver fixes including two fixes for bugs related to the
     new Operating Performance Points Device Tree bindings introduced
     recently (Viresh Kumar).

   - suspend frequency support for the cpufreq-dt driver (Bartlomiej
     Zolnierkiewicz, Viresh Kumar).

   - cpufreq core cleanups (Viresh Kumar).

   - intel_pstate driver fixes (Chen Yu, Kristen Carlson Accardi).

   - additional sanity check in the cpuidle core (Xunlei Pang).

   - fix for a comment related to CPU power management (Lina Iyer)"

* tag 'pm+acpi-4.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_pstate: fix PCT_TO_HWP macro
  intel_pstate: Fix user input of min/max to legal policy region
  PM / OPP: Return suspend_opp only if it is enabled
  cpufreq-dt: add suspend frequency support
  cpufreq: allow cpufreq_generic_suspend() to work without suspend frequency
  PM / OPP: add dev_pm_opp_get_suspend_opp() helper
  staging: board: Migrate away from __pm_genpd_name_add_device()
  cpufreq: Use __func__ to print function's name
  cpufreq: staticize cpufreq_cpu_get_raw()
  PM / Domains: Ensure subdomain is not in use before removing
  cpufreq: Add ARM_MT8173_CPUFREQ dependency on THERMAL
  cpuidle/coupled: Add sanity check for safe_state_index
  PM / Domains: Try power off masters in error path of __pm_genpd_poweron()
  cpufreq: dt: Tolerance applies on both sides of target voltage
  cpufreq: dt: Print error on failing to mark OPPs as shared
  cpufreq: dt: Check OPP count before marking them shared
  kernel/cpu_pm: fix cpu_cluster_pm_exit comment
2015-09-11 19:11:06 -07:00
Rafael J. Wysocki 4614e0cc66 Merge branches 'pm-cpu', 'pm-cpuidle' and 'pm-domains'
* pm-cpu:
  kernel/cpu_pm: fix cpu_cluster_pm_exit comment

* pm-cpuidle:
  cpuidle/coupled: Add sanity check for safe_state_index

* pm-domains:
  staging: board: Migrate away from __pm_genpd_name_add_device()
  PM / Domains: Ensure subdomain is not in use before removing
  PM / Domains: Try power off masters in error path of __pm_genpd_poweron()
2015-09-11 15:37:36 +02:00
Linus Torvalds c706c7eb0d Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM development updates from Russell King:
 "Included in this update:

   - moving PSCI code from ARM64/ARM to drivers/

   - removal of some architecture internals from global kernel view

   - addition of software based "privileged no access" support using the
     old domains register to turn off the ability for kernel
     loads/stores to access userspace.  Only the proper accessors will
     be usable.

   - addition of early fixup support for early console

   - re-addition (and reimplementation) of OMAP special interconnect
     barrier

   - removal of finish_arch_switch()

   - only expose cpuX/online in sysfs if hotpluggable

   - a number of code cleanups"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (41 commits)
  ARM: software-based priviledged-no-access support
  ARM: entry: provide uaccess assembly macro hooks
  ARM: entry: get rid of multiple macro definitions
  ARM: 8421/1: smp: Collapse arch_cpu_idle_dead() into cpu_die()
  ARM: uaccess: provide uaccess_save_and_enable() and uaccess_restore()
  ARM: mm: improve do_ldrd_abort macro
  ARM: entry: ensure that IRQs are enabled when calling syscall_trace_exit()
  ARM: entry: efficiency cleanups
  ARM: entry: get rid of asm_trace_hardirqs_on_cond
  ARM: uaccess: simplify user access assembly
  ARM: domains: remove DOMAIN_TABLE
  ARM: domains: keep vectors in separate domain
  ARM: domains: get rid of manager mode for user domain
  ARM: domains: move initial domain setting value to asm/domains.h
  ARM: domains: provide domain_mask()
  ARM: domains: switch to keeping domain value in register
  ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE
  ARM: 8417/1: refactor bitops functions with BIT_MASK() and BIT_WORD()
  ARM: 8416/1: Feroceon: use of_iomap() to map register base
  ARM: 8415/1: early fixmap support for earlycon
  ...
2015-09-03 16:27:01 -07:00
Xunlei Pang abceaa9cde cpuidle/coupled: Add sanity check for safe_state_index
Since we are using cpuidle_driver::safe_state_index directly as the
target state index, it is better to add the sanity check at the point
of registering the driver.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-09-03 03:05:47 +02:00
Linus Torvalds ae98207309 Power management and ACPI material for v4.3-rc1
- ACPICA update to upstream revision 20150818 including method
    tracing extensions to allow more in-depth AML debugging in the
    kernel and a number of assorted fixes and cleanups (Bob Moore,
    Lv Zheng, Markus Elfring).
 
  - ACPI sysfs code updates and a documentation update related to
    AML method tracing (Lv Zheng).
 
  - ACPI EC driver fix related to serialized evaluations of _Qxx
    methods and ACPI tools updates allowing the EC userspace tool
    to be built from the kernel source (Lv Zheng).
 
  - ACPI processor driver updates preparing it for future
    introduction of CPPC support and ACPI PCC mailbox driver
    updates (Ashwin Chaugule).
 
  - ACPI interrupts enumeration fix for a regression related
    to the handling of IRQ attribute conflicts between MADT
    and the ACPI namespace (Jiang Liu).
 
  - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar).
 
  - ACPI device registration code reorganization to separate the
    sysfs-related code and bus type operations from the rest (Rafael
    J Wysocki).
 
  - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
    Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
 
  - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups
    (Pan Xinhui, Rafael J Wysocki).
 
  - cpufreq core cleanups on top of the previous changes allowing it
    to preseve its sysfs directories over system suspend/resume (Viresh
    Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
 
  - cpufreq fixes and cleanups related to governors (Viresh Kumar).
 
  - cpufreq updates (core and the cpufreq-dt driver) related to the
    turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
 
  - New DT bindings for Operating Performance Points (OPP), support
    for them in the OPP framework and in the cpufreq-dt driver plus
    related OPP framework fixes and cleanups (Viresh Kumar).
 
  - cpufreq powernv driver updates (Shilpasri G Bhat).
 
  - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
 
  - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
    and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
 
  - intel_pstate driver updates including Skylake-S support, support
    for enabling HW P-states per CPU and an additional vendor bypass
    list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
 
  - cpuidle core fixes related to the handling of coupled idle states
    (Xunlei Pang).
 
  - intel_idle driver updates including Skylake Client support and
    support for freeze-mode-specific idle states (Len Brown).
 
  - Driver core updates related to power management (Andy Shevchenko,
    Rafael J Wysocki).
 
  - Generic power domains framework fixes and cleanups (Jon Hunter,
    Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
 
  - Device PM QoS framework update to allow the latency tolerance
    setting to be exposed to user space via sysfs (Mika Westerberg).
 
  - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
    exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
 
  - System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
 
  - rockchip-io AVS support updates (Heiko Stuebner).
 
  - PM core clocks support fixup (Colin Ian King).
 
  - Power capping RAPL driver update including support for Skylake H/S
    and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
 
  - Generic device properties framework fixes related to the handling
    of static (driver-provided) property sets (Andy Shevchenko).
 
  - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
    Shreyas B Prabhu).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m
 ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC
 R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/
 fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X
 9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O
 1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ
 walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/
 fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6
 H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q
 P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA
 gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi
 I3rfxlXoc/5xJWCgNB8f
 =fTgI
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "From the number of commits perspective, the biggest items are ACPICA
  and cpufreq changes with the latter taking the lead (over 50 commits).

  On the cpufreq front, there are many cleanups and minor fixes in the
  core and governors, driver updates etc.  We also have a new cpufreq
  driver for Mediatek MT8173 chips.

  ACPICA mostly updates its debug infrastructure and adds a number of
  fixes and cleanups for a good measure.

  The Operating Performance Points (OPP) framework is updated with new
  DT bindings and support for them among other things.

  We have a few updates of the generic power domains framework and a
  reorganization of the ACPI device enumeration code and bus type
  operations.

  And a lot of fixes and cleanups all over.

  Included is one branch from the MFD tree as it contains some
  PM-related driver core and ACPI PM changes a few other commits are
  based on.

  Specifics:

   - ACPICA update to upstream revision 20150818 including method
     tracing extensions to allow more in-depth AML debugging in the
     kernel and a number of assorted fixes and cleanups (Bob Moore, Lv
     Zheng, Markus Elfring).

   - ACPI sysfs code updates and a documentation update related to AML
     method tracing (Lv Zheng).

   - ACPI EC driver fix related to serialized evaluations of _Qxx
     methods and ACPI tools updates allowing the EC userspace tool to be
     built from the kernel source (Lv Zheng).

   - ACPI processor driver updates preparing it for future introduction
     of CPPC support and ACPI PCC mailbox driver updates (Ashwin
     Chaugule).

   - ACPI interrupts enumeration fix for a regression related to the
     handling of IRQ attribute conflicts between MADT and the ACPI
     namespace (Jiang Liu).

   - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi
     Kasagar).

   - ACPI device registration code reorganization to separate the
     sysfs-related code and bus type operations from the rest (Rafael J
     Wysocki).

   - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
     Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).

   - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan
     Xinhui, Rafael J Wysocki).

   - cpufreq core cleanups on top of the previous changes allowing it to
     preseve its sysfs directories over system suspend/resume (Viresh
     Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).

   - cpufreq fixes and cleanups related to governors (Viresh Kumar).

   - cpufreq updates (core and the cpufreq-dt driver) related to the
     turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).

   - New DT bindings for Operating Performance Points (OPP), support for
     them in the OPP framework and in the cpufreq-dt driver plus related
     OPP framework fixes and cleanups (Viresh Kumar).

   - cpufreq powernv driver updates (Shilpasri G Bhat).

   - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).

   - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
     and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).

   - intel_pstate driver updates including Skylake-S support, support
     for enabling HW P-states per CPU and an additional vendor bypass
     list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).

   - cpuidle core fixes related to the handling of coupled idle states
     (Xunlei Pang).

   - intel_idle driver updates including Skylake Client support and
     support for freeze-mode-specific idle states (Len Brown).

   - Driver core updates related to power management (Andy Shevchenko,
     Rafael J Wysocki).

   - Generic power domains framework fixes and cleanups (Jon Hunter,
     Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).

   - Device PM QoS framework update to allow the latency tolerance
     setting to be exposed to user space via sysfs (Mika Westerberg).

   - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
     exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).

   - System sleep support updates (Alan Stern, Len Brown, SungEun Kim).

   - rockchip-io AVS support updates (Heiko Stuebner).

   - PM core clocks support fixup (Colin Ian King).

   - Power capping RAPL driver update including support for Skylake H/S
     and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).

   - Generic device properties framework fixes related to the handling
     of static (driver-provided) property sets (Andy Shevchenko).

   - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
     Shreyas B Prabhu)"

* tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits)
  cpufreq: speedstep-lib: Use monotonic clock
  cpufreq: powernv: Increase the verbosity of OCC console messages
  cpufreq: sfi: use kmemdup rather than duplicating its implementation
  cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
  cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
  cpufreq: remove redundant 'policy' field from user_policy
  cpufreq: remove redundant 'governor' field from user_policy
  cpufreq: update user_policy.* on success
  cpufreq: use memcpy() to copy policy
  cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
  cpufreq: mediatek: Add MT8173 cpufreq driver
  dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
  PM / Domains: Fix typo in description of genpd_dev_pm_detach()
  PM / Domains: Remove unusable governor dummies
  PM / Domains: Make pm_genpd_init() available to modules
  PM / domains: Align column headers and data in pm_genpd_summary output
  powercap / RAPL: disable the 2nd power limit properly
  tools: cpupower: Fix error when running cpupower monitor
  PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
  PM / OPP: Fix static checker warning (broken 64bit big endian systems)
  ...
2015-09-01 19:45:46 -07:00
Linus Torvalds a1d8561172 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The biggest change in this cycle is the rewrite of the main SMP load
  balancing metric: the CPU load/utilization.  The main goal was to make
  the metric more precise and more representative - see the changelog of
  this commit for the gory details:

    9d89c257df ("sched/fair: Rewrite runnable load and utilization average tracking")

  It is done in a way that significantly reduces complexity of the code:

    5 files changed, 249 insertions(+), 494 deletions(-)

  and the performance testing results are encouraging.  Nevertheless we
  need to keep an eye on potential regressions, since this potentially
  affects every SMP workload in existence.

  This work comes from Yuyang Du.

  Other changes:

   - SCHED_DL updates.  (Andrea Parri)

   - Simplify architecture callbacks by removing finish_arch_switch().
     (Peter Zijlstra et al)

   - cputime accounting: guarantee stime + utime == rtime.  (Peter
     Zijlstra)

   - optimize idle CPU wakeups some more - inspired by Facebook server
     loads.  (Mike Galbraith)

   - stop_machine fixes and updates.  (Oleg Nesterov)

   - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)

   - sched/numa tweaks.  (Srikar Dronamraju)

   - misc fixes and small cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  sched/deadline: Fix comment in enqueue_task_dl()
  sched/deadline: Fix comment in push_dl_tasks()
  sched: Change the sched_class::set_cpus_allowed() calling context
  sched: Make sched_class::set_cpus_allowed() unconditional
  sched: Fix a race between __kthread_bind() and sched_setaffinity()
  sched: Ensure a task has a non-normalized vruntime when returning back to CFS
  sched/numa: Fix NUMA_DIRECT topology identification
  tile: Reorganize _switch_to()
  sched, sparc32: Update scheduler comments in copy_thread()
  sched: Remove finish_arch_switch()
  sched, tile: Remove finish_arch_switch
  sched, sh: Fold finish_arch_switch() into switch_to()
  sched, score: Remove finish_arch_switch()
  sched, avr32: Remove finish_arch_switch()
  sched, MIPS: Get rid of finish_arch_switch()
  sched, arm: Remove finish_arch_switch()
  sched/fair: Clean up load average references
  sched/fair: Provide runnable_load_avg back to cfs_rq
  sched/fair: Remove task and group entity load when they are dead
  sched/fair: Init cfs_rq's sched_entity load average
  ...
2015-08-31 20:26:22 -07:00
Xunlei Pang 4c1ed5a607 cpuidle/coupled: Remove redundant 'dev' argument of cpuidle_state_is_coupled()
For cpuidle_state_is_coupled(), 'dev' is not used, so remove it.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-28 15:14:54 +02:00
Xunlei Pang ba6a860d41 cpuidle/coupled: Remove cpuidle_device::safe_state_index
cpuidle_device::safe_state_index need to be initialized before
use, it should be the same as cpuidle_driver::safe_state_index.

We tackled this issue by removing the safe_state_index from the
cpuidle_device structure and use the one in the cpuidle_driver
structure instead.

Suggested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-28 15:14:54 +02:00
Mark Rutland be120397e7 ARM: migrate to common PSCI client code
Now that the common PSCI client code has been factored out to
drivers/firmware, and made safe for 32-bit use, move the 32-bit ARM code
over to it. This results in a moderate reduction of duplicated lines,
and will prevent further duplication as the PSCI client code is updated
for PSCI 1.0 and beyond.

The two legacy platform users of the PSCI invocation code are updated to
account for interface changes. In both cases the power state parameter
(which is constant) is now generated using macros, so that the
pack/unpack logic can be killed in preparation for PSCI 1.0 power state
changes.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2015-08-03 15:38:39 +01:00
Lucas Stach 63caae8480 sched/idle: Move latency tracing stop/start calls deeper inside the idle loop
Make sure to stop tracing only once we are past a point where
all latency tracing events have been processed (irqs are not
enabled again). This has the slight advantage of capturing more
latency related events in the idle path, but most importantly it
makes sure that latency tracing doesn't get re-enabled
inadvertently when new events are coming in.

This makes the irqsoff latency tracer useful again, as we stop
capturing CPU sleep time as IRQ latency.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel@pengutronix.de
Cc: patchwork-lst@pengutronix.de
Link: http://lkml.kernel.org/r/1437410090-3747-1-git-send-email-l.stach@pengutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-07-21 08:18:51 +02:00
Rafael J. Wysocki ae0afb4f5d suspend-to-idle: Prevent RCU from complaining about tick_freeze()
Put tick_freeze() under RCU_NONIDLE() to prevent RCU from complaining
about suspicious RCU usage in idle by trace_suspend_resume() called
from there.

While at it, fix a comment related to another usage of RCU_NONIDLE()
in enter_freeze_proper().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-09 22:59:49 +02:00
Linus Torvalds 75462c8a87 Replace module_platform_driver with builtin_platform driver in non modules.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVkO9lAAoJEOvOhAQsB9HWRV4P/jYrQm/S14ZfbwzqwV2w5xh+
 E1SHk+kjcLyIvG6JXknp8mlNFGFhsIZNhTq8wvYBmFHlkop9jlMqT3IwaV7y9baV
 NmxltPHVIgFhnPMBF6+nvMJVFe0oBXh3adwc02h/LcXauEPK98Na/BtAfX5nxmoy
 DO/9R+R3SxqShSHvQqM6JNu3M/xAxU7RRSMsthF3nZJfZEm5i7Sl9w6Zcmu67gEn
 KbAPmthHSzDvJZGPt6xQiR2OPvhdA2Ddxjey0/cLyl/IVd2DdUTUUHDY0lUpPd3A
 Ba6C6OaWoHbCoAVzGvXEJLP1CfuF5upTmo53FZ2+1fERzX7Co4E2xInq6qkpWK5+
 cLcqCZaxHXvmvmidrfTaJQ52dLseGAH5KsiDoR8m5RcsCMrK367V6ja5/A2UG+xW
 FVJzU7/1LRHzw17si/AcrD0Q3hFR0n6klEGS3E964fsyOuCYlSc77IspxZ7nF4QW
 cFKKweyAUdrmrlduS7rKxX4z/ne4ljbR1M82YxFVPWqg/n2cqQ4e9RQFeK8ogBe6
 ASXu6pmz03X5xoD7xPQEsVzjDDGPzGFdD/601j9cRJ0+TR9udECP776gXt+5Ml0L
 jWlhVGbt7BN64UFZ/kInGo1h6cS+JjlrBfNq6eZVQP78bZ5UWdyiupGzcLcixefN
 bnkl2MHHY/d6yk2Rs7zh
 =WLBw
 -----END PGP SIGNATURE-----

Merge tag 'module-builtin_driver-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux

Pull module_platform_driver replacement from Paul Gortmaker:
 "Replace module_platform_driver with builtin_platform driver in non
  modules.

  We see an increasing number of non-modular drivers using
  modular_driver() type register functions.  There are several downsides
  to letting this continue unchecked:

   - The code can appear modular to a reader of the code, and they won't
     know if the code really is modular without checking the Makefile
     and Kconfig to see if compilation is governed by a bool or
     tristate.

   - Coders of drivers may be tempted to code up an __exit function that
     is never used, just in order to satisfy the required three args of
     the modular registration function.

   - Non-modular code ends up including the <module.h> which increases
     CPP overhead that they don't need.

   - It hinders us from performing better separation of the module init
     code and the generic init code.

  So here we introduce similar macros for builtin drivers.  Then we
  convert builtin drivers (controlled by a bool Kconfig) by making the
  following type of mapping:

    module_platform_driver()       --->  builtin_platform_driver()
    module_platform_driver_probe() --->  builtin_platform_driver_probe().

  The set of drivers that are converted here are just the ones that
  showed up as relying on an implicit include of <module.h> during a
  pending header cleanup.  So we convert them here vs adding an include
  of <module.h> to non-modular code to avoid compile fails.  Additonal
  conversions can be done asynchronously at any time.

  Once again, an unused module_exit function that is removed here
  appears in the diffstat as an outlier wrt all the other changes"

* tag 'module-builtin_driver-v4.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  drivers/clk: convert sunxi/clk-mod0.c to use builtin_platform_driver
  drivers/power: Convert non-modular syscon-reboot to use builtin_platform_driver
  drivers/soc: Convert non-modular soc-realview to use builtin_platform_driver
  drivers/soc: Convert non-modular tegra/pmc to use builtin_platform_driver
  drivers/cpufreq: Convert non-modular s5pv210-cpufreq.c to use builtin_platform_driver
  drivers/cpuidle: Convert non-modular drivers to use builtin_platform_driver
  drivers/platform: Convert non-modular pdev_bus to use builtin_platform_driver
  platform_device: better support builtin boilerplate avoidance
2015-07-02 10:42:13 -07:00
Linus Torvalds 5c3950970b Power management and ACPI fixes for v4.2-rc1
- Fix a recently added memory leak in an error path in the ACPI
    resources management code (Dan Carpenter).
 
  - Fix a build warning triggered by an ACPI video header function
    that should be static inline (Borislav Petkov).
 
  - Change names of helper function converting struct fwnode_handle
    pointers to either struct device_node or struct acpi_device
    pointers so they don't conflict with local variable names
    (Alexander Sverdlin).
 
  - Make the hibernate core re-enable nonboot CPUs on failures to
    disable them as expected (Vitaly Kuznetsov).
 
  - Increase the default timeout of the device suspend watchdog to
    prevent it from triggering too early on some systems (Takashi Iwai).
 
  - Prevent the cpuidle powernv driver from registering idle
    states with CPUIDLE_FLAG_TIMER_STOP set if CONFIG_TICK_ONESHOT
    is unset which leads to boot hangs (Preeti U Murthy).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJVksL4AAoJEILEb/54YlRxY9cP/jnXE12Jv2aYQAram5Fd7nY7
 LSiuKAzVCqJbdBa3sRILKjMwgxciYABXypw6Zapa8TimAV374GZh6W4VXgIAifDf
 gdicSSxB4A5cViUEte3uebzdaM2QMcOcQ6A+UOe849q3emfbu91f0LXTWwahR0og
 Hjs7QCcvZS/swQIIY0JaIivC5mRwxrx141oPVn4GNf0tnOzH7eOUktYCZwh4U4Qo
 FwUn66XI+ttlrRxs/IV5QSQ9S5qDBHOKdv6MgmGwMzUXINTL4w2Zawg+Pw0m3Puf
 2VnT9jsz4anD1jZMJGLpeKNAjFJZ6ODiv06HjYhscaUZuMSECUiE9f6auUkiSU1F
 r463ujaIcCW5MWUGjRBq5GwP15IuIL/NxjWAVyyMxeYcjOKrpQeEJ2liWnXPv/Bh
 JVOoktFvFu1iOojlWfwnaC83bjZiJIJ6BFEywIm7l0wD1VAmk8IFepMqIrTp4U9R
 ptKD8Go9GripftHryqSQA5wgp64hIQKxSHi+fEM6RfcLskXVKxEYSButva3wnFHx
 2lQSxro42jsaP8O9T1wmb+br0h3efPeo9qyewmjld+vISeUhmsCILta55VtXI0wu
 ektWoAuaA97d+u+mTee4U2kONXo+yUuXe0YqlCb3x/7m8oMg6hoblWZPnWFSeqeH
 T/vI5cRS+a3NqAjctwmX
 =vkde
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI fixes from Rafael Wysocki:
 "These are fixes that didn't make it to the previous PM+ACPI pull
  request or are fixing issues introduced by it.

  Specifics:

   - Fix a recently added memory leak in an error path in the ACPI
     resources management code (Dan Carpenter)

   - Fix a build warning triggered by an ACPI video header function that
     should be static inline (Borislav Petkov)

   - Change names of helper function converting struct fwnode_handle
     pointers to either struct device_node or struct acpi_device
     pointers so they don't conflict with local variable names
     (Alexander Sverdlin)

   - Make the hibernate core re-enable nonboot CPUs on failures to
     disable them as expected (Vitaly Kuznetsov)

   - Increase the default timeout of the device suspend watchdog to
     prevent it from triggering too early on some systems (Takashi Iwai)

   - Prevent the cpuidle powernv driver from registering idle states
     with CPUIDLE_FLAG_TIMER_STOP set if CONFIG_TICK_ONESHOT is unset
     which leads to boot hangs (Preeti U Murthy)"

* tag 'pm+acpi-4.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode
  PM / sleep: Increase default DPM watchdog timeout to 60
  PM / hibernate: re-enable nonboot cpus on disable_nonboot_cpus() failure
  ACPI / OF: Rename of_node() and acpi_node() to to_of_node() and to_acpi_node()
  ACPI / video: Inline acpi_video_set_dmi_backlight_type
  ACPI / resources: free memory on error in add_region_before()
2015-07-01 14:17:44 -07:00
Linus Torvalds e8a0b37d28 Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
 "Bigger items included in this update are:

   - A series of updates from Arnd for ARM randconfig build failures
   - Updates from Dmitry for StrongARM SA-1100 to move IRQ handling to
     drivers/irqchip/
   - Move ARMs SP804 timer to drivers/clocksource/
   - Perf updates from Mark Rutland in preparation to move the ARM perf
     code into drivers/ so it can be shared with ARM64.
   - MCPM updates from Nicolas
   - Add support for taking platform serial number from DT
   - Re-implement Keystone2 physical address space switch to conform to
     architecture requirements
   - Clean up ARMv7 LPAE code, which goes in hand with the Keystone2
     changes.
   - L2C cleanups to avoid unlocking caches if we're prevented by the
     secure support to unlock.
   - Avoid cleaning a potentially dirty cache containing stale data on
     CPU initialisation
   - Add ARM-only entry point for secondary startup (for machines that
     can only call into a Thumb kernel in ARM mode).  Same thing is also
     done for the resume entry point.
   - Provide arch_irqs_disabled via asm-generic
   - Enlarge ARMv7M vector table
   - Always use BFD linker for VDSO, as gold doesn't accept some of the
     options we need.
   - Fix an incorrect BSYM (for Thumb symbols) usage, and convert all
     BSYM compiler macros to a "badr" (for branch address).
   - Shut up compiler warnings provoked by our cmpxchg() implementation.
   - Ensure bad xchg sizes fail to link"

* 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: (75 commits)
  ARM: Fix build if CLKDEV_LOOKUP is not configured
  ARM: fix new BSYM() usage introduced via for-arm-soc branch
  ARM: 8383/1: nommu: avoid deprecated source register on mov
  ARM: 8391/1: l2c: add options to overwrite prefetching behavior
  ARM: 8390/1: irqflags: Get arch_irqs_disabled from asm-generic
  ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap
  ARM: 8388/1: tcm: Don't crash when TCM banks are protected by TrustZone
  ARM: 8384/1: VDSO: force use of BFD linker
  ARM: 8385/1: VDSO: group link options
  ARM: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
  ARM: remove __bad_xchg definition
  ARM: 8369/1: ARMv7M: define size of vector table for Vybrid
  ARM: 8382/1: clocksource: make ARM_TIMER_SP804 depend on GENERIC_SCHED_CLOCK
  ARM: 8366/1: move Dual-Timer SP804 driver to drivers/clocksource
  ARM: 8365/1: introduce sp804_timer_disable and remove arm_timer.h inclusion
  ARM: 8364/1: fix BE32 module loading
  ARM: 8360/1: add secondary_startup_arm prototype in header file
  ARM: 8359/1: correct secondary_startup_arm mode
  ARM: proc-v7: sanitise and document registers around errata
  ARM: proc-v7: clean up MIDR access
  ...
2015-06-26 12:20:00 -07:00
Rafael J. Wysocki 132c242d95 Merge branches 'acpi-video', 'device-properties', 'pm-sleep' and 'pm-cpuidle'
* acpi-video:
  ACPI / video: Inline acpi_video_set_dmi_backlight_type

* device-properties:
  ACPI / OF: Rename of_node() and acpi_node() to to_of_node() and to_acpi_node()

* pm-sleep:
  PM / sleep: Increase default DPM watchdog timeout to 60
  PM / hibernate: re-enable nonboot cpus on disable_nonboot_cpus() failure

* pm-cpuidle:
  tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode
2015-06-26 03:30:37 +02:00
preeti cc5a2f7b8f tick/idle/powerpc: Do not register idle states with CPUIDLE_FLAG_TIMER_STOP set in periodic mode
On some archs, the local clockevent device stops in deep cpuidle states.
The broadcast framework is used to wakeup cpus in these idle states, in
which either an external clockevent device is used to send wakeup ipis
or the hrtimer broadcast framework kicks in in the absence of such a
device. One cpu is nominated as the broadcast cpu and this cpu sends
wakeup ipis to sleeping cpus at the appropriate time. This is the
implementation in the oneshot mode of broadcast.

In periodic mode of broadcast however, the presence of such cpuidle
states results in the cpuidle driver calling tick_broadcast_enable()
which shuts down the local clockevent devices of all the cpus and
appoints the tick broadcast device as the clockevent device for each of
them. This works on those archs where the tick broadcast device is a
real clockevent device.  But on archs which depend on the hrtimer mode
of broadcast, the tick broadcast device hapens to be a pseudo device.
The consequence is that the local clockevent devices of all cpus are
shutdown and the kernel hangs at boot time in periodic mode.

Let us thus not register the cpuidle states which have
CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
mode of broadcast in periodic mode. This patch takes care of doing this
on powerpc. The cpus would not have entered into such deep cpuidle
states in periodic mode on powerpc anyway. So there is no loss here.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: 3.19+ <stable@vger.kernel.org> # 3.19+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-26 03:28:49 +02:00
Rafael J. Wysocki d461003574 Merge branch 'pm-cpuidle'
* pm-cpuidle:
  cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
2015-06-22 15:15:36 +02:00
Shilpasri G Bhat 78eaa10f02 cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state
The idle cpus which stay in snooze for a long period can degrade the
perfomance of the sibling cpus. If the cpu stays in snooze for more
than target residency of the next available idle state, then exit from
snooze. This gives a chance to the cpuidle governor to re-evaluate the
last idle state of the cpu to promote it to deeper idle states.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-22 15:15:15 +02:00
Rafael J. Wysocki ab232ba570 Merge branches 'pm-sleep' and 'pm-runtime'
* pm-sleep:
  PM / sleep: trace_device_pm_callback coverage in dpm_prepare/complete
  PM / wakeup: add a dummy wakeup_source to record statistics
  PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPEND
  PM / sleep: Return -EBUSY from suspend_enter() on wakeup detection
  PM / tick: Add tracepoints for suspend-to-idle diagnostics
  PM / sleep: Fix symbol name in a comment in kernel/power/main.c
  leds / PM: fix hibernation on arm when gpio-led used with CPU led trigger
  ARM: omap-device: use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS
  bus: omap_l3_noc: add missed callbacks for suspend-to-disk
  PM / sleep: Add macro to define common noirq system PM callbacks
  PM / sleep: Refine diagnostic messages in enter_state()
  PM / wakeup: validate wakeup source before activating it.

* pm-runtime:
  PM / Runtime: Update last_busy in rpm_resume
  PM / runtime: add note about re-calling in during device probe()
2015-06-19 01:18:02 +02:00
Paul Gortmaker 090d1cf103 drivers/cpuidle: Convert non-modular drivers to use builtin_platform_driver
All these drivers are configured with Kconfig options that are
declared as bool.  Hence it is not possible for the code
to be built as modular.  However the code is currently using the
module_platform_driver() macro for driver registration.

While this currently works, we really don't want to be including
the module.h header in non-modular code, which we'll be forced
to do, pending some upcoming code relocation from init.h into
module.h.  So we fix it now by using the non-modular equivalent.

Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2015-06-16 14:12:38 -04:00
Rafael J. Wysocki 7d51d97925 cpuidle: Do not use CPUIDLE_DRIVER_STATE_START in cpuidle.c
The CPUIDLE_DRIVER_STATE_START symbol is defined as 1 only if
CONFIG_ARCH_HAS_CPU_RELAX is set, otherwise it is defined as 0.
However, if CONFIG_ARCH_HAS_CPU_RELAX is set, the first (index 0)
entry in the cpuidle driver's table of states is overwritten with
the default "poll" entry by the core.  The "state" defined by the
"poll" entry doesn't provide ->enter_dead and ->enter_freeze
callbacks and its exit_latency is 0.

For this reason, it is not necessary to use CPUIDLE_DRIVER_STATE_START
in cpuidle_play_dead() (->enter_dead is NULL, so the "poll state"
will be skipped by the loop).

It also is arguably unuseful to return states with exit_latency
equal to 0 from find_deepest_state(), so the function can be modified
to start the loop from index 0 and the "poll state" will be skipped by
it as a result of the check against latency_req.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
2015-05-30 02:15:21 +02:00
Rafael J. Wysocki 87e9b9f1d8 PM / sleep: Make suspend-to-idle-specific code depend on CONFIG_SUSPEND
Since idle_should_freeze() is defined to always return 'false'
for CONFIG_SUSPEND unset, all of the code depending on it in
cpuidle_idle_call() is not necessary in that case.

Make that code depend on CONFIG_SUSPEND too to avoid building it
when it is not going to be used.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-19 02:44:24 +02:00
Rafael J. Wysocki 0d94039fab cpuidle: Select a different state on tick_broadcast_enter() failures
If tick_broadcast_enter() fails in cpuidle_enter_state(),
try to find another idle state to enter instead of invoking
default_idle_call() immediately and returning -EBUSY which
should increase the chances of saving some energy in those
cases.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
2015-05-14 21:37:52 +02:00
Rafael J. Wysocki 827a5aefc5 sched / idle: Call default_idle_call() from cpuidle_enter_state()
The check of the cpuidle_enter() return value against -EBUSY
made in call_cpuidle() will not be necessary any more if
cpuidle_enter_state() calls default_idle_call() directly when it
is about to return -EBUSY, so make that happen and eliminate the
check.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
2015-05-14 21:37:47 +02:00
Rafael J. Wysocki faad384928 sched / idle: Call idle_set_state() from cpuidle_enter_state()
Introduce a wrapper function around idle_set_state() called
sched_idle_set_state() that will pass this_rq() to it as the
first argument and make cpuidle_enter_state() call the new
function before and after entering the target state.

At the same time, remove direct invocations of idle_set_state()
from call_cpuidle().

This will allow the invocation of default_idle_call() to be
moved from call_cpuidle() to cpuidle_enter_state() safely
and call_cpuidle() to be simplified a bit as a result.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
2015-05-14 21:35:10 +02:00
Rafael J. Wysocki 7312280bd2 cpuidle: Fix the kerneldoc comment for cpuidle_enter_state()
The kerneldoc comment for cpuidle_enter_state() doesn't match the
function's header any more, so fix it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-09 21:50:32 +02:00
Nicolas Pitre 7895f73169 ARM: MCPM: remove residency argument from mcpm_cpu_suspend()
This is currently unused.

If a suspend must be limited to CPU level only by preventing the last man
from triggering a cluster level suspend then this should be determined
according to many other criteria the MCPM layer is currently not aware of.
It is unlikely that mcpm_cpu_suspend() would be the proper conduit for
that information anyway.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Dave Martin <Dave.Martin@arm.com>
2015-05-06 11:47:10 -04:00
Rafael J. Wysocki a802ea9645 cpuidle: Check the sign of index in cpuidle_reflect()
Avoid calling the governor's ->reflect method if the state index
passed to cpuidle_reflect() is negative.

This allows the analogous check to be dropped from menu_reflect(),
so do that too, and ensures that arbitrary error codes can be
passed to cpuidle_reflect() as the index with no adverse
consequences.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-05-04 22:53:28 +02:00
Rafael J. Wysocki df8d9eeadd cpuidle: Run tick_broadcast_exit() with disabled interrupts
Commit 335f49196f (sched/idle: Use explicit broadcast oneshot
control function) replaced clockevents_notify() invocations in
cpuidle_idle_call() with direct calls to tick_broadcast_enter()
and tick_broadcast_exit(), but it overlooked the fact that
interrupts were already enabled before calling the latter which
led to functional breakage on systems using idle states with the
CPUIDLE_FLAG_TIMER_STOP flag set.

Fix that by moving the invocations of tick_broadcast_enter()
and tick_broadcast_exit() down into cpuidle_enter_state() where
interrupts are still disabled when tick_broadcast_exit() is
called.  Also ensure that interrupts will be disabled before
running tick_broadcast_exit() even if they have been enabled by
the idle state's ->enter callback.  Trigger a WARN_ON_ONCE() in
that case, as we generally don't want that to happen for states
with CPUIDLE_FLAG_TIMER_STOP set.

Fixes: 335f49196f (sched/idle: Use explicit broadcast oneshot control function)
Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reported-and-tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-29 15:19:21 +02:00
Linus Torvalds 38eb1dbb0d ARM: SoC fixes for v4.1
Here's the usual "low-priority fixes that didn't make it into the last
 few -rcs, with a twist: We had a fixes pull request that I didn't send
 in time to get into 4.0, so we'll send some of them to Greg for -stable as
 well.
 
 Contents here is as usual not all that controversial:
 
 - A handful of randconfig fixes from Arnd, in particular for older Samsung
   platforms
 - Exynos fixes, !SMP building, DTS updates for MMC and lid switch
 - Kbuild fix to create output subdirectory for DTB files
 - Misc minor fixes for OMAP
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNzItAAoJEIwa5zzehBx34T0QAID8eUwb0dAy23y2vlszQIhG
 5zxvkDgINGj2LOBvlO3afOf3ErKXfNBGJ4UJYyqWxBsQCvY0AcWWmTVXGERhDF3/
 ZY+Fn7txhSf63CKaFa0SVwMq4qM9V3oe1EiR2kaXwNb1sk2vniCFA9RLCznSEYsi
 wy2cwECfNKljnMVl6CiCihOXsY3ZBTq+7nbp4X6ewUoQkvqde2RDSAHKZG2Y43p6
 tKhNEPqSKZ9sYwkNrnQ5F92zlkrc7nUdttKel4Q21+3MR132taC/cqnSsVjjFRL6
 cb3xM50iy2sHZBdzqkHNSBTAgP5L+PCGOLqkByUrNronOpUYdmqsz4M0Tm1SwEX3
 NvmsLjEkeZTsVd9go0/uyrJlZ71UYkSE+YwE4o+mhg55tTW3ZPvb4i+3uXIJnVA7
 +nGT5AhmfgrR1w3sQvniQp/hEwY6+qgt2ygQrpn7zSqsWMrwC9QxlJUbikJECKYC
 SCGJBvbFBsLTiLZf7aJV2L0rtvYe27E+b1LYMos+DIj9llGSWAbWy+hHj1kY2AbP
 aa93gx3xT8TOwNYFnSa9CuiA1eiWOpS5HFY5Pp7er4+Mf5PibB/Xqdq/krXLBAo3
 nWMiRMWPF0Jmn2o8J1gR3Qrb2TLodHXitpHE2Beh0mG/Un148cG2MrjaUnjNo+NG
 Ox+2BuDNqtXuTL7Y0B9l
 =o/4N
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Here's the usual "low-priority fixes that didn't make it into the last
  few -rcs, with a twist: We had a fixes pull request that I didn't send
  in time to get into 4.0, so we'll send some of them to Greg for
  -stable as well.

  Contents here is as usual not all that controversial:

   - a handful of randconfig fixes from Arnd, in particular for older
     Samsung platforms

   - Exynos fixes, !SMP building, DTS updates for MMC and lid switch

   - Kbuild fix to create output subdirectory for DTB files

   - misc minor fixes for OMAP"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
  ARM: at91/dt: sama5d3 xplained: add phy address for macb1
  kbuild: Create directory for target DTB
  ARM: mvebu: Disable CPU Idle on Armada 38x
  ARM: DRA7: Enable Cortex A15 errata 798181
  ARM: dts: am57xx-beagle-x15: Add thermal map to include fan and tmp102
  ARM: dts: DRA7: Add bandgap and related thermal nodes
  bus: ocp2scp: SYNC2 value should be changed to 0x6
  ARM: dts: am4372: Add "ti,am437x-ocp2scp" as compatible string for OCP2SCP
  ARM: OMAP2+: remove superfluous NULL pointer check
  ARM: EXYNOS: Fix build breakage cpuidle on !SMP
  ARM: dts: fix lid and power pin-functions for exynos5250-spring
  ARM: dts: fix mmc node updates for exynos5250-spring
  ARM: OMAP4: remove dead kconfig option OMAP4_ERRATA_I688
  MAINTAINERS: add OMAP defconfigs under OMAP SUPPORT
  ARM: OMAP1: PM: fix some build warnings on 1510-only Kconfigs
  ARM: cns3xxx: don't export static symbol
  ARM: S3C24XX: avoid a Kconfig warning
  ARM: S3C24XX: fix header file inclusions
  ARM: S3C24XX: fix building without PM_SLEEP
  ARM: S3C24XX: use SAMSUNG_WAKEMASK for s3c2416
  ...
2015-04-22 09:03:30 -07:00
Linus Torvalds 6496edfce9 This is the final removal (after several years!) of the obsolete cpus_*
functions, prompted by their mis-use in staging.
 
 With these function removed, all cpu functions should only iterate to
 nr_cpu_ids, so we finally only allocate that many bits when cpumasks
 are allocated offstack.
 
 Thanks,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNPMuAAoJENkgDmzRrbjx7ZIP/j65e6xs1jEyXR3WOYSdTU1x
 bMo6JcII6O1oEZLgyKXgx9KiBg6uIIDta1NG/H/XIe354dwfHVsHvj5HHHQR5Xof
 iRrjLOaHj4XglI3hvsk0eEEl3/OBBLgyo9bUwDvMF1fmr/9tW4caMs3Op6n7Evzm
 YIvoAyeJ0A8BfEtOU5lXhcVIGmnHtSw0x6mdGXpXIBmWYQPCtdQP868s4lnl44w0
 bSNpAYdzEqg64Ph3SK0prgWPrn5+5EiaAhV7HZzENZ5+o0DAdIXWq/W7uHyCWPKH
 536cJDojec+nSUQkPYngngGprxrKO02aBcMw/3JGJ0tdCDj8yw3XAyVAFzz4hmMb
 Lkmyv4QHHIILLvJ4ZRH5KHWCjjVBg41LNCs2H3HnoxFACdm0lZYKHsUAh2ucBVtU
 Wb/eHmLxOG43UIkpX4yrhy3SfE1ZdnOVzEzOzPXtr51t8ojqk+bLFe/hJ6EkzrQX
 X+90qHfBq+PMJlAnc3zdXHjxoJrL6KPWVwVvFrNeibgEKtVvy/BiwZkS6QceC1Ea
 TatOYA5r6awFVHHQCooN1DGAxN5Juvu2SmdnTUA9ymsCNDghj1YUoAKRNP81u8Sa
 pe3hco/63iCuPna+vlwNDU6SgsaMk9m0p+1n1BiDIfVJIkWYCNeG+u2gQkzbDKlQ
 AJuKKQv1QuZiF0ylZ0wq
 =VAgA
 -----END PGP SIGNATURE-----

Merge tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
 "This is the final removal (after several years!) of the obsolete
  cpus_* functions, prompted by their mis-use in staging.

  With these function removed, all cpu functions should only iterate to
  nr_cpu_ids, so we finally only allocate that many bits when cpumasks
  are allocated offstack"

* tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
  cpumask: remove __first_cpu / __next_cpu
  cpumask: resurrect CPU_MASK_CPU0
  linux/cpumask.h: add typechecking to cpumask_test_cpu
  cpumask: only allocate nr_cpumask_bits.
  Fix weird uses of num_online_cpus().
  cpumask: remove deprecated functions.
  mips: fix obsolete cpumask_of_cpu usage.
  x86: fix more deprecated cpu function usage.
  ia64: remove deprecated cpus_ usage.
  powerpc: fix deprecated CPU_MASK_CPU0 usage.
  CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
  staging/lustre/o2iblnd: Don't use cpus_weight
  staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
  staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
  blackfin: fix up obsolete cpu function usage.
  parisc: fix up obsolete cpu function usage.
  tile: fix up obsolete cpu function usage.
  arm64: fix up obsolete cpu function usage.
  mips: fix up obsolete cpu function usage.
  x86: fix up obsolete cpu function usage.
  ...
2015-04-20 10:19:03 -07:00
Javi Merino ee3c86f356 cpuidle: menu: use DIV_ROUND_CLOSEST_ULL()
Now that the kernel provides DIV_ROUND_CLOSEST_ULL(), drop the internal
implementation and use the kernel one.

Signed-off-by: Javi Merino <javi.merino@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17 09:03:55 -04:00
Linus Torvalds 2481bc7528 Power management and ACPI updates for v4.1-rc1
- Generic PM domains support update including new PM domain
    callbacks to handle device initialization better (Russell King,
    Rafael J Wysocki, Kevin Hilman).
 
  - Unified device properties API update including a new mechanism
    for accessing data provided by platform initialization code
    (Rafael J Wysocki, Adrian Hunter).
 
  - ARM cpuidle update including ARM32/ARM64 handling consolidation
    (Daniel Lezcano).
 
  - intel_idle update including support for the Silvermont Core in
    the Baytrail SOC and for the Airmont Core in the Cherrytrail and
    Braswell SOCs (Len Brown, Mathias Krause).
 
  - New cpufreq driver for Hisilicon ACPU (Leo Yan).
 
  - intel_pstate update including support for the Knights Landing
    chip (Dasaratharaman Chandramouli, Kristen Carlson Accardi).
 
  - QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann).
 
  - powernv cpufreq driver update (Shilpasri G Bhat).
 
  - devfreq update including Tegra support changes (Tomeu Vizoso,
    MyungJoo Ham, Chanwoo Choi).
 
  - powercap RAPL (Running-Average Power Limit) driver update
    including support for Intel Broadwell server chips (Jacob Pan,
    Mathias Krause).
 
  - ACPI device enumeration update related to the handling of the
    special PRP0001 device ID allowing DT-style 'compatible' property
    to be used for ACPI device identification (Rafael J Wysocki).
 
  - ACPI EC driver update including limited _DEP support (Lan Tianyu,
    Lv Zheng).
 
  - ACPI backlight driver update including a new mechanism to allow
    native backlight handling to be forced on non-Windows 8 systems
    and a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede).
 
  - New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu).
 
  - Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
    Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki).
 
  - Fixes related to suspend-to-idle for the iTCO watchdog driver and
    the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu).
 
  - PM tracing support for the suspend phase of system suspend/resume
    transitions (Zhonghui Fu).
 
  - Configurable delay for the system suspend/resume testing facility
    (Brian Norris).
 
  - PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki).
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJVLbO+AAoJEILEb/54YlRx5N4QAJXsmEW1FL2l6mMAyTQkEsVj
 nbqjF9I6aJgYM9+i8GKaZJxpN17SAZ7Ii7aCAXjPwX8AvjT70+gcZr+KDWtPir61
 B75VNVEcUYOR4vOF5Z6rQcQMlhGPkfMOJYXFMahpOG6DdPbVh1x2/tuawfc6IC0V
 a6S/fln6WqHrXQ+8swDSv1KuZsav6+8AQaTlNUQkkuXdY9b3k/3xiy5C2K26APP8
 x1B39iAF810qX6ipnK0gEOC3Vs29dl7hvNmgOVmmkBGVS7+pqTuy5n1/9M12cDRz
 78IQ7DXB0NcSwr5tdrmGVUyH0Q6H9lnD3vO7MJkYwKDh5a/2MiBr2GZc4KHDKDWn
 E1sS27f1Pdn9qnpWLzTcY+yYNV3EEyre56L2fc+sh+Xq9sNOjUah+Y/eAej/IxYD
 XYRf+GAj768yCJgNP+Y3PJES/PRh+0IZ/dn5k0Qq2iYvc8mcObyG6zdQIvCucv/i
 70uV1Z2GWEb31cI9TUV8o5GrMW3D0KI9EsCEEpiFFUnhjNog3AWcerGgFQMHxu7X
 ZnNSzudvek+XJ3NtpbPgTiJAmnMz8bDvBQm3G1LUO2TQdjYTU6YMUHsfzXs8DL6c
 aIMWO4stkVuDtWrlT/hfzIXepliccyXmSP6sbH+zNNCepulXe5C4M2SftaDi4l/B
 uIctXWznvHoGys+EFL+v
 =erd3
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management and ACPI updates from Rafael Wysocki:
 "These are mostly fixes and cleanups all over, although there are a few
  items that sort of fall into the new feature category.

  First off, we have new callbacks for PM domains that should help us to
  handle some issues related to device initialization in a better way.

  There also is some consolidation in the unified device properties API
  area allowing us to use that inferface for accessing data coming from
  platform initialization code in addition to firmware-provided data.

  We have some new device/CPU IDs in a few drivers, support for new
  chips and a new cpufreq driver too.

  Specifics:

   - Generic PM domains support update including new PM domain callbacks
     to handle device initialization better (Russell King, Rafael J
     Wysocki, Kevin Hilman)

   - Unified device properties API update including a new mechanism for
     accessing data provided by platform initialization code (Rafael J
     Wysocki, Adrian Hunter)

   - ARM cpuidle update including ARM32/ARM64 handling consolidation
     (Daniel Lezcano)

   - intel_idle update including support for the Silvermont Core in the
     Baytrail SOC and for the Airmont Core in the Cherrytrail and
     Braswell SOCs (Len Brown, Mathias Krause)

   - New cpufreq driver for Hisilicon ACPU (Leo Yan)

   - intel_pstate update including support for the Knights Landing chip
     (Dasaratharaman Chandramouli, Kristen Carlson Accardi)

   - QorIQ cpufreq driver update (Tang Yuantian, Arnd Bergmann)

   - powernv cpufreq driver update (Shilpasri G Bhat)

   - devfreq update including Tegra support changes (Tomeu Vizoso,
     MyungJoo Ham, Chanwoo Choi)

   - powercap RAPL (Running-Average Power Limit) driver update including
     support for Intel Broadwell server chips (Jacob Pan, Mathias Krause)

   - ACPI device enumeration update related to the handling of the
     special PRP0001 device ID allowing DT-style 'compatible' property
     to be used for ACPI device identification (Rafael J Wysocki)

   - ACPI EC driver update including limited _DEP support (Lan Tianyu,
     Lv Zheng)

   - ACPI backlight driver update including a new mechanism to allow
     native backlight handling to be forced on non-Windows 8 systems and
     a new quirk for Lenovo Ideapad Z570 (Aaron Lu, Hans de Goede)

   - New Windows Vista compatibility quirk for Sony VGN-SR19XN (Chen Yu)

   - Assorted ACPI fixes and cleanups (Aaron Lu, Martin Kepplinger,
     Masanari Iida, Mika Westerberg, Nan Li, Rafael J Wysocki)

   - Fixes related to suspend-to-idle for the iTCO watchdog driver and
     the ACPI core system suspend/resume code (Rafael J Wysocki, Chen Yu)

   - PM tracing support for the suspend phase of system suspend/resume
     transitions (Zhonghui Fu)

   - Configurable delay for the system suspend/resume testing facility
     (Brian Norris)

   - PNP subsystem cleanups (Peter Huewe, Rafael J Wysocki)"

* tag 'pm+acpi-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (74 commits)
  ACPI / scan: Fix NULL pointer dereference in acpi_companion_match()
  ACPI / scan: Rework modalias creation when "compatible" is present
  intel_idle: mark cpu id array as __initconst
  powercap / RAPL: mark rapl_ids array as __initconst
  powercap / RAPL: add ID for Broadwell server
  intel_pstate: Knights Landing support
  intel_pstate: remove MSR test
  cpufreq: fix qoriq uniprocessor build
  ACPI / scan: Take the PRP0001 position in the list of IDs into account
  ACPI / scan: Simplify acpi_match_device()
  ACPI / scan: Generalize of_compatible matching
  device property: Introduce firmware node type for platform data
  device property: Make it possible to use secondary firmware nodes
  PM / watchdog: iTCO: stop watchdog during system suspend
  cpufreq: hisilicon: add acpu driver
  ACPI / EC: Call acpi_walk_dep_device_list() after installing EC opregion handler
  cpufreq: powernv: Report cpu frequency throttling
  intel_idle: Add support for the Airmont Core in the Cherrytrail and Braswell SOCs
  intel_idle: Update support for Silvermont Core in Baytrail SOC
  PM / devfreq: tegra: Register governor on module init
  ...
2015-04-14 20:21:54 -07:00
Linus Torvalds 7fd56474db Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Ingo Molnar:
 "The main changes in this cycle were:

   - clockevents state machine cleanups and enhancements (Viresh Kumar)

   - clockevents broadcast notifier horror to state machine conversion
     and related cleanups (Thomas Gleixner, Rafael J Wysocki)

   - clocksource and timekeeping core updates (John Stultz)

   - clocksource driver updates and fixes (Ben Dooks, Dmitry Osipenko,
     Hans de Goede, Laurent Pinchart, Maxime Ripard, Xunlei Pang)

   - y2038 fixes (Xunlei Pang, John Stultz)

   - NMI-safe ktime_get_raw_fast() and general refactoring of the clock
     code, in preparation to perf's per event clock ID support (Peter
     Zijlstra)

   - generic sched/clock fixes, optimizations and cleanups (Daniel
     Thompson)

   - clockevents cpu_down() race fix (Preeti U Murthy)"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
  timers/PM: Drop unnecessary braces from tick_freeze()
  timers/PM: Fix up tick_unfreeze()
  timekeeping: Get rid of stale comment
  clockevents: Cleanup dead cpu explicitely
  clockevents: Make tick handover explicit
  clockevents: Remove broadcast oneshot control leftovers
  sched/idle: Use explicit broadcast oneshot control function
  ARM: Tegra: Use explicit broadcast oneshot control function
  ARM: OMAP: Use explicit broadcast oneshot control function
  intel_idle: Use explicit broadcast oneshot control function
  ACPI/idle: Use explicit broadcast control function
  ACPI/PAD: Use explicit broadcast oneshot control function
  x86/amd/idle, clockevents: Use explicit broadcast oneshot control functions
  clockevents: Provide explicit broadcast oneshot control functions
  clockevents: Remove the broadcast control leftovers
  ARM: OMAP: Use explicit broadcast control function
  intel_idle: Use explicit broadcast control function
  cpuidle: Use explicit broadcast control function
  ACPI/processor: Use explicit broadcast control function
  ACPI/PAD: Use explicit broadcast control function
  ...
2015-04-13 11:08:28 -07:00
Rafael J. Wysocki cdde51b9fe Merge back earlier cpuidle material for v4.1. 2015-04-10 12:01:24 +02:00
Olof Johansson 03a1e74cad Samsung 2nd fixes for v4.0
- Fix build breakage exynos cpuidle driver on !SMP
   because it is coupled built-in so added check for SMP.
 
 - Fix lid, power pin-functions and mmc node updates
   for exynos5250-spring: Fixes commit ID 53dd4138bb
   ("ARM: dts: Add exynos5250-spring device tree")
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJVHCgYAAoJEA0Cl+kVi2xqOc0P/iRf+6vc1LLwbhJIjKnIkzqK
 UEov7aWc4gBauP8oaQaaSV9DMShr0MVZlOgvQpjg5xYHYBcPAzVn+dR/vOxoIo5/
 pQFMljNYDZONU+xkTGe8wFwCXk5ixBC8rhuGLCbqPXGa91zQrbRiEYPlFfsv4Req
 dYtReh8+Uf8ATdqouvZ9FxP0q41LeeVthGvJwHy3folfom8T34UFtq38ecC10jqU
 UlIgudYI5GOGBY1IZh3F0p+PTGug8BUTeapt4QcC6GlMDsB+ylvhRnk2cto85waD
 2RBhyBMi1S419WT2G+h/Pr/Np9K7nwKiVD62eBoldYzUSOveufvMsDmtOTlxLumo
 Jj+3/EDGMof74TeXeylZxr3wfj65yn9XAk2LQTQT8xh/wJjqZ1EHexKtikmdO84o
 tSisyIBr7Bp6J2TREZm8CQbyPgY5ciyLlxT0mTWrWs7rt44ZELAv8nqyibfZ/ce4
 TVTpdSWH8VPFteZrYXmbP+3va8RQvEyeg2mAtyEMJHX73UgYdxKt6bf66KypNDi7
 zQVx26vDcbVCP2dQuwMO7LcIy9LyFnEEsN8ZqpRD+TSFkq/+fJPWSvH2kJr0wdUv
 Nvxz0YkA+1E9agiFrh6WQzoWpubvngWUV4KCCyScB8Yta0py3uSqmu3IB0fHm4CJ
 ALjWdQ9AKlEyd7o6rUxm
 =pPV1
 -----END PGP SIGNATURE-----

Merge tag 'samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

Merge "Samsung 2nd fixes for v4.0" from Kukjin Kim:

- Fix build breakage exynos cpuidle driver on !SMP
  because it is coupled built-in so added check for SMP.

- Fix lid, power pin-functions and mmc node updates
  for exynos5250-spring: Fixes commit ID 53dd4138bb
  ("ARM: dts: Add exynos5250-spring device tree")

* tag 'samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Fix build breakage cpuidle on !SMP
  ARM: dts: fix lid and power pin-functions for exynos5250-spring
  ARM: dts: fix mmc node updates for exynos5250-spring

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-04-03 14:51:10 -07:00
Bartlomiej Zolnierkiewicz d75e4af14e cpuidle: remove state_count field from struct cpuidle_device
Thomas Schlichter reports the following issue on his Samsung NC20:

"The C-states C1 and C2 to the OS when connected to AC, and additionally
 provides the C3 C-state when disconnected from AC.  However, the number
 of C-states shown in sysfs is fixed to the number of C-states present
 at boot.
   If I boot with AC connected, I always only see the C-states up to C2
   even if I disconnect AC.

   The reason is commit 130a5f6924 (ACPI / cpuidle: remove dev->state_count
   setting).  It removes the update of dev->state_count, but sysfs uses
   exactly this variable to show the C-states.

   The fix is to use drv->state_count in sysfs.  As this is currently the
   last user of dev->state_count, this variable can be completely removed."

Remove dev->state_count as per the above.

Reported-by: Thomas Schlichter <thomas.schlichter@web.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: 3.14+ <stable@vger.kernel.org> # 3.14+
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-03 13:15:50 +02:00
Thomas Gleixner ee7a1438b5 cpuidle: Use explicit broadcast control function
Replace the clockevents_notify() call with an explicit function call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/2106401.cYdJzzA6Ic@vostro.rjw.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-03 08:44:32 +02:00
Daniel Lezcano a0d46a3dfd ARM: cpuidle: Register per cpuidle device
If the cpuidle init cpu operation returns -ENXIO, therefore reporting HW
failure or misconfiguration, the CPUidle driver skips the respective
cpuidle device initialization because the associated platform back-end HW
is not operational.

That prevents the system to crash and allows to handle the error gracefully.

For example, on Qcom's platform, each core has a SPM. The device associated
with this SPM is initialized before the cpuidle framework. If there is an error
in the initialization (eg. error in the DT), the system continues to boot but
in degraded mode as some SPM may not be correctly initialized.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-24 14:46:25 +01:00
Daniel Lezcano 0e0870448a ARM: cpuidle: Enable the ARM64 driver for both ARM32/ARM64
ARM32 and ARM64 have the same DT definitions and the same approaches.

The generic ARM cpuidle driver can be put in common for those two
architectures.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rob Herring <robherring2@gmail.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-24 10:16:11 +01:00
Daniel Lezcano 69e6cb3d2f ARM64: cpuidle: Remove arm64 reference
In the next patch, this driver will be common across ARM/ARM64. Remove all refs
to ARM64 as it will be shared with ARM32.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rob Herring <robherring2@gmail.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-24 10:16:10 +01:00
Daniel Lezcano c9d6216149 ARM64: cpuidle: Rename cpu_init_idle to a common function name
With this change the cpuidle-arm64.c file calls the same function name
for both ARM and ARM64.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rob Herring <robherring2@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-24 10:16:09 +01:00
Daniel Lezcano 191de17aa3 ARM64: cpuidle: Replace cpu_suspend by the common ARM/ARM64 function
Call the common ARM/ARM64 'arm_cpuidle_suspend' instead of cpu_suspend function
which is specific to ARM64.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Rob Herring <robherring2@gmail.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-24 10:16:09 +01:00
Daniel Lezcano eeebc3bb4d ARM: cpuidle: Remove duplicate header inclusion
The cpu_do_idle() function is always used by the cpuidle drivers.

That led to have each driver including cpuidle.h and proc-fns.h, they are
always paired. That makes a lot of duplicate headers inclusion. Instead of
including both in each .c file, move the proc-fns.h header inclusion in the
cpuidle.h header file directly, so we can save some line of code.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2015-03-23 18:03:11 +01:00
Bartlomiej Zolnierkiewicz cfdda3535f ARM: EXYNOS: Fix build breakage cpuidle on !SMP
The Exynos cpuidle driver has coupled cpuidle built-in so it cannot be
built without SMP:

arch/arm/mach-exynos/pm.c: In function 'exynos_cpu0_enter_aftr':
arch/arm/mach-exynos/pm.c:246:4: error: implicit declaration of function 'arch_send_wakeup_ipi_mask' [-Werror=implicit-function-declaration]
arch/arm/mach-exynos/built-in.o: In function 'exynos_pre_enter_aftr':
../arch/arm/mach-exynos/pm.c:300: undefined reference to 'cpu_boot_reg_base'
arch/arm/mach-exynos/built-in.o: In function 'exynos_cpu1_powerdown':
../arch/arm/mach-exynos/pm.c:282: undefined reference to 'exynos_cpu_power_down'

Fix it by adding missing checks for SMP.

Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-03-18 03:26:11 +09:00
Sebastien Rannou ce6031c89a cpuidle: mvebu: Update cpuidle thresholds for Armada XP SOCs
Originally, the thresholds used in the cpuidle driver for Armada SOCs
were temporarily chosen, leaving room for improvements.

This commit updates the thresholds for the Armada XP SOCs with values
that positively impact performances:

                                without patch  with patch   vendor kernel
 - iperf localhost (gbit/sec)   ~3.7           ~6.4         ~5.4
 - ioping tmpfs (iops)          ~163k          ~206k        ~179k
 - ioping tmpfs (mib/s)         ~636           ~805         ~699

The idle power consumption is negatively impacted (proportionally less
than the performance gain), and we are still performing better than
the vendor kernel here:

                                without patch   with patch  vendor kernel
 - power consumption idle (W)   ~2.4            ~3.2        ~4.4
 - power consumption busy (W)   ~8.6            ~8.3        ~8.6

There is still room for improvement regarding the value of these
thresholds, they were chosen to mimic the vendor kernel.

This patch only impacts Armada XP SOCs and was tested on Online Labs
C1 boards. A similar approach can be taken to improve the performances
of the Armada 370 and Armada 38x SOCs.

Thanks a lot to Thomas Petazzoni, Gregory Clement and Willy Tarreau
for the discussions and tips around this topic.

Signed-off-by: Sebastien Rannou <mxs@sbrk.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
2015-03-13 18:31:29 +01:00
Gregory CLEMENT 43b68879de cpuidle: mvebu: Fix the CPU PM notifier usage
As stated in kernel/cpu_pm.c, "Platform is responsible for ensuring
that cpu_pm_enter is not called twice on the same CPU before
cpu_pm_exit is called.". In the current code in case of failure when
calling mvebu_v7_cpu_suspend, the function cpu_pm_exit() is never
called whereas cpu_pm_enter() was called just before.

This patch moves the cpu_pm_exit() in order to balance the
cpu_pm_enter() calls.

Cc: stable@vger.kernel.org
Reported-by: Fulvio Benini <fbf@libero.it>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-03-13 18:26:06 +01:00
Rafael J. Wysocki ef2b22ac54 cpuidle / sleep: Use broadcast timer for states that stop local timer
Commit 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state.  If the cpuidle driver in use does not provide
the new ->enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 3810631332.

Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.

Fixes: 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-03-05 23:13:19 +01:00
Rusty Russell f9b531fe14 drivers: fix up obsolete cpu function usage.
Thanks to spatch, plus manual removal of "&*".  Then a sweep for
for_each_cpu_mask => for_each_cpu.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: netdev@vger.kernel.org
2015-03-05 13:37:02 +10:30
Rafael J. Wysocki 31a3409065 cpuidle / sleep: Do sanity checks in cpuidle_enter_freeze() too
Modify cpuidle_enter_freeze() to do the sanity checks done by
cpuidle_select() to avoid crashing the suspend-to-idle code
path in case something is missing.

Fixes: 3810631332 (PM / sleep: Re-implement suspend-to-idle handling)
Original-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-28 23:46:39 +01:00
Rafael J. Wysocki 01e04f466e idle / sleep: Avoid excessive disabling and enabling interrupts
Disabling interrupts at the end of cpuidle_enter_freeze() is not
useful, because its caller, cpuidle_idle_call(), re-enables them
right away after invoking it.

To avoid that unnecessary back and forth dance with interrupts,
make cpuidle_enter_freeze() enable interrupts after calling
enter_freeze_proper() and drop the local_irq_disable() at its
end, so that all of the code paths in it end up with interrupts
enabled.  Then, cpuidle_idle_call() will not need to re-enable
interrupts after calling cpuidle_enter_freeze() any more, because
the latter will return with interrupts enabled, in analogy with
cpuidle_enter().

Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-28 23:46:24 +01:00
Rafael J. Wysocki 3466b547e3 Merge branches 'pnp', 'pm-cpuidle' and 'pm-cpufreq'
* pnp:
  PNP: Switch from __check_region() to __request_region()

* pm-cpuidle:
  cpuidle: powernv: Avoid endianness conversions while parsing DT
  cpuidle: powernv: Read target_residency value of idle states from DT if available

* pm-cpufreq:
  cpufreq: s3c: remove last use of resume_clocks callback
  cpufreq: s3c: remove incorrect __init annotations
2015-02-21 04:29:16 +01:00
Preeti U Murthy 70734a786a cpuidle: powernv: Avoid endianness conversions while parsing DT
We currently read the information about idle states from the DT
so as to populate the cpuidle table. Use those APIs to read from
the DT that can avoid endianness conversions of the property values
in the cpuidle driver.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-19 23:44:38 +01:00
Preeti U Murthy 92c83ff5b4 cpuidle: powernv: Read target_residency value of idle states from DT if available
The device tree now exposes the residency values for different idle states. Read
these values instead of calculating residency from the latency values. The values
exposed in the DT are validated for optimal power efficiency. However to maintain
compatibility with the older firmware code which does not expose residency
values, use default values as a fallback mechanism. While at it, use better
APIs to parse the powermgmt device tree node.

Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-18 06:34:17 +01:00
Linus Torvalds 99fa0ad92c Suspend-to-idle timer quiescing support for v3.20-rc1
Till now suspend-to-idle has not been able to save much more energy
 than runtime PM because of timer interrupts that periodically bring
 CPUs out of idle while they are waiting for a wakeup interrupt.  Of
 course, the timer interrupts are not wakeup ones, so the handling of
 them can be deferred until a real wakeup interrupt happens, but at
 the same time we don't want to mass-expire timers at that point.
 
 The solution is to suspend the entire timekeeping when the last CPU
 is entering an idle state and resume it when the first CPU goes out
 of idle.  That has to be done with care, though, so as to avoid
 accessing suspended clocksources etc. end we need extra support
 from idle drivers for that.
 
 This series of commits adds support for quiescing timers during
 suspend-to-idle and adds the requisite callbacks to intel_idle
 and the ACPI cpuidle driver.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABCAAGBQJU4PNaAAoJEILEb/54YlRxgjsP/0UbDGbltVyM8VFhsobqhOni
 thKJTJsqWqYgsPfTufbOGyvP6zskbsarDlzCXoKXuHaynIqcxY8xfZvMdcQr1j0S
 nhKdqv4R6qlP3w2cFxXVZwhw21X3YO1zIxpi5Do1HdVuWoOvxq8Dk4cU8MrgOJC0
 6ThC9Q7klheV4tY6Narlmmf6sX5O+S/EaqnupESSG4cqxNmlPw5AguLviBaUNVAY
 RSjUX8LAce05bOIGEpaFY+vUws+jlU7/T/GEajquEsGF9zalh2CsWso5nQvilxrJ
 22MVqXUyHaXmTC+U7nV78qRkavR6zyr3v/JBDse56qRI1mFlmyvGh8mE5ukmpqJE
 Cg5rRC68o71xlBSVGhKW3Os2ks2Nenj2NLltrTyuh43OBJ691TaLsZnKh5nYt/MW
 MZdqRRjIDTMF+/P1u4wY8S63labrrmp7w4T720CgaZCLJ/9VfZQuqKXTTm2R5/II
 eDhFvdYXoP2748uUOn5sOr5/o0xhnMdaxykZZxE3IkSpOpIV1Mo2HWTIyDYXlILP
 0OuJUUZFZtFOjWGCPn3YgoFT94C3nlO1vkXw//44okTUiUaaOZz+VWDF4fxdVeLR
 8NGTe+/QzEq+2lbs+ZWRSM1hPukOntFcwCgWXFiqh9x2F00LAw9JpkiKBujxTjUV
 m2WstYaML3W7gBMyhxg0
 =55Jb
 -----END PGP SIGNATURE-----

Merge tag 'suspend-to-idle-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull suspend-to-idle updates from Rafael Wysocki:
 "Suspend-to-idle timer quiescing support for v3.20-rc1

  Until now suspend-to-idle has not been able to save much more energy
  than runtime PM because of timer interrupts that periodically bring
  CPUs out of idle while they are waiting for a wakeup interrupt.  Of
  course, the timer interrupts are not wakeup ones, so the handling of
  them can be deferred until a real wakeup interrupt happens, but at the
  same time we don't want to mass-expire timers at that point.

  The solution is to suspend the entire timekeeping when the last CPU is
  entering an idle state and resume it when the first CPU goes out of
  idle.  That has to be done with care, though, so as to avoid accessing
  suspended clocksources etc.  end we need extra support from idle
  drivers for that.

  This series of commits adds support for quiescing timers during
  suspend-to-idle and adds the requisite callbacks to intel_idle and the
  ACPI cpuidle driver"

* tag 'suspend-to-idle-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / idle: Implement ->enter_freeze callback routine
  intel_idle: Add ->enter_freeze callbacks
  PM / sleep: Make it possible to quiesce timers during suspend-to-idle
  timekeeping: Make it safe to use the fast timekeeper while suspended
  timekeeping: Pass readout base to update_fast_timekeeper()
  PM / sleep: Re-implement suspend-to-idle handling
2015-02-17 14:17:51 -08:00
Linus Torvalds 18656782a8 ARM: SoC driver updates
These are changes for drivers that are intimately tied to some SoC
 and for some reason could not get merged through the respective
 subsystem maintainer tree.
 
 This time around, much of this is for at91, with the bulk of it being syscon
 and udc drivers.
 
 Also, there's:
 - coupled cpuidle support for Samsung Exynos4210
 - Renesas 73A0 common-clk work
 - of/platform changes to tear down DMA mappings on device destruction
 - a few updates to the TI Keystone knav code
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU4upSAAoJEIwa5zzehBx3HkUP/Rc4B1yZChNIFNfVq4dbei6w
 dT9WdFmxOIj2JLeXEypFBiNf1nSHmsxrZe9/IDACz2fYQOnaZZ6/786utUJP/PtC
 2GDJy9cjL2Xh03We3nQp5z6J33XvpEni1t82cOpCl8wLBOQNnkjEks8UvLgi1LHW
 CNLcMm8JtDQ2aB/gRTjzetp9liZluESY5+Mig+loE2F/rzbMbNQDcWDDgUPyIQIS
 1onL+Bad3BnGFdo/+qnkurGc81pxoKiQJty06VWFftzvIwxXhsNjrqls2+KzstAx
 0lLvW1tqaDhXvUBImRM8GgfbldZslsgoFVmgESS9MpPMBNENYrkAiQNvJUnM7kd9
 qHDQNq+zRNsz/k4fVvp/YUp7xEiAo4rLcFmp/dBr535jS2LNyiZnB94q+kXsin/m
 tiyEMx+RWxEHTEHN9WdKE61Ty1RbzOa5UTLSzOKFAkF+m2nvuQsJvb97n19coAq9
 SSsj/wJgesfqrDEegphCDh1fyVxUzlAjjhTAyvPS155WvPzkbxZxuBbSqRuriRKA
 2aCfVne2ELimHAr3LEPgPW2kFBcONHckOGe6MvrTX4zPHU8bb9WIeg+iGdQChnr3
 nclT9jq+ZnQro5XTgUtPtadq100oEXlJbqpAzhd+cJbvgzSNbcWfcgE6kOWqd9uK
 oeWQWFLCdXLmXf9zCwmk
 =T7R2
 -----END PGP SIGNATURE-----

Merge tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC driver updates from Olof Johansson:
 "These are changes for drivers that are intimately tied to some SoC and
  for some reason could not get merged through the respective subsystem
  maintainer tree.

  This time around, much of this is for at91, with the bulk of it being
  syscon and udc drivers.

  Also, there's:
   - coupled cpuidle support for Samsung Exynos4210
   - Renesas 73A0 common-clk work
   - of/platform changes to tear down DMA mappings on device destruction
   - a few updates to the TI Keystone knav code"

* tag 'drivers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (26 commits)
  cpuidle: exynos: add coupled cpuidle support for exynos4210
  ARM: EXYNOS: apply S5P_CENTRAL_SEQ_OPTION fix only when necessary
  soc: ti: knav_qmss_queue: change knav_range_setup_acc_irq to static
  soc: ti: knav_qmss_queue: makefile tweak to build as dynamic module
  pcmcia: at91_cf: depend on !ARCH_MULTIPLATFORM
  soc: ti: knav_qmss_queue: export API calls for use by user driver
  of/platform: teardown DMA mappings on device destruction
  usb: gadget: at91_udc: Allocate udc instance
  usb: gadget: at91_udc: Update DT binding documentation
  usb: gadget: at91_udc: Rework for multi-platform kernel support
  usb: gadget: at91_udc: Simplify probe and remove functions
  usb: gadget: at91_udc: Remove non-DT handling code
  usb: gadget: at91_udc: Document DT clocks and clock-names property
  usb: gadget: at91_udc: Drop uclk clock
  usb: gadget: at91_udc: Fix clock names
  mfd: syscon: Add Atmel SMC binding doc
  mfd: syscon: Add atmel-smc registers definition
  mfd: syscon: Add Atmel Matrix bus DT binding documentation
  mfd: syscon: Add atmel-matrix registers definition
  clk: shmobile: fix sparse NULL pointer warning
  ...
2015-02-17 09:38:59 -08:00
Rafael J. Wysocki 124cf9117c PM / sleep: Make it possible to quiesce timers during suspend-to-idle
The efficiency of suspend-to-idle depends on being able to keep CPUs
in the deepest available idle states for as much time as possible.
Ideally, they should only be brought out of idle by system wakeup
interrupts.

However, timer interrupts occurring periodically prevent that from
happening and it is not practical to chase all of the "misbehaving"
timers in a whack-a-mole fashion.  A much more effective approach is
to suspend the local ticks for all CPUs and the entire timekeeping
along the lines of what is done during full suspend, which also
helps to keep suspend-to-idle and full suspend reasonably similar.

The idea is to suspend the local tick on each CPU executing
cpuidle_enter_freeze() and to make the last of them suspend the
entire timekeeping.  That should prevent timer interrupts from
triggering until an IO interrupt wakes up one of the CPUs.  It
needs to be done with interrupts disabled on all of the CPUs,
though, because otherwise the suspended clocksource might be
accessed by an interrupt handler which might lead to fatal
consequences.

Unfortunately, the existing ->enter callbacks provided by cpuidle
drivers generally cannot be used for implementing that, because some
of them re-enable interrupts temporarily and some idle entry methods
cause interrupts to be re-enabled automatically on exit.  Also some
of these callbacks manipulate local clock event devices of the CPUs
which really shouldn't be done after suspending their ticks.

To overcome that difficulty, introduce a new cpuidle state callback,
->enter_freeze, that will be guaranteed (1) to keep interrupts
disabled all the time (and return with interrupts disabled) and (2)
not to touch the CPU timer devices.  Modify cpuidle_enter_freeze() to
look for the deepest available idle state with ->enter_freeze present
and to make the CPU execute that callback with suspended tick (and the
last of the online CPUs to execute it with suspended timekeeping).

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-15 19:40:09 +01:00
Rafael J. Wysocki 3810631332 PM / sleep: Re-implement suspend-to-idle handling
In preparation for adding support for quiescing timers in the final
stage of suspend-to-idle transitions, rework the freeze_enter()
function making the system wait on a wakeup event, the freeze_wake()
function terminating the suspend-to-idle loop and the mechanism by
which deep idle states are entered during suspend-to-idle.

First of all, introduce a simple state machine for suspend-to-idle
and make the code in question use it.

Second, prevent freeze_enter() from losing wakeup events due to race
conditions and ensure that the number of online CPUs won't change
while it is being executed.  In addition to that, make it force
all of the CPUs re-enter the idle loop in case they are in idle
states already (so they can enter deeper idle states if possible).

Next, drop cpuidle_use_deepest_state() and replace use_deepest_state
checks in cpuidle_select() and cpuidle_reflect() with a single
suspend-to-idle state check in cpuidle_idle_call().

Finally, introduce cpuidle_enter_freeze() that will simply find the
deepest idle state available to the given CPU and enter it using
cpuidle_enter().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-13 23:49:36 +01:00
Linus Torvalds 6b00f7efb5 arm64 updates for 3.20:
- reimplementation of the virtual remapping of UEFI Runtime Services in
   a way that is stable across kexec
 - emulation of the "setend" instruction for 32-bit tasks (user
   endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
   accordingly)
 - compat_sys_call_table implemented in C (from asm) and made it a
   constant array together with sys_call_table
 - export CPU cache information via /sys (like other architectures)
 - DMA API implementation clean-up in preparation for IOMMU support
 - macros clean-up for KVM
 - dropped some unnecessary cache+tlb maintenance
 - CONFIG_ARM64_CPU_SUSPEND clean-up
 - defconfig update (CPU_IDLE)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU25v3AAoJEGvWsS0AyF7xYjcP/j8ESvs+z0BPgeJ6XREfOnCh
 cp+w/1rJ5BafJ5RRkibrciwTNOIJS4FGMivWyURtoh430lS0Rh7fxZ3Ouna3xjrT
 Nf7AxenWoA8Lo6wHh+FlNUeGk3iWfX6WwA2tYrbKudK+LBJ1wHjwpE7cWQO0FgwJ
 aFDahu+QD5/u45p/VcVctMtiEDvOxBdO8gfat6r+YkLm7pbRxQkZnpA/JE4Gps1p
 Td5jvMNH9pXI5pffSbeR9Q+vs/r0yqKLXQg01Eb2bZgGDgwf9yzADrHuaKamZt35
 X5flmLiTGC6swJCJvUkZC1Nuue33bXcvW5+vgvar+MNGyXsxv+B/wARLqGhiWhQZ
 nLGwFpuNu6wdY9tGHb/XR8khcewkw1/lRH1hHKhchrmRyUqHvXcPgC5tamjLrY8C
 BV3BAeQvRho8OKwWUmbXIlyON1vPux6CJdj4D/A5NL+qph2WHeVWJCXg6nVFx0Wc
 Eb3bXbI4QRwTFL7pGRF8RyZJBAQtgYhQMKWMW2GHgUgn+r1EixG73BZoSwvpHrrw
 FOR9AVNfVBqmNON8xiIb3DN4EViq76EF0jrsZh5I9EoWS2w5qtk60kJQgXE+M4EE
 vOlmh3dhEVfCN2SxOn0bgoQmTulyjqGauTSSJKQbIBuinPFveukrJfGNFIWt0SZs
 f38FBMo6sgU4VG85B+Fr
 =X5x/
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
 "arm64 updates for 3.20:

   - reimplementation of the virtual remapping of UEFI Runtime Services
     in a way that is stable across kexec
   - emulation of the "setend" instruction for 32-bit tasks (user
     endianness switching trapped in the kernel, SCTLR_EL1.E0E bit set
     accordingly)
   - compat_sys_call_table implemented in C (from asm) and made it a
     constant array together with sys_call_table
   - export CPU cache information via /sys (like other architectures)
   - DMA API implementation clean-up in preparation for IOMMU support
   - macros clean-up for KVM
   - dropped some unnecessary cache+tlb maintenance
   - CONFIG_ARM64_CPU_SUSPEND clean-up
   - defconfig update (CPU_IDLE)

  The EFI changes going via the arm64 tree have been acked by Matt
  Fleming.  There is also a patch adding sys_*stat64 prototypes to
  include/linux/syscalls.h, acked by Andrew Morton"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (47 commits)
  arm64: compat: Remove incorrect comment in compat_siginfo
  arm64: Fix section mismatch on alloc_init_p[mu]d()
  arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros
  arm64: mm: use *_sect to check for section maps
  arm64: drop unnecessary cache+tlb maintenance
  arm64:mm: free the useless initial page table
  arm64: Enable CPU_IDLE in defconfig
  arm64: kernel: remove ARM64_CPU_SUSPEND config option
  arm64: make sys_call_table const
  arm64: Remove asm/syscalls.h
  arm64: Implement the compat_sys_call_table in C
  syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64
  compat: Declare compat_sys_sigpending and compat_sys_sigprocmask prototypes
  arm64: uapi: expose our struct ucontext to the uapi headers
  smp, ARM64: Kill SMP single function call interrupt
  arm64: Emulate SETEND for AArch32 tasks
  arm64: Consolidate hotplug notifier for instruction emulation
  arm64: Track system support for mixed endian EL0
  arm64: implement generic IOMMU configuration
  arm64: Combine coherent and non-coherent swiotlb dma_ops
  ...
2015-02-11 18:03:54 -08:00
Olof Johansson 6f4554bdff Samsung CPUIdle updates for v3.20
- adds coupled cpuidle support for exynos4210
   : fix for Exynos platform PM code preparing it for the coupled
   cpuidle support and adds coupled cpuidle AFTR mode on exynos4210
 
 Note this is mostrly based on earlier cpuidle-exynos4210 driver
 from Daniel Lezcano and Bart updated.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJU0ikrAAoJEA0Cl+kVi2xqLoUQAI2JZi6XyeyJ+0YNGn+TpGEp
 OBMks8hVyUXDIk+juuijRLh1MTzOkguMNKjS8y84IHpM8SvVUiPzh/b4pPBr7i14
 c6zaocm17HePZwF9tR5VwLAAiyV+0f9rRgUUBSaVTyc5/tMFJES6iuNVzPwa7fqf
 5u7YjAuV1N349NC8148wvvrg/ICktpdCydYmfOfOTsjgVNwiyd4kF3ofC7Xt9MrN
 fNgdl+OQZxOIWFKkBw3p3+exWybgjd9GE5K9kuASmGWrIOXjRO4KxO/UqRBNJIhH
 bVjnkEq88kczf4Z999YHO33fMaMC3y/h++luRgkzxOZge0KVbqusikq+dCc5je3D
 k/TPn0tnc8A2nbbcvziuXOe/EhudqXWD+QZXhRcmUAeRFDA1DNsEfSHzujD1fbyn
 TA9+RjLodCv2LT5wQq/EHs5yiAYChiUz/TGOBwhHzMUNvXKs3cSBtATabXgWrCtd
 pYtI+o1FYekccafF1bUWMIR6BDHxAOb41FDQVZYzi7+ez5yHHYEwZGqPXLmY3x/f
 8Nvyy2PgOouB5cHSxn9r0wtMM0nATWB1WNUIf3cPRCHZKpxpHOhrH4CEDsQG02FS
 qhs8QARvDA2pCV+m9hFFKDt/VNMpzHnw6uC0XWlICrDk9sHPzwXVuB1MHcjr/n6w
 wJhgNOVyfTEnIDQi2mGt
 =nadx
 -----END PGP SIGNATURE-----

Merge tag 'samsung-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/drivers

Merge "Samsung CPUIdle updates for v3.20" from Kukjin Kim:

- adds coupled cpuidle support for exynos4210
  : fix for Exynos platform PM code preparing it for the coupled
  cpuidle support and adds coupled cpuidle AFTR mode on exynos4210

Note this is mostrly based on earlier cpuidle-exynos4210 driver
from Daniel Lezcano and Bart updated.

* tag 'samsung-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  cpuidle: exynos: add coupled cpuidle support for exynos4210
  ARM: EXYNOS: apply S5P_CENTRAL_SEQ_OPTION fix only when necessary

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-02-06 01:07:33 -08:00
Bartlomiej Zolnierkiewicz 712eddf702 cpuidle: exynos: add coupled cpuidle support for exynos4210
The following patch adds coupled cpuidle support for Exynos4210 to
an existing cpuidle-exynos driver.  As a result it enables AFTR mode
to be used by default on Exynos4210 without the need to hot unplug
CPU1 first.

The patch is heavily based on earlier cpuidle-exynos4210 driver from
Daniel Lezcano:

http://www.spinics.net/lists/linux-samsung-soc/msg28134.html

Changes from Daniel's code include:
- porting code to current kernels
- fixing it to work on my setup (by using S5P_INFORM register
  instead of S5P_VA_SYSRAM one on Revison 1.1 and retrying poking
  CPU1 out of the BOOT ROM if necessary)
- fixing rare lockup caused by waiting for CPU1 to get stuck in
  the BOOT ROM (CPU hotplug code in arch/arm/mach-exynos/platsmp.c
  doesn't require this and works fine)
- moving Exynos specific code to arch/arm/mach-exynos/pm.c
- using cpu_boot_reg_base() helper instead of BOOT_VECTOR macro
- using exynos_cpu_*() helpers instead of accessing registers
  directly
- using arch_send_wakeup_ipi_mask() instead of dsb_sev()
  (this matches CPU hotplug code in arch/arm/mach-exynos/platsmp.c)
- integrating separate exynos4210-cpuidle driver into existing
  exynos-cpuidle one

Cc: Colin Cross <ccross@google.com>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
2015-01-30 08:39:15 +09:00
Lorenzo Pieralisi af3cfdbf56 arm64: kernel: remove ARM64_CPU_SUSPEND config option
ARM64_CPU_SUSPEND config option was introduced to make code providing
context save/restore selectable only on platforms requiring power
management capabilities.

Currently ARM64_CPU_SUSPEND depends on the PM_SLEEP config option which
in turn is set by the SUSPEND config option.

The introduction of CPU_IDLE for arm64 requires that code configured
by ARM64_CPU_SUSPEND (context save/restore) should be compiled in
in order to enable the CPU idle driver to rely on CPU operations
carrying out context save/restore.

The ARM64_CPUIDLE config option (ARM64 generic idle driver) is therefore
forced to select ARM64_CPU_SUSPEND, even if there may be (ie PM_SLEEP)
failed dependencies, which is not a clean way of handling the kernel
configuration option.

For these reasons, this patch removes the ARM64_CPU_SUSPEND config option
and makes the context save/restore dependent on CPU_PM, which is selected
whenever either SUSPEND or CPU_IDLE are configured, cleaning up dependencies
in the process.

This way, code previously configured through ARM64_CPU_SUSPEND is
compiled in whenever a power management subsystem requires it to be
present in the kernel (SUSPEND || CPU_IDLE), which is the behaviour
expected on ARM64 kernels.

The cpu_suspend and cpu_init_idle CPU operations are added only if
CPU_IDLE is selected, since they are CPU_IDLE specific methods and
should be grouped and defined accordingly.

PSCI CPU operations are updated to reflect the introduced changes.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-01-27 11:35:33 +00:00
Sudeep Holla 194fe6f28e drivers: cpuidle: Don't initialize big.LITTLE driver if MCPM is unavailable
If big.LITTLE driver is initialized even when MCPM is unavailable,
we get the below warning the first time cpu tries to enter deeper
C-states.

------------[ cut here ]------------
WARNING: CPU: 4 PID: 0 at kernel/arch/arm/common/mcpm_entry.c:130 mcpm_cpu_suspend+0x6d/0x74()
Modules linked in:
CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc3-00007-gaf5a2cb1ad5c-dirty #11
Hardware name: ARM-Versatile Express
[<c0013fa5>] (unwind_backtrace) from [<c001084d>] (show_stack+0x11/0x14)
[<c001084d>] (show_stack) from [<c04fe7f1>] (dump_stack+0x6d/0x78)
[<c04fe7f1>] (dump_stack) from [<c0020645>] (warn_slowpath_common+0x69/0x90)
[<c0020645>] (warn_slowpath_common) from [<c00206db>] (warn_slowpath_null+0x17/0x1c)
[<c00206db>] (warn_slowpath_null) from [<c001cbdd>] (mcpm_cpu_suspend+0x6d/0x74)
[<c001cbdd>] (mcpm_cpu_suspend) from [<c03c6919>] (bl_powerdown_finisher+0x21/0x24)
[<c03c6919>] (bl_powerdown_finisher) from [<c001218d>] (cpu_suspend_abort+0x1/0x14)
[<c001218d>] (cpu_suspend_abort) from [<00000000>] (  (null))
---[ end trace d098e3fd00000008 ]---

This patch fixes the issue by checking for the availability of MCPM
before initializing the big.LITTLE cpuidle driver

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2015-01-23 15:05:48 +01:00
Rafael J. Wysocki ff23ab2441 Merge branches 'pm-cpufreq' and 'pm-cpuidle'
* pm-cpufreq:
  cpufreq: fix a NULL pointer dereference in __cpufreq_governor()
  cpufreq-dt: defer probing if OPP table is not ready

* pm-cpuidle:
  cpuidle / ACPI: remove unused CPUIDLE_FLAG_TIME_INVALID
  cpuidle: ladder: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
  cpuidle: menu: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
2014-12-29 21:23:41 +01:00
Linus Torvalds 34b85e3574 powerpc updates for 3.19 batch 2
The highlight is the series that reworks the idle management on powernv, which
 allows us to use deeper idle states on those machines.
 
 There's the fix from Anton for the "BUG at kernel/smpboot.c:134!" problem.
 
 An i2c driver for powernv. This is acked by Wolfram Sang, and he asked that we
 take it through the powerpc tree.
 
 A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of the audit
 maintainers.
 
 A patch from Ben to export the symbol map of our OPAL firmware as a sysfs file,
 so that tools can use it.
 
 Also some CXL fixes, a couple of powerpc perf fixes, a fix for smt-enabled, and
 the patch to add __force to get_user() so we can use bitwise types.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUk+oCAAoJEFHr6jzI4aWADBAP/i/CJ+cu6o4mzNDdfs8bnxqn
 RGZCSV+SrkTZPcoLbLiM9iaqq34ORVIn7hwkhkTz2/koluMVfTsqtVulMoFf+hVd
 GTVt81MjMFzA3hM3bXEV58KRT79+64K54dLCe0F7OaD6f4AikKR4LLz/PY0EBMiZ
 2h13uQlfglaMeYTsaD9eeUpIIKs7+PwsNqUknmN9We07WWfxWqnRpiTR4TYTMXx4
 3lQPvCnnHokwDqjuKgwiqDVSaCfCl8laS1i+BPk0G0aRV1AnPDvR3MhgVb2IpNxX
 Joxy2D1HSawwDhqHOsId8dkGZXOM4vzo+Y658qnC1XfThqE0MhA+kCfa5/b6xlOR
 K7nDO5A41B6nXB3mMOQh/szTXSIa8KJRTR3ibbJJrMdF6F0TN0JLLQNUcmM4j/5D
 vvgZEzvFNZhWX98ktlQLde2E4ClWJg6mWESCGSgJeVjIXaxe/6GneIa8vLKm5QMu
 OoykNsASMDGqddYMGoYeX/mSsvjPjK0PDO2q19sPbkP8xpyDLx6J8xo+5hO4l8xc
 0Cdb38ECfeno+w5oKAnjidHnz0KYBsuYFLeS+rV0b8sUSWAzfdEjSn2AVIQ8gLOv
 IOCAqwZ5tL9EcUs+AKru5EHtBEV+2XB54xPRxfdFS/k+vYRE7MpS3ipxveIynN2l
 eRxf9hsSO7ASNDd0b3ID
 =GXdK
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull second batch of powerpc updates from Michael Ellerman:
 "The highlight is the series that reworks the idle management on
  powernv, which allows us to use deeper idle states on those machines.

  There's the fix from Anton for the "BUG at kernel/smpboot.c:134!"
  problem.

  An i2c driver for powernv.  This is acked by Wolfram Sang, and he
  asked that we take it through the powerpc tree.

  A fix for audit from rgb at Red Hat, acked by Paul Moore who is one of
  the audit maintainers.

  A patch from Ben to export the symbol map of our OPAL firmware as a
  sysfs file, so that tools can use it.

  Also some CXL fixes, a couple of powerpc perf fixes, a fix for
  smt-enabled, and the patch to add __force to get_user() so we can use
  bitwise types"

* tag 'powerpc-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/powernv: Ignore smt-enabled on Power8 and later
  powerpc/uaccess: Allow get_user() with bitwise types
  powerpc/powernv: Expose OPAL firmware symbol map
  powernv/powerpc: Add winkle support for offline cpus
  powernv/cpuidle: Redesign idle states management
  powerpc/powernv: Enable Offline CPUs to enter deep idle states
  powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode
  i2c: Driver to expose PowerNV platform i2c busses
  powerpc: add little endian flag to syscall_get_arch()
  power/perf/hv-24x7: Use kmem_cache_free() instead of kfree
  powerpc/perf/hv-24x7: Use per-cpu page buffer
  cxl: Unmap MMIO regions when detaching a context
  cxl: Add timeout to process element commands
  cxl: Change contexts_lock to a mutex to fix sleep while atomic bug
  powerpc: Secondary CPUs must set cpu_callin_map after setting active and online
2014-12-19 12:57:45 -08:00
Len Brown b73026b9c9 cpuidle: ladder: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
When the ladder governor sees the CPUIDLE_FLAG_TIME_INVALID flag,
it unconditionally causes a state promotion by setting last_residency
to a number higher than the state's promotion_time:

last_residency = last_state->threshold.promotion_time + 1

It does this for fear that cpuidle_get_last_residency()
will be in-accurate, because cpuidle_enter_state() invoked
a state with CPUIDLE_FLAG_TIME_INVALID.

But the only state with CPUIDLE_FLAG_TIME_INVALID is
acpi_safe_halt(), which may return well after its actual
idle duration because it enables interrupts, so cpuidle_enter_state()
also measures interrupt service time.

So what?  In ladder, a huge invalid last_residency has exactly
the same effect as the current code -- it unconditionally
causes a state promotion.

In the case where the idle residency plus measured interrupt
handling time is less than the state's demotion_time -- we should
use that timestamp to give ladder a chance to demote, rather than
unconditionally promoting.

This can be done by simply ignoring the CPUIDLE_FLAG_TIME_INVALID,
and using the "invalid" time, as it is either equal to what we are
doing today, or better.

Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-17 02:26:28 +01:00
Len Brown 4108b3d962 cpuidle: menu: Better idle duration measurement without using CPUIDLE_FLAG_TIME_INVALID
When menu sees CPUIDLE_FLAG_TIME_INVALID, it ignores its timestamps,
and assumes that idle lasted as long as the time till next predicted
timer expiration.

But if an interrupt was seen and serviced before that duration,
it would actually be more accurate to use the measured time
rather than rounding up to the next predicted timer expiration.

And if an interrupt is seen and serviced such that the mesured time
exceeds the time till next predicted timer expiration, then
truncating to that expiration is the right thing to do --
since we can never stay idle past that timer expiration.

So the code can do a better job without
checking for CPUIDLE_FLAG_TIME_INVALID.

Signed-off-by: Len Brown <len.brown@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Tuukka Tikkanen <tuukka.tikkanen@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-12-17 02:26:28 +01:00
Linus Torvalds e6b5be2be4 Driver core patches for 3.19-rc1
Here's the set of driver core patches for 3.19-rc1.
 
 They are dominated by the removal of the .owner field in platform
 drivers.  They touch a lot of files, but they are "simple" changes, just
 removing a line in a structure.
 
 Other than that, a few minor driver core and debugfs changes.  There are
 some ath9k patches coming in through this tree that have been acked by
 the wireless maintainers as they relied on the debugfs changes.
 
 Everything has been in linux-next for a while.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlSOD20ACgkQMUfUDdst+ylLPACg2QrW1oHhdTMT9WI8jihlHVRM
 53kAoLeteByQ3iVwWurwwseRPiWa8+MI
 =OVRS
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core update from Greg KH:
 "Here's the set of driver core patches for 3.19-rc1.

  They are dominated by the removal of the .owner field in platform
  drivers.  They touch a lot of files, but they are "simple" changes,
  just removing a line in a structure.

  Other than that, a few minor driver core and debugfs changes.  There
  are some ath9k patches coming in through this tree that have been
  acked by the wireless maintainers as they relied on the debugfs
  changes.

  Everything has been in linux-next for a while"

* tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits)
  Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries"
  fs: debugfs: add forward declaration for struct device type
  firmware class: Deletion of an unnecessary check before the function call "vunmap"
  firmware loader: fix hung task warning dump
  devcoredump: provide a one-way disable function
  device: Add dev_<level>_once variants
  ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries
  ath: use seq_file api for ath9k debugfs files
  debugfs: add helper function to create device related seq_file
  drivers/base: cacheinfo: remove noisy error boot message
  Revert "core: platform: add warning if driver has no owner"
  drivers: base: support cpu cache information interface to userspace via sysfs
  drivers: base: add cpu_device_create to support per-cpu devices
  topology: replace custom attribute macros with standard DEVICE_ATTR*
  cpumask: factor out show_cpumap into separate helper function
  driver core: Fix unbalanced device reference in drivers_probe
  driver core: fix race with userland in device_add()
  sysfs/kernfs: make read requests on pre-alloc files use the buffer.
  sysfs/kernfs: allow attributes to request write buffer be pre-allocated.
  fs: sysfs: return EGBIG on write if offset is larger than file size
  ...
2014-12-14 16:10:09 -08:00
Shreyas B. Prabhu 7cba160ad7 powernv/cpuidle: Redesign idle states management
Deep idle states like sleep and winkle are per core idle states. A core
enters these states only when all the threads enter either the
particular idle state or a deeper one. There are tasks like fastsleep
hardware bug workaround and hypervisor core state save which have to be
done only by the last thread of the core entering deep idle state and
similarly tasks like timebase resync, hypervisor core register restore
that have to be done only by the first thread waking up from these
state.

The current idle state management does not have a way to distinguish the
first/last thread of the core waking/entering idle states. Tasks like
timebase resync are done for all the threads. This is not only is
suboptimal, but can cause functionality issues when subcores and kvm is
involved.

This patch adds the necessary infrastructure to track idle states of
threads in a per-core structure. It uses this info to perform tasks like
fastsleep workaround and timebase resync only once per core.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Originally-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-15 10:46:40 +11:00
Shreyas B. Prabhu 8eb8ac89a3 powerpc/powernv: Enable Offline CPUs to enter deep idle states
The secondary threads should enter deep idle states so as to gain maximum
powersavings when the entire core is offline. To do so the offline path
must be made aware of the available deepest idle state. Hence probe the
device tree for the possible idle states in powernv core code and
expose the deepest idle state through flags.

Since the  device tree is probed by the cpuidle driver as well, move
the parameters required to discover the idle states into an appropriate
common place to both the driver and the powernv core code.

Another point is that fastsleep idle state may require workarounds in
the kernel to function properly. This workaround is introduced in the
subsequent patches. However neither the cpuidle driver or the hotplug
path need be bothered about this workaround.

They will be taken care of by the core powernv code.

Originally-by: Srivatsa S. Bhat <srivatsa@mit.edu>
Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: Paul Mackerras <paulus@samba.org>

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: linux-pm@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-15 10:46:40 +11:00
Rafael J. Wysocki 0a924200ae Merge back earlier cpuidle material for 3.19-rc1.
Conflicts:
	drivers/cpuidle/dt_idle_states.c
2014-11-21 16:31:42 +01:00
Lorenzo Pieralisi 18f95a3640 drivers: cpuidle: Remove cpuidle-arm64 duplicate error messages
Current CPUidle driver for arm64 machines spits errors upon idle
state initialization and cpuidle driver registration failures.
These error messages are already printed in core code so there is
no need to print them again.

This patch removes the duplicate print messages from the cpuidle-arm64
driver.

Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-11-19 10:16:28 +01:00
Lorenzo Pieralisi c00bc5df7c drivers: cpuidle: Add idle-state-name description to ARM idle states
On ARM machines, where generally speaking the idle state numbering has
no fixed and standard meaning it is useful to provide a description
of the idle state inner workings for benchmarking and monitoring purposes.

This patch adds a property to the idle states bindings that if present
gives platform firmware a means of describing the idle state and export
the string description to user space.

The patch updates the DT parsing code accordingly to take the description,
if present, into consideration.

Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-11-19 10:16:23 +01:00
Lorenzo Pieralisi 97735da074 drivers: cpuidle: Add status property to ARM idle states
On some platforms the device tree bindings must provide the kernel
with a status flag for idle states, that defines whether the idle
state is operational or not in the current configuration.

This patch adds a status property to the ARM idle states compliant
with ePAPR v1.1 and updates the DT parsing code accordingly.

Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2014-11-19 10:16:15 +01:00
Daniel Lezcano b82b6cca48 cpuidle: Invert CPUIDLE_FLAG_TIME_VALID logic
The only place where the time is invalid is when the ACPI_CSTATE_FFH entry
method is not set. Otherwise for all the drivers, the time can be correctly
measured.

Instead of duplicating the CPUIDLE_FLAG_TIME_VALID flag in all the drivers
for all the states, just invert the logic by replacing it by the flag
CPUIDLE_FLAG_TIME_INVALID, hence we can set this flag only for the acpi idle
driver, remove the former flag from all the drivers and invert the logic with
this flag in the different governor.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-11-12 21:17:27 +01:00
Greg Kroah-Hartman a8a93c6f99 Merge branch 'platform/remove_owner' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into driver-core-next
Remove all .owner fields from platform drivers
2014-11-03 19:53:56 -08:00
Linus Torvalds 2cc91884b6 Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "This is the first round of fixes and tying up loose ends for MIPS.

   - plenty of fixes for build errors in specific obscure configurations
   - remove redundant code on the Lantiq platform
   - removal of a useless SEAD I2C driver that was causing a build issue
   - fix an earlier TLB exeption handler fix to also work on Octeon.
   - fix ISA level dependencies in FPU emulator's instruction decoding.
   - don't hardcode kernel command line in Octeon software emulator.
   - fix an earlier fix for the Loondson 2 clock setting"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: SEAD3: Fix I2C device registration.
  MIPS: SEAD3: Nuke PIC32 I2C driver.
  MIPS: ftrace: Fix a microMIPS build problem
  MIPS: MSP71xx: Fix build error
  MIPS: Malta: Do not build the malta-amon.c file if CMP is not enabled
  MIPS: Prevent compiler warning from cop2_{save,restore}
  MIPS: Kconfig: Add missing MIPS_CPS dependencies to PM and cpuidle
  MIPS: idle: Remove leftover __pastwait symbol and its references
  MIPS: Sibyte: Include the swarm subdir to the sb1250 LittleSur builds
  MIPS: ptrace.h: Add a missing include
  MIPS: ath79: Fix compilation error when CONFIG_PCI is disabled
  MIPS: MSP71xx: Remove compilation error when CONFIG_MIPS_MT is present
  MIPS: Octeon: Remove special case for simulator command line.
  MIPS: tlbex: Properly fix HUGE TLB Refill exception handler
  MIPS: loongson2_cpufreq: Fix CPU clock rate setting mismerge
  pci: pci-lantiq: remove duplicate check on resource
  MIPS: Lasat: Add missing CONFIG_PROC_FS dependency to PICVUE_PROC
  MIPS: cp1emu: Fix ISA restrictions for cop1x_op instructions
2014-10-24 12:48:47 -07:00
Markos Chandras 39a5959355 MIPS: Kconfig: Add missing MIPS_CPS dependencies to PM and cpuidle
The MIPS_CPS_PM and MIPS_CPS_CPUIDLE implementation should depend
on the MIPS_CPS symbol to avoid the following build problem

arch/mips/kernel/pm-cps.c: In function 'cps_pm_enter_state':
arch/mips/kernel/pm-cps.c:164:26: error: 'cpu_coherent_mask' undeclared
(first use in this function)
cpumask_clear_cpu(cpu, &cpu_coherent_mask);
                            ^
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/7798/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2014-10-23 19:58:05 +02:00
Preeti U. Murthy 74aa51b5cc cpuidle: powernv: Populate cpuidle state details by querying the device-tree
We hard code the metrics relevant for cpuidle states in the kernel today.
Instead pick them up from the device tree so that they remain relevant
and updated for the system that the kernel is running on.

Signed-off-by: Preeti U. Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-21 00:58:58 +02:00
Wolfram Sang c6ec883212 cpuidle: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:24 +02:00
Linus Torvalds 0429fbc0bd Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
 "Way back, before the current percpu allocator was implemented, static
  and dynamic percpu memory areas were allocated and handled separately
  and had their own accessors.  The distinction has been gone for many
  years now; however, the now duplicate two sets of accessors remained
  with the pointer based ones - this_cpu_*() - evolving various other
  operations over time.  During the process, we also accumulated other
  inconsistent operations.

  This pull request contains Christoph's patches to clean up the
  duplicate accessor situation.  __get_cpu_var() uses are replaced with
  with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

  Unfortunately, the former sometimes is tricky thanks to C being a bit
  messy with the distinction between lvalues and pointers, which led to
  a rather ugly solution for cpumask_var_t involving the introduction of
  this_cpu_cpumask_var_ptr().

  This converts most of the uses but not all.  Christoph will follow up
  with the remaining conversions in this merge window and hopefully
  remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
  irqchip: Properly fetch the per cpu offset
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
  ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
  Revert "powerpc: Replace __get_cpu_var uses"
  percpu: Remove __this_cpu_ptr
  clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  sparc: Replace __get_cpu_var uses
  avr32: Replace __get_cpu_var with __this_cpu_write
  blackfin: Replace __get_cpu_var uses
  tile: Use this_cpu_ptr() for hardware counters
  tile: Replace __get_cpu_var uses
  powerpc: Replace __get_cpu_var uses
  alpha: Replace __get_cpu_var
  ia64: Replace __get_cpu_var uses
  s390: cio driver &__get_cpu_var replacements
  s390: Replace __get_cpu_var uses
  mips: Replace __get_cpu_var uses
  MIPS: Replace __get_cpu_var uses in FPU emulator.
  arm: Replace __this_cpu_ptr with raw_cpu_ptr
  ...
2014-10-15 07:48:18 +02:00