Граф коммитов

17 Коммитов

Автор SHA1 Сообщение Дата
Rafael J. Wysocki b3ca7aff3c intel: thermal: PCH: Drop ACPI_FADT_LOW_POWER_S0 check
If ACPI_FADT_LOW_POWER_S0 is not set, this doesn't mean that low-power
S0 idle is not usable.  It merely means that using S3 on the given
system is more beneficial from the energy saving perspective than using
low-power S0 idle, as long as S3 is supported.

Suspend-to-idle is still a valid suspend mode if ACPI_FADT_LOW_POWER_S0
is not set and the pm_suspend_via_firmware() check in pch_wpt_suspend()
is sufficient to distinguish suspend-to-idle from S3, so drop the
confusing ACPI_FADT_LOW_POWER_S0 check.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
2022-07-22 21:32:47 +02:00
Zhang Rui bd30d075ee thermal: intel: pch: improve the cooling delay log
Previously, during suspend, intel_pch_thermal driver logs for every
cooling iteration, about the current PCH temperature and number of cooling
iterations that have been tried, like below

[  100.955526] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 1 times for 100 ms duration
[  101.064156] intel_pch_thermal 0000:00:14.2: CPU-PCH current temp [53C] higher than the threshold temp [50C], sleep 2 times for 100 ms duration

After changing the default delay_cnt to 600, in practice, it is common to
see tens of the above messages if the system is suspended when PCH
overheats. Thus, change this log message from dev_warn to dev_dbg because
it is only useful when we want to check the temperature trend.

At the same time, there is always a one-line message given by the driver
with the patch applied, with below four possibilities.

1. PCH is cool, no cooling delay needed
[ 1791.902853] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C]

2. PCH overheats and becomes cool after the cooling delays
[ 1475.511617] intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay

3. PCH still overheats after the overall cooling timeout
[ 2250.157487] intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail

4. PCH aborts cooling because of wakeup event detected during the delay
[ 1933.639509] intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:40:25 +02:00
Zhang Rui 92923028e9 thermal: intel: pch: enhance overheat handling
Commit ef63b043ac ("thermal: intel: pch: fix S0ix failure due to PCH
temperature above threshold") introduces delay loop mechanism that allows
PCH temperature to go down below threshold during suspend so it won't
block S0ix. And the default overall delay timeout is 1 second.

However, in practice, we found that the time it takes to cool the PCH down
below threshold highly depends on the initial PCH temperature when the
delay starts, as well as the ambient temperature.
And in some cases, the 1 second delay is not sufficient. As a result, the
system stays in a shallower power state like PCx instead of S0ix, and
drains the battery power, without user' notice.

To make sure S0ix is not blocked by the PCH overheating, we
1. expand the default overall timeout to 60 seconds.
2. make sure the temperature is below threshold rather than equal to it.

At the same time, as the cooling delay can be much longer and many wakeup
events (ACPI Power Button press, USB mouse move, etc) becomes valid in the
suspend_noirq phase, add detection of wakeup event so that the driver
does not delay blindly when the system suspend is likely to abort soon.

This patch may introduce longer suspend time, but only in the cases when
the system overheats and Linux used to enter a shallower S2idle state,
say, PCx instead of S0ix.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:40:25 +02:00
Zhang Rui 28708e1937 thermal: intel: pch: move cooling delay to suspend_noirq phase
Move the PCH Thermal driver suspend callback to suspend_noirq to do
cooling while the system is more quiescent.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-05-19 19:40:25 +02:00
Kai-Heng Feng 03671968d0 thermal: intel: pch: Fix unexpected shutdown at critical temperature
Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.

Override the default .critical callback to prevent surprising thermal
shutdoown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
2021-01-19 22:30:25 +01:00
Sumeet Pawnikar 8639ff4194 thermal: intel: pch: use macro for temperature calculation
Use macro for temperature calculation

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201210124801.13850-1-sumeet.r.pawnikar@intel.com
2020-12-10 14:30:44 +01:00
Randy Dunlap be133722df thermal: intel_pch_thermal: fix build for ACPI not enabled
The reference to acpi_gbl_FADT causes a build error when ACPI is not
enabled. Fix by making that conditional on CONFIG_ACPI.

../drivers/thermal/intel/intel_pch_thermal.c: In function 'pch_wpt_suspend':
../drivers/thermal/intel/intel_pch_thermal.c:217:8: error: 'acpi_gbl_FADT' undeclared (first use in this function); did you mean 'acpi_get_type'?
  if (!(acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0))
        ^~~~~~~~~~~~~

Fixes: ef63b043ac ("thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Amit Kucheria <amitk@kernel.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201117023807.8266-1-rdunlap@infradead.org
2020-11-17 10:05:43 +01:00
Andres Freund e78acf7efe thermal: intel_pch_thermal: Add PCI ids for Lewisburg PCH.
I noticed that I couldn't read the PCH temperature on my workstation
(C620 series chipset, w/ 2x Xeon Gold 5215 CPUs) directly, but had to go
through IPMI. Looking at the data sheet, it looks to me like the
existing intel PCH thermal driver should work without changes for
Lewisburg.

I suspect there's some other PCI IDs missing. But I hope somebody at
Intel would have an easier time figuring that out than I...

Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Tushar Dave <tushar.n.dave@intel.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/lkml/20200115184415.1726953-1-andres@anarazel.de/
Signed-off-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Pandruvada, Srinivas <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201113204916.1144907-1-andres@anarazel.de
2020-11-14 19:44:39 +01:00
Sumeet Pawnikar ef63b043ac thermal: intel: pch: fix S0ix failure due to PCH temperature above threshold
When system tries to enter S0ix suspend state, just after active load
scenarios, it fails due to PCH current temperature is higher than set
threshold.
This patch introduces delay loop mechanism that allows PCH temperature
to go down below threshold during suspend so it won't fail to enter S0ix.
Add delay loop timeout and count as module parameters for user to tune it,
if required based on system design. This change notifies the different
warning messages like when PCH temperature above the threshold and
executing delay loop. Also, notify the messages when it success or
failure for S0ix entry.
Previously out of 1000 runs around 3 to 5 times it might fail to enter
S0ix just after heavy workload. With this change, S0ix failures reduced
as PCH cools down below threshold.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201106170633.20838-1-sumeet.r.pawnikar@intel.com
2020-11-07 19:02:31 +01:00
Sumeet Pawnikar c569e805c7 thermal: intel: intel_pch_thermal: Add Cannon Lake Low Power PCH support
Add LP (Low Power) PCH id for Cannon Lake (CNL) based platforms.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/1596097503-27924-1-git-send-email-sumeet.r.pawnikar@intel.com
2020-08-04 10:43:03 +02:00
Andrzej Pietrasiewicz bbcf90c064 thermal: Explicitly enable non-changing thermal zone devices
Some thermal zone devices never change their state, so they should be
always enabled.

Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@collabora.com>
Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20200629122925.21729-9-andrzej.p@collabora.com
2020-06-29 20:26:37 +02:00
Akinobu Mita 97a0a49d55 thermal: intel_pch: switch to use <linux/units.h> helpers
This switches the intel pch thermal driver to use
deci_kelvin_to_millicelsius() in <linux/units.h> instead of helpers in
<linux/thermal.h>.

This is preparation for centralizing the kelvin to/from Celsius
conversion helpers in <linux/units.h>.

Link: http://lkml.kernel.org/r/1576386975-7941-7-git-send-email-akinobu.mita@gmail.com
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sujith Thomas <sujith.thomas@intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Andy Shevchenko <andy@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Amit Kucheria <amit.kucheria@verdurent.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Hartmut Knaack <knaack.h@gmx.de>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-01-31 10:30:40 -08:00
Gayatri Kammela 35709c4ee7 thermal: intel: intel_pch_thermal: Add Comet Lake (CML) platform support
Add Comet Lake to the list of the platforms to support intel_pch_thermal
driver.

Cc: Zhang rui <rui.zhang@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20191211200043.4985-1-gayatri.kammela@intel.com
2020-01-27 11:43:24 +01:00
Chuhong Yuan 66dd8b802c thermal: intel: Fix unmatched pci_release_region
The driver calls pci_request_regions() in probe and uses
pci_release_regions() in probe failure.
However, it calls pci_release_region() in remove, which does
match the other two calls.
Use pci_release_regions() instead to unify them.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20191206075531.18637-1-hslester96@gmail.com
2020-01-27 11:43:24 +01:00
Chuhong Yuan 97e9cafe85 thermal: intel: Use dev_get_drvdata
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2019-09-24 09:55:51 +08:00
Thomas Gleixner 2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Amit Kucheria 3e8c4d31f8 drivers: thermal: Move various drivers for intel platforms into a subdir
This cleans up the directory a bit, now that we have several other
platforms using platform-specific sub-directories. Compile-tested with
ARCH=x86 defconfig and the drivers explicitly enabled with menuconfig.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
2018-12-07 16:48:47 +08:00