During suspend/resume usecases and tests, it is common to see issues
such as lockups either in suspend path or resume path because of the
bugs in the corresponding device driver pm handling code. In such cases,
it is important that watchdog is active to make sure that we either
receive a watchdog pretimeout notification or a bite causing reset
instead of a hang causing us to hard reset the machine.
There are good reasons as to why we need this because:
* We can have a watchdog pretimeout governor set to panic in which
case we can have a backtrace which would help identify the issue
with the particular driver and cause a normal reboot.
* Even in case where there is no pretimeout support, a watchdog
bite is still useful because some firmware has debug support to dump
CPU core context on watchdog bite for post-mortem analysis.
* One more usecase which comes to mind is of warm reboot. In case we
hard reset the target, a cold reboot could be induced resulting in
lose of ddr contents thereby losing all the debug info.
Currently, the watchdog pm callback just invokes the usual suspend
and resume callback which do not have any special ordering in the
sense that a watchdog can be suspended before the buggy device driver
suspend callback and watchdog resume can happen after the buggy device
driver resume callback. This would mean that the watchdog will not be
active when the buggy driver cause the lockups thereby hanging the
system. So to make sure this doesn't happen, move the watchdog pm to
use late/early system pm callbacks which will ensure that the watchdog
is suspended late and resumed early so that it can catch such issues.
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20210310202004.1436-1-saiprakash.ranjan@codeaurora.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>