WSL2-Linux-Kernel/drivers/clocksource
Niklas Söderlund 2bb27b956a clocksource/drivers/sh_cmt: Address race condition for clock events
[ Upstream commit db19d3aa77612983a02bd223b3f273f896b243cf ]

There is a race condition in the CMT interrupt handler. In the interrupt
handler the driver sets a driver private flag, FLAG_IRQCONTEXT. This
flag is used to indicate any call to set_next_event() should not be
directly propagated to the device, but instead cached. This is done as
the interrupt handler itself reprograms the device when needed before it
completes and this avoids this operation to take place twice.

It is unclear why this design was chosen, my suspicion is to allow the
struct clock_event_device.event_handler callback, which is called while
the FLAG_IRQCONTEXT is set, can update the next event without having to
write to the device twice.

Unfortunately there is a race between when the FLAG_IRQCONTEXT flag is
set and later cleared where the interrupt handler have already started to
write the next event to the device. If set_next_event() is called in
this window the value is only cached in the driver but not written. This
leads to the board to misbehave, or worse lockup and produce a splat.

   rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
   rcu:     0-...!: (0 ticks this GP) idle=f5e0/0/0x0 softirq=519/519 fqs=0 (false positive?)
   rcu:     (detected by 1, t=6502 jiffies, g=-595, q=77 ncpus=2)
   Sending NMI from CPU 1 to CPUs 0:
   NMI backtrace for cpu 0
   CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-rc5-arm64-renesas-00019-g74a6f86eaf1c-dirty #20
   Hardware name: Renesas Salvator-X 2nd version board based on r8a77965 (DT)
   pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
   pc : tick_check_broadcast_expired+0xc/0x40
   lr : cpu_idle_poll.isra.0+0x8c/0x168
   sp : ffff800081c63d70
   x29: ffff800081c63d70 x28: 00000000580000c8 x27: 00000000bfee5610
   x26: 0000000000000027 x25: 0000000000000000 x24: 0000000000000000
   x23: ffff00007fbb9100 x22: ffff8000818f1008 x21: ffff8000800ef07c
   x20: ffff800081c79ec0 x19: ffff800081c70c28 x18: 0000000000000000
   x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffc2c717d8
   x14: 0000000000000000 x13: ffff000009c18080 x12: ffff8000825f7fc0
   x11: 0000000000000000 x10: ffff8000818f3cd4 x9 : 0000000000000028
   x8 : ffff800081c79ec0 x7 : ffff800081c73000 x6 : 0000000000000000
   x5 : 0000000000000000 x4 : ffff7ffffe286000 x3 : 0000000000000000
   x2 : ffff7ffffe286000 x1 : ffff800082972900 x0 : ffff8000818f1008
   Call trace:
    tick_check_broadcast_expired+0xc/0x40
    do_idle+0x9c/0x280
    cpu_startup_entry+0x34/0x40
    kernel_init+0x0/0x11c
    do_one_initcall+0x0/0x260
    __primary_switched+0x80/0x88
   rcu: rcu_preempt kthread timer wakeup didn't happen for 6501 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
   rcu:     Possible timer handling issue on cpu=0 timer-softirq=262
   rcu: rcu_preempt kthread starved for 6502 jiffies! g-595 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=0
   rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
   rcu: RCU grace-period kthread stack dump:
   task:rcu_preempt     state:I stack:0     pid:15    tgid:15    ppid:2      flags:0x00000008
   Call trace:
    __switch_to+0xbc/0x100
    __schedule+0x358/0xbe0
    schedule+0x48/0x148
    schedule_timeout+0xc4/0x138
    rcu_gp_fqs_loop+0x12c/0x764
    rcu_gp_kthread+0x208/0x298
    kthread+0x10c/0x110
    ret_from_fork+0x10/0x20

The design have been part of the driver since it was first merged in
early 2009. It becomes increasingly harder to trigger the issue the
older kernel version one tries. It only takes a few boots on v6.10-rc5,
while hundreds of boots are needed to trigger it on v5.10.

Close the race condition by using the CMT channel lock for the two
competing sections. The channel lock was added to the driver after its
initial design.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Link: https://lore.kernel.org/r/20240702190230.3825292-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19 05:45:37 +02:00
..
Kconfig clocksource/drivers/timer-ti-dm: Select TIMER_OF 2021-11-18 19:16:39 +01:00
Makefile clocksource/drivers/prima: Remove sirf prima driver 2021-02-03 09:13:46 +01:00
acpi_pm.c clocksource: acpi_pm: fix return value of __setup handler 2022-04-08 14:23:09 +02:00
arc_timer.c clocksource/drivers/arc_timer: Remove duplicate error message 2020-05-22 23:58:56 +02:00
arm_arch_timer.c clocksource/arm_arch_timer: Improve Allwinner A64 timer workaround 2021-06-16 17:33:04 +02:00
arm_global_timer.c clocksource/drivers/arm_global_timer: Fix maximum prescaler value 2024-04-10 16:18:46 +02:00
armv7m_systick.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194 2019-05-30 11:29:22 -07:00
asm9260_timer.c clocksource/drivers/asm9260: Add a check for of_clk_get 2019-11-04 10:40:10 +01:00
bcm2835_timer.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
bcm_kona_timer.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
clksrc-dbx500-prcmu.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
clksrc_st_lpc.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
clps711x-timer.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
dummy_timer.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
dw_apb_timer.c clocksource: dw_apb_timer: Make CPU-affiliation being optional 2020-05-23 00:02:41 +02:00
dw_apb_timer_of.c clocksource/drivers/dw_apb_timer_of: Fix probe failure 2021-12-14 10:57:23 +01:00
em_sti.c clocksource/drivers/em_sti: Fix variable declaration in em_sti_probe 2020-01-16 19:06:57 +01:00
exynos_mct.c clocksource/drivers/exynos_mct: Handle DTS with higher number of interrupts 2022-04-08 14:23:08 +02:00
h8300_timer8.c clocksource/drivers/h8300_timer8: Fix wrong return value in h8300_8timer_init() 2020-08-24 13:01:38 +02:00
h8300_timer16.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
h8300_tpu.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
hyperv_timer.c clocksource: hyper-v: unexport __init-annotated hv_init_clocksource() 2022-06-22 14:21:59 +02:00
i8253.c clockevents/drivers/i8253: Add support for PIT shutdown quirk 2018-11-04 11:04:46 +01:00
ingenic-ost.c clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe() 2021-04-08 13:24:15 +02:00
ingenic-sysost.c clocksource/drivers/ingenic: Use bitfield macro helpers 2021-08-14 02:44:35 +02:00
ingenic-timer.c clocksource/drivers/ingenic: Add support for the JZ4760 2021-04-08 13:23:22 +02:00
jcore-pit.c clocksource/drivers: Rename CLOCKSOURCE_OF_DECLARE to TIMER_OF_DECLARE 2017-06-14 11:58:45 +02:00
mips-gic-timer.c clocksource: mips-gic-timer: Mark GIC timer as unstable if ref clock changes 2020-05-23 00:03:16 +02:00
mmio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
mps2-timer.c clocksource/drivers/mps2-timer: Use semicolons rather than commas to separate statements 2020-10-01 10:07:26 +02:00
mxs_timer.c clocksource/drivers/mxs_timer: Add missing semicolon when DEBUG is defined 2021-01-18 22:28:59 +01:00
nomadik-mtu.c clocksource/drivers/nomadik-mtu: Handle 32kHz clock 2020-07-23 16:57:43 +02:00
numachip.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 2019-06-05 17:36:37 +02:00
renesas-ostm.c clocksource/drivers/renesas-ostm: Use unique device name instead of ostm 2019-11-04 10:38:46 +01:00
samsung_pwm_timer.c clocksource/drivers/samsung_pwm: Constify source IO memory 2021-06-04 10:12:13 +02:00
scx200_hrt.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
sh_cmt.c clocksource/drivers/sh_cmt: Address race condition for clock events 2024-08-19 05:45:37 +02:00
sh_mtu2.c PM: domains: Rename pm_genpd_syscore_poweroff|poweron() 2020-11-10 20:42:01 +01:00
sh_tmu.c PM: domains: Rename pm_genpd_syscore_poweroff|poweron() 2020-11-10 20:42:01 +01:00
timer-armada-370-xp.c clocksource/drivers/armada-370-xp: Use semicolons rather than commas to separate statements 2020-10-02 16:27:28 +02:00
timer-atcpit100.c clocksource/drivers: Set clockevent device cpumask to cpu_possible_mask 2018-07-26 11:26:30 +02:00
timer-atmel-pit.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
timer-atmel-st.c clocksource/drivers/atmel-st: Remove useless 'status' 2020-04-15 10:57:15 +02:00
timer-atmel-tcb.c clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware 2023-11-28 16:56:15 +00:00
timer-cadence-ttc.c clocksource/drivers/cadence-ttc: Fix memory leak in ttc_timer_probe 2023-07-23 13:46:45 +02:00
timer-clint.c clocksource: clint: Export clint_time_val for modules 2020-09-29 23:55:27 -07:00
timer-cs5535.c clocksource/drivers/timer-cs5535: Request irq with non-NULL dev_id 2020-03-12 19:23:06 +01:00
timer-davinci.c clocksource/drivers/davinci: Fix memory leak in davinci_timer_register when init fails 2023-05-11 23:00:37 +09:00
timer-digicolor.c clocksource/drivers: Rename CLOCKSOURCE_OF_DECLARE to TIMER_OF_DECLARE 2017-06-14 11:58:45 +02:00
timer-fsl-ftm.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
timer-fttmr010.c clocksource/drivers/fttmr010: Pass around less pointers 2021-08-14 10:49:49 +02:00
timer-gx6605s.c clocksource/drivers/timer-gx6605s: Fixup counter reload 2020-08-24 13:01:39 +02:00
timer-imx-gpt.c clocksource/drivers/timer-imx-gpt: Fix potential memory leak 2023-11-28 16:56:15 +00:00
timer-imx-sysctr.c clocksource/drivers/imx-sysctr: Remove unused includes 2020-03-17 10:11:45 +01:00
timer-imx-tpm.c clocksource/drivers/imx-tpm: Add support for ARM64 2020-04-09 16:24:50 +02:00
timer-integrator-ap.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
timer-ixp4xx.c clocksource/drivers/ixp4xx: remove EXPORT_SYMBOL_GPL from ixp4xx_timer_setup() 2022-07-07 17:53:32 +02:00
timer-keystone.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
timer-lpc32xx.c clocksource/drivers: Unify the names to timer-* format 2018-10-03 14:37:02 +02:00
timer-mediatek.c clocksource/drivers/mediatek: Optimize systimer irq clear flow on shutdown 2021-08-14 02:44:35 +02:00
timer-meson6.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
timer-microchip-pit64b.c clocksource/drivers/timer-microchip-pit64b: Use notrace 2022-04-08 14:23:08 +02:00
timer-milbeaut.c clocksource/drivers/timer-milbeaut: Cleanup common register accesses 2019-05-02 21:55:58 +02:00
timer-mp-csky.c clocksource/drivers/c-sky: fixup ftrace call-graph panic 2018-12-31 23:17:23 +08:00
timer-npcm7xx.c clocksource/drivers/npcm: Add support for WPCM450 2021-04-08 13:24:16 +02:00
timer-of.c clocksource/drivers/timer-of: Check return value of of_iomap in timer_of_base_init() 2022-04-08 14:23:09 +02:00
timer-of.h clocksource/drivers/timer-of: Store the device node pointer in 'struct timer_of' 2018-01-08 17:57:24 +01:00
timer-orion.c clocksource/drivers/orion: Add missing clk_disable_unprepare() on error path 2020-12-03 19:16:26 +01:00
timer-owl.c clocksource/drivers/owl: Improve owl_timer_init fail messages 2020-02-27 09:42:00 +01:00
timer-oxnas-rps.c clocksource/drivers/oxnas-rps: Fix irq_of_parse_and_map() return value 2022-06-14 18:36:09 +02:00
timer-pistachio.c clocksource/drivers/pistachio: Fix trivial typo 2021-04-08 13:24:15 +02:00
timer-probe.c treewide: Convert macro and uses of __section(foo) to __section("foo") 2020-10-25 14:51:49 -07:00
timer-pxa.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
timer-qcom.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 2019-06-05 17:36:37 +02:00
timer-rda.c clocksource/drivers/rda: Add clock driver for RDA8810PL SoC 2018-12-18 22:22:23 +01:00
timer-riscv.c Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend" 2022-12-08 11:28:45 +01:00
timer-rockchip.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
timer-sp.h clocksource/drivers/sp804: Enable Hisilicon sp804 timer 64bit mode 2020-09-24 10:51:04 +02:00
timer-sp804.c clocksource/drivers/sp804: Avoid error on multiple instances 2022-06-14 18:36:22 +02:00
timer-sprd.c clocksource/drivers/sprd: Register one always-on timer to compensate suspend time 2018-07-26 11:26:34 +02:00
timer-stm32-lp.c clocksource: Add Low Power STM32 timers driver 2020-06-18 11:19:58 +01:00
timer-stm32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194 2019-05-30 11:29:22 -07:00
timer-sun4i.c clocksource: sun4i: Add missing compatibles 2019-08-27 00:31:39 +02:00
timer-sun5i.c clocksource/drivers/sun5i: Fail gracefully when clock rate is unavailable 2019-02-23 12:13:45 +01:00
timer-tegra.c clocksource/drivers/tegra: Set up maximum-ticks limit properly 2019-06-25 19:49:18 +02:00
timer-ti-32k.c clocksource/drivers: Replace HTTP links with HTTPS ones 2020-07-23 16:57:43 +02:00
timer-ti-dm-systimer.c clocksource/drivers/timer-ti-dm: Fix missing clk_disable_unprepare in dmtimer_systimer_init_clock() 2022-12-31 13:14:04 +01:00
timer-ti-dm.c clocksource/drivers/timer-ti-dm: Drop unnecessary restore 2021-06-16 17:33:04 +02:00
timer-versatile.c clocksource/drivers/timer-versatile: Clear OF_POPULATED flag 2020-05-23 00:03:25 +02:00
timer-vf-pit.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
timer-vt8500.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00
timer-zevio.c clocksource: Replace setup_irq() by request_irq() 2020-02-27 12:15:24 +01:00