Commit e201260081 ("arm64: perf: Add userspace counter access disable
switch") introduced a new 'perf_user_access' sysctl file to enable and
disable direct userspace access to the PMU counters. Sadly, Geert
reports that on his big.LITTLE SoC ('Renesas Salvator-XS w/ R-Car H3'),
the file is created for each PMU type probed, resulting in a splat
during boot:
| hw perfevents: enabled with armv8_cortex_a53 PMU driver, 7 counters available
| sysctl duplicate entry: /kernel//perf_user_access
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.16.0-rc3-arm64-renesas-00003-ge2012600810c #1420
| Hardware name: Renesas Salvator-X 2nd version board based on r8a77951 (DT)
| Call trace:
| dump_backtrace+0x0/0x190
| show_stack+0x14/0x20
| dump_stack_lvl+0x88/0xb0
| dump_stack+0x14/0x2c
| __register_sysctl_table+0x384/0x818
| register_sysctl+0x20/0x28
| armv8_pmu_init.constprop.0+0x118/0x150
| armv8_a57_pmu_init+0x1c/0x28
| arm_pmu_device_probe+0x1b4/0x558
| armv8_pmu_device_probe+0x18/0x20
| platform_probe+0x64/0xd0
| hw perfevents: enabled with armv8_cortex_a57 PMU driver, 7 counters available
Introduce a state variable to track creation of the sysctl file and
ensure that it is only created once.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: e201260081 ("arm64: perf: Add userspace counter access disable switch")
Link: https://lore.kernel.org/r/CAMuHMdVcDxR9sGzc5pcnORiotonERBgc6dsXZXMd6wTvLGA9iw@mail.gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
* for-next/perf-cpu:
arm64: perf: Support new DT compatibles
arm64: perf: Simplify registration boilerplate
arm64: perf: Support Denver and Carmel PMUs
With the trend for per-core events moving to userspace JSON, registering
names for PMUv3 implementations is increasingly a pure boilerplate
exercise. Let's wrap things a step further so we can generate the basic
PMUv3 init function with a macro invocation, and reduce further new
addition to just 2 lines each.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/b79477ea3b97f685d00511d4ecd2f686184dca34.1639490264.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Add support for the NVIDIA Denver and Carmel PMUs using the generic
PMUv3 event map for now.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
[ rm: reorder entries alphabetically ]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5f0f69d47acca78a9e479501aa4d8b429e23cf11.1639490264.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Arm PMUs can support direct userspace access of counters which allows for
low overhead (i.e. no syscall) self-monitoring of tasks. The same feature
exists on x86 called 'rdpmc'. Unlike x86, userspace access will only be
enabled for thread bound events. This could be extended if needed, but
simplifies the implementation and reduces the chances for any
information leaks (which the x86 implementation suffers from).
PMU EL0 access will be enabled when an event with userspace access is
part of the thread's context. This includes when the event is not
scheduled on the PMU. There's some additional overhead clearing
dirty counters when access is enabled in order to prevent leaking
disabled counter data from other tasks.
Unlike x86, enabling of userspace access must be requested with a new
attr bit: config1:1. If the user requests userspace access with 64-bit
counters, then the event open will fail if the h/w doesn't support
64-bit counters. Chaining is not supported with userspace access. The
modes for config1 are as follows:
config1 = 0 : user access disabled and always 32-bit
config1 = 1 : user access disabled and always 64-bit (using chaining if needed)
config1 = 2 : user access enabled and always 32-bit
config1 = 3 : user access enabled and always 64-bit
Based on work by Raphael Gault <raphael.gault@arm.com>, but has been
completely re-written.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211208201124.310740-5-robh@kernel.org
[will: Made armv8pmu_proc_user_access_handler() static]
Signed-off-by: Will Deacon <will@kernel.org>
Like x86, some users may want to disable userspace PMU counter
altogether. Add a sysctl 'perf_user_access' file to control userspace
counter access. The default is '0' which is disabled. Writing '1'
enables access.
Note that x86 supports globally enabling user access by writing '2' to
/sys/bus/event_source/devices/cpu/rdpmc. As there's not existing
userspace support to worry about, this shouldn't be necessary for Arm.
It could be added later if the need arises.
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-perf-users@vger.kernel.org
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211208201124.310740-4-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
ID_AA64DFR0_PMUVER_IMP_DEF, indicating an "implementation defined" PMU,
never actually gets used although there are '0xf' instances scattered
all around. Use the symbolic name instead of the raw hex constant.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/1628652427-24695-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Use common macro PMU_EVENT_ATTR_ID to simplify ARMV8_EVENT_ATTR
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1623220863-58233-8-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Armv8.7 has introduced BUS_SLOTS and BUS_WIDTH in PMMIR_EL1 register,
add two entries in caps for bus_slots and bus_width under sysfs. It
will return the true slots and width if the information is available,
otherwise it will return 0.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Link: https://lore.kernel.org/r/1622704502-63951-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
Use sysfs_emit instead of snprintf to avoid buf overrun,because in
sysfs_emit it strictly checks whether buf is null or buf whether
pagesize aligned, otherwise it returns an error.
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Link: https://lore.kernel.org/r/1621497585-30887-1-git-send-email-tiantao6@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
The initialization of value in function armv8pmu_read_hw_counter()
and armv8pmu_read_counter() seem redundant, as they are soon updated.
So, We can remove them.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1617275801-1980-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Commit 0fdf1bb759 ("arm64: perf: Avoid PMXEV* indirection") changed
armv8pmu_read_evcntr() to return a u32 instead of u64. The result is
silent truncation of the event counter when using 64-bit counters. Given
the offending commit appears to have passed thru several folks, it seems
likely this was a bad rebase after v8.5 PMU 64-bit counters landed.
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: <stable@vger.kernel.org>
Fixes: 0fdf1bb759 ("arm64: perf: Avoid PMXEV* indirection")
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
In a few places we don't have whitespace between macro parameters,
which makes them hard to read. This patch adds whitespace to clearly
separate the parameters.
In a few places we have unnecessary whitespace around unary operators,
which is confusing, This patch removes the unnecessary whitespace.
Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Link: https://lore.kernel.org/r/1612403029-5011-1-git-send-email-daizhiyuan@phytium.com.cn
Signed-off-by: Will Deacon <will@kernel.org>
The only usage of these is to put their addresses in an array of
pointers to const attribute_group structs. Make them const to allow the
compiler to put them in read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
This reverts commit 367c820ef0.
lockup_detector_init() makes heavy use of per-cpu variables and must be
called with preemption disabled. Usually, it's handled early during boot
in kernel_init_freeable(), before SMP has been initialised.
Since we do not know whether or not our PMU interrupt can be signalled
as an NMI until considerably later in the boot process, the Arm PMU
driver attempts to re-initialise the lockup detector off the back of a
device_initcall(). Unfortunately, this is called from preemptible
context and results in the following splat:
| BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
| caller is debug_smp_processor_id+0x20/0x2c
| CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.10.0+ #276
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace+0x0/0x3c0
| show_stack+0x20/0x6c
| dump_stack+0x2f0/0x42c
| check_preemption_disabled+0x1cc/0x1dc
| debug_smp_processor_id+0x20/0x2c
| hardlockup_detector_event_create+0x34/0x18c
| hardlockup_detector_perf_init+0x2c/0x134
| watchdog_nmi_probe+0x18/0x24
| lockup_detector_init+0x44/0xa8
| armv8_pmu_driver_init+0x54/0x78
| do_one_initcall+0x184/0x43c
| kernel_init_freeable+0x368/0x380
| kernel_init+0x1c/0x1cc
| ret_from_fork+0x10/0x30
Rather than bodge this with raw_smp_processor_id() or randomly disabling
preemption, simply revert the culprit for now until we figure out how to
do this properly.
Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20201221162249.3119-1-lecopzer.chen@mediatek.com
Link: https://lore.kernel.org/r/20210112221855.10666-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.
One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Acked-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/1602060704-10921-1-git-send-email-sumit.garg@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
When handling events, armv8pmu_handle_irq() calls perf_event_overflow(),
and subsequently calls irq_work_run() to handle any work queued by
perf_event_overflow(). As perf_event_overflow() raises IPI_IRQ_WORK when
queuing the work, this isn't strictly necessary and the work could be
handled as part of the IPI_IRQ_WORK handler.
In the common case the IPI handler will run immediately after the PMU IRQ
handler, and where the PE is heavily loaded with interrupts other handlers
may run first, widening the window where some counters are disabled.
In practice this window is unlikely to be a significant issue, and removing
the call to irq_work_run() would make the PMU IRQ handler NMI safe in
addition to making it simpler, so let's do that.
[Alexandru E.: Reworded commit message]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-5-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
The PMU is disabled and enabled, and the counters are programmed from
contexts where interrupts or preemption is disabled.
The functions to toggle the PMU and to program the PMU counters access the
registers directly and don't access data modified by the interrupt handler.
That, and the fact that they're always called from non-preemptible
contexts, means that we don't need to disable interrupts or use a spinlock.
[Alexandru E.: Explained why locking is not needed, removed WARN_ONs]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-4-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently we access the counter registers and their respective type
registers indirectly. This requires us to write to PMSELR, issue an ISB,
then access the relevant PMXEV* registers.
This is unfortunate, because:
* Under virtualization, accessing one register requires two traps to
the hypervisor, even though we could access the register directly with
a single trap.
* We have to issue an ISB which we could otherwise avoid the cost of.
* When we use NMIs, the NMI handler will have to save/restore the select
register in case the code it preempted was attempting to access a
counter or its type register.
We can avoid these issues by directly accessing the relevant registers.
This patch adds helpers to do so.
In armv8pmu_enable_event() we still need the ISB to prevent the PE from
reordering the write to PMINTENSET_EL1 register. If the interrupt is
enabled before we disable the counter and the new event is configured,
we might get an interrupt triggered by the previously programmed event
overflowing, but which we wrongly attribute to the event that we are
enabling. Execute an ISB after we disable the counter.
In the process, remove the comment that refers to the ARMv7 PMU.
[Julien T.: Don't inline read/write functions to avoid big code-size
increase, remove unused read_pmevtypern function,
fix counter index issue.]
[Alexandru E.: Removed comment, removed trailing semicolons in macros,
added ISB]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-3-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Writes to the PMXEVTYPER_EL0 register are not self-synchronising. In
armv8pmu_enable_event(), the PE can reorder configuring the event type
after we have enabled the counter and the interrupt. This can lead to an
interrupt being asserted because of the previous event type that we were
counting using the same counter, not the one that we've just configured.
The same rationale applies to writes to the PMINTENSET_EL1 register. The PE
can reorder enabling the interrupt at any point in the future after we have
enabled the event.
Prevent both situations from happening by adding an ISB just before we
enable the event counter.
Fixes: 030896885a ("arm64: Performance counters support")
Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20200924110706.254996-2-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
ARMv8.4-PMU introduces the PMMIR_EL1 registers and some new PMU events,
like STALL_SLOT etc, are related to it. Let's add a caps directory to
/sys/bus/event_source/devices/armv8_pmuv3_0/ and support slots from
PMMIR_EL1 registers in this entry. The user programs can get the slots
from sysfs directly.
/sys/bus/event_source/devices/armv8_pmuv3_0/caps/slots is exposed
under sysfs. Both ARMv8.4-PMU and STALL_SLOT event are implemented,
it returns the slots from PMMIR_EL1, otherwise it will return 0.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1600754025-53535-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
event_idx is obtained from armv8pmu_get_event_idx(), and this idx must be
between ARMV8_IDX_CYCLE_COUNTER and cpu_pmu->num_events. So it's unnecessary
to do this check. Let's remove it.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Link: https://lore.kernel.org/r/1599213458-28394-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
This patch is to add the general hardware last level cache (LLC) events
for PMUv3: one event is for LLC access and another is for LLC miss.
With this change, perf tool can support last level cache profiling,
below is an example to demonstrate the usage on Arm64:
$ perf stat -e LLC-load-misses -e LLC-loads -- \
perf bench mem memcpy -s 1024MB -l 100 -f default
[...]
Performance counter stats for 'perf bench mem memcpy -s 1024MB -l 100 -f default':
35,534,262 LLC-load-misses # 2.16% of all LL-cache hits
1,643,946,443 LLC-loads
[...]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200811053505.21223-1-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
Some new PMU events can been detected by PMCEID1_EL0, but it can't
be listed, Let's expose these through sysfs.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/1595328573-12751-2-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon <will@kernel.org>
This completes the ARM64 cap_user_time support.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-7-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
When sched_clock is running on anything other than arch_timer, don't
advertise cap_user_time*.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-5-leo.yan@linaro.org
Requested-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
As reported by Leo; the existing implementation is broken when the
clock and counter don't intersect at 0.
Use the sched_clock's struct clock_read_data information to correctly
implement cap_user_time and cap_user_time_zero.
Note that the ARM64 counter is architecturally only guaranteed to be
56bit wide (implementations are allowed to be wider) and the existing
perf ABI cannot deal with wrap-around.
This implementation should also be faster than the old; seeing how we
don't need to recompute mult and shift all the time.
[leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow]
Reported-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-4-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
When PMU event ID is equal or greater than 0x4000, it will be reduced
by 0x4000 and it is not the raw number in the sysfs. Let's correct it
and obtain the raw event ID.
Before this patch:
cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed
event=0x001
After this patch:
cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed
event=0x4001
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1592487344-30555-3-git-send-email-zhangshaokun@hisilicon.com
[will: fixed formatting of 'if' condition]
Signed-off-by: Will Deacon <will@kernel.org>
At present ARMv8 event counters are limited to 32-bits, though by
using the CHAIN event it's possible to combine adjacent counters to
achieve 64-bits. The perf config1:0 bit can be set to use such a
configuration.
With the introduction of ARMv8.5-PMU support, all event counters can
now be used as 64-bit counters.
Let's enable 64-bit event counters where support exists. Unless the
user sets config1:0 we will adjust the counter value such that it
overflows upon 32-bit overflow. This follows the same behaviour as
the cycle counter which has always been (and remains) 64-bits.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[Mark: fix ID field names, compare with 8.5 value]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Reading this code bordered on painful, what with all the repetition and
pointless return values. More fundamentally, dribbling the hardware
enables and disables in one bit at a time incurs needless system
register overhead for chained events and on reset. We already use
bitmask values for the KVM hooks, so consolidate all the register
accesses to match, and make a reasonable saving in both source and
object code.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Add support for matching the new PMUs. For now, this just wires them up
as generic PMUv3 such that people writing DTs for new SoCs can do the
right thing, and at least have architectural and raw events be usable.
We can come back and fill in event maps for sysfs and/or perf tools at
a later date.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
The PMU init callbacks are already drowning in boilerplate, so before
doubling the number of supported PMU models, give it a sensible refactor
to significantly reduce the bloat, both in source and object code.
Although nobody uses non-default sysfs attributes today, there's minimal
impact to preserving the notion that maybe, some day, somebody might, so
we may as well keep up appearances.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
For each PMU event, there is a ARMV8_EVENT_ATTR(xx, XX) and
&armv8_event_attr_xx.attr.attr. Let's redefine the ARMV8_EVENT_ATTR
to simplify the armv8_pmuv3_event_attrs.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
[will: Dropped unnecessary array syntax]
Signed-off-by: Will Deacon <will@kernel.org>
In perf_event.c we use smp_processor_id(), but we haven't included
<linux/smp.h> where it is defined, and rely on this being pulled in
via a transitive include. Let's make this more robust by including
<linux.smp.h> explicitly.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
ARMV8_EVENT_ATTR_RESOLVE became unused after commit <4b1a9e6934ec>
("arm64/perf: Filter common events based on PMCEIDn_EL0").
Remove it.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will@kernel.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 503 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* POWER: support for direct access to the POWER9 XIVE interrupt controller,
memory and performance optimizations.
* x86: support for accessing memory not backed by struct page, fixes and refactoring
* Generic: dirty page tracking improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJc3qV/AAoJEL/70l94x66Dn3QH/jX1Bn0P/RZAIt4w0SySklSg
PqxUKDyBQqB9vN9Qeb9jWXAKPH2CtM3+up/rz7oRnBWp7qA6vXcC/R/QJYAvzdXE
nklsR/oYCsflR1KdlVYuDvvPCPP2fLBU5zfN83OsaBQ8fNRkm3gN+N5XQ2SbXbLy
Mo9tybS4otY201UAC96e8N0ipwwyCRpDneQpLcl+F5nH3RBt63cVbs04O+70MXn7
eT4I+8K3+Go7LATzT8hglD21D/7uvE31qQb6yr5L33IfhU4GB51RZzBXTNaAdY8n
hT1rMrRkAMAFWYZPQDfoMadjWU3i5DIfstKjDxOr9oTfuOEp5Z+GvJwvVnUDg1I=
=D0+p
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- support for SVE and Pointer Authentication in guests
- PMU improvements
POWER:
- support for direct access to the POWER9 XIVE interrupt controller
- memory and performance optimizations
x86:
- support for accessing memory not backed by struct page
- fixes and refactoring
Generic:
- dirty page tracking improvements"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits)
kvm: fix compilation on aarch64
Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU"
kvm: x86: Fix L1TF mitigation for shadow MMU
KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible
KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device
KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing"
KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs
kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete
tests: kvm: Add tests for KVM_SET_NESTED_STATE
KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state
tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID
tests: kvm: Add tests to .gitignore
KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one
KVM: Fix the bitmap range to copy during clear dirty
KVM: arm64: Fix ptrauth ID register masking logic
KVM: x86: use direct accessors for RIP and RSP
KVM: VMX: Use accessors for GPRs outside of dedicated caching logic
KVM: x86: Omit caching logic for always-available GPRs
kvm, x86: Properly check whether a pfn is an MMIO or not
...
With VHE different exception levels are used between the host (EL2) and
guest (EL1) with a shared exception level for userpace (EL0). We can take
advantage of this and use the PMU's exception level filtering to avoid
enabling/disabling counters in the world-switch code. Instead we just
modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
We also ensure that trapped PMU system register writes do not re-enable
EL0 when reconfiguring the backing perf events.
This approach completely avoids blackout windows seen with !VHE.
Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.
We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping events based
on their event attributes.
With !VHE we switch the counters between host/guest at EL2. We are able
to eliminate counters counting host events on the boundaries of guest
entry/exit when using :G by filtering out EL2 for exclude_host. When
using !exclude_hv there is a small blackout window at the guest
entry/exit where host events are not captured.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.
In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.
Let's remove the unnecessary isb instruction.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The functions armv8pmu_read_counter() and armv8pmu_write_counter()
are `static inline` while they are only referenced when assigned
to a function pointer field in a `struct arm_pmu` instance.
The inline keyword is thus counter intuitive and shouldn't be used.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The comment for the armv8pmu_set_event_filter function suggests that
it only works for PMUv2 PMUs - this is incorrect.
Let's remove the incorrect comment.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Fix up one typos: Onl -> Only
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
There have been some additional events added to the PMU architecture
since Armv8.0, so expose them via our sysfs infrastructure.
Signed-off-by: Will Deacon <will.deacon@arm.com>
The PMU event numbers are split between perf_event.h and perf_event.c,
which makes it difficult to spot any gaps in the numbers which may be
allocated in the future.
This patch sorts the events numerically, adds some missing events and
moves the definitions into perf_event.h.
Signed-off-by: Will Deacon <will.deacon@arm.com>
We cannot distinguish reads from writes in our generic cache events, so
drop the WRITE entries and leave the READ entries pointing to the combined
read/write events, as is done by other CPUs and architectures.
Reported-by: Ganapatrao Kulkarni <Ganapatrao.Kulkarni@cavium.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>