The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).
The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"
Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.
Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.
We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some platforms, an auxiliary event has to be enabled
simultaneously with the load latency event.
For Alderlake, the auxiliary event is created in "cpu_core" pmu.
So first we need to check the existing of "cpu_core" pmu
and then check if this pmu has auxiliary event.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The option "opts->full_auxtrace" is checked at the earlier place, if it
is false the function will directly bail out. So remove the redundant
checking for "opts->full_auxtrace".
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519041546.1574961-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For per-cpu mmap, it should enable timestamp tracing for Arm SPE; this
is helpful for samples correlation.
To automatically enable the timestamp, a helper arm_spe_set_timestamp()
is introduced for setting "ts_enable" format bit.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210519041546.1574961-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dummy event is mainly used for mmap, the TIME sample is only needed
for per-cpu case so that the perf tool can rely on the correct timing
for parsing symbols. And the CPU sample is useless for mmap.
The BRANCH_STACK sample bit will be always reset for the dummy event in
the function evsel__config(), so don't need to repeatedly reset it for
Arm SPE specific.
So this patch only enables TIME sample for per-cpu mmap.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519041546.1574961-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now it's hard code to set sample flags for CPU, TIME and TID for SPE
event, which is pointless.
The CPU is useful for sampling only for per-mmap case, it is used to
indicate the AUX trace is associated to which CPU.
The TIME sample is not needed for AUX event, since the time for AUX
event is not really used and this time is a different thing from the
timestamp in Arm SPE trace, the timestamp tracing which is controlled
by Arm SPE's config bit.
The TID sample is not useful for AUX event.
This patch corrects the sample flags for SPE event, it only set CPU
sample bit for per-cpu mmap case.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519041546.1574961-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Function declarations don't need __maybe_unused annotations, only the
implementations do. Drop them on the perf x86 tests.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: masayoshi mizuma <msys.mizuma@gmail.com>
Link: http://lore.kernel.org/lkml/20210513174614.2242210-2-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no reason for making the test__arch_unwind_sample declaration per
arch. Currently that's done 2 different ways either with a declaration in
arch-tests.h or with an arch define. Unify all this with an unconditional
declaration in tests.h.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: masayoshi mizuma <msys.mizuma@gmail.com>
Link: http://lore.kernel.org/lkml/20210513174614.2242210-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in this csets:
5b9fedb31e ("quota: Disable quotactl_path syscall")
That silences these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
usage:
- kvm stat
run a command and gather performance counter statistics
- show the result:
perf kvm stat report --event=msr
See the msr events:
Analyze events for all VMs, all VCPUs:
MSR Access Samples Samples% Time% Min Time Max Time Avg time
0x6e0:W 67007 98.17% 98.31% 0.59us 10.69us 0.90us ( +- 0.10% )
0x830:W 1186 1.74% 1.60% 0.53us 108.34us 0.82us ( +- 11.02% )
0x3b:R 66 0.10% 0.09% 0.56us 1.26us 0.80us ( +- 3.24% )
Total Samples:68259, Total events handled time:61150.95us.
Signed-off-by: Lei Zhao <zhaolei27@baidu.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1618470001-7239-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
a49f4f81cb ("arch: Wire up Landlock syscalls")
2a1867219c ("fs: add mount_setattr()")
fa8b90070a ("quota: wire up quotactl_path")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# ~acme/bin/perf trace -v -e landlock*
event qualifier tracepoint filter: (common_pid != 129365 && common_pid != 3502) && (id == 444 || id == 445 || id == 446)
^C#
That is tha filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep landlock tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
444 common landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mickaël Salaün <mic@linux.microsoft.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since clang's -Wmissing-field-initializers warns if a data
structure is initialized with a signle NULL as below,
----
tools/perf $ make CC=clang LLVM=1
...
arch/arm64/util/kvm-stat.c:74:9: error: missing field 'ops' initializer [-Werror,-Wmissing-field-initializers]
{ NULL },
^
1 error generated.
----
add another field initializer expressly as same as other
arch's kvm-stat.c code.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Link: http://lore.kernel.org/lkml/162037767540.94840.15758657049033010518.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat:
- Add support for hybrid PMUs to support systems such as Intel Alderlake
and its BIG/little core/atom cpus.
- Introduce 'bperf' to share hardware PMCs with BPF.
- New --iostat option to collect and present IO stats on Intel hardware.
This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
It is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
- Align CSV output for summary.
- Clarify --null use cases: Assess raw overhead of 'perf stat' or
measure just wall clock time.
- Improve readability of shadow stats.
perf record:
- Change the COMM when starting tha workload so that --exclude-perf
doesn't seem to be not honoured.
- Improve 'Workload failed' message printing events + what was exec'ed.
- Fix cross-arch support for TIME_CONV.
perf report:
- Add option to disable raw event ordering.
- Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
- Improvements to --stat output, that shows information about PERF_RECORD_ events.
- Preserve identifier id in OCaml demangler.
perf annotate:
- Show full source location with 'l' hotkey in the 'perf annotate' TUI.
- Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode.
- Add --demangle and --demangle-kernel to 'perf annotate'.
- Allow configuring annotate.demangle{,_kernel} in 'perf config'.
- Fix sample events lost in stdio mode.
perf data:
- Allow converting a perf.data file to JSON.
libperf:
- Add support for user space counter access.
- Update topdown documentation to permit rdpmc calls.
perf test:
- Add 'perf test' for 'perf stat' CSV output.
- Add 'perf test' entries to test the hybrid PMU support.
- Cleanup 'perf test daemon' if its 'perf test' is interrupted.
- Handle metric reuse in pmu-events parsing 'perf test' entry.
- Add test for PE executable support.
- Add timeout for wait for daemon start in its 'perf test' entries.
Build:
- Enable libtraceevent dynamic linking.
- Improve feature detection output.
- Fix caching of feature checks caching.
- First round of updates for tools copies of kernel headers.
- Enable warnings when compiling BPF programs.
Vendor specific events:
Intel:
- Add missing skylake & icelake model numbers.
arm64:
- Add Hisi hip08 L1, L2 and L3 metrics.
- Add Fujitsu A64FX PMU events.
PowerPC:
- Initial JSON/events list for power10 platform.
- Remove unsupported power9 metrics.
AMD:
- Add Zen3 events.
- Fix broken L2 Cache Hits from L2 HWPF metric.
- Use lowercases for all the eventcodes and umasks.
Hardware tracing:
arm64:
- Update CoreSight ETM metadata format.
- Fix bitmap for CS-ETM option.
- Support PID tracing in config.
- Detect pid in VMID for kernel running at EL2.
Arch specific:
MIPS:
- Support MIPS unwinding and dwarf-regs.
- Generate mips syscalls_n64.c syscall table.
PowerPC:
- Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
- Support pipeline stage cycles for powerpc.
libbeauty:
- Fix fsconfig generator.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIshAwAKCRCyPKLppCJ+
J8oWAP9c1POclDQ7AZDe5/t/InZYSQKJFIku1sE1SNCSOupy7wEAuPBtaN7wDaRj
BFBibfUGd4MNzLPvMMHneIhSY3DgJwg=
=FLLr
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"perf stat:
- Add support for hybrid PMUs to support systems such as Intel
Alderlake and its BIG/little core/atom cpus.
- Introduce 'bperf' to share hardware PMCs with BPF.
- New --iostat option to collect and present IO stats on Intel
hardware.
This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP)
in commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore
unit to IIO PMON mapping")
It is intended to provide four I/O performance metrics in MB per
each PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
- Align CSV output for summary.
- Clarify --null use cases: Assess raw overhead of 'perf stat' or
measure just wall clock time.
- Improve readability of shadow stats.
perf record:
- Change the COMM when starting tha workload so that --exclude-perf
doesn't seem to be not honoured.
- Improve 'Workload failed' message printing events + what was
exec'ed.
- Fix cross-arch support for TIME_CONV.
perf report:
- Add option to disable raw event ordering.
- Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
- Improvements to --stat output, that shows information about
PERF_RECORD_ events.
- Preserve identifier id in OCaml demangler.
perf annotate:
- Show full source location with 'l' hotkey in the 'perf annotate'
TUI.
- Add line number like in TUI and source location at EOL to the 'perf
annotate' --stdio mode.
- Add --demangle and --demangle-kernel to 'perf annotate'.
- Allow configuring annotate.demangle{,_kernel} in 'perf config'.
- Fix sample events lost in stdio mode.
perf data:
- Allow converting a perf.data file to JSON.
libperf:
- Add support for user space counter access.
- Update topdown documentation to permit rdpmc calls.
perf test:
- Add 'perf test' for 'perf stat' CSV output.
- Add 'perf test' entries to test the hybrid PMU support.
- Cleanup 'perf test daemon' if its 'perf test' is interrupted.
- Handle metric reuse in pmu-events parsing 'perf test' entry.
- Add test for PE executable support.
- Add timeout for wait for daemon start in its 'perf test' entries.
Build:
- Enable libtraceevent dynamic linking.
- Improve feature detection output.
- Fix caching of feature checks caching.
- First round of updates for tools copies of kernel headers.
- Enable warnings when compiling BPF programs.
Vendor specific events:
- Intel:
- Add missing skylake & icelake model numbers.
- arm64:
- Add Hisi hip08 L1, L2 and L3 metrics.
- Add Fujitsu A64FX PMU events.
- PowerPC:
- Initial JSON/events list for power10 platform.
- Remove unsupported power9 metrics.
- AMD:
- Add Zen3 events.
- Fix broken L2 Cache Hits from L2 HWPF metric.
- Use lowercases for all the eventcodes and umasks.
Hardware tracing:
- arm64:
- Update CoreSight ETM metadata format.
- Fix bitmap for CS-ETM option.
- Support PID tracing in config.
- Detect pid in VMID for kernel running at EL2.
Arch specific updates:
- MIPS:
- Support MIPS unwinding and dwarf-regs.
- Generate mips syscalls_n64.c syscall table.
- PowerPC:
- Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
- Support pipeline stage cycles for powerpc.
libbeauty:
- Fix fsconfig generator"
* tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits)
perf build: Defer printing detected features to the end of all feature checks
tools build: Allow deferring printing the results of feature detection
perf build: Regenerate the FEATURE_DUMP file after extra feature checks
perf session: Dump PERF_RECORD_TIME_CONV event
perf session: Add swap operation for event TIME_CONV
perf jit: Let convert_timestamp() to be backwards-compatible
perf tools: Change fields type in perf_record_time_conv
perf tools: Enable libtraceevent dynamic linking
perf Documentation: Document intel-hybrid support
perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
perf tests: Support 'Convert perf time to TSC' test for hybrid
perf tests: Support 'Session topology' test for hybrid
perf tests: Support 'Parse and process metrics' test for hybrid
perf tests: Support 'Track with sched_switch' test for hybrid
perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
perf tests: Add hybrid cases for 'Parse event definition strings' test
perf record: Uniquify hybrid event name
perf stat: Warn group events from different hybrid PMU
perf stat: Filter out unmatched aggregation for hybrid event
...
- Improve Intel uncore PMU support:
- Parse uncore 'discovery tables' - a new hardware capability enumeration method
introduced on the latest Intel platforms. This table is in a well-defined PCI
namespace location and is read via MMIO. It is organized in an rbtree.
These uncore tables will allow the discovery of standard counter blocks, but
fancier counters still need to be enumerated explicitly.
- Add Alder Lake support
- Improve IIO stacks to PMON mapping support on Skylake servers
- Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs
and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived)
cores.
The CPU-side feature set is entirely symmetrical - but on the PMU side there's
core type dependent PMU functionality.
- Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by
fixing the AUX allocation watermark logic.
- Improve ring buffer allocation on NUMA systems
- Put 'struct perf_event' into their separate kmem_cache pool
- Add support for synchronous signals for select perf events. The immediate motivation
is to support low-overhead sampling-based race detection for user-space code. The
feature consists of the following main changes:
- Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits
inheritance of events to CLONE_THREAD.
- Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec.
- Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64
::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is
PERF_TYPE_BREAKPOINT.
The siginfo support is adequate for breakpoints right now - but the new field can be used
to introduce support for other types of metadata passed over siginfo as well.
- Misc fixes, cleanups and smaller updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJGpERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j9zBAAuVbG2snV6SBSdXLhQcM66N3NckOXvSY5
QjjhQcuwJQEK/NJB3266K5d8qSmdyRBsWf3GCsrmyBT67P1V28K44Pu7oCV0UDtf
mpVRjEP0oR7hNsANSSgo8Fa4ZD7H5waX7dK7925Tvw8By3mMoZoddiD/84WJHhxO
NDF+GRFaRj+/dpbhV8cdCoXTjYdkC36vYuZs3b9lu0tS9D/AJgsNy7TinLvO02Cs
5peP+2y29dgvCXiGBiuJtEA6JyGnX3nUJCvfOZZ/DWDc3fdduARlRrc5Aiq4n/wY
UdSkw1VTZBlZ1wMSdmHQVeC5RIH3uWUtRoNqy0Yc90lBm55AQ0EENwIfWDUDC5zy
USdBqWTNWKMBxlEilUIyqKPQK8LW/31TRzqy8BWKPNcZt5yP5YS1SjAJRDDjSwL/
I+OBw1vjLJamYh8oNiD5b+VLqNQba81jFASfv+HVWcULumnY6ImECCpkg289Fkpi
BVR065boifJDlyENXFbvTxyMBXQsZfA+EhtxG7ju2Ni+TokBbogyCb3L2injPt9g
7jjtTOqmfad4gX1WSc+215iYZMkgECcUd9E+BfOseEjBohqlo7yNKIfYnT8mE/Xq
nb7eHjyvLiE8tRtZ+7SjsujOMHv9LhWFAbSaxU/kEVzpkp0zyd6mnnslDKaaHLhz
goUMOL/D0lg=
=NhQ7
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event updates from Ingo Molnar:
- Improve Intel uncore PMU support:
- Parse uncore 'discovery tables' - a new hardware capability
enumeration method introduced on the latest Intel platforms. This
table is in a well-defined PCI namespace location and is read via
MMIO. It is organized in an rbtree.
These uncore tables will allow the discovery of standard counter
blocks, but fancier counters still need to be enumerated
explicitly.
- Add Alder Lake support
- Improve IIO stacks to PMON mapping support on Skylake servers
- Add Intel Alder Lake PMU support - which requires the introduction of
'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big')
and Gracemont ('small' - Atom derived) cores.
The CPU-side feature set is entirely symmetrical - but on the PMU
side there's core type dependent PMU functionality.
- Reduce data loss with CPU level hardware tracing on Intel PT / AUX
profiling, by fixing the AUX allocation watermark logic.
- Improve ring buffer allocation on NUMA systems
- Put 'struct perf_event' into their separate kmem_cache pool
- Add support for synchronous signals for select perf events. The
immediate motivation is to support low-overhead sampling-based race
detection for user-space code. The feature consists of the following
main changes:
- Add thread-only event inheritance via
perf_event_attr::inherit_thread, which limits inheritance of
events to CLONE_THREAD.
- Add the ability for events to not leak through exec(), via
perf_event_attr::remove_on_exec.
- Allow the generation of SIGTRAP via perf_event_attr::sigtrap,
extend siginfo with an u64 ::si_perf, and add the breakpoint
information to ::si_addr and ::si_perf if the event is
PERF_TYPE_BREAKPOINT.
The siginfo support is adequate for breakpoints right now - but the
new field can be used to introduce support for other types of
metadata passed over siginfo as well.
- Misc fixes, cleanups and smaller updates.
* tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
signal, perf: Add missing TRAP_PERF case in siginfo_layout()
signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures
perf/x86: Allow for 8<num_fixed_counters<16
perf/x86/rapl: Add support for Intel Alder Lake
perf/x86/cstate: Add Alder Lake CPU support
perf/x86/msr: Add Alder Lake CPU support
perf/x86/intel/uncore: Add Alder Lake support
perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
perf/x86/intel: Add Alder Lake Hybrid support
perf/x86: Support filter_match callback
perf/x86/intel: Add attr_update for Hybrid PMUs
perf/x86: Add structures for the attributes of Hybrid PMUs
perf/x86: Register hybrid PMUs
perf/x86: Factor out x86_pmu_show_pmu_cap
perf/x86: Remove temporary pmu assignment in event_init
perf/x86/intel: Factor out intel_pmu_check_extra_regs
perf/x86/intel: Factor out intel_pmu_check_event_constraints
perf/x86/intel: Factor out intel_pmu_check_num_counters
perf/x86: Hybrid PMU support for extra_regs
perf/x86: Hybrid PMU support for event constraints
...
Relative path include works in the regular build due to -I paths but may
break in other situations.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210416214113.552252-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce helper functions to control PCIe root ports list.
These helpers will be used in the follow-up patch.
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-3-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Turns out, the default setting of attr.aux_watermark to half of the total
buffer size is not very useful, especially with smaller buffers. The
problem is that, after half of the buffer is filled up, the kernel updates
->aux_head and sets up the next "transaction", while observing that
->aux_tail is still zero (as userspace haven't had the chance to update
it), meaning that the trace will have to stop at the end of this second
"transaction". This means, for example, that the second PERF_RECORD_AUX in
every trace comes with TRUNCATED flag set.
Setting attr.aux_watermark to quarter of the buffer gives enough space for
the ->aux_tail update to be observed and prevents the data loss.
The obligatory before/after showcase:
> # perf_before record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 6 times to write data ]
> Warning:
> AUX data lost 4 times out of 10!
>
> [ perf record: Captured and wrote 0.099 MB perf.data ]
> # perf record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 4 times to write data ]
> [ perf record: Captured and wrote 0.039 MB perf.data ]
The effect is still visible with large workloads and large buffers,
although less pronounced.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210414154955.49603-3-alexander.shishkin@linux.intel.com
Add a function to find the common PMU map for the system.
For arm64, a special variant is added. This is because arm64 supports
heterogeneous CPU systems. As such, it cannot be guaranteed that the
cpumap is same for all CPUs. So in case of heterogeneous systems, don't
return a cpumap.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Paul A. Clarke <pc@us.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1617791570-165223-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sort dimension "p_stage_cyc" is used to represent pipeline
stage cycle information. Presently, this is used only in powerpc.
For unsupported platforms, we don't want to display it
in the perf report output columns. Hence add check in sort_dimension__add()
and skip the sort key incase it is not applicable for the particular arch.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/1616425047-1666-6-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pipeline stage cycles details can be recorded on powerpc from the
contents of Performance Monitor Unit (PMU) registers. On ISA v3.1
platform, sampling registers exposes the cycles spent in different
pipeline stages. Patch adds perf tools support to present two of the
cycle counter information along with memory latency (weight).
Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
This is stored in 'var2_w' field of 'perf_sample_weight'.
Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
which is stored in 'var3_w' field of perf_sample_weight.
Add new sort function 'Pipeline Stage Cycle' and include this in
default_mem_sort_order[]. This new sort function may be used to denote
some other pipeline stage in another architecture. So add this to list
of sort entries that can have dynamic header string.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/1616425047-1666-5-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add arch specific arch_evsel__set_sample_weight() to set the new
sample type for powerpc.
Add arch specific arch_perf_parse_sample_weight() to store the
sample->weight values depending on the sample type applied.
if the new sample type (PERF_SAMPLE_WEIGHT_STRUCT) is applied,
store only the lower 32 bits to sample->weight. If sample type
is 'PERF_SAMPLE_WEIGHT', store the full 64-bit to sample->weight.
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/1616425047-1666-4-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the fixes sent for v5.12 and continue development based on
v5.12-rc2, i.e. without the swap on file bug.
This also gets a slightly newer and better tools/perf/arch/arm/util/cs-etm.c
patch version, using the BIT() macro, that had already been slated to
v5.13 but ended up going to v5.12-rc1 on an older version.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When set option with macros ETM_OPT_CTXTID and ETM_OPT_TS, it wrongly
takes these two values (14 and 28 prespectively) as bit masks, but
actually both are the offset for bits. But this doesn't lead to
further failure due to the AND logic operation will be always true for
ETM_OPT_CTXTID / ETM_OPT_TS.
This patch defines new independent macros (rather than using the
"config" bits) for requesting the "contextid" and "timestamp" for
cs_etm_set_option().
Signed-off-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Daniel Kiss <daniel.kiss@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210206150833.42120-5-leo.yan@linaro.org
[ Extract the change as a separate patch for easier review ]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In some versions of alpine Linux the perf build is broken since commit
1d509f2a6e ("x86/insn: Support big endian cross-compiles"):
In file included from /usr/include/linux/byteorder/little_endian.h:13,
from /usr/include/asm/byteorder.h:5,
from arch/x86/util/../../../../arch/x86/include/asm/insn.h:10,
from arch/x86/util/archinsn.c:2:
/usr/include/linux/swab.h:161:8: error: unknown type name '__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
So move the inclusion of arch/x86/include/asm/insn.h to later in the
places where linux/stddef.h (that conditionally defines
__always_inline) to workaround this problem on Alpine Linux 3.9 to 3.11,
3.12 onwards works.
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ins_lat of PERF_SAMPLE_WEIGHT_STRUCT stands for the instruction
latency, which is only available for X86. Add a X86 specific test for
the ins_lat and PERF_SAMPLE_WEIGHT_STRUCT type.
The test__x86_sample_parsing() uses the same way as the
test__sample_parsing() to verify a sample type. Since the ins_lat and
PERF_SAMPLE_WEIGHT_STRUCT are the only X86 specific sample type for now,
the test__x86_sample_parsing() only verify the PERF_SAMPLE_WEIGHT_STRUCT
type. Other sample types are still verified in the generic test.
$ perf test 77 -v
77: x86 Sample parsing :
--- start ---
test child forked, pid 102370
test child finished with 0
---- end ----
x86 Sample parsing: Ok
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1614787285-104151-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
9caccd4154 ("fs: introduce MOUNT_ATTR_IDMAP")
This adds this new syscall to the tables used by tools such as 'perf
trace', so that one can specify it by name and have it filtered, etc.
Addressing these perf build warnings:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YD6Wsxr9ByUbab/a@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
fbcee2ebe8 ("powerpc/32: Always save non volatile GPRs at syscall entry")
That shouldn't cause any change in tooling, just silences the following
tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the kernel is running at EL2, the pid of a task is exposed via VMID
instead of the CONTEXTID. Add support for this in the perf tool.
This patch respects user setting if user has specified any configs
from "contextid", "contextid1" or "contextid2"; otherwise, it
dynamically sets config based on PMU format "contextid".
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Co-developed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Link: https://lore.kernel.org/r/20210213113220.292229-4-leo.yan@linaro.org
Link: https://lore.kernel.org/r/20210224164835.3497311-5-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When set option with macros ETM_OPT_CTXTID and ETM_OPT_TS, it wrongly
takes these two values (14 and 28 prespectively) as bit masks, but
actually both are the offset for bits. But this doesn't lead to further
failure due to the AND logic operation will be always true for
ETM_OPT_CTXTID / ETM_OPT_TS.
This patch uses the BIT() macro for option bits, thus it can request the
correct bitmaps for "contextid" and "timestamp" when calling
cs_etm_set_option().
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20210213113220.292229-3-leo.yan@linaro.org
Link: https://lore.kernel.org/r/20210224164835.3497311-4-mathieu.poirier@linaro.org
[Extract the change as a separate patch for easier review]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current fixed metadata version format (version 0), means that adding
metadata parameter items renders files from a previous version of perf
unreadable. Per CPU parameters appear in a fixed order, but there is no
field to indicate the number of ETM parameters per CPU.
This patch updates the per CPU parameter blocks to include a NR_PARAMs
value which indicates the number of parameters in the block.
The header version is incremented to 1. Fixed ordering is retained,
new ETM parameters are added to the end of the list.
The reader code is updated to be able to read current version 0 files,
For version 1, the reader will read the number of parameters in the
per CPU block. This allows the reader to process older or newer files
that may have different numbers of parameters than in use at the
time perf was built.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20210202214040.32349-1-mike.leach@linaro.org
Link: https://lore.kernel.org/r/20210224164835.3497311-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Grab a copy of arch/mips/kernel/syscalls/syscall_n64.tbl and use it to
generate tools/perf/arch/mips/include/generated/asm/syscalls_n64.c file,
this is similar with commit 1b700c9975 ("perf tools: Build syscall
table .c header from kernel's syscall_64.tbl")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Juxin Gao <gaojuxin@loongson.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: linux-mips@vger.kernel.org
Link: http://lore.kernel.org/lkml/1612409724-3516-4-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Map perf APIs (perf_reg_name/get_arch_regstr/unwind__arch_reg_id) with
MIPS specific registers.
[ayan@wavecomp.com: repick this patch for unwinding userstack backtrace
by perf and libunwind on MIPS based CPU.]
[yangtiezhu@loongson.cn: Add sample_reg_masks[] to fix build error,
silence some checkpatch errors and warnings, and also separate the
original patches into two parts (MIPS kernel and perf tools) to merge
easily.]
The original patches:
https://lore.kernel.org/patchwork/patch/1126521/https://lore.kernel.org/patchwork/patch/1126520/
Committer notes:
Do it as __perf_reg_name() to cope with:
067012974c ("perf tools: Fix arm64 build error with gcc-11")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Archer Yan <ayan@wavecomp.com>
Cc: David Daney <david.daney@cavium.com>
Cc: Jianlin Lv <Jianlin.Lv@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Juxin Gao <gaojuxin@loongson.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Link: http://lore.kernel.org/lkml/1612409724-3516-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Archer Yan <ayan@wavecomp.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
fbcee2ebe8 ("powerpc/32: Always save non volatile GPRs at syscall entry")
That shouldn't cause any change in tooling, just silences the following
tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Retain the PIP packet payload as is, instead of just the CR3, because it
contains also the VMX NR flag which is needed to track VM-Entry.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation to support Intel PT decoding of virtual machine traces, add
vmlaunch and vmresume as branch instructions.
Note, sample flags will show "VMentry" even if the VM-Entry fails.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For X86, the var2_w field of PERF_SAMPLE_WEIGHT_STRUCT stands for the
instruction latency. Current perf forces the var2_w to the data->ins_lat
in the generic code. It works well for now because X86 is the only
architecture that supports the PERF_SAMPLE_WEIGHT_STRUCT, but it may
bring problems once other architectures support the sample type. For
example, the var2_w may be used to capture something else on PowerPC.
Create two architecture specific functions to parse and synthesize the
weight related samples. Move the X86 specific codes to the X86 version
functions. Other architectures can implement their own functions later
separately.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/1612540912-6562-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to use "%#" PRIx64 for u64 values, not "%lx". In arm64's and
s390x cases the compiler doesn't complain, but lets fix this in case
this code gets copied to a 32-bit arch, like with powerpc 32-bit that
got fixed in the previous patch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to use "%#" PRIx64 for u64 values, not "%lx", fixing this build
problem on powerpc 32-bit:
72 13.69 ubuntu:18.04-x-powerpc : FAIL powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
arch/powerpc/util/machine.c: In function 'arch__symbols__fixup_end':
arch/powerpc/util/machine.c:23:12: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'u64 {aka long long unsigned int}' [-Werror=format=]
pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end);
^
/git/linux/tools/perf/util/debug.h:18:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
/git/linux/tools/perf/util/debug.h:33:29: note: in expansion of macro 'pr_debugN'
#define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~
/git/linux/tools/perf/util/debug.h:33:42: note: in expansion of macro 'pr_fmt'
#define pr_debug4(fmt, ...) pr_debugN(4, pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~
arch/powerpc/util/machine.c:23:2: note: in expansion of macro 'pr_debug4'
pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end);
^~~~~~~~~
cc1: all warnings being treated as errors
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[5]: *** [util] Error 2
/git/linux/tools/build/Makefile.build:139: recipe for target 'powerpc' failed
make[4]: *** [powerpc] Error 2
/git/linux/tools/build/Makefile.build:139: recipe for target 'arch' failed
make[3]: *** [arch] Error 2
73 30.47 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Fixes: 557c3eadb7 ("perf powerpc: Fix gap between kernel end and module start")
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. Users can apply either the
PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample
type to retrieve the sample weight, but they cannot apply both sample
types simultaneously.
The new sample type shares the same space as the PERF_SAMPLE_WEIGHT
sample type. The lower 32 bits are exactly the same for both sample
type. The higher 32 bits may be different for different architecture.
Add arch specific arch_evsel__set_sample_weight() to set the new sample
type for X86. Only store the lower 32 bits for the sample->weight if the
new sample type is applied. In practice, no memory access could last
than 4G cycles. No data will be lost.
If the kernel doesn't support the new sample type. Fall back to the
PERF_SAMPLE_WEIGHT sample type.
There is no impact for other architectures.
Committer notes:
Fixup related to PERF_SAMPLE_CODE_PAGE_SIZE, present in acme/perf/core
but not upstream yet.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1612296553-21962-6-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On the Intel Sapphire Rapids server, an auxiliary event has to be
enabled simultaneously with the load latency event to retrieve complete
Memory Info.
Add X86 specific perf_mem_events__name() to handle the auxiliary event.
- Users are only interested in the samples of the mem-loads event.
Sample read the auxiliary event.
- The auxiliary event must be in front of the load latency event in a
group. Assume the second event to sample if the auxiliary event is the
leader.
- Add a weak is_mem_loads_aux_event() to check the auxiliary event for
X86. For other ARCHs, it always return false.
Parse the unique event name, mem-loads-aux, for the auxiliary event.
Committer notes:
According to 61b985e3e7 ("perf/x86/intel: Add perf core PMU
support for Sapphire Rapids"), ENODATA is only returned by
sys_perf_event_open() when used with these auxiliary events, with this
in evsel__open_strerror():
case ENODATA:
return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
"Please add an auxiliary event in front of the load latency event.");
This is Ok at this point in time, but fragile long term, I pointed this
out in the e-mail thread, requesting a follow up patch to check if
ENODATA is really for this specific case.
Fixed up sizeof(MEM_LOADS_AUX_NAME) bug pointed out by Namhyung.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210205152648.GC920417@kernel.org
Link: http://lore.kernel.org/lkml/1612296553-21962-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To enable presenting of Performance Monitor Counter Registers (PMC1 to
PMC6) as part of extended regsiters, this patch adds these to
sample_reg_mask in the tool side (to use with -I? option).
Simplified the PERF_REG_PMU_MASK_300/31 definition. Excluded the
unsupported SPRs (MMCR3, SIER2, SIER3) from extended mask value for
CPU_FTR_ARCH_300.
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Running "perf mem report" in TUI mode fails with ENOMEM message in
powerpc:
failed to process sample
Running with debug and verbose options points that issue is while
allocating memory for sample histograms.
The error path is:
symbol__inc_addr_samples() ->
__symbol__inc_addr_samples() ->
annotated_source__histogram()
symbol__inc_addr_samples() calls annotated_source__alloc_histograms ()
to allocate memory for sample histograms using calloc(). Here calloc()
fails since the size of symbol is huge. The size of a symbol is
calculated as difference between its start and end address.
Example histogram allocation that fails is:
sym->name is _end
sym->start is 0xc0000000027a0000
sym->end is 0xc008000003890000
symbol__size(sym) is 0x80000010f0000
In the above case, the difference between sym->start
(0xc0000000027a0000) and sym->end (0xc008000003890000) is huge.
This is same problem as in s390 and arm64 which are fixed in commits:
b9c0a64901 ("perf annotate: Fix s390 gap between kernel end and module start")
78886f3ed3 ("perf symbols: Fix arm64 gap between kernel start and module end")
When this symbol was read first, its start and end address was set to
address which matches with data from /proc/kallsyms.
After symbol__new():
symbol__new: _end 0xc0000000027a0000-0xc0000000027a0000
From /proc/kallsyms:
...
c000000002799370 b backtrace_flag
c000000002799378 B radix_tree_node_cachep
c000000002799380 B __bss_stop
c0000000027a0000 B _end
c008000003890000 t icmp_checkentry [ip_tables]
c008000003890038 t ipt_alloc_initial_table [ip_tables]
c008000003890468 T ipt_do_table [ip_tables]
c008000003890de8 T ipt_unregister_table_pre_exit [ip_tables]
...
Perf calls function symbols__fixup_end() which sets the end of symbol to
0xc008000003890000, which is the next address and this is the start
address of first module (icmp_checkentry in above) which will make the
huge symbol size of 0x80000010f0000.
After symbols__fixup_end:
symbols__fixup_end: sym->name: _end
sym->start: 0xc0000000027a0000
sym->end: 0xc008000003890000
On powerpc, kernel text segment is located at 0xc000000000000000 whereas
the modules are located at very high memory addresses,
0xc00800000xxxxxxx. Since the gap between end of kernel text segment and
beginning of first module's address is high, histogram allocation using
calloc fails.
Fix this by detecting the kernel's last symbol and limiting the range of
last kernel symbol to pagesize.
Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-By: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1609208054-1566-1-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors. From the Ice Lake and later platforms, the
Topdown information can be retrieved from the dedicated "metrics"
register, which isn't impacted by other events. Also, the Topdown
metrics support both per thread/process and per core measuring. Adding
Topdown metrics events as default events can enrich the default
measuring information, and would not cost any extra multiplexing.
Introduce arch_evlist__add_default_attrs() to allow architecture
specific default events. Add the Topdown metrics events in the X86
specific arch_evlist__add_default_attrs(). Other architectures can add
their own default events later separately.
With the patch:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.82 msec task-clock:u # 0.001 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
61 page-faults:u # 0.074 M/sec
319,941 cycles:u # 0.388 GHz
242,802 instructions:u # 0.76 insn per cycle
54,380 branches:u # 66.028 M/sec
4,043 branch-misses:u # 7.43% of all branches
1,585,555 slots:u # 1925.189 M/sec
238,941 topdown-retiring:u # 15.0% retiring
410,378 topdown-bad-spec:u # 25.8% bad speculation
634,222 topdown-fe-bound:u # 39.9% frontend bound
304,675 topdown-be-bound:u # 19.2% backend bound
1.001791625 seconds time elapsed
0.000000000 seconds user
0.001572000 seconds sys
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now the two OP formats are used for SDT marker argument in Arm64 ELF,
one format is general register xNUM (e.g. x1, x2, etc), another is for
using stack pointer to access local variables (e.g. [sp], [sp, 8]).
This patch adds support SDT marker argument for Arm64, it parses OP and
converts to uprobe compatible format.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: He Zhe <zhe.he@windriver.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20201225052751.24513-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This silences the following tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
Just make them same:
cp arch/s390/kernel/syscalls/syscall.tbl tools/perf/arch/s390/entry/syscalls/syscall.tbl
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1608278364-6733-5-git-send-email-yangtiezhu@loongson.cn
[ There were updates after Tiezhu's post, so I just updated the copy ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This silences the following tools/perf/ build warning:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
Just make them same:
cp arch/powerpc/kernel/syscalls/syscall.tbl tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1608278364-6733-4-git-send-email-yangtiezhu@loongson.cn
[ There were updates after Tiezhu's post, so I just updated the copy ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is better to check syscall.tbl for s390 in check-headers.sh, it is
similar with commit c9b51a0170 ("perf tools: Move syscall_64.tbl check
into check-headers.sh").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1608278364-6733-3-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is better to check syscall.tbl for powerpc in check-headers.sh, it is
similar with commit c9b51a0170 ("perf tools: Move syscall_64.tbl check
into check-headers.sh").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1608278364-6733-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
b0a0c2615f ("epoll: wire up syscall epoll_pwait2")
That addresses these perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously, this command returns no help message on aarch64:
-> ./perf record --user-regs=?
available registers:
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
With this change, the registers are listed.
-> ./perf record --user-regs=?
available registers: x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 lr sp pc
It's also now possible to record subsets of registers on aarch64:
-> ./perf record --user-regs=x4,x5 ls
-> ./perf report --dump-raw-trace
12801163749305260 0xc70 [0x40]: PERF_RECORD_SAMPLE(IP, 0x2): 51956/51956: 0xffffaa6571f0 period: 145785 addr: 0
... user regs: mask 0x30 ABI 64-bit
.... x4 0x000000000000006c
.... x5 0x0000001001000001
... thread: ls:51956
...... dso: /usr/lib64/ld-2.17.so
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Acked-by: John Garry <john.garry@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20201127153923.26717-1-alexandre.truong@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This fix is for a failure that occurred in the DWARF unwind perf test.
Stack unwinders may probe memory when looking for frames.
Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.
This can lead to false memory sanitizer failures for the use of an
uninitialized value.
Avoid this problem by removing the poison on the copied stack.
The full msan failure with track origins looks like:
==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
#1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
#2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#24 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
#1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
#2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
#3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#25 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
#1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
#2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
#3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
#4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#17 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
#0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445
SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds ARM SPE events for perf memory profiling:
'spe-load': event for only recording memory load ops;
'spe-store': event for only recording memory store ops;
'spe-ldst': event for recording memory load and store ops.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20201106094853.21082-10-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
x86 arch provides the testing for conversion between tsc and perf time,
the testing is located in x86 arch folder. Move this testing out from
x86 arch folder and place it into the common testing folder, so allows
to execute tsc testing on other architectures (e.g. Arm64).
This patch removes the inclusion of "arch-tests.h" from the testing
code, this can avoid building failure if any arch has no this header
file.
Committer testing:
$ perf test -v tsc
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
70: Convert perf time to TSC :
--- start ---
test child forked, pid 4032834
mmap size 528384B
1st event perf time 165409788843605 tsc 336578703793868
rdtsc time 165409788854986 tsc 336578703837038
2nd event perf time 165409788855487 tsc 336578703838935
test child finished with 0
---- end ----
Convert perf time to TSC: Ok
$
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20201019100236.23675-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
ecb8ac8b1f ("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
That addresses these perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- cgroup improvements for 'perf stat', allowing for compact specification of events
and cgroups in the command line.
- Support per thread topdown metrics in 'perf stat'.
- Support sample-read topdown metric group in 'perf record'
- Show start of latency in addition to its start in 'perf sched latency'.
- Add min, max to 'perf script' futex-contention output, in addition to avg.
- Allow usage of 'perf_event_attr->exclusive' attribute via the new ':e' event
modifier.
- Add 'snapshot' command to 'perf record --control', using it with Intel PT.
- Support FIFO file names as alternative options to 'perf record --control'.
- Introduce branch history "streams", to compare 'perf record' runs with
'perf diff' based on branch records and report hot streams.
- Support PE executable symbol tables using libbfd, to profile, for instance, wine binaries.
- Add filter support for option 'perf ftrace -F/--funcs'.
- Allow configuring the 'disassembler_style' 'perf annotate' knob via 'perf config'
- Update CascadelakeX and SkylakeX JSON vendor events files.
- Add support for parsing perchip/percore JSON vendor events.
- Add power9 hv_24x7 core level metric events.
- Add L2 prefetch, ITLB instruction fetch hits JSON events for AMD zen1.
- Enable Family 19h users by matching Zen2 AMD vendor events.
- Use debuginfod in 'perf probe' when required debug files not found locally.
- Display negative tid in non-sample events in 'perf script'.
- Make GTK2 support opt-in
- Add build test with GTK+
- Add missing -lzstd to the fast path feature detection
- Add scripts to auto generate 'mmap', 'mremap' string<->id tables for use in 'perf trace'.
- Show python test script in verbose mode.
- Fix uncore metric expressions
- Msan uninitialized use fixes.
- Use condition variables in 'perf bench numa'
- Autodetect python3 binary in systems without python2.
- Support md5 build ids in addition to sha1.
- Add build id 'perf test' regression test.
- Fix printable strings in python3 scripts.
- Fix off by ones in 'perf trace' in arches using libaudit.
- Fix JSON event code for events referencing std arch events.
- Introduce 'perf test' shell script for Arm CoreSight testing.
- Add rdtsc() for Arm64 for used in the PERF_RECORD_TIME_CONV metadata
event and in 'perf test tsc'.
- 'perf c2c' improvements: Add "RMT Load Hit" metric, "Total Stores", fixes
and documentation update.
- Fix usage of reloc_sym in 'perf probe' when using both kallsyms and debuginfo files.
- Do not print 'Metric Groups:' unnecessarily in 'perf list'
- Refcounting fixes in the event parsing code.
- Add expand cgroup event 'perf test' entry.
- Fix out of bounds CPU map access when handling armv8_pmu events in 'perf stat'.
- Add build-id injection 'perf bench' benchmark.
- Enter namespace when reading build-id in 'perf inject'.
- Do not load map/dso when injecting build-id speeding up the 'perf inject' process.
- Add --buildid-all option to avoid processing all samples, just the mmap metadata events.
- Add feature test to check if libbfd has buildid support
- Add 'perf test' entry for PE binary format support.
- Fix typos in power8 PMU vendor events JSON files.
- Hide libtraceevent non API functions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test results:
The first ones are container based builds of tools/perf with and without libelf
support. Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
$ grep "model name" -m1 /proc/cpuinfo
model name: AMD Ryzen 9 3900X 12-Core Processor
$ export PERF_TARBALL=http://192.168.122.1/perf/perf-5.9.0-rc7.tar.xz
$ dm
Thu 15 Oct 2020 01:10:56 PM -03
1 67.40 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
2 69.01 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
3 70.79 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
4 79.89 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
5 80.88 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
6 83.88 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
7 107.87 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
8 115.43 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
9 106.80 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
10 114.06 alpine:edge : Ok gcc (Alpine 10.2.0) 10.2.0, Alpine clang version 10.0.1
11 70.42 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final)
12 98.70 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 10.0.0
13 80.37 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.3.1 20200518 (ALT Sisyphus 9.3.1-alt1), clang version 10.0.1
14 64.12 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
15 97.64 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-9), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
16 22.70 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
17 22.72 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
18 26.70 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
19 31.86 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
20 113.19 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03)
21 57.23 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200908 releases/gcc-10.2.0-203-g127d693955, clang version 10.0.1
22 64.98 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
23 76.08 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
24 74.49 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
25 78.50 debian:experimental : Ok gcc (Debian 10.2.0-15) 10.2.0, Debian clang version 11.0.0-2
26 33.30 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.0-3) 10.2.0
27 30.96 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 9.3.0-8) 9.3.0
28 32.63 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0
29 30.12 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
30 30.99 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
31 68.60 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
32 78.92 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
33 26.15 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
34 80.13 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
35 90.68 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
36 90.45 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
37 100.88 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
38 105.99 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
39 111.05 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30)
40 29.96 fedora:30-x-ARC-glibc : Ok arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
41 27.02 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
42 110.47 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-2.fc31)
43 88.78 fedora:32 : Ok gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), clang version 10.0.0 (Fedora 10.0.0-2.fc32)
44 15.92 fedora:rawhide : FAIL gcc (GCC) 10.2.1 20200916 (Red Hat 10.2.1-4), clang version 11.0.0 (Fedora 11.0.0-0.4.rc3.fc34)
45 33.58 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
46 65.32 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
47 81.35 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
48 103.94 mageia:7 : Ok gcc (Mageia 8.4.0-1.mga7) 8.4.0, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
49 91.62 manjaro:latest : Ok gcc (GCC) 10.2.0, clang version 10.0.1
50 219.87 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva), OpenMandriva 11.0.0-0.20200909.1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-release-11.x/clang 5cb8ffbab42358a7cdb0a67acfadb84df0779579)
51 111.76 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 5.0.1 (tags/RELEASE_501/final 312548)
52 118.03 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
53 107.91 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1
54 102.34 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], clang version 10.0.1
55 25.33 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
56 30.45 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
57 104.65 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.3), clang version 9.0.1 (Red Hat 9.0.1-2.0.1.module+el8.2.0+5599+9ed9ef6d)
58 26.04 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
59 29.49 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
60 72.95 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
61 26.03 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
62 25.15 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
63 24.88 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
64 25.72 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
65 25.39 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
66 25.34 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
67 84.84 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
68 27.15 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
69 26.68 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
70 22.38 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
71 26.35 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
72 28.58 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
73 28.18 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
74 178.55 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
75 24.58 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
76 26.89 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
77 24.81 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
78 68.90 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final)
79 69.31 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, clang version 10.0.0-4ubuntu1
80 30.00 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10-20200411-0ubuntu1) 10.0.1 20200411 (experimental) [master revision bb87d5cc77d:75961caccb7:f883c46b4877f637e0fa5025b4d6b5c9040ec566]
81 70.34 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-5ubuntu2) 10.2.0, Ubuntu clang version 10.0.1-1
$
# uname -a
Linux five 5.9.0+ #1 SMP Thu Oct 15 09:06:41 -03 2020 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
744aec4df2 perf c2c: Update documentation for metrics reorganization
# perf version --build-options
perf version 5.9.rc7.g744aec4df2c5
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
aio: [ on ] # HAVE_AIO_SUPPORT
zstd: [ on ] # HAVE_ZSTD_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
11: DSO data read : Ok
12: DSO data cache : Ok
13: DSO data reopen : Ok
14: Roundtrip evsel->name : Ok
15: Parse sched tracepoints fields : Ok
16: syscalls:sys_enter_openat event fields : Ok
17: Setup struct perf_event_attr : Ok
18: Match and link multiple hists : Ok
19: 'import perf' in python : Ok
20: Breakpoint overflow signal handler : Ok
21: Breakpoint overflow sampling : Ok
22: Breakpoint accounting : Ok
23: Watchpoint :
23.1: Read Only Watchpoint : Skip
23.2: Write Only Watchpoint : Ok
23.3: Read / Write Watchpoint : Ok
23.4: Modify Watchpoint : Ok
24: Number of exit events of a simple workload : Ok
25: Software clock events period values : Ok
26: Object code reading : Ok
27: Sample parsing : Ok
28: Use a dummy software event to keep tracking : Ok
29: Parse with no sample_id_all bit set : Ok
30: Filter hist entries : Ok
31: Lookup mmap thread : Ok
32: Share thread maps : Ok
33: Sort output of hist entries : Ok
34: Cumulate child hist entries : Ok
35: Track with sched_switch : Ok
36: Filter fds with revents mask in a fdarray : Ok
37: Add fd to a fdarray, making it autogrow : Ok
38: kmod_path__parse : Ok
39: Thread map : Ok
40: LLVM search and compile :
40.1: Basic BPF llvm compile : Ok
40.2: kbuild searching : Ok
40.3: Compile source for BPF prologue generation : Ok
40.4: Compile source for BPF relocation : Ok
41: Session topology : Ok
42: BPF filter :
42.1: Basic BPF filtering : Ok
42.2: BPF pinning : Ok
42.3: BPF prologue generation : Ok
42.4: BPF relocation checker : Ok
43: Synthesize thread map : Ok
44: Remove thread map : Ok
45: Synthesize cpu map : Ok
46: Synthesize stat config : Ok
47: Synthesize stat : Ok
48: Synthesize stat round : Ok
49: Synthesize attr update : Ok
50: Event times : Ok
51: Read backward ring buffer : Ok
52: Print cpu map : Ok
53: Merge cpu map : Ok
54: Probe SDT events : Ok
55: is_printable_array : Ok
56: Print bitmap : Ok
57: perf hooks : Ok
58: builtin clang support : Skip (not compiled in)
59: unit_number__scnprintf : Ok
60: mem2node : Ok
61: time utils : Ok
62: Test jit_write_elf : Ok
63: Test libpfm4 support : Skip (not compiled in)
64: Test api io : Ok
65: maps__merge_in : Ok
66: Demangle Java : Ok
67: Parse and process metrics : Ok
68: PE file support : Ok
69: Event expansion for cgroups : Ok
70: x86 rdpmc : Ok
71: Convert perf time to TSC : Ok
72: DWARF unwind : Ok
73: x86 instruction decoder - new instructions : Ok
74: Intel PT packet decoder : Ok
75: x86 bp modify : Ok
76: probe libc's inet_pton & backtrace it with ping : Ok
77: Check Arm CoreSight trace data recording and synthesized samples: Skip
78: Use vfs_getname probe to get syscall args filenames : Ok
79: Check open filename arg using perf trace + vfs_getname : Ok
80: Zstd perf.data compression/decompression : Ok
81: Add vfs_getname probe to get syscall args filenames : Ok
82: build id cache operations : Ok
#
$ git log --oneline -1
744aec4df2 (HEAD -> perf/core, quaco/perf/core) perf c2c: Update documentation for metrics reorganization
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_install_bin_O: make install-bin
make_static_O: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_newt_O: make NO_NEWT=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_sdt_O: make NO_SDT=1
make_debug_O: make DEBUG=1
make_perf_o_O: make perf.o
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libbpf_DEBUG_O: make NO_LIBBPF=1 DEBUG=1
make_clean_all_O: make clean all
make_tags_O: make tags
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libcrypto_O: make NO_LIBCRYPTO=1
make_with_libpfm4_O: make LIBPFM4=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_util_map_o_O: make util/map.o
make_no_slang_O: make NO_SLANG=1
make_with_gtk2_O: make GTK2=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_help_O: make help
make_pure_O: make
make_no_gtk2_O: make NO_GTK2=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1
make_install_O: make install
make_doc_O: make doc
make_no_libperl_O: make NO_LIBPERL=1
make_no_syscall_tbl_O: make NO_SYSCALL_TABLE=1
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCX4iuzgAKCRCyPKLppCJ+
J1khAP4iMQMFCMpNsBaL6KLtj3aTOhrooYuhbNL3kajqYVyW/QD8Dws35k6m2+tB
tcOMJykFjPkQ4I13zsxKyugeJuUzSQw=
=KdSj
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.10-2020-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- cgroup improvements for 'perf stat', allowing for compact
specification of events and cgroups in the command line.
- Support per thread topdown metrics in 'perf stat'.
- Support sample-read topdown metric group in 'perf record'
- Show start of latency in addition to its start in 'perf sched
latency'.
- Add min, max to 'perf script' futex-contention output, in addition to
avg.
- Allow usage of 'perf_event_attr->exclusive' attribute via the new
':e' event modifier.
- Add 'snapshot' command to 'perf record --control', using it with
Intel PT.
- Support FIFO file names as alternative options to 'perf record
--control'.
- Introduce branch history "streams", to compare 'perf record' runs
with 'perf diff' based on branch records and report hot streams.
- Support PE executable symbol tables using libbfd, to profile, for
instance, wine binaries.
- Add filter support for option 'perf ftrace -F/--funcs'.
- Allow configuring the 'disassembler_style' 'perf annotate' knob via
'perf config'
- Update CascadelakeX and SkylakeX JSON vendor events files.
- Add support for parsing perchip/percore JSON vendor events.
- Add power9 hv_24x7 core level metric events.
- Add L2 prefetch, ITLB instruction fetch hits JSON events for AMD
zen1.
- Enable Family 19h users by matching Zen2 AMD vendor events.
- Use debuginfod in 'perf probe' when required debug files not found
locally.
- Display negative tid in non-sample events in 'perf script'.
- Make GTK2 support opt-in
- Add build test with GTK+
- Add missing -lzstd to the fast path feature detection
- Add scripts to auto generate 'mmap', 'mremap' string<->id tables for
use in 'perf trace'.
- Show python test script in verbose mode.
- Fix uncore metric expressions
- Msan uninitialized use fixes.
- Use condition variables in 'perf bench numa'
- Autodetect python3 binary in systems without python2.
- Support md5 build ids in addition to sha1.
- Add build id 'perf test' regression test.
- Fix printable strings in python3 scripts.
- Fix off by ones in 'perf trace' in arches using libaudit.
- Fix JSON event code for events referencing std arch events.
- Introduce 'perf test' shell script for Arm CoreSight testing.
- Add rdtsc() for Arm64 for used in the PERF_RECORD_TIME_CONV metadata
event and in 'perf test tsc'.
- 'perf c2c' improvements: Add "RMT Load Hit" metric, "Total Stores",
fixes and documentation update.
- Fix usage of reloc_sym in 'perf probe' when using both kallsyms and
debuginfo files.
- Do not print 'Metric Groups:' unnecessarily in 'perf list'
- Refcounting fixes in the event parsing code.
- Add expand cgroup event 'perf test' entry.
- Fix out of bounds CPU map access when handling armv8_pmu events in
'perf stat'.
- Add build-id injection 'perf bench' benchmark.
- Enter namespace when reading build-id in 'perf inject'.
- Do not load map/dso when injecting build-id speeding up the 'perf
inject' process.
- Add --buildid-all option to avoid processing all samples, just the
mmap metadata events.
- Add feature test to check if libbfd has buildid support
- Add 'perf test' entry for PE binary format support.
- Fix typos in power8 PMU vendor events JSON files.
- Hide libtraceevent non API functions.
* tag 'perf-tools-for-v5.10-2020-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (113 commits)
perf c2c: Update documentation for metrics reorganization
perf c2c: Add metrics "RMT Load Hit"
perf c2c: Correct LLC load hit metrics
perf c2c: Change header for LLC local hit
perf c2c: Use more explicit headers for HITM
perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
perf c2c: Organize metrics based on memory hierarchy
perf c2c: Display "Total Stores" as a standalone metrics
perf c2c: Display the total numbers continuously
perf bench: Use condition variables in numa.
perf jevents: Fix event code for events referencing std arch events
perf diff: Support hot streams comparison
perf streams: Report hot streams
perf streams: Calculate the sum of total streams hits
perf streams: Link stream pair
perf streams: Compare two streams
perf streams: Get the evsel_streams by evsel_idx
perf streams: Introduce branch history "streams"
perf intel-pt: Improve PT documentation slightly
perf tools: Add support for exclusive groups/events
...
Pull compat mount cleanups from Al Viro:
"The last remnants of mount(2) compat buried by Christoph.
Buried into NFS, that is.
Generally I'm less enthusiastic about "let's use in_compat_syscall()
deep in call chain" kind of approach than Christoph seems to be, but
in this case it's warranted - that had been an NFS-specific wart,
hopefully not to be repeated in any other filesystems (read: any new
filesystem introducing non-text mount options will get NAKed even if
it doesn't mess the layout up).
IOW, not worth trying to grow an infrastructure that would avoid that
use of in_compat_syscall()..."
* 'compat.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: remove compat_sys_mount
fs,nfs: lift compat nfs4 mount data handling into the nfs code
nfs: simplify nfs4_parse_monolithic
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
compat_sys_mount is identical to the regular sys_mount now, so remove it
and use the native version everywhere.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The system register CNTVCT_EL0 can be used to retrieve the counter from
user space. Add rdtsc() for Arm64.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200914115311.2201-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Functions perf_read_tsc_conversion() and perf_event__synth_time_conv()
should work as common functions rather than x86 specific, so move these
two functions out from arch/x86 folder and place them into util/tsc.c.
Since the function perf_event__synth_time_conv() will be linked in
util/tsc.c, remove its weak version.
Committer testing:
Before/after:
# perf test tsc
70: Convert perf time to TSC : Ok
#
# perf test -v tsc
70: Convert perf time to TSC :
--- start ---
test child forked, pid 8520
mmap size 528384B
1st event perf time 592110439891 tsc 2317172044331
rdtsc time 592110441915 tsc 2317172052010
2nd event perf time 592110442336 tsc 2317172053605
test child finished with 0
---- end ----
Convert perf time to TSC: Ok
#
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Link: http://lore.kernel.org/lkml/20200914115311.2201-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hardware TopDown metrics feature, sample-read feature should be
supported for a topdown group, e.g., sample a non-topdown event and read
a topdown metric group. But the current perf record code errors out.
For a topdown metric group, the slots event must be the leader of the
group, but the leader slots event doesn't support sampling.
To support sample-read the topdown metric group, use the 2nd event of
the group as the "leader" for the purposes of sampling.
Only the platform with Topdown metic feature supports sample-read the
topdown group. Add arch_topdown_sample_read() to indicate whether the
topdown group supports sample-read.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The group.h/c only include TopDown group related functions. The name
"group" is too generic and inaccurate. Use the name "topdown" to replace
it.
Move topdown related functions to a dedicated file, topdown.c.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds passing of pmu_event as a parameter in function
'arch_get_runtimeparam' which can be used to get details like if the
event is percore/perchip.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200907064133.75090-5-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we use the 'intel' disassembler style we get 'ret' instead of
'retq', so add that as an alias.
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > before
Apply this patch and then:
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > after
# diff -u before after
--- before 2020-09-04 14:10:47.768414634 -0300
+++ after 2020-09-04 14:10:59.116681039 -0300
@@ -33,7 +33,7 @@
test al,0x8
↓ je 97
and DWORD PTR gs:[rip+0x7e548509],0x7fffffff
- 97: ret
+ 97: ← ret
mov rax,QWORD PTR gs:0x17bc0
lock or BYTE PTR [rax+0x2],0x20
mov rax,QWORD PTR [rax]
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Matt P. Dziubinski <matdzb@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 61a47c1ad3 ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.
We have been warning about people using the sysctl system call for years
and believe there are no more users. Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.
So completely remove sys_sysctl on all architectures.
[nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org> [arm/arm64]
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: chenzefeng <chenzefeng2@huawei.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kars de Jong <jongk@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Paul Burton <paulburton@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sven Schnelle <svens@stackframe.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New features:
- Introduce controlling how 'perf stat' and 'perf record' works via a
control file descriptor, allowing starting with events configured but
disabled until commands are received via the control file descriptor.
This allows, for instance for tools such as Intel VTune to make further
use of perf as its Linux platform driver.
- Improve 'perf record' to to register in a perf.data file header the clockid
used to help later correlate things like syslog files and perf events
recorded.
- Add basic syscall and find_next_bit benchmarks to 'perf bench'.
- Allow using computed metrics in calculating other metrics. For instance:
{
.metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit",
.metric_name = "DCache_L2_All_Hits",
},
{
.metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss",
.metric_name = "DCache_L2_All_Miss",
},
{
.metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss",
.metric_name = "DCache_L2_All",
}
- Add suport for 'd_ratio', '>' and '<' operators to the expression resolver used
in calculating metrics in 'perf stat'.
Support for new kernel features:
- Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope with
things like ftrace, trampolines, i.e. changes in the kernel text that gets
in the way of properly decoding Intel PT hardware traces, for instance.
Intel PT:
- Add various knobs to reduce the volume of Intel PT traces by reducing the
level of details such as decoding just some types of packets (e.g., FUP/TIP,
PSB+), also filtering by time range.
- Add new itrace options (log flags to the 'd' option, error flags to the 'e'
one, etc), controlling how Intel PT is transformed into perf events, document
some missing options (e.g., how to synthesize callchains).
BPF:
- Properly report BPF errors when parsing events.
- Do not setup side-band events if LIBBPF is not linked, fixing a segfault.
Libraries:
- Improvements on the libtraceevent plugin mechanism.
- Improve libtracevent support for KVM trace events SVM exit reasons.
- Add a libtracevent plugins for decoding syscalls/sys_enter_futex and for tlb_flush.
- Ensure sample_period is set libpfm4 events in 'perf test'.
- Fixup libperf namespacing, to make sure what is in libperf has the perf_
namespace while what is now only in tools/perf/ doesn't use that prefix.
Arch specific:
- Improve the testing of vendor events and metrics in 'perf test'.
- Allow no ARM CoreSight hardware tracer sink to be specified on command line.
- Fix arm_spe_x recording when mixed with other perf events.
- Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of idle symbols.
- List kernel supplied event aliases for arm64 in 'perf list'.
- Add support for extended register capability in PowerPC 9 and 10.
- Added nest IMC power9 metric events.
Miscellaneous:
- No need to setup sample_regs_intr/sample_regs_user for dummy events.
- Update various copies of kernel headers, some causing perf to handle new
syscalls, MSRs, etc.
- Improve usage of flex and yacc, enabling warnings and addressing the fallout.
- Add missing '--output' option to 'perf kmem' so that it can pass it along to 'perf record'.
- 'perf probe' fixes related to adding multiple probes on the same address for
the same event.
- Make 'perf probe' warn if the target function is a GNU indirect function.
- Remove //anon mmap events from 'perf inject jit' to fix supporting both using
ELF files for generated functions and the perf-PID.map approaches.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test results:
The first ones are container based builds of tools/perf with and without libelf
support. Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
fedora:rawhide with python3 and gcc 10.1.1-2 is failing (10.1.1-1 on fedora:32
works), fixes will be provided soon.
clearlinux:latest is failing on libbpf, there is a fix already in the bpf tree.
The ones failing when linking with libllvm, not the default build, were
restricted to clang-9/llvm-9, working with anything before or after, e.g.,
using clang-8 on ubuntu:19.10 and clang-11 on debian:experimental fixed the
build in those environments.
# export PERF_TARBALL=http://192.168.124.1/perf/perf-5.8.0.tar.xz
# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
5 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
6 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
7 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
8 alpine:3.11 : Ok gcc (Alpine 9.2.0) 9.2.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
9 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
10 alpine:edge : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (git://git.alpinelinux.org/aports 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
11 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final)
12 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 7.0.1
13 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.2.1 20200123 (ALT Sisyphus 9.2.1-alt3), clang version 10.0.0
14 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
15 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
16 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
17 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
18 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
19 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
20 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03)
21 clearlinux:latest : FAIL gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200723 releases/gcc-10.2.0-3-g677b80db41, clang version 10.0.1
gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200723 releases/gcc-10.2.0-3-g677b80db41
btf.c: In function 'btf__parse_raw':
btf.c:625:28: error: 'btf' may be used uninitialized in this function [-Werror=maybe-uninitialized]
625 | return err ? ERR_PTR(err) : btf;
| ~~~~~~~~~~~~~~~~~~~^~~~~
22 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
23 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
24 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final)
25 debian:experimental : Ok gcc (Debian 10.2.0-3) 10.2.0, Debian clang version 11.0.0-+rc1-1
26 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0
27 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0
28 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 9.3.0-8) 9.3.0
29 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909
30 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
31 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
32 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
33 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
34 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
35 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
36 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
37 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
38 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
39 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
40 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30)
41 fedora:30-x-ARC-glibc : Ok arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
42 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
43 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-2.fc31)
44 fedora:32 : Ok gcc (GCC) 10.1.1 20200507 (Red Hat 10.1.1-1), clang version 10.0.0 (Fedora 10.0.0-2.fc32)
45 fedora:rawhide : FAIL gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), clang version 10.0.0 (Fedora 10.0.0-10.fc33)
gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1)
util/scripting-engines/trace-event-python.c: In function 'python_start_script':
util/scripting-engines/trace-event-python.c:1595:2: error: 'visibility' attribute ignored [-Werror=attributes]
1595 | PyMODINIT_FUNC (*initfunc)(void);
| ^~~~~~~~~~~~~~
46 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
47 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
48 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
49 mageia:7 : Ok gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
50 manjaro:latest : Ok gcc (GCC) 9.2.0, clang version 9.0.0 (tags/RELEASE_900/final)
51 openmandriva:cooker : Ok gcc (GCC) 10.0.0 20200502 (OpenMandriva), clang version 10.0.1
52 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548)
53 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
54 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1
55 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553)
56 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200728 [revision c0438ced53bcf57e4ebb1c38c226e41571aca892], clang version 10.0.1
57 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
58 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39.0.5)
59 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.3), clang version 9.0.1 (Red Hat 9.0.1-2.0.1.module+el8.2.0+5599+9ed9ef6d)
60 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
61 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
62 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
63 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
64 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
65 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
66 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
67 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
68 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
69 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
70 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
71 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
72 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
73 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
74 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
75 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
76 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
77 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
78 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
79 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
80 ubuntu:18.10 : Ok gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final)
81 ubuntu:19.04 : Ok gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final)
82 ubuntu:19.04-x-alpha : Ok alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
83 ubuntu:19.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
84 ubuntu:19.04-x-hppa : Ok hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0
85 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final)
86 219.74 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, clang version 10.0.0-4ubuntu1
#
# uname -a
Linux quaco 5.7.12-200.fc32.x86_64 #1 SMP Sat Aug 1 16:13:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
1101c872c8 perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
# perf version --build-options
perf version 5.8.g1101c872c8c7
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
aio: [ on ] # HAVE_AIO_SUPPORT
zstd: [ on ] # HAVE_ZSTD_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
11: DSO data read : Ok
12: DSO data cache : Ok
13: DSO data reopen : Ok
14: Roundtrip evsel->name : Ok
15: Parse sched tracepoints fields : Ok
16: syscalls:sys_enter_openat event fields : Ok
17: Setup struct perf_event_attr : Ok
18: Match and link multiple hists : Ok
19: 'import perf' in python : Ok
20: Breakpoint overflow signal handler : Ok
21: Breakpoint overflow sampling : Ok
22: Breakpoint accounting : Ok
23: Watchpoint :
23.1: Read Only Watchpoint : Skip
23.2: Write Only Watchpoint : Ok
23.3: Read / Write Watchpoint : Ok
23.4: Modify Watchpoint : Ok
24: Number of exit events of a simple workload : Ok
25: Software clock events period values : Ok
26: Object code reading : FAILED!
Fix being evaluated
27: Sample parsing : Ok
28: Use a dummy software event to keep tracking : Ok
29: Parse with no sample_id_all bit set : Ok
30: Filter hist entries : Ok
31: Lookup mmap thread : Ok
32: Share thread maps : Ok
33: Sort output of hist entries : Ok
34: Cumulate child hist entries : Ok
35: Track with sched_switch : Ok
36: Filter fds with revents mask in a fdarray : Ok
37: Add fd to a fdarray, making it autogrow : Ok
38: kmod_path__parse : Ok
39: Thread map : Ok
40: LLVM search and compile :
40.1: Basic BPF llvm compile : Ok
40.2: kbuild searching : Ok
40.3: Compile source for BPF prologue generation : Ok
40.4: Compile source for BPF relocation : Ok
41: Session topology : Ok
42: BPF filter :
42.1: Basic BPF filtering : Ok
42.2: BPF pinning : Ok
42.3: BPF prologue generation : Ok
42.4: BPF relocation checker : Ok
43: Synthesize thread map : Ok
44: Remove thread map : Ok
45: Synthesize cpu map : Ok
46: Synthesize stat config : Ok
47: Synthesize stat : Ok
48: Synthesize stat round : Ok
49: Synthesize attr update : Ok
50: Event times : Ok
51: Read backward ring buffer : Ok
52: Print cpu map : Ok
53: Merge cpu map : Ok
54: Probe SDT events : Ok
55: is_printable_array : Ok
56: Print bitmap : Ok
57: perf hooks : Ok
58: builtin clang support : Skip (not compiled in)
59: unit_number__scnprintf : Ok
60: mem2node : Ok
61: time utils : Ok
62: Test jit_write_elf : Ok
63: Test libpfm4 support : Skip (not compiled in)
64: Test api io : Ok
65: maps__merge_in : Ok
66: Demangle Java : Ok
67: Parse and process metrics : Ok
68: x86 rdpmc : Ok
69: Convert perf time to TSC : Ok
70: DWARF unwind : Ok
71: x86 instruction decoder - new instructions : Ok
72: Intel PT packet decoder : Ok
73: x86 bp modify : Ok
74: probe libc's inet_pton & backtrace it with ping : Ok
75: Use vfs_getname probe to get syscall args filenames : Ok
76: Add vfs_getname probe to get syscall args filenames : Ok
77: Check open filename arg using perf trace + vfs_getname: Ok
78: Zstd perf.data compression/decompression : Ok
#
$ cd ~acme/git/perf ; git log --oneline -1; time make -C tools/perf build-test
1101c872c8 (HEAD -> perf/core, quaco/perf/core) perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libcrypto_O: make NO_LIBCRYPTO=1
make_no_sdt_O: make NO_SDT=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_syscall_tbl_O: make NO_SYSCALL_TABLE=1
make_no_newt_O: make NO_NEWT=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libbpf_DEBUG_O: make NO_LIBBPF=1 DEBUG=1
make_static_O: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1
make_pure_O: make
make_install_bin_O: make install-bin
make_no_libelf_O: make NO_LIBELF=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_with_babeltrace_O: make LIBBABELTRACE=1
make_debug_O: make DEBUG=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_tags_O: make tags
make_doc_O: make doc
make_no_gtk2_O: make NO_GTK2=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_slang_O: make NO_SLANG=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libperl_O: make NO_LIBPERL=1
make_clean_all_O: make clean all
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_with_libpfm4_O: make LIBPFM4=1
make_help_O: make help
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_util_map_o_O: make util/map.o
make_install_O: make install
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_perf_o_O: make perf.o
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXzFq5QAKCRCyPKLppCJ+
J46OAP40WV9uE1L+3NznUF5D+zh7++SquzEBoABZiYNAXNhrGQEA2QZqAspkbLoo
hCM/yo7lO1XixiTGlp533b14OvE5oQk=
=n4VQ
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"New features:
- Introduce controlling how 'perf stat' and 'perf record' works via a
control file descriptor, allowing starting with events configured
but disabled until commands are received via the control file
descriptor. This allows, for instance for tools such as Intel VTune
to make further use of perf as its Linux platform driver.
- Improve 'perf record' to to register in a perf.data file header the
clockid used to help later correlate things like syslog files and
perf events recorded.
- Add basic syscall and find_next_bit benchmarks to 'perf bench'.
- Allow using computed metrics in calculating other metrics. For
instance:
{
.metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit",
.metric_name = "DCache_L2_All_Hits",
},
{
.metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss",
.metric_name = "DCache_L2_All_Miss",
},
{
.metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss",
.metric_name = "DCache_L2_All",
}
- Add suport for 'd_ratio', '>' and '<' operators to the expression
resolver used in calculating metrics in 'perf stat'.
Support for new kernel features:
- Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope
with things like ftrace, trampolines, i.e. changes in the kernel
text that gets in the way of properly decoding Intel PT hardware
traces, for instance.
Intel PT:
- Add various knobs to reduce the volume of Intel PT traces by
reducing the level of details such as decoding just some types of
packets (e.g., FUP/TIP, PSB+), also filtering by time range.
- Add new itrace options (log flags to the 'd' option, error flags to
the 'e' one, etc), controlling how Intel PT is transformed into
perf events, document some missing options (e.g., how to synthesize
callchains).
BPF:
- Properly report BPF errors when parsing events.
- Do not setup side-band events if LIBBPF is not linked, fixing a
segfault.
Libraries:
- Improvements to the libtraceevent plugin mechanism.
- Improve libtracevent support for KVM trace events SVM exit reasons.
- Add a libtracevent plugins for decoding syscalls/sys_enter_futex
and for tlb_flush.
- Ensure sample_period is set libpfm4 events in 'perf test'.
- Fixup libperf namespacing, to make sure what is in libperf has the
perf_ namespace while what is now only in tools/perf/ doesn't use
that prefix.
Arch specific:
- Improve the testing of vendor events and metrics in 'perf test'.
- Allow no ARM CoreSight hardware tracer sink to be specified on
command line.
- Fix arm_spe_x recording when mixed with other perf events.
- Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of
idle symbols.
- List kernel supplied event aliases for arm64 in 'perf list'.
- Add support for extended register capability in PowerPC 9 and 10.
- Added nest IMC power9 metric events.
Miscellaneous:
- No need to setup sample_regs_intr/sample_regs_user for dummy
events.
- Update various copies of kernel headers, some causing perf to
handle new syscalls, MSRs, etc.
- Improve usage of flex and yacc, enabling warnings and addressing
the fallout.
- Add missing '--output' option to 'perf kmem' so that it can pass it
along to 'perf record'.
- 'perf probe' fixes related to adding multiple probes on the same
address for the same event.
- Make 'perf probe' warn if the target function is a GNU indirect
function.
- Remove //anon mmap events from 'perf inject jit' to fix supporting
both using ELF files for generated functions and the perf-PID.map
approaches"
* tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits)
perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
perf tools powerpc: Add support for extended regs in power10
perf tools powerpc: Add support for extended register capability
tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
tools arch x86: Sync asm/cpufeatures.h with the kernel sources
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers UAPI: update linux/in.h copy
tools headers API: Update close_range affected files
perf script: Add 'tod' field to display time of day
perf script: Change the 'enum perf_output_field' enumerators to be 64 bits
perf data: Add support to store time of day in CTF data conversion
perf tools: Move clockid_res_ns under clock struct
perf header: Store clock references for -k/--clockid option
perf tools: Add clockid_name function
perf clockid: Move parse_clockid() to new clockid object
tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API
libtraceevent: Fixed description of tep_add_plugin_path() API
libtraceevent: Fixed type in PRINT_FMT_STING
libtraceevent: Fixed broken indentation in parse_ip4_print_args()
libtraceevent: Improve error handling of tep_plugin_add_option() API
...
- Add support for (optionally) using queued spinlocks & rwlocks.
- Support for a new faster system call ABI using the scv instruction on Power9
or later.
- Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on
Power10 and future processors, leaving us with no way to implement the
functionality it requests. This risks breaking userspace, though we believe
it is unused in practice.
- A bug fix for, and then the removal of, our custom stack expansion checking.
We now allow stack expansion up to the rlimit, like other architectures.
- Remove the remnants of our (previously disabled) topology update code, which
tried to react to NUMA layout changes on virtualised systems, but was prone
to crashes and other problems.
- Add PMU support for Power10 CPUs.
- A change to our signal trampoline so that we don't unbalance the link stack
(branch return predictor) in the signal delivery path.
- Lots of other cleanups, refactorings, smaller features and so on as usual.
Thanks to:
Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy,
Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton
Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill
Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy,
Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A.
Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar,
Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini,
Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe,
Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li
RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal
Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan
Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver
O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe
Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju,
Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung
Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong,
YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl8tOxATHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDQfEAClXHWf6hnxB84bEu39D51NkVotL1IG
BRWFvyix+xHuUkHIouBPAAMl6ngY5X6wkYd+Z+CY9zHNtdSDoVlJE30YXdMQA/dE
L/rYxR1884yGR/uU/3wusboO68ReXwcKQPmKOymUfh0zH7ujyJsSWLpXFK1YDC5d
2TVVTi0Q+P5ucMHDh0L+AHirIxZvtZSp43+J7xLtywsj+XAxJWCTGo5WCJbdgbCA
Qbv3aOkVyUa3EgsbdM/STPpv82ebqT+PHxeSIO4Jw6ZODtKRH0R5YsWCApuY9eZ+
ebY9RLmgv9ZAhJqB2fv9A5NDcMoGpZNmjM7HrWpXwULKQpkBGHCzJ9FcSdHVMOx8
nbVMFjt4uzLwV1w8lFYslQ2tNH/uH2o9BlryV1RLpiiKokDAJO/NOsWN9y0u/I4J
EmAM5DSX2LgVvvas96IlGK8KX4xkOkf8FLX/H5UDvvAfloH8J4CZXk/CWCab/nqY
KEHPnMmYvQZ1w9SzyZg9sO/1p6Bl1Gmm75Jv2F1lBiRW/42VcGBI/qLsJ4lC59Fc
KbwufYNYYG38wbxDLW1HAPJhRonxIcaZj3EEqk7aTiLZ55nNbu8e2k32CpNXTGqt
npOhzJHimcq7L6+878ZW+xpbZwogIEUdRSsmwb6aT8za3ShnYwSA2Q3LYxh9xyGH
j3GifvPq6Efp3Q==
=QMY1
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add support for (optionally) using queued spinlocks & rwlocks.
- Support for a new faster system call ABI using the scv instruction on
Power9 or later.
- Drop support for the PROT_SAO mmap/mprotect flag as it will be
unsupported on Power10 and future processors, leaving us with no way
to implement the functionality it requests. This risks breaking
userspace, though we believe it is unused in practice.
- A bug fix for, and then the removal of, our custom stack expansion
checking. We now allow stack expansion up to the rlimit, like other
architectures.
- Remove the remnants of our (previously disabled) topology update
code, which tried to react to NUMA layout changes on virtualised
systems, but was prone to crashes and other problems.
- Add PMU support for Power10 CPUs.
- A change to our signal trampoline so that we don't unbalance the link
stack (branch return predictor) in the signal delivery path.
- Lots of other cleanups, refactorings, smaller features and so on as
usual.
Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey
Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju
T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan
S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris
Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan
Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn
Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand,
Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel
Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh
Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton
Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan
Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran,
Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud,
Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy
Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh
Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar
Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza
Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov,
Wei Yongjun, Wen Xiong, YueHaibing.
* tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (337 commits)
selftests/powerpc: Fix pkey syscall redefinitions
powerpc: Fix circular dependency between percpu.h and mmu.h
powerpc/powernv/sriov: Fix use of uninitialised variable
selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUs
powerpc/40x: Fix assembler warning about r0
powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
cpuidle: pseries: Fixup exit latency for CEDE(0)
cpuidle: pseries: Add function to parse extended CEDE records
cpuidle: pseries: Set the latency-hint before entering CEDE
selftests/powerpc: Fix online CPU selection
powerpc/perf: Consolidate perf_callchain_user_[64|32]()
powerpc/pseries/hotplug-cpu: Remove double free in error path
powerpc/pseries/mobility: Add pr_debug() for device tree changes
powerpc/pseries/mobility: Set pr_fmt()
powerpc/cacheinfo: Warn if cache object chain becomes unordered
powerpc/cacheinfo: Improve diagnostics about malformed cache lists
powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
powerpc/cacheinfo: Set pr_fmt()
powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
...
Added support for supported regs which are new in power10 ( MMCR3,
SIER2, SIER3 ) to sample_reg_mask in the tool side to use with `-I?`
option. Also added PVR check to send extended mask for power10 at kernel
while capturing extended regs in each sample.
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add extended regs to sample_reg_mask in the tool side to use with `-I?`
option. Perf tools side uses extended mask to display the platform
supported register names (with -I? option) to the user and also send
this mask to the kernel to capture the extended registers in each
sample. Hence decide the mask value based on the processor version.
Currently definitions for `mfspr`, `SPRN_PVR` are part of
`arch/powerpc/util/header.c`. Move this to a header file so that these
definitions can be re-used in other source files as well.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Reviewed--by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org> <mikey@neuling.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
[Decide extended mask at run time based on platform]
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
55db9c0e85 ("net: remove compat_sys_{get,set}sockopt")
9b4feb630e ("arch: wire-up close_range()")
That automagically add the 'close_range' syscall to tools such as 'perf
trace'.
Before:
# perf trace -e close_range
event syntax error: 'close_range'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
After, system wide strace like tracing for this syscall:
# perf trace -e close_range
^C#
No calls, I need some test proggie :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To sync headers, for instance, in this case tools/perf was ahead of
upstream till Linus merged tip/perf/core to get the
PERF_RECORD_TEXT_POKE changes:
Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.
[root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]#
[root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
-e cache-misses sleep 1
[root@localhost 0620]#
The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().
We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.
Fixes: ffd3d18c20 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- auxtrace_record__init() is called only once, so there is no point in
using a static variable to cache the results of
find_all_arm_spe_pmus(), make it local and free the results after use.
- Another reason is, even though SPE is micro-architecture dependent,
but so far it only supports "statistical-profiling-extension-v1" and
we have no chance to use multiple SPE's PMU events in Perf command.
So remove the useless check code to make it clear.
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-3-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording with cache-misses and arm_spe_x event, I found that it
will just fail without showing any error info if i put cache-misses
after 'arm_spe_x' event.
[root@localhost 0620]# perf record -e cache-misses \
-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.067 MB perf.data ]
[root@localhost 0620]#
[root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
-e cache-misses sleep 1
[root@localhost 0620]#
The current code can only work if the only event to be traced is an
'arm_spe_x', or if it is the last event to be specified. Otherwise the
last event type will be checked against all the arm_spe_pmus[i]->types,
none will match and an out of bound 'i' index will be used in
arm_spe_recording_init().
We don't support concurrent multiple arm_spe_x events currently, that
is checked in arm_spe_recording_options(), and it will show the relevant
info. So add the check and record of the first found 'arm_spe_pmu' to
fix this issue here.
Fixes: ffd3d18c20 ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On PAPR+ the hcall() on 0x1B0 is called H_DISABLE_AND_GET, but got
defined as H_DISABLE_AND_GETC instead.
This define was introduced with a typo in commit <b13a96cfb055>
("[PATCH] powerpc: Extends HCALL interface for InfiniBand usage"), and was
later used without having the typo noticed.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200707004812.190765-1-leobras.c@gmail.com
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.
This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.
It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Select text poke events when available and the kernel is being traced.
Process text poke events to invalidate entries in Intel PT's instruction
cache.
Example:
The example requires kernel config:
CONFIG_PROC_SYSCTL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
Before:
# perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.341 MB perf.data.before ]
[1]+ Terminated perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.before --itrace=e >/dev/null
Warning:
474 instruction trace errors
After:
# perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.646 MB perf.data.after ]
[1]+ Terminated perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.after --itrace=e >/dev/null
Example:
The example requires kernel config:
# CONFIG_FUNCTION_TRACER is not set
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (68 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.268 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
207 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (107 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 39.978 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
6 565303693547 0x291f18 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027a000 len 4096 type 2 flags 0x0 name kprobe_insn_page
6 565303697010 0x291f68 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 0 new len 6
6 565303838278 0x291fa8 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027c000 len 4096 type 2 flags 0x0 name kprobe_optinsn_page
6 565303848286 0x291ff8 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 0 new len 106
6 565369336743 0x292af8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
7 566434327704 0x217c208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
6 566456313475 0x293198 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 106 new len 0
6 566456314935 0x293238 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 6 new len 0
Example:
The example requires kernel config:
CONFIG_FUNCTION_TRACER=y
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (6 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 43.850 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
8 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (206 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.442 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
5 312216133258 0x8bafe0 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc0360000 len 415 type 2 flags 0x0 name ftrace_trampoline
5 312216133494 0x8bb030 [0x1d8]: PERF_RECORD_TEXT_POKE addr 0xffffffffc0360000 old len 0 new len 415
5 312216229563 0x8bb208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216239063 0x8bb248 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
5 312216727230 0x8bb288 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
5 312216739322 0x8bb2c8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216748321 0x8bb308 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287163462 0x2817430 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287174890 0x2817470 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287818979 0x28174b0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
7 313287829357 0x28174f0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287841246 0x2817530 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up fixes and move perf/core forward, minor conflict as
perf_evlist__add_dummy() lost its 'perf_' prefix as it operates on a
'struct evlist', not on a 'struct perf_evlist', i.e. its tools/perf/
specific, it is not in libperf.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording PEBS-via-PT, the kernel will not accept the intel_pt
event with register sampling e.g.
# perf record --kcore -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
Error:
intel_pt/branch=0/: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fix by suppressing register sampling on the intel_pt evsel.
Committer notes:
Adrian informed that this is only available from Tremont onwards, so on
older processors the error continues the same as before.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adjust the handling of the session sink selection to allow no sink to be
selected on the command line. This then forwards the sink selection to
the CoreSight infrastructure which will attempt to select a sink based
on the default sink select priorities.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the copies of files affected by:
c8ffd8bcdd ("vfs: add faccessat2 syscall")
To address this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h'
diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Which results in 'perf trace' gaining support for the 'faccessat2'
syscall, now one can use:
# perf trace -e faccessat2
And have system wide tracing of this syscall. And this also will include
it;
# perf trace -e faccess*
Together with the other variants.
How it affects building/usage (on an x86_64 system):
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c.before
$
[root@five ~]# perf trace -e faccessat2
event syntax error: 'faccessat2'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
[root@five ~]#
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ git diff
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 37b844f839bc..78847b32e137 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -359,6 +359,7 @@
435 common clone3 sys_clone3
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
+439 common faccessat2 sys_faccessat2
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
$ make -C tools/perf O=/tmp/build/perf/ install-bin
<SNIP>
CC /tmp/build/perf/util/syscalltbl.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
[root@five ~]# perf trace -e faccessat2
^C[root@five ~]#
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is currently working due to extra include paths in the build.
Before:
$ cd tools/perf/arch/arm64/util
$ ls -la ../../util/unwind-libdw.h
ls: cannot access '../../util/unwind-libdw.h': No such file or directory
After:
$ ls -la ../../../util/unwind-libdw.h
-rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200529225232.207532-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid a false positive caused by assembly code in arch/x86.
In tests, zero the perf_event to avoid uninitialized memory uses.
Warnings were caught using clang with -fsanitize=memory.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200530082015.39162-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stop the message displaying when user space is not being traced.
Example:
Prerequisites:
sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
sudo chmod +r /proc/kcore
Before:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
Warning:
Intel Processor Trace decoding will not be possible except for kernel tracing!
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.838 MB perf.data ]
After:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.068 MB perf.data ]
$ sudo chmod go-r /proc/kcore
$ sudo setcap -r ~/bin/perf
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Context switch events are added automatically by Intel PT and Coresight.
Make it possible to suppress them. That is useful for tracing the
scheduler without the disturbance that the switch event processing
creates.
Example:
Prerequisites:
$ which perf
~/bin/perf
$ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
$ sudo chmod +r /proc/kcore
Before:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.938 MB perf.data ]
$ perf script -D | grep PERF_RECORD_SWITCH | wc -l
572
After:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
Warning:
Intel Processor Trace decoding will not be possible except for kernel tracing!
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.838 MB perf.data ]
$ perf script -D | grep PERF_RECORD_SWITCH | wc -l
0
$ sudo chmod go-r /proc/kcore
$ sudo setcap -r ~/bin/perf
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On a CPU like skylakex an uncore_iio_0 PMU may alias with
uncore_iio_free_running_0. The latter PMU doesn't support fc_mask as a
parameter and so pmu_config_term fails. Typically parse_events_add_pmu
is called in a loop where if one alias succeeds errors are ignored,
however, if multiple errors occur parse_events__handle_error will
currently give a WARN_ONCE.
This change removes the WARN_ONCE in parse_events__handle_error and
makes it a pr_debug. It adds verbose messages to parse_events_add_pmu
warning that non-fatal errors may occur, while giving details on the pmu
and config terms for useful context. pmu_config_term is altered so the
failing term and pmu are present in the case of the 'unknown term' error
which makes spotting the free_running case more straightforward.
Before:
$ perf --debug verbose=3 stat -M llc_misses.pcie_read sleep 1
Using CPUID GenuineIntel-6-55-4
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
adding {unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W,{unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
WARNING: multiple event parsing errors
...
Invalid event/parameter 'fc_mask'
...
After:
$ perf --debug verbose=3 stat -M llc_misses.pcie_read sleep 1
Using CPUID GenuineIntel-6-55-4
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
adding {unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W,{unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
Attempting to add event pmu 'uncore_iio_free_running_5' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_5' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Attempting to add event pmu 'uncore_iio_free_running_3' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_3' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Attempting to add event pmu 'uncore_iio_free_running_1' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_1' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Multiple errors dropping message: unknown term 'fc_mask' for pmu 'uncore_iio_free_running_3' (valid terms: event,umask,config,config1,config2,name,period,percore)
...
So before you see a 'WARNING: multiple event parsing errors' and
'Invalid event/parameter'. After you see 'Attempting... that may result
in non-fatal errors' then 'Multiple errors...' with details that
'fc_mask' wasn't known to a free running counter. While not completely
clean, this makes it clearer that an error hasn't really occurred.
v2. addresses review feedback from Jiri Olsa <jolsa@redhat.com>.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200513220635.54700-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 7eec00a747 ("perf symbols: Consolidate symbol fixup issue")
removed powerpc specific sym-handling.c file from Build. This wasn't
caught by build CI because all functions in this file are declared
as __weak in common code. Fix it.
Fixes: 7eec00a747 ("perf symbols: Consolidate symbol fixup issue")
Reported-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20200509112113.174745-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As these are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As those are not 'struct evsel' methods, not part of tools/lib/perf/,
aka libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As they are not 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trying to disentangle this a bit further, unfortunately it uses
parse_events(), its interesting to have it separated anyway, so do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Patch enhances current metric infrastructure to handle "?" in the metric
expression. The "?" can be use for parameters whose value not known
while creating metric events and which can be replace later at runtime
to the proper value. It also add flexibility to create multiple events
out of single metric event added in JSON file.
Patch adds function 'arch_get_runtimeparam' which is a arch specific
function, returns the count of metric events need to be created. By
default it return 1.
This infrastructure needed for hv_24x7 socket/chip level events.
"hv_24x7" chip level events needs specific chip-id to which the data is
requested. Function 'arch_get_runtimeparam' implemented in header.c
which extract number of sockets from sysfs file "sockets" under
"/sys/devices/hv_24x7/interface/".
With this patch basically we are trying to create as many metric events
as define by runtime_param.
For that one loop is added in function 'metricgroup__add_metric', which
create multiple events at run time depend on return value of
'arch_get_runtimeparam' and merge that event in 'group_list'.
To achieve that we are actually passing this parameter value as part of
`expr__find_other` function and changing "?" present in metric
expression with this value.
As in our JSON file, there gonna be single metric event, and out of
which we are creating multiple events.
To understand which data count belongs to which parameter value,
we also printing param value in generic_metric function.
For example,
command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000
1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0
1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1
2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0
2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1
So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-5-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
d3b1b776ee ("x86/entry/64: Remove ptregs qualifier from syscall table")
cab56d3484 ("x86/entry: Remove ABI prefixes from functions in syscall tables")
27dd84fafc ("x86/entry/64: Use syscall wrappers for x32_rt_sigreturn")
Addressing this tools/perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
That didn't result in any tooling changes, as what is extracted are just
the first two columns, and these patches touched only the third.
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ make -C tools/perf O=/tmp/build/perf install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
DESCEND plugins
CC /tmp/build/perf/util/syscalltbl.o
INSTALL trace_plugins
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
During execution of command 'perf report' in my arm64 virtual machine,
this error message is showed:
failed to process sample
__symbol__inc_addr_samples(860): ENOMEM! sym->name=__this_module,
start=0x1477100, addr=0x147dbd8, end=0x80002000, func: 0
The error is caused with path:
cmd_report
__cmd_report
perf_session__process_events
__perf_session__process_events
ordered_events__flush
__ordered_events__flush
oe->deliver (ordered_events__deliver_event)
perf_session__deliver_event
machines__deliver_event
perf_evlist__deliver_sample
tool->sample (process_sample_event)
hist_entry_iter__add
iter->add_entry_cb(hist_iter__report_callback)
hist_entry__inc_addr_samples
symbol__inc_addr_samples
__symbol__inc_addr_samples
h = annotated_source__histogram(src, evidx) (NULL)
annotated_source__histogram failed is caused with path:
...
hist_entry__inc_addr_samples
symbol__inc_addr_samples
symbol__hists
annotated_source__alloc_histograms
src->histograms = calloc(nr_hists, sizeof_sym_hist) (failed)
Calloc failed as the symbol__size(sym) is too huge. As show in error
message: start=0x1477100, end=0x80002000, size of symbol is about 2G.
This is the same problem as 'perf annotate: Fix s390 gap between kernel
end and module start (b9c0a64901)'. Perf gets symbol information from
/proc/kallsyms in __dso__load_kallsyms. A part of symbol in /proc/kallsyms
from my virtual machine is as follows:
#cat /proc/kallsyms | sort
...
ffff000001475080 d rpfilter_mt_reg [ip6t_rpfilter]
ffff000001475100 d $d [ip6t_rpfilter]
ffff000001475100 d __this_module [ip6t_rpfilter]
ffff000080080000 t _head
ffff000080080000 T _text
ffff000080080040 t pe_header
...
Take line 'ffff000001475100 d __this_module [ip6t_rpfilter]' as example.
The start and end of symbol are both set to ffff000001475100 in
dso__load_all_kallsyms. Then symbols__fixup_end will set the end of symbol
to next big address to ffff000001475100 in /proc/kallsyms, ffff000080080000
in this example. Then sizeof of symbol will be about 2G and cause the
problem.
The start of module in my machine is
ffff000000a62000 t $x [dm_mod]
The start of kernel in my machine is
ffff000080080000 t _head
There is a big gap between end of module and begin of kernel if a samll
amount of memory is used by module. And the last symbol in module will
have a large address range as caotaining the big gap.
Give that the module and kernel text segment sequence may change in
the future, fix this by limiting range of last symbol in module and kernel
to 4K in arch arm64.
Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Hu Shiyuan <hushiyuan@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/33fd24c4-0d5a-9d93-9b62-dffa97c992ca@huawei.com
[ refreshed the patch on current codebase, added string.h include as strchr() is used ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the
following instructions:
incsspd
incsspq
rdsspd
rdsspq
saveprevssp
rstorssp
wrssd
wrssq
wrussd
wrussq
setssbsy
clrssbsy
endbr32
endbr64
And the "notrack" prefix for indirect calls and jumps.
For information about the instructions, refer Intel Control-flow
Enforcement Technology Specification May 2019 (334525-003).
Committer testing:
$ perf test instr
67: x86 instruction decoder - new instructions : Ok
$
Then use verbose mode and check one of those new instructions:
$ perf test -v instr |& grep saveprevssp
Decoded ok: f3 0f 01 ea saveprevssp
Decoded ok: f3 0f 01 ea saveprevssp
$
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi v. Shankar <ravi.v.shankar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200204171425.28073-3-yu-cheng.yu@intel.com
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After copying Arm64's perf archive with object files and perf.data file
to x86 laptop, the x86's perf kernel symbol resolution fails. It
outputs 'unknown' for all symbols parsing.
This issue is root caused by the function elf__needs_adjust_symbols(),
x86 perf tool uses one weak version, Arm64 (and powerpc) has rewritten
their own version. elf__needs_adjust_symbols() decides if need to parse
symbols with the relative offset address; but x86 building uses the weak
function which misses to check for the elf type 'ET_DYN', so that it
cannot parse symbols in Arm DSOs due to the wrong result from
elf__needs_adjust_symbols().
The DSO parsing should not depend on any specific architecture perf
building; e.g. x86 perf tool can parse Arm and Arm64 DSOs, vice versa.
And confirmed by Naveen N. Rao that powerpc64 kernels are not being
built as ET_DYN anymore and change to ET_EXEC.
This patch removes the arch specific functions for Arm64 and powerpc and
changes elf__needs_adjust_symbols() as a common function.
In the common elf__needs_adjust_symbols(), it checks an extra condition
'ET_DYN' for elf header type. With this fixing, the Arm64 DSO can be
parsed properly with x86's perf tool.
Before:
# perf script
main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c46670 [unknown] ([kernel.kallsyms]) => ffff800010c4eaec [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eaec [unknown] ([kernel.kallsyms]) => ffff800010c4eb00 [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eb08 [unknown] ([kernel.kallsyms]) => ffff800010c4e780 [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4e7a0 [unknown] ([kernel.kallsyms]) => ffff800010c4eeac [unknown] ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eebc [unknown] ([kernel.kallsyms]) => ffff800010c4ed80 [unknown] ([kernel.kallsyms])
After:
# perf script
main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c coresight_timeout+0x54 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c46670 coresight_timeout+0x68 ([kernel.kallsyms]) => ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) => ffff800010c4eb00 etm4_enable_hw+0x3e0 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eb08 etm4_enable_hw+0x3e8 ([kernel.kallsyms]) => ffff800010c4e780 etm4_enable_hw+0x60 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4e7a0 etm4_enable_hw+0x80 ([kernel.kallsyms]) => ffff800010c4eeac etm4_enable+0x2d4 ([kernel.kallsyms])
main 3258 1 branches: ffff800010c4eebc etm4_enable+0x2e4 ([kernel.kallsyms]) => ffff800010c4ed80 etm4_enable+0x1a8 ([kernel.kallsyms])
v3: Changed to check for ET_DYN across all architectures.
v2: Fixed Arm64 and powerpc native building.
Reported-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Enrico Weigelt <info@metux.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20200306015759.10084-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is currently working due to extra include paths in the build.
Committer testing:
$ cd tools/include/uapi/asm/
Before this patch:
$ ls -la ../../arch/x86/include/uapi/asm/errno.h
ls: cannot access '../../arch/x86/include/uapi/asm/errno.h': No such file or directory
$
After this patch;
$ ls -la ../../../arch/x86/include/uapi/asm/errno.h
-rw-rw-r--. 1 acme acme 31 Feb 20 12:42 ../../../arch/x86/include/uapi/asm/errno.h
$
Check that that is still under tools/, i.e. hasn't escaped into the main
kernel sources:
$ cd ../../../arch/x86/include/uapi/asm/
$ pwd
/home/acme/git/perf/tools/arch/x86/include/uapi/asm
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Li <liwei391@huawei.com>
Link: http://lore.kernel.org/lkml/20200306071110.130202-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copy over powerpc syscall.tbl to grab changes from the below commits
fddb5d430a ("open: introduce openat2(2) syscall")
9a2cef09c8 ("arch: wire up pidfd_getfd syscall")
Now 'perf trace' on powerpc will be able to map from those syscall
strings to the right syscall numbers, i.e.
perf trace -e pidfd*
Will include 'pidfd_getfd' as well as:
perf trace open*
Will cover all 'open' variants.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All ->read_finish() implementations are doing the same thing. Add a
helper function so that they can share the same implementation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Link: http://lore.kernel.org/lkml/20200217082300.6301-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the event is disabled, don't enable it again here.
Based-on-patch-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the cs_etm event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
coresight feature, but the code seems buggy same as arm-spe and
intel-pt.
Tester notes:
Thanks for looping, Adrian. Applied this patch and tested with
CoreSight on juno board, it works well.
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-4-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return']
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the intel_bts event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
intel_bts feature, but the code seems buggy same as arm-spe and
intel-pt.
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-3-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return']
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be endless.
If the intel_pt event is disabled, we don't enable it again here.
Before the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 46803
^C^C^C^C^C^C
After the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 48591
^C[ perf record: Woken up 0 times to write data ]
Warning:
AUX data lost 504 times out of 4816!
[ perf record: Captured and wrote 2024.405 MB perf.data ]
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-2-adrian.hunter@intel.com
[ ahunter: removed redundant 'else' after 'return' ]
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an arm64 version of get_cpuid(), which is used for various annotation
and headers - for example, I now get the CPUID in "perf report --header",
as shown in this snippet:
# hostname : ubuntu
# os release : 5.5.0-rc1-dirty
# perf version : 5.5.rc1.gbf8a13dc9851
# arch : aarch64
# nrcpus online : 96
# nrcpus avail : 96
# cpuid : 0x00000000480fd010
Since much of the code to read the MIDR is already in get_cpuid_str(),
factor out this code.
Tester notes:
I tested this patch on my new ARM64 Kunpeng 920 server.
[root@node1 zsk]# ./perf --version
perf version 5.6.rc1.g2cdb955b7252
Both perf list and perf stat can work.
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxarm@huawei.com
Link: http://lore.kernel.org/lkml/1576245255-210926-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
fddb5d430a ("open: introduce openat2(2) syscall")
9a2cef09c8 ("arch: wire up pidfd_getfd syscall")
We also need to grab a copy of uapi/linux/openat2.h since it is now
needed by fcntl.h, add it to tools/perf/check_headers.h.
$ diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
--- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 2019-12-20 16:43:57.662429958 -0300
+++ arch/x86/entry/syscalls/syscall_64.tbl 2020-02-10 16:36:22.070012468 -0300
@@ -357,6 +357,8 @@
433 common fspick __x64_sys_fspick
434 common pidfd_open __x64_sys_pidfd_open
435 common clone3 __x64_sys_clone3/ptregs
+437 common openat2 __x64_sys_openat2
+438 common pidfd_getfd __x64_sys_pidfd_getfd
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
Update tools/'s copy of that file:
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
See the result:
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
--- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2020-02-10 16:42:59.010636041 -0300
+++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2020-02-10 16:43:24.149958337 -0300
@@ -346,5 +346,7 @@
[433] = "fspick",
[434] = "pidfd_open",
[435] = "clone3",
+ [437] = "openat2",
+ [438] = "pidfd_getfd",
};
-#define SYSCALLTBL_x86_64_MAX_ID 435
+#define SYSCALLTBL_x86_64_MAX_ID 438
$
Now one can use:
perf trace -e openat2,pidfd_getfd
To get just those syscalls or use in things like:
perf trace -e open*
To get all the open variant (open, openat, openat2, etc) or:
perf trace pidfd*
To get the pidfd syscalls.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The struct perf_evsel_config_term::val is a union which contains fields
'callgraph', 'drv_cfg' and 'branch' as string pointers. This leads to
the complex code logic for handling every type's string separately, and
it's hard to release string as a general way.
This patch refactors the structure to add a common field 'str' in the
'val' union as string pointer and remove the other three fields
'callgraph', 'drv_cfg' and 'branch'. Without passing field name, the
patch simplifies the string handling with macro ADD_CONFIG_TERM_STR()
for string pointer assignment.
This patch fixes multiple warnings of line over 80 characters detected
by checkpatch tool.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200117055251.24058-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And update linux/linkage.h, which requires in turn that we make these
files switch from ENTRY()/ENDPROC() to SYM_FUNC_START()/SYM_FUNC_END():
tools/perf/arch/arm64/tests/regs_load.S
tools/perf/arch/arm/tests/regs_load.S
tools/perf/arch/powerpc/tests/regs_load.S
tools/perf/arch/x86/tests/regs_load.S
We also need to switch SYM_FUNC_START_LOCAL() to SYM_FUNC_START() for
the functions used directly by 'perf bench', and update
tools/perf/check_headers.sh to ignore those changes when checking if the
kernel original files drifted from the copies we carry.
This is to get the changes from:
6dcc5627f6 ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*")
ef1e03152c ("x86/asm: Make some functions local")
e9b9d020c4 ("x86/asm: Annotate aliases")
And address these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tay3l8x8k11p7y3qcpqh9qh5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step in the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ibtn3vua76f934t7woyf26w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step on the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-61rra2wg392rhvdgw421wzpt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step on the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-69vcr8pubpym90skxhmbwhiw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
v4fmaddps
v4fmaddss
v4fnmaddps
v4fnmaddss
vaesdec
vaesdeclast
vaesenc
vaesenclast
vcvtne2ps2bf16
vcvtneps2bf16
vdpbf16ps
gf2p8affineinvqb
vgf2p8affineinvqb
gf2p8affineqb
vgf2p8affineqb
gf2p8mulb
vgf2p8mulb
vp2intersectd
vp2intersectq
vp4dpwssd
vp4dpwssds
vpclmulqdq
vpcompressb
vpcompressw
vpdpbusd
vpdpbusds
vpdpwssd
vpdpwssds
vpexpandb
vpexpandw
vpopcntb
vpopcntd
vpopcntq
vpopcntw
vpshldd
vpshldq
vpshldvd
vpshldvq
vpshldvw
vpshldw
vpshrdd
vpshrdq
vpshrdvd
vpshrdvq
vpshrdvw
vpshrdw
vpshufbitqmb
For information about the instructions, refer Intel SDM May 2019
(325462-070US) and Intel Architecture Instruction Set Extensions May
2019 (319433-037).
Committer testing:
$ perf test x86
61: x86 rdpmc : Ok
64: x86 instruction decoder - new instructions : Ok
66: x86 bp modify : Ok
$
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20191125125044.31879-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an error message because Intel BTS does not support AUX area
sampling.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Set up the default number of mmap pages, default sample size and default
psb_period for AUX area sampling. Add documentation also.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Record the first event parsing error and report. Implementing feedback
from Jiri Olsa:
https://lkml.org/lkml/2019/10/28/680
An example error is:
$ tools/perf/perf stat -e c/c/
WARNING: multiple event parsing errors
event syntax error: 'c/c/'
\___ unknown term
valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore
Initial error:
event syntax error: 'c/c/'
\___ Cannot find PMU `c'. Missing kernel support?
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
cldemote
tpause
umonitor
umwait
movdiri
movdir64b
enqcmd
enqcmds
encls
enclu
enclv
pconfig
wbnoinvd
For information about the instructions, refer Intel SDM May 2019
(325462-070US) and Intel Architecture Instruction Set Extensions
May 2019 (319433-037).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20191115135447.6519-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These were the last uses of map->groups, next cset will nuke it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-n3g0foos7l7uxq9nar0zo0vj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.
This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were just passing a map to look for and reuse its map->groups member,
but the idea is that this is going away, as a map can be in multiple
rb_trees when being reused via a map_node, so do as all the other
map_groups methods and pass as its first arg the object being operated
on.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-nmi2pbggqloogwl6vxrvex5a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently when cross compiling perf tool for ARM64 on my x86 machine I
get this error:
arch/arm64/util/sym-handling.c:9:10: fatal error: gelf.h: No such file or directory
#include <gelf.h>
For the build, libelf is reported off:
Auto-detecting system features:
...
... libelf: [ OFF ]
Indeed, test-libelf is not built successfully:
more ./build/feature/test-libelf.make.output
test-libelf.c:2:10: fatal error: libelf.h: No such file or directory
#include <libelf.h>
^~~~~~~~~~
compilation terminated.
I have no such problems natively compiling on ARM64, and I did not
previously have this issue for cross compiling. Fix by relocating the
gelf.h include.
Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1573045254-39833-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce boilerplate, provide a more compact form using an idiom
present in other trees of data structures.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_event() from tools/perf to libperf and export it in
the perf/mmap.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_init() from tools/perf to libperf and export it in
the perf/mmap.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_init() from tools/perf to libperf and export it in
perf/mmap.h header.
And add pr_debug2()/pr_debug3() macros support, because the code is
using them.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__consume() vrom tools/perf to libperf and export it in
the perf/mmap.h header.
Move also the needed helpers perf_mmap__write_tail(),
perf_mmap__read_head() and perf_mmap__empty().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Being const + weak breaks with some compilers that constant-propagate
from the weak symbol. This behavior is outside of the specification, but
in LLVM is chosen to match GCC's behavior.
LLVM's implementation was set in this patch:
f49573d1ee
A const + weak symbol is set to be weak_odr:
https://llvm.org/docs/LangRef.html
ODR is one definition rule, and given there is one constant definition
constant-propagation is possible. It is possible to get this code to
miscompile with LLVM when applying link time optimization. As compilers
become more aggressive, this is likely to break in more instances.
Move the definition of sample_reg_masks to the conditional part of
perf_regs.h and guard usage with HAVE_PERF_REGS_SUPPORT. This avoids the
weak symbol.
Fix an issue when HAVE_PERF_REGS_SUPPORT isn't defined from patch v1.
In v3, add perf_regs.c for architectures that HAVE_PERF_REGS_SUPPORT but
don't declare sample_regs_masks.
Further notes:
Jiri asked:
"Is this just a precaution or you actualy saw some breakage?"
Ian answered:
"We saw a breakage with clang with thinlto enabled for linking. Our
compiler team had recently seen, and were surprised by, a similar issue
and were able to dig out the weak ODR issue."
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Guo Ren <guoren@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linux-riscv@lists.infradead.org
Cc: Mao Han <han_mao@c-sky.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20191001003623.255186-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They are called from symbol__annotate() and to propagate errors that can
help understand the problem make them return what
symbol__strerror_disassemble() known, i.e. errno codes and other
annotation specific errors in a special, out of errnos, range.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency, propagate the exact cause for get_cpuid() to have
failed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Naresh Kamboju reported, that on the i386 build pr_err()
doesn't get defined properly due to header ordering:
perf-in.o: In function `libunwind__x86_reg_id':
tools/perf/util/libunwind/../../arch/x86/util/unwind-libunwind.c:109:
undefined reference to `pr_err'
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Further reducing the size of util/evsel.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-20zr7di9eynm0272mtjfdhfc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need the 'page_size' variable in libperf, so move it there.
Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.
Committer notes:
Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_evlist__first()/last() functions to libperf, as internal
functions and rename perf's origins to evlist__first/last.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will
be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.
Committer notes:
Added stdbool.h as we now use bool here.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the perf_mmap struct to libperf.
The definition is added into:
include/internal/mmap.h
which is not to be included by users, but shared within perf and
libperf.
Committer notes:
Remove unnecessary includes from tools/perf/lib/include/internal/mmap.h,
those will be readded as they become necessary, later in the series.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.
Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.
Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.
Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name
clash when we add perf_evlist__mmap() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name
clash when we add 'struct perf_mmap' to libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat:
Srikar Dronamraju:
- Fix a segmentation fault when using repeat forever.
- Reset previous counts on repeat with interval.
aarch64:
James Clark:
- Add PMU event JSON files for Cortex-A76 and Neoverse N1.
PowerPC:
Anju T Sudhakar:
- Make 'trace_cycles' the default event for 'perf kvm record' in PowerPC.
S/390:
- Link libjvmti to tools/lib/string.o to have a weak strlcpy()
implementation, providing previously unresolved symbol on s/390.
perf test:
Jiri Olsa:
- Add libperf automated tests to 'make -C tools/perf build-test'.
Colin Ian King:
- Fix spelling mistake.
Tree wide:
Arnaldo Carvalho de Melo:
- Some more header file sanitization.
libperf:
Jiri Olsa:
- Add dependency on libperf for python.so binding.
libtraceevent:
Sakari Ailus:
- Convert remaining %p[fF] users to %p[sS].
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXYTUUAAKCRCyPKLppCJ+
JxXzAP97DpoHp/oFD7rpPr51+8MLbPTMldeVdrXb3Yk9uX8qqQD/aqpNEeoIn3um
9iqE3pwqNmJGHgY8xr/6eTd4SEISVA8=
=Xucn
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.4-20190920-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf stat:
Srikar Dronamraju:
- Fix a segmentation fault when using repeat forever.
- Reset previous counts on repeat with interval.
aarch64:
James Clark:
- Add PMU event JSON files for Cortex-A76 and Neoverse N1.
PowerPC:
Anju T Sudhakar:
- Make 'trace_cycles' the default event for 'perf kvm record' in PowerPC.
S/390:
- Link libjvmti to tools/lib/string.o to have a weak strlcpy()
implementation, providing previously unresolved symbol on s/390.
perf test:
Jiri Olsa:
- Add libperf automated tests to 'make -C tools/perf build-test'.
Colin Ian King:
- Fix spelling mistake.
Tree wide:
Arnaldo Carvalho de Melo:
- Some more header file sanitization.
libperf:
Jiri Olsa:
- Add dependency on libperf for python.so binding.
libtraceevent:
Sakari Ailus:
- Convert remaining %p[fF] users to %p[sS].
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use 'trace_imc/trace_cycles' as the default event for 'perf kvm record'
in powerpc.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20190718181749.30612-3-anju@linux.vnet.ibm.com
[ Add missing pmu.h header, needed because this patch uses pmu_have_event() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf kvm record' uses 'cycles'(if the user did not specify any event)
as the default event to profile the guest.
This will not provide any proper samples from the guest incase of
powerpc architecture, since in powerpc the PMUs are controlled by the
guest rather than the host.
Patch adds a function to pick an arch specific event for 'perf kvm
record', instead of selecting 'cycles' as a default event for all
architectures.
For powerpc this function checks for any user specified event, and if
there isn't any it returns invalid instead of proceeding with 'cycles'
event.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20190718181749.30612-2-anju@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those are the only routines using the perf_event__handler_t typedef and
are all related, so move to a separate header to reduce the header
dependency tree, lots of places were getting event.h and even stdio.h,
limits.h indirectly, so fix those as well.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All we need is a bunch of struct forward declarations and then add
event.h to the only place that was getting it indirectly via
callchain.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qq2xhyuxcvx5vmxha9otjd8d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Only a 'struct perf_cmp_map' forward allocation is necessary, fix the
places that need the header but were getting it indirectly, by luck,
from env.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-3sj3n534zghxhk7ygzeaqlx9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check that it is not needed and remove, fixing up some fallout for
places where it was only serving to get something else.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pruning a bit more the includes dependency tree. Building this thing on
lots of containers takes time, we better reduce the time per build, each
container is doing 6 builds when clang and clang-devel are available,
and the plan is to do a 'make -C tools/perf build-test' that have many
more.
Also helps when doing normal development, as touching some random file
will have a much reduced chance of triggering lots of rebuilds.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-r889ur2cxe16m91m2a4pl15p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Improved kbprobes robustness
- Intel PEBS support for PT hardware tracing
- Other Intel PT improvements: high order pages memory footprint
reduction and various related cleanups
- Misc cleanups
The perf tooling side has been very busy in this cycle, with over 300
commits. This is an incomplete high-level summary of the many
improvements done by over 30 developers:
- Lots of updates to the following tools:
'perf c2c'
'perf config'
'perf record'
'perf report'
'perf script'
'perf test'
'perf top'
'perf trace'
- Updates to libperf and libtraceevent, and a consolidation of the
proliferation of x86 instruction decoder libraries.
- Vendor event updates for Intel and PowerPC CPUs,
- Updates to hardware tracing tooling for ARM and Intel CPUs,
- ... and lots of other changes and cleanups - see the shortlog and
Git log for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (322 commits)
kprobes: Prohibit probing on BUG() and WARN() address
perf/x86: Make more stuff static
x86, perf: Fix the dependency of the x86 insn decoder selftest
objtool: Ignore intentional differences for the x86 insn decoder
objtool: Update sync-check.sh from perf's check-headers.sh
perf build: Ignore intentional differences for the x86 insn decoder
perf intel-pt: Use shared x86 insn decoder
perf intel-pt: Remove inat.c from build dependency list
perf: Update .gitignore file
objtool: Move x86 insn decoder to a common location
perf metricgroup: Support multiple events for metricgroup
perf metricgroup: Scale the metric result
perf pmu: Change convert_scale from static to global
perf symbols: Move mem_info and branch_info out of symbol.h
perf auxtrace: Uninline functions that touch perf_session
perf tools: Remove needless evlist.h include directives
perf tools: Remove needless evlist.h include directives
perf tools: Remove needless thread_map.h include directives
perf tools: Remove needless thread.h include directives
perf tools: Remove needless map.h include directives
...
This patch adds support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.
Signed-off-by: Mao Han <han_mao@c-sky.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: linux-riscv <linux-riscv@lists.infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Guo Ren <guoren@kernel.org>
Tested-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Now that there's a common version of the decoder for all tools, use it
instead of the local copy.
Also use perf's check-headers.sh script to diff the decoder files to
make sure they remain in sync with the kernel version. Objtool has a
similar check.
Committer notes:
Had to keep this all pointing explicitely to x86 headers/files, i.e.
instead of asm/isnn.h we had to use ../include/asm/insn.h when the files
were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h".
This way we continue to be able to process perf.data files with Intel PT
traces in distros other than x86.
Also fixed up the awk script paths to use $(srcdir)/tools/arch instead
or relative directories so that we keep detached tarballs (make help |
grep perf) working.
For now the include lines in these headers are being ignored so as not
to flag false reports of kernel/tools out of sync.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The mem_info struct goes to mem-events.h and branch_info goes to
branch.h, where they belong, this way we can remove several headers from
symbols.h and trim the include dependency tree more.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we don't carry the session.h include directive in auxtrace.h,
which in turn opens a can of worms of files that were getting all sorts
of things via that include, fix them all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-d2d83aovpgri2z75wlitquni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that thread_map.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fyzvg64cz1ikvyxp8d6nrhz1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that map.h isn't included by any other header, we can check where
it is really needed, i.e. we can remove it and be sure that it isn't
being obtained indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-iu8ylqky7g1i9i54v3y7qovw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can remove dso.h from symbol.h and reduce the header
dependency tree.
Fixup cases where struct dso guts are needed but were obtained via
symbol.h, indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-ip683cegt306ncu3gsz7ii21@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf.h and build-id.h are not needed at all in event.h, remove them.
And fixup the fallout of files that were getting needed stuff from this
now pruned include.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-rdm3dgtlrndmmnlc4bafsg3b@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And fixup the fallout of c files not building due to now missing
headers.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-sw8k3kpla98pr3rqypbjk9hf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And fix the fallout, adding it to places that must have it since they
use its definitions.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1s3jel4i26chq2g0lydoz7i3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the movement of lots of stuff out of perf.h to other headers we
ended up not needing it in lots of places, remove it from those places.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c718m0sxxwp73lp9d8vpihb4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And remove unneeded include directives from perf-sys.h to prune the
header dependency tree.
Fixup the fallout in places where definitions were being used without
the needed include directives that were being satisfied because they
were in perf-sys.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7b1zvugiwak4ibfa3j6ott7f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Even more, to have a "perf_record_" prefix, so that they match the
PERF_RECORD_ enum they map to.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the PERF_RECORD_AUXTRACE_INFO event definition to libperf's
event.h.
In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-9-jolsa@kernel.org
[ Fix cs_etm__print_auxtrace_info() arg to be __u64 too to fix the CORESIGHT=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is no need for that util/util.h include there and, remove it,
pruning the include tree, fix the fallout by adding necessary headers to
places that were getting needed includes indirectly from evlist.h ->
util.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-s9f7uve8wvykr5itcm7m7d8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel is using CAP_SYS_ADMIN instead of euid==0 to override
perf_event_paranoid check. Make perf do the same.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> # coresight part
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1566869956-7154-3-git-send-email-ilubashe@akamai.com
Signed-off-by: Igor Lubashev <ilubashe@akamai.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copy over powerpc syscall.tbl to grab changes from the below commits:
commit cee3536d24 ("powerpc: Wire up clone3 syscall")
commit 1a271a68e0 ("arch: mark syscall number 435 reserved for clone3")
commit 7615d9e178 ("arch: wire-up pidfd_open()")
commit d8076bdb56 ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]")
commit 39036cd272 ("arch: add pidfd and io_uring syscalls everywhere")
commit 48166e6ea4 ("y2038: add 64-bit time_t syscalls to all 32-bit architectures")
commit d33c577ccc ("y2038: rename old time and utime syscalls")
commit 00bf25d693 ("y2038: use time32 syscall names on 32-bit")
commit 8dabe7245b ("y2038: syscalls: rename y2038 compat syscalls")
commit 0d6040d468 ("arch: add split IPC system calls where needed")
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20190827071458.19897-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And into a separate util/record.h, to better isolate things and make
sure that those who use record_opts and the other moved declarations
are explicitly including the necessary header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/cpumap.h file doesn't use anything in refcount.h not in
debug.h, it needs just a forward reference to 'struct cpu_map_data',
that is defined in util/event.h and cpumap.h was getting indirectly via,
of all things, debug.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-mtjww98yptt4ppo6g2blavg5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It uses strcmp(), strstr() and was getting the required string.h header
by luck, from evsel.h -> cpumap.h -> debug.h -> string.h, add the
missing header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qrz8hhvrhwnmt5ocfwk4br5d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It uses strstr(), needs to include string.h or its not going to build
when we remove string.h from the place it is getting from indirectly, by
luck.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-72y0i0uiaqght5b83e3ae7p4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This file uses pr_debug() but isn't including debug.h, getting it by
luck, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-t7pisnsdfh88kclpw52jcwl7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So it's part of the libperf library as one of basic functions operating
on the perf_cpu_map class.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch the rest of the perf code to use libperf's perf_cpu_map__nr(),
which is the same as current cpu_map__nr() and remove the cpu_map__nr()
function.
Link: http://lkml.kernel.org/n/tip-6e0guy75clis7nm0xpuz9fga@git.kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Process synth_opts.other_events and attr.aux_output to set up for
synthesizing PEBs via Intel PT events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190806084606.4021-6-alexander.shishkin@linux.intel.com
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
[ Fixed up libbperf clashes, i.e. some places using perf_evsel (now in libperf)
need to use instead 'evsel' (a tools/perf only abstraction) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These paths point to the wrong location but still work because they get
picked up by a -I flag that happens to direct to the correct file. Fix
paths to lead to the actual file location without help from include
flags.
Signed-off-by: Luke Mujica <lukemujica@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190719202253.220261-1-lukemujica@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get closer to upstream and check if we need to sync more UAPI
headers, pick up fixes for libbpf that prevent perf's container tests
from completing successfuly, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
During execution of command 'perf top' the error message:
Not enough memory for annotating '__irf_end' symbol!)
is emitted from this call sequence:
__cmd_top
perf_top__mmap_read
perf_top__mmap_read_idx
perf_event__process_sample
hist_entry_iter__add
hist_iter__top_callback
perf_top__record_precise_ip
hist_entry__inc_addr_samples
symbol__inc_addr_samples
symbol__get_annotation
symbol__alloc_hist
In this function the size of symbol __irf_end is calculated. The size of
a symbol is the difference between its start and end address.
When the symbol was read the first time, its start and end was set to:
symbol__new: __irf_end 0xe954d0-0xe954d0
which is correct and maps with /proc/kallsyms:
root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
root@s8360046:~/linux-4.15.0/tools/perf#
In function symbol__alloc_hist() the end of symbol __irf_end is
symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8
which is identical with the first module entry in /proc/kallsyms
This results in a symbol size of __irf_req for histogram analyses of
70334140059072 bytes and a malloc() for this requested size fails.
The root cause of this is function
__dso__load_kallsyms()
+-> symbols__fixup_end()
Function symbols__fixup_end() enlarges the last symbol in the kallsyms
map:
# fgrep __irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
#
to the start address of the first module:
# cat /proc/kallsyms | sort | egrep ' [tT] '
....
0000000000e952d0 T __security_initcall_end
0000000000e954d0 T __initramfs_size
0000000000e954d0 t __irf_end
000003ff800045a8 T fc_get_event_number [scsi_transport_fc]
000003ff800045d0 t store_fc_vport_disable [scsi_transport_fc]
000003ff800046a8 T scsi_is_fc_rport [scsi_transport_fc]
000003ff800046d0 t fc_target_setup [scsi_transport_fc]
On s390 the kernel is located around memory address 0x200, 0x10000 or
0x100000, depending on linux version. Modules however start some- where
around 0x3ff xxxx xxxx.
This is different than x86 and produces a large gap for which histogram
allocation fails.
Fix this by detecting the kernel's last symbol and do no adjustment for
it. Introduce a weak function and handle s390 specifics.
Reported-by: Klaus Theurich <klaus.theurich@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190724122703.3996-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390 the modules loaded in memory have the text segment located after
the GOT and Relocation table. This can be seen with this output:
[root@m35lp76 perf]# fgrep qeth /proc/modules
qeth 151552 1 qeth_l2, Live 0x000003ff800b2000
...
[root@m35lp76 perf]# cat /sys/module/qeth/sections/.text
0x000003ff800b3990
[root@m35lp76 perf]#
There is an offset of 0x1990 bytes. The size of the qeth module is
151552 bytes (0x25000 in hex).
The location of the GOT/relocation table at the beginning of a module is
unique to s390.
commit 203d8a4aa6 ("perf s390: Fix 'start' address of module's map")
adjusts the start address of a module in the map structures, but does
not adjust the size of the modules. This leads to overlapping of module
maps as this example shows:
[root@m35lp76 perf] # ./perf report -D
0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000)
@ 0]: x /lib/modules/.../qeth.ko.xz
0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000)
@ 0]: x /lib/modules/.../ip6_tables.ko.xz
The module qeth.ko has an adjusted start address modified to b3990, but
its size is unchanged and the module ends at 0x3ff800d8990. This end
address overlaps with the next modules start address of 0x3ff800d85a0.
When the size of the leading GOT/Relocation table stored in the
beginning of the text segment (0x1990 bytes) is subtracted from module
qeth end address, there are no overlaps anymore:
0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000
which is the same as
0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000.
To fix this issue, also adjust the modules size in function
arch__fix_module_text_start(). Add another function parameter named size
and reduce the size of the module when the text segment start address is
changed.
Output after:
0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670)
@ 0]: x /lib/modules/.../qeth.ko.xz
0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60)
@ 0]: x /lib/modules/.../ip6_tables.ko.xz
Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 203d8a4aa6 ("perf s390: Fix 'start' address of module's map")
Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the evlist__set_maps() function from tools/perf to libperf.
Committer notes:
Fix up reject due to earlier inversion in calling perf_evlist__init().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-57-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving the following functions from tools/perf:
cpu_map__new()
cpu_map__read()
to libperf with the following names:
perf_cpu_map__new()
perf_cpu_map__read()
Committer notes:
Fixed up this one:
tools/perf/arch/arm/util/cs-etm.c
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-44-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
Committer notes:
Fixed up these:
tools/perf/arch/arm/util/auxtrace.c
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/arm-spe.c
tools/perf/arch/s390/util/auxtrace.c
tools/perf/util/cs-etm.c
Also
cc1: warnings being treated as errors
tests/sample-parsing.c: In function 'do_test':
tests/sample-parsing.c:162: error: missing initializer
tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
struct evsel evsel = {
.needs_swap = false,
- .core.attr = {
- .sample_type = sample_type,
- .read_format = read_format,
+ .core = {
+ . attr = {
+ .sample_type = sample_type,
+ .read_format = read_format,
+ },
[perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
gcc (GCC) 4.4.7
Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move nr_entries count from 'struct perf' to into perf_evlist struct.
Committer notes:
Fix tools/perf/arch/s390/util/auxtrace.c case. And also the comment in
tools/perf/util/annotate.h.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-42-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving the following functions:
cpu_map__get()
cpu_map__put()
to libperf with following names:
perf_cpu_map__get()
perf_cpu_map__put()
Committer notes:
Added fixes for arm/arm64
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-31-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__disable() to evlist__disable(), so we don't have a
name clash when we add perf_evlist__disable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__enable() to evlist__enable(), so we don't have a
name clash when we add perf_evlist__enable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__open() to evlist__open(), so we don't have a name
clash when we add perf_evlist__open() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Renaming perf_evsel__disable() to evsel__disable(), so we don't have a
name clash when we add perf_evsel__disable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evsel__enable() to evsel__enable(), so we don't have a name
clash when we add perf_evsel__enable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__delete() to evlist__delete(), so we don't have a
name clash when we add perf_evlist__delete() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__new() to evlist__new(), so we don't have a name
clash when we add perf_evlist__new() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.
Committer notes:
Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct thread_map to struct perf_thread_map, so it could be part
of libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct cpu_map to struct perf_cpu_map, so it could be part of
libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Part of the erosion of util/util.h, that will lose its include stdlib.h,
we need to add it to places where it is needed but was getting it
indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1imnqezw99ahc07fjeb51qby@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf fixes from Ingo Molnar:
"Various fixes, most of them related to bugs perf fuzzing found in the
x86 code"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/regs: Use PERF_REG_EXTENDED_MASK
perf/x86: Remove pmu->pebs_no_xmm_regs
perf/x86: Clean up PEBS_XMM_REGS
perf/x86/regs: Check reserved bits
perf/x86: Disable extended registers for non-supported PMUs
perf/ioctl: Add check for the sample_period value
perf/core: Fix perf_sample_regs_user() mm check
There were a few places where we still were using the libc version of
ctype.h, switch to the one in tools/lib/ctype.c that the rest of perf
uses.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wa4nz4kt61eze88eprk20tfd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.
This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.
Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not to depend of getting it indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tirjsmvu4ektw0k7lm8k9lhu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We're getting it by sheer luck, add that util.h to get the 'page_size'
definition.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-347078mgj3d2jfygtxs4ntti@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the macro defined in kernel ABI header to replace the local name.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Call function cs_etm_set_option() once with all relevant options set
rather than multiple times to avoid going through the list of CPU more
than once.
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190611204528.20093-1-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In order to subsequently add more tests for the arm64 architecture we
compile the tests target for arm64 systematically.
Further explanation provided by Mark Rutland:
Given prior questions regarding this commit, it's probably worth
spelling things out more explicitly, e.g.
Currently we only build the arm64/tests directory if
CONFIG_DWARF_UNWIND is selected, which is fine as the only test we
have is arm64/tests/dwarf-unwind.o.
So that we can add more tests to the test directory, let's
unconditionally build the directory, but conditionally build
dwarf-unwind.o depending on CONFIG_DWARF_UNWIND.
There should be no functional change as a result of this patch.
Signed-off-by: Raphael Gault <raphael.gault@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190611125315.18736-2-raphael.gault@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record:
Alexey Budankov:
- Allow mixing --user-regs with --call-graph=dwarf, making sure that
the minimal set of registers for DWARF unwinding is present in the
set of user registers requested to be present in each sample, while
warning the user that this may make callchains unreliable if more
that the minimal set of registers is needed to unwind.
yuzhoujian:
- Add support to collect callchains from kernel or user space only,
IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
bits from the command line.
perf trace:
Arnaldo Carvalho de Melo:
- Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
payloads, use instead the syscall numbers obtainer either by the
arch specific syscalltbl generators or from audit-libs.
- Allow 'perf trace' to ask for the number of bytes to collect for
string arguments, for now ask for PATH_MAX, i.e. the whole
pathnames, which ends up being just a way to speficy which syscall
args are pathnames and thus should be read using bpf_probe_read_str().
- Skip unknown syscalls when expanding strace like syscall groups.
This helps using the 'string' group of syscalls to work in arm64,
where some of the syscalls present in x86_64 that deal with
strings, for instance 'access', are deprecated and this should not
be asked for tracing.
Leo Yan:
- Exit when failing to build eBPF program.
perf config:
Arnaldo Carvalho de Melo:
- Bail out when a handler returns failure for a key-value pair. This
helps with cases where processing a key-value pair is not just a
matter of setting some tool specific knob, involving, for instance
building a BPF program to then attach to the list of events 'perf
trace' will use, e.g. augmented_raw_syscalls.c.
perf.data:
Kan Liang:
- Read and store die ID information available in new Intel processors
in CPUID.1F in the CPU topology written in the perf.data header.
perf stat:
Kan Liang:
- Support per-die aggregation.
Documentation:
Arnaldo Carvalho de Melo:
- Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
CLOCKID and DIR_FORMAT headers.
Song Liu:
- Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.
Leo Yan:
- Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.
JVMTI:
Jiri Olsa:
- Address gcc string overflow warning for strncpy()
core:
- Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().
Intel PT:
Adrian Hunter:
- Add support for samples to contain IPC ratio, collecting cycles
information from CYC packets, showing the IPC info periodically, because
Intel PT does not update the cycle count on every branch or instruction,
the incremental values will often be zero. When there are values, they
will be the number of instructions and number of cycles since the last
update, and thus represent the average IPC since the last IPC value.
E.g.:
# perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
rounding mmap pages size to 1024M (262144 pages)
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 2.208 MB perf.data ]
# perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
#
<SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44)
2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp
3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00
4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al
5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00
6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx
7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12
8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax
9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax
10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821
11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi
12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi
13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi
14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip)
15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821
16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp
17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8
18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx)
19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3)
<SNIP>
- Allow using time ranges with Intel PT, i.e. these features, already
present but not optimially usable with Intel PT, should be now:
Select the second 10% time slice:
$ perf script --time 10%/2
Select from 0% to 10% time slice:
$ perf script --time 0%-10%
Select the first and second 10% time slices:
$ perf script --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices:
$ perf script --time 0%-10%,30%-40%
cs-etm (ARM):
Mathieu Poirier:
- Add support for CPU-wide trace scenarios.
s390:
Thomas Richter:
- Fix missing kvm module load for s390.
- Fix OOM error in TUI mode on s390
- Support s390 diag event display when doing analysis on !s390
architectures.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXP/1xQAKCRCyPKLppCJ+
J9xcAQCwOITAshE7op7HbKUPtkqiMNu+hpNa3skhxEpGHvKO0AEArpBXtuvEP8EU
PZsp+8vcVrlZ+dZutttgvkRz25mScg8=
=kfFb
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.3-20190611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf record:
Alexey Budankov:
- Allow mixing --user-regs with --call-graph=dwarf, making sure that
the minimal set of registers for DWARF unwinding is present in the
set of user registers requested to be present in each sample, while
warning the user that this may make callchains unreliable if more
that the minimal set of registers is needed to unwind.
yuzhoujian:
- Add support to collect callchains from kernel or user space only,
IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
bits from the command line.
perf trace:
Arnaldo Carvalho de Melo:
- Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
payloads, use instead the syscall numbers obtainer either by the
arch specific syscalltbl generators or from audit-libs.
- Allow 'perf trace' to ask for the number of bytes to collect for
string arguments, for now ask for PATH_MAX, i.e. the whole
pathnames, which ends up being just a way to speficy which syscall
args are pathnames and thus should be read using bpf_probe_read_str().
- Skip unknown syscalls when expanding strace like syscall groups.
This helps using the 'string' group of syscalls to work in arm64,
where some of the syscalls present in x86_64 that deal with
strings, for instance 'access', are deprecated and this should not
be asked for tracing.
Leo Yan:
- Exit when failing to build eBPF program.
perf config:
Arnaldo Carvalho de Melo:
- Bail out when a handler returns failure for a key-value pair. This
helps with cases where processing a key-value pair is not just a
matter of setting some tool specific knob, involving, for instance
building a BPF program to then attach to the list of events 'perf
trace' will use, e.g. augmented_raw_syscalls.c.
perf.data:
Kan Liang:
- Read and store die ID information available in new Intel processors
in CPUID.1F in the CPU topology written in the perf.data header.
perf stat:
Kan Liang:
- Support per-die aggregation.
Documentation:
Arnaldo Carvalho de Melo:
- Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
CLOCKID and DIR_FORMAT headers.
Song Liu:
- Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.
Leo Yan:
- Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.
JVMTI:
Jiri Olsa:
- Address gcc string overflow warning for strncpy()
core:
- Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().
Intel PT:
Adrian Hunter:
- Add support for samples to contain IPC ratio, collecting cycles
information from CYC packets, showing the IPC info periodically, because
Intel PT does not update the cycle count on every branch or instruction,
the incremental values will often be zero. When there are values, they
will be the number of instructions and number of cycles since the last
update, and thus represent the average IPC since the last IPC value.
E.g.:
# perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
rounding mmap pages size to 1024M (262144 pages)
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 2.208 MB perf.data ]
# perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
#
<SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44)
2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp
3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00
4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al
5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00
6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx
7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12
8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax
9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax
10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821
11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi
12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi
13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi
14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip)
15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821
16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp
17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8
18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx)
19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3)
<SNIP>
- Allow using time ranges with Intel PT, i.e. these features, already
present but not optimially usable with Intel PT, should be now:
Select the second 10% time slice:
$ perf script --time 10%/2
Select from 0% to 10% time slice:
$ perf script --time 0%-10%
Select the first and second 10% time slices:
$ perf script --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices:
$ perf script --time 0%-10%,30%-40%
cs-etm (ARM):
Mathieu Poirier:
- Add support for CPU-wide trace scenarios.
s390:
Thomas Richter:
- Fix missing kvm module load for s390.
- Fix OOM error in TUI mode on s390
- Support s390 diag event display when doing analysis on !s390
architectures.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds the necessary intelligence to properly compute the value
of 'old' and 'head' when operating in snapshot mode. That way we can
get the latest information in the AUX buffer and be compatible with the
generic AUX ring buffer mechanic.
Tester notes:
> Leo, have you had the chance to test/review this one? Suzuki?
Sure. I applied this patch on the perf/core branch (with latest
commit 3e4fbf36c1e3 'perf augmented_raw_syscalls: Move reading
filename to the loop') and passed testing with below steps:
# perf record -e cs_etm/@tmc_etr0/ -S -m,64 --per-thread ./sort &
[1] 19097
Bubble sorting array of 30000 elements
# kill -USR2 19097
# kill -USR2 19097
# kill -USR2 19097
[ perf record: Woken up 4 times to write data ]
[ perf record: Captured and wrote 0.753 MB perf.data ]
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190605161633.12245-1-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ask the perf core to generate an event when processes are swapped in/out
of context. That way proper action can be taken by the decoding code
when faced with such event.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-4-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When operating in CPU-wide mode tracers need to generate timestamps in
order to correlate the code being traced on one CPU with what is executed
on other CPUs.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-3-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When operating in CPU-wide mode being notified of contextID changes is
required so that the decoding mechanic is aware of the process context
switch.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-2-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 only
as published by the free software foundation
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Armijn Hemel <armijn@tjaldur.nl>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531081036.798138318@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 263 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull perf fixes from Ingo Molnar:
"On the kernel side there's a bunch of ring-buffer ordering fixes for a
reproducible bug, plus a PEBS constraints regression fix.
Plus tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools headers UAPI: Sync kvm.h headers with the kernel sources
perf record: Fix s390 missing module symbol and warning for non-root users
perf machine: Read also the end of the kernel
perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms
perf session: Add missing swap ops for namespace events
perf namespace: Protect reading thread's namespace
tools headers UAPI: Sync drm/drm.h with the kernel
tools headers UAPI: Sync drm/i915_drm.h with the kernel
tools headers UAPI: Sync linux/fs.h with the kernel
tools headers UAPI: Sync linux/sched.h with the kernel
tools arch x86: Sync asm/cpufeatures.h with the with the kernel
tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel
perf data: Fix 'strncat may truncate' build failure with recent gcc
perf/ring-buffer: Use regular variables for nesting
perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data
perf/ring_buffer: Add ordering to rb->nest increment
perf/ring_buffer: Fix exposing a temporarily decreased data_head
perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
59 temple place suite 330 boston ma 02111 1307 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 1334 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.113240726@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Command 'perf record' and 'perf report' on a system without kernel
debuginfo packages uses /proc/kallsyms and /proc/modules to find
addresses for kernel and module symbols. On x86 this works for root and
non-root users.
On s390, when invoked as non-root user, many of the following warnings
are shown and module symbols are missing:
proc/{kallsyms,modules} inconsistency while looking for
"[sha1_s390]" module!
Command 'perf record' creates a list of module start addresses by
parsing the output of /proc/modules and creates a PERF_RECORD_MMAP
record for the kernel and each module. The following function call
sequence is executed:
machine__create_kernel_maps
machine__create_module
modules__parse
machine__create_module --> for each line in /proc/modules
arch__fix_module_text_start
Function arch__fix_module_text_start() is s390 specific. It opens
file /sys/module/<name>/sections/.text to extract the module's .text
section start address. On s390 the module loader prepends a header
before the first section, whereas on x86 the module's text section
address is identical the the module's load address.
However module section files are root readable only. For non-root the
read operation fails and machine__create_module() returns an error.
Command perf record does not generate any PERF_RECORD_MMAP record
for loaded modules. Later command perf report complains about missing
module maps.
To fix this function arch__fix_module_text_start() always returns
success. For root users there is no change, for non-root users
the module's load address is used as module's text start address
(the prepended header then counts as part of the text section).
This enable non-root users to use module symbols and avoid the
warning when perf report is executed.
Output before:
[tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
Output after:
[tmricht@m83lp54 perf]$ ./perf report -D | fgrep MMAP
0 0x168 [0x50]: PERF_RECORD_MMAP ... x [kernel.kallsyms]_text
0 0x1b8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../autofs4.ko.xz
0 0x250 [0xa8]: PERF_RECORD_MMAP ... x /lib/modules/.../sha_common.ko.xz
0 0x2f8 [0x98]: PERF_RECORD_MMAP ... x /lib/modules/.../des_generic.ko.xz
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Link: http://lkml.kernel.org/r/20190522144601.50763-4-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copy the headers changed by these csets:
d8076bdb56 ("uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]")
9c8ad7a2ff ("uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]")
cf3cba4a42 ("vfs: syscall: Add fspick() to select a superblock for reconfiguration")
93766fbd26 ("vfs: syscall: Add fsmount() to create a mount for a superblock")
ecdab150fd ("vfs: syscall: Add fsconfig() for configuring and managing a context")
24dcb3d90a ("vfs: syscall: Add fsopen() to prepare for superblock creation")
2db154b3ea ("vfs: syscall: Add move_mount(2) to move mounts around")
a07b200047 ("vfs: syscall: Add open_tree(2) to reference or clone a mount")
We need to create tables for all the flags argument in the new syscalls,
in followup patches.
This silences these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/linux/mount.h' differs from latest version at 'include/uapi/linux/mount.h'
diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-knpqr1u2ffvz6641056z2mwu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a host system has kernel headers that are newer than a compiling
kernel, mksyscalltbl fails with errors such as:
<stdin>: In function 'main':
<stdin>:271:44: error: '__NR_kexec_file_load' undeclared (first use in this function)
<stdin>:271:44: note: each undeclared identifier is reported only once for each function it appears in
<stdin>:272:46: error: '__NR_pidfd_send_signal' undeclared (first use in this function)
<stdin>:273:43: error: '__NR_io_uring_setup' undeclared (first use in this function)
<stdin>:274:43: error: '__NR_io_uring_enter' undeclared (first use in this function)
<stdin>:275:46: error: '__NR_io_uring_register' undeclared (first use in this function)
tools/perf/arch/arm64/entry/syscalls//mksyscalltbl: line 48: /tmp/create-table-xvUQdD: Permission denied
mksyscalltbl is compiled with default host includes, but run with
compiling kernel tree includes, causing some syscall numbers to being
undeclared.
Committer testing:
Before this patch, in my cross build environment, no build problems, but
these new syscalls were not in the syscalls.c generated from the
unistd.h file, which is a bug, this patch fixes it:
perfbuilder@6e20056ed532:/git/perf$ tail /tmp/build/perf/arch/arm64/include/generated/asm/syscalls.c
[292] = "io_pgetevents",
[293] = "rseq",
[294] = "kexec_file_load",
[424] = "pidfd_send_signal",
[425] = "io_uring_setup",
[426] = "io_uring_enter",
[427] = "io_uring_register",
[428] = "syscalls",
};
perfbuilder@6e20056ed532:/git/perf$ strings /tmp/build/perf/perf | egrep '^(io_uring_|pidfd_|kexec_file)'
kexec_file_load
pidfd_send_signal
io_uring_setup
io_uring_enter
io_uring_register
perfbuilder@6e20056ed532:/git/perf$
$
Well, there is that last "syscalls" thing, but that looks like some
other bug.
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Michael Petlan <mpetlan@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190521030203.1447-1-vt@altlinux.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
XMM registers can be collected on Icelake and later platforms.
Add specific arch__intr_reg_mask(), which creating an event to check if
the kernel and hardware can collect XMM registers.
Test on Skylake which doesn't support XMM registers collection. There is
nothing changed.
#perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9
R10 R11 R12 R13 R14 R15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on
interrupt, use '-I?' to list register names
#perf record -I
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.905 MB perf.data (2520 samples) ]
#perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
1, bpf_event: 1, sample_regs_intr: 0xff0fff
Test on Icelake which support XMM registers collection.
#perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on
interrupt, use '-I?' to list register names
#perf record -I
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.800 MB perf.data (318 samples) ]
#perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type:
IP|TID|TIME|CPU|PERIOD|REGS_INTR, read_format: ID, disabled: 1,
inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3,
sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol:
1, bpf_event: 1, sample_regs_intr: 0xffffffff00ff0fff
Committer notes:
Don't set attr.sample_period as a named struct init, as it is part of an
unnamed union in 'struct perf_event_attr', and doing so breaks the build
on older gcc versions, such as:
gcc version 4.1.2 20080704 (Red Hat 4.1.2-55)
gcc version 4.4.7 20120313 (Red Hat 4.4.7-23) (GCC)
arch/x86/util/perf_regs.c: In function 'arch__intr_reg_mask':
arch/x86/util/perf_regs.c:279: error: unknown field 'sample_period' specified in initializer
cc1: warnings being treated as errors
arch/x86/util/perf_regs.c:279: warning: missing braces around initializer
arch/x86/util/perf_regs.c:279: warning: (near initialization for 'attr.<anonymous>')
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
[ Only on a lenovo t480s, a skylake machine, where the XMM registers didn't show up in -I?/--user-regs=? as expected ]
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1557865174-56264-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Icelake and later platforms support collecting XMM registers with PEBS
event.
Add support for 'perf script' to dump them, and support for the register
parser in 'perf record -I=' ... to configure them.
For now they are just printed in hex, we could potentially later add
other formats too.
Committer testing:
Before:
# perf record -IXMM0
Warning:
unknown register XMM0, check man page or run 'perf record -I?'
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
#
# perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
#
After:
# perf record -IXMM0
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cycles).
/bin/dmesg | grep -i perf may provide additional information.
#
# perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9 XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use -I ? to list register names
#
More work is needed to, when faced with such error, warn the user that
that register is not available on the running platform.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190506141926.13659-1-kan.liang@linux.intel.com
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch add support for DWARF register mappings and libdw registers
initialization, which is used by perf callchain analyzing when
--call-graph=dwarf is given.
Here is the elfutils csky backend patch set:
https://sourceware.org/ml/elfutils-devel/2019-q2/msg00007.html
Signed-off-by: Mao Han <han_mao@c-sky.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-arch@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555860794-10572-1-git-send-email-guoren@kernel.org
Signed-off-by: Guo Ren <ren_guo@c-sky.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes introduced in the following csets:
2b188cc1bb ("Add io_uring IO interface")
edafccee56 ("io_uring: add support for pre-mapped user IO buffers")
3eb39f4793 ("signal: add pidfd_send_signal() syscall")
This makes 'perf trace' to become aware of these new syscalls, so that
one can use them like 'perf trace -e ui_uring*,*signal' to do a system
wide strace-like session looking at those syscalls, for instance.
For example:
# perf trace -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla
Summary of events:
io_uring-cp (383), 1208866 events, 100.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
-------------- ------ -------- ------ ------- ------- ------
io_uring_enter 605780 2955.615 0.000 0.005 33.804 1.94%
openat 4 459.446 0.004 114.861 459.435 100.00%
munmap 4 0.073 0.009 0.018 0.042 44.03%
mmap 10 0.054 0.002 0.005 0.026 43.24%
brk 28 0.038 0.001 0.001 0.003 7.51%
io_uring_setup 1 0.030 0.030 0.030 0.030 0.00%
mprotect 4 0.014 0.002 0.004 0.005 14.32%
close 5 0.012 0.001 0.002 0.004 28.87%
fstat 3 0.006 0.001 0.002 0.003 35.83%
read 4 0.004 0.001 0.001 0.002 13.58%
access 1 0.003 0.003 0.003 0.003 0.00%
lseek 3 0.002 0.001 0.001 0.001 9.00%
arch_prctl 2 0.002 0.001 0.001 0.001 0.69%
execve 1 0.000 0.000 0.000 0.000 0.00%
#
# perf trace -e io_uring* -s io_uring-cp ~acme/isos/RHEL-x86_64-dvd1.iso ~/bla
Summary of events:
io_uring-cp (390), 1191250 events, 100.0%
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
-------------- ------ -------- ------ ------ ------ ------
io_uring_enter 597093 2706.060 0.001 0.005 14.761 1.10%
io_uring_setup 1 0.038 0.038 0.038 0.038 0.00%
#
More work needed to make the tools/perf/examples/bpf/augmented_raw_syscalls.c
BPF program to copy the 'struct io_uring_params' arguments to perf's ring
buffer so that 'perf trace' can use the BTF info put in place by pahole's
conversion of the kernel DWARF and then auto-beautify those arguments.
This patch produces the expected change in the generated syscalls table
for x86_64:
--- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2019-03-26 13:37:46.679057774 -0300
+++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2019-03-26 13:38:12.755990383 -0300
@@ -334,5 +334,9 @@ static const char *syscalltbl_x86_64[] =
[332] = "statx",
[333] = "io_pgetevents",
[334] = "rseq",
+ [424] = "pidfd_send_signal",
+ [425] = "io_uring_setup",
+ [426] = "io_uring_enter",
+ [427] = "io_uring_register",
};
-#define SYSCALLTBL_x86_64_MAX_ID 334
+#define SYSCALLTBL_x86_64_MAX_ID 427
This silences these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-p0ars3otuc52x5iznf21shhw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in 7948450d45 ("x86/x32: use time64 versions of
sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour
in tools/perf/ as it deals just with the x32 entries.
This silences this tools/perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-mqpvshayeqidlulx5qpioa59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>