The event converter scripts at:
https://github.com/intel/event-converter-for-linux-perf
passes Filter values from data on 01.org that is bogus in a perf command
line and can cause perf to infinitely recurse in parse events. Remove
such events or filters using the updated patch:
afd779df99
Fixes: ef908a1925 ("perf vendor events: Update Intel broadwellde")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805013856.1842878-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow the architecture built into pmu-events.c to be set on the make
command line with JEVENTS_ARCH.
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previous implementation wanted variable order and '(null)' string output
to match the C implementation. The '(null)' string output was a
quirk/bug and so there is no need to carry it forward.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Improve type hints to clean up pytype warnings.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220804221816.1802790-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Switch to new EVP API for detecting libcrypto, as Fedora 36 returns an
error when it encounters the deprecated function MD5_Init() and the others.
The error would be interpreted as missing libcrypto, while in reality it is
not.
Fixes: 6e8ccb4f62 ("tools/bpf: properly account for libbfd variations")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220719170555.2576993-4-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As the building mechanism is now able to retry detection with different
combinations of linking flags, setting
FEATURE_CHECK_LDFLAGS-disassembler-four-args and
FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore,
so remove it.
Committer notes:
Use the same technique to find the set of bfd-related libraries to link as in:
3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220719170555.2576993-3-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 6e8ccb4f62 ("tools/bpf: properly account for libbfd variations")
sets the linking flags depending on which flavor of the libbfd feature was
detected.
However, the flavors except libbfd cannot be detected, as they are not in
the feature list.
Complete the list of features to detect by adding libbfd-liberty and
libbfd-liberty-z.
Committer notes:
Adjust conflict with with:
1e1613f64c ("tools bpftool: Don't display disassembler-four-args feature test")
600b7b26c0 ("tools bpftool: Fix compilation error with new binutils")
Fixes: 6e8ccb4f62 ("tools/bpf: properly account for libbfd variations")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: llvm@lists.linux.dev
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220719170555.2576993-2-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While separate features have been defined to determine which linking flags
are required to use libbfd depending on the distribution (libbfd,
libbfd-liberty and libbfd-liberty-z), the same has not been done for other
features requiring linking to libbfd.
For example, disassembler-four-args requires linking to libbfd too, but it
should use the right linking flags. If not all the required ones are
specified, e.g. -liberty, detection will always fail even if the feature is
available.
Instead of creating new features, similarly to libbfd, simply retry
detection with the different set of flags until detection succeeds (or
fails, if the libraries are missing). In this way, feature detection is
transparent for the users of this building mechanism (e.g. perf), and those
users don't have for example to set an appropriate value for the
FEATURE_CHECK_LDFLAGS-disassembler-four-args variable.
The number of retries and features for which the retry mechanism is
implemented is low enough to make the increase in the complexity of
Makefile negligible.
Tested with perf and bpftool on Ubuntu 20.04.4 LTS, Fedora 36 and openSUSE
Tumbleweed.
Committer notes:
Do the retry for disassembler-init-styled as well.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220719170555.2576993-1-roberto.sassu@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add field checking tests for perf stat JSON output.
Sanity checks the expected number of fields are present, that the
expected keys are present and they have the correct values.
Committer notes:
Had to fix this:
- $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib' \
+ $(INSTALL) tests/shell/lib/*.sh '$(DESTDIR_SQ)$(perfexec_instdir_SQ)/tests/shell/lib'; \
Committer testing:
[root@quaco ~]# perf test json
90: perf stat JSON output linter : Ok
[root@quaco ~]# set -o vi
[root@quaco ~]# perf test -v json
90: perf stat JSON output linter :
--- start ---
test child forked, pid 560794
Checking json output: no args [Success]
Checking json output: system wide [Success]
Checking json output: system wide Checking json output: system wide no aggregation [Success]
Checking json output: interval [Success]
Checking json output: event [Success]
Checking json output: per core [Success]
Checking json output: per thread [Success]
Checking json output: per die [Success]
Checking json output: per node [Success]
Checking json output: per socket [Success]
test child finished with 0
---- end ----
perf stat JSON output linter: Ok
[root@quaco ~]#
Signed-off-by: Claire Jensen <cjense@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alyssa Ross <hi@alyssa.is>
Cc: Claire Jensen <clairej735@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220805200105.2020995-3-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a regression test to check against invalid check_and_init_map_value
call inside prealloc_lru_pop.
The kptr should not be reset to NULL once we set it after deleting the
map element. Hence, we trigger a program that updates the element
causing its reuse, and checks whether the unref kptr is reset or not.
If it is, prealloc_lru_pop does an incorrect check_and_init_map_value
call and the test fails.
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220809213033.24147-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
* An optimization in memblock_add_range() to reduce array traversals
* Improvements to the memblock test suite
-----BEGIN PGP SIGNATURE-----
iQFMBAABCAA2FiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmLyCbgYHG1pa2UucmFw
b3BvcnRAZ21haWwuY29tAAoJEDkDhibLDv2RBxMH/1uIcfERl3Cbw25zluWSVn4O
mrnr+JPqUkyeVLQDEGzk/VWIM1WT11s7fFpoTpIwu3dq/fVoD3HZlZQkWS0ANFDL
V3xf6Xz17R5ZNoZmacczhNaBqkJSi+dcvoAevjyBHPpKEaCLC/rNrISpDdCD0Lz0
5fgv2F4sISBUVc6FVIFB+9zKC/neI9ewemCABSFTIa5mmQvZZwX1Tj5BrxIsESwN
DwX5u1Q65SoFBbAk6F5+aoClJ7wMGz8OlZoFw106HTvxq8sNne27KXW9mKugBzJr
yAZ/TWrjXigNAr8dcXQEZuxagFSB1PQ4aNgU8phiAwE7/5z3j1KLa65hDRzc9t4=
=JMiG
-----END PGP SIGNATURE-----
Merge tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
- An optimization in memblock_add_range() to reduce array traversals
- Improvements to the memblock test suite
* tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock test: Modify the obsolete description in README
memblock tests: fix compilation errors
memblock tests: change build options to run-time options
memblock tests: remove completed TODO items
memblock tests: set memblock_debug to enable memblock_dbg() messages
memblock tests: add verbose output to memblock tests
memblock tests: Makefile: add arguments to control verbosity
memblock: avoid some repeat when add new range
Intel eIBRS machines do not sufficiently mitigate against RET
mispredictions when doing a VM Exit therefore an additional RSB,
one-entry stuffing is needed.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLqsGsACgkQEsHwGGHe
VUpXGg//ZEkxhf3Ri7X9PknAWNG6eIEqigKqWcdnOw+Oq/GMVb6q7JQsqowK7KBZ
AKcY5c/KkljTJNohditnfSOePyCG5nDTPgfkjzIawnaVdyJWMRCz/L4X2cv6ykDl
2l2EvQm4Ro8XAogYhE7GzDg/osaVfx93OkLCQj278VrEMWgM/dN2RZLpn+qiIkNt
DyFlQ7cr5UASh/svtKLko268oT4JwhQSbDHVFLMJ52VaLXX36yx4rValZHUKFdox
ZDyj+kiszFHYGsI94KAD0dYx76p6mHnwRc4y/HkVcO8vTacQ2b9yFYBGTiQatITf
0Nk1RIm9m3rzoJ82r/U0xSIDwbIhZlOVNm2QtCPkXqJZZFhopYsZUnq2TXhSWk4x
GQg/2dDY6gb/5MSdyLJmvrTUtzResVyb/hYL6SevOsIRnkwe35P6vDDyp15F3TYK
YvidZSfEyjtdLISBknqYRQD964dgNZu9ewrj+WuJNJr+A2fUvBzUebXjxHREsugN
jWp5GyuagEKTtneVCvjwnii+ptCm6yfzgZYLbHmmV+zhinyE9H1xiwVDvo5T7DDS
ZJCBgoioqMhp5qR59pkWz/S5SNGui2rzEHbAh4grANy8R/X5ASRv7UHT9uAo6ve1
xpw6qnE37CLzuLhj8IOdrnzWwLiq7qZ/lYN7m+mCMVlwRWobbOo=
=a8em
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 eIBRS fixes from Borislav Petkov:
"More from the CPU vulnerability nightmares front:
Intel eIBRS machines do not sufficiently mitigate against RET
mispredictions when doing a VM Exit therefore an additional RSB,
one-entry stuffing is needed"
* tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Add LFENCE to RSB fill sequence
x86/speculation: Add RSB VM Exit protections
- Fix NULL pointer dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Make the Intel TCC cooling driver support Alder Lake-N and Raptor
Lake-P (Sumeet Pawnikar).
- Fix possible path truncations in the tmon utility (Florian Fainelli).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLxT1USHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxj5UP/j2uxSAZC0OpyS/wRP9PmzTJKaUdofD8
BhQ6jvVu1wcQ+zG2MGrLUw9mOTEfd6dDfDUMnwiSAoHKvtUpEM7hsRa9wEGjXZoN
mJYzRFtWT8cqNA4WaVV9UjmTLVRvswXQ25ExumydCoXMHzk8ZSwkslq0rIfPlHWa
NpqraiXdmqG5qFlfQR1AhnLF3IXbogrxWbXqJRDatxwb7m4VEqy7TKQYKBozpWu6
0SfnGOrW6WqL5YaNCO2Q2cbRNRlVnS1gaGTfowMGAR5IRRz0MYmoHAvRIQzpigJI
VaTShIh3LQi+ud2dHsyeBZn95r35JgUQB716CHsZkkPB7WJx+Jj68K/u+TpsV8bV
KHTS7H1if2jugv0duC/ZSMqrN9Kqam2rjPnAtjK+Eo6gzWYP77Ha+yjkHTcWfICB
b7WargRB3tqj9Rsczl6bozQ4x6djYZAB4I0dZffdenbuees7U4qsmrSi2Z/3jta1
zZz5yPM15ZaESGAKr0ECiHD45Lgf7+ZCqS0oMMVNIaCXQyDUInpaR28t9YLQeIg/
h8uPIJ+AONdgNs6XKZ96AGpOZHJG+pszSIMVM8bPYCsy56bXBDI8m2NVlKXp58WF
N4fimjv97MCDLxxua6SuwWVhy8FHlv+OW3zCAiGJ2mOHvoO2E/nLoeOQkC//03zn
MIayMXpCTYRM
=8JEE
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These fix an error code path issue leading to a NULL pointer
dereference, drop Kconfig dependencies that are not needed any more
after recent changes, add CPU IDs for new chips to a driver and fix up
the tmon utility.
Specifics:
- Fix NULL pointer dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Make the Intel TCC cooling driver support Alder Lake-N and Raptor
Lake-P (Sumeet Pawnikar).
- Fix possible path truncations in the tmon utility (Florian
Fainelli)"
* tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/thermal: Fix possible path truncations
thermal: Drop obsolete dependency on COMPILE_TEST
thermal: sysfs: Fix cooling_device_stats_setup() error code path
thermal: intel: Add TCC cooling support for Alder Lake-N and Raptor Lake-P
Apparently, no existing selftest covers it. Add a new one where
we load cgroup/bind4 program and attach fentry to it. Calling
bpf_obj_get_info_by_fd on the fentry program should return non-zero
btf_id/btf_obj_id instead of crashing the kernel.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220804201140.1340684-2-sdf@google.com
This branch consists of:
Qu Wenruo:
lib: bitmap: fix the duplicated comments on bitmap_to_arr64()
https://lore.kernel.org/lkml/0d85e1dbad52ad7fb5787c4432bdb36cbd24f632.1656063005.git.wqu@suse.com/
Alexander Lobakin:
bitops: let optimize out non-atomic bitops on compile-time constants
https://lore.kernel.org/lkml/20220624121313.2382500-1-alexandr.lobakin@intel.com/T/
Yury Norov:
lib: cleanup bitmap-related headers
https://lore.kernel.org/linux-arm-kernel/YtCVeOGLiQ4gNPSf@yury-laptop/T/#m305522194c4d38edfdaffa71fcaaf2e2ca00a961
Alexander Lobakin:
x86/olpc: fix 'logical not is only applied to the left hand side'
https://www.spinics.net/lists/kernel/msg4440064.html
Yury Norov:
lib/nodemask: inline wrappers around bitmap
https://lore.kernel.org/all/20220723214537.2054208-1-yury.norov@gmail.com/
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmLpVvwACgkQsUSA/Tof
vsiAHgwAwS9pl8GJ+fKYnue2CYo9349d2oT6BBUs/Rv8uqYEa4QkpYsR7NS733TG
pos0hhoRvSOzrUP4qppXUjfJ+NkzLgpnKFOeWfFoNAKlHuaaMRvF3Y0Q/P8g0/Kg
HPWcCQLHyCH9Wjs3e2TTgRjxTrHuruD2VJ401/PX/lw0DicUhmev5mUFa10uwFkP
ZJRprjoFn9HJ0Hk16pFZDi36d3YumhACOcWRiJdoBDrEPV3S6lm9EeOy/yHBNp5k
9bKj+RboeT2t70KaZcKv+M5j1nu0cAhl7kRkjcxcmGyimI0l82Vgq9yFxhGqvWg8
RnCrJ5EaO08FGCAKG9GEwzdiNa24Gdq5XZSpQA7JZHmhmchpnnlNenJicyv0gOQi
abChZeWSEsyA+78l2+kk9nezfVKUOnKDEZQxBVTOyWsmZYxHZV94oam340VjQDaY
4/fETdOy/qqPIxnpxAeFGWxZjcVaYiYPLj7KLPMsB0aAAF7pZrem465vSfgbrE81
+gCdqrWd
=4dTW
-----END PGP SIGNATURE-----
Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linux
Pull bitmap updates from Yury Norov:
- fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo)
- optimize out non-atomic bitops on compile-time constants (Alexander
Lobakin)
- cleanup bitmap-related headers (Yury Norov)
- x86/olpc: fix 'logical not is only applied to the left hand side'
(Alexander Lobakin)
- lib/nodemask: inline wrappers around bitmap (Yury Norov)
* tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits)
lib/nodemask: inline next_node_in() and node_random()
powerpc: drop dependency on <asm/machdep.h> in archrandom.h
x86/olpc: fix 'logical not is only applied to the left hand side'
lib/cpumask: move some one-line wrappers to header file
headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure
headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>
headers/deps: mm: Optimize <linux/gfp.h> header dependencies
lib/cpumask: move trivial wrappers around find_bit to the header
lib/cpumask: change return types to unsigned where appropriate
cpumask: change return types to bool where appropriate
lib/bitmap: change type of bitmap_weight to unsigned long
lib/bitmap: change return types to bool where appropriate
arm: align find_bit declarations with generic kernel
iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
lib/test_bitmap: test the tail after bitmap_to_arr64()
lib/bitmap: fix off-by-one in bitmap_to_arr64()
lib: test_bitmap: add compile-time optimization/evaluations assertions
bitmap: don't assume compiler evaluates small mem*() builtins calls
net/ice: fix initializing the bitmap in the switch code
bitops: let optimize out non-atomic bitops on compile-time constants
...
fatfs, autofs, squashfs, procfs, etc.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYu9BeQAKCRDdBJ7gKXxA
jp1DAP4mjCSvAwYzXklrIt+Knv3CEY5oVVdS+pWOAOGiJpldTAD9E5/0NV+VmlD9
kwS/13j38guulSlXRzDLmitbg81zAAI=
=Zfum
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc updates from Andrew Morton:
"Updates to various subsystems which I help look after. lib, ocfs2,
fatfs, autofs, squashfs, procfs, etc. A relatively small amount of
material this time"
* tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
scripts/gdb: ensure the absolute path is generated on initial source
MAINTAINERS: kunit: add David Gow as a maintainer of KUnit
mailmap: add linux.dev alias for Brendan Higgins
mailmap: update Kirill's email
profile: setup_profiling_timer() is moslty not implemented
ocfs2: fix a typo in a comment
ocfs2: use the bitmap API to simplify code
ocfs2: remove some useless functions
lib/mpi: fix typo 'the the' in comment
proc: add some (hopefully) insightful comments
bdi: remove enum wb_congested_state
kernel/hung_task: fix address space of proc_dohung_task_timeout_secs
lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t()
squashfs: support reading fragments in readahead call
squashfs: implement readahead
squashfs: always build "file direct" version of page actor
Revert "squashfs: provide backing_dev_info in order to disable read-ahead"
fs/ocfs2: Fix spelling typo in comment
ia64: old_rr4 added under CONFIG_HUGETLB_PAGE
proc: fix test for "vsyscall=xonly" boot option
...
- Rework copy_oldmem_page() callback to take an iov_iter.
This includes few prerequisite updates and fixes to the
oldmem reading code.
- Rework cpufeature implementation to allow for various CPU feature
indications, which is not only limited to hardware capabilities,
but also allows CPU facilities.
- Use the cpufeature rework to autoload Ultravisor module when CPU
facility 158 is available.
- Add ELF note type for encrypted CPU state of a protected virtual CPU.
The zgetdump tool from s390-tools package will decrypt the CPU state
using a Customer Communication Key and overwrite respective notes to
make the data accessible for crash and other debugging tools.
- Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test.
- Fix incorrect recovery of kretprobe modified return address in stacktrace.
- Switch the NMI handler to use generic irqentry_nmi_enter() and
irqentry_nmi_exit() helper functions.
- Rework the cryptographic Adjunct Processors (AP) pass-through design
to support dynamic changes to the AP matrix of a running guest as well
as to implement more of the AP architecture.
- Minor boot code cleanups.
- Grammar and typo fixes to hmcdrv and tape drivers.
-----BEGIN PGP SIGNATURE-----
iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCYu4dRBccYWdvcmRlZXZA
bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8DnlAP45Sk4cE35T+Z0vdHE2f0uMXE/p
uHNjS3fDZOQVFJ2jZwEA99xPF5qPCttbR/b1VHsMSb30684IT1A4PC7y05kgfAw=
=jCc3
-----END PGP SIGNATURE-----
Merge tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:
- Rework copy_oldmem_page() callback to take an iov_iter.
This includes a few prerequisite updates and fixes to the oldmem
reading code.
- Rework cpufeature implementation to allow for various CPU feature
indications, which is not only limited to hardware capabilities, but
also allows CPU facilities.
- Use the cpufeature rework to autoload Ultravisor module when CPU
facility 158 is available.
- Add ELF note type for encrypted CPU state of a protected virtual CPU.
The zgetdump tool from s390-tools package will decrypt the CPU state
using a Customer Communication Key and overwrite respective notes to
make the data accessible for crash and other debugging tools.
- Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto
test.
- Fix incorrect recovery of kretprobe modified return address in
stacktrace.
- Switch the NMI handler to use generic irqentry_nmi_enter() and
irqentry_nmi_exit() helper functions.
- Rework the cryptographic Adjunct Processors (AP) pass-through design
to support dynamic changes to the AP matrix of a running guest as
well as to implement more of the AP architecture.
- Minor boot code cleanups.
- Grammar and typo fixes to hmcdrv and tape drivers.
* tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits)
Revert "s390/smp: enforce lowcore protection on CPU restart"
Revert "s390/smp: rework absolute lowcore access"
Revert "s390/smp,ptdump: add absolute lowcore markers"
s390/unwind: fix fgraph return address recovery
s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit()
s390: add ELF note type for encrypted CPU state of a PV VCPU
s390/smp,ptdump: add absolute lowcore markers
s390/smp: rework absolute lowcore access
s390/setup: rearrange absolute lowcore initialization
s390/boot: cleanup adjust_to_uv_max() function
s390/smp: enforce lowcore protection on CPU restart
s390/tape: fix comment typo
s390/hmcdrv: fix Kconfig "its" grammar
s390/docs: fix warnings for vfio_ap driver doc
s390/docs: fix warnings for vfio_ap driver lock usage doc
s390/crash: support multi-segment iterators
s390/crash: use static swap buffer for copy_to_user_real()
s390/crash: move copy_to_user_real() to crash_dump.c
s390/zcore: fix race when reading from hardware system area
s390/crash: fix incorrect number of bytes to copy to user space
...
- Add support for syscall stack randomization.
- Add support for atomic operations to the 32 & 64-bit BPF JIT.
- Full support for KASAN on 64-bit Book3E.
- Add a watchdog driver for the new PowerVM hypervisor watchdog.
- Add a number of new selftests for the Power10 PMU support.
- Add a driver for the PowerVM Platform KeyStore.
- Increase the NMI watchdog timeout during live partition migration, to avoid timeouts
due to increased memory access latency.
- Add support for using the 'linux,pci-domain' device tree property for PCI domain
assignment.
- Many other small features and fixes.
Thanks to: Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas
Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz,
Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg
Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna
Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant,
Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, Zhouyi Zhou.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmLuAPgTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBPpD/9kY/T0qlOXABxlZCgtqeAjPX+2xpnY
BF+TlsN1TS1auFcEZL2BapmVacsvOeGEFDVuZHZvZJc69Hx+gSjnjFCnZjp6n+Yz
wt6y9w9Pu0t/sjD5vNQ46O15/dXqm6RoVI7um12j/WLMN8Ko5+x3gKAyQONjQd2/
1kPcxVH6FUosAdnCuvIcqCX4e4IIHl2ZkitHOTXoQUvUy9oAK/mOBnwqZ6zLGUKC
E5M+Zyt4RFGxhPs48FkX6Nq6crDGU/P0VJpDKkR/t7GHnE67Bm70gZougAPrzrgP
nx8zoTWgDKpqDeuqK7pFcyKgNS3dKbxsN3sAfKHOWu/YnV4wMyy+7fmwagMauki7
lXccKN6F/r+8JcMNx80Jp/dAw3ZdLceP38M3Ryf8IL6lTfkNySumUvrKJn6r1Cu1
wvzhgyEuDawss9KHdEmXcA2i3+XVZvitaipO7JWUC8pblrP1SJMoPfIIe9zh3y3M
pyZj0TcGJ8XaK+badvI+PW/K/KeRgXEY8HpC3wDHSoIkli3OE4jDwXn6TiZgvm3n
k0sKL8YSmQZ8hP8QAkR+r8NQKYqLlfyPxdslK5omDPxfub5Uzk9ZV2Ep7svkaiQn
Wqjq27Dpz8+w0XPjsQ0Tkv+ByTkOhrawOH7x9SpFLHpv9g5otcYmS79NkO/htx8C
6LyPNx1VYn5IRA==
=tRkm
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add support for syscall stack randomization
- Add support for atomic operations to the 32 & 64-bit BPF JIT
- Full support for KASAN on 64-bit Book3E
- Add a watchdog driver for the new PowerVM hypervisor watchdog
- Add a number of new selftests for the Power10 PMU support
- Add a driver for the PowerVM Platform KeyStore
- Increase the NMI watchdog timeout during live partition migration, to
avoid timeouts due to increased memory access latency
- Add support for using the 'linux,pci-domain' device tree property for
PCI domain assignment
- Many other small features and fixes
Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira
Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas,
Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A.
Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol
Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada,
Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch,
Naveen N. Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár,
Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher
Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu
Jianfeng, and Zhouyi Zhou.
* tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits)
powerpc/64e: Fix kexec build error
EDAC/ppc_4xx: Include required of_irq header directly
powerpc/pci: Fix PHB numbering when using opal-phbid
powerpc/64: Init jump labels before parse_early_param()
selftests/powerpc: Avoid GCC 12 uninitialised variable warning
powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address
powerpc/xive: Fix refcount leak in xive_get_max_prio
powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader
powerpc/perf: Include caps feature for power10 DD1 version
powerpc: add support for syscall stack randomization
powerpc: Move system_call_exception() to syscall.c
powerpc/powernv: rename remaining rng powernv_ functions to pnv_
powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
powerpc/powernv: Avoid crashing if rng is NULL
selftests/powerpc: Fix matrix multiply assist test
powerpc/signal: Update comment for clarity
powerpc: make facility_unavailable_exception 64s
powerpc/platforms/83xx/suspend: Remove write-only global variable
powerpc/platforms/83xx/suspend: Prevent unloading the driver
powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration
...
- Introduce 'perf lock contention' subtool, using new lock contention
tracepoints and using BPF for in kernel aggregation and then userspace
processing using the perf tooling infrastructure for resolving symbols, target
specification, etc.
Since the new lock contention tracepoints don't provide lock names, get up to
8 stack traces and display the first non-lock function symbol name as a caller:
$ perf lock report -F acquired,contended,avg_wait,wait_total
Name acquired contended avg wait total wait
update_blocked_a... 40 40 3.61 us 144.45 us
kernfs_fop_open+... 5 5 3.64 us 18.18 us
_nohz_idle_balance 3 3 2.65 us 7.95 us
tick_do_update_j... 1 1 6.04 us 6.04 us
ep_scan_ready_list 1 1 3.93 us 3.93 us
Supports the usual 'perf record' + 'perf report' workflow as well as a
BCC/bpftrace like mode where you start the tool and then press control+C to get
results:
$ sudo perf lock contention -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
...
- Add new 'perf kwork' tool to trace time properties of kernel work (such as
softirq, and workqueue), uses eBPF skeletons to collect info in kernel space,
aggregating data that then gets processed by the userspace tool, e.g.:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
----------------------------------------------------------------------------------------------------
nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------
See commit log messages for more examples with extra options to limit the events time window, etc.
- Add support for new AMD IBS (Instruction Based Sampling) features:
With the DataSrc extensions, the source of data can be decoded among:
- Local L3 or other L1/L2 in CCX.
- A peer cache in a near CCX.
- Data returned from DRAM.
- A peer cache in a far CCX.
- DRAM address map with "long latency" bit set.
- Data returned from MMIO/Config/PCI/APIC.
- Extension Memory (S-Link, GenZ, etc - identified by the CS target
and/or address map at DF's choice).
- Peer Agent Memory.
- Support hardware tracing with Intel PT on guest machines, combining the
traces with the ones in the host machine.
- Add a "-m" option to 'perf buildid-list' to show kernel and modules
build-ids, to display all of the information needed to do external
symbolization of kernel stack traces, such as those collected by
bpf_get_stackid().
- Add arch TSC frequency information to perf.data file headers.
- Handle changes in the binutils disassembler function signatures in
perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
- Fix building the perf perl binding with the newest gcc in distros such
as fedora rawhide, where some new warnings were breaking the build as
perf uses -Werror.
- Add 'perf test' entry for branch stack sampling.
- Add ARM SPE system wide 'perf test' entry.
- Add user space counter reading tests to 'perf test'.
- Build with python3 by default, if available.
- Add python converter script for the vendor JSON event files.
- Update vendor JSON files for alderlake, bonnell, broadwell, broadwellde,
broadwellx, cascadelakex, elkhartlake, goldmont, goldmontplus, haswell,
haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, knightslanding,
nehalemep, nehalemex, sandybridge, sapphirerapids, silvermont, skylake,
skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp and westmereex.
- Add vendor JSON File for Intel meteorlake.
- Add Arm Cortex-A78C and X1C JSON vendor event files.
- Add workaround to symbol address reading from ELF files without phdr,
falling back to the previoous equation.
- Convert legacy map definition to BTF-defined in the perf BPF script test.
- Rework prologue generation code to stop using libbpf deprecated APIs.
- Add default hybrid events for 'perf stat' on x86.
- Add topdown metrics in the default 'perf stat' on the hybrid machines (big/little cores).
- Prefer sampled CPU when exporting JSON in 'perf data convert'
- Fix ('perf stat CSV output linter') and ("Check branch stack sampling") 'perf test' entries on s390.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYuw6gwAKCRCyPKLppCJ+
J5+iAP0RL6sKMhzdkRjRYfG8CluJ401YaPHadzv5jxP8gOZz2gEAsuYDrMF9t1zB
4DqORfobdX9UQEJjP9oRltU73GM0swI=
=2/M0
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf lock contention' subtool, using new lock contention
tracepoints and using BPF for in kernel aggregation and then
userspace processing using the perf tooling infrastructure for
resolving symbols, target specification, etc.
Since the new lock contention tracepoints don't provide lock names,
get up to 8 stack traces and display the first non-lock function
symbol name as a caller:
$ perf lock report -F acquired,contended,avg_wait,wait_total
Name acquired contended avg wait total wait
update_blocked_a... 40 40 3.61 us 144.45 us
kernfs_fop_open+... 5 5 3.64 us 18.18 us
_nohz_idle_balance 3 3 2.65 us 7.95 us
tick_do_update_j... 1 1 6.04 us 6.04 us
ep_scan_ready_list 1 1 3.93 us 3.93 us
Supports the usual 'perf record' + 'perf report' workflow as well as
a BCC/bpftrace like mode where you start the tool and then press
control+C to get results:
$ sudo perf lock contention -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
...
- Add new 'perf kwork' tool to trace time properties of kernel work
(such as softirq, and workqueue), uses eBPF skeletons to collect info
in kernel space, aggregating data that then gets processed by the
userspace tool, e.g.:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
----------------------------------------------------------------------------------------------------
nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------
See commit log messages for more examples with extra options to limit
the events time window, etc.
- Add support for new AMD IBS (Instruction Based Sampling) features:
With the DataSrc extensions, the source of data can be decoded among:
- Local L3 or other L1/L2 in CCX.
- A peer cache in a near CCX.
- Data returned from DRAM.
- A peer cache in a far CCX.
- DRAM address map with "long latency" bit set.
- Data returned from MMIO/Config/PCI/APIC.
- Extension Memory (S-Link, GenZ, etc - identified by the CS target
and/or address map at DF's choice).
- Peer Agent Memory.
- Support hardware tracing with Intel PT on guest machines, combining
the traces with the ones in the host machine.
- Add a "-m" option to 'perf buildid-list' to show kernel and modules
build-ids, to display all of the information needed to do external
symbolization of kernel stack traces, such as those collected by
bpf_get_stackid().
- Add arch TSC frequency information to perf.data file headers.
- Handle changes in the binutils disassembler function signatures in
perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
- Fix building the perf perl binding with the newest gcc in distros
such as fedora rawhide, where some new warnings were breaking the
build as perf uses -Werror.
- Add 'perf test' entry for branch stack sampling.
- Add ARM SPE system wide 'perf test' entry.
- Add user space counter reading tests to 'perf test'.
- Build with python3 by default, if available.
- Add python converter script for the vendor JSON event files.
- Update vendor JSON files for most Intel cores.
- Add vendor JSON File for Intel meteorlake.
- Add Arm Cortex-A78C and X1C JSON vendor event files.
- Add workaround to symbol address reading from ELF files without phdr,
falling back to the previoous equation.
- Convert legacy map definition to BTF-defined in the perf BPF script
test.
- Rework prologue generation code to stop using libbpf deprecated APIs.
- Add default hybrid events for 'perf stat' on x86.
- Add topdown metrics in the default 'perf stat' on the hybrid machines
(big/little cores).
- Prefer sampled CPU when exporting JSON in 'perf data convert'
- Fix ('perf stat CSV output linter') and ("Check branch stack
sampling") 'perf test' entries on s390.
* tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf stat: Refactor __run_perf_stat() common code
perf lock: Print the number of lost entries for BPF
perf lock: Add --map-nr-entries option
perf lock: Introduce struct lock_contention
perf scripting python: Do not build fail on deprecation warnings
genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test
perf parse-events: Break out tracepoint and printing
perf parse-events: Don't #define YY_EXTRA_TYPE
tools bpftool: Don't display disassembler-four-args feature test
tools bpftool: Fix compilation error with new binutils
tools bpf_jit_disasm: Don't display disassembler-four-args feature test
tools bpf_jit_disasm: Fix compilation error with new binutils
tools perf: Fix compilation error with new binutils
tools include: add dis-asm-compat.h to handle version differences
tools build: Don't display disassembler-four-args feature test
tools build: Add feature test for init_disassemble_info API changes
perf test: Add ARM SPE system wide test
perf tools: Rework prologue generation code
perf bpf: Convert legacy map definition to BTF-defined
...
Enable/disable tracing infrastructure while packets are in-flight.
This triggers KASAN splat after
e34b9ed96c ("netfilter: nf_tables: avoid skb access on nf_stolen").
While at it, reduce script run time as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve latency
and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYuravgAKCRDdBJ7gKXxA
jpqSAQDrXSdII+ht9kSHlaCVYjqRFQz/rRvURQrWQV74f6aeiAD+NHHeDPwZn11/
SPktqEUrF1pxnGQxqLh1kUFUhsVZQgE=
=w/UH
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Most of the MM queue. A few things are still pending.
Liam's maple tree rework didn't make it. This has resulted in a few
other minor patch series being held over for next time.
Multi-gen LRU still isn't merged as we were waiting for mapletree to
stabilize. The current plan is to merge MGLRU into -mm soon and to
later reintroduce mapletree, with a view to hopefully getting both
into 6.1-rc1.
Summary:
- The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe
Lin, Yang Shi, Anshuman Khandual and Mike Rapoport
- Some kmemleak fixes from Patrick Wang and Waiman Long
- DAMON updates from SeongJae Park
- memcg debug/visibility work from Roman Gushchin
- vmalloc speedup from Uladzislau Rezki
- more folio conversion work from Matthew Wilcox
- enhancements for coherent device memory mapping from Alex Sierra
- addition of shared pages tracking and CoW support for fsdax, from
Shiyang Ruan
- hugetlb optimizations from Mike Kravetz
- Mel Gorman has contributed some pagealloc changes to improve
latency and realtime behaviour.
- mprotect soft-dirty checking has been improved by Peter Xu
- Many other singleton patches all over the place"
[ XFS merge from hell as per Darrick Wong in
https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ]
* tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits)
tools/testing/selftests/vm/hmm-tests.c: fix build
mm: Kconfig: fix typo
mm: memory-failure: convert to pr_fmt()
mm: use is_zone_movable_page() helper
hugetlbfs: fix inaccurate comment in hugetlbfs_statfs()
hugetlbfs: cleanup some comments in inode.c
hugetlbfs: remove unneeded header file
hugetlbfs: remove unneeded hugetlbfs_ops forward declaration
hugetlbfs: use helper macro SZ_1{K,M}
mm: cleanup is_highmem()
mm/hmm: add a test for cross device private faults
selftests: add soft-dirty into run_vmtests.sh
selftests: soft-dirty: add test for mprotect
mm/mprotect: fix soft-dirty check in can_change_pte_writable()
mm: memcontrol: fix potential oom_lock recursion deadlock
mm/gup.c: fix formatting in check_and_migrate_movable_page()
xfs: fail dax mount if reflink is enabled on a partition
mm/memcontrol.c: remove the redundant updating of stats_flush_threshold
userfaultfd: don't fail on unrecognized features
hugetlb_cgroup: fix wrong hugetlb cgroup numa stat
...
dynamic. For instance, enclaves can now change enclave page
permissions on the fly.
- Removal of an unused structure member
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmLq2M8ACgkQaDWVMHDJ
krCbAw/+J4nHXxZNMQQX1c8CYJ7XHIr+YtsqNFYwH58rJJstHO/YwQf+mesVOeeu
08BYn+T5cdAbShKcxdkowPB17S6w/WzACtUfVhaoRQC7Md40cBiyc45UiC2e1u9g
W3Osk5+fTVcSYA9WiizPntIQkjVs9e7hcNKjTyVPnSw8W8mFCLg+ZiPb7YvKERTO
o8Wi2+zzX1BGDNOyBEqvnstz9uXDbCbFUTYX6zToBUk+Y1ZPXHwuHgNTtrAqGYaL
qyi0O2zoWnfOUmplzjJ/1aPmzPJDPgDNImC+gjTpYXGmg05Ryds+VZAc64IIjqYn
K+/5674PZFdsp5/YfctubdsQm0l0xen94sccAacd7KfsVurcHs3E2bdQPDw0htxv
svCX0Sai/qv52tPNzw+n9EJRcQsiwd9Pn0rWwx2i8hQcgMFiwCus6DBKhU7uh2Jp
oTwlspqJy2NHu9bici78tmsOio9CORjrh1WOfWX+yHEux4dtQAl889Gw5qzId6V1
Bh1MgoAu/pQ78feo96f3h5yOultOtpbTGyXEC8t4MTSpIVgZ2NzfUxe4RhOCBnhA
kdftVNfZLGOzwBbgFy0gYTe/ukt1DkP4BNHQilf2I+bUP/kZFlN8wfxBipWzr0bs
Skrz4+brBIaTdGoFgzhgt3g5YH16DSasmy/HCkIeV7eaAHFRLoE=
=Y7YA
-----END PGP SIGNATURE-----
Merge tag 'x86_sgx_for_v6.0-2022-08-03.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Dave Hansen:
"A set of x86/sgx changes focused on implementing the "SGX2" features,
plus a minor cleanup:
- SGX2 ISA support which makes enclave memory management much more
dynamic. For instance, enclaves can now change enclave page
permissions on the fly.
- Removal of an unused structure member"
* tag 'x86_sgx_for_v6.0-2022-08-03.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
x86/sgx: Drop 'page_index' from sgx_backing
selftests/sgx: Page removal stress test
selftests/sgx: Test reclaiming of untouched page
selftests/sgx: Test invalid access to removed enclave page
selftests/sgx: Test faulty enclave behavior
selftests/sgx: Test complete changing of page type flow
selftests/sgx: Introduce TCS initialization enclave operation
selftests/sgx: Introduce dynamic entry point
selftests/sgx: Test two different SGX2 EAUG flows
selftests/sgx: Add test for TCS page permission changes
selftests/sgx: Add test for EPCM permission changes
Documentation/x86: Introduce enclave runtime management section
x86/sgx: Free up EPC pages directly to support large page ranges
x86/sgx: Support complete page removal
x86/sgx: Support modifying SGX page type
x86/sgx: Tighten accessible memory range after enclave initialization
x86/sgx: Support adding of pages to an initialized enclave
x86/sgx: Support restricting of enclave page permissions
x86/sgx: Support VA page allocation without reclaiming
x86/sgx: Export sgx_encl_page_alloc()
...
There are three independent sets of changes:
- Sai Prakash Ranjan adds tracing support to the asm-generic
version of the MMIO accessors, which is intended to help
understand problems with device drivers and has been part
of Qualcomm's vendor kernels for many years.
- A patch from Sebastian Siewior to rework the handling of
IRQ stacks in softirqs across architectures, which is
needed for enabling PREEMPT_RT.
- The last patch to remove the CONFIG_VIRT_TO_BUS option and
some of the code behind that, after the last users of this
old interface made it in through the netdev, scsi, media and
staging trees.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLqPPEACgkQmmx57+YA
GNlUbQ/+NpIsiA0JUrCGtySt8KrLHdA2dH9lJOR5/iuxfphscPFfWtpcPvcXQWmt
a8u7wyI8SHW1ku4U0Y5sO0dBSldDnoIqJ5t4X5d7YNU9yVtEtucqQhZf+GkrPlVD
1HkRu05B7y0k2BMn7BLhSvkpafs3f1lNGXjs8oFBdOF1/zwp/GjcrfCK7KFzqjwU
dYrX0SOFlKFd4BZC75VfK+XcKg4LtwIOmJraRRl7alz2Q5Oop2hgjgZxXDPf//vn
SPOhXJN/97i1FUpY2TkfHVH1NxbPfjCV4pUnjmLG0Y4NSy9UQ/ZcXHcywIdeuhfa
0LySOIsAqBeccpYYYdg2ubiMDZOXkBfANu/sB9o/EhoHfB4svrbPRDhBIQZMFXJr
MJYu+IYce2rvydA/nydo4q++pxR8v1ES1ZIo8bDux+q1CI/zbpQV+f98kPVRA0M7
ajc+5GTIqNIsvHzzadq7eYxcj5Bi8Li2JA9sVkAQ+6iq1TVyeYayMc9eYwONlmqw
MD+PFYc651pKtXZCfkLXPIKSwS0uPqBndAibuVhpZ0hxWaCBBdKvY9mrWcPxt0kA
tMR8lrosbbrV2K48BFdWTOHvCs2FhHQxPGVPZ/iWuxTA0hHZ9tUlaEkSX+VM57IU
KCYQLdWzT8J9vrgqSbgYKlb6pSPz6FIjTfut6NZMmshIbavHV/Q=
=aTR0
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"There are three independent sets of changes:
- Sai Prakash Ranjan adds tracing support to the asm-generic version
of the MMIO accessors, which is intended to help understand
problems with device drivers and has been part of Qualcomm's vendor
kernels for many years
- A patch from Sebastian Siewior to rework the handling of IRQ stacks
in softirqs across architectures, which is needed for enabling
PREEMPT_RT
- The last patch to remove the CONFIG_VIRT_TO_BUS option and some of
the code behind that, after the last users of this old interface
made it in through the netdev, scsi, media and staging trees"
* tag 'asm-generic-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
uapi: asm-generic: fcntl: Fix typo 'the the' in comment
arch/*/: remove CONFIG_VIRT_TO_BUS
soc: qcom: geni: Disable MMIO tracing for GENI SE
serial: qcom_geni_serial: Disable MMIO tracing for geni serial
asm-generic/io: Add logging support for MMIO accessors
KVM: arm64: Add a flag to disable MMIO trace for nVHE KVM
lib: Add register read/write tracing support
drm/meson: Fix overflow implicit truncation warnings
irqchip/tegra: Fix overflow implicit truncation warnings
coresight: etm4x: Use asm-generic IO memory barriers
arm64: io: Use asm-generic high level MMIO accessors
arch/*: Disable softirq stacks on PREEMPT_RT.
- Runtime verification infrastructure
This is the biggest change for this pull request. It introduces the
runtime verification that is necessary for running Linux on safety
critical systems. It allows for deterministic automata models to be
inserted into the kernel that will attach to tracepoints, where the
information on these tracepoints will move the model from state to state.
If a state is encountered that does not belong to the model, it will then
activate a given reactor, that could just inform the user or even panic
the kernel (for which safety critical systems will detect and can recover
from).
- Two monitor models are also added: Wakeup In Preemptive (WIP - not to be
confused with "work in progress"), and Wakeup While Not Running (WWNR).
- Added __vstring() helper to the TRACE_EVENT() macro to replace several
vsnprintf() usages that were all doing it wrong.
- eprobes now can have their event autogenerated when the event name is left
off.
- The rest is various cleanups and fixes.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYu0yzRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qj4HAP4tQtV55rjj4DQ5XIXmtI3/64PmyRSJ
+y4DEXi1UvEUCQD/QAuQfWoT/7gh35ltkfeS4t3ockzy14rrkP5drZigiQA=
=kEtM
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Runtime verification infrastructure
This is the biggest change here. It introduces the runtime
verification that is necessary for running Linux on safety critical
systems.
It allows for deterministic automata models to be inserted into the
kernel that will attach to tracepoints, where the information on
these tracepoints will move the model from state to state.
If a state is encountered that does not belong to the model, it will
then activate a given reactor, that could just inform the user or
even panic the kernel (for which safety critical systems will detect
and can recover from).
- Two monitor models are also added: Wakeup In Preemptive (WIP - not to
be confused with "work in progress"), and Wakeup While Not Running
(WWNR).
- Added __vstring() helper to the TRACE_EVENT() macro to replace
several vsnprintf() usages that were all doing it wrong.
- eprobes now can have their event autogenerated when the event name is
left off.
- The rest is various cleanups and fixes.
* tag 'trace-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (50 commits)
rv: Unlock on error path in rv_unregister_reactor()
tracing: Use alignof__(struct {type b;}) instead of offsetof()
tracing/eprobe: Show syntax error logs in error_log file
scripts/tracing: Fix typo 'the the' in comment
tracepoints: It is CONFIG_TRACEPOINTS not CONFIG_TRACEPOINT
tracing: Use free_trace_buffer() in allocate_trace_buffers()
tracing: Use a struct alignof to determine trace event field alignment
rv/reactor: Add the panic reactor
rv/reactor: Add the printk reactor
rv/monitor: Add the wwnr monitor
rv/monitor: Add the wip monitor
rv/monitor: Add the wip monitor skeleton created by dot2k
Documentation/rv: Add deterministic automata instrumentation documentation
Documentation/rv: Add deterministic automata monitor synthesis documentation
tools/rv: Add dot2k
Documentation/rv: Add deterministic automaton documentation
tools/rv: Add dot2c
Documentation/rv: Add a basic documentation
rv/include: Add instrumentation helper functions
rv/include: Add deterministic automata monitor definition via C macros
...
Changes to RTLA:
- Fix a double free
- Define syscall numbers for RISCV
- Fix Makefile when called from -C tools
- Use calloc() to check for memory allocation failures
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYuqJ+RQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qiicAQDXrtf1iZrtDMscgqNEUB5pbRmbL387
R8aWnoexYJ1+KAD/Q7cFFUIDLjsKNZiLT1AgfCPY6c0llKL7kaYQYBWbBwY=
=3bf4
-----END PGP SIGNATURE-----
Merge tag 'trace-rtla-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull real time analysis tool (rtla) updates from Steven Rostedt:
- Fix a double free
- Define syscall numbers for RISCV
- Fix Makefile when called from -C tools
- Use calloc() to check for memory allocation failures
* tag 'trace-rtla-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
rtla: Define syscall numbers for riscv
rtla: Fix double free
rtla: Fix Makefile when called from -C tools/
rtla/utils: Use calloc and check the potential memory allocation failure
Few test cases related to the fix for 924a9bc362a5:
"net: check if protocol extracted by virtio_net_hdr_set_proto is correct"
Need test for the case when a non-standard packet (GSO without NEEDS_CSUM)
sent to the tap device causes a BUG check in the tap driver.
Signed-off-by: Cezar Bulinaru <cbulinaru@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the selftest got added, sendfile() on mptcp sockets returned
-EOPNOTSUPP, so running 'mptcp_connect.sh -m sendfile' failed
immediately.
This is no longer the case, but the script fails anyway due to timeout.
Let the receiver know once the sender has sent all data, just like
with '-m mmap' mode.
v2: need to respect cfg_wait too, as pm_userspace.sh relied
on -m sendfile to keep the connection open (Mat Martineau)
Fixes: 048d19d444 ("mptcp: add basic kselftest for mptcp")
Reported-by: Xiumei Mu <xmu@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Highlights:
- Microsoft Surface:
- SSAM hot unplug support
- Surface Pro 8 keyboard cover support
- Tablet mode switch support for Surface Pro 8 and Surface Laptop Studio
- thinkpad_acpi: AMD Automatice Mode Transitions (AMT) support
- Mellanox:
- Vulcan chassis COMe NVSwitch management support
- XH3000 support
- New generic/shared Intel P2SB (Primary to Sideband) support
- Lots of small cleanups
- Various small bugfixes
- Various new hardware ids / quirks additions
The following is an automated git shortlog grouped by driver:
ACPI:
- video: Fix acpi_video_handles_brightness_key_presses()
- video: Change how we determine if brightness key-presses are handled
Documentation/ABI:
- Add new attributes for mlxreg-io sysfs interfaces
- mlxreg-io: Fix contact info
Drop the PMC_ATOM Kconfig option:
- Drop the PMC_ATOM Kconfig option
EDAC, pnd2:
- convert to use common P2SB accessor
- Use proper I/O accessors and address space annotation
HID:
- surface-hid: Add support for hot-removal
ISST:
- PUNIT device mapping with Sub-NUMA clustering
Kconfig:
- Remove unnecessary "if X86"
MAINTAINERS:
- repair file entry in MICROSOFT SURFACE AGGREGATOR TABLET-MODE SWITCH
Merge tag 'ib-mfd-edac-i2c-leds-pinctrl-platform-watchdog-v5.20' into review-hans:
- Merge tag 'ib-mfd-edac-i2c-leds-pinctrl-platform-watchdog-v5.20' into review-hans
Move AMD platform drivers to separate directory:
- Move AMD platform drivers to separate directory
acer-wmi:
- Use backlight helper
acer_wmi:
- Cleanup Kconfig selects
apple-gmux:
- Use backlight helper
asus-wmi:
- Add mic-mute LED classdev support
- Add key mappings
compal-laptop:
- Use backlight helper
efi:
- Fix efi_power_off() not being run before acpi_power_off() when necessary
gigabyte-wmi:
- add support for B660I AORUS PRO DDR4
hp-wmi:
- Ignore Sanitization Mode event
i2c:
- i801: convert to use common P2SB accessor
ideapad-laptop:
- Add Ideapad 5 15ITL05 to ideapad_dytc_v4_allow_table[]
- Add allow_v4_dytc module parameter
intel/pmc:
- Add Alder Lake N support to PMC core driver
intel_atomisp2_led:
- Also turn off the always-on camera LED on the Asus T100TAF
leds:
- simatic-ipc-leds-gpio: Add GPIO version of Siemens driver
- simatic-ipc-leds: Convert to use P2SB accessor
mfd:
- lpc_ich: Add support for pinctrl in non-ACPI system
- lpc_ich: Switch to generic p2sb_bar()
- lpc_ich: Factor out lpc_ich_enable_spi_write()
mlx-platform:
- Add COME board revision register
- Add support for new system XH3000
- Introduce support for COMe NVSwitch management module for Vulcan chassis
- Add support for systems equipped with two ASICs
- Add cosmetic changes for alignment
- Make activation of some drivers conditional
p2sb:
- Move out of X86_PLATFORM_DEVICES dependency
panasonic-laptop:
- Use acpi_video_get_backlight_type()
- filter out duplicate volume up/down/mute keypresses
- don't report duplicate brightness key-presses
- revert "Resolve hotkey double trigger bug"
- sort includes alphabetically
- de-obfuscate button codes
pinctrl:
- intel: Check against matching data instead of ACPI companion
platform/mellanox:
- mlxreg-lc: Fix error flow and extend verbosity
- mlxreg-io: Add locking for io operations
- nvsw-sn2201: fix error code in nvsw_sn2201_create_static_devices()
platform/olpc:
- Fix uninitialized data in debugfs write
platform/surface:
- gpe: Add support for 13" Intel version of Surface Laptop 4
- tabletsw: Fix __le32 integer access
- Update copyright year of various drivers
- aggregator: Move subsystem hub drivers to their own module
- aggregator: Move device registry helper functions to core module
- aggregator_registry: Add support for tablet mode switch on Surface Laptop Studio
- aggregator_registry: Add support for tablet mode switch on Surface Pro 8
- Add KIP/POS tablet-mode switch driver
- aggregator: Add helper macros for requests with argument and return value
- aggregator: Reserve more event- and target-categories
- avoid flush_scheduled_work() usage
- aggregator_registry: Add support for keyboard cover on Surface Pro 8
- aggregator_registry: Add KIP device hub
- aggregator_registry: Change device ID for base hub
- aggregator_registry: Generify subsystem hub functionality
- aggregator: Add comment for KIP subsystem category
- aggregator_registry: Use client device wrappers for notifier registration
- aggregator: Allow notifiers to avoid communication on unregistering
- aggregator: Allow devices to be marked as hot-removed
- aggregator: Allow is_ssam_device() to be used when CONFIG_SURFACE_AGGREGATOR_BUS is disabled
platform/x86/amd/pmc:
- Add new platform support
- Add new acpi id for PMC controller
platform/x86/dell:
- Kconfig: Remove unnecessary "depends on X86_PLATFORM_DEVICES"
platform/x86/intel:
- Add Primary to Sideband (P2SB) bridge support
platform/x86/intel/ifs:
- Mark as BROKEN
platform/x86/intel/pmt:
- telemetry: Fix fixed region handling
platform/x86/intel/vsec:
- Fix wrong type for local status variables
- Add PCI error recovery support to Intel PMT
- Add support for Raptor Lake
- Rework early hardware code
pmc_atom:
- Fix comment typo
- Match all Lex BayTrail boards with critclk_systems DMI table
power/supply:
- surface_battery: Use client device wrappers for notifier registration
- surface_charger: Use client device wrappers for notifier registration
serial-multi-instantiate:
- Sort ACPI IDs by HID
- Get rid of redundant 'else'
- Use while (i--) pattern to clean up
- Improve dev_err_probe() messaging
- Drop duplicate check
- Improve autodetection
simatic-ipc:
- drop custom P2SB bar code
sony-laptop:
- Remove useless comparisons in sony_pic_read_possible_resource()
system76_acpi:
- Use dev_get_drvdata
thinkpad_acpi:
- Enable AMT by default on supported systems
- Add support for hotkey 0x131a
- Add support for automatic mode transitions
- profile capabilities as integer
- do not use PSC mode on Intel platforms
- Fix a memory leak of EFCH MMIO resource
- Replace custom str_on_off() etc
- Sort headers for better maintenance
- Use backlight helper
tools/power/x86/intel-speed-select:
- Remove unneeded semicolon
- Fix off by one check
watchdog:
- simatic-ipc-wdt: convert to use P2SB accessor
x86-android-tablets:
- Fix Lenovo Yoga Tablet 2 830/1050 poweroff again
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmLrndEUHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9zeMgf7BjSCz6ZA8SSY1i8QHDTvdjySihHJ
j07Gn3j1T/5G00R/r6viMDE4PxcYvMAPXjq3azepKQd8H5kGfE323SA6fgWFPAvi
P2OvEfvWfI5S8FYGYPBkNP2MjQ5MFe7qzLEh3+wQH0ocJ7WRCi457B4Xvtd2gWI3
dHj5gMSWC3O5xNa2S4Mg3dnD9uJlwhX+FNjWIuRy8eh5+DikgByyC4B+uW6WtO5e
t0rmIm6q5wUzB7dIetJLoAQwrcpYAOkK7L33G9h/7knWAfiJfklaKTbftnoxbDSv
iGWODkLDyob4C48DmVusS6WMEhPUzl/R33+tk6LjVt/YOYOP030EMtCECQ==
=Krhc
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Hans de Goede:
- Microsoft Surface:
- SSAM hot unplug support
- Surface Pro 8 keyboard cover support
- Tablet mode switch support for Surface Pro 8 and Surface Laptop
Studio
- thinkpad_acpi:
- AMD Automatice Mode Transitions (AMT) support
- Mellanox:
- Vulcan chassis COMe NVSwitch management support
- XH3000 support
- New generic/shared Intel P2SB (Primary to Sideband) support
- Lots of small cleanups
- Various small bugfixes
- Various new hardware ids / quirks additions
* tag 'platform-drivers-x86-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (105 commits)
platform/x86/intel/vsec: Fix wrong type for local status variables
platform/x86: p2sb: Move out of X86_PLATFORM_DEVICES dependency
platform/x86: pmc_atom: Fix comment typo
platform/surface: gpe: Add support for 13" Intel version of Surface Laptop 4
platform/olpc: Fix uninitialized data in debugfs write
platform/mellanox: mlxreg-lc: Fix error flow and extend verbosity
platform/x86: pmc_atom: Match all Lex BayTrail boards with critclk_systems DMI table
platform/x86: sony-laptop: Remove useless comparisons in sony_pic_read_possible_resource()
tools/power/x86/intel-speed-select: Remove unneeded semicolon
tools/power/x86/intel-speed-select: Fix off by one check
platform/surface: tabletsw: Fix __le32 integer access
Documentation/ABI: Add new attributes for mlxreg-io sysfs interfaces
Documentation/ABI: mlxreg-io: Fix contact info
platform/mellanox: mlxreg-io: Add locking for io operations
platform/x86: mlx-platform: Add COME board revision register
platform/x86: mlx-platform: Add support for new system XH3000
platform/x86: mlx-platform: Introduce support for COMe NVSwitch management module for Vulcan chassis
platform/x86: mlx-platform: Add support for systems equipped with two ASICs
platform/x86: mlx-platform: Add cosmetic changes for alignment
platform/x86: mlx-platform: Make activation of some drivers conditional
...
* Unwinder implementations for both nVHE modes (classic and
protected), complete with an overflow stack
* Rework of the sysreg access from userspace, with a complete
rewrite of the vgic-v3 view to allign with the rest of the
infrastructure
* Disagregation of the vcpu flags in separate sets to better track
their use model.
* A fix for the GICv2-on-v3 selftest
* A small set of cosmetic fixes
RISC-V:
* Track ISA extensions used by Guest using bitmap
* Added system instruction emulation framework
* Added CSR emulation framework
* Added gfp_custom flag in struct kvm_mmu_memory_cache
* Added G-stage ioremap() and iounmap() functions
* Added support for Svpbmt inside Guest
s390:
* add an interface to provide a hypervisor dump for secure guests
* improve selftests to use TAP interface
* enable interpretive execution of zPCI instructions (for PCI passthrough)
* First part of deferred teardown
* CPU Topology
* PV attestation
* Minor fixes
x86:
* Permit guests to ignore single-bit ECC errors
* Intel IPI virtualization
* Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS
* PEBS virtualization
* Simplify PMU emulation by just using PERF_TYPE_RAW events
* More accurate event reinjection on SVM (avoid retrying instructions)
* Allow getting/setting the state of the speaker port data bit
* Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent
* "Notify" VM exit (detect microarchitectural hangs) for Intel
* Use try_cmpxchg64 instead of cmpxchg64
* Ignore benign host accesses to PMU MSRs when PMU is disabled
* Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
* Allow NX huge page mitigation to be disabled on a per-vm basis
* Port eager page splitting to shadow MMU as well
* Enable CMCI capability by default and handle injected UCNA errors
* Expose pid of vcpu threads in debugfs
* x2AVIC support for AMD
* cleanup PIO emulation
* Fixes for LLDT/LTR emulation
* Don't require refcounted "struct page" to create huge SPTEs
* Miscellaneous cleanups:
** MCE MSR emulation
** Use separate namespaces for guest PTEs and shadow PTEs bitmasks
** PIO emulation
** Reorganize rmap API, mostly around rmap destruction
** Do not workaround very old KVM bugs for L0 that runs with nesting enabled
** new selftests API for CPUID
Generic:
* Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache
* new selftests API using struct kvm_vcpu instead of a (vm, id) tuple
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmLnyo4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMtQQf/XjVWiRcWLPR9dqzRM/vvRXpiG+UL
jU93R7m6ma99aqTtrxV/AE+kHgamBlma3Cwo+AcWk9uCVNbIhFjv2YKg6HptKU0e
oJT3zRYp+XIjEo7Kfw+TwroZbTlG6gN83l1oBLFMqiFmHsMLnXSI2mm8MXyi3dNB
vR2uIcTAl58KIprqNNsYJ2dNn74ogOMiXYx9XzoA9/5Xb6c0h4rreHJa5t+0s9RO
Gz7Io3PxumgsbJngjyL1Ve5oxhlIAcZA8DU0PQmjxo3eS+k6BcmavGFd45gNL5zg
iLpCh4k86spmzh8CWkAAwWPQE4dZknK6jTctJc0OFVad3Z7+X7n0E8TFrA==
=PM8o
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"Quite a large pull request due to a selftest API overhaul and some
patches that had come in too late for 5.19.
ARM:
- Unwinder implementations for both nVHE modes (classic and
protected), complete with an overflow stack
- Rework of the sysreg access from userspace, with a complete rewrite
of the vgic-v3 view to allign with the rest of the infrastructure
- Disagregation of the vcpu flags in separate sets to better track
their use model.
- A fix for the GICv2-on-v3 selftest
- A small set of cosmetic fixes
RISC-V:
- Track ISA extensions used by Guest using bitmap
- Added system instruction emulation framework
- Added CSR emulation framework
- Added gfp_custom flag in struct kvm_mmu_memory_cache
- Added G-stage ioremap() and iounmap() functions
- Added support for Svpbmt inside Guest
s390:
- add an interface to provide a hypervisor dump for secure guests
- improve selftests to use TAP interface
- enable interpretive execution of zPCI instructions (for PCI
passthrough)
- First part of deferred teardown
- CPU Topology
- PV attestation
- Minor fixes
x86:
- Permit guests to ignore single-bit ECC errors
- Intel IPI virtualization
- Allow getting/setting pending triple fault with
KVM_GET/SET_VCPU_EVENTS
- PEBS virtualization
- Simplify PMU emulation by just using PERF_TYPE_RAW events
- More accurate event reinjection on SVM (avoid retrying
instructions)
- Allow getting/setting the state of the speaker port data bit
- Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls
are inconsistent
- "Notify" VM exit (detect microarchitectural hangs) for Intel
- Use try_cmpxchg64 instead of cmpxchg64
- Ignore benign host accesses to PMU MSRs when PMU is disabled
- Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
- Allow NX huge page mitigation to be disabled on a per-vm basis
- Port eager page splitting to shadow MMU as well
- Enable CMCI capability by default and handle injected UCNA errors
- Expose pid of vcpu threads in debugfs
- x2AVIC support for AMD
- cleanup PIO emulation
- Fixes for LLDT/LTR emulation
- Don't require refcounted "struct page" to create huge SPTEs
- Miscellaneous cleanups:
- MCE MSR emulation
- Use separate namespaces for guest PTEs and shadow PTEs bitmasks
- PIO emulation
- Reorganize rmap API, mostly around rmap destruction
- Do not workaround very old KVM bugs for L0 that runs with nesting enabled
- new selftests API for CPUID
Generic:
- Fix races in gfn->pfn cache refresh; do not pin pages tracked by
the cache
- new selftests API using struct kvm_vcpu instead of a (vm, id)
tuple"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (606 commits)
selftests: kvm: set rax before vmcall
selftests: KVM: Add exponent check for boolean stats
selftests: KVM: Provide descriptive assertions in kvm_binary_stats_test
selftests: KVM: Check stat name before other fields
KVM: x86/mmu: remove unused variable
RISC-V: KVM: Add support for Svpbmt inside Guest/VM
RISC-V: KVM: Use PAGE_KERNEL_IO in kvm_riscv_gstage_ioremap()
RISC-V: KVM: Add G-stage ioremap() and iounmap() functions
KVM: Add gfp_custom flag in struct kvm_mmu_memory_cache
RISC-V: KVM: Add extensible CSR emulation framework
RISC-V: KVM: Add extensible system instruction emulation framework
RISC-V: KVM: Factor-out instruction emulation into separate sources
RISC-V: KVM: move preempt_disable() call in kvm_arch_vcpu_ioctl_run
RISC-V: KVM: Make kvm_riscv_guest_timer_init a void function
RISC-V: KVM: Fix variable spelling mistake
RISC-V: KVM: Improve ISA extension by using a bitmap
KVM, x86/mmu: Fix the comment around kvm_tdp_mmu_zap_leafs()
KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog
KVM: x86/mmu: Treat NX as a valid SPTE bit for NPT
KVM: x86: Do not block APIC write for non ICR registers
...
Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text. Also included in here are a few other minor updates,
2 USB files, and one Documentation file update to get the SPDX lines
correct.
All of these have been in the linux-next tree for a very long time.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYupz3g8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynPUgCgslaf2ssCgW5IeuXbhla+ZBRAzisAnjVgOvLN
4AKdqbiBNlFbCroQwmeQ
=v1sg
-----END PGP SIGNATURE-----
Merge tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text.
Also included in here are a few other minor updates, two USB files,
and one Documentation file update to get the SPDX lines correct.
All of these have been in the linux-next tree for a very long time"
* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
Documentation: samsung-s3c24xx: Add blank line after SPDX directive
x86/crypto: Remove stray comment terminator
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
...
Here is the big set of Thunderbolt and USB changes for 6.0-rc1.
Lots of little things here, nothing major, just constant development on
some new hardware support and cleanups of older drivers. Highlights of
this pull request are:
- lots of typec changes and improvements for new hardware
- new gadget controller driver
- thunderbolt support for new hardware
- the normal set of new usb-serial device ids and cleanups
- loads of dwc3 controller fixes and improvements
- mtu3 driver updates
- testusb fixes for longtime issues (not many people use this
tool it seems.)
- minor driver fixes and improvements over the USB tree
- chromeos platform driver changes were added and then reverted
as they depened on some typec changes, but the cross-tree
merges caused problems so they will come back later through
the platform tree.
All of these have been in linux-next for a while now with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYup5Rg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yko/ACfYD9mdlr4WleUpVw5/uNywN6sL9EAn1tv0V8W
cUTAoWxAf5orClAC22ZU
=Vcqd
-----END PGP SIGNATURE-----
Merge tag 'usb-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of Thunderbolt and USB changes for 6.0-rc1.
Lots of little things here, nothing major, just constant development
on some new hardware support and cleanups of older drivers. Highlights
are:
- lots of typec changes and improvements for new hardware
- new gadget controller driver
- thunderbolt support for new hardware
- the normal set of new usb-serial device ids and cleanups
- loads of dwc3 controller fixes and improvements
- mtu3 driver updates
- testusb fixes for longtime issues (not many people use this tool it
seems.)
- minor driver fixes and improvements over the USB tree
- chromeos platform driver changes were added and then reverted as
they depened on some typec changes, but the cross-tree merges
caused problems so they will come back later through the platform
tree.
All of these have been in linux-next for a while now with no reported
issues"
* tag 'usb-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (193 commits)
usb: misc: onboard_usb_hub: Remove duplicated power_on delay
usb: misc: onboard_usb_hub: Add TI USB8041 hub support
usb: misc: onboard_usb_hub: Add reset-gpio support
USB: usbsevseg: convert sysfs snprintf to sysfs_emit
dt-bindings: usb: Add binding for TI USB8041 hub controller
ARM: multi_v7_defconfig: enable USB onboard HUB driver
ARM: dts: stm32: add support for USB2514B onboard hub on stm32mp15xx-dkx
usb: misc: onboard-hub: add support for Microchip USB2514B USB 2.0 hub
dt-bindings: usb: generic-ehci: allow usb-hcd schema properties
usb: typec: ucsi: stm32g0: add bootloader support
usb: typec: ucsi: stm32g0: add support for stm32g0 controller
dt-bindings: usb: typec: add bindings for stm32g0 controller
usb: typec: ucsi: Acknowledge the GET_ERROR_STATUS command completion
usb: cdns3: change place of 'priv_ep' assignment in cdns3_gadget_ep_dequeue(), cdns3_gadget_ep_enable()
usb/chipidea: fix repeated words in comments
usb: renesas-xhci: Do not print any log while fw verif success
usb: typec: retimer: Add missing id check in match callback
USB: xhci: Fix comment typo
usb/typec/tcpm: fix repeated words in comments
usb/musb: fix repeated words in comments
...
Here is the large set of char and misc and other driver subsystem
changes for 6.0-rc1.
Highlights include:
- large set of IIO driver updates, additions, and cleanups
- new habanalabs device support added (loads of register maps
much like GPUs have)
- soundwire driver updates
- phy driver updates
- slimbus driver updates
- tiny virt driver fixes and updates
- misc driver fixes and updates
- interconnect driver updates
- hwtracing driver updates
- fpga driver updates
- extcon driver updates
- firmware driver updates
- counter driver update
- mhi driver fixes and updates
- binder driver fixes and updates
- speakup driver fixes
Full details are in the long shortlog contents.
All of these have been in linux-next for a while without any reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYup9QQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylBKQCfaSuzl9ZP9dTvAw2FPp14oRqXnpoAnicvWAoq
1vU9Vtq2c73uBVLdZm4m
=AwP3
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc driver updates from Greg KH:
"Here is the large set of char and misc and other driver subsystem
changes for 6.0-rc1.
Highlights include:
- large set of IIO driver updates, additions, and cleanups
- new habanalabs device support added (loads of register maps much
like GPUs have)
- soundwire driver updates
- phy driver updates
- slimbus driver updates
- tiny virt driver fixes and updates
- misc driver fixes and updates
- interconnect driver updates
- hwtracing driver updates
- fpga driver updates
- extcon driver updates
- firmware driver updates
- counter driver update
- mhi driver fixes and updates
- binder driver fixes and updates
- speakup driver fixes
All of these have been in linux-next for a while without any reported
problems"
* tag 'char-misc-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (634 commits)
drivers: lkdtm: fix clang -Wformat warning
char: remove VR41XX related char driver
misc: Mark MICROCODE_MINOR unused
spmi: trace: fix stack-out-of-bound access in SPMI tracing functions
dt-bindings: iio: adc: Add compatible for MT8188
iio: light: isl29028: Fix the warning in isl29028_remove()
iio: accel: sca3300: Extend the trigger buffer from 16 to 32 bytes
iio: fix iio_format_avail_range() printing for none IIO_VAL_INT
iio: adc: max1027: unlock on error path in max1027_read_single_value()
iio: proximity: sx9324: add empty line in front of bullet list
iio: magnetometer: hmc5843: Remove duplicate 'the'
iio: magn: yas530: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
iio: magnetometer: ak8974: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
iio: light: veml6030: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
iio: light: vcnl4035: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
iio: light: vcnl4000: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr() macros
iio: light: tsl2591: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr()
iio: light: tsl2583: Use DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr()
iio: light: isl29028: Use DEFINE_RUNTIME_DEV_PM_OPS() and pm_ptr()
iio: light: gp2ap002: Switch to DEFINE_RUNTIME_DEV_PM_OPS and pm_ptr()
...
Core
----
- Refactor the forward memory allocation to better cope with memory
pressure with many open sockets, moving from a per socket cache to
a per-CPU one
- Replace rwlocks with RCU for better fairness in ping, raw sockets
and IP multicast router.
- Network-side support for IO uring zero-copy send.
- A few skb drop reason improvements, including codegen the source file
with string mapping instead of using macro magic.
- Rename reference tracking helpers to a more consistent
netdev_* schema.
- Adapt u64_stats_t type to address load/store tearing issues.
- Refine debug helper usage to reduce the log noise caused by bots.
BPF
---
- Improve socket map performance, avoiding skb cloning on read
operation.
- Add support for 64 bits enum, to match types exposed by kernel.
- Introduce support for sleepable uprobes program.
- Introduce support for enum textual representation in libbpf.
- New helpers to implement synproxy with eBPF/XDP.
- Improve loop performances, inlining indirect calls when
possible.
- Removed all the deprecated libbpf APIs.
- Implement new eBPF-based LSM flavor.
- Add type match support, which allow accurate queries to the
eBPF used types.
- A few TCP congetsion control framework usability improvements.
- Add new infrastructure to manipulate CT entries via eBPF programs.
- Allow for livepatch (KLP) and BPF trampolines to attach to the same
kernel function.
Protocols
---------
- Introduce per network namespace lookup tables for unix sockets,
increasing scalability and reducing contention.
- Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.
- Add support to forciby close TIME_WAIT TCP sockets via user-space
tools.
- Significant performance improvement for the TLS 1.3 receive path,
both for zero-copy and not-zero-copy.
- Support for changing the initial MTPCP subflow priority/backup
status
- Introduce virtually contingus buffers for sockets over RDMA,
to cope better with memory pressure.
- Extend CAN ethtool support with timestamping capabilities
- Refactor CAN build infrastructure to allow building only the needed
features.
Driver API
----------
- Remove devlink mutex to allow parallel commands on multiple links.
- Add support for pause stats in distributed switch.
- Implement devlink helpers to query and flash line cards.
- New helper for phy mode to register conversion.
New hardware / drivers
----------------------
- Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.
- Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.
- Ethernet DSA driver for the Microchip LAN937x switch.
- Ethernet PHY driver for the Aquantia AQR113C EPHY.
- CAN driver for the OBD-II ELM327 interface.
- CAN driver for RZ/N1 SJA1000 CAN controller.
- Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.
Drivers
-------
- Intel Ethernet NICs:
- i40e: add support for vlan pruning
- i40e: add support for XDP framented packets
- ice: improved vlan offload support
- ice: add support for PPPoE offload
- Mellanox Ethernet (mlx5)
- refactor packet steering offload for performance and scalability
- extend support for TC offload
- refactor devlink code to clean-up the locking schema
- support stacked vlans for bridge offloads
- use TLS objects pool to improve connection rate
- Netronome Ethernet NICs (nfp):
- extend support for IPv6 fields mangling offload
- add support for vepa mode in HW bridge
- better support for virtio data path acceleration (VDPA)
- enable TSO by default
- Microsoft vNIC driver (mana)
- add support for XDP redirect
- Others Ethernet drivers:
- bonding: add per-port priority support
- microchip lan743x: extend phy support
- Fungible funeth: support UDP segmentation offload and XDP xmit
- Solarflare EF100: add support for virtual function representors
- MediaTek SoC: add XDP support
- Mellanox Ethernet/IB switch (mlxsw):
- dropped support for unreleased H/W (XM router).
- improved stats accuracy
- unified bridge model coversion improving scalability
(parts 1-6)
- support for PTP in Spectrum-2 asics
- Broadcom PHYs
- add PTP support for BCM54210E
- add support for the BCM53128 internal PHY
- Marvell Ethernet switches (prestera):
- implement support for multicast forwarding offload
- Embedded Ethernet switches:
- refactor OcteonTx MAC filter for better scalability
- improve TC H/W offload for the Felix driver
- refactor the Microchip ksz8 and ksz9477 drivers to share
the probe code (parts 1, 2), add support for phylink
mac configuration
- Other WiFi:
- Microchip wilc1000: diable WEP support and enable WPA3
- Atheros ath10k: encapsulation offload support
Old code removal:
- Neterion vxge ethernet driver: this is untouched since more than
10 years.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmLqN+oSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkB9kQAI9VqW0c3SfiTJnkVBEIovZ6Tnh5stD2
UYFkh1BdchLsYxi7W4XMpVPSzRztiTP87mIx5c/KvIzj+QNeWL1XWRJSPdI9HhTD
pTAA/tM2OG7bqrbyQiKDNfpQdNl7+kk1RwnYd+f9RFl1QVuIJaYhmjVwrsN5xF/+
jUsotpROarM2dGFWiFwJbKhP2zMDT+6qEEahM8pEPggKhv8wRLYjany2cZVEe4e0
WGUpbINAS8gEKm0Ob922WaDfDrcK/N1Z0jNz/kMaENkK18Vvc7F6bCO0DzAawKX9
QZMMwm6mHp3EThflJAMAzCGIYiIcwLhykgdyj8rrjPhFrWbMD2Sdsbo21HOXU/8j
u4aAhVl+d+h7emmbgBoJ8sycVJ7BQlXz7lX20sTgADv9xI4/dPhQ17CMRuwX6fXX
JSrn6P6e1LTV5CEg6vrlSPnKPY6uhFn/cPw47FxCjRwJ9phVnp+8uZWQmf9Pz3yf
Ok/tcj+juFbsmuOshHy2cbRkuNZNS0oRWlSTBo5795ZwOLSakMonR3L+ev2aOvzz
DVrFp2Y/iIVwMSFdCbouYdYnhArPRhOAtCmZc2afY8aBN7aaMgrdTy3+mzUoHy3I
FG3K+VuKpfi0vY4zn6ZoLZDIpyXIoJJ93RcSGltD32t3Dp1RaQMVEI4s45k05PVm
1nYpXKHA8qML
=hxEG
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking changes from Paolo Abeni:
"Core:
- Refactor the forward memory allocation to better cope with memory
pressure with many open sockets, moving from a per socket cache to
a per-CPU one
- Replace rwlocks with RCU for better fairness in ping, raw sockets
and IP multicast router.
- Network-side support for IO uring zero-copy send.
- A few skb drop reason improvements, including codegen the source
file with string mapping instead of using macro magic.
- Rename reference tracking helpers to a more consistent netdev_*
schema.
- Adapt u64_stats_t type to address load/store tearing issues.
- Refine debug helper usage to reduce the log noise caused by bots.
BPF:
- Improve socket map performance, avoiding skb cloning on read
operation.
- Add support for 64 bits enum, to match types exposed by kernel.
- Introduce support for sleepable uprobes program.
- Introduce support for enum textual representation in libbpf.
- New helpers to implement synproxy with eBPF/XDP.
- Improve loop performances, inlining indirect calls when possible.
- Removed all the deprecated libbpf APIs.
- Implement new eBPF-based LSM flavor.
- Add type match support, which allow accurate queries to the eBPF
used types.
- A few TCP congetsion control framework usability improvements.
- Add new infrastructure to manipulate CT entries via eBPF programs.
- Allow for livepatch (KLP) and BPF trampolines to attach to the same
kernel function.
Protocols:
- Introduce per network namespace lookup tables for unix sockets,
increasing scalability and reducing contention.
- Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.
- Add support to forciby close TIME_WAIT TCP sockets via user-space
tools.
- Significant performance improvement for the TLS 1.3 receive path,
both for zero-copy and not-zero-copy.
- Support for changing the initial MTPCP subflow priority/backup
status
- Introduce virtually contingus buffers for sockets over RDMA, to
cope better with memory pressure.
- Extend CAN ethtool support with timestamping capabilities
- Refactor CAN build infrastructure to allow building only the needed
features.
Driver API:
- Remove devlink mutex to allow parallel commands on multiple links.
- Add support for pause stats in distributed switch.
- Implement devlink helpers to query and flash line cards.
- New helper for phy mode to register conversion.
New hardware / drivers:
- Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.
- Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.
- Ethernet DSA driver for the Microchip LAN937x switch.
- Ethernet PHY driver for the Aquantia AQR113C EPHY.
- CAN driver for the OBD-II ELM327 interface.
- CAN driver for RZ/N1 SJA1000 CAN controller.
- Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.
Drivers:
- Intel Ethernet NICs:
- i40e: add support for vlan pruning
- i40e: add support for XDP framented packets
- ice: improved vlan offload support
- ice: add support for PPPoE offload
- Mellanox Ethernet (mlx5)
- refactor packet steering offload for performance and scalability
- extend support for TC offload
- refactor devlink code to clean-up the locking schema
- support stacked vlans for bridge offloads
- use TLS objects pool to improve connection rate
- Netronome Ethernet NICs (nfp):
- extend support for IPv6 fields mangling offload
- add support for vepa mode in HW bridge
- better support for virtio data path acceleration (VDPA)
- enable TSO by default
- Microsoft vNIC driver (mana)
- add support for XDP redirect
- Others Ethernet drivers:
- bonding: add per-port priority support
- microchip lan743x: extend phy support
- Fungible funeth: support UDP segmentation offload and XDP xmit
- Solarflare EF100: add support for virtual function representors
- MediaTek SoC: add XDP support
- Mellanox Ethernet/IB switch (mlxsw):
- dropped support for unreleased H/W (XM router).
- improved stats accuracy
- unified bridge model coversion improving scalability (parts 1-6)
- support for PTP in Spectrum-2 asics
- Broadcom PHYs
- add PTP support for BCM54210E
- add support for the BCM53128 internal PHY
- Marvell Ethernet switches (prestera):
- implement support for multicast forwarding offload
- Embedded Ethernet switches:
- refactor OcteonTx MAC filter for better scalability
- improve TC H/W offload for the Felix driver
- refactor the Microchip ksz8 and ksz9477 drivers to share the
probe code (parts 1, 2), add support for phylink mac
configuration
- Other WiFi:
- Microchip wilc1000: diable WEP support and enable WPA3
- Atheros ath10k: encapsulation offload support
Old code removal:
- Neterion vxge ethernet driver: this is untouched since more than 10 years"
* tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits)
doc: sfp-phylink: Fix a broken reference
wireguard: selftests: support UML
wireguard: allowedips: don't corrupt stack when detecting overflow
wireguard: selftests: update config fragments
wireguard: ratelimiter: use hrtimer in selftest
net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
net: usb: ax88179_178a: Bind only to vendor-specific interface
selftests: net: fix IOAM test skip return code
net: usb: make USB_RTL8153_ECM non user configurable
net: marvell: prestera: remove reduntant code
octeontx2-pf: Reduce minimum mtu size to 60
net: devlink: Fix missing mutex_unlock() call
net/tls: Remove redundant workqueue flush before destroy
net: txgbe: Fix an error handling path in txgbe_probe()
net: dsa: Fix spelling mistakes and cleanup code
Documentation: devlink: add add devlink-selftests to the table of contents
dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
net: ionic: fix error check for vlan flags in ionic_set_nic_features()
net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
nfp: flower: add support for tunnel offload without key ID
...
- Enable mirrored memory for arm64
- Fix up several abuses of the efivar API
- Refactor the efivar API in preparation for moving the 'business logic'
part of it into efivarfs
- Enable ACPI PRM on arm64
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmLhuDIACgkQw08iOZLZ
jyS9IQv/Wc2nhjN50S3gfrL+68/el/hGdP/J0FK5BOOjNosG2t1ZNYZtSthXqpPH
hRrTU2m6PpQUalRpFDyLiHkJvdBFQe4VmvrzBa3TIBIzyflLQPJzkWrqThPchV+B
qi4lmCtTDNIEJmayewqx1wWA+QmUiyI5zJ8wrZp84LTctBPL75seVv0SB20nqai0
3/I73omB2RLVGpCpeWvb++vePXL8euFW3FEwCTM8hRboICjORTyIZPy8Y5os+3xT
UgrIgVDOtn1Xwd4tK0qVwjOVA51east4Fcn3yGOrL40t+3SFm2jdpAJOO3UvyNPl
vkbtjvXsIjt3/oxreKxXHLbamKyueWIfZRyCLsrg6wrr96oypPk6ID4iDCQoen/X
Zf0VjM2vmvSd4YgnEIblOfSBxVg48cHJA4iVHVxFodNTrVnzGGFYPTmNKmJqo+Xn
JeUILM7jlR4h/t0+cTTK3Busu24annTuuz5L5rjf4bUm6pPf4crb1yJaFWtGhlpa
er233D6O
=zI0R
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Enable mirrored memory for arm64
- Fix up several abuses of the efivar API
- Refactor the efivar API in preparation for moving the 'business
logic' part of it into efivarfs
- Enable ACPI PRM on arm64
* tag 'efi-next-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
ACPI: Move PRM config option under the main ACPI config
ACPI: Enable Platform Runtime Mechanism(PRM) support on ARM64
ACPI: PRM: Change handler_addr type to void pointer
efi: Simplify arch_efi_call_virt() macro
drivers: fix typo in firmware/efi/memmap.c
efi: vars: Drop __efivar_entry_iter() helper which is no longer used
efi: vars: Use locking version to iterate over efivars linked lists
efi: pstore: Omit efivars caching EFI varstore access layer
efi: vars: Add thin wrapper around EFI get/set variable interface
efi: vars: Don't drop lock in the middle of efivar_init()
pstore: Add priv field to pstore_record for backend specific use
Input: applespi - avoid efivars API and invoke EFI services directly
selftests/kexec: remove broken EFI_VARS secure boot fallback check
brcmfmac: Switch to appropriate helper to load EFI variable contents
iwlwifi: Switch to proper EFI variable store interface
media: atomisp_gmin_platform: stop abusing efivar API
efi: efibc: avoid efivar API for setting variables
efi: avoid efivars layer when loading SSDTs from variables
efi: Correct comment on efi_memmap_alloc
memblock: Disable mirror feature if kernelcore is not specified
...
hmm-tests.c:1607:42: error: 'HMM_DMIRROR_MIGRATE' undeclared (first use in this function); did you mean 'HMM_DMIRROR_WRITE'?
Fixes: f6c3e1ae01 ("mm/hmm: add a test for cross device private faults")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A build with -D_FORTIFY_SOURCE=2 enabled will produce the following warnings:
sysfs.c:63:30: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size between 0 and 255 [-Wformat-truncation=]
snprintf(filepath, 256, "%s/%s", path, filename);
^~
Bump up the buffer to PATH_MAX which is the limit and account for all of
the possible NUL and separators that could lead to exceeding the
allocated buffer sizes.
Fixes: 94f69966fa ("tools/thermal: Introduce tmon, a tool for thermal subsystem")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add appropriate might_alloc() annotations to the XArray APIs
- Document that the IDR is deprecated
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmLpWggACgkQDpNsjXcp
gj7OiAf+Ie0kxztC96srZXoaUUXM/OhNAUdHCyRMiH8DyRScrBpucj4QazPceAO0
fOQ+Nupx0XtCeVJl4E3cmHIaG2utP3VYnI6cKhZhQJARCDS4Lynddd6Q4RDNyDQu
/ibq2+/8XF5+RLZytir8MyqMI2DpdMikKHFNlLcFXLkIESsub3PUWeU7/YHajp1G
gliXkDLScIUU1XHuVDB6Ol02rJ/mmMclvko2GHgDTeuQjEMqivR0NHTxZl2lRAeM
zMqSkkywHhrYiEo/N+gEqaHNhr5O8IwG0qUVnI848AG+QxyqajRJ87fKDxP4UvxQ
Ga7SiSwhnvxCwdvs8JaPtqSj2s5S0w==
=IwpY
-----END PGP SIGNATURE-----
Merge tag 'xarray-6.0' of git://git.infradead.org/users/willy/xarray
Pull XArray/IDR updates from Matthew Wilcox:
- Add appropriate might_alloc() annotations to the XArray APIs
- Document that the IDR is deprecated
* tag 'xarray-6.0' of git://git.infradead.org/users/willy/xarray:
IDR: Note that the IDR API is deprecated
XArray: Add calls to might_alloc()
This extracts common code from the branches of the forks if-then-else.
enable_counters(), which was at the beginning of both branches of the
conditional, is now unconditional; evlist__start_workload() is extracted
to a different if, which enables making the common clocking code
unconditional.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Adrián Herrera Arcila <adrian.herrera@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220729161244.10522-1-adrian.herrera@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as
documented for RET instructions after VM exits. Mitigate it with a new
one-entry RSB stuffing mechanism and a new LFENCE.
== Background ==
Indirect Branch Restricted Speculation (IBRS) was designed to help
mitigate Branch Target Injection and Speculative Store Bypass, i.e.
Spectre, attacks. IBRS prevents software run in less privileged modes
from affecting branch prediction in more privileged modes. IBRS requires
the MSR to be written on every privilege level change.
To overcome some of the performance issues of IBRS, Enhanced IBRS was
introduced. eIBRS is an "always on" IBRS, in other words, just turn
it on once instead of writing the MSR on every privilege level change.
When eIBRS is enabled, more privileged modes should be protected from
less privileged modes, including protecting VMMs from guests.
== Problem ==
Here's a simplification of how guests are run on Linux' KVM:
void run_kvm_guest(void)
{
// Prepare to run guest
VMRESUME();
// Clean up after guest runs
}
The execution flow for that would look something like this to the
processor:
1. Host-side: call run_kvm_guest()
2. Host-side: VMRESUME
3. Guest runs, does "CALL guest_function"
4. VM exit, host runs again
5. Host might make some "cleanup" function calls
6. Host-side: RET from run_kvm_guest()
Now, when back on the host, there are a couple of possible scenarios of
post-guest activity the host needs to do before executing host code:
* on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not
touched and Linux has to do a 32-entry stuffing.
* on eIBRS hardware, VM exit with IBRS enabled, or restoring the host
IBRS=1 shortly after VM exit, has a documented side effect of flushing
the RSB except in this PBRSB situation where the software needs to stuff
the last RSB entry "by hand".
IOW, with eIBRS supported, host RET instructions should no longer be
influenced by guest behavior after the host retires a single CALL
instruction.
However, if the RET instructions are "unbalanced" with CALLs after a VM
exit as is the RET in #6, it might speculatively use the address for the
instruction after the CALL in #3 as an RSB prediction. This is a problem
since the (untrusted) guest controls this address.
Balanced CALL/RET instruction pairs such as in step #5 are not affected.
== Solution ==
The PBRSB issue affects a wide variety of Intel processors which
support eIBRS. But not all of them need mitigation. Today,
X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates
PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e.,
eIBRS systems which enable legacy IBRS explicitly.
However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT
and most of them need a new mitigation.
Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE
which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT.
The lighter-weight mitigation performs a CALL instruction which is
immediately followed by a speculative execution barrier (INT3). This
steers speculative execution to the barrier -- just like a retpoline
-- which ensures that speculation can never reach an unbalanced RET.
Then, ensure this CALL is retired before continuing execution with an
LFENCE.
In other words, the window of exposure is opened at VM exit where RET
behavior is troublesome. While the window is open, force RSB predictions
sampling for RET targets to a dead end at the INT3. Close the window
with the LFENCE.
There is a subset of eIBRS systems which are not vulnerable to PBRSB.
Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB.
Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO.
[ bp: Massage, incorporate review comments from Andy Cooper. ]
Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Hi Linus,
Please, pull the following treewide patch that replaces zero-length arrays
with flexible-array members in UAPI. This patch has been baking in
linux-next for 5 weeks now.
-fstrict-flex-arrays=3 is coming and we need to land these changes
to prevent issues like these in the short future:
../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0,
but the source string has length 2 (including NUL byte) [-Wfortify-source]
strcpy(de3->name, ".");
^
Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If
this breaks anything, we can use a union with a new member name.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
Thanks
--
Gustavo
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmLoNdcACgkQRwW0y0cG
2zEVeg//QYJ3j2pbKt9zB6muO3SkrNoMPc5wpY/SITUeiDscukLvGzJG88eIZskl
NaEjbmacHmdlQrBkUdr10i1+hkb2zRd6/j42GIDXEhhKTMoT2UxJCBp47KSvd7VY
dKNLGsgQs3kwmmxLEGu6w6vywWpI5wxXTKWL1Q/RpUXoOnLmsMEbzKTjf12a1Edl
9gPNY+tMHIHyB0pGIRXDY/ZF5c+FcRFn6kKeMVzJL0bnX7FI4UmYe83k9ajEiLWA
MD3JAw/mNv2X0nizHHuQHIjtky8Pr+E8hKs5ni88vMYmFqeABsTw4R1LJykv/mYa
NakU1j9tHYTKcs2Ju+gIvSKvmatKGNmOpti/8RAjEX1YY4cHlHWNsigVbVRLqfo7
SKImlSUxOPGFS3HAJQCC9P/oZgICkUdD6sdLO1PVBnE1G3Fvxg5z6fGcdEuEZkVR
PQwlYDm1nlTuScbkgVSBzyU/AkntVMJTuPWgbpNo+VgSXWZ8T/U8II0eGrFVf9rH
+y5dAS52/bi6OP0la7fNZlq7tcPfNG9HJlPwPb1kQtuPT4m6CBhth/rRrDJwx8za
0cpJT75Q3CI0wLZ7GN4yEjtNQrlAeeiYiS4LMQ/SFFtg1KzvmYYVmWDhOf0+mMDA
f7bq4cxEg2LHwrhRgQQWowFVBu7yeiwKbcj9sybfA27bMqCtfto=
=8yMq
-----END PGP SIGNATURE-----
Merge tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull uapi flexible array update from Gustavo Silva:
"A treewide patch that replaces zero-length arrays with flexible-array
members in UAPI. This has been baking in linux-next for 5 weeks now.
'-fstrict-flex-arrays=3' is coming and we need to land these changes
to prevent issues like these in the short future:
fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source]
strcpy(de3->name, ".");
^
Since these are all [0] to [] changes, the risk to UAPI is nearly
zero. If this breaks anything, we can use a union with a new member
name"
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836
* tag 'flexible-array-transformations-UAPI-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
treewide: uapi: Replace zero-length arrays with flexible-array members
This Kselftest update for Linux 5.20-rc1 consists of:
- timers test build fixes and cleanups for new tool chains
- removing khdr from kselftest framework and main Makefile
- changes to test output messages to improve reports
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmLoSY4ACgkQCwJExA0N
QxyukA//WMYoc9QfVNlBX+XHWqUy+XIP8GLiq08Bq4Ir7a+yZI7AIksnogb9rLZV
Nhm+MGjk0DoIRrz8RFK3lpXparUZEb/H/aye329Sn+v/ICh9i2AWfv01S3vu5NB8
5eYxF0LGHZnVWi9ttesUoNsFmJ2jeD3IAbD1KOZTzq5KALvfbgckl4bRbbE65YFS
kxDfSgPmYV+2qXkZKi6B3kB0+oihi/jQdfdr3rdxRLRAxTlOH4RqIHS80m9itjWr
bDA+RYMS/BuumAMCGokPhRFkt82EQPY2SsbPBb6d1v5mF3j+3Fboyj/UwwNM1sJZ
sfZD3/xZZMpNI3eFKTjYmd4TYbcPUpGudE4YDme64ITvRYlfErWsztCheeZzRPte
0HUaO5JEYKtD/a6bFnTJVksXy6w8wPjvy2JwqK5IMkBMl5wNPZFNdCSiM4rOpiA2
LXyrkeCJ59Tilf7VU/FbhEKXogI0FZK8T7WEPBe3+kARlwJQbQijmGVdutM1S5QT
A1UKyfUNMTm12cgLEX4Lfcb0JJQn6IXMfp9XKqbrAjEH0tNmCqolcL9Cci9tdU+6
XkFdAUO/xRBS+7/HSsdbMWaKCWx0FG4VLW9E2o2+Js26c+pUwJRThN1yox+kAWm2
BcVHxskmjjAgYvL+0eB+tNCe8vImSmgyQhnEcxYlv3CTWDtLoQ4=
=sFMu
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-next-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
- timers test build fixes and cleanups for new tool chains
- removing khdr from kselftest framework and main Makefile
- changes to test output messages to improve reports
* tag 'linux-kselftest-next-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (24 commits)
Makefile: replace headers_install with headers for kselftest
selftests/landlock: drop deprecated headers dependency
selftests: timers: clocksource-switch: adapt to kselftest framework
selftests: timers: clocksource-switch: add 'runtime' command line parameter
selftests: timers: clocksource-switch: add command line switch to skip sanity check
selftests: timers: clocksource-switch: sort includes
selftests: timers: clocksource-switch: fix passing errors from child
selftests: timers: inconsistency-check: adapt to kselftest framework
selftests: timers: nanosleep: adapt to kselftest framework
selftests: timers: fix declarations of main()
selftests: timers: valid-adjtimex: build fix for newer toolchains
Makefile: add headers_install to kselftest targets
selftests: drop KSFT_KHDR_INSTALL make target
selftests: stop using KSFT_KHDR_INSTALL
selftests: drop khdr make target
selftests: drivers/dma-buf: Improve message in selftest summary
selftests/kcmp: Make the test output consistent and clear
selftests:timers: globals don't need initialization to 0
selftests/drivers/gpu: Add error messages to drm_mm.sh
selftests/tpm2: increase timeout for kselftests
...
This KUnit update for Linux 5.20-rc1 consists of several fixes and an
important feature to discourage running KUnit tests on production
systems. Running tests on a production system could leave the system
in a bad state. This new feature adds:
- adds a new taint type, TAINT_TEST to signal that a test has been run.
This should discourage people from running these tests on production
systems, and to make it easier to tell if tests have been run
accidentally (by loading the wrong configuration, etc.)
- several documentation and tool enhancements and fixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmLoOXcACgkQCwJExA0N
Qxy5HQ//QehcBsN0rvNM5enP0HyJjDFxoF9HI7RxhHbwAE3LEkMQTNnFJOViJ7cY
XZgvPipySkekPkvbm9uAnJw160hUSTCM3Oikf7JaxSTKS9Zvfaq9k78miQNrU2rT
C9ljhLBF9y2eXxj9348jwlIHmjBwV5iMn6ncSvUkdUpDAkll2qIvtmmdiSgl33Et
CRhdc07XBwhlz/hBDwj8oK2ZYGPsqjxf2CyrhRMJAOEJtY0wt971COzPj8cDGtmi
nmQXiUhGejXPlzL/7hPYNr83YmYa/xGjecgDPKR3hOf5dVEVRUE2lKQ00F4GrwdZ
KC6CWyXCzhhbtH7tfpWBU4ZoBdmyxhVOMDPFNJdHzuAHVAI3WbHmGjnptgV9jT7o
KqgPVDW2n0fggMMUjmxR4fV2VrKoVy8EvLfhsanx961KhnPmQ6MXxL1cWoMT5BwA
JtwPlNomwaee2lH9534Qgt1brybYZRGx1RDbWn2CW3kJabODptL80sZ62X5XxxRi
I/keCbSjDO1mL3eEeGg/n7AsAhWrZFsxCThxSXH6u6d6jrrvCF3X2Ki5m27D1eGD
Yh40Fy+FhwHSXNyVOav6XHYKhyRzJvPxM/mTGe5DtQ6YnP7G7SnfPchX4irZQOkv
T2soJdtAcshnpG6z38Yd3uWM/8ARtSMaBU891ZAkFD9foniIYWE=
=WzBX
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull KUnit updates from Shuah Khan:
"This consists of several fixes and an important feature to discourage
running KUnit tests on production systems. Running tests on a
production system could leave the system in a bad state.
Summary:
- Add a new taint type, TAINT_TEST to signal that a test has been
run.
This should discourage people from running these tests on
production systems, and to make it easier to tell if tests have
been run accidentally (by loading the wrong configuration, etc)
- Several documentation and tool enhancements and fixes"
* tag 'linux-kselftest-kunit-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (29 commits)
Documentation: KUnit: Fix example with compilation error
Documentation: kunit: Add CLI args for kunit_tool
kcsan: test: Add a .kunitconfig to run KCSAN tests
kunit: executor: Fix a memory leak on failure in kunit_filter_tests
clk: explicitly disable CONFIG_UML_PCI_OVER_VIRTIO in .kunitconfig
mmc: sdhci-of-aspeed: test: Use kunit_test_suite() macro
nitro_enclaves: test: Use kunit_test_suite() macro
thunderbolt: test: Use kunit_test_suite() macro
kunit: flatten kunit_suite*** to kunit_suite** in .kunit_test_suites
kunit: unify module and builtin suite definitions
selftest: Taint kernel when test module loaded
module: panic: Taint the kernel when selftest modules load
Documentation: kunit: fix example run_kunit func to allow spaces in args
Documentation: kunit: Cleanup run_wrapper, fix x-ref
kunit: test.h: fix a kernel-doc markup
kunit: tool: Enable virtio/PCI by default on UML
kunit: tool: make --kunitconfig repeatable, blindly concat
kunit: add coverage_uml.config to enable GCOV on UML
kunit: tool: refactor internal kconfig handling, allow overriding
kunit: tool: introduce --qemu_args
...
earth-shaking:
- More Chinese translations, and an update to the Italian translations.
The Japanese, Korean, and traditional Chinese translations are
more-or-less unmaintained at this point, instead.
- Some build-system performance improvements.
- The removal of the archaic submitting-drivers.rst document, with the
movement of what useful material that remained into other docs.
- Improvements to sphinx-pre-install to, hopefully, give more useful
suggestions.
- A number of build-warning fixes
Plus the usual collection of typo fixes, updates, and more.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmLn9OwPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5YtrwIAJNZoDYJJIRuVHnFkAn5EJ4b/chnR1dSTBtn
WdE/1zdAlMBWVlEGO48VZybph9Sk0v+cUGf+yviDgASQrfOhRRTkg/0u6XaBAYO0
+C2D1QDd9DggGgajxsfJfTdD3IuB78mGmCQvP17XIJW+NK1CK9rXZBnj6WC5/HJw
PCHzeeVreBxOS3W9GelMYa6vjVl7dv81x4DPllnsgU2AMk0/Ce0MVjeIZ695sOeP
Ki6jZgC2GsgFSK5kBC35OiDe5q+fDzlLfek34EUCn4SIbMALSUYWO1db122w5Pme
Ej0+UTBhD19WH1uB/rcVKnVWugi7UEUJexZsao+nC7UrdIVtYq0=
=83BG
-----END PGP SIGNATURE-----
Merge tag 'docs-6.0' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"This was a moderately busy cycle for documentation, but nothing
all that earth-shaking:
- More Chinese translations, and an update to the Italian
translations.
The Japanese, Korean, and traditional Chinese translations
are more-or-less unmaintained at this point, instead.
- Some build-system performance improvements.
- The removal of the archaic submitting-drivers.rst document,
with the movement of what useful material that remained into
other docs.
- Improvements to sphinx-pre-install to, hopefully, give more
useful suggestions.
- A number of build-warning fixes
Plus the usual collection of typo fixes, updates, and more"
* tag 'docs-6.0' of git://git.lwn.net/linux: (92 commits)
docs: efi-stub: Fix paths for x86 / arm stubs
Docs/zh_CN: Update the translation of sched-stats to 5.19-rc8
Docs/zh_CN: Update the translation of pci to 5.19-rc8
Docs/zh_CN: Update the translation of pci-iov-howto to 5.19-rc8
Docs/zh_CN: Update the translation of usage to 5.19-rc8
Docs/zh_CN: Update the translation of testing-overview to 5.19-rc8
Docs/zh_CN: Update the translation of sparse to 5.19-rc8
Docs/zh_CN: Update the translation of kasan to 5.19-rc8
Docs/zh_CN: Update the translation of iio_configfs to 5.19-rc8
doc:it_IT: align Italian documentation
docs: Remove spurious tag from admin-guide/mm/overcommit-accounting.rst
Documentation: process: Update email client instructions for Thunderbird
docs: ABI: correct QEMU fw_cfg spec path
doc/zh_CN: remove submitting-driver reference from docs
docs: zh_TW: align to submitting-drivers removal
docs: zh_CN: align to submitting-drivers removal
docs: ko_KR: howto: remove reference to removed submitting-drivers
docs: ja_JP: howto: remove reference to removed submitting-drivers
docs: it_IT: align to submitting-drivers removal
docs: process: remove outdated submitting-drivers.rst
...
This branch provides nolibc updates, perhaps most notably improved testing
via the "cd tools/include/nolibc; make headers" command. This should
be considered a smoke test. More thorough testing is in the works.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmLgM6UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jOrgD/sFQzRgKVS2v3/rb+9IyIqwuCrniSBe
e7SLfteLNg3e02jMX/eVwx/X6D3C6Weg1ucLG/v0liyPV3ODiX/cSJK48q7wBnOR
/TNtBEvtBsoY7LNORf53qAm3x//fCTTdw2qlkWM6RXcmeR0NH/PWfcas2hRJYhl8
hwkQaC9j2CcOLxgN75RcbOsmnV6x+CtLJkH/k2DlfHpnwoO953uyeG0wCAWzNzQn
yFr3PXjOaJd2qNCPNsdMGpjush9tp+fm2E9gXiDj+vk49MWNgoM/nWQe/p7GB5V9
YWKEMudpVbkxXvT6EFj+ctGS7RVvYYnhjicZvVEkZEtszx3muQLcuSOE+p9KQ7v+
mRGyzvhNu7ZyhMSZWA1Qf5pqiu6L3XTp672RMsYcP4keqN5kC/A5hbNt8qGybEPg
PXaQVZQpV15z7gluEM8FhUPbUL3J9PZsTgwPEzjgt3lyDdmE+KszHQCSD/UyN//8
WZJ/gwweQeTtLz/U+i2dGiT5tWjQ8sWg32AobzbylkyX7D1qDqUu0GYai8dPfbWO
y7vbn9IB+uL1c2MFiO3jJkYCY7UoGESNd2j0SFdNPWnZee5FeXht3SIHaqMOhnqF
Oguigt5SUd3/jh5dzMgQPahAH0tKm1n/+W7jBi2ieV1Lz4dKpS8xjMhmmT+D/SaT
M7C8+63JeQwY8A==
=sLEE
-----END PGP SIGNATURE-----
Merge tag 'nolibc.2022.07.27a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull nolibc updates from Paul McKenney:
"This provides nolibc updates, perhaps most notably improved testing
via the 'cd tools/include/nolibc; make headers' command. This should
be considered a smoke test. More thorough testing is in the works"
* tag 'nolibc.2022.07.27a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
tools/nolibc: add a help target to list supported targets
tools/nolibc: make the default target build the headers
tools/nolibc: fix the makefile to also work as "make -C tools ..."
tools/nolibc/stdio: Add format attribute to enable printf warnings
tools/nolibc/stdlib: Support overflow checking for older compiler versions
This pull request contains the following branches:
doc.2022.06.21a: Documentation updates.
fixes.2022.07.19a: Miscellaneous fixes.
nocb.2022.07.19a: Callback-offload updates, perhaps most notably a new
RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to
be offloaded at boot time, regardless of kernel boot parameters.
This is useful to battery-powered systems such as ChromeOS
and Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel
boot parameter prevents offloaded callbacks from interfering
with real-time workloads and with energy-efficiency mechanisms.
poll.2022.07.21a: Polled grace-period updates, perhaps most notably
making these APIs account for both normal and expedited grace
periods.
rcu-tasks.2022.06.21a: Tasks RCU updates, perhaps most notably reducing
the CPU overhead of RCU tasks trace grace periods by more than
a factor of two on a system with 15,000 tasks. The reduction
is expected to increase with the number of tasks, so it seems
reasonable to hypothesize that a system with 150,000 tasks might
see a 20-fold reduction in CPU overhead.
torture.2022.06.21a: Torture-test updates.
ctxt.2022.07.05a: Updates that merge RCU's dyntick-idle tracking into
context tracking, thus reducing the overhead of transitioning to
kernel mode from either idle or nohz_full userspace execution
for kernels that track context independently of RCU. This is
expected to be helpful primarily for kernels built with
CONFIG_NO_HZ_FULL=y.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmLgMcgTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jArXD/0fjbCwqpRjHVTzjMY8jN4zDkqZZD6m
g8Fx27hZ4ToNFwRptyHwNezrNj14skjAJEXfdjaVw32W62ivXvf0HINvSzsTLCSq
k2kWyBdXLc9CwY5p5W4smnpn5VoAScjg5PoPL59INoZ/Zziji323C7Zepl/1DYJt
0T6bPCQjo1ZQoDUCyVpSjDmAqxnderWG0MeJVt74GkLqmnYLANg0GH8c7mH4+9LL
kVGlLp5nlPgNJ4FEoFdMwNU8T/ETmaVld/m2dkiawjkXjJzB2XKtBigU91DDmXz5
7DIdV4ABrxiy4kGNqtIe/jFgnKyVD7xiDpyfjd6KTeDr/rDS8u2ZH7+1iHsyz3g0
Np/tS3vcd0KR+gI/d0eXxPbgm5sKlCmKw/nU2eArpW/+4LmVXBUfHTG9Jg+LJmBc
JrUh6aEdIZJZHgv/nOQBNig7GJW43IG50rjuJxAuzcxiZNEG5lUSS23ysaA9CPCL
PxRWKSxIEfK3kdmvVO5IIbKTQmIBGWlcWMTcYictFSVfBgcCXpPAksGvqA5JiUkc
egW+xLFo/7K+E158vSKsVqlWZcEeUbsNJ88QOlpqnRgH++I2Yv/LhK41XfJfpH+Y
ALxVaDd+mAq6v+qSHNVq9wT3ozXIPy/zK1hDlMIqx40h2YvaEsH4je+521oSoN9r
vX60+QNxvUBLwA==
=vUNm
-----END PGP SIGNATURE-----
Merge tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes
- Callback-offload updates, perhaps most notably a new
RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be
offloaded at boot time, regardless of kernel boot parameters.
This is useful to battery-powered systems such as ChromeOS and
Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot
parameter prevents offloaded callbacks from interfering with
real-time workloads and with energy-efficiency mechanisms
- Polled grace-period updates, perhaps most notably making these APIs
account for both normal and expedited grace periods
- Tasks RCU updates, perhaps most notably reducing the CPU overhead of
RCU tasks trace grace periods by more than a factor of two on a
system with 15,000 tasks.
The reduction is expected to increase with the number of tasks, so it
seems reasonable to hypothesize that a system with 150,000 tasks
might see a 20-fold reduction in CPU overhead
- Torture-test updates
- Updates that merge RCU's dyntick-idle tracking into context tracking,
thus reducing the overhead of transitioning to kernel mode from
either idle or nohz_full userspace execution for kernels that track
context independently of RCU.
This is expected to be helpful primarily for kernels built with
CONFIG_NO_HZ_FULL=y
* tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits)
rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings
rcu: Diagnose extended sync_rcu_do_polled_gp() loops
rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings
rcutorture: Test polled expedited grace-period primitives
rcu: Add polled expedited grace-period primitives
rcutorture: Verify that polled GP API sees synchronous grace periods
rcu: Make Tiny RCU grace periods visible to polled APIs
rcu: Make polled grace-period API account for expedited grace periods
rcu: Switch polled grace-period APIs to ->gp_seq_polled
rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty
rcu/nocb: Add option to opt rcuo kthreads out of RT priority
rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread()
rcu/nocb: Add an option to offload all CPUs on boot
rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call
rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order
rcu/nocb: Add/del rdp to iterate from rcuog itself
rcu/tree: Add comment to describe GP-done condition in fqs loop
rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs()
rcu/kvfree: Remove useless monitor_todo flag
rcu: Cleanup RCU urgency state for offline CPU
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmLnDOwACgkQSfxwEqXe
A65Fiw//Z0YaPejSslQIGitQ1b0XzdWBhyJArYDieaaiQRXMqlaSKlIUqHz38xb7
+FykUY51/SJLjHV2riPxq1OK3/MPmk6VlTd0HHihcHVmg77oZcFcv2tPnDpZoqND
TsBOujLbXKwxP8tNFedRY/4+K7w+ue9BTfDjuH7aCtz7uWd+4cNJmPg3x9FCfkMA
+hbcRluwE9W3Pg4OCKwv+qxL0JF3qQtNKEOp1wpnjGAZZW/I9gFNgFBEkykvcAsj
TkIRDc3agPFj6QgDeRIgLdnf9KCsLubKAg5oJneeCvQztJJUCSkn8nQXxpx+4sLo
GsRgvCdfL/GyJqfSAzQJVYDHKtKMkJiCiWCC/oOALR8dzHJfSlULDAjbY1m/DAr9
at+vi4678Or7TNx2ZSaUlCXXKZ+UT7yWMlQWax9JuxGk1hGYP5/eT1AH5SGjqUwF
w1q8oyzxt1vUcnOzEddFXPFirnqqhAk4dQFtu83+xKM4ZssMVyeB4NZdEhAdW0ng
MX+RjrVj4l5gWWuoS0Cx3LUxDCgV6WT0dN+Vl9axAZkoJJbcXLEmXwQ6NbzTLPWg
1/MT7qFTxNcTCeAArMdZvvFbeh7pOBXO42pafrK/7vDRnTMUIw9tqXNLQUfvdFQp
F5flPgiVRHDU2vSzKIFtnPTyXU0RBBGvNb4n0ss2ehH2DSsCxYE=
=Zy3d
-----END PGP SIGNATURE-----
Merge tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"Though there's been a decent amount of RNG-related development during
this last cycle, not all of it is coming through this tree, as this
cycle saw a shift toward tackling early boot time seeding issues,
which took place in other trees as well.
Here's a summary of the various patches:
- The CONFIG_ARCH_RANDOM .config option and the "nordrand" boot
option have been removed, as they overlapped with the more widely
supported and more sensible options, CONFIG_RANDOM_TRUST_CPU and
"random.trust_cpu". This change allowed simplifying a bit of arch
code.
- x86's RDRAND boot time test has been made a bit more robust, with
RDRAND disabled if it's clearly producing bogus results. This would
be a tip.git commit, technically, but I took it through random.git
to avoid a large merge conflict.
- The RNG has long since mixed in a timestamp very early in boot, on
the premise that a computer that does the same things, but does so
starting at different points in wall time, could be made to still
produce a different RNG state. Unfortunately, the clock isn't set
early in boot on all systems, so now we mix in that timestamp when
the time is actually set.
- User Mode Linux now uses the host OS's getrandom() syscall to
generate a bootloader RNG seed and later on treats getrandom() as
the platform's RDRAND-like faculty.
- The arch_get_random_{seed_,}_long() family of functions is now
arch_get_random_{seed_,}_longs(), which enables certain platforms,
such as s390, to exploit considerable performance advantages from
requesting multiple CPU random numbers at once, while at the same
time compiling down to the same code as before on platforms like
x86.
- A small cleanup changing a cmpxchg() into a try_cmpxchg(), from
Uros.
- A comment spelling fix"
More info about other random number changes that come in through various
architecture trees in the full commentary in the pull request:
https://lore.kernel.org/all/20220731232428.2219258-1-Jason@zx2c4.com/
* tag 'random-6.0-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: correct spelling of "overwrites"
random: handle archrandom with multiple longs
um: seed rng using host OS rng
random: use try_cmpxchg in _credit_init_bits
timekeeping: contribute wall clock to rng on time change
x86/rdrand: Remove "nordrand" flag in favor of "random.trust_cpu"
random: remove CONFIG_ARCH_RANDOM
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEgvWslnM+qUy+sgVg5n2WYw6TPBAFAmLofpEACgkQ5n2WYw6T
PBDnXg/9E1ZZ6c/RkGG224qc1f9K+Epl4ZjFWAzDeQ84GQpa2BdBEs++JDCH9M1c
YBWBjPMzry1D980VRpxtP6Of6M2SsheMuKQCBBLlO6/uJp1EgMFxFJq/kq6FIybH
cZx4VZqEsw7Yt4U05I5FDfKpkdOIncGBykMmjDgPZYbGR8S03kpc80Ou9luAlEde
31SMhXpTy17yT5WMgBeGtY5OYqO+Plf5FXmS1KEA2BUDk3L3XfYurPpM5mD+Oc3a
HosxT29CeqEPDl+nr96dOliSspC+81IKbHH03Ah7UiKd/12dSjxXQuqLnpksB+vr
H5LjjwuS8CphnFETPx5pb+Ceia4wxJT/FOfcQlzWGh1jI1gFDTipbO04nVyRPDPa
88oQPkqDp7Sh7hCaHsUFmPBkOTwgmG9jHvgBl0656YU14BzHXr4jNMFCL/2x+LPt
jAF/gws87lyyVJ/7c0VaH+V8QWB4a/B1/Gr85yT2Qge1W1T+/lRIhgGtukX+0uBw
AJhPNBVjA2SFopOiBF+WuGEfmyXoUwIpMF/9UDhsvZn5Q+fa/QuuvwuER0QoorVE
FbTbE60eGSPfFdxdyLBrELrDapslZLyn89SG4C3Ec/xljhp7RR8xz2c0EPvJ4HWz
pDjoLG3LbJXSsst86bFJc3B45MvOcxgqIrht9PyY12l+oUKs9mY=
=ESR7
-----END PGP SIGNATURE-----
Merge tag 'safesetid-6.0' of https://github.com/micah-morton/linux
Pull SafeSetID updates from Micah Morton:
"This contains one commit that touches common kernel code, one that
adds functionality internal to the SafeSetID LSM code, and a few other
commits that only modify the SafeSetID LSM selftest.
The commit that touches common kernel code simply adds an LSM hook in
the setgroups() syscall that mirrors what is done for the existing LSM
hooks in the setuid() and setgid() syscalls. This commit combined with
the SafeSetID-specific one allow the LSM to filter setgroups() calls
according to configured rule sets in the same way that is already done
for setuid() and setgid()"
* tag 'safesetid-6.0' of https://github.com/micah-morton/linux:
LSM: SafeSetID: add setgroups() testing to selftest
LSM: SafeSetID: Add setgroups() security policy handling
security: Add LSM hook to setgroups() syscall
LSM: SafeSetID: add GID testing to selftest
LSM: SafeSetID: selftest cleanup and prepare for GIDs
LSM: SafeSetID: fix userns bug in selftest
- Allow unsharing time namespace on vfork+exec (Andrei Vagin)
- Replace usage of deprecated kmap APIs (Fabio M. De Francesco)
- Fix spelling mistake (Zhang Jiaming)
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmLoDyAWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJh0mEACL07hj3eT3rWg6ohZx9sCTcAjY
/tG+zxLQ7xu717nM1a4j7CI5kdNNpYbsCqG71ikDDRrOCeEutu7M8zE1emctjtHv
oh853D6BKhV2Hvsiuk1oM2ZHR1bmgiW1eFNAJcCLz6rE6wYu564R0wYJV0h418fH
Rjk+Y989A7Srs9t/9GQSktjX3Q039/PG28avhA5q144/ZNycr5FnLFOf4RlmzEUz
7E8TfGsftX8eRAfxW/dPiWuIKMuYPLqspca9pT3aFj3ze2qKnldjNV3c9M5ajL5Q
q7KKWeWzunKyYHMaRzIxkHyhs396ZGKFN2PbcNYyml+NBItyc3fCHishMF7bW0Vb
nyZbmYJslBloYmrSJYgqCfxyjUuhe0cMMk9iMzDVp6ROwtLgFFLwfwunM6RwRmnr
dAmM8QGwSE3qYLhVnLEcRqpgdXzVd+S0TGhB5k5AyI3628/mLxhE66/eWq0X8QF5
los5zku1GagMkylt6SOGb3TME4JZe6ZdZpU4fe/ilM22qw852xgbF3+6Zap6IBbD
AdzXVCHyU/obORfIxx5KTF213m4KpkWBBi3N1/vVlxIAFAUy1WdXDM1o2RPMD7hw
DeHe8sgfTZxLmSqfWLuX+3qC94IvrbDPFaRCIMj1QNK0ltM8I9oHRPcUFyZMaV0O
xHN/5QtmgVDfKA3mTw==
=82SS
-----END PGP SIGNATURE-----
Merge tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Allow unsharing time namespace on vfork+exec (Andrei Vagin)
- Replace usage of deprecated kmap APIs (Fabio M. De Francesco)
- Fix spelling mistake (Zhang Jiaming)
* tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
exec: Call kmap_local_page() in copy_string_kernel()
exec: Fix a spelling mistake
selftests/timens: add a test for vfork+exit
fs/exec: allow to unshare a time namespace on vfork+exec
Like the normal 'perf lock contention' output, it'd print the number of
lost entries for BPF if exists or -v option is passed.
Currently it uses BROKEN_CONTENDED stat for the lost count (due to full
stack maps).
$ sudo perf lock con -a -b --map-nr-entries 128 sleep 5
...
=== output for debug===
bad: 43, total: 14903
bad rate: 0.29 %
histogram of events caused bad sequence
acquire: 0
acquired: 0
contended: 43
release: 0
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --map-nr-entries option is to control number of max entries in the
perf lock contention BPF maps.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The lock_contention struct is to carry related fields together and to
minimize the change when we add new config options.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This shoud open up various possibilities like time travel execution, and
is also just another platform to help shake out bugs.
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The kernel.config and debug.config fragments in wireguard selftests mention
some config symbols that have been reworked:
Commit c566586818 ("mm: kmemleak: use the memory pool for early
allocations") removes the config DEBUG_KMEMLEAK_EARLY_LOG_SIZE and since
then, the config's feature is available without further configuration.
Commit 4675ff05de ("kmemcheck: rip it out") removes kmemcheck and the
corresponding arch config HAVE_ARCH_KMEMCHECK. There is no need for this
config.
Commit 3bf195ae60 ("netfilter: nat: merge nf_nat_ipv4,6 into nat core")
removes the config NF_NAT_IPV4 and since then, the config's feature is
available without further configuration.
Commit 41a2901e7d ("rcu: Remove SPARSE_RCU_POINTER Kconfig option")
removes the config SPARSE_RCU_POINTER and since then, the config's feature
is enabled by default.
Commit dfb4357da6 ("time: Remove CONFIG_TIMER_STATS") removes the feature
and config CONFIG_TIMER_STATS without any replacement.
Commit 3ca17b1f36 ("lib/ubsan: remove null-pointer checks") removes the
check and config UBSAN_NULL without any replacement.
Adjust the config fragments to those changes in configs.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmLkm/MQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpoaXD/9Nevo4KQmlG83ZcZfu2d51VlGtt6/Dl7LL
pr07RfnRFJcjeCPCwXCXmu6rrlY+inpfEWv9iCR/ImoeESOJCzm0dN/nlffO/zT1
E0h5AlEoDv2bYrCnVkbfvxL722TZqGeLiDE4YY1jVbuUfs3TDmLQzfGbORK+Zw4y
wPEMDZP1yWHoyeHUGWFasu6dpWiAwsZ4sTX0J631YwIBDNWKZqtienIiY15rK4dz
GioBea6voe8Fos0VEhCBOKXMmV9mG4yVOPeaDbTWTRfuzGNF8b7t2vg7mz+PrbBY
M8h1oEt+/+FnsCIZqfaEUzqHX6quv46OVtq/F5L3yNz/5QEsnqfv08ZFwD3sXdgZ
/RFxXamfcn/LoxzZ9eLu3MeyzpXp6frxBcgTNGc3q2TlIwXr1WsIx2N4PxZh00GM
ssW/ulaOZvZmOmDlbdeSC7sp3R1JmHO4qVlHowr58ce8pkishNTwlZZGr0sHyeNq
/Wkd9NQEQEFD6AIzZ/Mz9CsmzHeHYpy6GhicFrcLuU4YF/fnQ6T4hTjlIlucGv/S
IeqoAHrurCB0/p1ml6VfJ58xUWXNCCCkKC5+xu8Vm6/RgMlIw5KkzvVEBfflnomB
wVJLYsLw41gnlqqpwISR39I7cDV+s6xC5P8YAA/NLz692HDIUrRX14dlbZuXIgbc
ROeHB2N5+g==
=vSwm
-----END PGP SIGNATURE-----
Merge tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block
Pull io_uring zerocopy support from Jens Axboe:
"This adds support for efficient support for zerocopy sends through
io_uring. Both ipv4 and ipv6 is supported, as well as both TCP and
UDP.
The core network changes to support this is in a stable branch from
Jakub that both io_uring and net-next has pulled in, and the io_uring
changes are layered on top of that.
All of the work has been done by Pavel"
* tag 'for-5.20/io_uring-zerocopy-send-2022-07-29' of git://git.kernel.dk/linux-block: (34 commits)
io_uring: notification completion optimisation
io_uring: export req alloc from core
io_uring/net: use unsigned for flags
io_uring/net: make page accounting more consistent
io_uring/net: checks errors of zc mem accounting
io_uring/net: improve io_get_notif_slot types
selftests/io_uring: test zerocopy send
io_uring: enable managed frags with register buffers
io_uring: add zc notification flush requests
io_uring: rename IORING_OP_FILES_UPDATE
io_uring: flush notifiers after sendzc
io_uring: sendzc with fixed buffers
io_uring: allow to pass addr into sendzc
io_uring: account locked pages for non-fixed zc
io_uring: wire send zc request type
io_uring: add notification slot registration
io_uring: add rsrc referencing for notifiers
io_uring: complete notifiers in tw
io_uring: cache struct io_notif
io_uring: add zc notification infrastructure
...
Pull turbostat updates from Len Brown:
"Only updating the turbostat tool here, no kernel changes"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2022.07.28
tools/power turbostat: do not decode ACC for ICX and SPR
tools/power turbostat: fix SPR PC6 limits
tools/power turbostat: cleanup 'automatic_cstate_conversion_probe()'
tools/power turbostat: separate SPR from ICX
tools/power turbosstat: fix comment
tools/power turbostat: Support RAPTORLAKE P
tools/power turbostat: add support for ALDERLAKE_N
tools/power turbostat: dump secondary Turbo-Ratio-Limit
tools/power turbostat: simplify dump_turbo_ratio_limits()
tools/power turbostat: dump CPUID.7.EDX.Hybrid
tools/power turbostat: update turbostat.8
tools/power turbostat: Show uncore frequency
tools/power turbostat: Fix file pointer leak
tools/power turbostat: replace strncmp with single character compare
tools/power turbostat: print the kernel boot commandline
tools/power turbostat: Introduce support for RaptorLake
First noticed with fedora:rawhide:
48 11.10 fedora:rawhide : FAIL gcc version 12.1.1 20220628 (Red Hat 12.1.1-3) (GCC)
util/scripting-engines/trace-event-python.c: In function 'python_start_script':
util/scripting-engines/trace-event-python.c:1899:9: error: 'PySys_SetArgv' is deprecated [-Werror=deprecated-declarations]
1899 | PySys_SetArgv(argc + 1, command_line);
No time now to address this warning, so don't make it an error, in time
we should either add yet more ifdefs to continue supporting older
systems or just convert to whatever new infra python put in place for
argv processing, sigh.
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When genelf was introduced it tested for HAVE_LIBCRYPTO not
HAVE_LIBCRYPTO_SUPPORT, which is the define the feature test for openssl
defines, fix it.
This also adds disables the deprecation warning, someone has to fix this
to build with openssl 3.0 before the warning becomes a hard error.
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Reported-by: 谭梓煊 <tanzixuan.me@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/YulpPqXSOG0Q4J1o@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With OpenSSL v3 installed, the libcrypto feature check fails as it use the
deprecated MD5_* API (and is compiled with -Werror). The error message is
as follows.
$ make tools/perf
```
Makefile.config:778: No libcrypto.h found, disables jitted code injection,
please install openssl-devel or libssl-dev
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... libbfd: [ on ]
... libbfd-buildid: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ OFF ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
```
This is very confusing because the suggested library (on my Ubuntu 20.04
it is libssl-dev) is already installed. As the test only checks for the
presence of libcrypto, this commit suppresses the deprecation warning to
allow the test to pass.
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220625153439.513559-1-tanzixuan.me@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move print_*_events functions out of parse-events.c into a new
print-events.c. Move tracepoint code into tracepoint.c or
trace-event-info.c (sole user). This reduces the dependencies of
parse-events.c and makes it more amenable to being a library in the
future.
Remove some unnecessary definitions from parse-events.h. Fix a
checkpatch.pl warning on using unsigned rather than unsigned int. Fix
some line length warnings too.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com
[ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano).
- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba).
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang).
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski).
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano).
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof).
- Fix array underflow in prep_lookup_table (Dan Carpenter).
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun).
- Fix typos in comments detected on sun8i by Coccinelle (Julia
Lawall).
- Fix typos in comments on rzg2l (Biju Das).
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li).
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov).
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer).
- Fix typo in the comments for the 'tmon' tool (Slark Xiao).
- Make the thermal core use ida_alloc()/free() directly instead of
ida_simple_get()/ida_simple_remove() that have been deprecated
(keliu).
- Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal control
driver (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoK5ASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxa0cQAJsl3wDxkDbvfEENZ1VSdfeH3qXbUSSE
EEo0j4X85JE1F1NwT8R2tb4D/YMJDT3p6I55twrVLvxNUdTnx7ybRfXem24uXkK5
xOybfsuYsSWXxaEfI4260GBzY6ijTR7uWYyDLPN3vvbW3FdMj+nni0D9uTySw7UL
ecIe1ISn3nxbbp0FxYh+n88+718HWKo07BaTE4TyKeUgQHw+v7HHtCZq7Rdoogm8
cp6tTkJ8ymrHoEvAWBIcO58zCx7LkSFeU69oMm4CUzVjxWdFfREb079F5cZ92GXr
ex70r/gKfFAd5GAAdL0WjeS4RwHKta49WKqAMA7w41nIgDj0IA2gJRowfJvKDkF+
JgcQ7OrJ5eo5jCr4pbycgQ9Lh23zBQe/3LH+yV71KlKiLf6/Tl5rhELfBNbZmraZ
HOvD5dAxBLySmANN2VX7DJgtbTcinneL9BDVo6dBTdYaWC4jQxXYm73n66nkZdS7
BDJ0N2P0uZ7NGLawXwrrsMi8xbIApMw4W/o8SN9R4FF1LqIroDg60kLJ9zO+6IhI
xF8ZtcMdyPVa71fSZNwD0+mz2sF6XnTucf88CjxzVdAxbvNVPQEvKufThWTreyuU
pjBPtf1YFOFz9CusBYAplOIu96RqUgL1t1aqqwsCqXoUu4Lgh/pyksIDeam1l0EP
Q5WBUB9bK8q8
=wj9M
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These start a rework of the handling of trip points in the thermal
core, improve the cpufreq/devfreq cooling device handling, update some
thermal control drivers and the tmon utility and clean up code.
Specifics:
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to
be used by the different sensors when registering a thermal zone
(Daniel Lezcano).
- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba).
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn).
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3
(Wolfram Sang).
- Convert the QCom tsens DT binding to the dtsformat format
(Krzysztof Kozlowski).
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano).
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof).
- Fix array underflow in prep_lookup_table (Dan Carpenter).
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun).
- Fix typos in comments detected on sun8i by Coccinelle (Julia
Lawall).
- Fix typos in comments on rzg2l (Biju Das).
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li).
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov).
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer).
- Fix typo in the comments for the 'tmon' tool (Slark Xiao).
- Make the thermal core use ida_alloc()/free() directly instead of
ida_simple_get()/ida_simple_remove() that have been deprecated
(keliu).
- Drop ACPI_FADT_LOW_POWER_S0 check from the Intel PCH thermal
control driver (Rafael Wysocki)"
* tag 'thermal-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits)
thermal/of: Initialize trip points separately
thermal/of: Use thermal trips stored in the thermal zone
thermal/core: Add thermal_trip in thermal_zone
thermal/core: Rename 'trips' to 'num_trips'
thermal/core: Move thermal_set_delay_jiffies to static
thermal/core: Remove unneeded EXPORT_SYMBOLS
thermal/of: Move thermal_trip structure to thermal.h
thermal/of: Remove the device node pointer for thermal_trip
thermal/of: Replace device node match with device node search
thermal/core: Remove duplicate information when an error occurs
thermal/core: Avoid calling ->get_trip_temp() unnecessarily
thermal/tools/tmon: Fix typo 'the the' in comment
thermal/tools/tmon: Include pthread and time headers in tmon.h
thermal/ti-soc-thermal: Fix comment typo
thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
thermal/drivers/u8500: Remove unnecessary print function dev_err()
thermal/drivers/rzg2l: Fix comments
thermal/drivers/sun8i: Fix typo in comment
thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
...
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq policy
which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
8t6ObMxgMhWT
=XeLV
-----END PGP SIGNATURE-----
Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly minor improvements all over including new CPU IDs for
the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
updates, documentation cleanups and a new version of the pm-graph
suite of utilities.
Specifics:
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq
policy which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
function parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello)"
* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
PM: QoS: Add check to make sure CPU freq is non-negative
PM: hibernate: defer device probing when resuming from hibernation
intel_idle: make SPR C1 and C1E be independent
cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
cpufreq: loongson2: fix Kconfig "its" grammar
pm-graph v5.9
cpufreq: Warn users while freeing active policy
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
PM: domains: Ensure genpd_debugfs_dir exists before remove
PM: runtime: Extend support for wakeirq for force_suspend|resume
...
Adding a #define to side-effect a local include isn't clean, for
example, it inhibits header precompilation. YY_EXTRA_TYPE is
defined to be void* by default, so just remove.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The big update this time around is some excellent work from David Jander
who went through the fast path and really eliminated overheads, meaning
that we are seeing a huge reduction in the time spent between transfers
for single threaded clients. Benchmarking has been coming out at about a
halving of overhead which is clearly visible in system level usage that
stresses SPI like some CAN and IIO applications, especially with small
transfers. Thanks to David for taking the time to drill down into this
and push the work upstream.
Otherwise there's been a bunch of new device support and the usual
- Optimisation of the fast path, particularly around the number and
types of locking operations, from David Jander.
- Support for Arbel NPCM845, HP GXP, Intel Meteor Lake and Thunder Bay,
MediaTek MT8188 and MT8365, Microchip FPGAs, nVidia Tegra 241 and
Samsung Exynos Auto v9 and 4210.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmLnyFYACgkQJNaLcl1U
h9AVpAf7BJI8NBQ659fyvfZkJDTlH8F3IjH4P3WpxMPCmTqvCZ5wBZyxwMIXGySE
fe7iQw3PGXBcoEHxhYPR4ePp7LO5jHePybUzGCJBD0EYhlo9QVBpD5+P4t65c9z8
Hjpul428My4L7eUGl/29iv0Qzkyd3wnVPSsZqBCB6BOPTQ+hribs93Uj6rB4wmzF
9Vu4p+dqdGvdrIj3G2KpFRtKxhpnjUeD5l8Eq3rOPlEPjSKoHADHP2ZSpxoz5jfR
8L6C+RyADs7ro7X4hiIq1TGURVJ+6EkGDdc6O+Rj0S+PL7MCVOGR0ucPZMOVmNbJ
114wnOQNmVnGKHX0IBm7VIOMkfc7Dg==
=5frj
-----END PGP SIGNATURE-----
Merge tag 'spi-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"The big update this time around is some excellent work from David
Jander who went through the fast path and really eliminated overheads,
meaning that we are seeing a huge reduction in the time spent between
transfers for single threaded clients.
Benchmarking has been coming out at about a halving of overhead which
is clearly visible in system level usage that stresses SPI like some
CAN and IIO applications, especially with small transfers. Thanks to
David for taking the time to drill down into this and push the work
upstream.
Otherwise there's been a bunch of new device support and the usual
updates.
- Optimisation of the fast path, particularly around the number and
types of locking operations, from David Jander.
- Support for Arbel NPCM845, HP GXP, Intel Meteor Lake and Thunder
Bay, MediaTek MT8188 and MT8365, Microchip FPGAs, nVidia Tegra 241
and Samsung Exynos Auto v9 and 4210"
* tag 'spi-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (97 commits)
MAINTAINERS: add spi support to GXP
spi: dt-bindings: add documentation for hpe,gxp-spifi
spi: spi-gxp: Add support for HPE GXP SoCs
spi: a3700: support BE for AC5 SPI driver
spi/panel: dt-bindings: drop CPHA and CPOL from common properties
spi: bcm2835: enable shared interrupt support
spi: dt-bindings: spi-controller: correct example indentation
spi: dt-bindings: qcom,spi-geni-qcom: allow three interconnects
spi: npcm-fiu: Add NPCM8XX support
dt-binding: spi: Add npcm845 compatible to npcm-fiu document
spi: npcm-fiu: Modify direct read dummy configuration
spi: atmel: remove #ifdef CONFIG_{PM, SLEEP}
spi: dt-bindings: Add compatible for MediaTek MT8188
spi: dt-bindings: mediatek,spi-mtk-nor: Update bindings for nor flash
spi: dt-bindings: atmel,at91rm9200-spi: convert to json-schema
spi: tegra20-slink: fix UAF in tegra_slink_remove()
spi: Fix simplification of devm_spi_register_controller
spi: microchip-core: switch to use dev_err_probe()
spi: microchip-core: switch to use devm_spi_alloc_master()
spi: microchip-core: fix UAF in mchp_corespi_remove()
...
The ioam6.sh test script exits with an error code (1) when tests are
skipped due to lack of support from userspace/kernel or not enough
permissions. It should return the kselftests SKIP code instead.
Reviewed-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Link: https://lore.kernel.org/r/20220801124615.256416-1-kleber.souza@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Load-balancing improvements:
============================
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness metrics (PELT),
and remove the conservative threshold requiring 6% energy savings to
migrate a task. Doing this improves power efficiency for most workloads,
and also increases the reliability of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time searching
for an idle CPU on overloaded systems. There's reports of several
milliseconds spent there on large systems with large workloads ...
[ Since the search logic changed, there might be behavioral side effects. ]
- Improve NUMA imbalance behavior. On certain systems
with spare capacity, initial placement of tasks is non-deterministic,
and such an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which benefit
from the artificial locality of the placement bug(s). Mel Gorman's
conclusion, with which we concur, was that consistency is better than
random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think it's
better to be consistent and use similar logic at both fork time
and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in sched_core_update_cookie() that
caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs for newly
woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup latencies.
ABI improvements/fixes:
=======================
- Do not check capabilities and do not issue capability check denial messages
when a scheduler syscall doesn't require privileges. (Such as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
==============
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
======================
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not being
re-enabled when the last SCHED_RT task is gone from a runqueue but there's
still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmLn2ywRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iNfxAAhPJMwM4tYCpIM6PhmxKiHl6kkiT2tt42
HhEmiJVLjczLybWaWwmGA2dSFkv1f4+hG7nqdZTm9QYn0Pqat2UTSRcwoKQc+gpB
x85Hwt2IUmnUman52fRl5r1miH9LTdCI6agWaFLQae5ds1XmOugFo52t2ahax+Gn
dB8LxS2fa/GrKj229EhkJSPWAK4Y94asoTProwpKLuKEeXhDkqUNrOWbKhz+wEnA
pVZySpA9uEOdNLVSr1s0VB6mZoh5/z6yQefj5YSNntsG71XWo9jxKCIm5buVdk2U
wjdn6UzoTThOy/5Ygm64eYRexMHG71UamF1JYUdmvDeUJZ5fhG6RD0FECUQNVcJB
Msu2fce6u1AV0giZGYtiooLGSawB/+e6MoDkjTl8guFHi/peve9CezKX1ZgDWPfE
eGn+EbYkUS9RMafXCKuEUBAC1UUqAavGN9sGGN1ufyR4za6ogZplOqAFKtTRTGnT
/Ne3fHTtvv73DLGW9ohO5vSS2Rp7zhAhB6FunhibhxCWlt7W6hA4Ze2vU9hf78Yn
SJDLAJjOEilLaKUkRG/d9uM3FjKJM1tqxuT76+sUbM0MNxdyiKcviQlP1b8oq5Um
xE1KNZUevnr/WXqOTGDKHH/HNPFgwxbwavMiP7dNFn8h/hEk4t9dkf5siDmVHtn4
nzDVOob1LgE=
=xr2b
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"Load-balancing improvements:
- Improve NUMA balancing on AMD Zen systems for affine workloads.
- Improve the handling of reduced-capacity CPUs in load-balancing.
- Energy Model improvements: fix & refine all the energy fairness
metrics (PELT), and remove the conservative threshold requiring 6%
energy savings to migrate a task. Doing this improves power
efficiency for most workloads, and also increases the reliability
of energy-efficiency scheduling.
- Optimize/tweak select_idle_cpu() to spend (much) less time
searching for an idle CPU on overloaded systems. There's reports of
several milliseconds spent there on large systems with large
workloads ...
[ Since the search logic changed, there might be behavioral side
effects. ]
- Improve NUMA imbalance behavior. On certain systems with spare
capacity, initial placement of tasks is non-deterministic, and such
an artificial placement imbalance can persist for a long time,
hurting (and sometimes helping) performance.
The fix is to make fork-time task placement consistent with runtime
NUMA balancing placement.
Note that some performance regressions were reported against this,
caused by workloads that are not memory bandwith limited, which
benefit from the artificial locality of the placement bug(s). Mel
Gorman's conclusion, with which we concur, was that consistency is
better than random workload benefits from non-deterministic bugs:
"Given there is no crystal ball and it's a tradeoff, I think
it's better to be consistent and use similar logic at both fork
time and runtime even if it doesn't have universal benefit."
- Improve core scheduling by fixing a bug in
sched_core_update_cookie() that caused unnecessary forced idling.
- Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs
for newly woken tasks.
- Fix a newidle balancing bug that introduced unnecessary wakeup
latencies.
ABI improvements/fixes:
- Do not check capabilities and do not issue capability check denial
messages when a scheduler syscall doesn't require privileges. (Such
as increasing niceness.)
- Add forced-idle accounting to cgroups too.
- Fix/improve the RSEQ ABI to not just silently accept unknown flags.
(No existing tooling is known to have learned to rely on the
previous behavior.)
- Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
Optimizations:
- Optimize & simplify leaf_cfs_rq_list()
- Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
Misc fixes & cleanups:
- Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
- Fix a full-NOHZ bug that can in some cases result in the tick not
being re-enabled when the last SCHED_RT task is gone from a
runqueue but there's still SCHED_OTHER tasks around.
- Various PREEMPT_RT related fixes.
- Misc cleanups & smaller fixes"
* tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
rseq: Kill process when unknown flags are encountered in ABI structures
rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags
sched/core: Fix the bug that task won't enqueue into core tree when update cookie
nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
sched/core: Always flush pending blk_plug
sched/fair: fix case with reduced capacity CPU
sched/core: Use try_cmpxchg in set_nr_{and_not,if}_polling
sched/core: add forced idle accounting for cgroups
sched/fair: Remove the energy margin in feec()
sched/fair: Remove task_util from effective utilization in feec()
sched/fair: Use the same cpumask per-PD throughout find_energy_efficient_cpu()
sched/fair: Rename select_idle_mask to select_rq_mask
sched, drivers: Remove max param from effective_cpu_util()/sched_cpu_util()
sched/fair: Decay task PELT values during wakeup migration
sched/fair: Provide u64 read for 32-bits arch helper
sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
sched: only perform capability check on privileged operation
sched: Remove unused function group_first_cpu()
sched/fair: Remove redundant word " *"
selftests/rseq: check if libc rseq support is registered
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEjUuTAak14xi+SF7M4CHKc/GJqRAFAmLnqTQACgkQ4CHKc/GJ
qRBnBwgAohP0MXszRnhGEKKTmLBtsEyPrV0OBEIlz3MnDYBYfnDLd5JdSMMA+1jp
sT80QWYKPMr10WKWX5vPjhIYRIfgWchEYND/93DnJYC6Fdap/D0hDd6tIQEKnxpN
YeGZHck6orj9L2HfazJo7qpt//Th5mM8WRTN9OIiFdKPYOvlm7DT51wukVLnK9fA
WoWrx3CsyIh6unvAC6AMOVFt7ZJOfD6muMQsGmkcpp1sJLeM1Ofoe8l+h5oSrFZQ
CrdV4XXrprVi7JhqvSX4alRnF5vmOAVKVXhBLZ3A/3uTou2Bhic6n68chyb/x2RE
FhwmsXS+v7jsOI0PV4gNzwT+sp+01w==
=y2kQ
-----END PGP SIGNATURE-----
Merge tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- An addition of 'accounted' flag to slab allocation tracepoints to
indicate memcg_kmem accounting, by Vasily
- An optimization of memcg handling in freeing paths, by Muchun
- Various smaller fixes and cleanups
* tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab_common: move generic bulk alloc/free functions to SLOB
mm/sl[au]b: use own bulk free function when bulk alloc failed
mm: slab: optimize memcg_slab_free_hook()
mm/tracing: add 'accounted' entry into output of allocation tracepoints
tools/vm/slabinfo: Handle files in debugfs
mm/slub: Simplify __kmem_cache_alias()
mm, slab: fix bad alignments
binutils changed the signature of init_disassemble_info(), which now causes
compilation to fail for tools/bpf/bpftool/jit_disasm.c, e.g. on debian
unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.
I verified that bpftool can still disassemble bpf programs, both with an
old and new dis-asm.h API. There are no output changes for plain and json
formats. When comparing the output from old binutils (2.35)
to new bintuils with the patch (upstream snapshot) there are a few output
differences, but they are unrelated to this patch. An example hunk is:
2f: pop %r14
31: pop %r13
33: pop %rbx
- 34: leaveq
- 35: retq
+ 34: leave
+ 35: ret
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-8-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
binutils changed the signature of init_disassemble_info(), which now causes
compilation to fail for tools/bpf/bpf_jit_disasm.c, e.g. on debian
unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.
I verified that bpf_jit_disasm can still disassemble bpf programs, both
with the old and new dis-asm.h API. With old binutils there's no change in
output before/after this patch. When comparing the output from old
binutils (2.35) to new bintuils with the patch (upstream snapshot) there
are a few output differences, but they are unrelated to this patch. An
example hunk is:
f4: mov %r14,%rsi
f7: mov %r15,%rdx
fa: mov $0x2a,%ecx
- ff: callq 0xffffffffea8c4988
+ ff: call 0xffffffffea8c4988
104: test %rax,%rax
107: jge 0x0000000000000110
109: xor %eax,%eax
- 10b: jmpq 0x0000000000000073
+ 10b: jmp 0x0000000000000073
110: cmp $0x16,%rax
However, I had to use an older kernel to generate the bpf_jit_enabled =
2 output, as that has been broken since 5.18 / 1022a5498f ("bpf,
x86_64: Use bpf_jit_binary_pack_alloc").
https://lore.kernel.org/20220703030210.pmjft7qc2eajzi6c@alap3.anarazel.de
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-6-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
binutils changed the signature of init_disassemble_info(), which now causes
compilation failures for tools/perf/util/annotate.c, e.g. on debian
unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
Wire up the feature test and switch to init_disassemble_info_compat(),
which were introduced in prior commits, fixing the compilation failure.
I verified that perf can still disassemble bpf programs by using bpftrace
under load, recording a perf trace, and then annotating the bpf "function"
with and without the changes. With old binutils there's no change in output
before/after this patch. When comparing the output from old binutils (2.35)
to new bintuils with the patch (upstream snapshot) there are a few output
differences, but they are unrelated to this patch. An example hunk is:
1.15 : 55:mov %rbp,%rdx
0.00 : 58:add $0xfffffffffffffff8,%rdx
0.00 : 5c:xor %ecx,%ecx
- 1.03 : 5e:callq 0xffffffffe12aca3c
+ 1.03 : 5e:call 0xffffffffe12aca3c
0.00 : 63:xor %eax,%eax
- 2.18 : 65:leaveq
- 2.82 : 66:retq
+ 2.18 : 65:leave
+ 2.82 : 66:ret
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-5-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
binutils changed the signature of init_disassemble_info(), which now causes
compilation failures for tools/{perf,bpf}, e.g. on debian unstable.
Relevant binutils commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=60a3da00bd5407f07
This commit introduces a wrapper for init_disassemble_info(), to avoid
spreading #ifdef DISASM_INIT_STYLED to a bunch of places. Subsequent
commits will use it to fix the build failures.
It likely is worth adding a wrapper for disassember(), to avoid the already
existing DISASM_FOUR_ARGS_SIGNATURE ifdefery.
Signed-off-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ben Hutchings <benh@debian.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Ben Hutchings <benh@debian.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220622181918.ykrs5rsnmx3og4sv@alap3.anarazel.de
Link: https://lore.kernel.org/r/20220801013834.156015-4-andres@anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the past it had a problem not setting the pid/tid on the sample
correctly when system-wide mode is used. Although it's fixed now it'd
be nice if we have a test case for it.
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220701230932.1000495-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some functions we use for bpf prologue generation are going to be
deprecated. This change reworks current code not to use them.
We need to replace following functions/struct:
bpf_program__set_prep
bpf_program__nth_fd
struct bpf_prog_prep_result
Currently we use bpf_program__set_prep to hook perf callback before
program is loaded and provide new instructions with the prologue.
We replace this function/ality by taking instructions for specific
program, attaching prologue to them and load such new ebpf programs
with prologue using separate bpf_prog_load calls (outside libbpf
load machinery).
Before we can take and use program instructions, we need libbpf to
actually load it. This way we get the final shape of its instructions
with all relocations and verifier adjustments).
There's one glitch though.. perf kprobe program already assumes
generated prologue code with proper values in argument registers,
so loading such program directly will fail in the verifier.
That's where the fallback pre-load handler fits in and prepends
the initialization code to the program. Once such program is loaded
we take its instructions, cut off the initialization code and prepend
the prologue.
I know.. sorry ;-)
To have access to the program when loading this patch adds support to
register 'fallback' section handler to take care of perf kprobe programs.
The fallback means that it handles any section definition besides the
ones that libbpf handles.
The handler serves two purposes:
- allows perf programs to have special arguments in section name
- allows perf to use pre-load callback where we can attach init
code (zeroing all argument registers) to each perf program
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220616202214.70359-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The libbpf is switching off support for legacy map definitions [1],
which will break the perf llvm tests.
Moving the base source map definition to BTF-defined, so we need
to use -g compile option for to add debug/BTF info.
[1] https://lore.kernel.org/bpf/20220627211527.2245459-1-andrii@kernel.org/
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220704152721.352046-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kvm_hypercall has to place the hypercall number in rax.
Trace events show that kvm_pv_test doesn't work properly:
kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0
kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0
kvm_pv_test-53132: kvm_hypercall: nr 0x0 a0 0x0 a1 0x0 a2 0x0 a3 0x0
With this change, it starts working as expected:
kvm_pv_test-54285: kvm_hypercall: nr 0x5 a0 0x0 a1 0x0 a2 0x0 a3 0x0
kvm_pv_test-54285: kvm_hypercall: nr 0xa a0 0x0 a1 0x0 a2 0x0 a3 0x0
kvm_pv_test-54285: kvm_hypercall: nr 0xb a0 0x0 a1 0x0 a2 0x0 a3 0x0
Signed-off-by: Andrei Vagin <avagin@google.com>
Message-Id: <20220722230241.1944655-5-avagin@google.com>
Fixes: ac4a4d6de2 ("selftests: kvm: test enforcement of paravirtual cpuid features")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The perf jvmti agent doesn't create program headers, in this case
fallback on section headers as happened previously.
Committer notes:
To test this, from a public post by Ian:
1) download a Java workload dacapo-9.12-MR1-bach.jar from
https://sourceforge.net/projects/dacapobench/
2) build perf such as "make -C tools/perf O=/tmp/perf NO_LIBBFD=1" it
should detect Java and create /tmp/perf/libperf-jvmti.so
3) run perf with the jvmti agent:
perf record -k 1 java -agentpath:/tmp/perf/libperf-jvmti.so -jar dacapo-9.12-MR1-bach.jar -n 10 fop
4) run perf inject:
perf inject -i perf.data -o perf-injected.data -j
5) run perf report
perf report -i perf-injected.data | grep org.apache.fop
With this patch reverted I see lots of symbols like:
0.00% java jitted-388040-4656.so [.] org.apache.fop.fo.FObj.bind(org.apache.fop.fo.PropertyList)
With the patch (2d86612aac ("perf symbol: Correct address for bss
symbols")) I see lots of:
dso__load_sym_internal: failed to find program header for symbol:
Lorg/apache/fop/fo/FObj;bind(Lorg/apache/fop/fo/PropertyList;)V
st_value: 0x40
Fixes: 2d86612aac ("perf symbol: Correct address for bss symbols")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220731164923.691193-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -a/--all-cpus and -C/--cpu options for cpu filtering. Also -p/--pid
and --tid options are added for task filtering. The short -t option is
taken for --threads already. Tracking the command line workload is
possible as well.
$ sudo perf lock contention -a -b sleep 1
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -b/--use-bpf option to use BPF to collect lock contention stats.
For simplicity it now runs system-wide and requires C-c to stop.
Upcoming changes will add the usual filtering.
$ sudo perf lock con -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation for later change to expose the function externally
so that it can be used without the implicit session data.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
GCC 12 thinks that `actual` might be used uninitialised. It's not, the
use is guarded by `bad_mmcr2` which is only set to true at the same
point where `actual` is initialised.
cycles_with_mmcr2_test.c: In function ‘cycles_with_mmcr2’:
cycles_with_mmcr2_test.c:81:17: error: ‘actual’ may be used uninitialized [-Werror=maybe-uninitialized]
81 | printf("Bad MMCR2 value seen is 0x%lx\n", actual);
Silence the warning by initialising `actual` to zero.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220801113746.802046-1-mpe@ellerman.id.au
These tests are based on test_stat_user_read in
tools/lib/perf/tests/test-evsel.c.
The tests are modified to skip if perf_event_open fails or rdpmc isn't
supported.
Committer testing:
⬢[acme@toolbox perf]$ perf test "mmap interface"
4: mmap interface tests :
4.1: Read samples using the mmap interface : Skip (permissions)
⬢[acme@toolbox perf]$
[root@five ~]# perf test "mmap interface"
4: mmap interface tests :
4.1: Read samples using the mmap interface : Ok
[root@five ~]#
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220719223946.176299-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test has been superseded by test_stat_user_read in:
tools/lib/perf/tests/test-evsel.c
The updated test doesn't divide-by-0 when running time of a counter is
0. It also supports ARM64.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220719223946.176299-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The only sensible exponent for a boolean stat is 0. Add a test assertion
requiring all boolean statistics to have an exponent of 0.
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Message-Id: <20220719143134.3246798-4-oliver.upton@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As it turns out, tests sometimes fail. When that is the case, packing
the test assertion with as much relevant information helps track down
the problem more quickly.
Sharpen up the stat descriptor assertions in kvm_binary_stats_test to
more precisely describe the reason for the test assertion and which
stat is to blame.
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Message-Id: <20220719143134.3246798-3-oliver.upton@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to provide more useful test assertions that describe the broken
stats descriptor, perform sanity check on the stat name before any other
descriptor field. While at it, avoid dereferencing the name field if the
sanity check fails as it is more likely to contain garbage.
Signed-off-by: Oliver Upton <oupton@google.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Message-Id: <20220719143134.3246798-2-oliver.upton@linux.dev>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM/arm64 updates for 5.20:
- Unwinder implementations for both nVHE modes (classic and
protected), complete with an overflow stack
- Rework of the sysreg access from userspace, with a complete
rewrite of the vgic-v3 view to allign with the rest of the
infrastructure
- Disagregation of the vcpu flags in separate sets to better track
their use model.
- A fix for the GICv2-on-v3 selftest
- A small set of cosmetic fixes
KVM/s390, KVM/x86 and common infrastructure changes for 5.20
x86:
* Permit guests to ignore single-bit ECC errors
* Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache
* Intel IPI virtualization
* Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS
* PEBS virtualization
* Simplify PMU emulation by just using PERF_TYPE_RAW events
* More accurate event reinjection on SVM (avoid retrying instructions)
* Allow getting/setting the state of the speaker port data bit
* Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent
* "Notify" VM exit (detect microarchitectural hangs) for Intel
* Cleanups for MCE MSR emulation
s390:
* add an interface to provide a hypervisor dump for secure guests
* improve selftests to use TAP interface
* enable interpretive execution of zPCI instructions (for PCI passthrough)
* First part of deferred teardown
* CPU Topology
* PV attestation
* Minor fixes
Generic:
* new selftests API using struct kvm_vcpu instead of a (vm, id) tuple
x86:
* Use try_cmpxchg64 instead of cmpxchg64
* Bugfixes
* Ignore benign host accesses to PMU MSRs when PMU is disabled
* Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior
* x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
* Port eager page splitting to shadow MMU as well
* Enable CMCI capability by default and handle injected UCNA errors
* Expose pid of vcpu threads in debugfs
* x2AVIC support for AMD
* cleanup PIO emulation
* Fixes for LLDT/LTR emulation
* Don't require refcounted "struct page" to create huge SPTEs
x86 cleanups:
* Use separate namespaces for guest PTEs and shadow PTEs bitmasks
* PIO emulation
* Reorganize rmap API, mostly around rmap destruction
* Do not workaround very old KVM bugs for L0 that runs with nesting enabled
* new selftests API for CPUID
RISC-V uses the same (generic) syscall numbers as ARM64.
Link: https://lkml.kernel.org/r/mvma68wl2ul.fsf@suse.de
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Avoid double free by making trace_instance_destroy indempotent. When
trace_instance_init fails, it calls trace_instance_destroy, but its only
caller osnoise_destroy_tool calls it again.
Link: https://lkml.kernel.org/r/mvmilnlkyzx.fsf_-_@suse.de
Fixes: 0605bf009f ("rtla: Add osnoise tool")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Sedat Dilek reported an error on rtla Makefile when running:
$ make -C tools/ clean
[...]
make[2]: Entering directory
'/home/dileks/src/linux-kernel/git/tools/tracing/rtla'
[...]
'/home/dileks/src/linux-kernel/git/Documentation/tools/rtla'
/bin/sh: 1: test: rtla-make[2]:: unexpected operator <------ The problem
rm: cannot remove '/home/dileks/src/linux-kernel/git': Is a directory
make[2]: *** [Makefile:120: clean] Error 1
make[2]: Leaving directory
This occurred because the rtla calls kernel's Makefile to get the
version in silence mode, e.g.,
$ make -sC ../../.. kernelversion
5.19.0-rc4
But the -s is being ignored when rtla's makefile is called indirectly,
so the output looks like this:
$ make -C ../../.. kernelversion
make: Entering directory '/root/linux'
5.19.0-rc4
make: Leaving directory '/root/linux'
Using 'grep -v make' avoids this problem, e.g.,
$ make -C ../../.. kernelversion | grep -v make
5.19.0-rc4
Thus, add | grep -v make.
Link: https://lkml.kernel.org/r/870c02d4d97a921f02a31fa3b229fc549af61a20.1657747763.git.bristot@kernel.org
Fixes: 8619e32825 ("rtla: Follow kernel version")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Per task wakeup while not running (wwnr) monitor.
This model is broken, the reason is that a task can be running in the
processor without being set as RUNNABLE. Think about a task about to
sleep:
1: set_current_state(TASK_UNINTERRUPTIBLE);
2: schedule();
And then imagine an IRQ happening in between the lines one and two,
waking the task up. BOOM, the wakeup will happen while the task is
running.
Q: Why do we need this model, so?
A: To test the reactors.
Link: https://lkml.kernel.org/r/473c0fc39967250fdebcff8b620311c11dccad30.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The wakeup in preemptive (wip) monitor verifies if the
wakeup events always take place with preemption disabled:
|
|
v
#==================#
H preemptive H <+
#==================# |
| |
| preempt_disable | preempt_enable
v |
sched_waking +------------------+ |
+--------------- | | |
| | non_preemptive | |
+--------------> | | -+
+------------------+
The wakeup event always takes place with preemption disabled because
of the scheduler synchronization. However, because the preempt_count
and its trace event are not atomic with regard to interrupts, some
inconsistencies might happen.
The documentation illustrates one of these cases.
Link: https://lkml.kernel.org/r/c98ca678df81115fddc04921b3c79720c836b18f.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
dot2c is a tool that transforms an automata in the graphiviz .dot file
into an C representation of the automata.
usage: dot2c [-h] dot_file
dot2c: converts a .dot file into a C structure
positional arguments:
dot_file The dot file to be converted
optional arguments:
-h, --help show this help message and exit
Link: https://lkml.kernel.org/r/b26204ba9509c80bcda31b76cdea31ddb188cd24.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The VERBOSE option in Makefile has been moved, but there still have the
description left in README. For now, we use `-v` options when running
memblock test to print information, so using the new to replace the
obsolete items.
Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com>
Acked-by: Rebecca Mckeever <remckee0@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220729161125.190845-1-shaoqin.huang@intel.com
Andrii Nakryiko says:
====================
bpf-next 2022-07-29
We've added 22 non-merge commits during the last 4 day(s) which contain
a total of 27 files changed, 763 insertions(+), 120 deletions(-).
The main changes are:
1) Fixes to allow setting any source IP with bpf_skb_set_tunnel_key() helper,
from Paul Chaignon.
2) Fix for bpf_xdp_pointer() helper when doing sanity checking, from Joanne Koong.
3) Fix for XDP frame length calculation, from Lorenzo Bianconi.
4) Libbpf BPF_KSYSCALL docs improvements and fixes to selftests to accommodate
s390x quirks with socketcall(), from Ilya Leoshkevich.
5) Allow/denylist and CI configs additions to selftests/bpf to improve BPF CI,
from Daniel Müller.
6) BPF trampoline + ftrace follow up fixes, from Song Liu and Xu Kuohai.
7) Fix allocation warnings in netdevsim, from Jakub Kicinski.
8) bpf_obj_get_opts() libbpf API allowing to provide file flags, from Joe Burton.
9) vsnprintf usage fix in bpf_snprintf_btf(), from Fedor Tokarev.
10) Various small fixes and clean ups, from Daniel Müller, Rongguang Wei,
Jörn-Thorben Hinz, Yang Li.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (22 commits)
bpf: Remove unneeded semicolon
libbpf: Add bpf_obj_get_opts()
netdevsim: Avoid allocation warnings triggered from user space
bpf: Fix NULL pointer dereference when registering bpf trampoline
bpf: Fix test_progs -j error with fentry/fexit tests
selftests/bpf: Bump internal send_signal/send_signal_tracepoint timeout
bpftool: Don't try to return value from void function in skeleton
bpftool: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE macro
bpf: btf: Fix vsnprintf return value check
libbpf: Support PPC in arch_specific_syscall_pfx
selftests/bpf: Adjust vmtest.sh to use local kernel configuration
selftests/bpf: Copy over libbpf configs
selftests/bpf: Sort configuration
selftests/bpf: Attach to socketcall() in test_probe_user
libbpf: Extend BPF_KSYSCALL documentation
bpf, devmap: Compute proper xdp_frame len redirecting frames
bpf: Fix bpf_xdp_pointer return pointer
selftests/bpf: Don't assign outer source IP to host
bpf: Set flow flag to allow any source IP in bpf_tunnel_key
geneve: Use ip_tunnel_key flow flags in route lookups
...
====================
Link: https://lore.kernel.org/r/20220729230948.1313527-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a simple test case for when hmm_range_fault() is called with the
HMM_PFN_REQ_FAULT flag and a device private PTE is found for a device
other than the hmm_range::dev_private_owner. This should cause the page
to be faulted back to system memory from the other device and the PFN
returned in the output array.
Also, remove a piece of code that unnecessarily unmaps part of the buffer.
Link: https://lkml.kernel.org/r/20220727000837.4128709-3-rcampbell@nvidia.com
Link: https://lkml.kernel.org/r/20220725183615.4118795-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20220725142048.30450-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add two soft-dirty test cases for mprotect() on both anon or file.
Link: https://lkml.kernel.org/r/20220725142048.30450-3-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Initialize "length" to zero by default.
Link: https://lkml.kernel.org/r/YtZzjvHXVXMXxpXO@kili
Fixes: ff712a627f ("selftests/vm: cleanup hugetlb file after mremap test")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This code just reads from memory without caring about the data itself.
However static checkers complain that "tmp" is never properly initialized.
Initialize it to zero and change the name to "dummy" to show that we
don't care about the value stored in it.
Link: https://lkml.kernel.org/r/YtZ8mKJmktA2GaHB@kili
Fixes: c4b6cb8840 ("selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Souptick Joarder (HPE) <jrdr.linux@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The test va_128TBswitch.c exercises a feature only supported on PPC and
x86_64, but it's run on other 64-bit archs as well. Before this patch,
the test did nothing and returned 0 for KSFT_PASS. This patch makes it
return the KSFT codes from kselftest.h, including KSFT_SKIP when
appropriate.
Verified on arm64 and x86_64.
Link: https://lkml.kernel.org/r/20220704123813.427625-1-adam@wowsignal.io
Signed-off-by: Adam Sindelar <adam@wowsignal.io>
Cc: David Vernet <void@manifault.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mrelease_test should return KSFT_SKIP when process_mrelease is not
defined, but due to a perror call consuming the errno, it returns
KSFT_FAIL.
This patch decides the exit code before calling perror.
[adam@wowsignal.io: fix remaining instances of errno mishandling]
Link: https://lkml.kernel.org/r/20220706141602.10159-1-adam@wowsignal.io
Link: https://lkml.kernel.org/r/20220704173351.19595-1-adam@wowsignal.io
Fixes: 33776141b8 ("selftests: vm: add process_mrelease tests")
Signed-off-by: Adam Sindelar <adam@wowsignal.io>
Reviewed-by: David Vernet <void@manifault.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add an extensible variant of bpf_obj_get() capable of setting the
`file_flags` parameter.
This parameter is needed to enable unprivileged access to BPF maps.
Without a method like this, users must manually make the syscall.
Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com
- Fix addresses for bss symbols, describing variables used in resolving data
access in tools such as 'perf c2c' and 'perf mem'.
- Skip symbols if SHF_ALLOC flag is not set, a technique used for
listing deprecated symbols, its addresses are zeros, so not useful.
- Remove undefined behavior from bpf_perf_object__next() when
dealing with an empty bpf_objects_list list.
- Make a ARM CoreSight disasm script work with both python2 and python3.
- Sync x86's cpufeatures header with with the kernel sources.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYuQG9QAKCRCyPKLppCJ+
JxtPAP9KlHo6mrPNtjly6jLJ0VvbS2NoJAg8gY1oIJBx68jE2QD+KRAZ7g6XaUuo
4c0BGm41QFyCIrUCDHMJhGJGI6g7NwI=
=ktD4
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v5.19-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix addresses for bss symbols, describing variables used in resolving
data access in tools such as 'perf c2c' and 'perf mem'.
- Skip symbols if SHF_ALLOC flag is not set, a technique used for
listing deprecated symbols, its addresses are zeros, so not useful.
- Remove undefined behavior from bpf_perf_object__next() when dealing
with an empty bpf_objects_list list.
- Make a ARM CoreSight disasm script work with both python2 and
python3.
- Sync x86's cpufeatures header with with the kernel sources.
* tag 'perf-tools-fixes-for-v5.19-2022-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf bpf: Remove undefined behavior from bpf_perf_object__next()
perf symbol: Skip symbols if SHF_ALLOC flag is not set
perf symbol: Correct address for bss symbols
perf scripts python: Let script to be python2 compliant
tools headers cpufeatures: Sync with the kernel sources
The send_signal/send_signal_tracepoint is pretty flaky, with at least
one failure in every ten runs on a few attempts I've tried it:
> test_send_signal_common:PASS:pipe_c2p 0 nsec
> test_send_signal_common:PASS:pipe_p2c 0 nsec
> test_send_signal_common:PASS:fork 0 nsec
> test_send_signal_common:PASS:skel_open_and_load 0 nsec
> test_send_signal_common:PASS:skel_attach 0 nsec
> test_send_signal_common:PASS:pipe_read 0 nsec
> test_send_signal_common:PASS:pipe_write 0 nsec
> test_send_signal_common:PASS:reading pipe 0 nsec
> test_send_signal_common:PASS:reading pipe error: size 0 0 nsec
> test_send_signal_common:FAIL:incorrect result unexpected incorrect result: actual 48 != expected 50
> test_send_signal_common:PASS:pipe_write 0 nsec
> #139/1 send_signal/send_signal_tracepoint:FAIL
The reason does not appear to be a correctness issue in the strict
sense. Rather, we merely do not receive the signal we are waiting for
within the provided timeout.
Let's bump the timeout by a factor of ten. With that change I have not
been able to reproduce the failure in 150+ iterations. I am also sneaking
in a small simplification to the test_progs test selection logic.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220727182955.4044988-1-deso@posteo.net
Merge devfreq changes, PM QoS change, and power management tools and
documentation changes for v5.20-rc1:
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
* pm-devfreq:
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
* pm-qos:
PM: QoS: Add check to make sure CPU freq is non-negative
* pm-tools:
pm-graph v5.9
* pm-docs:
Documentation: PM: Drop pme_interrupt reference
A skeleton generated by bpftool previously contained a return followed
by an expression in OBJ_NAME__detach(), which has return type void. This
did not hurt, the bpf_object__detach_skeleton() called there returns
void itself anyway, but led to a warning when compiling with e.g.
-pedantic.
Signed-off-by: Jörn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220726133203.514087-1-jthinz@mailbox.tu-berlin.de
Use the ARRAY_SIZE macro and make the code more compact.
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220726093045.3374026-1-clementwei90@163.com
global variable, fix comments and rework the trace information
(Lukasz Luba)
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang)
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski)
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano)
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof)
- Fix array underflow in prep_lookup_table (Dan Carpenter)
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun)
- Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)
- Fix typos in comments on rzg2l (Biju Das)
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li)
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov)
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer)
- Fix typo in the comments for the 'tmon' tool (Slark Xiao)
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmLkDEgACgkQqDIjiipP
6E9PPAf/fZRYgzqgv68lYy2hnJBZEha7z76KyKKxbPATy65VQHzHBqWyPgOnZWx8
xm26tlDJMFEGql/Sy5QetvnFdDqvY33Q0FBhDbmCdCp7vxxirDNKxXhGnxUggCIt
PrloMzC9zjgdNaFTclf/ceCFNwHPnY8l5kxGHhVDn/l5vvGFB869HKMT+13FMCQM
cKVNZY0F3BgmY0ouAMbXT2jwNm/FIYfXC9CFaQo9XhiTAvqU1h4BI08S8JdXsve0
VVBi8MB0sBolWIQ/GVlC1IWj1FhxgMfvcfZAOlyia7I4kQz7K5wAHxiHnhA+GHsZ
NdxVeGTIdmjIInvRxnsT7yR2HcitkA==
=sDfh
-----END PGP SIGNATURE-----
Merge tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal control changes for 5.20-rc1 from Daniel Lezcano:
"- Make per cpufreq / devfreq cooling device ops instead of using a
global variable, fix comments and rework the trace information
(Lukasz Luba)
- Add the include/dt-bindings/thermal.h under the area covered by the
thermal maintainer in the MAINTAINERS file (Lukas Bulwahn)
- Improve the error output by giving the sensor identification when a
thermal zone failed to initialize, the DT bindings by changing the
positive logic and adding the r8a779f0 support on the rcar3 (Wolfram
Sang)
- Convert the QCom tsens DT binding to the dtsformat format (Krzysztof
Kozlowski)
- Remove the pointless get_trend() function in the QCom, Ux500 and
tegra thermal drivers, along with the unused DROP_FULL and
RAISE_FULL trends definitions. Simplify the code by using clamp()
macros (Daniel Lezcano)
- Fix ref_table memory leak at probe time on the k3_j72xx bandgap
(Bryan Brattlof)
- Fix array underflow in prep_lookup_table (Dan Carpenter)
- Add static annotation to the k3_j72xx_bandgap_j7* data structure
(Jin Xiaoyun)
- Fix typos in comments detected on sun8i by Coccinelle (Julia Lawall)
- Fix typos in comments on rzg2l (Biju Das)
- Remove as unnecessary call to dev_err() as the error is already
printed by the failing function on u8500 (Yang Li)
- Register the thermal zones as hwmon sensors for the Qcom thermal
sensors (Dmitry Baryshkov)
- Fix 'tmon' tool compilation issue by adding phtread.h include
(Markus Mayer)
- Fix typo in the comments for the 'tmon' tool (Slark Xiao)
- Consolidate the thermal core code by beginning to move the thermal
trip structure from the thermal OF code as a generic structure to be
used by the different sensors when registering a thermal zone
(Daniel Lezcano)"
* tag 'thermal-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (36 commits)
thermal/of: Initialize trip points separately
thermal/of: Use thermal trips stored in the thermal zone
thermal/core: Add thermal_trip in thermal_zone
thermal/core: Rename 'trips' to 'num_trips'
thermal/core: Move thermal_set_delay_jiffies to static
thermal/core: Remove unneeded EXPORT_SYMBOLS
thermal/of: Move thermal_trip structure to thermal.h
thermal/of: Remove the device node pointer for thermal_trip
thermal/of: Replace device node match with device node search
thermal/core: Remove duplicate information when an error occurs
thermal/core: Avoid calling ->get_trip_temp() unnecessarily
thermal/tools/tmon: Fix typo 'the the' in comment
thermal/tools/tmon: Include pthread and time headers in tmon.h
thermal/ti-soc-thermal: Fix comment typo
thermal/drivers/qcom/spmi-adc-tm5: Register thermal zones as hwmon sensors
thermal/drivers/qcom/temp-alarm: Register thermal zones as hwmon sensors
thermal/drivers/u8500: Remove unnecessary print function dev_err()
thermal/drivers/rzg2l: Fix comments
thermal/drivers/sun8i: Fix typo in comment
thermal/drivers/k3_j72xx_bandgap: Make k3_j72xx_bandgap_j721e_data and k3_j72xx_bandgap_j7200_data static
...
Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs() to add
the Arch specific default attrs, e.g., Topdown for x86.
It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.
Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.
Add default_null_attrs[] to indicate the arch specific attrs.
No functional change for the arch specific default attrs either.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-4-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 55bcf6ef31 ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.
Add an arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.
Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.
Nothing is changed for the other archs.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-3-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This reverts commit Fixes: ac2dc29edd ("perf stat: Add default
hybrid events")
Between this patch and the reverted patch, the commit 6c1912898e
("perf parse-events: Rename parse_events_error functions") and the
commit 07eafd4e05 ("perf parse-event: Add init and exit to
parse_event_error") clean up the parse_events_error_*() codes. The
related change is also reverted.
The reverted patch is hard to be extended to support new default events,
e.g., Topdown events, and the existing "--detailed" option on a hybrid
platform.
A new solution will be proposed in the following patch to enable the
perf stat default on a hybrid platform.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-2-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On linux-next tree 'perf test 95' ("Check branch stack sampling") was
added recently.
s390 does not support branch sampling at all and the test case fails
despite for checking branch support before hand.
The check for support of branching uses the software event named "dummy",
as seen in the line:
perf record -b -o- -e dummy -B true > /dev/null 2>&1 || exit 2
However when the branch recording is actually done, a different event is
used, as seen in the line:
perf record -o $TMPDIR/... --branch-filter any,save_type,u -- ...
The event is omitted and for "perf record" the default event is cycles,
which is not supported by s390 and this fails when executed on s390:
# perf record --branch-filter any,save_type,u -- /tmp/__perf_test.program.iDSmQ/a.out
Error:
cycles: PMU Hardware or event type doesn't support branch stack sampling.
#
Therefore fix this and use the same event cycles for testing support
and actually running the test.
Output before:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
Testing user branch stack sampling
---- end ----
Check branch stack sampling: FAILED!
#
Output after:
# ./perf test -Fv 95
95: Check branch stack sampling :
--- start ---
---- end ----
Check branch stack sampling: Skip
#
Fixes: b55878c90a ("perf test: Add test for branch stack sampling")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: German Gomez <german.gomez@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220727141439.712582-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add IPv4 and IPv6 test cases that ensure that we are not leaking a
reference on the nexthop device when we are unable to delete its
associated route.
Without the fix in a previous patch ("netdevsim: fib: Fix reference
count leak on route deletion failure") both test cases get stuck,
waiting for the reference to be released from the dummy device [1][2].
[1]
unregister_netdevice: waiting for dummy1 to become free. Usage count = 5
leaked reference.
fib_check_nh+0x275/0x620
fib_create_info+0x237c/0x4d30
fib_table_insert+0x1dd/0x1d20
inet_rtm_newroute+0x11b/0x200
rtnetlink_rcv_msg+0x43b/0xd20
netlink_rcv_skb+0x15e/0x430
netlink_unicast+0x53b/0x800
netlink_sendmsg+0x945/0xe40
____sys_sendmsg+0x747/0x960
___sys_sendmsg+0x11d/0x190
__sys_sendmsg+0x118/0x1e0
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
[2]
unregister_netdevice: waiting for dummy1 to become free. Usage count = 5
leaked reference.
fib6_nh_init+0xc46/0x1ca0
ip6_route_info_create+0x1167/0x19a0
ip6_route_add+0x27/0x150
inet6_rtm_newroute+0x161/0x170
rtnetlink_rcv_msg+0x43b/0xd20
netlink_rcv_skb+0x15e/0x430
netlink_unicast+0x53b/0x800
netlink_sendmsg+0x945/0xe40
____sys_sendmsg+0x747/0x960
___sys_sendmsg+0x11d/0x190
__sys_sendmsg+0x118/0x1e0
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This selftest is designed for testing the H.L2Encaps.Red behavior. It
instantiates a virtual network composed of several nodes: hosts and SRv6
routers. Each node is realized using a network namespace that is
properly interconnected to others through veth pairs.
The test considers SRv6 routers implementing a L2 VPN leveraged by hosts
for communicating with each other. Such routers make use of the SRv6
H.L2Encaps.Red behavior for applying SRv6 policies to L2 traffic coming
from hosts.
The correct execution of the behavior is verified through reachability
tests carried out between hosts belonging to the same VPN.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
This selftest is designed for testing the H.Encaps.Red behavior. It
instantiates a virtual network composed of several nodes: hosts and SRv6
routers. Each node is realized using a network namespace that is
properly interconnected to others through veth pairs.
The test considers SRv6 routers implementing L3 VPNs leveraged by hosts
for communicating with each other. Such routers make use of the SRv6
H.Encaps.Red behavior for applying SRv6 policies to L3 traffic coming
from hosts.
The correct execution of the behavior is verified through reachability
tests carried out between hosts belonging to the same VPN.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do 'make -C tools/testing/memblock', get the following errors:
memblock.o: In function `memblock_find_in_range.constprop.9':
memblock.c:(.text+0x4651): undefined reference to `pr_warn_ratelimited'
memblock.o: In function `memblock_mark_mirror':
memblock.c:(.text+0x7171): undefined reference to `mirrored_kernelcore'
Fixes: 902c2d9158 ("memblock: Disable mirror feature if kernelcore is not specified")
Fixes: 14d9a675fd ("mm: Ratelimited mirrored memory related warning messages")
Signed-off-by: Liu Xinpeng <liuxp11@chinatelecom.cn>
Tested-by: Ma Wupeng <mawupeng1@huawei.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/1658916453-26312-1-git-send-email-liuxp11@chinatelecom.cn
Add a Makefile which takes care of installing the selftests in
tools/testing/selftests/drivers/net/dsa. This can be used to install all
DSA specific selftests and forwarding.config using the same approach as
for the selftests in tools/testing/selftests/net/forwarding.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220727191642.480279-1-martin.blumenstingl@googlemail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a handful of memory randomizations and precise length checks.
Nothing is really broken here, I did this to increase confidence
when debugging. It does fix a GCC warning, tho. Apparently GCC
recognizes that memory needs to be initialized for send() but
does not recognize that for write().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 708ac5bea0 ("libbpf: add ksyscall/kretsyscall sections support
for syscall kprobes") added the arch_specific_syscall_pfx() function,
which returns a string representing the architecture in use. As it turns
out this function is currently not aware of Power PC, where NULL is
returned. That's being flagged by the libbpf CI system, which builds for
ppc64le and the compiler sees a NULL pointer being passed in to a %s
format string.
With this change we add representations for two more architectures, for
Power PC and Power PC 64, and also adjust the string format logic to
handle NULL pointers gracefully, in an attempt to prevent similar issues
with other architectures in the future.
Fixes: 708ac5bea0 ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net
Add PMU events for the Arm Cortex-A78C and Arm Cortex-X1C CPUs.
Events for Arm Cortex-A78C match those for Arm Cortex-A78.
Events for Arm Cortex-X1C match those for Arm Cortex- X1.
As such, this is just a mapfile change.
Main ID Register (MIDR) and event data is sourced from the corresponding
Arm Technical Reference Manuals:
Arm Cortex-A78C:
https://developer.arm.com/documentation/102226/
Arm Cortex-X1C:
https://developer.arm.com/documentation/101968/
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220610174459.615995-1-nick.forrington@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.20, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the snowridgex files into perf and update mapfile.csv.
Tested on a non-snowridgex with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
This change just updates the version number.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-31-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v3, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually
copy the westmereex files into perf and update mapfile.csv. This
change just aligns whitespace and updates the version number.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-30-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v3, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually
copy the westmereep-sp files into perf and update
mapfile.csv. This change just aligns whitespace and updates the
version number.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-29-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v2, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually
copy the westmereep-dp files into perf and update
mapfile.csv. This change just aligns whitespace.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-28-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.07, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the tigerlake files into perf and update mapfile.csv.
Tested on a non-tigerlake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-27-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.28, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the skylakex files into perf and update mapfile.csv.
Tested with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
90: perf all metricgroups test : Ok
91: perf all metrics test : Skip
93: perf all PMU test : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-26-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v53, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the skylake files into perf and update mapfile.csv.
Tested on a non-skylake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v14, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually
copy the silvermont files into perf and update mapfile.csv. Other
than aligning whitespace this change just folds the mapfile.csv
entries for silvertmont onto one line.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.04, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the sapphirerapids files into perf and update mapfile.csv.
Tested on a non-sapphirerapids with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v17, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the sandybridge files into perf and update mapfile.csv.
Tested on a non-sandybridge with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v3, there are no TMA metrics for nehalemex.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the nehalemex files into perf and update mapfile.csv.
Tested on a non-nehalemex with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Note: most of this change is just sorting the keys in the json dictionaries.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v3, the are no TMA metrics for nehalemep.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the nehalemep files into perf and update mapfile.csv.
Tested on a non-nehalemep with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events are v1.00, there are no metrics yet.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the events and metrics. Manually copy
the meteorlake files into perf and update mapfile.csv.
Tested on a non-meteorlake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v9, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the knightslanding files into perf and update mapfile.csv.
Tested on a non-knightslanding with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Note: uncore-memory has become uncore-other as the topic was
determined this way in the conversion scripts. For simplicity the
scripts naming is maintained.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v21, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the jaketown files into perf and update mapfile.csv.
Tested on a non-jaketown with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v21, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the ivytown files into perf and update mapfile.csv.
Tested on a non-ivytown with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v22, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the ivybridge files into perf and update mapfile.csv.
Tested on a non-ivybridge with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.15, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the icelakex files into perf and update mapfile.csv.
Tested with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
90: perf all metricgroups test : Ok
91: perf all metrics test : Skip
93: perf all PMU test : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.14, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the icelake files into perf and update mapfile.csv.
Tested on a non-icelake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v25, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the haswellx files into perf and update mapfile.csv.
Tested with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
90: perf all metricgroups test : Ok
91: perf all metrics test : Failed
93: perf all PMU test : Ok
The test 91 failure is a pre-existing failure on the test system
with the metric Load_Miss_Real_Latency which is fixed by
prefixing it with --metric-no-group.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v31, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the haswell files into perf and update mapfile.csv.
Tested on a non-haswell with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
Correct the version in mapfile.csv.
Event json remains at v1.01, there are no goldmontplus metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
Modify mapfile.csv to have a missing goldmont cpuid.
Event json remains at v13, there are no goldmont metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.03. Elkhartlake metrics aren't in TMA but basic metrics are
left unchanged.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the elkhartlake files into perf and update mapfile.csv.
Tested on a non-elkhartlake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.16, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the cascadelakex files into perf and update mapfile.csv.
Tested with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
90: perf all metricgroups test : Ok
91: perf all metrics test : Skip
93: perf all PMU test : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Align end of file whitespace with what is generated by:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
Fold the mapfile.csv entries together with a more complex regular
expression. This will reduce the pmu-events.c table size.
The files following this change are still at v4.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v1.13, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the alderlake files into perf and update mapfile.csv.
Tested on a non-alderlake with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v7, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the broadwellde files into perf and update mapfile.csv.
Tested on a non-broadwellde with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v26, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the broadwell files into perf and update mapfile.csv.
Tested on a non-broadwell with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update to v19, the metrics are based on TMA 4.4 full.
Use script at:
https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py
to download and generate the latest events and metrics. Manually copy
the broadwellx files into perf and update mapfile.csv.
Tested with 'perf test':
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
90: perf all metricgroups test : Ok
91: perf all metrics test : Skip
93: perf all PMU test : Ok
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220727220832.2865794-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ACC (automatic C-state conversion) feature was available on Sky Lake and
Cascade Lake Xeons (SKX and CLX), but it is not available on Ice Lake and
Sapphire Rapids Xeons (ICX and SPR). Therefore, stop decoding it for ICX and
SPR.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Sapphire Rapids Xeon (SPR) supports 2 flavors of PC6 - PC6N (non-retention) and
PC6R (retention). Before this patch we used ICX package C-state limits, which
was wrong, because ICX has only one PC6 flavor. With this patch, we use SKX PC6
limits for SPR, because they are the same.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The 'automatic_cstate_conversion_probe()' function has a too long 'if'
statement, convert it to a 'switch' statement in order to improve code
readability a bit.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Before this patch, SPR platform was considered identical to ICX platform. This
patch separates SPR support from ICX.
This patch is a preparation for adding SPR-specific package C-state limits
support.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Intel Performance Hybrid processors have a 2nd MSR
describing the turbo limits enforced on the Ecores.
Note, TRL and Secondary-TRL are usually R/O information,
but on overclock-capable parts, they can be written.
Signed-off-by: Len Brown <len.brown@intel.com>
When CONFIG_INTEL_UNCORE_FREQ_CONTROL is effective,
(Linux 5.9 and later), print the current (and default)
min and max uncore frequency limits.
When that driver provides the current uncore frequency
(Linux 5.18 and later), print a UncMHz column
reflecting the current uncore frequency.
Note that UncMHz is an instantaneous sample, not an average.
eg.
$ sudo ./turbostat -S --show frequency
...
Uncore Frequency pkg0 die0: 800 - 3900 MHz (800 - 3900 MHz)
...
Avg_MHz Busy% Bzy_MHz TSC_MHz UncMHz
28 0.70 4049 3095 3900
Signed-off-by: Len Brown <len.brown@intel.com>
Currently if a fscanf fails then an early return leaks an open
file pointer. Fix this by fclosing the file before the return.
Detected using static analysis with cppcheck:
tools/power/x86/turbostat/turbostat.c:2039:3: error: Resource leak: fp [resourceLeak]
Fixes: eae97e053f ("tools/power turbostat: Support thermal throttle count print")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Using strncmp for a single character comparison is overly complicated,
just use a simpler single character comparison instead. Also stops
static analyzers (such as cppcheck) from complaining about strncmp on
non-null terminated strings.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
It would be handy to have cmdline in turbostat output. For example,
according to the turbostat output, there are no C-states requested.
In this case the user is very curious if something like
intel_idle.max_cstate=0 was used, or may be idle=none too. It is
also curious whether things like intel_pstate=nohwp were used.
Print the boot command line accordingly:
turbostat version 21.05.04 - Len Brown <lenb@kernel.org>
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.16.0+ root=UUID=
b42359ed-1e05-42eb-8757-6bf2a1c19070 ro quiet splash vt.handoff=7
Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Remove an unneeded semicolon.
Signed-off-by: Xin Gao <gaoxin@cdjrlc.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Change > MAX_DIE_PER_PACKAGE to >= MAX_DIE_PER_PACKAGE to prevent
accessing one element beyond the end of the array.
Fixes: 7fd786dfbd ("tools/power/x86/intel-speed-select: OOB daemon mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Verify that KVM allows toggling VMX MSR bits to be "more" restrictive,
and also allows restoring each MSR to KVM's original, less restrictive
value.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220607213604.3346000-16-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a command line option to dirty_log_perf_test to run vCPUs for the
entire duration of disabling dirty logging. By default, the test stops
running runs vCPUs before disabling dirty logging, which is faster but
less interesting as it doesn't stress KVM's handling of contention
between page faults and the zapping of collapsible SPTEs. Enabling the
flag also lets the user verify that KVM is indeed rebuilding zapped SPTEs
as huge pages by checking KVM's pages_{1g,2m,4k} stats. Without vCPUs to
fault in the zapped SPTEs, the stats will show that KVM is zapping pages,
but they never show whether or not KVM actually allows huge pages to be
recreated.
Note! Enabling the flag can _significantly_ increase runtime, especially
if the thread that's disabling dirty logging doesn't have a dedicated
pCPU, e.g. if all pCPUs are used to run vCPUs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220715232107.3775620-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Include sys/time.h and pthread.h in tmon.h, so that types
"pthread_mutex_t" and "struct timeval tv" are known when tmon.h
references them.
Without these headers, compiling tmon against musl-libc will fail with
these errors:
In file included from sysfs.c:31:0:
tmon.h:47:8: error: unknown type name 'pthread_mutex_t'
extern pthread_mutex_t input_lock;
^~~~~~~~~~~~~~~
make[3]: *** [<builtin>: sysfs.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from tui.c:31:0:
tmon.h:54:17: error: field 'tv' has incomplete type
struct timeval tv;
^~
make[3]: *** [<builtin>: tui.o] Error 1
make[2]: *** [Makefile:83: tmon] Error 2
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Tested-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Fixes: 94f69966fa ("tools/thermal: Introduce tmon, a tool for thermal subsystem")
Link: https://lore.kernel.org/r/20220718031040.44714-1-f.fainelli@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The ISA states: "when ACC[i] contains defined data, the contents of VSRs
4×i to 4×i+3 are undefined until either a VSX Move From ACC instruction
is used to copy the contents of ACC[i] to VSRs 4×i to 4×i+3 or some other
instruction directly writes to one of these VSRs." We aren't doing this.
This test only works on Power10 because the hardware implementation
happens to map ACC0 to VSRs 0-3, but will fail on any other implementation
that doesn't do this. So add xxmfacc between writing to the accumulator
and accessing the VSRs.
Fixes: 3527e1ab9a ("selftests/powerpc: Add matrix multiply assist (MMA) test")
Signed-off-by: Rashmica Gupta <rashmica@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220617043935.428083-1-rashmica@linux.ibm.com
clang has -Wconstant-conversion by default, and the constant 0xAAAAAAAAA
(9 As) being converted to an int, which is generally 32 bits, results
in the compile warning:
clang -Wl,-no-as-needed -Wall -isystem ../../../../usr/include/ -lpthread seccomp_bpf.c -lcap -o seccomp_bpf
seccomp_bpf.c:812:67: warning: implicit conversion from 'long' to 'int' changes value from 45812984490 to -1431655766 [-Wconstant-conversion]
int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAAA;
~~~~ ^~~~~~~~~~~
1 warning generated.
-1431655766 is the expected truncation, 0xAAAAAAAA (8 As), so use
this directly in the code to avoid the warning.
Fixes: 3932fcecd9 ("selftests/seccomp: Add test for unknown SECCOMP_RET kill behavior")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220526223407.1686936-1-zhuyifei@google.com
Two more bug fixes for asm-generic, one addressing an incorrect
Kconfig symbol reference and another one fixing a build failure
for the perf tool on mips and possibly others.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmLhU4gACgkQmmx57+YA
GNnW9A/+NCnHmZPGBhde00BNfcFUsoQCTSsqDy12iahKLaeqxbswjcM6B0xJhf4v
M3iMZ5CpXJEWpjg1qETQVDkc2WUcEPGih+B58Et7Yc54szesW77IQUWQruPiuerE
ELkVpJ6MDdWVDOw4FJvhXHeGoXTVNg/smAHagkIzezOvJPUVzWaJ+AcQSSJAS9Fc
1vOM3QSMd/aWxOFA4uq3Sr9d7xtVCPc0njiOrfDV4HBJg8mL8HWxhFt5QYHXDgjw
eWDZ0lo38qH23BXtV4gLILwukWCRPP0Zk+VlUmqO5NPK+2OAGm9AMym74XGI0SZn
HNesso1KERfMuz8MKJyGmCUg7c2gfIOP/peRRNeTX3NdZJ7V1Mjdh2QpqT1mQ2BV
CY14YgpJmzrJsAJpOwA2F4PL1tJLByPHPIPBNEad9QY/xXgqBciMPJQCZEm2XfDO
Uj2WUQj2i9jueFceVusRedamoZHg1PyyD0Ig57nHEEsnZhqquoJLOK0QWm25jltJ
g06SSBGGvVH1iP2MmLxcC/x9B73SMqMHEKUePM3Yinf0YoyqTJKC0Kf0vfqFpOzt
bfzXiHU49tS7g1AZevWfVPTBpMyqSJgGY+Vimq/3baAD6pDsYbACT+yxBDNar0rq
oHOhhXi0bLyM7D+wkZRTJtjipKHHsfi7cgRzVRHHPfG9mp0H0Cw=
=2fva
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
"Two more bug fixes for asm-generic, one addressing an incorrect
Kconfig symbol reference and another one fixing a build failure for
the perf tool on mips and possibly others"
* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove a broken and needless ifdef conditional
tools: Fixed MIPS builds due to struct flock re-definition
So far the vmtest.sh script, which can be used as a convenient way to
run bpf selftests, has obtained the kernel config safe to use for
testing from the libbpf/libbpf GitHub repository [0].
Given that we now have included this configuration into this very
repository, we can just consume it from here as well, eliminating the
necessity of remote accesses.
With this change we adjust the logic in the script to use the
configuration from below tools/testing/selftests/bpf/configs/ instead
of pulling it over the network.
[0] https://github.com/libbpf/libbpf
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-4-deso@posteo.net
This change integrates libbpf maintained configurations and black/white
lists [0] into the repository, co-located with the BPF selftests themselves.
We minimize the kernel configurations to keep future updates as small as
possible [1].
Furthermore, we make both kernel configurations build on top of the existing
configuration tools/testing/selftests/bpf/config (to be concatenated before
build). Lastly, we replaced the terms blacklist & whitelist with denylist and
allowlist, respectively.
[0] 20f0330235/travis-ci/vmtest/configs
[1] https://lore.kernel.org/bpf/20220712212124.3180314-1-deso@posteo.net/T/#m30a53648352ed494e556ac003042a9ad0a8f98c6
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-3-deso@posteo.net
This change makes sure to sort the existing minimal kernel configuration
containing options required for running BPF selftests alphabetically.
Doing so will make it easier to diff it against other configurations,
which in turn helps with maintaining disjunct config files that build on
top of each other. It also helped identify the CONFIG_IPV6_GRE being set
twice and removes one of the occurrences.
Lastly, we change NET_CLS_BPF from 'm' to 'y'. Having this option as 'm'
will cause failures of the btf_skc_cls_ingress selftest.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-2-deso@posteo.net
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.
Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.
Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.
Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some symbols are observed with the 'st_value' field zeroed. E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.
Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.
This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chang Rui <changruinj@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
...
Disassembly of section .bss:
0000000000004040 <completed.0>:
...
0000000000004080 <buf1>:
...
00000000000040c0 <buf2>:
...
0000000000004100 <thread>:
...
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
# ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf2 0x30a8-0x30e8
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf1 0x3068-0x30a8
...
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
file_address = st_value - sh_addr + sh_offset
^
` Memory address
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
After applying the change:
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf2 0x30c0-0x3100
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf1 0x3080-0x30c0
...
Fixes: f17e04afaf ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes: 12fdd6c009 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi <toracat@elrepo.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: ElRepo <contact@elrepo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
28a99e95f5 ("x86/amd: Use IBPB for firmware calls")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Restore the +x bit to va_128TBswitch.sh, which got dropped from the
previous patch, somehow.
Link: https://lkml.kernel.org/r/20220708090646.34927-1-adam@wowsignal.io
Fixes: 1afd01d43efc3 ("selftests/vm: Only run 128TBswitch with 5-level paging")
Signed-off-by: Adam Sindelar <adam@wowsignal.io>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Once line card is activated, check the FW version and PSID are exposed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Once line card is provisioned, check if HW revision and INI version
are exposed on associated nested auxiliary device.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of
topdown-slots-issued and topdown-total-slots.
Legacy event parsing would try to use '-' to separate parts of an event
and so perf_pmu__parse_init sets 'slots' to be a
PMU_EVENT_SYMBOL_SUFFIX2.
As such parsing the slots event for a fake PMU fails as a
PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token.
Resolve this issue by test initializing the PMU parsing state before
every parse. This must be done every parse as the state is removes after
each parse_events.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update JSON core/uncore events for haswellx to perf.
Based on HSX JSON list v24:
https://download.01.org/perfmon/HSX
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update JSON core/uncore events for broadwellx to perf.
Based on BDX JSON list v19:
https://download.01.org/perfmon/BDX
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
More uncore events are added in the converter tool:
https://github.com/intel/event-converter-for-linux-perf
Keep both alias and the original name for the events, in case someone
already used the alias in their script.
Generate the perf events based on Snowridgex(SNR) event list v1.20:
https://download.01.org/perfmon/SNR/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update JSON event list for Sapphirerapids to perf.
Based on JSON list v1.02:
https://download.01.org/perfmon/SPR/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update JSON event list for Alderlake to perf.
It is a hybrid event list for both Atom and Core.
Based on JSON list v1.11:
https://download.01.org/perfmon/ADL/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a spelling mistake in a pr_err message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements workqueue trace bpf function.
Test cases:
# perf kwork -k workqueue lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(w)addrconf_verify_work | 0002 | 5.856 ms | 1 | 5.856 ms | 111994.634313 s | 111994.640169 s |
(w)vmstat_update | 0001 | 1.247 ms | 1 | 1.247 ms | 111996.462651 s | 111996.463899 s |
(w)neigh_periodic_work | 0001 | 1.183 ms | 1 | 1.183 ms | 111996.462789 s | 111996.463973 s |
(w)neigh_managed_work | 0001 | 0.989 ms | 2 | 1.635 ms | 111996.462820 s | 111996.464455 s |
(w)wb_workfn | 0000 | 0.667 ms | 1 | 0.667 ms | 111996.384273 s | 111996.384940 s |
(w)bpf_prog_free_deferred | 0001 | 0.495 ms | 1 | 0.495 ms | 111986.314201 s | 111986.314696 s |
(w)mix_interrupt_randomness | 0002 | 0.421 ms | 6 | 0.749 ms | 111995.927750 s | 111995.928499 s |
(w)vmstat_shepherd | 0000 | 0.374 ms | 2 | 0.385 ms | 111991.265242 s | 111991.265627 s |
(w)e1000_watchdog | 0002 | 0.356 ms | 5 | 0.390 ms | 111994.528380 s | 111994.528770 s |
(w)vmstat_update | 0000 | 0.231 ms | 2 | 0.365 ms | 111996.384407 s | 111996.384772 s |
(w)flush_to_ldisc | 0006 | 0.165 ms | 1 | 0.165 ms | 111995.930606 s | 111995.930771 s |
(w)flush_to_ldisc | 0000 | 0.094 ms | 2 | 0.095 ms | 111996.460453 s | 111996.460548 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k workqueue rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)e1000_watchdog | 0002 | 0.627 ms | 2 | 0.324 ms | 112002.720665 s | 112002.720989 s |
(w)flush_to_ldisc | 0007 | 0.598 ms | 2 | 0.534 ms | 112000.875226 s | 112000.875761 s |
(w)wq_barrier_func | 0007 | 0.492 ms | 1 | 0.492 ms | 112000.876981 s | 112000.877473 s |
(w)flush_to_ldisc | 0007 | 0.281 ms | 1 | 0.281 ms | 112005.826882 s | 112005.827163 s |
(w)mix_interrupt_randomness | 0002 | 0.229 ms | 3 | 0.102 ms | 112005.825671 s | 112005.825774 s |
(w)vmstat_shepherd | 0000 | 0.202 ms | 1 | 0.202 ms | 112001.504511 s | 112001.504713 s |
(w)bpf_prog_free_deferred | 0001 | 0.181 ms | 1 | 0.181 ms | 112000.883251 s | 112000.883432 s |
(w)wb_workfn | 0007 | 0.130 ms | 1 | 0.130 ms | 112001.505195 s | 112001.505325 s |
(w)vmstat_update | 0000 | 0.053 ms | 1 | 0.053 ms | 112001.504763 s | 112001.504815 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements softirq trace bpf function.
Test cases:
Trace softirq latency without filter:
# perf kwork -k softirq lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0005 | 0.281 ms | 3 | 0.338 ms | 111295.752222 s | 111295.752560 s |
(s)RCU:9 | 0002 | 0.262 ms | 24 | 1.400 ms | 111301.335986 s | 111301.337386 s |
(s)SCHED:7 | 0005 | 0.177 ms | 14 | 0.212 ms | 111295.752270 s | 111295.752481 s |
(s)RCU:9 | 0007 | 0.161 ms | 47 | 2.022 ms | 111295.402159 s | 111295.404181 s |
(s)NET_RX:3 | 0003 | 0.149 ms | 12 | 1.261 ms | 111301.192964 s | 111301.194225 s |
(s)TIMER:1 | 0001 | 0.105 ms | 9 | 0.198 ms | 111301.180191 s | 111301.180389 s |
... <SNIP> ...
(s)NET_RX:3 | 0002 | 0.098 ms | 6 | 0.124 ms | 111295.403760 s | 111295.403884 s |
(s)SCHED:7 | 0001 | 0.093 ms | 19 | 0.242 ms | 111301.180256 s | 111301.180498 s |
(s)SCHED:7 | 0007 | 0.078 ms | 15 | 0.188 ms | 111300.064226 s | 111300.064415 s |
(s)SCHED:7 | 0004 | 0.077 ms | 11 | 0.213 ms | 111301.361759 s | 111301.361973 s |
(s)SCHED:7 | 0000 | 0.063 ms | 33 | 0.805 ms | 111295.401811 s | 111295.402616 s |
(s)SCHED:7 | 0003 | 0.063 ms | 14 | 0.085 ms | 111301.192255 s | 111301.192340 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with cpu filter:
# perf kwork -k softirq lat -b -C 1
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0001 | 0.178 ms | 5 | 0.572 ms | 111435.534135 s | 111435.534707 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with name filter:
# perf kwork -k softirq lat -b -n SCHED
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)SCHED:7 | 0001 | 0.295 ms | 15 | 2.183 ms | 111452.534950 s | 111452.537133 s |
(s)SCHED:7 | 0002 | 0.215 ms | 10 | 0.315 ms | 111460.000238 s | 111460.000553 s |
(s)SCHED:7 | 0005 | 0.190 ms | 29 | 0.338 ms | 111457.032538 s | 111457.032876 s |
(s)SCHED:7 | 0003 | 0.097 ms | 10 | 0.319 ms | 111452.434351 s | 111452.434670 s |
(s)SCHED:7 | 0006 | 0.089 ms | 1 | 0.089 ms | 111450.737450 s | 111450.737539 s |
(s)SCHED:7 | 0007 | 0.085 ms | 17 | 0.169 ms | 111452.471333 s | 111452.471502 s |
(s)SCHED:7 | 0004 | 0.071 ms | 15 | 0.221 ms | 111452.535252 s | 111452.535473 s |
(s)SCHED:7 | 0000 | 0.044 ms | 32 | 0.130 ms | 111460.001982 s | 111460.002112 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements irq trace bpf function.
Test cases:
Trace irq without filter:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 31.026 ms | 285 | 1.493 ms | 110326.049963 s | 110326.051456 s |
eth0:10 | 0002 | 7.875 ms | 96 | 1.429 ms | 110313.916835 s | 110313.918264 s |
ata_piix:14 | 0002 | 2.510 ms | 28 | 0.396 ms | 110331.367987 s | 110331.368383 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with cpu filter:
# perf kwork -k irq rep -b -C 0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 34.288 ms | 282 | 2.061 ms | 110358.078968 s | 110358.081029 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with name filter:
# perf kwork -k irq rep -b -n eth0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
eth0:10 | 0002 | 2.184 ms | 21 | 0.572 ms | 110386.541699 s | 110386.542271 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with summary:
# perf kwork -k irq rep -b -S
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 42.923 ms | 285 | 1.181 ms | 110418.128867 s | 110418.130049 s |
eth0:10 | 0002 | 2.085 ms | 20 | 0.668 ms | 110416.002935 s | 110416.003603 s |
ata_piix:14 | 0002 | 0.970 ms | 4 | 0.656 ms | 110424.034482 s | 110424.035138 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 309
Total runtime (msec) : 45.977 (0.003% load average)
Total time span (msec) : 17017.655
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
nvme0q20:145 | 0019 | 0.570 ms | 28 | 0.064 ms | 26966.635102 s | 26966.635167 s |
amdgpu:162 | 0002 | 0.568 ms | 29 | 0.068 ms | 26966.644346 s | 26966.644414 s |
nvme0q4:129 | 0003 | 0.565 ms | 31 | 0.037 ms | 26966.614830 s | 26966.614866 s |
nvme0q16:141 | 0015 | 0.205 ms | 66 | 0.012 ms | 26967.145161 s | 26967.145174 s |
nvme0q29:154 | 0028 | 0.154 ms | 44 | 0.014 ms | 26967.078970 s | 26967.078984 s |
nvme0q10:135 | 0009 | 0.134 ms | 43 | 0.011 ms | 26967.132093 s | 26967.132104 s |
nvme0q2:127 | 0001 | 0.132 ms | 26 | 0.011 ms | 26966.883584 s | 26966.883595 s |
nvme0q25:150 | 0024 | 0.127 ms | 32 | 0.014 ms | 26966.631419 s | 26966.631433 s |
nvme0q14:139 | 0013 | 0.110 ms | 21 | 0.017 ms | 26966.760843 s | 26966.760861 s |
nvme0q30:155 | 0029 | 0.102 ms | 30 | 0.022 ms | 26966.677171 s | 26966.677193 s |
nvme0q13:138 | 0012 | 0.088 ms | 20 | 0.015 ms | 26966.738733 s | 26966.738748 s |
nvme0q6:131 | 0005 | 0.087 ms | 13 | 0.020 ms | 26966.648445 s | 26966.648465 s |
nvme0q28:153 | 0027 | 0.066 ms | 12 | 0.015 ms | 26966.771431 s | 26966.771447 s |
nvme0q26:151 | 0025 | 0.060 ms | 13 | 0.012 ms | 26966.704266 s | 26966.704278 s |
nvme0q21:146 | 0020 | 0.054 ms | 20 | 0.011 ms | 26967.322082 s | 26967.322094 s |
nvme0q1:126 | 0000 | 0.046 ms | 11 | 0.013 ms | 26966.859754 s | 26966.859767 s |
nvme0q17:142 | 0016 | 0.046 ms | 10 | 0.011 ms | 26967.114513 s | 26967.114524 s |
xhci_hcd:74 | 0015 | 0.041 ms | 3 | 0.016 ms | 26967.086004 s | 26967.086020 s |
nvme0q8:133 | 0007 | 0.039 ms | 12 | 0.008 ms | 26966.712056 s | 26966.712063 s |
nvme0q32:157 | 0031 | 0.036 ms | 10 | 0.014 ms | 26966.627054 s | 26966.627068 s |
nvme0q9:134 | 0008 | 0.036 ms | 11 | 0.011 ms | 26967.258452 s | 26967.258462 s |
nvme0q7:132 | 0006 | 0.024 ms | 3 | 0.014 ms | 26966.767404 s | 26966.767418 s |
nvme0q11:136 | 0010 | 0.023 ms | 5 | 0.006 ms | 26966.935455 s | 26966.935461 s |
nvme0q31:156 | 0030 | 0.018 ms | 5 | 0.006 ms | 26966.627517 s | 26966.627524 s |
nvme0q12:137 | 0011 | 0.015 ms | 2 | 0.014 ms | 26966.799588 s | 26966.799602 s |
enp5s0-rx-0:164 | 0006 | 0.009 ms | 2 | 0.005 ms | 26966.742024 s | 26966.742028 s |
enp5s0-rx-1:165 | 0007 | 0.006 ms | 2 | 0.004 ms | 26966.939486 s | 26966.939490 s |
enp5s0-tx-0:166 | 0008 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
enp5s0-tx-1:167 | 0009 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
--------------------------------------------------------------------------------------------------------------------------------
#t
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.
Using eBPF trace can process the data in kernel, which solves the
preceding two problems.
Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:
1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.
This commit implements the framework code and
does not add specific event support.
Test cases:
# perf kwork rep -h
Usage: perf kwork report [<options>]
-b, --use-bpf Use BPF to measure kwork runtime
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-b, --use-bpf Use BPF to measure kwork latency
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -b
Unsupported bpf trace class irq
# perf kwork rep -b
Unsupported bpf trace class irq
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements workqueue latency function.
Test cases:
# perf kwork -k workqueue lat
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(w)vmstat_update | 0001 | 5.004 ms | 1 | 5.004 ms | 44001.745646 s | 44001.750650 s |
(w)vmstat_update | 0006 | 1.773 ms | 1 | 1.773 ms | 44000.830840 s | 44000.832613 s |
(w)vmstat_shepherd | 0000 | 0.992 ms | 8 | 2.474 ms | 44007.717845 s | 44007.720318 s |
(w)vmstat_update | 0000 | 0.974 ms | 5 | 2.624 ms | 44004.785970 s | 44004.788594 s |
(w)e1000_watchdog | 0002 | 0.687 ms | 5 | 2.632 ms | 44005.009334 s | 44005.011966 s |
(w)vmstat_update | 0002 | 0.307 ms | 1 | 0.307 ms | 44004.817395 s | 44004.817702 s |
(w)vmstat_update | 0004 | 0.296 ms | 1 | 0.296 ms | 43997.913677 s | 43997.913973 s |
(w)mix_interrupt_randomness | 0000 | 0.283 ms | 285 | 3.724 ms | 44006.790889 s | 44006.794613 s |
(w)neigh_managed_work | 0001 | 0.271 ms | 1 | 0.271 ms | 43997.665542 s | 43997.665813 s |
(w)vmstat_update | 0005 | 0.261 ms | 1 | 0.261 ms | 44007.820542 s | 44007.820803 s |
(w)neigh_managed_work | 0004 | 0.220 ms | 1 | 0.220 ms | 44002.953287 s | 44002.953507 s |
(w)neigh_periodic_work | 0004 | 0.217 ms | 1 | 0.217 ms | 43999.929718 s | 43999.929935 s |
(w)mix_interrupt_randomness | 0002 | 0.199 ms | 5 | 0.310 ms | 44005.012316 s | 44005.012625 s |
(w)vmstat_update | 0003 | 0.199 ms | 4 | 0.307 ms | 44005.714391 s | 44005.714699 s |
(w)gc_worker | 0001 | 0.071 ms | 173 | 1.128 ms | 44002.062579 s | 44002.063707 s |
--------------------------------------------------------------------------------------------------------------------------------
INFO: 0.020% skipped events (17 including 10 raise, 7 entry, 0 exit)
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-13-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements softirq latency function.
Test cases:
# perf kwork -k softirq lat
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s |
(s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s |
(s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s |
(s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s |
(s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s |
(s)SCHED:7 | 0000 | 0.158 ms | 677 | 2.639 ms | 44004.785716 s | 44004.788355 s |
... <SNIP> ...
(s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s |
(s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s |
(s)TIMER:1 | 0005 | 0.128 ms | 1 | 0.128 ms | 44007.820346 s | 44007.820474 s |
(s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k softirq lat -C 1,2
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s |
(s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s |
(s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s |
(s)NET_RX:3 | 0002 | 0.106 ms | 5 | 0.163 ms | 44005.012255 s | 44005.012418 s |
(s)TIMER:1 | 0002 | 0.084 ms | 9 | 0.114 ms | 44005.009168 s | 44005.009282 s |
(s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s |
(s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k softirq lat -n RCU
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s |
(s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s |
(s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s |
(s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s |
(s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s |
(s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s |
(s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s |
(s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k softirq lat -s count,avg -n RCU
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0003 | 0.129 ms | 2193 | 1.507 ms | 44006.010208 s | 44006.011715 s |
(s)RCU:9 | 0001 | 0.216 ms | 1619 | 3.659 ms | 43997.662069 s | 43997.665727 s |
(s)RCU:9 | 0002 | 0.141 ms | 932 | 1.662 ms | 44005.010206 s | 44005.011868 s |
(s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s |
(s)RCU:9 | 0007 | 0.217 ms | 140 | 1.335 ms | 43997.671080 s | 43997.672415 s |
(s)RCU:9 | 0005 | 0.245 ms | 88 | 1.866 ms | 43997.683105 s | 43997.684971 s |
(s)RCU:9 | 0004 | 0.237 ms | 26 | 0.792 ms | 43997.683018 s | 43997.683810 s |
(s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k softirq lat --time 43997,
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0006 | 1.048 ms | 1 | 1.048 ms | 44000.829759 s | 44000.830807 s |
(s)TIMER:1 | 0001 | 1.008 ms | 4 | 3.434 ms | 43997.662069 s | 43997.665503 s |
(s)RCU:9 | 0006 | 0.675 ms | 7 | 1.328 ms | 43997.670304 s | 43997.671632 s |
(s)RCU:9 | 0000 | 0.414 ms | 701 | 3.996 ms | 43997.661170 s | 43997.665167 s |
(s)TIMER:1 | 0004 | 0.083 ms | 21 | 0.127 ms | 44004.969171 s | 44004.969298 s |
... <SNIP> ...
(s)SCHED:7 | 0005 | 0.050 ms | 4 | 0.086 ms | 43997.684852 s | 43997.684938 s |
(s)SCHED:7 | 0001 | 0.049 ms | 655 | 0.837 ms | 44005.707998 s | 44005.708835 s |
(s)SCHED:7 | 0007 | 0.044 ms | 171 | 0.077 ms | 43997.943265 s | 43997.943342 s |
(s)SCHED:7 | 0002 | 0.040 ms | 1731 | 0.211 ms | 44005.009237 s | 44005.009447 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-12-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements framework of perf kwork latency, which is used to report time
properties such as delay time and frequency.
Test cases:
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -C 199
Requested CPU 199 too large. Consider raising MAX_NR_CPUS
Invalid cpu bitmap
# perf kwork lat -i perf_no_exist.data
failed to open perf_no_exist.data: No such file or directory
# perf kwork lat -s avg1
Error: Unknown --sort key: `avg1'
Usage: perf kwork latency [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat --time FFFF,
Invalid time span
# perf kwork lat
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
INFO: 36.570% skipped events (31537 including 0 raise, 31537 entry, 0 exit)
Since there are no latency-enabled events, the output is empty.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-11-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements workqueue report function.
Test cases:
# perf kwork -k workqueue rep
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s |
(w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s |
(w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s |
(w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s |
(w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s |
(w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s |
(w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s |
(w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s |
(w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s |
(w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s |
(w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s |
(w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s |
(w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s |
(w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s |
(w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s |
(w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k workqueue rep -S
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)gc_worker | 0001 | 1912.389 ms | 173 | 12.896 ms | 44002.050787 s | 44002.063683 s |
(w)mix_interrupt_randomness | 0000 | 24.308 ms | 285 | 3.349 ms | 44004.784908 s | 44004.788257 s |
(w)e1000_watchdog | 0002 | 5.332 ms | 5 | 2.059 ms | 44000.914366 s | 44000.916424 s |
(w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s |
(w)vmstat_shepherd | 0000 | 0.964 ms | 8 | 0.195 ms | 43997.986453 s | 43997.986648 s |
(w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s |
(w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s |
(w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s |
(w)mix_interrupt_randomness | 0002 | 0.114 ms | 5 | 0.037 ms | 44005.012625 s | 44005.012662 s |
(w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s |
(w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s |
(w)neigh_periodic_work | 0004 | 0.039 ms | 1 | 0.039 ms | 43999.929935 s | 43999.929974 s |
(w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s |
(w)neigh_managed_work | 0001 | 0.036 ms | 1 | 0.036 ms | 43997.665813 s | 43997.665849 s |
(w)neigh_managed_work | 0004 | 0.036 ms | 1 | 0.036 ms | 44002.953507 s | 44002.953543 s |
(w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 500
Total runtime (msec) : 1945.085 (0.192% load average)
Total time span (msec) : 10155.026
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k workqueue rep -n vmstat_update
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)vmstat_update | 0005 | 0.989 ms | 2 | 0.953 ms | 43997.986991 s | 43997.987944 s |
(w)vmstat_update | 0003 | 0.306 ms | 6 | 0.077 ms | 44004.689543 s | 44004.689620 s |
(w)vmstat_update | 0000 | 0.196 ms | 5 | 0.049 ms | 44005.713732 s | 44005.713781 s |
(w)vmstat_update | 0001 | 0.162 ms | 2 | 0.130 ms | 44000.192034 s | 44000.192164 s |
(w)vmstat_update | 0002 | 0.084 ms | 2 | 0.043 ms | 44004.817702 s | 44004.817745 s |
(w)vmstat_update | 0006 | 0.067 ms | 2 | 0.041 ms | 43997.987214 s | 43997.987254 s |
(w)vmstat_update | 0007 | 0.037 ms | 1 | 0.037 ms | 43997.988969 s | 43997.989006 s |
(w)vmstat_update | 0004 | 0.027 ms | 1 | 0.027 ms | 43997.913973 s | 43997.914000 s |
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork -k workqueue rep -C 1 | head -20
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)commit_work | 0001 | 25.896 ms | 2 | 13.200 ms | 26522.906700 s | 26522.919900 s |
(w)commit_work | 0001 | 13.316 ms | 1 | 13.316 ms | 26522.573246 s | 26522.586562 s |
(w)commit_work | 0001 | 13.177 ms | 1 | 13.177 ms | 26522.673406 s | 26522.686583 s |
(w)commit_work | 0001 | 12.630 ms | 1 | 12.630 ms | 26522.123921 s | 26522.136551 s |
(w)btrfs_work_helper | 0001 | 3.544 ms | 1 | 3.544 ms | 26529.131296 s | 26529.134840 s |
(w)btrfs_work_helper | 0001 | 3.330 ms | 1 | 3.330 ms | 26529.137698 s | 26529.141028 s |
(w)btrfs_work_helper | 0001 | 2.855 ms | 1 | 2.855 ms | 26529.134842 s | 26529.137697 s |
(w)btrfs_work_helper | 0001 | 2.757 ms | 1 | 2.757 ms | 26529.124086 s | 26529.126843 s |
(w)btrfs_work_helper | 0001 | 2.182 ms | 1 | 2.182 ms | 26529.141030 s | 26529.143212 s |
(w)btrfs_work_helper | 0001 | 1.743 ms | 1 | 1.743 ms | 26520.415335 s | 26520.417078 s |
(w)btrfs_work_helper | 0001 | 1.499 ms | 1 | 1.499 ms | 26529.127774 s | 26529.129272 s |
(w)btrfs_work_helper | 0001 | 1.446 ms | 1 | 1.446 ms | 26529.129848 s | 26529.131294 s |
(w)btrfs_work_helper | 0001 | 1.373 ms | 1 | 1.373 ms | 26523.808270 s | 26523.809643 s |
(w)wb_workfn | 0001 | 1.165 ms | 2 | 0.763 ms | 26527.071056 s | 26527.071819 s |
(w)btrfs_work_helper | 0001 | 0.926 ms | 1 | 0.926 ms | 26529.126846 s | 26529.127771 s |
(w)btrfs_work_helper | 0001 | 0.571 ms | 1 | 0.571 ms | 26529.129275 s | 26529.129846 s |
(w)wb_workfn | 0001 | 0.525 ms | 1 | 0.525 ms | 26522.975151 s | 26522.975676 s |
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-10-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements softirq kwork report function.
Test cases:
# perf kwork -k softirq rep
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s |
(s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s |
(s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s |
(s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s |
... <SNIP> ...
(s)RCU:9 | 0004 | 0.830 ms | 26 | 0.058 ms | 43997.666418 s | 43997.666476 s |
(s)TIMER:1 | 0001 | 0.471 ms | 4 | 0.158 ms | 44007.834694 s | 44007.834852 s |
(s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s |
(s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s |
(s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s |
(s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s |
(s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s |
--------------------------------------------------------------------------------------------------------------------------------
#
# perf kwork -k softirq rep -S
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s |
(s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s |
(s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s |
(s)SCHED:7 | 0000 | 63.631 ms | 680 | 2.690 ms | 44006.721976 s | 44006.724666 s |
... <SNIP> ...
(s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s |
(s)RCU:9 | 0006 | 0.220 ms | 7 | 0.048 ms | 44004.833764 s | 44004.833812 s |
(s)NET_RX:3 | 0002 | 0.164 ms | 5 | 0.049 ms | 44005.012418 s | 44005.012466 s |
(s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s |
(s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s |
(s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 12748
Total runtime (msec) : 661.433 (0.065% load average)
Total time span (msec) : 10176.441
--------------------------------------------------------------------------------------------------------------------------------
#
# perf kwork -k softirq rep -s count,max
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(s)TIMER:1 | 0003 | 181.387 ms | 2476 | 1.240 ms | 44004.787960 s | 44004.789201 s |
(s)RCU:9 | 0003 | 91.573 ms | 2193 | 0.650 ms | 44004.790258 s | 44004.790908 s |
(s)SCHED:7 | 0002 | 50.039 ms | 1731 | 0.074 ms | 44005.009447 s | 44005.009521 s |
(s)RCU:9 | 0001 | 78.960 ms | 1619 | 1.195 ms | 44001.496553 s | 44001.497749 s |
(s)SCHED:7 | 0003 | 55.962 ms | 1255 | 0.954 ms | 44004.812008 s | 44004.812962 s |
... <SNIP> ...
(s)RCU:9 | 0002 | 35.241 ms | 932 | 0.407 ms | 44005.009541 s | 44005.009949 s |
(s)RCU:9 | 0000 | 45.710 ms | 702 | 1.144 ms | 44004.787023 s | 44004.788167 s |
(s)SCHED:7 | 0006 | 0.080 ms | 2 | 0.044 ms | 43997.826145 s | 43997.826189 s |
(s)TIMER:1 | 0005 | 0.164 ms | 1 | 0.164 ms | 44007.820474 s | 44007.820638 s |
(s)TIMER:1 | 0006 | 0.087 ms | 1 | 0.087 ms | 44000.830807 s | 44000.830894 s |
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork -k softirq report -C 2 -s count,max
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(s)SCHED:7 | 0002 | 0.980 ms | 159 | 0.024 ms | 26035.571037 s | 26035.571061 s |
(s)RCU:9 | 0002 | 0.124 ms | 88 | 0.021 ms | 26035.177050 s | 26035.177071 s |
(s)TIMER:1 | 0002 | 0.122 ms | 56 | 0.007 ms | 26035.468045 s | 26035.468052 s |
--------------------------------------------------------------------------------------------------------------------------------
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-9-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements irq kwork report function.
Test cases:
# perf kwork record -- sleep 10
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 6.134 MB perf.data ]
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s |
eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -C 2
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -C 3
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -i perf.data
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s |
eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -s max,freq
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s |
eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -S
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 1167.501 ms | 18284 | 1.096 ms | 44004.464905 s | 44004.466001 s |
eth0:10 | 0002 | 0.185 ms | 5 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 18289
Total runtime (msec) : 1167.686 (0.115% load average)
Total time span (msec) : 10159.155
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report --time 44005,
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 402.173 ms | 4695 | 0.981 ms | 44007.831992 s | 44007.832973 s |
eth0:10 | 0002 | 0.089 ms | 2 | 0.058 ms | 44005.012222 s | 44005.012280 s |
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
nvme0q5:130 | 0004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 0002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 0023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 0019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 0030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 0007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 0005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 0018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 0006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 0017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 0016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 0006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 0008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------------------------------
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-8-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements framework of 'perf kwork report', which is used to report
time properties such as run time and frequency:
Test cases:
# perf kwork
Usage: perf kwork [<options>] {record|report}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, etc)
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork report -h
Usage: perf kwork report [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -S
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
Total count : 0
Total runtime (msec) : 0.000 (0.000% load average)
Total time span (msec) : 0.000
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork report -C 0,100
Requested CPU 100 too large. Consider raising MAX_NR_CPUS
Invalid cpu bitmap
# perf kwork report -s runtime1
Error: Unknown --sort key: `runtime1'
Usage: perf kwork report [<options>]
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork report -i perf_no_exist.data
failed to open perf_no_exist.data: No such file or directory
# perf kwork report --time 00FFF,
Invalid time span
Since there are no report supported events, the output is empty.
Briefly describe the data structure:
1. "class" indicates event type. For example, irq and softiq correspond
to different types.
2. "cluster" refers to a specific event corresponding to a type. For
example, RCU and TIMER in softirq correspond to different clusters,
which contains three types of events: raise, entry, and exit.
3. "atom" includes time of each sample and sample of the previous phase.
(For example, exit corresponds to entry, which is used for timehist.)
Committer notes:
- Add {} for multiline if blocks.
- report_print_work() should either return that ret variable that
accounts how many bytes were printed or stop accounting and be void.
Do the former for now to avoid this:
builtin-kwork.c:534:6: error: variable 'ret' set but not used [-Werror,-Wunused-but-set-variable]
int ret = 0;
^
1 error generated.
When building with:
⬢[acme@toolbox perf]$ clang --version
clang version 13.0.0 (https://github.com/llvm/llvm-project e8991caea8690ec2d17b0b7e1c29bf0da6609076)
Also:
- if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
+ if (dst_type < KWORK_TRACE_MAX) {
Several versions of clang and at least this gcc:
3 51.40 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0)
builtin-kwork.c:411:16: error: comparison of unsigned enum expression >= 0 is
always true [-Werror,-Wtautological-compare]
if ((dst_type >= 0) && (dst_type < KWORK_TRACE_MAX)) {
As the first entry in a enum is zero.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-7-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add list_last_entry_or_null() to get the last element from a list,
returns NULL if the list is empty.
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-6-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Record interrupt events irq:irq_handler_entry & irq_handler_exit
Test cases:
# perf kwork record -o perf_kwork.date -- sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.556 MB perf_kwork.date ]
#
# perf evlist -i perf_kwork.date
irq:irq_handler_entry
irq:irq_handler_exit
dummy:HG
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-3-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf kwork' tool is used to trace time properties of kernel work
(such as irq, softirq, and workqueue), including runtime, latency, and
timehist, using the infrastructure in the perf tools to allow tracing
extra targets.
This is the first commit to reuse the 'perf record' framework code to
implement a simple record function, kwork is not supported currently.
Test cases:
# perf
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
The most commonly used perf commands are:
<SNIP>
iostat Show I/O performance metrics
kallsyms Searches running kernel for symbols
kmem Tool to trace/measure kernel memory properties
kvm Tool to trace/measure kvm guest os
kwork Tool to trace/measure kernel work properties (latencies)
list List all symbolic event types
lock Analyze lock events
mem Profile memory accesses
record Run a command and record its profile into perf.data
<SNIP>
See 'perf help COMMAND' for more information on a specific command.
# perf kwork
Usage: perf kwork [<options>] {record}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-k, --kwork <kwork> list of kwork to profile
-v, --verbose be more verbose (show symbol address, etc)
# perf kwork record -- sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.787 MB perf.data ]
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-2-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
test_probe_user fails on architectures where libc uses
socketcall(SYS_CONNECT) instead of connect(). Fix by attaching
to socketcall as well.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220726134008.256968-3-iii@linux.ibm.com
Explicitly list known quirks. Mention that socket-related syscalls can be
invoked via socketcall().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220726134008.256968-2-iii@linux.ibm.com
The previous commit fixed a bug in the bpf_skb_set_tunnel_key helper to
avoid dropping packets whose outer source IP address isn't assigned to a
host interface. This commit changes the corresponding selftest to not
assign the outer source IP address to an interface.
Not assigning the source IP to an interface causes two issues in the
existing test:
1. The ARP requests will fail for that IP address so we need to add the
ARP entry manually.
2. The encapsulated ICMP echo reply traffic will not reach the VXLAN
device. It will be dropped by the stack before, because the
outer destination IP is unknown.
To solve 2., we have two choices. Either we perform decapsulation
ourselves in a BPF program attached at veth1 (the base device for the
VXLAN device), or we switch the outer destination address when we
receive the packet at veth1, such that the stack properly demultiplexes
it to the VXLAN device afterward.
This commit implements the second approach, where we switch the outer
destination address from the unassigned IP address to the assigned one,
only for VXLAN traffic ingressing veth1.
Then, at the vxlan device, the BPF program that checks the output of
bpf_skb_get_tunnel_key needs to be updated as the expected local IP
address is now the unassigned one.
Signed-off-by: Paul Chaignon <paul@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/4addde76eaf3477a58975bef15ed2788c44e5f55.1658759380.git.paul@isovalent.com
Noticed when processing 'perf kwork' that includes util/data.h without,
by luck, having included unistd.h indirectly to get the pid_t typedef.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like perf lock report, it can report lock contention stat of each task.
$ perf lock contention -t
contended total wait max wait avg wait pid comm
5 945.20 us 902.08 us 189.04 us 316167 EventManager_De
33 98.17 us 6.78 us 2.97 us 766063 kworker/0:1-get
7 92.47 us 61.26 us 13.21 us 316170 EventManager_De
14 76.31 us 12.87 us 5.45 us 12949 timedcall
24 76.15 us 12.27 us 3.17 us 767992 sched-pipe
15 75.62 us 11.93 us 5.04 us 15127 switchto-defaul
24 71.84 us 5.59 us 2.99 us 629168 kworker/u513:2-
17 67.41 us 7.94 us 3.96 us 13504 coroner-
1 59.56 us 59.56 us 59.56 us 316165 EventManager_De
14 56.21 us 6.89 us 4.01 us 0 swapper
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like perf lock report, add -k/--key and -F/--field options to control
output formatting and sorting. Note that it has slightly different
default options as some fields are not available and to optimize the
screen space.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf lock contention' processes the lock contention events and
displays the result like perf lock report. Right now, there's not
much difference between the two but the lock contention specific
features will come soon.
$ perf lock contention
contended total wait max wait avg wait type caller
238 1.41 ms 29.20 us 5.94 us spinlock update_blocked_averages+0x4c
1 902.08 us 902.08 us 902.08 us rwsem:R do_user_addr_fault+0x1dd
81 330.30 us 17.24 us 4.08 us spinlock _nohz_idle_balance+0x172
2 89.54 us 61.26 us 44.77 us spinlock do_anonymous_page+0x16d
24 78.36 us 12.27 us 3.27 us mutex pipe_read+0x56
2 71.58 us 59.56 us 35.79 us spinlock __handle_mm_fault+0x6aa
6 25.68 us 6.89 us 4.28 us spinlock do_idle+0x28d
1 18.46 us 18.46 us 18.46 us rtmutex exec_fw_cmd+0x21b
3 15.25 us 6.26 us 5.08 us spinlock tick_do_update_jiffies64+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the aggr_mode variable to prepare a later code change.
The default is LOCK_AGGR_ADDR which aggregates the result for the lock
instances.
When -t/--threads option is given, it'd be set to LOCK_AGGR_TASK. The
LOCK_AGGR_CALLER is for the contention analysis and it'd aggregate the
stat by comparing the callstacks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For lock contention tracepoint analysis, it needs to keep the flags.
As nr_readlock and nr_trylock fields are not used for it, let's make
it a union.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220725183124.368304-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After all the soft validation of the region has completed, convey the
region configuration to hardware while being careful to commit decoders
in specification mandated order. In addition to programming the endpoint
decoder base-address, interleave ways and granularity, the switch
decoder target lists are also established.
While the kernel can enforce spec-mandated commit order, it can not
enforce spec-mandated reset order. For example, the kernel can't stop
someone from removing an endpoint device that is occupying decoderN in a
switch decoder where decoderN+1 is also committed. To reset decoderN,
decoderN+1 must be torn down first. That "tear down the world"
implementation is saved for a follow-on patch.
Callback operations are provided for the 'commit' and 'reset'
operations. While those callbacks may prove useful for CXL accelerators
(Type-2 devices with memory) the primary motivation is to enable a
simple way for cxl_test to intercept those operations.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784338418.1758207.14659830845389904356.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The value should be non-zero on Intel while zero on everything else.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The CPUID method of arch_get_tsc_freq fails for older Intel processors,
such as Skylake. Compute using /proc/cpuinfo.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The TSC frequency information is required for the event metrics with the
literal, system_tsc_freq. For the newer Intel platform, the TSC
frequency information can be retrieved from the CPUID leaf 0x15. If the
TSC frequency information isn't present the /proc/cpuinfo approach is
used.
Refactor cpuid() for this use. Note, the previous stack pushing/popping
approach was broken on x86-64 that has stack red zones that would be
clobbered.
Committer testing:
Before:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
After the patch:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the ptrace-gpr test only tests the GET/SET(FP)REGS ptrace
APIs. But there's an alternate (older) API, called PEEK/POKEUSR.
Add some minimal testing of PEEK/POKEUSR of the FPRs. This is sufficient
to detect the bug that was fixed recently in the 32-bit ptrace FPR
handling.
Depends-on: 8e12784444 ("powerpc/32: Fix overread/overwrite of thread_struct via ptrace")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220627140239.2464900-13-mpe@ellerman.id.au
The ptrace-gpr test uses fixed values to test that registers can be
read/written via ptrace. In particular it sets all GPRs to 1, which
means the test could miss some types of bugs - eg. if the kernel was
only returning the low word.
So generate some random values at startup and use those instead.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220627140239.2464900-12-mpe@ellerman.id.au
The ptrace-gpr test includes some inline asm to load GPR and FPR
registers. It then goes back to C to wait for the parent to trace it and
then checks register contents.
The split between inline asm and C is fragile, it relies on the compiler
not using any non-volatile GPRs after the inline asm block. It also
requires a very large and unwieldy inline asm block.
So convert the logic to set registers, wait, and store registers to a
single asm function, meaning there's no window for the compiler to
intervene.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220627140239.2464900-10-mpe@ellerman.id.au
Some of the ptrace tests check the contents of floating pointer
registers. Currently these use float, which is always 4 bytes, but the
ptrace API supports saving/restoring 8 bytes per register, so switch to
using doubles to exercise the code more fully.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220627140239.2464900-8-mpe@ellerman.id.au