WSL2-Linux-Kernel/tools/perf
Waiman Long f9ceffb605 perf symbols: Fix vdso list searching
When "perf record" was used on a large machine with a lot of CPUs, the
perf post-processing time (the time after the workload was done until
the perf command itself exited) could take a lot of minutes and even
hours depending on how large the resulting perf.data file was.

While running AIM7 1500-user high_systime workload on a 80-core x86-64
system with a 3.9 kernel (with only the -s -a options used), the
workload itself took about 2 minutes to run and the perf.data file had a
size of 1108.746 MB. However, the post-processing step took more than 10
minutes.

With a gprof-profiled perf binary, the time spent by perf was as
follows:

  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 96.90    822.10   822.10   192156     0.00     0.00  dsos__find
  0.81    828.96     6.86 172089958     0.00     0.00  rb_next
  0.41    832.44     3.48 48539289     0.00     0.00  rb_erase

So 97% (822 seconds) of the time was spent in a single dsos_find()
function. After analyzing the call-graph data below:

 -----------------------------------------------
                 0.00  822.12  192156/192156      map__new [6]
 [7]     96.9    0.00  822.12  192156         vdso__dso_findnew [7]
               822.10    0.00  192156/192156      dsos__find [8]
                 0.01    0.00  192156/192156      dsos__add [62]
                 0.01    0.00  192156/192366      dso__new [61]
                 0.00    0.00       1/45282525     memdup [31]
                 0.00    0.00  192156/192230      dso__set_long_name [91]
 -----------------------------------------------
               822.10    0.00  192156/192156      vdso__dso_findnew [7]
 [8]     96.9  822.10    0.00  192156         dsos__find [8]
 -----------------------------------------------

It was found that the vdso__dso_findnew() function failed to locate
VDSO__MAP_NAME ("[vdso]") in the dso list and have to insert a new
entry at the end for 192156 times. This problem is due to the fact that
there are 2 types of name in the dso entry - short name and long name.
The initial dso__new() adds "[vdso]" to both the short and long names.
After that, vdso__dso_findnew() modifies the long name to something
like /tmp/perf-vdso.so-NoXkDj. The dsos__find() function only compares
the long name. As a result, the same vdso entry is duplicated many
time in the dso list. This bug increases memory consumption as well
as slows the symbol processing time to a crawl.

To resolve this problem, the dsos__find() function interface was
modified to enable searching either the long name or the short
name. The vdso__dso_findnew() will now search only the short name
while the other call sites search for the long name as before.

With this change, the cpu time of perf was reduced from 848.38s to
15.77s and dsos__find() only accounted for 0.06% of the total time.

  0.06     15.73     0.01   192151     0.00     0.00  dsos__find

Signed-off-by: Waiman Long <Waiman.Long@hp.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: "Chandramouleeswaran, Aswin" <aswin@hp.com>
Cc: "Norton, Scott J" <scott.norton@hp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1368110568-64714-1-git-send-email-Waiman.Long@hp.com
[ replaced TRUE/FALSE with stdbool.h equivalents, fixing builds where
  those macros are not present (NO_LIBPYTHON=1 NO_LIBPERL=1), fix from Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-07-08 17:59:07 -03:00
..
Documentation perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
arch perf tools: Fix build on non-glibc systems due to libio.h absence 2013-03-15 13:05:13 -03:00
bench perf bench: Fix memory allocation fail check in mem{set,cpy} workloads 2013-07-08 17:35:40 -03:00
config perf tools: Fix build errors with O and DESTDIR make vars set 2013-07-08 17:34:00 -03:00
python
scripts perf: net_dropmonitor: Remove progress indicator 2013-05-22 15:10:11 -07:00
tests perf tests: Fix exclude_guest|exclude_host checking for attr tests 2013-05-29 15:15:19 +03:00
ui perf top: Add --percent-limit option 2013-05-28 16:24:01 +03:00
util perf symbols: Fix vdso list searching 2013-07-08 17:59:07 -03:00
.gitignore perf tools: Ignore compiled python binaries 2012-09-07 12:10:58 -03:00
CREDITS
MANIFEST perf tools: Introduce tools/lib/lk library 2013-03-15 13:06:00 -03:00
Makefile perf tools: Fix build errors with O and DESTDIR make vars set 2013-07-08 17:34:00 -03:00
bash_completion perf tools: Complete tracepoint event names 2012-10-04 12:44:52 -03:00
builtin-annotate.c perf tools: Add support for weight v7 (modified) 2013-04-01 12:19:43 -03:00
builtin-bench.c perf tools: Make numa benchmark optional 2013-01-30 10:36:21 -03:00
builtin-buildid-cache.c perf buildid-cache: Add --update option 2013-02-14 14:59:27 -03:00
builtin-buildid-list.c perf symbols: Generalize filter in __fprintf_buildid methods 2012-12-09 08:46:07 -03:00
builtin-diff.c perf tools: Fix -x/--exclude-other option for report command 2013-07-08 17:38:53 -03:00
builtin-evlist.c perf evlist: Pass the event_group info via perf_attr_details 2013-02-06 18:09:28 -03:00
builtin-help.c perf help: Fix --help for builtins 2012-10-22 12:35:49 -02:00
builtin-inject.c perf inject: Mark a dso if it's used 2012-10-26 11:22:25 -02:00
builtin-kmem.c perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
builtin-kvm.c perf kvm: Handle realloc failures 2013-05-28 16:24:04 +03:00
builtin-list.c perf tools: Use __maybe_used for unused variables 2012-09-11 12:19:15 -03:00
builtin-lock.c perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
builtin-mem.c perf tools: Add new mem command for memory access profiling 2013-04-01 12:21:44 -03:00
builtin-probe.c perf tools: Introduce tools/lib/lk library 2013-03-15 13:06:00 -03:00
builtin-record.c perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
builtin-report.c perf tools: Fix -x/--exclude-other option for report command 2013-07-08 17:38:53 -03:00
builtin-sched.c perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
builtin-script.c perf script: Don't display trace info when invoking scripts 2013-01-24 16:40:52 -03:00
builtin-stat.c perf stat: Avoid sending SIGTERM to random processes 2013-07-08 17:36:33 -03:00
builtin-timechart.c perf record: Remove -f/--force option 2013-07-08 17:37:25 -03:00
builtin-top.c perf tools: Fix -x/--exclude-other option for report command 2013-07-08 17:38:53 -03:00
builtin-trace.c perf trace: Free evlist resources properly on return path 2013-03-15 13:06:11 -03:00
builtin.h perf tools: Add new mem command for memory access profiling 2013-04-01 12:21:44 -03:00
command-list.txt perf tools: Add new mem command for memory access profiling 2013-04-01 12:21:44 -03:00
design.txt perf tools: Update ioctl documentation for PERF_IOC_FLAG_GROUP 2012-05-31 11:38:42 -03:00
perf-archive.sh perf archive: Make 'f' the last parameter for tar 2012-09-17 13:10:42 -03:00
perf.c perf tools: Convert needless static variable to local 2013-04-01 12:22:48 -03:00
perf.h perf tools: Add support for weight v7 (modified) 2013-04-01 12:19:43 -03:00