Tracing updates for 5.14:
- Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/ yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ= =WEPR -----END PGP SIGNATURE----- Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - Added option for per CPU threads to the hwlat tracer - Have hwlat tracer handle hotplug CPUs - New tracer: osnoise, that detects latency caused by interrupts, softirqs and scheduling of other tasks. - Added timerlat tracer that creates a thread and measures in detail what sources of latency it has for wake ups. - Removed the "success" field of the sched_wakeup trace event. This has been hardcoded as "1" since 2015, no tooling should be looking at it now. If one exists, we can revert this commit, fix that tool and try to remove it again in the future. - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids. - New boot command line option "tp_printk_stop", as tp_printk causes trace events to write to console. When user space starts, this can easily live lock the system. Having a boot option to stop just after boot up is useful to prevent that from happening. - Have ftrace_dump_on_oops boot command line option take numbers that match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops. - Bootconfig clean ups, fixes and enhancements. - New ktest script that tests bootconfig options. - Add tracepoint_probe_register_may_exist() to register a tracepoint without triggering a WARN*() if it already exists. BPF has a path from user space that can do this. All other paths are considered a bug. - Small clean ups and fixes * tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits) tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT tracing: Simplify & fix saved_tgids logic treewide: Add missing semicolons to __assign_str uses tracing: Change variable type as bool for clean-up trace/timerlat: Fix indentation on timerlat_main() trace/osnoise: Make 'noise' variable s64 in run_osnoise() tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing tracing: Fix spelling in osnoise tracer "interferences" -> "interference" Documentation: Fix a typo on trace/osnoise-tracer trace/osnoise: Fix return value on osnoise_init_hotplug_support trace/osnoise: Make interval u64 on osnoise_main trace/osnoise: Fix 'no previous prototype' warnings tracing: Have osnoise_main() add a quiescent state for task rcu seq_buf: Make trace_seq_putmem_hex() support data longer than 8 seq_buf: Fix overflow in seq_buf_putmem_hex() trace/osnoise: Support hotplug operations trace/hwlat: Support hotplug operations trace/hwlat: Protect kdata->kthread with get/put_online_cpus trace: Add timerlat tracer trace: Add osnoise tracer ...
This commit is contained in:
Коммит
757fa80f4e
|
@ -89,13 +89,35 @@ you can use ``+=`` operator. For example::
|
|||
|
||||
In this case, the key ``foo`` has ``bar``, ``baz`` and ``qux``.
|
||||
|
||||
However, a sub-key and a value can not co-exist under a parent key.
|
||||
For example, following config is NOT allowed.::
|
||||
Moreover, sub-keys and a value can coexist under a parent key.
|
||||
For example, following config is allowed.::
|
||||
|
||||
foo = value1
|
||||
foo.bar = value2 # !ERROR! subkey "bar" and value "value1" can NOT co-exist
|
||||
foo.bar := value2 # !ERROR! even with the override operator, this is NOT allowed.
|
||||
foo.bar = value2
|
||||
foo := value3 # This will update foo's value.
|
||||
|
||||
Note, since there is no syntax to put a raw value directly under a
|
||||
structured key, you have to define it outside of the brace. For example::
|
||||
|
||||
foo {
|
||||
bar = value1
|
||||
bar {
|
||||
baz = value2
|
||||
qux = value3
|
||||
}
|
||||
}
|
||||
|
||||
Also, the order of the value node under a key is fixed. If there
|
||||
are a value and subkeys, the value is always the first child node
|
||||
of the key. Thus if user specifies subkeys first, e.g.::
|
||||
|
||||
foo.bar = value1
|
||||
foo = value2
|
||||
|
||||
In the program (and /proc/bootconfig), it will be shown as below::
|
||||
|
||||
foo = value2
|
||||
foo.bar = value1
|
||||
|
||||
Comments
|
||||
--------
|
||||
|
|
|
@ -5672,12 +5672,25 @@
|
|||
Note, echoing 1 into this file without the
|
||||
tracepoint_printk kernel cmdline option has no effect.
|
||||
|
||||
The tp_printk_stop_on_boot (see below) can also be used
|
||||
to stop the printing of events to console at
|
||||
late_initcall_sync.
|
||||
|
||||
** CAUTION **
|
||||
|
||||
Having tracepoints sent to printk() and activating high
|
||||
frequency tracepoints such as irq or sched, can cause
|
||||
the system to live lock.
|
||||
|
||||
tp_printk_stop_on_boot[FTRACE]
|
||||
When tp_printk (above) is set, it can cause a lot of noise
|
||||
on the console. It may be useful to only include the
|
||||
printing of events during boot up, as user space may
|
||||
make the system inoperable.
|
||||
|
||||
This command line option will stop the printing of events
|
||||
to console at the late_initcall_sync() time frame.
|
||||
|
||||
traceoff_on_warning
|
||||
[FTRACE] enable this option to disable tracing when a
|
||||
warning is hit. This turns off "tracing_on". Tracing can
|
||||
|
|
|
@ -99,6 +99,12 @@ These options are setting per-event options.
|
|||
ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
|
||||
Enable GROUP:EVENT tracing.
|
||||
|
||||
ftrace.[instance.INSTANCE.]event.GROUP.enable
|
||||
Enable all event tracing within GROUP.
|
||||
|
||||
ftrace.[instance.INSTANCE.]event.enable
|
||||
Enable all event tracing.
|
||||
|
||||
ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
|
||||
Set FILTER rule to the GROUP:EVENT.
|
||||
|
||||
|
|
|
@ -76,8 +76,13 @@ in /sys/kernel/tracing:
|
|||
- tracing_cpumask - the CPUs to move the hwlat thread across
|
||||
- hwlat_detector/width - specified amount of time to spin within window (usecs)
|
||||
- hwlat_detector/window - amount of time between (width) runs (usecs)
|
||||
- hwlat_detector/mode - the thread mode
|
||||
|
||||
The hwlat detector's kernel thread will migrate across each CPU specified in
|
||||
tracing_cpumask between each window. To limit the migration, either modify
|
||||
tracing_cpumask, or modify the hwlat kernel thread (named [hwlatd]) CPU
|
||||
affinity directly, and the migration will stop.
|
||||
By default, one hwlat detector's kernel thread will migrate across each CPU
|
||||
specified in cpumask at the beginning of a new window, in a round-robin
|
||||
fashion. This behavior can be changed by changing the thread mode,
|
||||
the available options are:
|
||||
|
||||
- none: do not force migration
|
||||
- round-robin: migrate across each CPU specified in cpumask [default]
|
||||
- per-cpu: create one thread for each cpu in tracing_cpumask
|
||||
|
|
|
@ -23,6 +23,8 @@ Linux Tracing Technologies
|
|||
histogram-design
|
||||
boottime-trace
|
||||
hwlat_detector
|
||||
osnoise-tracer
|
||||
timerlat-tracer
|
||||
intel_th
|
||||
ring-buffer-design
|
||||
stm
|
||||
|
|
|
@ -0,0 +1,152 @@
|
|||
==============
|
||||
OSNOISE Tracer
|
||||
==============
|
||||
|
||||
In the context of high-performance computing (HPC), the Operating System
|
||||
Noise (*osnoise*) refers to the interference experienced by an application
|
||||
due to activities inside the operating system. In the context of Linux,
|
||||
NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
|
||||
system. Moreover, hardware-related jobs can also cause noise, for example,
|
||||
via SMIs.
|
||||
|
||||
hwlat_detector is one of the tools used to identify the most complex
|
||||
source of noise: *hardware noise*.
|
||||
|
||||
In a nutshell, the hwlat_detector creates a thread that runs
|
||||
periodically for a given period. At the beginning of a period, the thread
|
||||
disables interrupt and starts sampling. While running, the hwlatd
|
||||
thread reads the time in a loop. As interrupts are disabled, threads,
|
||||
IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the
|
||||
cause of any gap between two different reads of the time roots either on
|
||||
NMI or in the hardware itself. At the end of the period, hwlatd enables
|
||||
interrupts and reports the max observed gap between the reads. It also
|
||||
prints a NMI occurrence counter. If the output does not report NMI
|
||||
executions, the user can conclude that the hardware is the culprit for
|
||||
the latency. The hwlat detects the NMI execution by observing
|
||||
the entry and exit of a NMI.
|
||||
|
||||
The osnoise tracer leverages the hwlat_detector by running a
|
||||
similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing
|
||||
all the sources of *osnoise* during its execution. Using the same approach
|
||||
of hwlat, osnoise takes note of the entry and exit point of any
|
||||
source of interferences, increasing a per-cpu interference counter. The
|
||||
osnoise tracer also saves an interference counter for each source of
|
||||
interference. The interference counter for NMI, IRQs, SoftIRQs, and
|
||||
threads is increased anytime the tool observes these interferences' entry
|
||||
events. When a noise happens without any interference from the operating
|
||||
system level, the hardware noise counter increases, pointing to a
|
||||
hardware-related noise. In this way, osnoise can account for any
|
||||
source of interference. At the end of the period, the osnoise tracer
|
||||
prints the sum of all noise, the max single noise, the percentage of CPU
|
||||
available for the thread, and the counters for the noise sources.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
Write the ASCII text "osnoise" into the current_tracer file of the
|
||||
tracing system (generally mounted at /sys/kernel/tracing).
|
||||
|
||||
For example::
|
||||
|
||||
[root@f32 ~]# cd /sys/kernel/tracing/
|
||||
[root@f32 tracing]# echo osnoise > current_tracer
|
||||
|
||||
It is possible to follow the trace by reading the trace trace file::
|
||||
|
||||
[root@f32 tracing]# cat trace
|
||||
# tracer: osnoise
|
||||
#
|
||||
# _-----=> irqs-off
|
||||
# / _----=> need-resched
|
||||
# | / _---=> hardirq/softirq
|
||||
# || / _--=> preempt-depth MAX
|
||||
# || / SINGLE Interference counters:
|
||||
# |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+
|
||||
# TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
|
||||
# | | | |||| | | | | | | | | | |
|
||||
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
|
||||
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
|
||||
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
|
||||
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
|
||||
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
|
||||
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
|
||||
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
|
||||
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
|
||||
|
||||
In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
|
||||
tracer prints a message at the end of each period for each CPU that is
|
||||
running an osnoise/ thread. The osnoise specific fields report:
|
||||
|
||||
- The RUNTIME IN US reports the amount of time in microseconds that
|
||||
the osnoise thread kept looping reading the time.
|
||||
- The NOISE IN US reports the sum of noise in microseconds observed
|
||||
by the osnoise tracer during the associated runtime.
|
||||
- The % OF CPU AVAILABLE reports the percentage of CPU available for
|
||||
the osnoise thread during the runtime window.
|
||||
- The MAX SINGLE NOISE IN US reports the maximum single noise observed
|
||||
during the runtime window.
|
||||
- The Interference counters display how many each of the respective
|
||||
interference happened during the runtime window.
|
||||
|
||||
Note that the example above shows a high number of HW noise samples.
|
||||
The reason being is that this sample was taken on a virtual machine,
|
||||
and the host interference is detected as a hardware interference.
|
||||
|
||||
Tracer options
|
||||
---------------------
|
||||
|
||||
The tracer has a set of options inside the osnoise directory, they are:
|
||||
|
||||
- osnoise/cpus: CPUs at which a osnoise thread will execute.
|
||||
- osnoise/period_us: the period of the osnoise thread.
|
||||
- osnoise/runtime_us: how long an osnoise thread will look for noise.
|
||||
- osnoise/stop_tracing_us: stop the system tracing if a single noise
|
||||
higher than the configured value happens. Writing 0 disables this
|
||||
option.
|
||||
- osnoise/stop_tracing_total_us: stop the system tracing if total noise
|
||||
higher than the configured value happens. Writing 0 disables this
|
||||
option.
|
||||
- tracing_threshold: the minimum delta between two time() reads to be
|
||||
considered as noise, in us. When set to 0, the default value will
|
||||
will be used, which is currently 5 us.
|
||||
|
||||
Additional Tracing
|
||||
------------------
|
||||
|
||||
In addition to the tracer, a set of tracepoints were added to
|
||||
facilitate the identification of the osnoise source.
|
||||
|
||||
- osnoise:sample_threshold: printed anytime a noise is higher than
|
||||
the configurable tolerance_ns.
|
||||
- osnoise:nmi_noise: noise from NMI, including the duration.
|
||||
- osnoise:irq_noise: noise from an IRQ, including the duration.
|
||||
- osnoise:softirq_noise: noise from a SoftIRQ, including the
|
||||
duration.
|
||||
- osnoise:thread_noise: noise from a thread, including the duration.
|
||||
|
||||
Note that all the values are *net values*. For example, if while osnoise
|
||||
is running, another thread preempts the osnoise thread, it will start a
|
||||
thread_noise duration at the start. Then, an IRQ takes place, preempting
|
||||
the thread_noise, starting a irq_noise. When the IRQ ends its execution,
|
||||
it will compute its duration, and this duration will be subtracted from
|
||||
the thread_noise, in such a way as to avoid the double accounting of the
|
||||
IRQ execution. This logic is valid for all sources of noise.
|
||||
|
||||
Here is one example of the usage of these tracepoints::
|
||||
|
||||
osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
|
||||
osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
|
||||
migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
|
||||
osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2
|
||||
|
||||
In this example, a noise sample of 8 microseconds was reported in the last
|
||||
line, pointing to two interferences. Looking backward in the trace, the
|
||||
two previous entries were about the migration thread running after a
|
||||
timer IRQ execution. The first event is not part of the noise because
|
||||
it took place one millisecond before.
|
||||
|
||||
It is worth noticing that the sum of the duration reported in the
|
||||
tracepoints is smaller than eight us reported in the sample_threshold.
|
||||
The reason roots in the overhead of the entry and exit code that happens
|
||||
before and after any interference execution. This justifies the dual
|
||||
approach: measuring thread and tracing.
|
|
@ -0,0 +1,181 @@
|
|||
###############
|
||||
Timerlat tracer
|
||||
###############
|
||||
|
||||
The timerlat tracer aims to help the preemptive kernel developers to
|
||||
find souces of wakeup latencies of real-time threads. Like cyclictest,
|
||||
the tracer sets a periodic timer that wakes up a thread. The thread then
|
||||
computes a *wakeup latency* value as the difference between the *current
|
||||
time* and the *absolute time* that the timer was set to expire. The main
|
||||
goal of timerlat is tracing in such a way to help kernel developers.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
Write the ASCII text "timerlat" into the current_tracer file of the
|
||||
tracing system (generally mounted at /sys/kernel/tracing).
|
||||
|
||||
For example::
|
||||
|
||||
[root@f32 ~]# cd /sys/kernel/tracing/
|
||||
[root@f32 tracing]# echo timerlat > current_tracer
|
||||
|
||||
It is possible to follow the trace by reading the trace trace file::
|
||||
|
||||
[root@f32 tracing]# cat trace
|
||||
# tracer: timerlat
|
||||
#
|
||||
# _-----=> irqs-off
|
||||
# / _----=> need-resched
|
||||
# | / _---=> hardirq/softirq
|
||||
# || / _--=> preempt-depth
|
||||
# || /
|
||||
# |||| ACTIVATION
|
||||
# TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY
|
||||
# | | | |||| | | | |
|
||||
<idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns
|
||||
<...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns
|
||||
<idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns
|
||||
<...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns
|
||||
<idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns
|
||||
<...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns
|
||||
<idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns
|
||||
<...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns
|
||||
|
||||
|
||||
The tracer creates a per-cpu kernel thread with real-time priority that
|
||||
prints two lines at every activation. The first is the *timer latency*
|
||||
observed at the *hardirq* context before the activation of the thread.
|
||||
The second is the *timer latency* observed by the thread. The ACTIVATION
|
||||
ID field serves to relate the *irq* execution to its respective *thread*
|
||||
execution.
|
||||
|
||||
The *irq*/*thread* splitting is important to clarify at which context
|
||||
the unexpected high value is coming from. The *irq* context can be
|
||||
delayed by hardware related actions, such as SMIs, NMIs, IRQs
|
||||
or by a thread masking interrupts. Once the timer happens, the delay
|
||||
can also be influenced by blocking caused by threads. For example, by
|
||||
postponing the scheduler execution via preempt_disable(), by the
|
||||
scheduler execution, or by masking interrupts. Threads can
|
||||
also be delayed by the interference from other threads and IRQs.
|
||||
|
||||
Tracer options
|
||||
---------------------
|
||||
|
||||
The timerlat tracer is built on top of osnoise tracer.
|
||||
So its configuration is also done in the osnoise/ config
|
||||
directory. The timerlat configs are:
|
||||
|
||||
- cpus: CPUs at which a timerlat thread will execute.
|
||||
- timerlat_period_us: the period of the timerlat thread.
|
||||
- osnoise/stop_tracing_us: stop the system tracing if a
|
||||
timer latency at the *irq* context higher than the configured
|
||||
value happens. Writing 0 disables this option.
|
||||
- stop_tracing_total_us: stop the system tracing if a
|
||||
timer latency at the *thread* context higher than the configured
|
||||
value happens. Writing 0 disables this option.
|
||||
- print_stack: save the stack of the IRQ ocurrence, and print
|
||||
it afte the *thread context* event".
|
||||
|
||||
timerlat and osnoise
|
||||
----------------------------
|
||||
|
||||
The timerlat can also take advantage of the osnoise: traceevents.
|
||||
For example::
|
||||
|
||||
[root@f32 ~]# cd /sys/kernel/tracing/
|
||||
[root@f32 tracing]# echo timerlat > current_tracer
|
||||
[root@f32 tracing]# echo 1 > events/osnoise/enable
|
||||
[root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us
|
||||
[root@f32 tracing]# tail -10 trace
|
||||
cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 13585 ns
|
||||
cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 7597 ns
|
||||
cc1-87882 [005] dNLh2.. 548.771099: irq_noise: qxl:21 start 548.771085017 duration 7139 ns
|
||||
cc1-87882 [005] d...3.. 548.771102: thread_noise: cc1:87882 start 548.771078243 duration 9909 ns
|
||||
timerlat/5-1035 [005] ....... 548.771104: #402268 context thread timer_latency 39960 ns
|
||||
|
||||
In this case, the root cause of the timer latency does not point to a
|
||||
single cause, but to multiple ones. Firstly, the timer IRQ was delayed
|
||||
for 13 us, which may point to a long IRQ disabled section (see IRQ
|
||||
stacktrace section). Then the timer interrupt that wakes up the timerlat
|
||||
thread took 7597 ns, and the qxl:21 device IRQ took 7139 ns. Finally,
|
||||
the cc1 thread noise took 9909 ns of time before the context switch.
|
||||
Such pieces of evidence are useful for the developer to use other
|
||||
tracing methods to figure out how to debug and optimize the system.
|
||||
|
||||
It is worth mentioning that the *duration* values reported
|
||||
by the osnoise: events are *net* values. For example, the
|
||||
thread_noise does not include the duration of the overhead caused
|
||||
by the IRQ execution (which indeed accounted for 12736 ns). But
|
||||
the values reported by the timerlat tracer (timerlat_latency)
|
||||
are *gross* values.
|
||||
|
||||
The art below illustrates a CPU timeline and how the timerlat tracer
|
||||
observes it at the top and the osnoise: events at the bottom. Each "-"
|
||||
in the timelines means circa 1 us, and the time moves ==>::
|
||||
|
||||
External timer irq thread
|
||||
clock latency latency
|
||||
event 13585 ns 39960 ns
|
||||
| ^ ^
|
||||
v | |
|
||||
|-------------| |
|
||||
|-------------+-------------------------|
|
||||
^ ^
|
||||
========================================================================
|
||||
[tmr irq] [dev irq]
|
||||
[another thread...^ v..^ v.......][timerlat/ thread] <-- CPU timeline
|
||||
=========================================================================
|
||||
|-------| |-------|
|
||||
|--^ v-------|
|
||||
| | |
|
||||
| | + thread_noise: 9909 ns
|
||||
| +-> irq_noise: 6139 ns
|
||||
+-> irq_noise: 7597 ns
|
||||
|
||||
IRQ stacktrace
|
||||
---------------------------
|
||||
|
||||
The osnoise/print_stack option is helpful for the cases in which a thread
|
||||
noise causes the major factor for the timer latency, because of preempt or
|
||||
irq disabled. For example::
|
||||
|
||||
[root@f32 tracing]# echo 500 > osnoise/stop_tracing_total_us
|
||||
[root@f32 tracing]# echo 500 > osnoise/print_stack
|
||||
[root@f32 tracing]# echo timerlat > current_tracer
|
||||
[root@f32 tracing]# tail -21 per_cpu/cpu7/trace
|
||||
insmod-1026 [007] dN.h1.. 200.201948: irq_noise: local_timer:236 start 200.201939376 duration 7872 ns
|
||||
insmod-1026 [007] d..h1.. 200.202587: #29800 context irq timer_latency 1616 ns
|
||||
insmod-1026 [007] dN.h2.. 200.202598: irq_noise: local_timer:236 start 200.202586162 duration 11855 ns
|
||||
insmod-1026 [007] dN.h3.. 200.202947: irq_noise: local_timer:236 start 200.202939174 duration 7318 ns
|
||||
insmod-1026 [007] d...3.. 200.203444: thread_noise: insmod:1026 start 200.202586933 duration 838681 ns
|
||||
timerlat/7-1001 [007] ....... 200.203445: #29800 context thread timer_latency 859978 ns
|
||||
timerlat/7-1001 [007] ....1.. 200.203446: <stack trace>
|
||||
=> timerlat_irq
|
||||
=> __hrtimer_run_queues
|
||||
=> hrtimer_interrupt
|
||||
=> __sysvec_apic_timer_interrupt
|
||||
=> asm_call_irq_on_stack
|
||||
=> sysvec_apic_timer_interrupt
|
||||
=> asm_sysvec_apic_timer_interrupt
|
||||
=> delay_tsc
|
||||
=> dummy_load_1ms_pd_init
|
||||
=> do_one_initcall
|
||||
=> do_init_module
|
||||
=> __do_sys_finit_module
|
||||
=> do_syscall_64
|
||||
=> entry_SYSCALL_64_after_hwframe
|
||||
|
||||
In this case, it is possible to see that the thread added the highest
|
||||
contribution to the *timer latency* and the stack trace, saved during
|
||||
the timerlat IRQ handler, points to a function named
|
||||
dummy_load_1ms_pd_init, which had the following code (on purpose)::
|
||||
|
||||
static int __init dummy_load_1ms_pd_init(void)
|
||||
{
|
||||
preempt_disable();
|
||||
mdelay(1);
|
||||
preempt_enable();
|
||||
return 0;
|
||||
|
||||
}
|
|
@ -102,6 +102,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += ftrace_$(BITS).o
|
|||
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
|
||||
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
|
||||
obj-$(CONFIG_X86_TSC) += trace_clock.o
|
||||
obj-$(CONFIG_TRACING) += trace.o
|
||||
obj-$(CONFIG_CRASH_CORE) += crash_core_$(BITS).o
|
||||
obj-$(CONFIG_KEXEC_CORE) += machine_kexec_$(BITS).o
|
||||
obj-$(CONFIG_KEXEC_CORE) += relocate_kernel_$(BITS).o crash.o
|
||||
|
|
|
@ -0,0 +1,234 @@
|
|||
#include <asm/trace/irq_vectors.h>
|
||||
#include <linux/trace.h>
|
||||
|
||||
#if defined(CONFIG_OSNOISE_TRACER) && defined(CONFIG_X86_LOCAL_APIC)
|
||||
/*
|
||||
* trace_intel_irq_entry - record intel specific IRQ entry
|
||||
*/
|
||||
static void trace_intel_irq_entry(void *data, int vector)
|
||||
{
|
||||
osnoise_trace_irq_entry(vector);
|
||||
}
|
||||
|
||||
/*
|
||||
* trace_intel_irq_exit - record intel specific IRQ exit
|
||||
*/
|
||||
static void trace_intel_irq_exit(void *data, int vector)
|
||||
{
|
||||
char *vector_desc = (char *) data;
|
||||
|
||||
osnoise_trace_irq_exit(vector, vector_desc);
|
||||
}
|
||||
|
||||
/*
|
||||
* register_intel_irq_tp - Register intel specific IRQ entry tracepoints
|
||||
*/
|
||||
int osnoise_arch_register(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = register_trace_local_timer_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_err;
|
||||
|
||||
ret = register_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
|
||||
if (ret)
|
||||
goto out_timer_entry;
|
||||
|
||||
#ifdef CONFIG_X86_THERMAL_VECTOR
|
||||
ret = register_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_timer_exit;
|
||||
|
||||
ret = register_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
|
||||
if (ret)
|
||||
goto out_thermal_entry;
|
||||
#endif /* CONFIG_X86_THERMAL_VECTOR */
|
||||
|
||||
#ifdef CONFIG_X86_MCE_AMD
|
||||
ret = register_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_thermal_exit;
|
||||
|
||||
ret = register_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
|
||||
if (ret)
|
||||
goto out_deferred_entry;
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_MCE_THRESHOLD
|
||||
ret = register_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_deferred_exit;
|
||||
|
||||
ret = register_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
|
||||
if (ret)
|
||||
goto out_threshold_entry;
|
||||
#endif /* CONFIG_X86_MCE_THRESHOLD */
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
ret = register_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_threshold_exit;
|
||||
|
||||
ret = register_trace_call_function_single_exit(trace_intel_irq_exit,
|
||||
"call_function_single");
|
||||
if (ret)
|
||||
goto out_call_function_single_entry;
|
||||
|
||||
ret = register_trace_call_function_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_call_function_single_exit;
|
||||
|
||||
ret = register_trace_call_function_exit(trace_intel_irq_exit, "call_function");
|
||||
if (ret)
|
||||
goto out_call_function_entry;
|
||||
|
||||
ret = register_trace_reschedule_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_call_function_exit;
|
||||
|
||||
ret = register_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
|
||||
if (ret)
|
||||
goto out_reschedule_entry;
|
||||
#endif /* CONFIG_SMP */
|
||||
|
||||
#ifdef CONFIG_IRQ_WORK
|
||||
ret = register_trace_irq_work_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_reschedule_exit;
|
||||
|
||||
ret = register_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
|
||||
if (ret)
|
||||
goto out_irq_work_entry;
|
||||
#endif
|
||||
|
||||
ret = register_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_irq_work_exit;
|
||||
|
||||
ret = register_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
|
||||
if (ret)
|
||||
goto out_x86_ipi_entry;
|
||||
|
||||
ret = register_trace_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_x86_ipi_exit;
|
||||
|
||||
ret = register_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
|
||||
if (ret)
|
||||
goto out_error_apic_entry;
|
||||
|
||||
ret = register_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
|
||||
if (ret)
|
||||
goto out_error_apic_exit;
|
||||
|
||||
ret = register_trace_spurious_apic_exit(trace_intel_irq_exit, "spurious_apic");
|
||||
if (ret)
|
||||
goto out_spurious_apic_entry;
|
||||
|
||||
return 0;
|
||||
|
||||
out_spurious_apic_entry:
|
||||
unregister_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
|
||||
out_error_apic_exit:
|
||||
unregister_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
|
||||
out_error_apic_entry:
|
||||
unregister_trace_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
out_x86_ipi_exit:
|
||||
unregister_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
|
||||
out_x86_ipi_entry:
|
||||
unregister_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
|
||||
out_irq_work_exit:
|
||||
|
||||
#ifdef CONFIG_IRQ_WORK
|
||||
unregister_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
|
||||
out_irq_work_entry:
|
||||
unregister_trace_irq_work_entry(trace_intel_irq_entry, NULL);
|
||||
out_reschedule_exit:
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
unregister_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
|
||||
out_reschedule_entry:
|
||||
unregister_trace_reschedule_entry(trace_intel_irq_entry, NULL);
|
||||
out_call_function_exit:
|
||||
unregister_trace_call_function_exit(trace_intel_irq_exit, "call_function");
|
||||
out_call_function_entry:
|
||||
unregister_trace_call_function_entry(trace_intel_irq_entry, NULL);
|
||||
out_call_function_single_exit:
|
||||
unregister_trace_call_function_single_exit(trace_intel_irq_exit, "call_function_single");
|
||||
out_call_function_single_entry:
|
||||
unregister_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
|
||||
out_threshold_exit:
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_MCE_THRESHOLD
|
||||
unregister_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
|
||||
out_threshold_entry:
|
||||
unregister_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
|
||||
out_deferred_exit:
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_MCE_AMD
|
||||
unregister_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
|
||||
out_deferred_entry:
|
||||
unregister_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
out_thermal_exit:
|
||||
#endif /* CONFIG_X86_MCE_AMD */
|
||||
|
||||
#ifdef CONFIG_X86_THERMAL_VECTOR
|
||||
unregister_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
|
||||
out_thermal_entry:
|
||||
unregister_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
|
||||
out_timer_exit:
|
||||
#endif /* CONFIG_X86_THERMAL_VECTOR */
|
||||
|
||||
unregister_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
|
||||
out_timer_entry:
|
||||
unregister_trace_local_timer_entry(trace_intel_irq_entry, NULL);
|
||||
out_err:
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
void osnoise_arch_unregister(void)
|
||||
{
|
||||
unregister_trace_spurious_apic_exit(trace_intel_irq_exit, "spurious_apic");
|
||||
unregister_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
|
||||
unregister_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
|
||||
unregister_trace_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
unregister_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
|
||||
unregister_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
|
||||
|
||||
#ifdef CONFIG_IRQ_WORK
|
||||
unregister_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
|
||||
unregister_trace_irq_work_entry(trace_intel_irq_entry, NULL);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_SMP
|
||||
unregister_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
|
||||
unregister_trace_reschedule_entry(trace_intel_irq_entry, NULL);
|
||||
unregister_trace_call_function_exit(trace_intel_irq_exit, "call_function");
|
||||
unregister_trace_call_function_entry(trace_intel_irq_entry, NULL);
|
||||
unregister_trace_call_function_single_exit(trace_intel_irq_exit, "call_function_single");
|
||||
unregister_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_MCE_THRESHOLD
|
||||
unregister_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
|
||||
unregister_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_MCE_AMD
|
||||
unregister_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
|
||||
unregister_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_X86_THERMAL_VECTOR
|
||||
unregister_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
|
||||
unregister_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
|
||||
#endif /* CONFIG_X86_THERMAL_VECTOR */
|
||||
|
||||
unregister_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
|
||||
unregister_trace_local_timer_entry(trace_intel_irq_entry, NULL);
|
||||
}
|
||||
#endif /* CONFIG_OSNOISE_TRAECR && CONFIG_X86_LOCAL_APIC */
|
|
@ -176,10 +176,10 @@ TRACE_EVENT(amdgpu_cs_ioctl,
|
|||
|
||||
TP_fast_assign(
|
||||
__entry->sched_job_id = job->base.id;
|
||||
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
|
||||
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job));
|
||||
__entry->context = job->base.s_fence->finished.context;
|
||||
__entry->seqno = job->base.s_fence->finished.seqno;
|
||||
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name)
|
||||
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name);
|
||||
__entry->num_ibs = job->num_ibs;
|
||||
),
|
||||
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
|
||||
|
@ -201,10 +201,10 @@ TRACE_EVENT(amdgpu_sched_run_job,
|
|||
|
||||
TP_fast_assign(
|
||||
__entry->sched_job_id = job->base.id;
|
||||
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
|
||||
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job));
|
||||
__entry->context = job->base.s_fence->finished.context;
|
||||
__entry->seqno = job->base.s_fence->finished.seqno;
|
||||
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name)
|
||||
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name);
|
||||
__entry->num_ibs = job->num_ibs;
|
||||
),
|
||||
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
|
||||
|
@ -229,7 +229,7 @@ TRACE_EVENT(amdgpu_vm_grab_id,
|
|||
|
||||
TP_fast_assign(
|
||||
__entry->pasid = vm->pasid;
|
||||
__assign_str(ring, ring->name)
|
||||
__assign_str(ring, ring->name);
|
||||
__entry->vmid = job->vmid;
|
||||
__entry->vm_hub = ring->funcs->vmhub,
|
||||
__entry->pd_addr = job->vm_pd_addr;
|
||||
|
@ -424,7 +424,7 @@ TRACE_EVENT(amdgpu_vm_flush,
|
|||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__assign_str(ring, ring->name)
|
||||
__assign_str(ring, ring->name);
|
||||
__entry->vmid = vmid;
|
||||
__entry->vm_hub = ring->funcs->vmhub;
|
||||
__entry->pd_addr = pd_addr;
|
||||
|
@ -525,7 +525,7 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
|
|||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__assign_str(ring, sched_job->base.sched->name)
|
||||
__assign_str(ring, sched_job->base.sched->name);
|
||||
__entry->id = sched_job->base.id;
|
||||
__entry->fence = fence;
|
||||
__entry->ctx = fence->context;
|
||||
|
|
|
@ -24,7 +24,7 @@ DECLARE_EVENT_CLASS(lima_task,
|
|||
__entry->task_id = task->base.id;
|
||||
__entry->context = task->base.s_fence->finished.context;
|
||||
__entry->seqno = task->base.s_fence->finished.seqno;
|
||||
__assign_str(pipe, task->base.sched->name)
|
||||
__assign_str(pipe, task->base.sched->name);
|
||||
),
|
||||
|
||||
TP_printk("task=%llu, context=%u seqno=%u pipe=%s",
|
||||
|
|
|
@ -63,7 +63,7 @@ TRACE_EVENT(hfi1_interrupt,
|
|||
__array(char, buf, 64)
|
||||
__field(int, src)
|
||||
),
|
||||
TP_fast_assign(DD_DEV_ASSIGN(dd)
|
||||
TP_fast_assign(DD_DEV_ASSIGN(dd);
|
||||
is_entry->is_name(__entry->buf, 64,
|
||||
src - is_entry->start);
|
||||
__entry->src = src;
|
||||
|
@ -100,7 +100,7 @@ TRACE_EVENT(hfi1_fault_opcode,
|
|||
__field(u32, qpn)
|
||||
__field(u8, opcode)
|
||||
),
|
||||
TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->opcode = opcode;
|
||||
),
|
||||
|
|
|
@ -70,7 +70,7 @@ DECLARE_EVENT_CLASS(hfi1_rc_template,
|
|||
__field(u32, r_psn)
|
||||
),
|
||||
TP_fast_assign(
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->s_flags = qp->s_flags;
|
||||
__entry->psn = psn;
|
||||
|
@ -130,7 +130,7 @@ DECLARE_EVENT_CLASS(/* rc_ack */
|
|||
__field(u32, lpsn)
|
||||
),
|
||||
TP_fast_assign(/* assign */
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->aeth = aeth;
|
||||
__entry->psn = psn;
|
||||
|
|
|
@ -886,7 +886,7 @@ DECLARE_EVENT_CLASS(/* sender_info */
|
|||
__field(u8, s_retry)
|
||||
),
|
||||
TP_fast_assign(/* assign */
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->state = qp->state;
|
||||
__entry->s_cur = qp->s_cur;
|
||||
|
@ -1285,7 +1285,7 @@ DECLARE_EVENT_CLASS(/* rc_rcv_err */
|
|||
__field(int, diff)
|
||||
),
|
||||
TP_fast_assign(/* assign */
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->s_flags = qp->s_flags;
|
||||
__entry->state = qp->state;
|
||||
|
@ -1574,7 +1574,7 @@ DECLARE_EVENT_CLASS(/* tid_ack */
|
|||
__field(u32, resync_psn)
|
||||
),
|
||||
TP_fast_assign(/* assign */
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->aeth = aeth;
|
||||
__entry->psn = psn;
|
||||
|
|
|
@ -120,7 +120,7 @@ DECLARE_EVENT_CLASS(hfi1_qpsleepwakeup_template,
|
|||
__field(unsigned long, iow_flags)
|
||||
),
|
||||
TP_fast_assign(
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->flags = flags;
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->s_flags = qp->s_flags;
|
||||
|
@ -868,7 +868,7 @@ TRACE_EVENT(
|
|||
__field(int, send_flags)
|
||||
),
|
||||
TP_fast_assign(
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->wqe = wqe;
|
||||
__entry->wr_id = wqe->wr.wr_id;
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
|
@ -904,7 +904,7 @@ DECLARE_EVENT_CLASS(
|
|||
__field(bool, flag)
|
||||
),
|
||||
TP_fast_assign(
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
|
||||
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->flag = flag;
|
||||
),
|
||||
|
@ -952,7 +952,7 @@ DECLARE_EVENT_CLASS(/* AIP */
|
|||
__field(u8, stopped)
|
||||
),
|
||||
TP_fast_assign(/* assign */
|
||||
DD_DEV_ASSIGN(txq->priv->dd)
|
||||
DD_DEV_ASSIGN(txq->priv->dd);
|
||||
__entry->txq = txq;
|
||||
__entry->sde = txq->sde;
|
||||
__entry->head = txq->tx_ring.head;
|
||||
|
|
|
@ -85,7 +85,7 @@ DECLARE_EVENT_CLASS(rvt_cq_template,
|
|||
__field(int, comp_vector_cpu)
|
||||
__field(u32, flags)
|
||||
),
|
||||
TP_fast_assign(RDI_DEV_ASSIGN(cq->rdi)
|
||||
TP_fast_assign(RDI_DEV_ASSIGN(cq->rdi);
|
||||
__entry->ip = cq->ip;
|
||||
__entry->cqe = attr->cqe;
|
||||
__entry->comp_vector = attr->comp_vector;
|
||||
|
@ -123,7 +123,7 @@ DECLARE_EVENT_CLASS(
|
|||
__field(u32, imm)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(cq->rdi)
|
||||
RDI_DEV_ASSIGN(cq->rdi);
|
||||
__entry->wr_id = wc->wr_id;
|
||||
__entry->status = wc->status;
|
||||
__entry->opcode = wc->opcode;
|
||||
|
|
|
@ -195,7 +195,7 @@ TRACE_EVENT(
|
|||
__field(uint, sg_offset)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(to_imr(ibmr)->mr.pd->device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(to_imr(ibmr)->mr.pd->device));
|
||||
__entry->ibmr_iova = ibmr->iova;
|
||||
__entry->iova = to_imr(ibmr)->mr.iova;
|
||||
__entry->user_base = to_imr(ibmr)->mr.user_base;
|
||||
|
|
|
@ -65,7 +65,7 @@ DECLARE_EVENT_CLASS(rvt_qphash_template,
|
|||
__field(u32, bucket)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->bucket = bucket;
|
||||
),
|
||||
|
@ -97,7 +97,7 @@ DECLARE_EVENT_CLASS(
|
|||
__field(u32, to)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->hrtimer = &qp->s_rnr_timer;
|
||||
__entry->s_flags = qp->s_flags;
|
||||
|
|
|
@ -71,7 +71,7 @@ DECLARE_EVENT_CLASS(rvt_rc_template,
|
|||
__field(u32, r_psn)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
__entry->s_flags = qp->s_flags;
|
||||
__entry->psn = psn;
|
||||
|
|
|
@ -111,7 +111,7 @@ TRACE_EVENT(
|
|||
__field(int, wr_num_sge)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
|
||||
__entry->wqe = wqe;
|
||||
__entry->wr_id = wqe->wr.wr_id;
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
|
@ -170,7 +170,7 @@ TRACE_EVENT(
|
|||
__field(int, send_flags)
|
||||
),
|
||||
TP_fast_assign(
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
|
||||
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
|
||||
__entry->wqe = wqe;
|
||||
__entry->wr_id = wqe->wr.wr_id;
|
||||
__entry->qpn = qp->ibqp.qp_num;
|
||||
|
|
|
@ -26,7 +26,7 @@ TRACE_EVENT(mei_reg_read,
|
|||
__field(u32, val)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(dev, dev_name(dev))
|
||||
__assign_str(dev, dev_name(dev));
|
||||
__entry->reg = reg;
|
||||
__entry->offs = offs;
|
||||
__entry->val = val;
|
||||
|
@ -45,7 +45,7 @@ TRACE_EVENT(mei_reg_write,
|
|||
__field(u32, val)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(dev, dev_name(dev))
|
||||
__assign_str(dev, dev_name(dev));
|
||||
__entry->reg = reg;
|
||||
__entry->offs = offs;
|
||||
__entry->val = val;
|
||||
|
@ -64,7 +64,7 @@ TRACE_EVENT(mei_pci_cfg_read,
|
|||
__field(u32, val)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(dev, dev_name(dev))
|
||||
__assign_str(dev, dev_name(dev));
|
||||
__entry->reg = reg;
|
||||
__entry->offs = offs;
|
||||
__entry->val = val;
|
||||
|
|
|
@ -21,7 +21,7 @@ TRACE_EVENT(otx2_msg_alloc,
|
|||
__field(u16, id)
|
||||
__field(u64, size)
|
||||
),
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev))
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev));
|
||||
__entry->id = id;
|
||||
__entry->size = size;
|
||||
),
|
||||
|
@ -36,7 +36,7 @@ TRACE_EVENT(otx2_msg_send,
|
|||
__field(u16, num_msgs)
|
||||
__field(u64, msg_size)
|
||||
),
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev))
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev));
|
||||
__entry->num_msgs = num_msgs;
|
||||
__entry->msg_size = msg_size;
|
||||
),
|
||||
|
@ -52,7 +52,7 @@ TRACE_EVENT(otx2_msg_check,
|
|||
__field(u16, rspid)
|
||||
__field(int, rc)
|
||||
),
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev))
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev));
|
||||
__entry->reqid = reqid;
|
||||
__entry->rspid = rspid;
|
||||
__entry->rc = rc;
|
||||
|
@ -69,8 +69,8 @@ TRACE_EVENT(otx2_msg_interrupt,
|
|||
__string(str, msg)
|
||||
__field(u64, intr)
|
||||
),
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev))
|
||||
__assign_str(str, msg)
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev));
|
||||
__assign_str(str, msg);
|
||||
__entry->intr = intr;
|
||||
),
|
||||
TP_printk("[%s] mbox interrupt %s (0x%llx)\n", __get_str(dev),
|
||||
|
@ -84,7 +84,7 @@ TRACE_EVENT(otx2_msg_process,
|
|||
__field(u16, id)
|
||||
__field(int, err)
|
||||
),
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev))
|
||||
TP_fast_assign(__assign_str(dev, pci_name(pdev));
|
||||
__entry->id = id;
|
||||
__entry->err = err;
|
||||
),
|
||||
|
|
|
@ -232,7 +232,7 @@ TRACE_EVENT(fjes_hw_start_debug_err,
|
|||
__string(err, err)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(err, err)
|
||||
__assign_str(err, err);
|
||||
),
|
||||
TP_printk("%s", __get_str(err))
|
||||
);
|
||||
|
@ -258,7 +258,7 @@ TRACE_EVENT(fjes_hw_stop_debug_err,
|
|||
__string(err, err)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(err, err)
|
||||
__assign_str(err, err);
|
||||
),
|
||||
TP_printk("%s", __get_str(err))
|
||||
);
|
||||
|
|
|
@ -138,7 +138,7 @@ DECLARE_EVENT_CLASS(cdnsp_log_simple,
|
|||
__string(text, msg)
|
||||
),
|
||||
TP_fast_assign(
|
||||
__assign_str(text, msg)
|
||||
__assign_str(text, msg);
|
||||
),
|
||||
TP_printk("%s", __get_str(text))
|
||||
);
|
||||
|
|
|
@ -625,7 +625,7 @@ TRACE_EVENT(nfs4_state_mgr,
|
|||
|
||||
TP_fast_assign(
|
||||
__entry->state = clp->cl_state;
|
||||
__assign_str(hostname, clp->cl_hostname)
|
||||
__assign_str(hostname, clp->cl_hostname);
|
||||
),
|
||||
|
||||
TP_printk(
|
||||
|
@ -1637,7 +1637,7 @@ DECLARE_EVENT_CLASS(nfs4_inode_callback_event,
|
|||
__entry->fileid = 0;
|
||||
__entry->dev = 0;
|
||||
}
|
||||
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown")
|
||||
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown");
|
||||
),
|
||||
|
||||
TP_printk(
|
||||
|
@ -1694,7 +1694,7 @@ DECLARE_EVENT_CLASS(nfs4_inode_stateid_callback_event,
|
|||
__entry->fileid = 0;
|
||||
__entry->dev = 0;
|
||||
}
|
||||
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown")
|
||||
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown");
|
||||
__entry->stateid_seq =
|
||||
be32_to_cpu(stateid->seqid);
|
||||
__entry->stateid_hash =
|
||||
|
|
|
@ -1427,8 +1427,8 @@ DECLARE_EVENT_CLASS(nfs_xdr_event,
|
|||
__entry->version = task->tk_client->cl_vers;
|
||||
__entry->error = error;
|
||||
__assign_str(program,
|
||||
task->tk_client->cl_program->name)
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
|
||||
task->tk_client->cl_program->name);
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
|
||||
),
|
||||
|
||||
TP_printk(
|
||||
|
|
|
@ -49,7 +49,7 @@ static int __init copy_xbc_key_value_list(char *dst, size_t size)
|
|||
else
|
||||
q = '"';
|
||||
ret = snprintf(dst, rest(dst, end), "%c%s%c%s",
|
||||
q, val, q, vnode->next ? ", " : "\n");
|
||||
q, val, q, xbc_node_is_array(vnode) ? ", " : "\n");
|
||||
if (ret < 0)
|
||||
goto out;
|
||||
dst += ret;
|
||||
|
|
|
@ -16,6 +16,26 @@
|
|||
#define BOOTCONFIG_ALIGN (1 << BOOTCONFIG_ALIGN_SHIFT)
|
||||
#define BOOTCONFIG_ALIGN_MASK (BOOTCONFIG_ALIGN - 1)
|
||||
|
||||
/**
|
||||
* xbc_calc_checksum() - Calculate checksum of bootconfig
|
||||
* @data: Bootconfig data.
|
||||
* @size: The size of the bootconfig data.
|
||||
*
|
||||
* Calculate the checksum value of the bootconfig data.
|
||||
* The checksum will be used with the BOOTCONFIG_MAGIC and the size for
|
||||
* embedding the bootconfig in the initrd image.
|
||||
*/
|
||||
static inline __init u32 xbc_calc_checksum(void *data, u32 size)
|
||||
{
|
||||
unsigned char *p = data;
|
||||
u32 ret = 0;
|
||||
|
||||
while (size--)
|
||||
ret += *p++;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* XBC tree node */
|
||||
struct xbc_node {
|
||||
u16 next;
|
||||
|
@ -71,7 +91,7 @@ static inline __init bool xbc_node_is_key(struct xbc_node *node)
|
|||
*/
|
||||
static inline __init bool xbc_node_is_array(struct xbc_node *node)
|
||||
{
|
||||
return xbc_node_is_value(node) && node->next != 0;
|
||||
return xbc_node_is_value(node) && node->child != 0;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -80,6 +100,8 @@ static inline __init bool xbc_node_is_array(struct xbc_node *node)
|
|||
*
|
||||
* Test the @node is a leaf key node which is a key node and has a value node
|
||||
* or no child. Returns true if it is a leaf node, or false if not.
|
||||
* Note that the leaf node can have subkey nodes in addition to the
|
||||
* value node.
|
||||
*/
|
||||
static inline __init bool xbc_node_is_leaf(struct xbc_node *node)
|
||||
{
|
||||
|
@ -129,6 +151,23 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
|
|||
return xbc_node_find_child(NULL, key);
|
||||
}
|
||||
|
||||
/**
|
||||
* xbc_node_get_subkey() - Return the first subkey node if exists
|
||||
* @node: Parent node
|
||||
*
|
||||
* Return the first subkey node of the @node. If the @node has no child
|
||||
* or only value node, this will return NULL.
|
||||
*/
|
||||
static inline struct xbc_node * __init xbc_node_get_subkey(struct xbc_node *node)
|
||||
{
|
||||
struct xbc_node *child = xbc_node_get_child(node);
|
||||
|
||||
if (child && xbc_node_is_value(child))
|
||||
return xbc_node_get_next(child);
|
||||
else
|
||||
return child;
|
||||
}
|
||||
|
||||
/**
|
||||
* xbc_array_for_each_value() - Iterate value nodes on an array
|
||||
* @anode: An XBC arraied value node
|
||||
|
@ -140,7 +179,7 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
|
|||
*/
|
||||
#define xbc_array_for_each_value(anode, value) \
|
||||
for (value = xbc_node_get_data(anode); anode != NULL ; \
|
||||
anode = xbc_node_get_next(anode), \
|
||||
anode = xbc_node_get_child(anode), \
|
||||
value = anode ? xbc_node_get_data(anode) : NULL)
|
||||
|
||||
/**
|
||||
|
@ -149,11 +188,24 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
|
|||
* @child: Iterated XBC node.
|
||||
*
|
||||
* Iterate child nodes of @parent. Each child nodes are stored to @child.
|
||||
* The @child can be mixture of a value node and subkey nodes.
|
||||
*/
|
||||
#define xbc_node_for_each_child(parent, child) \
|
||||
for (child = xbc_node_get_child(parent); child != NULL ; \
|
||||
child = xbc_node_get_next(child))
|
||||
|
||||
/**
|
||||
* xbc_node_for_each_subkey() - Iterate child subkey nodes
|
||||
* @parent: An XBC node.
|
||||
* @child: Iterated XBC node.
|
||||
*
|
||||
* Iterate subkey nodes of @parent. Each child nodes are stored to @child.
|
||||
* The @child is only the subkey node.
|
||||
*/
|
||||
#define xbc_node_for_each_subkey(parent, child) \
|
||||
for (child = xbc_node_get_subkey(parent); child != NULL ; \
|
||||
child = xbc_node_get_next(child))
|
||||
|
||||
/**
|
||||
* xbc_node_for_each_array_value() - Iterate array entries of geven key
|
||||
* @node: An XBC node.
|
||||
|
@ -171,7 +223,7 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
|
|||
*/
|
||||
#define xbc_node_for_each_array_value(node, key, anode, value) \
|
||||
for (value = xbc_node_find_value(node, key, &anode); value != NULL; \
|
||||
anode = xbc_node_get_next(anode), \
|
||||
anode = xbc_node_get_child(anode), \
|
||||
value = anode ? xbc_node_get_data(anode) : NULL)
|
||||
|
||||
/**
|
||||
|
|
|
@ -7,12 +7,21 @@ extern bool trace_hwlat_callback_enabled;
|
|||
extern void trace_hwlat_callback(bool enter);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_OSNOISE_TRACER
|
||||
extern bool trace_osnoise_callback_enabled;
|
||||
extern void trace_osnoise_callback(bool enter);
|
||||
#endif
|
||||
|
||||
static inline void ftrace_nmi_enter(void)
|
||||
{
|
||||
#ifdef CONFIG_HWLAT_TRACER
|
||||
if (trace_hwlat_callback_enabled)
|
||||
trace_hwlat_callback(true);
|
||||
#endif
|
||||
#ifdef CONFIG_OSNOISE_TRACER
|
||||
if (trace_osnoise_callback_enabled)
|
||||
trace_osnoise_callback(true);
|
||||
#endif
|
||||
}
|
||||
|
||||
static inline void ftrace_nmi_exit(void)
|
||||
|
@ -21,6 +30,10 @@ static inline void ftrace_nmi_exit(void)
|
|||
if (trace_hwlat_callback_enabled)
|
||||
trace_hwlat_callback(false);
|
||||
#endif
|
||||
#ifdef CONFIG_OSNOISE_TRACER
|
||||
if (trace_osnoise_callback_enabled)
|
||||
trace_osnoise_callback(false);
|
||||
#endif
|
||||
}
|
||||
|
||||
#endif /* _LINUX_FTRACE_IRQ_H */
|
||||
|
|
|
@ -41,6 +41,13 @@ int trace_array_init_printk(struct trace_array *tr);
|
|||
void trace_array_put(struct trace_array *tr);
|
||||
struct trace_array *trace_array_get_by_name(const char *name);
|
||||
int trace_array_destroy(struct trace_array *tr);
|
||||
|
||||
/* For osnoise tracer */
|
||||
int osnoise_arch_register(void);
|
||||
void osnoise_arch_unregister(void);
|
||||
void osnoise_trace_irq_entry(int id);
|
||||
void osnoise_trace_irq_exit(int id, const char *desc);
|
||||
|
||||
#endif /* CONFIG_TRACING */
|
||||
|
||||
#endif /* _LINUX_TRACE_H */
|
||||
|
|
|
@ -41,7 +41,17 @@ extern int
|
|||
tracepoint_probe_register_prio(struct tracepoint *tp, void *probe, void *data,
|
||||
int prio);
|
||||
extern int
|
||||
tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe, void *data,
|
||||
int prio);
|
||||
extern int
|
||||
tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data);
|
||||
static inline int
|
||||
tracepoint_probe_register_may_exist(struct tracepoint *tp, void *probe,
|
||||
void *data)
|
||||
{
|
||||
return tracepoint_probe_register_prio_may_exist(tp, probe, data,
|
||||
TRACEPOINT_DEFAULT_PRIO);
|
||||
}
|
||||
extern void
|
||||
for_each_kernel_tracepoint(void (*fct)(struct tracepoint *tp, void *priv),
|
||||
void *priv);
|
||||
|
|
|
@ -1092,7 +1092,7 @@ TRACE_EVENT(btrfs_trigger_flush,
|
|||
__entry->flags = flags;
|
||||
__entry->bytes = bytes;
|
||||
__entry->flush = flush;
|
||||
__assign_str(reason, reason)
|
||||
__assign_str(reason, reason);
|
||||
),
|
||||
|
||||
TP_printk_btrfs("%s: flush=%d(%s) flags=%llu(%s) bytes=%llu",
|
||||
|
|
|
@ -23,8 +23,8 @@ DECLARE_EVENT_CLASS(dma_fence,
|
|||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__assign_str(driver, fence->ops->get_driver_name(fence))
|
||||
__assign_str(timeline, fence->ops->get_timeline_name(fence))
|
||||
__assign_str(driver, fence->ops->get_driver_name(fence));
|
||||
__assign_str(timeline, fence->ops->get_timeline_name(fence));
|
||||
__entry->context = fence->context;
|
||||
__entry->seqno = fence->seqno;
|
||||
),
|
||||
|
|
|
@ -0,0 +1,142 @@
|
|||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#undef TRACE_SYSTEM
|
||||
#define TRACE_SYSTEM osnoise
|
||||
|
||||
#if !defined(_OSNOISE_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
|
||||
#define _OSNOISE_TRACE_H
|
||||
|
||||
#include <linux/tracepoint.h>
|
||||
TRACE_EVENT(thread_noise,
|
||||
|
||||
TP_PROTO(struct task_struct *t, u64 start, u64 duration),
|
||||
|
||||
TP_ARGS(t, start, duration),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__array( char, comm, TASK_COMM_LEN)
|
||||
__field( u64, start )
|
||||
__field( u64, duration)
|
||||
__field( pid_t, pid )
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
memcpy(__entry->comm, t->comm, TASK_COMM_LEN);
|
||||
__entry->pid = t->pid;
|
||||
__entry->start = start;
|
||||
__entry->duration = duration;
|
||||
),
|
||||
|
||||
TP_printk("%8s:%d start %llu.%09u duration %llu ns",
|
||||
__entry->comm,
|
||||
__entry->pid,
|
||||
__print_ns_to_secs(__entry->start),
|
||||
__print_ns_without_secs(__entry->start),
|
||||
__entry->duration)
|
||||
);
|
||||
|
||||
TRACE_EVENT(softirq_noise,
|
||||
|
||||
TP_PROTO(int vector, u64 start, u64 duration),
|
||||
|
||||
TP_ARGS(vector, start, duration),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u64, start )
|
||||
__field( u64, duration)
|
||||
__field( int, vector )
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->vector = vector;
|
||||
__entry->start = start;
|
||||
__entry->duration = duration;
|
||||
),
|
||||
|
||||
TP_printk("%8s:%d start %llu.%09u duration %llu ns",
|
||||
show_softirq_name(__entry->vector),
|
||||
__entry->vector,
|
||||
__print_ns_to_secs(__entry->start),
|
||||
__print_ns_without_secs(__entry->start),
|
||||
__entry->duration)
|
||||
);
|
||||
|
||||
TRACE_EVENT(irq_noise,
|
||||
|
||||
TP_PROTO(int vector, const char *desc, u64 start, u64 duration),
|
||||
|
||||
TP_ARGS(vector, desc, start, duration),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u64, start )
|
||||
__field( u64, duration)
|
||||
__string( desc, desc )
|
||||
__field( int, vector )
|
||||
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__assign_str(desc, desc);
|
||||
__entry->vector = vector;
|
||||
__entry->start = start;
|
||||
__entry->duration = duration;
|
||||
),
|
||||
|
||||
TP_printk("%s:%d start %llu.%09u duration %llu ns",
|
||||
__get_str(desc),
|
||||
__entry->vector,
|
||||
__print_ns_to_secs(__entry->start),
|
||||
__print_ns_without_secs(__entry->start),
|
||||
__entry->duration)
|
||||
);
|
||||
|
||||
TRACE_EVENT(nmi_noise,
|
||||
|
||||
TP_PROTO(u64 start, u64 duration),
|
||||
|
||||
TP_ARGS(start, duration),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u64, start )
|
||||
__field( u64, duration)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->start = start;
|
||||
__entry->duration = duration;
|
||||
),
|
||||
|
||||
TP_printk("start %llu.%09u duration %llu ns",
|
||||
__print_ns_to_secs(__entry->start),
|
||||
__print_ns_without_secs(__entry->start),
|
||||
__entry->duration)
|
||||
);
|
||||
|
||||
TRACE_EVENT(sample_threshold,
|
||||
|
||||
TP_PROTO(u64 start, u64 duration, u64 interference),
|
||||
|
||||
TP_ARGS(start, duration, interference),
|
||||
|
||||
TP_STRUCT__entry(
|
||||
__field( u64, start )
|
||||
__field( u64, duration)
|
||||
__field( u64, interference)
|
||||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__entry->start = start;
|
||||
__entry->duration = duration;
|
||||
__entry->interference = interference;
|
||||
),
|
||||
|
||||
TP_printk("start %llu.%09u duration %llu ns interference %llu",
|
||||
__print_ns_to_secs(__entry->start),
|
||||
__print_ns_without_secs(__entry->start),
|
||||
__entry->duration,
|
||||
__entry->interference)
|
||||
);
|
||||
|
||||
#endif /* _TRACE_OSNOISE_H */
|
||||
|
||||
/* This part must be outside protection */
|
||||
#include <trace/define_trace.h>
|
|
@ -152,7 +152,7 @@ DECLARE_EVENT_CLASS(rpcgss_ctx_class,
|
|||
TP_fast_assign(
|
||||
__entry->cred = gc;
|
||||
__entry->service = gc->gc_service;
|
||||
__assign_str(principal, gc->gc_principal)
|
||||
__assign_str(principal, gc->gc_principal);
|
||||
),
|
||||
|
||||
TP_printk("cred=%p service=%s principal='%s'",
|
||||
|
@ -535,7 +535,7 @@ TRACE_EVENT(rpcgss_upcall_msg,
|
|||
),
|
||||
|
||||
TP_fast_assign(
|
||||
__assign_str(msg, buf)
|
||||
__assign_str(msg, buf);
|
||||
),
|
||||
|
||||
TP_printk("msg='%s'", __get_str(msg))
|
||||
|
|
|
@ -148,7 +148,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
|
|||
__array( char, comm, TASK_COMM_LEN )
|
||||
__field( pid_t, pid )
|
||||
__field( int, prio )
|
||||
__field( int, success )
|
||||
__field( int, target_cpu )
|
||||
),
|
||||
|
||||
|
@ -156,7 +155,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
|
|||
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
|
||||
__entry->pid = p->pid;
|
||||
__entry->prio = p->prio; /* XXX SCHED_DEADLINE */
|
||||
__entry->success = 1; /* rudiment, kill when possible */
|
||||
__entry->target_cpu = task_cpu(p);
|
||||
),
|
||||
|
||||
|
|
|
@ -154,8 +154,8 @@ TRACE_EVENT(rpc_clnt_new,
|
|||
__entry->client_id = clnt->cl_clid;
|
||||
__assign_str(addr, xprt->address_strings[RPC_DISPLAY_ADDR]);
|
||||
__assign_str(port, xprt->address_strings[RPC_DISPLAY_PORT]);
|
||||
__assign_str(program, program)
|
||||
__assign_str(server, server)
|
||||
__assign_str(program, program);
|
||||
__assign_str(server, server);
|
||||
),
|
||||
|
||||
TP_printk("client=%u peer=[%s]:%s program=%s server=%s",
|
||||
|
@ -180,8 +180,8 @@ TRACE_EVENT(rpc_clnt_new_err,
|
|||
|
||||
TP_fast_assign(
|
||||
__entry->error = error;
|
||||
__assign_str(program, program)
|
||||
__assign_str(server, server)
|
||||
__assign_str(program, program);
|
||||
__assign_str(server, server);
|
||||
),
|
||||
|
||||
TP_printk("program=%s server=%s error=%d",
|
||||
|
@ -284,8 +284,8 @@ TRACE_EVENT(rpc_request,
|
|||
__entry->client_id = task->tk_client->cl_clid;
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__entry->async = RPC_IS_ASYNC(task);
|
||||
__assign_str(progname, task->tk_client->cl_program->name)
|
||||
__assign_str(procname, rpc_proc_name(task))
|
||||
__assign_str(progname, task->tk_client->cl_program->name);
|
||||
__assign_str(procname, rpc_proc_name(task));
|
||||
),
|
||||
|
||||
TP_printk("task:%u@%u %sv%d %s (%ssync)",
|
||||
|
@ -494,10 +494,10 @@ DECLARE_EVENT_CLASS(rpc_reply_event,
|
|||
__entry->task_id = task->tk_pid;
|
||||
__entry->client_id = task->tk_client->cl_clid;
|
||||
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
|
||||
__assign_str(progname, task->tk_client->cl_program->name)
|
||||
__assign_str(progname, task->tk_client->cl_program->name);
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__assign_str(procname, rpc_proc_name(task))
|
||||
__assign_str(servername, task->tk_xprt->servername)
|
||||
__assign_str(procname, rpc_proc_name(task));
|
||||
__assign_str(servername, task->tk_xprt->servername);
|
||||
),
|
||||
|
||||
TP_printk("task:%u@%d server=%s xid=0x%08x %sv%d %s",
|
||||
|
@ -622,8 +622,8 @@ TRACE_EVENT(rpc_stats_latency,
|
|||
__entry->task_id = task->tk_pid;
|
||||
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__assign_str(progname, task->tk_client->cl_program->name)
|
||||
__assign_str(procname, rpc_proc_name(task))
|
||||
__assign_str(progname, task->tk_client->cl_program->name);
|
||||
__assign_str(procname, rpc_proc_name(task));
|
||||
__entry->backlog = ktime_to_us(backlog);
|
||||
__entry->rtt = ktime_to_us(rtt);
|
||||
__entry->execute = ktime_to_us(execute);
|
||||
|
@ -669,15 +669,15 @@ TRACE_EVENT(rpc_xdr_overflow,
|
|||
__entry->task_id = task->tk_pid;
|
||||
__entry->client_id = task->tk_client->cl_clid;
|
||||
__assign_str(progname,
|
||||
task->tk_client->cl_program->name)
|
||||
task->tk_client->cl_program->name);
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
|
||||
} else {
|
||||
__entry->task_id = 0;
|
||||
__entry->client_id = 0;
|
||||
__assign_str(progname, "unknown")
|
||||
__assign_str(progname, "unknown");
|
||||
__entry->version = 0;
|
||||
__assign_str(procedure, "unknown")
|
||||
__assign_str(procedure, "unknown");
|
||||
}
|
||||
__entry->requested = requested;
|
||||
__entry->end = xdr->end;
|
||||
|
@ -735,9 +735,9 @@ TRACE_EVENT(rpc_xdr_alignment,
|
|||
__entry->task_id = task->tk_pid;
|
||||
__entry->client_id = task->tk_client->cl_clid;
|
||||
__assign_str(progname,
|
||||
task->tk_client->cl_program->name)
|
||||
task->tk_client->cl_program->name);
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
|
||||
|
||||
__entry->offset = offset;
|
||||
__entry->copied = copied;
|
||||
|
@ -1107,9 +1107,9 @@ TRACE_EVENT(xprt_retransmit,
|
|||
__entry->xid = be32_to_cpu(rqst->rq_xid);
|
||||
__entry->ntrans = rqst->rq_ntrans;
|
||||
__assign_str(progname,
|
||||
task->tk_client->cl_program->name)
|
||||
task->tk_client->cl_program->name);
|
||||
__entry->version = task->tk_client->cl_vers;
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
|
||||
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
|
||||
),
|
||||
|
||||
TP_printk(
|
||||
|
@ -1842,7 +1842,7 @@ TRACE_EVENT(svc_xprt_accept,
|
|||
|
||||
TP_fast_assign(
|
||||
__assign_str(addr, xprt->xpt_remotebuf);
|
||||
__assign_str(protocol, xprt->xpt_class->xcl_name)
|
||||
__assign_str(protocol, xprt->xpt_class->xcl_name);
|
||||
__assign_str(service, service);
|
||||
),
|
||||
|
||||
|
|
|
@ -36,7 +36,8 @@
|
|||
EM( WB_REASON_PERIODIC, "periodic") \
|
||||
EM( WB_REASON_LAPTOP_TIMER, "laptop_timer") \
|
||||
EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \
|
||||
EMe(WB_REASON_FORKER_THREAD, "forker_thread")
|
||||
EM( WB_REASON_FORKER_THREAD, "forker_thread") \
|
||||
EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush")
|
||||
|
||||
WB_WORK_REASON
|
||||
|
||||
|
|
|
@ -358,6 +358,21 @@ TRACE_MAKE_SYSTEM_STR();
|
|||
trace_print_hex_dump_seq(p, prefix_str, prefix_type, \
|
||||
rowsize, groupsize, buf, len, ascii)
|
||||
|
||||
#undef __print_ns_to_secs
|
||||
#define __print_ns_to_secs(value) \
|
||||
({ \
|
||||
u64 ____val = (u64)(value); \
|
||||
do_div(____val, NSEC_PER_SEC); \
|
||||
____val; \
|
||||
})
|
||||
|
||||
#undef __print_ns_without_secs
|
||||
#define __print_ns_without_secs(value) \
|
||||
({ \
|
||||
u64 ____val = (u64)(value); \
|
||||
(u32) do_div(____val, NSEC_PER_SEC); \
|
||||
})
|
||||
|
||||
#undef DECLARE_EVENT_CLASS
|
||||
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
|
||||
static notrace enum print_line_t \
|
||||
|
@ -736,6 +751,16 @@ static inline void ftrace_test_probe_##call(void) \
|
|||
#undef __print_array
|
||||
#undef __print_hex_dump
|
||||
|
||||
/*
|
||||
* The below is not executed in the kernel. It is only what is
|
||||
* displayed in the print format for userspace to parse.
|
||||
*/
|
||||
#undef __print_ns_to_secs
|
||||
#define __print_ns_to_secs(val) (val) / 1000000000UL
|
||||
|
||||
#undef __print_ns_without_secs
|
||||
#define __print_ns_without_secs(val) (val) % 1000000000UL
|
||||
|
||||
#undef TP_printk
|
||||
#define TP_printk(fmt, args...) "\"" fmt "\", " __stringify(args)
|
||||
|
||||
|
|
12
init/main.c
12
init/main.c
|
@ -386,16 +386,6 @@ static char * __init xbc_make_cmdline(const char *key)
|
|||
return new_cmdline;
|
||||
}
|
||||
|
||||
static u32 boot_config_checksum(unsigned char *p, u32 size)
|
||||
{
|
||||
u32 ret = 0;
|
||||
|
||||
while (size--)
|
||||
ret += *p++;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int __init bootconfig_params(char *param, char *val,
|
||||
const char *unused, void *arg)
|
||||
{
|
||||
|
@ -439,7 +429,7 @@ static void __init setup_boot_config(void)
|
|||
return;
|
||||
}
|
||||
|
||||
if (boot_config_checksum((unsigned char *)data, size) != csum) {
|
||||
if (xbc_calc_checksum(data, size) != csum) {
|
||||
pr_err("bootconfig checksum failed\n");
|
||||
return;
|
||||
}
|
||||
|
|
|
@ -356,6 +356,68 @@ config HWLAT_TRACER
|
|||
file. Every time a latency is greater than tracing_thresh, it will
|
||||
be recorded into the ring buffer.
|
||||
|
||||
config OSNOISE_TRACER
|
||||
bool "OS Noise tracer"
|
||||
select GENERIC_TRACER
|
||||
help
|
||||
In the context of high-performance computing (HPC), the Operating
|
||||
System Noise (osnoise) refers to the interference experienced by an
|
||||
application due to activities inside the operating system. In the
|
||||
context of Linux, NMIs, IRQs, SoftIRQs, and any other system thread
|
||||
can cause noise to the system. Moreover, hardware-related jobs can
|
||||
also cause noise, for example, via SMIs.
|
||||
|
||||
The osnoise tracer leverages the hwlat_detector by running a similar
|
||||
loop with preemption, SoftIRQs and IRQs enabled, thus allowing all
|
||||
the sources of osnoise during its execution. The osnoise tracer takes
|
||||
note of the entry and exit point of any source of interferences,
|
||||
increasing a per-cpu interference counter. It saves an interference
|
||||
counter for each source of interference. The interference counter for
|
||||
NMI, IRQs, SoftIRQs, and threads is increased anytime the tool
|
||||
observes these interferences' entry events. When a noise happens
|
||||
without any interference from the operating system level, the
|
||||
hardware noise counter increases, pointing to a hardware-related
|
||||
noise. In this way, osnoise can account for any source of
|
||||
interference. At the end of the period, the osnoise tracer prints
|
||||
the sum of all noise, the max single noise, the percentage of CPU
|
||||
available for the thread, and the counters for the noise sources.
|
||||
|
||||
In addition to the tracer, a set of tracepoints were added to
|
||||
facilitate the identification of the osnoise source.
|
||||
|
||||
The output will appear in the trace and trace_pipe files.
|
||||
|
||||
To enable this tracer, echo in "osnoise" into the current_tracer
|
||||
file.
|
||||
|
||||
config TIMERLAT_TRACER
|
||||
bool "Timerlat tracer"
|
||||
select OSNOISE_TRACER
|
||||
select GENERIC_TRACER
|
||||
help
|
||||
The timerlat tracer aims to help the preemptive kernel developers
|
||||
to find sources of wakeup latencies of real-time threads.
|
||||
|
||||
The tracer creates a per-cpu kernel thread with real-time priority.
|
||||
The tracer thread sets a periodic timer to wakeup itself, and goes
|
||||
to sleep waiting for the timer to fire. At the wakeup, the thread
|
||||
then computes a wakeup latency value as the difference between
|
||||
the current time and the absolute time that the timer was set
|
||||
to expire.
|
||||
|
||||
The tracer prints two lines at every activation. The first is the
|
||||
timer latency observed at the hardirq context before the
|
||||
activation of the thread. The second is the timer latency observed
|
||||
by the thread, which is the same level that cyclictest reports. The
|
||||
ACTIVATION ID field serves to relate the irq execution to its
|
||||
respective thread execution.
|
||||
|
||||
The tracer is build on top of osnoise tracer, and the osnoise:
|
||||
events can be used to trace the source of interference from NMI,
|
||||
IRQs and other threads. It also enables the capture of the
|
||||
stacktrace at the IRQ context, which helps to identify the code
|
||||
path that can cause thread delay.
|
||||
|
||||
config MMIOTRACE
|
||||
bool "Memory mapped IO tracing"
|
||||
depends on HAVE_MMIOTRACE_SUPPORT && PCI
|
||||
|
|
|
@ -58,6 +58,7 @@ obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
|
|||
obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o
|
||||
obj-$(CONFIG_SCHED_TRACER) += trace_sched_wakeup.o
|
||||
obj-$(CONFIG_HWLAT_TRACER) += trace_hwlat.o
|
||||
obj-$(CONFIG_OSNOISE_TRACER) += trace_osnoise.o
|
||||
obj-$(CONFIG_NOP_TRACER) += trace_nop.o
|
||||
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
|
||||
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o
|
||||
|
|
|
@ -1842,7 +1842,8 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *
|
|||
if (prog->aux->max_tp_access > btp->writable_size)
|
||||
return -EINVAL;
|
||||
|
||||
return tracepoint_probe_register(tp, (void *)btp->bpf_func, prog);
|
||||
return tracepoint_probe_register_may_exist(tp, (void *)btp->bpf_func,
|
||||
prog);
|
||||
}
|
||||
|
||||
int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog)
|
||||
|
|
|
@ -3391,7 +3391,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
|
|||
case RINGBUF_TYPE_PADDING:
|
||||
if (event->time_delta == 1)
|
||||
break;
|
||||
/* fall through */
|
||||
fallthrough;
|
||||
case RINGBUF_TYPE_DATA:
|
||||
ts += event->time_delta;
|
||||
break;
|
||||
|
|
|
@ -87,6 +87,7 @@ void __init disable_tracing_selftest(const char *reason)
|
|||
/* Pipe tracepoints to printk */
|
||||
struct trace_iterator *tracepoint_print_iter;
|
||||
int tracepoint_printk;
|
||||
static bool tracepoint_printk_stop_on_boot __initdata;
|
||||
static DEFINE_STATIC_KEY_FALSE(tracepoint_printk_key);
|
||||
|
||||
/* For tracers that don't implement custom flags */
|
||||
|
@ -197,12 +198,12 @@ __setup("ftrace=", set_cmdline_ftrace);
|
|||
|
||||
static int __init set_ftrace_dump_on_oops(char *str)
|
||||
{
|
||||
if (*str++ != '=' || !*str) {
|
||||
if (*str++ != '=' || !*str || !strcmp("1", str)) {
|
||||
ftrace_dump_on_oops = DUMP_ALL;
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (!strcmp("orig_cpu", str)) {
|
||||
if (!strcmp("orig_cpu", str) || !strcmp("2", str)) {
|
||||
ftrace_dump_on_oops = DUMP_ORIG;
|
||||
return 1;
|
||||
}
|
||||
|
@ -257,6 +258,13 @@ static int __init set_tracepoint_printk(char *str)
|
|||
}
|
||||
__setup("tp_printk", set_tracepoint_printk);
|
||||
|
||||
static int __init set_tracepoint_printk_stop(char *str)
|
||||
{
|
||||
tracepoint_printk_stop_on_boot = true;
|
||||
return 1;
|
||||
}
|
||||
__setup("tp_printk_stop_on_boot", set_tracepoint_printk_stop);
|
||||
|
||||
unsigned long long ns2usecs(u64 nsec)
|
||||
{
|
||||
nsec += 500;
|
||||
|
@ -1683,8 +1691,7 @@ static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt)
|
|||
unsigned long __read_mostly tracing_thresh;
|
||||
static const struct file_operations tracing_max_lat_fops;
|
||||
|
||||
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)) && \
|
||||
defined(CONFIG_FSNOTIFY)
|
||||
#ifdef LATENCY_FS_NOTIFY
|
||||
|
||||
static struct workqueue_struct *fsnotify_wq;
|
||||
|
||||
|
@ -2185,8 +2192,15 @@ void tracing_reset_all_online_cpus(void)
|
|||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* The tgid_map array maps from pid to tgid; i.e. the value stored at index i
|
||||
* is the tgid last observed corresponding to pid=i.
|
||||
*/
|
||||
static int *tgid_map;
|
||||
|
||||
/* The maximum valid index into tgid_map. */
|
||||
static size_t tgid_map_max;
|
||||
|
||||
#define SAVED_CMDLINES_DEFAULT 128
|
||||
#define NO_CMDLINE_MAP UINT_MAX
|
||||
static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
|
||||
|
@ -2459,24 +2473,41 @@ void trace_find_cmdline(int pid, char comm[])
|
|||
preempt_enable();
|
||||
}
|
||||
|
||||
static int *trace_find_tgid_ptr(int pid)
|
||||
{
|
||||
/*
|
||||
* Pairs with the smp_store_release in set_tracer_flag() to ensure that
|
||||
* if we observe a non-NULL tgid_map then we also observe the correct
|
||||
* tgid_map_max.
|
||||
*/
|
||||
int *map = smp_load_acquire(&tgid_map);
|
||||
|
||||
if (unlikely(!map || pid > tgid_map_max))
|
||||
return NULL;
|
||||
|
||||
return &map[pid];
|
||||
}
|
||||
|
||||
int trace_find_tgid(int pid)
|
||||
{
|
||||
if (unlikely(!tgid_map || !pid || pid > PID_MAX_DEFAULT))
|
||||
return 0;
|
||||
int *ptr = trace_find_tgid_ptr(pid);
|
||||
|
||||
return tgid_map[pid];
|
||||
return ptr ? *ptr : 0;
|
||||
}
|
||||
|
||||
static int trace_save_tgid(struct task_struct *tsk)
|
||||
{
|
||||
int *ptr;
|
||||
|
||||
/* treat recording of idle task as a success */
|
||||
if (!tsk->pid)
|
||||
return 1;
|
||||
|
||||
if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
|
||||
ptr = trace_find_tgid_ptr(tsk->pid);
|
||||
if (!ptr)
|
||||
return 0;
|
||||
|
||||
tgid_map[tsk->pid] = tsk->tgid;
|
||||
*ptr = tsk->tgid;
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
@ -2730,9 +2761,45 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
|
|||
if (!tr->no_filter_buffering_ref &&
|
||||
(trace_file->flags & (EVENT_FILE_FL_SOFT_DISABLED | EVENT_FILE_FL_FILTERED)) &&
|
||||
(entry = this_cpu_read(trace_buffered_event))) {
|
||||
/* Try to use the per cpu buffer first */
|
||||
/*
|
||||
* Filtering is on, so try to use the per cpu buffer first.
|
||||
* This buffer will simulate a ring_buffer_event,
|
||||
* where the type_len is zero and the array[0] will
|
||||
* hold the full length.
|
||||
* (see include/linux/ring-buffer.h for details on
|
||||
* how the ring_buffer_event is structured).
|
||||
*
|
||||
* Using a temp buffer during filtering and copying it
|
||||
* on a matched filter is quicker than writing directly
|
||||
* into the ring buffer and then discarding it when
|
||||
* it doesn't match. That is because the discard
|
||||
* requires several atomic operations to get right.
|
||||
* Copying on match and doing nothing on a failed match
|
||||
* is still quicker than no copy on match, but having
|
||||
* to discard out of the ring buffer on a failed match.
|
||||
*/
|
||||
int max_len = PAGE_SIZE - struct_size(entry, array, 1);
|
||||
|
||||
val = this_cpu_inc_return(trace_buffered_event_cnt);
|
||||
if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) {
|
||||
|
||||
/*
|
||||
* Preemption is disabled, but interrupts and NMIs
|
||||
* can still come in now. If that happens after
|
||||
* the above increment, then it will have to go
|
||||
* back to the old method of allocating the event
|
||||
* on the ring buffer, and if the filter fails, it
|
||||
* will have to call ring_buffer_discard_commit()
|
||||
* to remove it.
|
||||
*
|
||||
* Need to also check the unlikely case that the
|
||||
* length is bigger than the temp buffer size.
|
||||
* If that happens, then the reserve is pretty much
|
||||
* guaranteed to fail, as the ring buffer currently
|
||||
* only allows events less than a page. But that may
|
||||
* change in the future, so let the ring buffer reserve
|
||||
* handle the failure in that case.
|
||||
*/
|
||||
if (val == 1 && likely(len <= max_len)) {
|
||||
trace_event_setup(entry, type, trace_ctx);
|
||||
entry->array[0] = len;
|
||||
return entry;
|
||||
|
@ -5172,6 +5239,8 @@ int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
|
|||
|
||||
int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
|
||||
{
|
||||
int *map;
|
||||
|
||||
if ((mask == TRACE_ITER_RECORD_TGID) ||
|
||||
(mask == TRACE_ITER_RECORD_CMD))
|
||||
lockdep_assert_held(&event_mutex);
|
||||
|
@ -5194,10 +5263,19 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
|
|||
trace_event_enable_cmd_record(enabled);
|
||||
|
||||
if (mask == TRACE_ITER_RECORD_TGID) {
|
||||
if (!tgid_map)
|
||||
tgid_map = kvcalloc(PID_MAX_DEFAULT + 1,
|
||||
sizeof(*tgid_map),
|
||||
if (!tgid_map) {
|
||||
tgid_map_max = pid_max;
|
||||
map = kvcalloc(tgid_map_max + 1, sizeof(*tgid_map),
|
||||
GFP_KERNEL);
|
||||
|
||||
/*
|
||||
* Pairs with smp_load_acquire() in
|
||||
* trace_find_tgid_ptr() to ensure that if it observes
|
||||
* the tgid_map we just allocated then it also observes
|
||||
* the corresponding tgid_map_max value.
|
||||
*/
|
||||
smp_store_release(&tgid_map, map);
|
||||
}
|
||||
if (!tgid_map) {
|
||||
tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
|
||||
return -ENOMEM;
|
||||
|
@ -5609,37 +5687,16 @@ static const struct file_operations tracing_readme_fops = {
|
|||
|
||||
static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
|
||||
{
|
||||
int *ptr = v;
|
||||
int pid = ++(*pos);
|
||||
|
||||
if (*pos || m->count)
|
||||
ptr++;
|
||||
|
||||
(*pos)++;
|
||||
|
||||
for (; ptr <= &tgid_map[PID_MAX_DEFAULT]; ptr++) {
|
||||
if (trace_find_tgid(*ptr))
|
||||
return ptr;
|
||||
}
|
||||
|
||||
return NULL;
|
||||
return trace_find_tgid_ptr(pid);
|
||||
}
|
||||
|
||||
static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
|
||||
{
|
||||
void *v;
|
||||
loff_t l = 0;
|
||||
int pid = *pos;
|
||||
|
||||
if (!tgid_map)
|
||||
return NULL;
|
||||
|
||||
v = &tgid_map[0];
|
||||
while (l <= *pos) {
|
||||
v = saved_tgids_next(m, v, &l);
|
||||
if (!v)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
return v;
|
||||
return trace_find_tgid_ptr(pid);
|
||||
}
|
||||
|
||||
static void saved_tgids_stop(struct seq_file *m, void *v)
|
||||
|
@ -5648,9 +5705,14 @@ static void saved_tgids_stop(struct seq_file *m, void *v)
|
|||
|
||||
static int saved_tgids_show(struct seq_file *m, void *v)
|
||||
{
|
||||
int pid = (int *)v - tgid_map;
|
||||
int *entry = (int *)v;
|
||||
int pid = entry - tgid_map;
|
||||
int tgid = *entry;
|
||||
|
||||
seq_printf(m, "%d %d\n", pid, trace_find_tgid(pid));
|
||||
if (tgid == 0)
|
||||
return SEQ_SKIP;
|
||||
|
||||
seq_printf(m, "%d %d\n", pid, tgid);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
@ -6135,7 +6197,7 @@ static int __tracing_resize_ring_buffer(struct trace_array *tr,
|
|||
ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
|
||||
unsigned long size, int cpu_id)
|
||||
{
|
||||
int ret = size;
|
||||
int ret;
|
||||
|
||||
mutex_lock(&trace_types_lock);
|
||||
|
||||
|
@ -7529,6 +7591,91 @@ static const struct file_operations snapshot_raw_fops = {
|
|||
|
||||
#endif /* CONFIG_TRACER_SNAPSHOT */
|
||||
|
||||
/*
|
||||
* trace_min_max_write - Write a u64 value to a trace_min_max_param struct
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The userspace provided buffer to read value into
|
||||
* @cnt: The maximum number of bytes to read
|
||||
* @ppos: The current "file" position
|
||||
*
|
||||
* This function implements the write interface for a struct trace_min_max_param.
|
||||
* The filp->private_data must point to a trace_min_max_param structure that
|
||||
* defines where to write the value, the min and the max acceptable values,
|
||||
* and a lock to protect the write.
|
||||
*/
|
||||
static ssize_t
|
||||
trace_min_max_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos)
|
||||
{
|
||||
struct trace_min_max_param *param = filp->private_data;
|
||||
u64 val;
|
||||
int err;
|
||||
|
||||
if (!param)
|
||||
return -EFAULT;
|
||||
|
||||
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
if (param->lock)
|
||||
mutex_lock(param->lock);
|
||||
|
||||
if (param->min && val < *param->min)
|
||||
err = -EINVAL;
|
||||
|
||||
if (param->max && val > *param->max)
|
||||
err = -EINVAL;
|
||||
|
||||
if (!err)
|
||||
*param->val = val;
|
||||
|
||||
if (param->lock)
|
||||
mutex_unlock(param->lock);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
return cnt;
|
||||
}
|
||||
|
||||
/*
|
||||
* trace_min_max_read - Read a u64 value from a trace_min_max_param struct
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The userspace provided buffer to read value into
|
||||
* @cnt: The maximum number of bytes to read
|
||||
* @ppos: The current "file" position
|
||||
*
|
||||
* This function implements the read interface for a struct trace_min_max_param.
|
||||
* The filp->private_data must point to a trace_min_max_param struct with valid
|
||||
* data.
|
||||
*/
|
||||
static ssize_t
|
||||
trace_min_max_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
|
||||
{
|
||||
struct trace_min_max_param *param = filp->private_data;
|
||||
char buf[U64_STR_SIZE];
|
||||
int len;
|
||||
u64 val;
|
||||
|
||||
if (!param)
|
||||
return -EFAULT;
|
||||
|
||||
val = *param->val;
|
||||
|
||||
if (cnt > sizeof(buf))
|
||||
cnt = sizeof(buf);
|
||||
|
||||
len = snprintf(buf, sizeof(buf), "%llu\n", val);
|
||||
|
||||
return simple_read_from_buffer(ubuf, cnt, ppos, buf, len);
|
||||
}
|
||||
|
||||
const struct file_operations trace_min_max_fops = {
|
||||
.open = tracing_open_generic,
|
||||
.read = trace_min_max_read,
|
||||
.write = trace_min_max_write,
|
||||
};
|
||||
|
||||
#define TRACING_LOG_ERRS_MAX 8
|
||||
#define TRACING_LOG_LOC_MAX 128
|
||||
|
||||
|
@ -9532,6 +9679,8 @@ static __init int tracer_init_tracefs(void)
|
|||
return 0;
|
||||
}
|
||||
|
||||
fs_initcall(tracer_init_tracefs);
|
||||
|
||||
static int trace_panic_handler(struct notifier_block *this,
|
||||
unsigned long event, void *unused)
|
||||
{
|
||||
|
@ -9952,7 +10101,7 @@ void __init trace_init(void)
|
|||
trace_event_init();
|
||||
}
|
||||
|
||||
__init static int clear_boot_tracer(void)
|
||||
__init static void clear_boot_tracer(void)
|
||||
{
|
||||
/*
|
||||
* The default tracer at boot buffer is an init section.
|
||||
|
@ -9962,26 +10111,21 @@ __init static int clear_boot_tracer(void)
|
|||
* about to be freed.
|
||||
*/
|
||||
if (!default_bootup_tracer)
|
||||
return 0;
|
||||
return;
|
||||
|
||||
printk(KERN_INFO "ftrace bootup tracer '%s' not registered.\n",
|
||||
default_bootup_tracer);
|
||||
default_bootup_tracer = NULL;
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
fs_initcall(tracer_init_tracefs);
|
||||
late_initcall_sync(clear_boot_tracer);
|
||||
|
||||
#ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
|
||||
__init static int tracing_set_default_clock(void)
|
||||
__init static void tracing_set_default_clock(void)
|
||||
{
|
||||
/* sched_clock_stable() is determined in late_initcall */
|
||||
if (!trace_boot_clock && !sched_clock_stable()) {
|
||||
if (security_locked_down(LOCKDOWN_TRACEFS)) {
|
||||
pr_warn("Can not set tracing clock due to lockdown\n");
|
||||
return -EPERM;
|
||||
return;
|
||||
}
|
||||
|
||||
printk(KERN_WARNING
|
||||
|
@ -9991,8 +10135,21 @@ __init static int tracing_set_default_clock(void)
|
|||
"on the kernel command line\n");
|
||||
tracing_set_clock(&global_trace, "global");
|
||||
}
|
||||
}
|
||||
#else
|
||||
static inline void tracing_set_default_clock(void) { }
|
||||
#endif
|
||||
|
||||
__init static int late_trace_init(void)
|
||||
{
|
||||
if (tracepoint_printk && tracepoint_printk_stop_on_boot) {
|
||||
static_key_disable(&tracepoint_printk_key.key);
|
||||
tracepoint_printk = 0;
|
||||
}
|
||||
|
||||
tracing_set_default_clock();
|
||||
clear_boot_tracer();
|
||||
return 0;
|
||||
}
|
||||
late_initcall_sync(tracing_set_default_clock);
|
||||
#endif
|
||||
|
||||
late_initcall_sync(late_trace_init);
|
||||
|
|
|
@ -45,6 +45,8 @@ enum trace_type {
|
|||
TRACE_BLK,
|
||||
TRACE_BPUTS,
|
||||
TRACE_HWLAT,
|
||||
TRACE_OSNOISE,
|
||||
TRACE_TIMERLAT,
|
||||
TRACE_RAW_DATA,
|
||||
TRACE_FUNC_REPEATS,
|
||||
|
||||
|
@ -290,7 +292,8 @@ struct trace_array {
|
|||
struct array_buffer max_buffer;
|
||||
bool allocated_snapshot;
|
||||
#endif
|
||||
#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)
|
||||
#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
|
||||
|| defined(CONFIG_OSNOISE_TRACER)
|
||||
unsigned long max_latency;
|
||||
#ifdef CONFIG_FSNOTIFY
|
||||
struct dentry *d_max_latency;
|
||||
|
@ -438,6 +441,8 @@ extern void __ftrace_bad_type(void);
|
|||
IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \
|
||||
IF_ASSIGN(var, ent, struct bputs_entry, TRACE_BPUTS); \
|
||||
IF_ASSIGN(var, ent, struct hwlat_entry, TRACE_HWLAT); \
|
||||
IF_ASSIGN(var, ent, struct osnoise_entry, TRACE_OSNOISE);\
|
||||
IF_ASSIGN(var, ent, struct timerlat_entry, TRACE_TIMERLAT);\
|
||||
IF_ASSIGN(var, ent, struct raw_data_entry, TRACE_RAW_DATA);\
|
||||
IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \
|
||||
TRACE_MMIO_RW); \
|
||||
|
@ -668,15 +673,15 @@ void update_max_tr_single(struct trace_array *tr,
|
|||
struct task_struct *tsk, int cpu);
|
||||
#endif /* CONFIG_TRACER_MAX_TRACE */
|
||||
|
||||
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)) && \
|
||||
defined(CONFIG_FSNOTIFY)
|
||||
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
|
||||
|| defined(CONFIG_OSNOISE_TRACER)) && defined(CONFIG_FSNOTIFY)
|
||||
#define LATENCY_FS_NOTIFY
|
||||
#endif
|
||||
|
||||
#ifdef LATENCY_FS_NOTIFY
|
||||
void latency_fsnotify(struct trace_array *tr);
|
||||
|
||||
#else
|
||||
|
||||
static inline void latency_fsnotify(struct trace_array *tr) { }
|
||||
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_STACKTRACE
|
||||
|
@ -1945,4 +1950,22 @@ static inline bool is_good_name(const char *name)
|
|||
return true;
|
||||
}
|
||||
|
||||
/*
|
||||
* This is a generic way to read and write a u64 value from a file in tracefs.
|
||||
*
|
||||
* The value is stored on the variable pointed by *val. The value needs
|
||||
* to be at least *min and at most *max. The write is protected by an
|
||||
* existing *lock.
|
||||
*/
|
||||
struct trace_min_max_param {
|
||||
struct mutex *lock;
|
||||
u64 *val;
|
||||
u64 *min;
|
||||
u64 *max;
|
||||
};
|
||||
|
||||
#define U64_STR_SIZE 24 /* 20 digits max */
|
||||
|
||||
extern const struct file_operations trace_min_max_fops;
|
||||
|
||||
#endif /* _LINUX_KERNEL_TRACE_H */
|
||||
|
|
|
@ -225,14 +225,37 @@ static void __init
|
|||
trace_boot_init_events(struct trace_array *tr, struct xbc_node *node)
|
||||
{
|
||||
struct xbc_node *gnode, *enode;
|
||||
bool enable, enable_all = false;
|
||||
const char *data;
|
||||
|
||||
node = xbc_node_find_child(node, "event");
|
||||
if (!node)
|
||||
return;
|
||||
/* per-event key starts with "event.GROUP.EVENT" */
|
||||
xbc_node_for_each_child(node, gnode)
|
||||
xbc_node_for_each_child(gnode, enode)
|
||||
xbc_node_for_each_child(node, gnode) {
|
||||
data = xbc_node_get_data(gnode);
|
||||
if (!strcmp(data, "enable")) {
|
||||
enable_all = true;
|
||||
continue;
|
||||
}
|
||||
enable = false;
|
||||
xbc_node_for_each_child(gnode, enode) {
|
||||
data = xbc_node_get_data(enode);
|
||||
if (!strcmp(data, "enable")) {
|
||||
enable = true;
|
||||
continue;
|
||||
}
|
||||
trace_boot_init_one_event(tr, gnode, enode);
|
||||
}
|
||||
/* Event enablement must be done after event settings */
|
||||
if (enable) {
|
||||
data = xbc_node_get_data(gnode);
|
||||
trace_array_set_clr_event(tr, data, NULL, true);
|
||||
}
|
||||
}
|
||||
/* Ditto */
|
||||
if (enable_all)
|
||||
trace_array_set_clr_event(tr, NULL, NULL, true);
|
||||
}
|
||||
#else
|
||||
#define trace_boot_enable_events(tr, node) do {} while (0)
|
||||
|
|
|
@ -360,3 +360,44 @@ FTRACE_ENTRY(func_repeats, func_repeats_entry,
|
|||
__entry->count,
|
||||
FUNC_REPEATS_GET_DELTA_TS(__entry))
|
||||
);
|
||||
|
||||
FTRACE_ENTRY(osnoise, osnoise_entry,
|
||||
|
||||
TRACE_OSNOISE,
|
||||
|
||||
F_STRUCT(
|
||||
__field( u64, noise )
|
||||
__field( u64, runtime )
|
||||
__field( u64, max_sample )
|
||||
__field( unsigned int, hw_count )
|
||||
__field( unsigned int, nmi_count )
|
||||
__field( unsigned int, irq_count )
|
||||
__field( unsigned int, softirq_count )
|
||||
__field( unsigned int, thread_count )
|
||||
),
|
||||
|
||||
F_printk("noise:%llu\tmax_sample:%llu\thw:%u\tnmi:%u\tirq:%u\tsoftirq:%u\tthread:%u\n",
|
||||
__entry->noise,
|
||||
__entry->max_sample,
|
||||
__entry->hw_count,
|
||||
__entry->nmi_count,
|
||||
__entry->irq_count,
|
||||
__entry->softirq_count,
|
||||
__entry->thread_count)
|
||||
);
|
||||
|
||||
FTRACE_ENTRY(timerlat, timerlat_entry,
|
||||
|
||||
TRACE_TIMERLAT,
|
||||
|
||||
F_STRUCT(
|
||||
__field( unsigned int, seqnum )
|
||||
__field( int, context )
|
||||
__field( u64, timer_latency )
|
||||
),
|
||||
|
||||
F_printk("seq:%u\tcontext:%d\ttimer_latency:%llu\n",
|
||||
__entry->seqnum,
|
||||
__entry->context,
|
||||
__entry->timer_latency)
|
||||
);
|
||||
|
|
|
@ -2434,12 +2434,12 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
|
|||
char *subsys_name, char *event_name, char *field_name)
|
||||
{
|
||||
struct trace_array *tr = target_hist_data->event_file->tr;
|
||||
struct hist_field *event_var = ERR_PTR(-EINVAL);
|
||||
struct hist_trigger_data *hist_data;
|
||||
unsigned int i, n, first = true;
|
||||
struct field_var_hist *var_hist;
|
||||
struct trace_event_file *file;
|
||||
struct hist_field *key_field;
|
||||
struct hist_field *event_var;
|
||||
char *saved_filter;
|
||||
char *cmd;
|
||||
int ret;
|
||||
|
@ -5232,6 +5232,7 @@ static void unregister_field_var_hists(struct hist_trigger_data *hist_data)
|
|||
cmd = hist_data->field_var_hists[i]->cmd;
|
||||
ret = event_hist_trigger_func(&trigger_hist_cmd, file,
|
||||
"!hist", "hist", cmd);
|
||||
WARN_ON_ONCE(ret < 0);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -916,7 +916,8 @@ void unpause_named_trigger(struct event_trigger_data *data)
|
|||
|
||||
/**
|
||||
* set_named_trigger_data - Associate common named trigger data
|
||||
* @data: The trigger data of a named trigger to unpause
|
||||
* @data: The trigger data to associate
|
||||
* @named_data: The common named trigger to be associated
|
||||
*
|
||||
* Named triggers are sets of triggers that share a common set of
|
||||
* trigger data. The first named trigger registered with a given name
|
||||
|
|
|
@ -34,7 +34,7 @@
|
|||
* Copyright (C) 2008-2009 Jon Masters, Red Hat, Inc. <jcm@redhat.com>
|
||||
* Copyright (C) 2013-2016 Steven Rostedt, Red Hat, Inc. <srostedt@redhat.com>
|
||||
*
|
||||
* Includes useful feedback from Clark Williams <clark@redhat.com>
|
||||
* Includes useful feedback from Clark Williams <williams@redhat.com>
|
||||
*
|
||||
*/
|
||||
#include <linux/kthread.h>
|
||||
|
@ -54,20 +54,33 @@ static struct trace_array *hwlat_trace;
|
|||
#define DEFAULT_SAMPLE_WIDTH 500000 /* 0.5s */
|
||||
#define DEFAULT_LAT_THRESHOLD 10 /* 10us */
|
||||
|
||||
/* sampling thread*/
|
||||
static struct task_struct *hwlat_kthread;
|
||||
|
||||
static struct dentry *hwlat_sample_width; /* sample width us */
|
||||
static struct dentry *hwlat_sample_window; /* sample window us */
|
||||
static struct dentry *hwlat_thread_mode; /* hwlat thread mode */
|
||||
|
||||
enum {
|
||||
MODE_NONE = 0,
|
||||
MODE_ROUND_ROBIN,
|
||||
MODE_PER_CPU,
|
||||
MODE_MAX
|
||||
};
|
||||
static char *thread_mode_str[] = { "none", "round-robin", "per-cpu" };
|
||||
|
||||
/* Save the previous tracing_thresh value */
|
||||
static unsigned long save_tracing_thresh;
|
||||
|
||||
/* NMI timestamp counters */
|
||||
static u64 nmi_ts_start;
|
||||
static u64 nmi_total_ts;
|
||||
static int nmi_count;
|
||||
static int nmi_cpu;
|
||||
/* runtime kthread data */
|
||||
struct hwlat_kthread_data {
|
||||
struct task_struct *kthread;
|
||||
/* NMI timestamp counters */
|
||||
u64 nmi_ts_start;
|
||||
u64 nmi_total_ts;
|
||||
int nmi_count;
|
||||
int nmi_cpu;
|
||||
};
|
||||
|
||||
struct hwlat_kthread_data hwlat_single_cpu_data;
|
||||
DEFINE_PER_CPU(struct hwlat_kthread_data, hwlat_per_cpu_data);
|
||||
|
||||
/* Tells NMIs to call back to the hwlat tracer to record timestamps */
|
||||
bool trace_hwlat_callback_enabled;
|
||||
|
@ -96,11 +109,24 @@ static struct hwlat_data {
|
|||
u64 sample_window; /* total sampling window (on+off) */
|
||||
u64 sample_width; /* active sampling portion of window */
|
||||
|
||||
int thread_mode; /* thread mode */
|
||||
|
||||
} hwlat_data = {
|
||||
.sample_window = DEFAULT_SAMPLE_WINDOW,
|
||||
.sample_width = DEFAULT_SAMPLE_WIDTH,
|
||||
.thread_mode = MODE_ROUND_ROBIN
|
||||
};
|
||||
|
||||
static struct hwlat_kthread_data *get_cpu_data(void)
|
||||
{
|
||||
if (hwlat_data.thread_mode == MODE_PER_CPU)
|
||||
return this_cpu_ptr(&hwlat_per_cpu_data);
|
||||
else
|
||||
return &hwlat_single_cpu_data;
|
||||
}
|
||||
|
||||
static bool hwlat_busy;
|
||||
|
||||
static void trace_hwlat_sample(struct hwlat_sample *sample)
|
||||
{
|
||||
struct trace_array *tr = hwlat_trace;
|
||||
|
@ -136,7 +162,9 @@ static void trace_hwlat_sample(struct hwlat_sample *sample)
|
|||
|
||||
void trace_hwlat_callback(bool enter)
|
||||
{
|
||||
if (smp_processor_id() != nmi_cpu)
|
||||
struct hwlat_kthread_data *kdata = get_cpu_data();
|
||||
|
||||
if (!kdata->kthread)
|
||||
return;
|
||||
|
||||
/*
|
||||
|
@ -145,15 +173,24 @@ void trace_hwlat_callback(bool enter)
|
|||
*/
|
||||
if (!IS_ENABLED(CONFIG_GENERIC_SCHED_CLOCK)) {
|
||||
if (enter)
|
||||
nmi_ts_start = time_get();
|
||||
kdata->nmi_ts_start = time_get();
|
||||
else
|
||||
nmi_total_ts += time_get() - nmi_ts_start;
|
||||
kdata->nmi_total_ts += time_get() - kdata->nmi_ts_start;
|
||||
}
|
||||
|
||||
if (enter)
|
||||
nmi_count++;
|
||||
kdata->nmi_count++;
|
||||
}
|
||||
|
||||
/*
|
||||
* hwlat_err - report a hwlat error.
|
||||
*/
|
||||
#define hwlat_err(msg) ({ \
|
||||
struct trace_array *tr = hwlat_trace; \
|
||||
\
|
||||
trace_array_printk_buf(tr->array_buffer.buffer, _THIS_IP_, msg); \
|
||||
})
|
||||
|
||||
/**
|
||||
* get_sample - sample the CPU TSC and look for likely hardware latencies
|
||||
*
|
||||
|
@ -163,6 +200,7 @@ void trace_hwlat_callback(bool enter)
|
|||
*/
|
||||
static int get_sample(void)
|
||||
{
|
||||
struct hwlat_kthread_data *kdata = get_cpu_data();
|
||||
struct trace_array *tr = hwlat_trace;
|
||||
struct hwlat_sample s;
|
||||
time_type start, t1, t2, last_t2;
|
||||
|
@ -175,9 +213,8 @@ static int get_sample(void)
|
|||
|
||||
do_div(thresh, NSEC_PER_USEC); /* modifies interval value */
|
||||
|
||||
nmi_cpu = smp_processor_id();
|
||||
nmi_total_ts = 0;
|
||||
nmi_count = 0;
|
||||
kdata->nmi_total_ts = 0;
|
||||
kdata->nmi_count = 0;
|
||||
/* Make sure NMIs see this first */
|
||||
barrier();
|
||||
|
||||
|
@ -197,7 +234,7 @@ static int get_sample(void)
|
|||
outer_diff = time_to_us(time_sub(t1, last_t2));
|
||||
/* This shouldn't happen */
|
||||
if (outer_diff < 0) {
|
||||
pr_err(BANNER "time running backwards\n");
|
||||
hwlat_err(BANNER "time running backwards\n");
|
||||
goto out;
|
||||
}
|
||||
if (outer_diff > outer_sample)
|
||||
|
@ -209,7 +246,7 @@ static int get_sample(void)
|
|||
|
||||
/* Check for possible overflows */
|
||||
if (total < last_total) {
|
||||
pr_err("Time total overflowed\n");
|
||||
hwlat_err("Time total overflowed\n");
|
||||
break;
|
||||
}
|
||||
last_total = total;
|
||||
|
@ -225,7 +262,7 @@ static int get_sample(void)
|
|||
|
||||
/* This shouldn't happen */
|
||||
if (diff < 0) {
|
||||
pr_err(BANNER "time running backwards\n");
|
||||
hwlat_err(BANNER "time running backwards\n");
|
||||
goto out;
|
||||
}
|
||||
|
||||
|
@ -247,15 +284,15 @@ static int get_sample(void)
|
|||
ret = 1;
|
||||
|
||||
/* We read in microseconds */
|
||||
if (nmi_total_ts)
|
||||
do_div(nmi_total_ts, NSEC_PER_USEC);
|
||||
if (kdata->nmi_total_ts)
|
||||
do_div(kdata->nmi_total_ts, NSEC_PER_USEC);
|
||||
|
||||
hwlat_data.count++;
|
||||
s.seqnum = hwlat_data.count;
|
||||
s.duration = sample;
|
||||
s.outer_duration = outer_sample;
|
||||
s.nmi_total_ts = nmi_total_ts;
|
||||
s.nmi_count = nmi_count;
|
||||
s.nmi_total_ts = kdata->nmi_total_ts;
|
||||
s.nmi_count = kdata->nmi_count;
|
||||
s.count = count;
|
||||
trace_hwlat_sample(&s);
|
||||
|
||||
|
@ -273,7 +310,6 @@ out:
|
|||
}
|
||||
|
||||
static struct cpumask save_cpumask;
|
||||
static bool disable_migrate;
|
||||
|
||||
static void move_to_next_cpu(void)
|
||||
{
|
||||
|
@ -281,15 +317,13 @@ static void move_to_next_cpu(void)
|
|||
struct trace_array *tr = hwlat_trace;
|
||||
int next_cpu;
|
||||
|
||||
if (disable_migrate)
|
||||
return;
|
||||
/*
|
||||
* If for some reason the user modifies the CPU affinity
|
||||
* of this thread, then stop migrating for the duration
|
||||
* of the current test.
|
||||
*/
|
||||
if (!cpumask_equal(current_mask, current->cpus_ptr))
|
||||
goto disable;
|
||||
goto change_mode;
|
||||
|
||||
get_online_cpus();
|
||||
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
|
||||
|
@ -300,7 +334,7 @@ static void move_to_next_cpu(void)
|
|||
next_cpu = cpumask_first(current_mask);
|
||||
|
||||
if (next_cpu >= nr_cpu_ids) /* Shouldn't happen! */
|
||||
goto disable;
|
||||
goto change_mode;
|
||||
|
||||
cpumask_clear(current_mask);
|
||||
cpumask_set_cpu(next_cpu, current_mask);
|
||||
|
@ -308,8 +342,9 @@ static void move_to_next_cpu(void)
|
|||
sched_setaffinity(0, current_mask);
|
||||
return;
|
||||
|
||||
disable:
|
||||
disable_migrate = true;
|
||||
change_mode:
|
||||
hwlat_data.thread_mode = MODE_NONE;
|
||||
pr_info(BANNER "cpumask changed while in round-robin mode, switching to mode none\n");
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -328,6 +363,7 @@ static int kthread_fn(void *data)
|
|||
|
||||
while (!kthread_should_stop()) {
|
||||
|
||||
if (hwlat_data.thread_mode == MODE_ROUND_ROBIN)
|
||||
move_to_next_cpu();
|
||||
|
||||
local_irq_disable();
|
||||
|
@ -351,178 +387,380 @@ static int kthread_fn(void *data)
|
|||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* start_kthread - Kick off the hardware latency sampling/detector kthread
|
||||
/*
|
||||
* stop_stop_kthread - Inform the hardware latency sampling/detector kthread to stop
|
||||
*
|
||||
* This kicks the running hardware latency sampling/detector kernel thread and
|
||||
* tells it to stop sampling now. Use this on unload and at system shutdown.
|
||||
*/
|
||||
static void stop_single_kthread(void)
|
||||
{
|
||||
struct hwlat_kthread_data *kdata = get_cpu_data();
|
||||
struct task_struct *kthread;
|
||||
|
||||
get_online_cpus();
|
||||
kthread = kdata->kthread;
|
||||
|
||||
if (!kthread)
|
||||
goto out_put_cpus;
|
||||
|
||||
kthread_stop(kthread);
|
||||
kdata->kthread = NULL;
|
||||
|
||||
out_put_cpus:
|
||||
put_online_cpus();
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* start_single_kthread - Kick off the hardware latency sampling/detector kthread
|
||||
*
|
||||
* This starts the kernel thread that will sit and sample the CPU timestamp
|
||||
* counter (TSC or similar) and look for potential hardware latencies.
|
||||
*/
|
||||
static int start_kthread(struct trace_array *tr)
|
||||
static int start_single_kthread(struct trace_array *tr)
|
||||
{
|
||||
struct hwlat_kthread_data *kdata = get_cpu_data();
|
||||
struct cpumask *current_mask = &save_cpumask;
|
||||
struct task_struct *kthread;
|
||||
int next_cpu;
|
||||
|
||||
if (hwlat_kthread)
|
||||
return 0;
|
||||
|
||||
/* Just pick the first CPU on first iteration */
|
||||
get_online_cpus();
|
||||
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
|
||||
put_online_cpus();
|
||||
next_cpu = cpumask_first(current_mask);
|
||||
if (kdata->kthread)
|
||||
goto out_put_cpus;
|
||||
|
||||
kthread = kthread_create(kthread_fn, NULL, "hwlatd");
|
||||
if (IS_ERR(kthread)) {
|
||||
pr_err(BANNER "could not start sampling thread\n");
|
||||
put_online_cpus();
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
/* Just pick the first CPU on first iteration */
|
||||
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
|
||||
|
||||
if (hwlat_data.thread_mode == MODE_ROUND_ROBIN) {
|
||||
next_cpu = cpumask_first(current_mask);
|
||||
cpumask_clear(current_mask);
|
||||
cpumask_set_cpu(next_cpu, current_mask);
|
||||
|
||||
}
|
||||
|
||||
sched_setaffinity(kthread->pid, current_mask);
|
||||
|
||||
kdata->kthread = kthread;
|
||||
wake_up_process(kthread);
|
||||
|
||||
out_put_cpus:
|
||||
put_online_cpus();
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* stop_cpu_kthread - Stop a hwlat cpu kthread
|
||||
*/
|
||||
static void stop_cpu_kthread(unsigned int cpu)
|
||||
{
|
||||
struct task_struct *kthread;
|
||||
|
||||
kthread = per_cpu(hwlat_per_cpu_data, cpu).kthread;
|
||||
if (kthread)
|
||||
kthread_stop(kthread);
|
||||
per_cpu(hwlat_per_cpu_data, cpu).kthread = NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* stop_per_cpu_kthreads - Inform the hardware latency sampling/detector kthread to stop
|
||||
*
|
||||
* This kicks the running hardware latency sampling/detector kernel threads and
|
||||
* tells it to stop sampling now. Use this on unload and at system shutdown.
|
||||
*/
|
||||
static void stop_per_cpu_kthreads(void)
|
||||
{
|
||||
unsigned int cpu;
|
||||
|
||||
get_online_cpus();
|
||||
for_each_online_cpu(cpu)
|
||||
stop_cpu_kthread(cpu);
|
||||
put_online_cpus();
|
||||
}
|
||||
|
||||
/*
|
||||
* start_cpu_kthread - Start a hwlat cpu kthread
|
||||
*/
|
||||
static int start_cpu_kthread(unsigned int cpu)
|
||||
{
|
||||
struct task_struct *kthread;
|
||||
char comm[24];
|
||||
|
||||
snprintf(comm, 24, "hwlatd/%d", cpu);
|
||||
|
||||
kthread = kthread_create_on_cpu(kthread_fn, NULL, cpu, comm);
|
||||
if (IS_ERR(kthread)) {
|
||||
pr_err(BANNER "could not start sampling thread\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
cpumask_clear(current_mask);
|
||||
cpumask_set_cpu(next_cpu, current_mask);
|
||||
sched_setaffinity(kthread->pid, current_mask);
|
||||
|
||||
hwlat_kthread = kthread;
|
||||
per_cpu(hwlat_per_cpu_data, cpu).kthread = kthread;
|
||||
wake_up_process(kthread);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* stop_kthread - Inform the hardware latency sampling/detector kthread to stop
|
||||
*
|
||||
* This kicks the running hardware latency sampling/detector kernel thread and
|
||||
* tells it to stop sampling now. Use this on unload and at system shutdown.
|
||||
*/
|
||||
static void stop_kthread(void)
|
||||
#ifdef CONFIG_HOTPLUG_CPU
|
||||
static void hwlat_hotplug_workfn(struct work_struct *dummy)
|
||||
{
|
||||
if (!hwlat_kthread)
|
||||
return;
|
||||
kthread_stop(hwlat_kthread);
|
||||
hwlat_kthread = NULL;
|
||||
struct trace_array *tr = hwlat_trace;
|
||||
unsigned int cpu = smp_processor_id();
|
||||
|
||||
mutex_lock(&trace_types_lock);
|
||||
mutex_lock(&hwlat_data.lock);
|
||||
get_online_cpus();
|
||||
|
||||
if (!hwlat_busy || hwlat_data.thread_mode != MODE_PER_CPU)
|
||||
goto out_unlock;
|
||||
|
||||
if (!cpumask_test_cpu(cpu, tr->tracing_cpumask))
|
||||
goto out_unlock;
|
||||
|
||||
start_cpu_kthread(cpu);
|
||||
|
||||
out_unlock:
|
||||
put_online_cpus();
|
||||
mutex_unlock(&hwlat_data.lock);
|
||||
mutex_unlock(&trace_types_lock);
|
||||
}
|
||||
|
||||
static DECLARE_WORK(hwlat_hotplug_work, hwlat_hotplug_workfn);
|
||||
|
||||
/*
|
||||
* hwlat_cpu_init - CPU hotplug online callback function
|
||||
*/
|
||||
static int hwlat_cpu_init(unsigned int cpu)
|
||||
{
|
||||
schedule_work_on(cpu, &hwlat_hotplug_work);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* hwlat_read - Wrapper read function for reading both window and width
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The userspace provided buffer to read value into
|
||||
* @cnt: The maximum number of bytes to read
|
||||
* @ppos: The current "file" position
|
||||
*
|
||||
* This function provides a generic read implementation for the global state
|
||||
* "hwlat_data" structure filesystem entries.
|
||||
* hwlat_cpu_die - CPU hotplug offline callback function
|
||||
*/
|
||||
static ssize_t hwlat_read(struct file *filp, char __user *ubuf,
|
||||
static int hwlat_cpu_die(unsigned int cpu)
|
||||
{
|
||||
stop_cpu_kthread(cpu);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void hwlat_init_hotplug_support(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "trace/hwlat:online",
|
||||
hwlat_cpu_init, hwlat_cpu_die);
|
||||
if (ret < 0)
|
||||
pr_warn(BANNER "Error to init cpu hotplug support\n");
|
||||
|
||||
return;
|
||||
}
|
||||
#else /* CONFIG_HOTPLUG_CPU */
|
||||
static void hwlat_init_hotplug_support(void)
|
||||
{
|
||||
return;
|
||||
}
|
||||
#endif /* CONFIG_HOTPLUG_CPU */
|
||||
|
||||
/*
|
||||
* start_per_cpu_kthreads - Kick off the hardware latency sampling/detector kthreads
|
||||
*
|
||||
* This starts the kernel threads that will sit on potentially all cpus and
|
||||
* sample the CPU timestamp counter (TSC or similar) and look for potential
|
||||
* hardware latencies.
|
||||
*/
|
||||
static int start_per_cpu_kthreads(struct trace_array *tr)
|
||||
{
|
||||
struct cpumask *current_mask = &save_cpumask;
|
||||
unsigned int cpu;
|
||||
int retval;
|
||||
|
||||
get_online_cpus();
|
||||
/*
|
||||
* Run only on CPUs in which hwlat is allowed to run.
|
||||
*/
|
||||
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
|
||||
|
||||
for_each_online_cpu(cpu)
|
||||
per_cpu(hwlat_per_cpu_data, cpu).kthread = NULL;
|
||||
|
||||
for_each_cpu(cpu, current_mask) {
|
||||
retval = start_cpu_kthread(cpu);
|
||||
if (retval)
|
||||
goto out_error;
|
||||
}
|
||||
put_online_cpus();
|
||||
|
||||
return 0;
|
||||
|
||||
out_error:
|
||||
put_online_cpus();
|
||||
stop_per_cpu_kthreads();
|
||||
return retval;
|
||||
}
|
||||
|
||||
static void *s_mode_start(struct seq_file *s, loff_t *pos)
|
||||
{
|
||||
int mode = *pos;
|
||||
|
||||
mutex_lock(&hwlat_data.lock);
|
||||
|
||||
if (mode >= MODE_MAX)
|
||||
return NULL;
|
||||
|
||||
return pos;
|
||||
}
|
||||
|
||||
static void *s_mode_next(struct seq_file *s, void *v, loff_t *pos)
|
||||
{
|
||||
int mode = ++(*pos);
|
||||
|
||||
if (mode >= MODE_MAX)
|
||||
return NULL;
|
||||
|
||||
return pos;
|
||||
}
|
||||
|
||||
static int s_mode_show(struct seq_file *s, void *v)
|
||||
{
|
||||
loff_t *pos = v;
|
||||
int mode = *pos;
|
||||
|
||||
if (mode == hwlat_data.thread_mode)
|
||||
seq_printf(s, "[%s]", thread_mode_str[mode]);
|
||||
else
|
||||
seq_printf(s, "%s", thread_mode_str[mode]);
|
||||
|
||||
if (mode != MODE_MAX)
|
||||
seq_puts(s, " ");
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void s_mode_stop(struct seq_file *s, void *v)
|
||||
{
|
||||
seq_puts(s, "\n");
|
||||
mutex_unlock(&hwlat_data.lock);
|
||||
}
|
||||
|
||||
static const struct seq_operations thread_mode_seq_ops = {
|
||||
.start = s_mode_start,
|
||||
.next = s_mode_next,
|
||||
.show = s_mode_show,
|
||||
.stop = s_mode_stop
|
||||
};
|
||||
|
||||
static int hwlat_mode_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
return seq_open(file, &thread_mode_seq_ops);
|
||||
};
|
||||
|
||||
static void hwlat_tracer_start(struct trace_array *tr);
|
||||
static void hwlat_tracer_stop(struct trace_array *tr);
|
||||
|
||||
/**
|
||||
* hwlat_mode_write - Write function for "mode" entry
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The user buffer that contains the value to write
|
||||
* @cnt: The maximum number of bytes to write to "file"
|
||||
* @ppos: The current position in @file
|
||||
*
|
||||
* This function provides a write implementation for the "mode" interface
|
||||
* to the hardware latency detector. hwlatd has different operation modes.
|
||||
* The "none" sets the allowed cpumask for a single hwlatd thread at the
|
||||
* startup and lets the scheduler handle the migration. The default mode is
|
||||
* the "round-robin" one, in which a single hwlatd thread runs, migrating
|
||||
* among the allowed CPUs in a round-robin fashion. The "per-cpu" mode
|
||||
* creates one hwlatd thread per allowed CPU.
|
||||
*/
|
||||
static ssize_t hwlat_mode_write(struct file *filp, const char __user *ubuf,
|
||||
size_t cnt, loff_t *ppos)
|
||||
{
|
||||
char buf[U64STR_SIZE];
|
||||
u64 *entry = filp->private_data;
|
||||
u64 val;
|
||||
int len;
|
||||
struct trace_array *tr = hwlat_trace;
|
||||
const char *mode;
|
||||
char buf[64];
|
||||
int ret, i;
|
||||
|
||||
if (!entry)
|
||||
if (cnt >= sizeof(buf))
|
||||
return -EINVAL;
|
||||
|
||||
if (copy_from_user(buf, ubuf, cnt))
|
||||
return -EFAULT;
|
||||
|
||||
if (cnt > sizeof(buf))
|
||||
cnt = sizeof(buf);
|
||||
buf[cnt] = 0;
|
||||
|
||||
val = *entry;
|
||||
mode = strstrip(buf);
|
||||
|
||||
len = snprintf(buf, sizeof(buf), "%llu\n", val);
|
||||
ret = -EINVAL;
|
||||
|
||||
return simple_read_from_buffer(ubuf, cnt, ppos, buf, len);
|
||||
}
|
||||
|
||||
/**
|
||||
* hwlat_width_write - Write function for "width" entry
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The user buffer that contains the value to write
|
||||
* @cnt: The maximum number of bytes to write to "file"
|
||||
* @ppos: The current position in @file
|
||||
*
|
||||
* This function provides a write implementation for the "width" interface
|
||||
* to the hardware latency detector. It can be used to configure
|
||||
* for how many us of the total window us we will actively sample for any
|
||||
* hardware-induced latency periods. Obviously, it is not possible to
|
||||
* sample constantly and have the system respond to a sample reader, or,
|
||||
* worse, without having the system appear to have gone out to lunch. It
|
||||
* is enforced that width is less that the total window size.
|
||||
/*
|
||||
* trace_types_lock is taken to avoid concurrency on start/stop
|
||||
* and hwlat_busy.
|
||||
*/
|
||||
static ssize_t
|
||||
hwlat_width_write(struct file *filp, const char __user *ubuf,
|
||||
size_t cnt, loff_t *ppos)
|
||||
{
|
||||
u64 val;
|
||||
int err;
|
||||
|
||||
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
|
||||
if (err)
|
||||
return err;
|
||||
mutex_lock(&trace_types_lock);
|
||||
if (hwlat_busy)
|
||||
hwlat_tracer_stop(tr);
|
||||
|
||||
mutex_lock(&hwlat_data.lock);
|
||||
if (val < hwlat_data.sample_window)
|
||||
hwlat_data.sample_width = val;
|
||||
else
|
||||
err = -EINVAL;
|
||||
|
||||
for (i = 0; i < MODE_MAX; i++) {
|
||||
if (strcmp(mode, thread_mode_str[i]) == 0) {
|
||||
hwlat_data.thread_mode = i;
|
||||
ret = cnt;
|
||||
}
|
||||
}
|
||||
|
||||
mutex_unlock(&hwlat_data.lock);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
if (hwlat_busy)
|
||||
hwlat_tracer_start(tr);
|
||||
mutex_unlock(&trace_types_lock);
|
||||
|
||||
return cnt;
|
||||
*ppos += cnt;
|
||||
|
||||
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
/**
|
||||
* hwlat_window_write - Write function for "window" entry
|
||||
* @filp: The active open file structure
|
||||
* @ubuf: The user buffer that contains the value to write
|
||||
* @cnt: The maximum number of bytes to write to "file"
|
||||
* @ppos: The current position in @file
|
||||
*
|
||||
* This function provides a write implementation for the "window" interface
|
||||
* to the hardware latency detector. The window is the total time
|
||||
* in us that will be considered one sample period. Conceptually, windows
|
||||
* occur back-to-back and contain a sample width period during which
|
||||
* actual sampling occurs. Can be used to write a new total window size. It
|
||||
* is enforced that any value written must be greater than the sample width
|
||||
* size, or an error results.
|
||||
/*
|
||||
* The width parameter is read/write using the generic trace_min_max_param
|
||||
* method. The *val is protected by the hwlat_data lock and is upper
|
||||
* bounded by the window parameter.
|
||||
*/
|
||||
static ssize_t
|
||||
hwlat_window_write(struct file *filp, const char __user *ubuf,
|
||||
size_t cnt, loff_t *ppos)
|
||||
{
|
||||
u64 val;
|
||||
int err;
|
||||
|
||||
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
mutex_lock(&hwlat_data.lock);
|
||||
if (hwlat_data.sample_width < val)
|
||||
hwlat_data.sample_window = val;
|
||||
else
|
||||
err = -EINVAL;
|
||||
mutex_unlock(&hwlat_data.lock);
|
||||
|
||||
if (err)
|
||||
return err;
|
||||
|
||||
return cnt;
|
||||
}
|
||||
|
||||
static const struct file_operations width_fops = {
|
||||
.open = tracing_open_generic,
|
||||
.read = hwlat_read,
|
||||
.write = hwlat_width_write,
|
||||
static struct trace_min_max_param hwlat_width = {
|
||||
.lock = &hwlat_data.lock,
|
||||
.val = &hwlat_data.sample_width,
|
||||
.max = &hwlat_data.sample_window,
|
||||
.min = NULL,
|
||||
};
|
||||
|
||||
static const struct file_operations window_fops = {
|
||||
.open = tracing_open_generic,
|
||||
.read = hwlat_read,
|
||||
.write = hwlat_window_write,
|
||||
/*
|
||||
* The window parameter is read/write using the generic trace_min_max_param
|
||||
* method. The *val is protected by the hwlat_data lock and is lower
|
||||
* bounded by the width parameter.
|
||||
*/
|
||||
static struct trace_min_max_param hwlat_window = {
|
||||
.lock = &hwlat_data.lock,
|
||||
.val = &hwlat_data.sample_window,
|
||||
.max = NULL,
|
||||
.min = &hwlat_data.sample_width,
|
||||
};
|
||||
|
||||
static const struct file_operations thread_mode_fops = {
|
||||
.open = hwlat_mode_open,
|
||||
.read = seq_read,
|
||||
.llseek = seq_lseek,
|
||||
.release = seq_release,
|
||||
.write = hwlat_mode_write
|
||||
};
|
||||
/**
|
||||
* init_tracefs - A function to initialize the tracefs interface files
|
||||
*
|
||||
|
@ -546,18 +784,25 @@ static int init_tracefs(void)
|
|||
|
||||
hwlat_sample_window = tracefs_create_file("window", 0640,
|
||||
top_dir,
|
||||
&hwlat_data.sample_window,
|
||||
&window_fops);
|
||||
&hwlat_window,
|
||||
&trace_min_max_fops);
|
||||
if (!hwlat_sample_window)
|
||||
goto err;
|
||||
|
||||
hwlat_sample_width = tracefs_create_file("width", 0644,
|
||||
top_dir,
|
||||
&hwlat_data.sample_width,
|
||||
&width_fops);
|
||||
&hwlat_width,
|
||||
&trace_min_max_fops);
|
||||
if (!hwlat_sample_width)
|
||||
goto err;
|
||||
|
||||
hwlat_thread_mode = trace_create_file("mode", 0644,
|
||||
top_dir,
|
||||
NULL,
|
||||
&thread_mode_fops);
|
||||
if (!hwlat_thread_mode)
|
||||
goto err;
|
||||
|
||||
return 0;
|
||||
|
||||
err:
|
||||
|
@ -569,18 +814,22 @@ static void hwlat_tracer_start(struct trace_array *tr)
|
|||
{
|
||||
int err;
|
||||
|
||||
err = start_kthread(tr);
|
||||
if (hwlat_data.thread_mode == MODE_PER_CPU)
|
||||
err = start_per_cpu_kthreads(tr);
|
||||
else
|
||||
err = start_single_kthread(tr);
|
||||
if (err)
|
||||
pr_err(BANNER "Cannot start hwlat kthread\n");
|
||||
}
|
||||
|
||||
static void hwlat_tracer_stop(struct trace_array *tr)
|
||||
{
|
||||
stop_kthread();
|
||||
if (hwlat_data.thread_mode == MODE_PER_CPU)
|
||||
stop_per_cpu_kthreads();
|
||||
else
|
||||
stop_single_kthread();
|
||||
}
|
||||
|
||||
static bool hwlat_busy;
|
||||
|
||||
static int hwlat_tracer_init(struct trace_array *tr)
|
||||
{
|
||||
/* Only allow one instance to enable this */
|
||||
|
@ -589,7 +838,6 @@ static int hwlat_tracer_init(struct trace_array *tr)
|
|||
|
||||
hwlat_trace = tr;
|
||||
|
||||
disable_migrate = false;
|
||||
hwlat_data.count = 0;
|
||||
tr->max_latency = 0;
|
||||
save_tracing_thresh = tracing_thresh;
|
||||
|
@ -608,7 +856,7 @@ static int hwlat_tracer_init(struct trace_array *tr)
|
|||
|
||||
static void hwlat_tracer_reset(struct trace_array *tr)
|
||||
{
|
||||
stop_kthread();
|
||||
hwlat_tracer_stop(tr);
|
||||
|
||||
/* the tracing threshold is static between runs */
|
||||
last_tracing_thresh = tracing_thresh;
|
||||
|
@ -637,6 +885,8 @@ __init static int init_hwlat_tracer(void)
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
hwlat_init_hotplug_support();
|
||||
|
||||
init_tracefs();
|
||||
|
||||
return 0;
|
||||
|
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -1202,7 +1202,6 @@ trace_hwlat_print(struct trace_iterator *iter, int flags,
|
|||
return trace_handle_return(s);
|
||||
}
|
||||
|
||||
|
||||
static enum print_line_t
|
||||
trace_hwlat_raw(struct trace_iterator *iter, int flags,
|
||||
struct trace_event *event)
|
||||
|
@ -1232,6 +1231,122 @@ static struct trace_event trace_hwlat_event = {
|
|||
.funcs = &trace_hwlat_funcs,
|
||||
};
|
||||
|
||||
/* TRACE_OSNOISE */
|
||||
static enum print_line_t
|
||||
trace_osnoise_print(struct trace_iterator *iter, int flags,
|
||||
struct trace_event *event)
|
||||
{
|
||||
struct trace_entry *entry = iter->ent;
|
||||
struct trace_seq *s = &iter->seq;
|
||||
struct osnoise_entry *field;
|
||||
u64 ratio, ratio_dec;
|
||||
u64 net_runtime;
|
||||
|
||||
trace_assign_type(field, entry);
|
||||
|
||||
/*
|
||||
* compute the available % of cpu time.
|
||||
*/
|
||||
net_runtime = field->runtime - field->noise;
|
||||
ratio = net_runtime * 10000000;
|
||||
do_div(ratio, field->runtime);
|
||||
ratio_dec = do_div(ratio, 100000);
|
||||
|
||||
trace_seq_printf(s, "%llu %10llu %3llu.%05llu %7llu",
|
||||
field->runtime,
|
||||
field->noise,
|
||||
ratio, ratio_dec,
|
||||
field->max_sample);
|
||||
|
||||
trace_seq_printf(s, " %6u", field->hw_count);
|
||||
trace_seq_printf(s, " %6u", field->nmi_count);
|
||||
trace_seq_printf(s, " %6u", field->irq_count);
|
||||
trace_seq_printf(s, " %6u", field->softirq_count);
|
||||
trace_seq_printf(s, " %6u", field->thread_count);
|
||||
|
||||
trace_seq_putc(s, '\n');
|
||||
|
||||
return trace_handle_return(s);
|
||||
}
|
||||
|
||||
static enum print_line_t
|
||||
trace_osnoise_raw(struct trace_iterator *iter, int flags,
|
||||
struct trace_event *event)
|
||||
{
|
||||
struct osnoise_entry *field;
|
||||
struct trace_seq *s = &iter->seq;
|
||||
|
||||
trace_assign_type(field, iter->ent);
|
||||
|
||||
trace_seq_printf(s, "%lld %llu %llu %u %u %u %u %u\n",
|
||||
field->runtime,
|
||||
field->noise,
|
||||
field->max_sample,
|
||||
field->hw_count,
|
||||
field->nmi_count,
|
||||
field->irq_count,
|
||||
field->softirq_count,
|
||||
field->thread_count);
|
||||
|
||||
return trace_handle_return(s);
|
||||
}
|
||||
|
||||
static struct trace_event_functions trace_osnoise_funcs = {
|
||||
.trace = trace_osnoise_print,
|
||||
.raw = trace_osnoise_raw,
|
||||
};
|
||||
|
||||
static struct trace_event trace_osnoise_event = {
|
||||
.type = TRACE_OSNOISE,
|
||||
.funcs = &trace_osnoise_funcs,
|
||||
};
|
||||
|
||||
/* TRACE_TIMERLAT */
|
||||
static enum print_line_t
|
||||
trace_timerlat_print(struct trace_iterator *iter, int flags,
|
||||
struct trace_event *event)
|
||||
{
|
||||
struct trace_entry *entry = iter->ent;
|
||||
struct trace_seq *s = &iter->seq;
|
||||
struct timerlat_entry *field;
|
||||
|
||||
trace_assign_type(field, entry);
|
||||
|
||||
trace_seq_printf(s, "#%-5u context %6s timer_latency %9llu ns\n",
|
||||
field->seqnum,
|
||||
field->context ? "thread" : "irq",
|
||||
field->timer_latency);
|
||||
|
||||
return trace_handle_return(s);
|
||||
}
|
||||
|
||||
static enum print_line_t
|
||||
trace_timerlat_raw(struct trace_iterator *iter, int flags,
|
||||
struct trace_event *event)
|
||||
{
|
||||
struct timerlat_entry *field;
|
||||
struct trace_seq *s = &iter->seq;
|
||||
|
||||
trace_assign_type(field, iter->ent);
|
||||
|
||||
trace_seq_printf(s, "%u %d %llu\n",
|
||||
field->seqnum,
|
||||
field->context,
|
||||
field->timer_latency);
|
||||
|
||||
return trace_handle_return(s);
|
||||
}
|
||||
|
||||
static struct trace_event_functions trace_timerlat_funcs = {
|
||||
.trace = trace_timerlat_print,
|
||||
.raw = trace_timerlat_raw,
|
||||
};
|
||||
|
||||
static struct trace_event trace_timerlat_event = {
|
||||
.type = TRACE_TIMERLAT,
|
||||
.funcs = &trace_timerlat_funcs,
|
||||
};
|
||||
|
||||
/* TRACE_BPUTS */
|
||||
static enum print_line_t
|
||||
trace_bputs_print(struct trace_iterator *iter, int flags,
|
||||
|
@ -1442,6 +1557,8 @@ static struct trace_event *events[] __initdata = {
|
|||
&trace_bprint_event,
|
||||
&trace_print_event,
|
||||
&trace_hwlat_event,
|
||||
&trace_osnoise_event,
|
||||
&trace_timerlat_event,
|
||||
&trace_raw_data_event,
|
||||
&trace_func_repeats_event,
|
||||
NULL
|
||||
|
|
|
@ -26,9 +26,9 @@ static struct task_struct *wakeup_task;
|
|||
static int wakeup_cpu;
|
||||
static int wakeup_current_cpu;
|
||||
static unsigned wakeup_prio = -1;
|
||||
static int wakeup_rt;
|
||||
static int wakeup_dl;
|
||||
static int tracing_dl = 0;
|
||||
static bool wakeup_rt;
|
||||
static bool wakeup_dl;
|
||||
static bool tracing_dl;
|
||||
|
||||
static arch_spinlock_t wakeup_lock =
|
||||
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
|
||||
|
@ -498,7 +498,7 @@ static void __wakeup_reset(struct trace_array *tr)
|
|||
{
|
||||
wakeup_cpu = -1;
|
||||
wakeup_prio = -1;
|
||||
tracing_dl = 0;
|
||||
tracing_dl = false;
|
||||
|
||||
if (wakeup_task)
|
||||
put_task_struct(wakeup_task);
|
||||
|
@ -572,9 +572,9 @@ probe_wakeup(void *ignore, struct task_struct *p)
|
|||
* another task until the first one wakes up.
|
||||
*/
|
||||
if (dl_task(p))
|
||||
tracing_dl = 1;
|
||||
tracing_dl = true;
|
||||
else
|
||||
tracing_dl = 0;
|
||||
tracing_dl = false;
|
||||
|
||||
wakeup_task = get_task_struct(p);
|
||||
|
||||
|
@ -685,8 +685,8 @@ static int wakeup_tracer_init(struct trace_array *tr)
|
|||
if (wakeup_busy)
|
||||
return -EBUSY;
|
||||
|
||||
wakeup_dl = 0;
|
||||
wakeup_rt = 0;
|
||||
wakeup_dl = false;
|
||||
wakeup_rt = false;
|
||||
return __wakeup_tracer_init(tr);
|
||||
}
|
||||
|
||||
|
@ -695,8 +695,8 @@ static int wakeup_rt_tracer_init(struct trace_array *tr)
|
|||
if (wakeup_busy)
|
||||
return -EBUSY;
|
||||
|
||||
wakeup_dl = 0;
|
||||
wakeup_rt = 1;
|
||||
wakeup_dl = false;
|
||||
wakeup_rt = true;
|
||||
return __wakeup_tracer_init(tr);
|
||||
}
|
||||
|
||||
|
@ -705,8 +705,8 @@ static int wakeup_dl_tracer_init(struct trace_array *tr)
|
|||
if (wakeup_busy)
|
||||
return -EBUSY;
|
||||
|
||||
wakeup_dl = 1;
|
||||
wakeup_rt = 0;
|
||||
wakeup_dl = true;
|
||||
wakeup_rt = false;
|
||||
return __wakeup_tracer_init(tr);
|
||||
}
|
||||
|
||||
|
|
|
@ -273,7 +273,8 @@ static void tracepoint_update_call(struct tracepoint *tp, struct tracepoint_func
|
|||
* Add the probe function to a tracepoint.
|
||||
*/
|
||||
static int tracepoint_add_func(struct tracepoint *tp,
|
||||
struct tracepoint_func *func, int prio)
|
||||
struct tracepoint_func *func, int prio,
|
||||
bool warn)
|
||||
{
|
||||
struct tracepoint_func *old, *tp_funcs;
|
||||
int ret;
|
||||
|
@ -288,7 +289,7 @@ static int tracepoint_add_func(struct tracepoint *tp,
|
|||
lockdep_is_held(&tracepoints_mutex));
|
||||
old = func_add(&tp_funcs, func, prio);
|
||||
if (IS_ERR(old)) {
|
||||
WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
|
||||
WARN_ON_ONCE(warn && PTR_ERR(old) != -ENOMEM);
|
||||
return PTR_ERR(old);
|
||||
}
|
||||
|
||||
|
@ -343,6 +344,32 @@ static int tracepoint_remove_func(struct tracepoint *tp,
|
|||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* tracepoint_probe_register_prio_may_exist - Connect a probe to a tracepoint with priority
|
||||
* @tp: tracepoint
|
||||
* @probe: probe handler
|
||||
* @data: tracepoint data
|
||||
* @prio: priority of this function over other registered functions
|
||||
*
|
||||
* Same as tracepoint_probe_register_prio() except that it will not warn
|
||||
* if the tracepoint is already registered.
|
||||
*/
|
||||
int tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe,
|
||||
void *data, int prio)
|
||||
{
|
||||
struct tracepoint_func tp_func;
|
||||
int ret;
|
||||
|
||||
mutex_lock(&tracepoints_mutex);
|
||||
tp_func.func = probe;
|
||||
tp_func.data = data;
|
||||
tp_func.prio = prio;
|
||||
ret = tracepoint_add_func(tp, &tp_func, prio, false);
|
||||
mutex_unlock(&tracepoints_mutex);
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(tracepoint_probe_register_prio_may_exist);
|
||||
|
||||
/**
|
||||
* tracepoint_probe_register_prio - Connect a probe to a tracepoint with priority
|
||||
* @tp: tracepoint
|
||||
|
@ -366,7 +393,7 @@ int tracepoint_probe_register_prio(struct tracepoint *tp, void *probe,
|
|||
tp_func.func = probe;
|
||||
tp_func.data = data;
|
||||
tp_func.prio = prio;
|
||||
ret = tracepoint_add_func(tp, &tp_func, prio);
|
||||
ret = tracepoint_add_func(tp, &tp_func, prio, true);
|
||||
mutex_unlock(&tracepoints_mutex);
|
||||
return ret;
|
||||
}
|
||||
|
|
|
@ -156,7 +156,7 @@ xbc_node_find_child(struct xbc_node *parent, const char *key)
|
|||
struct xbc_node *node;
|
||||
|
||||
if (parent)
|
||||
node = xbc_node_get_child(parent);
|
||||
node = xbc_node_get_subkey(parent);
|
||||
else
|
||||
node = xbc_root_node();
|
||||
|
||||
|
@ -164,7 +164,7 @@ xbc_node_find_child(struct xbc_node *parent, const char *key)
|
|||
if (!xbc_node_match_prefix(node, &key))
|
||||
node = xbc_node_get_next(node);
|
||||
else if (*key != '\0')
|
||||
node = xbc_node_get_child(node);
|
||||
node = xbc_node_get_subkey(node);
|
||||
else
|
||||
break;
|
||||
}
|
||||
|
@ -274,6 +274,8 @@ int __init xbc_node_compose_key_after(struct xbc_node *root,
|
|||
struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
|
||||
struct xbc_node *node)
|
||||
{
|
||||
struct xbc_node *next;
|
||||
|
||||
if (unlikely(!xbc_data))
|
||||
return NULL;
|
||||
|
||||
|
@ -282,6 +284,13 @@ struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
|
|||
if (!node)
|
||||
node = xbc_nodes;
|
||||
} else {
|
||||
/* Leaf node may have a subkey */
|
||||
next = xbc_node_get_subkey(node);
|
||||
if (next) {
|
||||
node = next;
|
||||
goto found;
|
||||
}
|
||||
|
||||
if (node == root) /* @root was a leaf, no child node. */
|
||||
return NULL;
|
||||
|
||||
|
@ -296,6 +305,7 @@ struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
|
|||
node = xbc_node_get_next(node);
|
||||
}
|
||||
|
||||
found:
|
||||
while (node && !xbc_node_is_leaf(node))
|
||||
node = xbc_node_get_child(node);
|
||||
|
||||
|
@ -367,18 +377,28 @@ static inline __init struct xbc_node *xbc_last_sibling(struct xbc_node *node)
|
|||
return node;
|
||||
}
|
||||
|
||||
static struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
|
||||
static inline __init struct xbc_node *xbc_last_child(struct xbc_node *node)
|
||||
{
|
||||
while (node->child)
|
||||
node = xbc_node_get_child(node);
|
||||
|
||||
return node;
|
||||
}
|
||||
|
||||
static struct xbc_node * __init __xbc_add_sibling(char *data, u32 flag, bool head)
|
||||
{
|
||||
struct xbc_node *sib, *node = xbc_add_node(data, flag);
|
||||
|
||||
if (node) {
|
||||
if (!last_parent) {
|
||||
/* Ignore @head in this case */
|
||||
node->parent = XBC_NODE_MAX;
|
||||
sib = xbc_last_sibling(xbc_nodes);
|
||||
sib->next = xbc_node_index(node);
|
||||
} else {
|
||||
node->parent = xbc_node_index(last_parent);
|
||||
if (!last_parent->child) {
|
||||
if (!last_parent->child || head) {
|
||||
node->next = last_parent->child;
|
||||
last_parent->child = xbc_node_index(node);
|
||||
} else {
|
||||
sib = xbc_node_get_child(last_parent);
|
||||
|
@ -392,6 +412,16 @@ static struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
|
|||
return node;
|
||||
}
|
||||
|
||||
static inline struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
|
||||
{
|
||||
return __xbc_add_sibling(data, flag, false);
|
||||
}
|
||||
|
||||
static inline struct xbc_node * __init xbc_add_head_sibling(char *data, u32 flag)
|
||||
{
|
||||
return __xbc_add_sibling(data, flag, true);
|
||||
}
|
||||
|
||||
static inline __init struct xbc_node *xbc_add_child(char *data, u32 flag)
|
||||
{
|
||||
struct xbc_node *node = xbc_add_sibling(data, flag);
|
||||
|
@ -517,17 +547,20 @@ static int __init xbc_parse_array(char **__v)
|
|||
char *next;
|
||||
int c = 0;
|
||||
|
||||
if (last_parent->child)
|
||||
last_parent = xbc_node_get_child(last_parent);
|
||||
|
||||
do {
|
||||
c = __xbc_parse_value(__v, &next);
|
||||
if (c < 0)
|
||||
return c;
|
||||
|
||||
node = xbc_add_sibling(*__v, XBC_VALUE);
|
||||
node = xbc_add_child(*__v, XBC_VALUE);
|
||||
if (!node)
|
||||
return -ENOMEM;
|
||||
*__v = next;
|
||||
} while (c == ',');
|
||||
node->next = 0;
|
||||
node->child = 0;
|
||||
|
||||
return c;
|
||||
}
|
||||
|
@ -557,8 +590,9 @@ static int __init __xbc_add_key(char *k)
|
|||
node = find_match_node(xbc_nodes, k);
|
||||
else {
|
||||
child = xbc_node_get_child(last_parent);
|
||||
/* Since the value node is the first child, skip it. */
|
||||
if (child && xbc_node_is_value(child))
|
||||
return xbc_parse_error("Subkey is mixed with value", k);
|
||||
child = xbc_node_get_next(child);
|
||||
node = find_match_node(child, k);
|
||||
}
|
||||
|
||||
|
@ -601,23 +635,29 @@ static int __init xbc_parse_kv(char **k, char *v, int op)
|
|||
if (ret)
|
||||
return ret;
|
||||
|
||||
child = xbc_node_get_child(last_parent);
|
||||
if (child) {
|
||||
if (xbc_node_is_key(child))
|
||||
return xbc_parse_error("Value is mixed with subkey", v);
|
||||
else if (op == '=')
|
||||
return xbc_parse_error("Value is redefined", v);
|
||||
}
|
||||
|
||||
c = __xbc_parse_value(&v, &next);
|
||||
if (c < 0)
|
||||
return c;
|
||||
|
||||
if (op == ':' && child) {
|
||||
child = xbc_node_get_child(last_parent);
|
||||
if (child && xbc_node_is_value(child)) {
|
||||
if (op == '=')
|
||||
return xbc_parse_error("Value is redefined", v);
|
||||
if (op == ':') {
|
||||
unsigned short nidx = child->next;
|
||||
|
||||
xbc_init_node(child, v, XBC_VALUE);
|
||||
} else if (!xbc_add_sibling(v, XBC_VALUE))
|
||||
child->next = nidx; /* keep subkeys */
|
||||
goto array;
|
||||
}
|
||||
/* op must be '+' */
|
||||
last_parent = xbc_last_child(child);
|
||||
}
|
||||
/* The value node should always be the first child */
|
||||
if (!xbc_add_head_sibling(v, XBC_VALUE))
|
||||
return -ENOMEM;
|
||||
|
||||
array:
|
||||
if (c == ',') { /* Array */
|
||||
c = xbc_parse_array(&next);
|
||||
if (c < 0)
|
||||
|
|
|
@ -229,8 +229,10 @@ int seq_buf_putmem_hex(struct seq_buf *s, const void *mem,
|
|||
|
||||
WARN_ON(s->size == 0);
|
||||
|
||||
BUILD_BUG_ON(MAX_MEMHEX_BYTES * 2 >= HEX_CHARS);
|
||||
|
||||
while (len) {
|
||||
start_len = min(len, HEX_CHARS - 1);
|
||||
start_len = min(len, MAX_MEMHEX_BYTES);
|
||||
#ifdef __BIG_ENDIAN
|
||||
for (i = 0, j = 0; i < start_len; i++) {
|
||||
#else
|
||||
|
@ -243,12 +245,14 @@ int seq_buf_putmem_hex(struct seq_buf *s, const void *mem,
|
|||
break;
|
||||
|
||||
/* j increments twice per loop */
|
||||
len -= j / 2;
|
||||
hex[j++] = ' ';
|
||||
|
||||
seq_buf_putmem(s, hex, j);
|
||||
if (seq_buf_has_overflowed(s))
|
||||
return -1;
|
||||
|
||||
len -= start_len;
|
||||
data += start_len;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
|
|
@ -27,7 +27,7 @@ static int xbc_show_value(struct xbc_node *node, bool semicolon)
|
|||
q = '\'';
|
||||
else
|
||||
q = '"';
|
||||
printf("%c%s%c%s", q, val, q, node->next ? ", " : eol);
|
||||
printf("%c%s%c%s", q, val, q, xbc_node_is_array(node) ? ", " : eol);
|
||||
i++;
|
||||
}
|
||||
return i;
|
||||
|
@ -35,30 +35,55 @@ static int xbc_show_value(struct xbc_node *node, bool semicolon)
|
|||
|
||||
static void xbc_show_compact_tree(void)
|
||||
{
|
||||
struct xbc_node *node, *cnode;
|
||||
struct xbc_node *node, *cnode = NULL, *vnode;
|
||||
int depth = 0, i;
|
||||
|
||||
node = xbc_root_node();
|
||||
while (node && xbc_node_is_key(node)) {
|
||||
for (i = 0; i < depth; i++)
|
||||
printf("\t");
|
||||
if (!cnode)
|
||||
cnode = xbc_node_get_child(node);
|
||||
while (cnode && xbc_node_is_key(cnode) && !cnode->next) {
|
||||
vnode = xbc_node_get_child(cnode);
|
||||
/*
|
||||
* If @cnode has value and subkeys, this
|
||||
* should show it as below.
|
||||
*
|
||||
* key(@node) {
|
||||
* key(@cnode) = value;
|
||||
* key(@cnode) {
|
||||
* subkeys;
|
||||
* }
|
||||
* }
|
||||
*/
|
||||
if (vnode && xbc_node_is_value(vnode) && vnode->next)
|
||||
break;
|
||||
printf("%s.", xbc_node_get_data(node));
|
||||
node = cnode;
|
||||
cnode = xbc_node_get_child(node);
|
||||
cnode = vnode;
|
||||
}
|
||||
if (cnode && xbc_node_is_key(cnode)) {
|
||||
printf("%s {\n", xbc_node_get_data(node));
|
||||
depth++;
|
||||
node = cnode;
|
||||
cnode = NULL;
|
||||
continue;
|
||||
} else if (cnode && xbc_node_is_value(cnode)) {
|
||||
printf("%s = ", xbc_node_get_data(node));
|
||||
xbc_show_value(cnode, true);
|
||||
/*
|
||||
* If @node has value and subkeys, continue
|
||||
* looping on subkeys with same node.
|
||||
*/
|
||||
if (cnode->next) {
|
||||
cnode = xbc_node_get_next(cnode);
|
||||
continue;
|
||||
}
|
||||
} else {
|
||||
printf("%s;\n", xbc_node_get_data(node));
|
||||
}
|
||||
cnode = NULL;
|
||||
|
||||
if (node->next) {
|
||||
node = xbc_node_get_next(node);
|
||||
|
@ -70,11 +95,13 @@ static void xbc_show_compact_tree(void)
|
|||
return;
|
||||
if (!xbc_node_get_child(node)->next)
|
||||
continue;
|
||||
if (depth) {
|
||||
depth--;
|
||||
for (i = 0; i < depth; i++)
|
||||
printf("\t");
|
||||
printf("}\n");
|
||||
}
|
||||
}
|
||||
node = xbc_node_get_next(node);
|
||||
}
|
||||
}
|
||||
|
@ -84,12 +111,12 @@ static void xbc_show_list(void)
|
|||
char key[XBC_KEYLEN_MAX];
|
||||
struct xbc_node *leaf;
|
||||
const char *val;
|
||||
int ret = 0;
|
||||
|
||||
xbc_for_each_key_value(leaf, val) {
|
||||
ret = xbc_node_compose_key(leaf, key, XBC_KEYLEN_MAX);
|
||||
if (ret < 0)
|
||||
if (xbc_node_compose_key(leaf, key, XBC_KEYLEN_MAX) < 0) {
|
||||
fprintf(stderr, "Failed to compose key %d\n", ret);
|
||||
break;
|
||||
}
|
||||
printf("%s = ", key);
|
||||
if (!val || val[0] == '\0') {
|
||||
printf("\"\"\n");
|
||||
|
@ -99,17 +126,6 @@ static void xbc_show_list(void)
|
|||
}
|
||||
}
|
||||
|
||||
/* Simple real checksum */
|
||||
static int checksum(unsigned char *buf, int len)
|
||||
{
|
||||
int i, sum = 0;
|
||||
|
||||
for (i = 0; i < len; i++)
|
||||
sum += buf[i];
|
||||
|
||||
return sum;
|
||||
}
|
||||
|
||||
#define PAGE_SIZE 4096
|
||||
|
||||
static int load_xbc_fd(int fd, char **buf, int size)
|
||||
|
@ -205,7 +221,7 @@ static int load_xbc_from_initrd(int fd, char **buf)
|
|||
return ret;
|
||||
|
||||
/* Wrong Checksum */
|
||||
rcsum = checksum((unsigned char *)*buf, size);
|
||||
rcsum = xbc_calc_checksum(*buf, size);
|
||||
if (csum != rcsum) {
|
||||
pr_err("checksum error: %d != %d\n", csum, rcsum);
|
||||
return -EINVAL;
|
||||
|
@ -354,7 +370,7 @@ static int apply_xbc(const char *path, const char *xbc_path)
|
|||
return ret;
|
||||
}
|
||||
size = strlen(buf) + 1;
|
||||
csum = checksum((unsigned char *)buf, size);
|
||||
csum = xbc_calc_checksum(buf, size);
|
||||
|
||||
/* Backup the bootconfig data */
|
||||
data = calloc(size + BOOTCONFIG_ALIGN +
|
||||
|
|
|
@ -1,3 +0,0 @@
|
|||
key.subkey = value
|
||||
# We can not override pre-defined subkeys with value
|
||||
key := value
|
|
@ -1,3 +0,0 @@
|
|||
key = value
|
||||
# We can not override pre-defined value with subkey
|
||||
key.subkey := value
|
|
@ -0,0 +1,4 @@
|
|||
key = foo
|
||||
keyx.subkey = value
|
||||
key += bar
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
# mixed key and subkeys with braces
|
||||
key = value
|
||||
key {
|
||||
subkey1
|
||||
subkey2 = foo
|
||||
}
|
|
@ -0,0 +1,4 @@
|
|||
key.foo = bar
|
||||
key = value
|
||||
# mixed key value can be overridden
|
||||
key := value2
|
|
@ -0,0 +1,49 @@
|
|||
ftrace.event {
|
||||
task.task_newtask {
|
||||
filter = "pid < 128"
|
||||
enable
|
||||
}
|
||||
kprobes.vfs_read {
|
||||
probes = "vfs_read $arg1 $arg2"
|
||||
filter = "common_pid < 200"
|
||||
enable
|
||||
}
|
||||
synthetic.initcall_latency {
|
||||
fields = "unsigned long func", "u64 lat"
|
||||
actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
|
||||
}
|
||||
initcall.initcall_start {
|
||||
actions = "hist:keys=func:ts0=common_timestamp.usecs"
|
||||
}
|
||||
initcall.initcall_finish {
|
||||
actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
|
||||
}
|
||||
}
|
||||
|
||||
ftrace.instance {
|
||||
foo {
|
||||
tracer = "function"
|
||||
ftrace.filters = "user_*"
|
||||
cpumask = 1
|
||||
options = nosym-addr
|
||||
buffer_size = 512KB
|
||||
trace_clock = mono
|
||||
event.signal.signal_deliver.actions=snapshot
|
||||
}
|
||||
bar {
|
||||
tracer = "function"
|
||||
ftrace.filters = "kernel_*"
|
||||
cpumask = 2
|
||||
trace_clock = x86-tsc
|
||||
}
|
||||
}
|
||||
|
||||
ftrace.alloc_snapshot
|
||||
|
||||
kernel {
|
||||
trace_options = sym-addr
|
||||
trace_event = "initcall:*"
|
||||
trace_buf_size = 1M
|
||||
ftrace = function
|
||||
ftrace_filter = "vfs*"
|
||||
}
|
|
@ -0,0 +1 @@
|
|||
CONFIG_CMDLINE="bootconfig"
|
|
@ -0,0 +1,15 @@
|
|||
ftrace {
|
||||
tracing_on = 0 # off by default
|
||||
tracer = function_graph
|
||||
event.kprobes {
|
||||
start_event {
|
||||
probes = "pci_proc_init"
|
||||
actions = "traceon"
|
||||
}
|
||||
end_event {
|
||||
probes = "pci_proc_init%return"
|
||||
actions = "traceoff"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,33 @@
|
|||
ftrace {
|
||||
tracer = function_graph;
|
||||
options = event-fork, sym-addr, stacktrace;
|
||||
buffer_size = 1M;
|
||||
alloc_snapshot;
|
||||
trace_clock = global;
|
||||
events = "task:task_newtask", "initcall:*";
|
||||
event.sched.sched_process_exec {
|
||||
filter = "pid < 128";
|
||||
}
|
||||
instance.bar {
|
||||
event.kprobes {
|
||||
myevent {
|
||||
probes = "vfs_read $arg2 $arg3";
|
||||
}
|
||||
myevent2 {
|
||||
probes = "vfs_write $arg2 +0($arg2):ustring $arg3";
|
||||
}
|
||||
myevent3 {
|
||||
probes = "initrd_load";
|
||||
}
|
||||
enable
|
||||
}
|
||||
}
|
||||
instance.foo {
|
||||
tracer = function;
|
||||
tracing_on = false;
|
||||
};
|
||||
}
|
||||
kernel {
|
||||
ftrace_dump_on_oops = "orig_cpu"
|
||||
traceoff_on_warning
|
||||
}
|
|
@ -0,0 +1,84 @@
|
|||
#!/bin/sh
|
||||
|
||||
cd /sys/kernel/tracing
|
||||
|
||||
compare_file() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file`
|
||||
if [ "$content" != "$val" ]; then
|
||||
echo "FAILED: $file has '$content', expected '$val'"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_file_partial() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file | sed -ne "/^$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not contain '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
file_contains() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
if ! grep -q "$val" $file ; then
|
||||
echo "FAILED: $file does not contain $val"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_mask() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not have mask '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_file "events/task/task_newtask/filter" "pid < 128"
|
||||
compare_file "events/task/task_newtask/enable" "1"
|
||||
|
||||
compare_file "events/kprobes/vfs_read/filter" "common_pid < 200"
|
||||
compare_file "events/kprobes/vfs_read/enable" "1"
|
||||
|
||||
compare_file_partial "events/synthetic/initcall_latency/trigger" "hist:keys=func.sym,lat:vals=hitcount,lat:sort=lat"
|
||||
compare_file_partial "events/synthetic/initcall_latency/enable" "0"
|
||||
|
||||
compare_file_partial "events/initcall/initcall_start/trigger" "hist:keys=func:vals=hitcount:ts0=common_timestamp.usecs"
|
||||
compare_file_partial "events/initcall/initcall_start/enable" "1"
|
||||
|
||||
compare_file_partial "events/initcall/initcall_finish/trigger" 'hist:keys=func:vals=hitcount:lat=common_timestamp.usecs-\$ts0:sort=hitcount:size=2048:clock=global:onmatch(initcall.initcall_start).initcall_latency(func,\$lat)'
|
||||
compare_file_partial "events/initcall/initcall_finish/enable" "1"
|
||||
|
||||
compare_file "instances/foo/current_tracer" "function"
|
||||
file_contains "instances/foo/set_ftrace_filter" "^user"
|
||||
compare_file "instances/foo/buffer_size_kb" "512"
|
||||
compare_mask "instances/foo/tracing_cpumask" "1"
|
||||
compare_file "instances/foo/options/sym-addr" "0"
|
||||
file_contains "instances/foo/trace_clock" '\[mono\]'
|
||||
compare_file_partial "instances/foo/events/signal/signal_deliver/trigger" "snapshot"
|
||||
|
||||
compare_file "instances/bar/current_tracer" "function"
|
||||
file_contains "instances/bar/set_ftrace_filter" "^kernel"
|
||||
compare_mask "instances/bar/tracing_cpumask" "2"
|
||||
file_contains "instances/bar/trace_clock" '\[x86-tsc\]'
|
||||
|
||||
file_contains "snapshot" "Snapshot is allocated"
|
||||
compare_file "options/sym-addr" "1"
|
||||
compare_file "events/initcall/enable" "1"
|
||||
compare_file "buffer_size_kb" "1027"
|
||||
compare_file "current_tracer" "function"
|
||||
file_contains "set_ftrace_filter" '^vfs'
|
||||
|
||||
exit 0
|
|
@ -0,0 +1,61 @@
|
|||
#!/bin/sh
|
||||
|
||||
cd /sys/kernel/tracing
|
||||
|
||||
compare_file() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file`
|
||||
if [ "$content" != "$val" ]; then
|
||||
echo "FAILED: $file has '$content', expected '$val'"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_file_partial() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file | sed -ne "/^$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not contain '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
file_contains() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
if ! grep -q "$val" $file ; then
|
||||
echo "FAILED: $file does not contain $val"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_mask() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not have mask '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
|
||||
compare_file "tracing_on" "0"
|
||||
compare_file "current_tracer" "function_graph"
|
||||
|
||||
compare_file_partial "events/kprobes/start_event/enable" "1"
|
||||
compare_file_partial "events/kprobes/start_event/trigger" "traceon"
|
||||
file_contains "kprobe_events" 'start_event.*pci_proc_init'
|
||||
|
||||
compare_file_partial "events/kprobes/end_event/enable" "1"
|
||||
compare_file_partial "events/kprobes/end_event/trigger" "traceoff"
|
||||
file_contains "kprobe_events" '^r.*end_event.*pci_proc_init'
|
||||
|
||||
exit 0
|
|
@ -0,0 +1,72 @@
|
|||
#!/bin/sh
|
||||
|
||||
cd /sys/kernel/tracing
|
||||
|
||||
compare_file() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file`
|
||||
if [ "$content" != "$val" ]; then
|
||||
echo "FAILED: $file has '$content', expected '$val'"
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_file_partial() {
|
||||
file="$1"
|
||||
val="$2"
|
||||
content=`cat $file | sed -ne "/^$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not contain '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
file_contains() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
if ! grep -q "$val" $file ; then
|
||||
echo "FAILED: $file does not contain $val"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_mask() {
|
||||
file=$1
|
||||
val="$2"
|
||||
|
||||
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
|
||||
if [ -z "$content" ]; then
|
||||
echo "FAILED: $file does not have mask '$val'"
|
||||
cat $file
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
compare_file "current_tracer" "function_graph"
|
||||
compare_file "options/event-fork" "1"
|
||||
compare_file "options/sym-addr" "1"
|
||||
compare_file "options/stacktrace" "1"
|
||||
compare_file "buffer_size_kb" "1024"
|
||||
file_contains "snapshot" "Snapshot is allocated"
|
||||
file_contains "trace_clock" '\[global\]'
|
||||
|
||||
compare_file "events/initcall/enable" "1"
|
||||
compare_file "events/task/task_newtask/enable" "1"
|
||||
compare_file "events/sched/sched_process_exec/filter" "pid < 128"
|
||||
compare_file "events/kprobes/enable" "1"
|
||||
|
||||
compare_file "instances/bar/events/kprobes/myevent/enable" "1"
|
||||
compare_file "instances/bar/events/kprobes/myevent2/enable" "1"
|
||||
compare_file "instances/bar/events/kprobes/myevent3/enable" "1"
|
||||
|
||||
compare_file "instances/foo/current_tracer" "function"
|
||||
compare_file "instances/foo/tracing_on" "0"
|
||||
|
||||
compare_file "/proc/sys/kernel/ftrace_dump_on_oops" "2"
|
||||
compare_file "/proc/sys/kernel/traceoff_on_warning" "1"
|
||||
|
||||
exit 0
|
|
@ -0,0 +1,69 @@
|
|||
# bootconfig.conf
|
||||
#
|
||||
# Tests to test some bootconfig scripts
|
||||
|
||||
# List where on the target machine the initrd is used
|
||||
INITRD := /boot/initramfs-test.img
|
||||
|
||||
# Install bootconfig on the target machine and define the path here.
|
||||
BOOTCONFIG := /usr/bin/bootconfig
|
||||
|
||||
# Currenty we just build the .config in the BUILD_DIR
|
||||
BUILD_TYPE := oldconfig
|
||||
|
||||
# Helper macro to run bootconfig on the target
|
||||
# SSH is defined in include/defaults.conf
|
||||
ADD_BOOTCONFIG := ${SSH} "${BOOTCONFIG} -d ${INITRD} && ${BOOTCONFIG} -a /tmp/${BOOTCONFIG_FILE} ${INITRD}"
|
||||
|
||||
# This copies a bootconfig script to the target and then will
|
||||
# add it to the initrd. SSH_USER is defined in include/defaults.conf
|
||||
# and MACHINE is defined in the example configs.
|
||||
BOOTCONFIG_TEST_PREP = scp ${BOOTCONFIG_PATH}${BOOTCONFIG_FILE} ${SSH_USER}@${MACHINE}:/tmp && ${ADD_BOOTCONFIG}
|
||||
|
||||
# When a test is complete, remove the bootconfig from the initrd.
|
||||
CLEAR_BOOTCONFIG := ${SSH} "${BOOTCONFIG} -d ${INITRD}"
|
||||
|
||||
# Run a verifier on the target after it had booted, to make sure that the
|
||||
# bootconfig script did what it was expected to do
|
||||
DO_TEST = scp ${BOOTCONFIG_PATH}${BOOTCONFIG_VERIFY} ${SSH_USER}@${MACHINE}:/tmp && ${SSH} /tmp/${BOOTCONFIG_VERIFY}
|
||||
|
||||
# Comment this out to not run the boot configs
|
||||
RUN_BOOTCONFIG := 1
|
||||
|
||||
TEST_START IF DEFINED RUN_BOOTCONFIG
|
||||
TEST_TYPE = test
|
||||
TEST_NAME = bootconfig boottrace
|
||||
# Just testing the bootconfig on initrd, no need to build the kernel
|
||||
BUILD_TYPE = nobuild
|
||||
BOOTCONFIG_FILE = boottrace.bconf
|
||||
BOOTCONFIG_VERIFY = verify-boottrace.sh
|
||||
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
|
||||
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
|
||||
PRE_TEST_DIE = 1
|
||||
TEST = ${DO_TEST}
|
||||
POST_TEST = ${CLEAR_BOOTCONFIG}
|
||||
|
||||
TEST_START IF DEFINED RUN_BOOTCONFIG
|
||||
TEST_TYPE = test
|
||||
TEST_NAME = bootconfig function graph
|
||||
BUILD_TYPE = nobuild
|
||||
BOOTCONFIG_FILE = functiongraph.bconf
|
||||
BOOTCONFIG_VERIFY = verify-functiongraph.sh
|
||||
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
|
||||
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
|
||||
PRE_TEST_DIE = 1
|
||||
TEST = ${DO_TEST}
|
||||
POST_TEST = ${CLEAR_BOOTCONFIG}
|
||||
|
||||
TEST_START IF DEFINED RUN_BOOTCONFIG
|
||||
TEST_TYPE = test
|
||||
TEST_NAME = bootconfig tracing
|
||||
BUILD_TYPE = nobuild
|
||||
BOOTCONFIG_FILE = tracing.bconf
|
||||
BOOTCONFIG_VERIFY = verify-tracing.sh
|
||||
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
|
||||
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
|
||||
PRE_TEST_DIE = 1
|
||||
TEST = ${DO_TEST}
|
||||
POST_TEST = ${CLEAR_BOOTCONFIG}
|
||||
|
|
@ -90,3 +90,4 @@ INCLUDE include/patchcheck.conf
|
|||
INCLUDE include/tests.conf
|
||||
INCLUDE include/bisect.conf
|
||||
INCLUDE include/min-config.conf
|
||||
INCLUDE include/bootconfig.conf
|
Загрузка…
Ссылка в новой задаче