ftrace: update txt document
Impact: Documentation update only A lot of changes have gone into ftrace. This patch updates the ftrace.txt document. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
Родитель
457d2ee225
Коммит
9b803c0fc3
|
@ -50,26 +50,26 @@ of ftrace. Here is a list of some of the key files:
|
||||||
|
|
||||||
Note: all time values are in microseconds.
|
Note: all time values are in microseconds.
|
||||||
|
|
||||||
current_tracer : This is used to set or display the current tracer
|
current_tracer: This is used to set or display the current tracer
|
||||||
that is configured.
|
that is configured.
|
||||||
|
|
||||||
available_tracers : This holds the different types of tracers that
|
available_tracers: This holds the different types of tracers that
|
||||||
have been compiled into the kernel. The tracers
|
have been compiled into the kernel. The tracers
|
||||||
listed here can be configured by echoing their name
|
listed here can be configured by echoing their name
|
||||||
into current_tracer.
|
into current_tracer.
|
||||||
|
|
||||||
tracing_enabled : This sets or displays whether the current_tracer
|
tracing_enabled: This sets or displays whether the current_tracer
|
||||||
is activated and tracing or not. Echo 0 into this
|
is activated and tracing or not. Echo 0 into this
|
||||||
file to disable the tracer or 1 to enable it.
|
file to disable the tracer or 1 to enable it.
|
||||||
|
|
||||||
trace : This file holds the output of the trace in a human readable
|
trace: This file holds the output of the trace in a human readable
|
||||||
format (described below).
|
format (described below).
|
||||||
|
|
||||||
latency_trace : This file shows the same trace but the information
|
latency_trace: This file shows the same trace but the information
|
||||||
is organized more to display possible latencies
|
is organized more to display possible latencies
|
||||||
in the system (described below).
|
in the system (described below).
|
||||||
|
|
||||||
trace_pipe : The output is the same as the "trace" file but this
|
trace_pipe: The output is the same as the "trace" file but this
|
||||||
file is meant to be streamed with live tracing.
|
file is meant to be streamed with live tracing.
|
||||||
Reads from this file will block until new data
|
Reads from this file will block until new data
|
||||||
is retrieved. Unlike the "trace" and "latency_trace"
|
is retrieved. Unlike the "trace" and "latency_trace"
|
||||||
|
@ -82,11 +82,11 @@ of ftrace. Here is a list of some of the key files:
|
||||||
tracer is not adding more data, they will display
|
tracer is not adding more data, they will display
|
||||||
the same information every time they are read.
|
the same information every time they are read.
|
||||||
|
|
||||||
iter_ctrl : This file lets the user control the amount of data
|
iter_ctrl: This file lets the user control the amount of data
|
||||||
that is displayed in one of the above output
|
that is displayed in one of the above output
|
||||||
files.
|
files.
|
||||||
|
|
||||||
trace_max_latency : Some of the tracers record the max latency.
|
trace_max_latency: Some of the tracers record the max latency.
|
||||||
For example, the time interrupts are disabled.
|
For example, the time interrupts are disabled.
|
||||||
This time is saved in this file. The max trace
|
This time is saved in this file. The max trace
|
||||||
will also be stored, and displayed by either
|
will also be stored, and displayed by either
|
||||||
|
@ -94,29 +94,26 @@ of ftrace. Here is a list of some of the key files:
|
||||||
only be recorded if the latency is greater than
|
only be recorded if the latency is greater than
|
||||||
the value in this file. (in microseconds)
|
the value in this file. (in microseconds)
|
||||||
|
|
||||||
trace_entries : This sets or displays the number of trace
|
trace_entries: This sets or displays the number of bytes each CPU
|
||||||
entries each CPU buffer can hold. The tracer buffers
|
buffer can hold. The tracer buffers are the same size
|
||||||
are the same size for each CPU. The displayed number
|
for each CPU. The displayed number is the size of the
|
||||||
is the size of the CPU buffer and not total size. The
|
CPU buffer and not total size of all buffers. The
|
||||||
trace buffers are allocated in pages (blocks of memory
|
trace buffers are allocated in pages (blocks of memory
|
||||||
that the kernel uses for allocation, usually 4 KB in size).
|
that the kernel uses for allocation, usually 4 KB in size).
|
||||||
Since each entry is smaller than a page, if the last
|
If the last page allocated has room for more bytes
|
||||||
allocated page has room for more entries than were
|
than requested, the rest of the page will be used,
|
||||||
requested, the rest of the page is used to allocate
|
making the actual allocation bigger than requested.
|
||||||
entries.
|
(Note, the size may not be a multiple of the page size due
|
||||||
|
to buffer managment overhead.)
|
||||||
|
|
||||||
This can only be updated when the current_tracer
|
This can only be updated when the current_tracer
|
||||||
is set to "none".
|
is set to "nop".
|
||||||
|
|
||||||
NOTE: It is planned on changing the allocated buffers
|
tracing_cpumask: This is a mask that lets the user only trace
|
||||||
from being the number of possible CPUS to
|
|
||||||
the number of online CPUS.
|
|
||||||
|
|
||||||
tracing_cpumask : This is a mask that lets the user only trace
|
|
||||||
on specified CPUS. The format is a hex string
|
on specified CPUS. The format is a hex string
|
||||||
representing the CPUS.
|
representing the CPUS.
|
||||||
|
|
||||||
set_ftrace_filter : When dynamic ftrace is configured in (see the
|
set_ftrace_filter: When dynamic ftrace is configured in (see the
|
||||||
section below "dynamic ftrace"), the code is dynamically
|
section below "dynamic ftrace"), the code is dynamically
|
||||||
modified (code text rewrite) to disable calling of the
|
modified (code text rewrite) to disable calling of the
|
||||||
function profiler (mcount). This lets tracing be configured
|
function profiler (mcount). This lets tracing be configured
|
||||||
|
@ -130,14 +127,11 @@ of ftrace. Here is a list of some of the key files:
|
||||||
be traced. If a function exists in both set_ftrace_filter
|
be traced. If a function exists in both set_ftrace_filter
|
||||||
and set_ftrace_notrace, the function will _not_ be traced.
|
and set_ftrace_notrace, the function will _not_ be traced.
|
||||||
|
|
||||||
available_filter_functions : When a function is encountered the first
|
available_filter_functions: This lists the functions that ftrace
|
||||||
time by the dynamic tracer, it is recorded and
|
has processed and can trace. These are the function
|
||||||
later the call is converted into a nop. This file
|
names that you can pass to "set_ftrace_filter" or
|
||||||
lists the functions that have been recorded
|
"set_ftrace_notrace". (See the section "dynamic ftrace"
|
||||||
by the dynamic tracer and these functions can
|
below for more details.)
|
||||||
be used to set the ftrace filter by the above
|
|
||||||
"set_ftrace_filter" file. (See the section "dynamic ftrace"
|
|
||||||
below for more details).
|
|
||||||
|
|
||||||
|
|
||||||
The Tracers
|
The Tracers
|
||||||
|
@ -145,7 +139,7 @@ The Tracers
|
||||||
|
|
||||||
Here is the list of current tracers that may be configured.
|
Here is the list of current tracers that may be configured.
|
||||||
|
|
||||||
ftrace - function tracer that uses mcount to trace all functions.
|
function - function tracer that uses mcount to trace all functions.
|
||||||
|
|
||||||
sched_switch - traces the context switches between tasks.
|
sched_switch - traces the context switches between tasks.
|
||||||
|
|
||||||
|
@ -166,8 +160,8 @@ Here is the list of current tracers that may be configured.
|
||||||
the highest priority task to get scheduled after
|
the highest priority task to get scheduled after
|
||||||
it has been woken up.
|
it has been woken up.
|
||||||
|
|
||||||
none - This is not a tracer. To remove all tracers from tracing
|
nop - This is not a tracer. To remove all tracers from tracing
|
||||||
simply echo "none" into current_tracer.
|
simply echo "nop" into current_tracer.
|
||||||
|
|
||||||
|
|
||||||
Examples of using the tracer
|
Examples of using the tracer
|
||||||
|
@ -182,7 +176,7 @@ Output format:
|
||||||
Here is an example of the output format of the file "trace"
|
Here is an example of the output format of the file "trace"
|
||||||
|
|
||||||
--------
|
--------
|
||||||
# tracer: ftrace
|
# tracer: function
|
||||||
#
|
#
|
||||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||||
# | | | | |
|
# | | | | |
|
||||||
|
@ -192,7 +186,7 @@ Here is an example of the output format of the file "trace"
|
||||||
--------
|
--------
|
||||||
|
|
||||||
A header is printed with the tracer name that is represented by the trace.
|
A header is printed with the tracer name that is represented by the trace.
|
||||||
In this case the tracer is "ftrace". Then a header showing the format. Task
|
In this case the tracer is "function". Then a header showing the format. Task
|
||||||
name "bash", the task PID "4251", the CPU that it was running on
|
name "bash", the task PID "4251", the CPU that it was running on
|
||||||
"01", the timestamp in <secs>.<usecs> format, the function name that was
|
"01", the timestamp in <secs>.<usecs> format, the function name that was
|
||||||
traced "path_put" and the parent function that called this function
|
traced "path_put" and the parent function that called this function
|
||||||
|
@ -1003,22 +997,20 @@ is the stack for the hard interrupt. This hides the fact that NEED_RESCHED
|
||||||
has been set. We do not see the 'N' until we switch back to the task's
|
has been set. We do not see the 'N' until we switch back to the task's
|
||||||
assigned stack.
|
assigned stack.
|
||||||
|
|
||||||
ftrace
|
function
|
||||||
------
|
--------
|
||||||
|
|
||||||
ftrace is not only the name of the tracing infrastructure, but it
|
This tracer is the function tracer. Enabling the function tracer
|
||||||
is also a name of one of the tracers. The tracer is the function
|
can be done from the debug file system. Make sure the ftrace_enabled is
|
||||||
tracer. Enabling the function tracer can be done from the
|
set; otherwise this tracer is a nop.
|
||||||
debug file system. Make sure the ftrace_enabled is set otherwise
|
|
||||||
this tracer is a nop.
|
|
||||||
|
|
||||||
# sysctl kernel.ftrace_enabled=1
|
# sysctl kernel.ftrace_enabled=1
|
||||||
# echo ftrace > /debug/tracing/current_tracer
|
# echo function > /debug/tracing/current_tracer
|
||||||
# echo 1 > /debug/tracing/tracing_enabled
|
# echo 1 > /debug/tracing/tracing_enabled
|
||||||
# usleep 1
|
# usleep 1
|
||||||
# echo 0 > /debug/tracing/tracing_enabled
|
# echo 0 > /debug/tracing/tracing_enabled
|
||||||
# cat /debug/tracing/trace
|
# cat /debug/tracing/trace
|
||||||
# tracer: ftrace
|
# tracer: function
|
||||||
#
|
#
|
||||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||||
# | | | | |
|
# | | | | |
|
||||||
|
@ -1040,10 +1032,10 @@ this tracer is a nop.
|
||||||
[...]
|
[...]
|
||||||
|
|
||||||
|
|
||||||
Note: ftrace uses ring buffers to store the above entries. The newest data
|
Note: function tracer uses ring buffers to store the above entries.
|
||||||
may overwrite the oldest data. Sometimes using echo to stop the trace
|
The newest data may overwrite the oldest data. Sometimes using echo to
|
||||||
is not sufficient because the tracing could have overwritten the data
|
stop the trace is not sufficient because the tracing could have overwritten
|
||||||
that you wanted to record. For this reason, it is sometimes better to
|
the data that you wanted to record. For this reason, it is sometimes better to
|
||||||
disable tracing directly from a program. This allows you to stop the
|
disable tracing directly from a program. This allows you to stop the
|
||||||
tracing at the point that you hit the part that you are interested in.
|
tracing at the point that you hit the part that you are interested in.
|
||||||
To disable the tracing directly from a C program, something like following
|
To disable the tracing directly from a C program, something like following
|
||||||
|
@ -1077,18 +1069,31 @@ every kernel function, produced by the -pg switch in gcc), starts
|
||||||
of pointing to a simple return. (Enabling FTRACE will include the
|
of pointing to a simple return. (Enabling FTRACE will include the
|
||||||
-pg switch in the compiling of the kernel.)
|
-pg switch in the compiling of the kernel.)
|
||||||
|
|
||||||
When dynamic ftrace is initialized, it calls kstop_machine to make
|
At compile time every C file object is run through the
|
||||||
the machine act like a uniprocessor so that it can freely modify code
|
recordmcount.pl script (located in the scripts directory). This
|
||||||
without worrying about other processors executing that same code. At
|
script will process the C object using objdump to find all the
|
||||||
initialization, the mcount calls are changed to call a "record_ip"
|
locations in the .text section that call mcount. (Note, only
|
||||||
function. After this, the first time a kernel function is called,
|
the .text section is processed, since processing other sections
|
||||||
it has the calling address saved in a hash table.
|
like .init.text may cause races due to those sections being freed).
|
||||||
|
|
||||||
Later on the ftraced kernel thread is awoken and will again call
|
A new section called "__mcount_loc" is created that holds references
|
||||||
kstop_machine if new functions have been recorded. The ftraced thread
|
to all the mcount call sites in the .text section. This section is
|
||||||
will change all calls to mcount to "nop". Just calling mcount
|
compiled back into the original object. The final linker will add
|
||||||
and having mcount return has shown a 10% overhead. By converting
|
all these references into a single table.
|
||||||
it to a nop, there is no measurable overhead to the system.
|
|
||||||
|
On boot up, before SMP is initialized, the dynamic ftrace code
|
||||||
|
scans this table and updates all the locations into nops. It also
|
||||||
|
records the locations, which are added to the available_filter_functions
|
||||||
|
list. Modules are processed as they are loaded and before they are
|
||||||
|
executed. When a module is unloaded, it also removes its functions from
|
||||||
|
the ftrace function list. This is automatic in the module unload
|
||||||
|
code, and the module author does not need to worry about it.
|
||||||
|
|
||||||
|
When tracing is enabled, kstop_machine is called to prevent races
|
||||||
|
with the CPUS executing code being modified (which can cause the
|
||||||
|
CPU to do undesireable things), and the nops are patched back
|
||||||
|
to calls. But this time, they do not call mcount (which is just
|
||||||
|
a function stub). They now call into the ftrace infrastructure.
|
||||||
|
|
||||||
One special side-effect to the recording of the functions being
|
One special side-effect to the recording of the functions being
|
||||||
traced is that we can now selectively choose which functions we
|
traced is that we can now selectively choose which functions we
|
||||||
|
@ -1251,36 +1256,6 @@ Produces:
|
||||||
|
|
||||||
We can see that there's no more lock or preempt tracing.
|
We can see that there's no more lock or preempt tracing.
|
||||||
|
|
||||||
ftraced
|
|
||||||
-------
|
|
||||||
|
|
||||||
As mentioned above, when dynamic ftrace is configured in, a kernel
|
|
||||||
thread wakes up once a second and checks to see if there are mcount
|
|
||||||
calls that need to be converted into nops. If there are not any, then
|
|
||||||
it simply goes back to sleep. But if there are some, it will call
|
|
||||||
kstop_machine to convert the calls to nops.
|
|
||||||
|
|
||||||
There may be a case in which you do not want this added latency.
|
|
||||||
Perhaps you are doing some audio recording and this activity might
|
|
||||||
cause skips in the playback. There is an interface to disable
|
|
||||||
and enable the "ftraced" kernel thread.
|
|
||||||
|
|
||||||
# echo 0 > /debug/tracing/ftraced_enabled
|
|
||||||
|
|
||||||
This will disable the calling of kstop_machine to update the
|
|
||||||
mcount calls to nops. Remember that there is a large overhead
|
|
||||||
to calling mcount. Without this kernel thread, that overhead will
|
|
||||||
exist.
|
|
||||||
|
|
||||||
If there are recorded calls to mcount, any write to the ftraced_enabled
|
|
||||||
file will cause the kstop_machine to run. This means that a
|
|
||||||
user can manually perform the updates when they want to by simply
|
|
||||||
echoing a '0' into the ftraced_enabled file.
|
|
||||||
|
|
||||||
The updates are also done at the beginning of enabling a tracer
|
|
||||||
that uses ftrace function recording.
|
|
||||||
|
|
||||||
|
|
||||||
trace_pipe
|
trace_pipe
|
||||||
----------
|
----------
|
||||||
|
|
||||||
|
@ -1289,14 +1264,14 @@ on the tracing is different. Every read from trace_pipe is consumed.
|
||||||
This means that subsequent reads will be different. The trace
|
This means that subsequent reads will be different. The trace
|
||||||
is live.
|
is live.
|
||||||
|
|
||||||
# echo ftrace > /debug/tracing/current_tracer
|
# echo function > /debug/tracing/current_tracer
|
||||||
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
|
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
|
||||||
[1] 4153
|
[1] 4153
|
||||||
# echo 1 > /debug/tracing/tracing_enabled
|
# echo 1 > /debug/tracing/tracing_enabled
|
||||||
# usleep 1
|
# usleep 1
|
||||||
# echo 0 > /debug/tracing/tracing_enabled
|
# echo 0 > /debug/tracing/tracing_enabled
|
||||||
# cat /debug/tracing/trace
|
# cat /debug/tracing/trace
|
||||||
# tracer: ftrace
|
# tracer: function
|
||||||
#
|
#
|
||||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||||
# | | | | |
|
# | | | | |
|
||||||
|
@ -1317,7 +1292,7 @@ is live.
|
||||||
|
|
||||||
Note, reading the trace_pipe file will block until more input is added.
|
Note, reading the trace_pipe file will block until more input is added.
|
||||||
By changing the tracer, trace_pipe will issue an EOF. We needed
|
By changing the tracer, trace_pipe will issue an EOF. We needed
|
||||||
to set the ftrace tracer _before_ cating the trace_pipe file.
|
to set the function tracer _before_ we "cat" the trace_pipe file.
|
||||||
|
|
||||||
|
|
||||||
trace entries
|
trace entries
|
||||||
|
@ -1334,10 +1309,10 @@ number of entries.
|
||||||
65620
|
65620
|
||||||
|
|
||||||
Note, to modify this, you must have tracing completely disabled. To do that,
|
Note, to modify this, you must have tracing completely disabled. To do that,
|
||||||
echo "none" into the current_tracer. If the current_tracer is not set
|
echo "nop" into the current_tracer. If the current_tracer is not set
|
||||||
to "none", an EINVAL error will be returned.
|
to "nop", an EINVAL error will be returned.
|
||||||
|
|
||||||
# echo none > /debug/tracing/current_tracer
|
# echo nop > /debug/tracing/current_tracer
|
||||||
# echo 100000 > /debug/tracing/trace_entries
|
# echo 100000 > /debug/tracing/trace_entries
|
||||||
# cat /debug/tracing/trace_entries
|
# cat /debug/tracing/trace_entries
|
||||||
100045
|
100045
|
||||||
|
|
Загрузка…
Ссылка в новой задаче