Граф коммитов

2626 Коммитов

Автор SHA1 Сообщение Дата
Linus Torvalds 214b931320 Lots of tweaks, small fixes, optimizations, and some helper functions
to help out the rest of the kernel to ease their use of trace events.
 
 The big change for this release is the allowing of other tracers,
 such as the latency tracers, to be used in the trace instances and allow
 for function or function graph tracing to be in the top level
 simultaneously.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTlbUqAAoJEKQekfcNnQGuP+8H+wTBG06beHsqe6XcaeXcKNkt
 Mimm0O04oQdw89CBWeJvXyOwRTtiN4M/4hxHXBTDtChxM9oUyWw1o0IpSMMuQ16O
 w9r3DfC8e1air+ufEuYWM0QNtyzHi8EfDSNia55ON5jvtkCZTXOEKZD+n8M9w28p
 I7PVgr0PDztsCpethCpg0M8beK9zuQPWMzsHAQCsKI06Xl5z33kPIJR15Exh+Kr1
 uVVTZW7JFVAPuSnteLSIx9pN6OjsVGzOZCljg+O+9/v/02u5nkMiS2nURxae86kg
 RTSiRYT6Hvl/MCBhdss/w5kgSk6BYiZ0hXbLtwetvre+vQrOR5CnDw2DxZ7e+gU=
 =oudH
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "Lots of tweaks, small fixes, optimizations, and some helper functions
  to help out the rest of the kernel to ease their use of trace events.

  The big change for this release is the allowing of other tracers, such
  as the latency tracers, to be used in the trace instances and allow
  for function or function graph tracing to be in the top level
  simultaneously"

* tag 'trace-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
  tracing: Fix memory leak on instance deletion
  tracing: Fix leak of ring buffer data when new instances creation fails
  tracing/kprobes: Avoid self tests if tracing is disabled on boot up
  tracing: Return error if ftrace_trace_arrays list is empty
  tracing: Only calculate stats of tracepoint benchmarks for 2^32 times
  tracing: Convert stddev into u64 in tracepoint benchmark
  tracing: Introduce saved_cmdlines_size file
  tracing: Add __get_dynamic_array_len() macro for trace events
  tracing: Remove unused variable in trace_benchmark
  tracing: Eliminate double free on failure of allocation on boot up
  ftrace/x86: Call text_ip_addr() instead of the duplicated code
  tracing: Print max callstack on stacktrace bug
  tracing: Move locking of trace_cmdline_lock into start/stop seq calls
  tracing: Try again for saved cmdline if failed due to locking
  tracing: Have saved_cmdlines use the seq_read infrastructure
  tracing: Add tracepoint benchmark tracepoint
  tracing: Print nasty banner when trace_printk() is in use
  tracing: Add funcgraph_tail option to print function name after closing braces
  tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
  tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks
  ...
2014-06-09 16:39:15 -07:00
Steven Rostedt (Red Hat) a9fcaaac37 tracing: Fix memory leak on instance deletion
When an instance is created, it also gets a snapshot ring buffer
allocated (with minimum of pages). But when it is deleted the snapshot
buffer is not. There was a helper function added to match the allocation
of these ring buffers to a way to free them, but it wasn't used by
the deletion of an instance. Using that helper function solves this
memory leak.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 23:17:28 -04:00
Steven Rostedt (Red Hat) 23aaa3c18e tracing: Fix leak of ring buffer data when new instances creation fails
Yoshihiro Yunomae reported that the ring buffer data for a trace
instance does not get properly cleaned up when it fails. He proposed
a patch that manually cleaned the data up and addad a bunch of labels.
The labels are not needed because all trace array is allocated with
a kzalloc which initializes it to 0 and all kfree()s can take a NULL
pointer and will ignore it.

Adding a new helper function free_trace_buffers() that can also take
null buffers to free the buffers that were allocated by
allocate_trace_buffers().

Link: http://lkml.kernel.org/r/20140605223522.32311.31664.stgit@yunodevel

Reported-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 04:53:40 -04:00
Yoshihiro YUNOMAE 748ec3a20e tracing/kprobes: Avoid self tests if tracing is disabled on boot up
If tracing is disabled on boot up, the kernel should not execute tracing
self tests. The kernel should check whether tracing is disabled or not
before executing any of the tracing self tests.

Link: http://lkml.kernel.org/p/20140605223520.32311.56097.stgit@yunodevel

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 04:53:39 -04:00
Yoshihiro YUNOMAE dc81e5e3ab tracing: Return error if ftrace_trace_arrays list is empty
ftrace_trace_arrays links global_trace.list. However, global_trace
is not added to ftrace_trace_arrays if trace_alloc_buffers() failed.
As the result, ftrace_trace_arrays becomes an empty list. If
ftrace_trace_arrays is an empty list, current top_trace_array() returns
an invalid pointer. As the result, the kernel can induce memory corruption
or panic.

Current implementation does not check whether ftrace_trace_arrays is empty
list or not. So, in this patch, if ftrace_trace_arrays is empty list,
top_trace_array() returns NULL. Moreover, this patch makes all functions
calling top_trace_array() handle it appropriately.

Link: http://lkml.kernel.org/p/20140605223517.32311.99233.stgit@yunodevel

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 04:47:46 -04:00
Steven Rostedt (Red Hat) 34839f5a69 tracing: Only calculate stats of tracepoint benchmarks for 2^32 times
When calculating the average and standard deviation, it is required that
the count be less than UINT_MAX, otherwise the do_div() will get
undefined results. After 2^32 counts of data, the average and standard
deviation should pretty much be set anyway.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-06 00:41:38 -04:00
Steven Rostedt (Red Hat) 72e2fe38ea tracing: Convert stddev into u64 in tracepoint benchmark
I've been told that do_div() expects an unsigned 64 bit number, and
is undefined if a signed is used. This gave a warning on the MIPS
build. I'm not sure if a signed 64 bit dividend is really an issue
or not, but the calculation this is used for is standard deviation,
and that isn't going to be negative. We can just convert it to
unsigned and be safe.

Reported-by: David Daney <ddaney.cavm@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-05 20:35:30 -04:00
Yoshihiro YUNOMAE 939c7a4f04 tracing: Introduce saved_cmdlines_size file
Introduce saved_cmdlines_size file for changing the number of saved pid-comms.
saved_cmdlines currently stores 128 command names using SAVED_CMDLINES, but
'no-existing processes' names are often lost in saved_cmdlines when we
read the trace data. So, by introducing saved_cmdlines_size file, we can
now change the 128 command names saved to something much larger if needed.

When we write a value to saved_cmdlines_size, the number of the value will
be stored in pid-comm list:

	# echo 1024 > /sys/kernel/debug/tracing/saved_cmdlines_size

Here, 1024 command names can be stored. The default number is 128 and the maximum
number is PID_MAX_DEFAULT (=32768 if CONFIG_BASE_SMALL is not set). So, if we
want to avoid losing any command names, we need to set 32768 to
saved_cmdlines_size.

We can read the maximum number of the list:

	# cat /sys/kernel/debug/tracing/saved_cmdlines_size
	128

Link: http://lkml.kernel.org/p/20140605012427.22115.16173.stgit@yunodevel

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-05 12:35:49 -04:00
Steven Rostedt (Red Hat) 195d280201 tracing: Remove unused variable in trace_benchmark
Somehow this unused variable warning sneaked past my warnings check
(probably due to it depending on a new config).

kernel/trace/trace_benchmark.c: In function 'trace_do_benchmark':
kernel/trace/trace_benchmark.c:38:6: warning: unused variable 'seedsq' [-Wunused-variable]
  u64 seedsq;
      ^

Link: http://lkml.kernel.org/r/20140604160921.4f4e69c4@canb.auug.org.au

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-04 13:36:29 -04:00
Yoshihiro YUNOMAE 198376cd88 tracing: Eliminate double free on failure of allocation on boot up
If allocation of the max_buffer fails on boot up, the error path will
free both per_cpu data structures from the buffers. With the new redesign
of the code, those structures are freed if allocations failed. That is,
the helper function that allocates the buffers will free the per cpu data
on failure. No need to do it again. In fact, the second free will cause
a bug as the code can not handle a double free.

Link: http://lkml.kernel.org/p/20140603042803.27308.30956.stgit@yunodevel

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-03 19:58:31 -04:00
Minchan Kim e317218194 tracing: Print max callstack on stacktrace bug
While I played with my own feature(ex, something on the way to reclaim),
the kernel would easily oops. I guessed that the reason had to do with
stack overflow and wanted to prove it.

I discovered the stack tracer which proved to be very useful for me but
the kernel would oops before my user program gather the information via
"watch cat /sys/kernel/debug/tracing/stack_trace" so I couldn't get any
message from that. What I needed was to have the stack tracer emit the
kernel stack usage before it does the oops so I could find what was
hogging the stack.

This patch shows the callstack of max stack usage right before an oops so
we can find a culprit.

So, the result is as follows.

[ 1116.522206] init: lightdm main process (1246) terminated with status 1
[ 1119.922916] init: failsafe-x main process (1272) terminated with status 1
[ 3887.728131] kworker/u24:1 (6637) used greatest stack depth: 256 bytes left
[ 6397.629227] cc1 (9554) used greatest stack depth: 128 bytes left
[ 7174.467392]         Depth    Size   Location    (47 entries)
[ 7174.467392]         -----    ----   --------
[ 7174.467785]   0)     7248     256   get_page_from_freelist+0xa7/0x920
[ 7174.468506]   1)     6992     352   __alloc_pages_nodemask+0x1cd/0xb20
[ 7174.469224]   2)     6640       8   alloc_pages_current+0x10f/0x1f0
[ 7174.469413]   3)     6632     168   new_slab+0x2c5/0x370
[ 7174.469413]   4)     6464       8   __slab_alloc+0x3a9/0x501
[ 7174.469413]   5)     6456      80   __kmalloc+0x1cb/0x200
[ 7174.469413]   6)     6376     376   vring_add_indirect+0x36/0x200
[ 7174.469413]   7)     6000     144   virtqueue_add_sgs+0x2e2/0x320
[ 7174.469413]   8)     5856     288   __virtblk_add_req+0xda/0x1b0
[ 7174.469413]   9)     5568      96   virtio_queue_rq+0xd3/0x1d0
[ 7174.469413]  10)     5472     128   __blk_mq_run_hw_queue+0x1ef/0x440
[ 7174.469413]  11)     5344      16   blk_mq_run_hw_queue+0x35/0x40
[ 7174.469413]  12)     5328      96   blk_mq_insert_requests+0xdb/0x160
[ 7174.469413]  13)     5232     112   blk_mq_flush_plug_list+0x12b/0x140
[ 7174.469413]  14)     5120     112   blk_flush_plug_list+0xc7/0x220
[ 7174.469413]  15)     5008      64   io_schedule_timeout+0x88/0x100
[ 7174.469413]  16)     4944     128   mempool_alloc+0x145/0x170
[ 7174.469413]  17)     4816      96   bio_alloc_bioset+0x10b/0x1d0
[ 7174.469413]  18)     4720      48   get_swap_bio+0x30/0x90
[ 7174.469413]  19)     4672     160   __swap_writepage+0x150/0x230
[ 7174.469413]  20)     4512      32   swap_writepage+0x42/0x90
[ 7174.469413]  21)     4480     320   shrink_page_list+0x676/0xa80
[ 7174.469413]  22)     4160     208   shrink_inactive_list+0x262/0x4e0
[ 7174.469413]  23)     3952     304   shrink_lruvec+0x3e1/0x6a0
[ 7174.469413]  24)     3648      80   shrink_zone+0x3f/0x110
[ 7174.469413]  25)     3568     128   do_try_to_free_pages+0x156/0x4c0
[ 7174.469413]  26)     3440     208   try_to_free_pages+0xf7/0x1e0
[ 7174.469413]  27)     3232     352   __alloc_pages_nodemask+0x783/0xb20
[ 7174.469413]  28)     2880       8   alloc_pages_current+0x10f/0x1f0
[ 7174.469413]  29)     2872     200   __page_cache_alloc+0x13f/0x160
[ 7174.469413]  30)     2672      80   find_or_create_page+0x4c/0xb0
[ 7174.469413]  31)     2592      80   ext4_mb_load_buddy+0x1e9/0x370
[ 7174.469413]  32)     2512     176   ext4_mb_regular_allocator+0x1b7/0x460
[ 7174.469413]  33)     2336     128   ext4_mb_new_blocks+0x458/0x5f0
[ 7174.469413]  34)     2208     256   ext4_ext_map_blocks+0x70b/0x1010
[ 7174.469413]  35)     1952     160   ext4_map_blocks+0x325/0x530
[ 7174.469413]  36)     1792     384   ext4_writepages+0x6d1/0xce0
[ 7174.469413]  37)     1408      16   do_writepages+0x23/0x40
[ 7174.469413]  38)     1392      96   __writeback_single_inode+0x45/0x2e0
[ 7174.469413]  39)     1296     176   writeback_sb_inodes+0x2ad/0x500
[ 7174.469413]  40)     1120      80   __writeback_inodes_wb+0x9e/0xd0
[ 7174.469413]  41)     1040     160   wb_writeback+0x29b/0x350
[ 7174.469413]  42)      880     208   bdi_writeback_workfn+0x11c/0x480
[ 7174.469413]  43)      672     144   process_one_work+0x1d2/0x570
[ 7174.469413]  44)      528     112   worker_thread+0x116/0x370
[ 7174.469413]  45)      416     240   kthread+0xf3/0x110
[ 7174.469413]  46)      176     176   ret_from_fork+0x7c/0xb0
[ 7174.469413] ------------[ cut here ]------------
[ 7174.469413] kernel BUG at kernel/trace/trace_stack.c:174!
[ 7174.469413] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 7174.469413] Dumping ftrace buffer:
[ 7174.469413]    (ftrace buffer empty)
[ 7174.469413] Modules linked in:
[ 7174.469413] CPU: 0 PID: 440 Comm: kworker/u24:0 Not tainted 3.14.0+ #212
[ 7174.469413] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 7174.469413] Workqueue: writeback bdi_writeback_workfn (flush-253:0)
[ 7174.469413] task: ffff880034170000 ti: ffff880029518000 task.ti: ffff880029518000
[ 7174.469413] RIP: 0010:[<ffffffff8112336e>]  [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
[ 7174.469413] RSP: 0000:ffff880029518290  EFLAGS: 00010046
[ 7174.469413] RAX: 0000000000000030 RBX: 000000000000002f RCX: 0000000000000000
[ 7174.469413] RDX: 0000000000000000 RSI: 000000000000002f RDI: ffffffff810b7159
[ 7174.469413] RBP: ffff8800295182f0 R08: ffffffffffffffff R09: 0000000000000000
[ 7174.469413] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff82768dfc
[ 7174.469413] R13: 000000000000f2e8 R14: ffff8800295182b8 R15: 00000000000000f8
[ 7174.469413] FS:  0000000000000000(0000) GS:ffff880037c00000(0000) knlGS:0000000000000000
[ 7174.469413] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 7174.469413] CR2: 00002acd0b994000 CR3: 0000000001c0b000 CR4: 00000000000006f0
[ 7174.469413] Stack:
[ 7174.469413]  0000000000000000 ffffffff8114fdb7 0000000000000087 0000000000001c50
[ 7174.469413]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 7174.469413]  0000000000000002 ffff880034170000 ffff880034171028 0000000000000000
[ 7174.469413] Call Trace:
[ 7174.469413]  [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
[ 7174.469413]  [<ffffffff816eee3f>] ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
[ 7174.469413]  [<ffffffff810a23fa>] ? __bfs+0x11a/0x270
[ 7174.469413]  [<ffffffff81165065>] ? next_zones_zonelist+0x5/0x70
[ 7174.469413]  [<ffffffff8114fdb7>] ? get_page_from_freelist+0xa7/0x920
[ 7174.469413]  [<ffffffff8119092f>] ? alloc_pages_current+0x10f/0x1f0
[ 7174.469413]  [<ffffffff811507fd>] __alloc_pages_nodemask+0x1cd/0xb20
[ 7174.469413]  [<ffffffff810a4de6>] ? check_irq_usage+0x96/0xe0
[ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
[ 7174.469413]  [<ffffffff81199cd5>] ? new_slab+0x2c5/0x370
[ 7174.469413]  [<ffffffff81199cd5>] new_slab+0x2c5/0x370
[ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff816db002>] __slab_alloc+0x3a9/0x501
[ 7174.469413]  [<ffffffff8119af8b>] ? __kmalloc+0x1cb/0x200
[ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413]  [<ffffffff8141dc46>] ? vring_add_indirect+0x36/0x200
[ 7174.469413]  [<ffffffff8119af8b>] __kmalloc+0x1cb/0x200
[ 7174.469413]  [<ffffffff8141de10>] ? vring_add_indirect+0x200/0x200
[ 7174.469413]  [<ffffffff8141dc46>] vring_add_indirect+0x36/0x200
[ 7174.469413]  [<ffffffff8141e402>] virtqueue_add_sgs+0x2e2/0x320
[ 7174.469413]  [<ffffffff8148e35a>] __virtblk_add_req+0xda/0x1b0
[ 7174.469413]  [<ffffffff8148e503>] virtio_queue_rq+0xd3/0x1d0
[ 7174.469413]  [<ffffffff8134aa0f>] __blk_mq_run_hw_queue+0x1ef/0x440
[ 7174.469413]  [<ffffffff8134b0d5>] blk_mq_run_hw_queue+0x35/0x40
[ 7174.469413]  [<ffffffff8134b7bb>] blk_mq_insert_requests+0xdb/0x160
[ 7174.469413]  [<ffffffff8134be5b>] blk_mq_flush_plug_list+0x12b/0x140
[ 7174.469413]  [<ffffffff81342237>] blk_flush_plug_list+0xc7/0x220
[ 7174.469413]  [<ffffffff816e60ef>] ? _raw_spin_unlock_irqrestore+0x3f/0x70
[ 7174.469413]  [<ffffffff816e16e8>] io_schedule_timeout+0x88/0x100
[ 7174.469413]  [<ffffffff816e1665>] ? io_schedule_timeout+0x5/0x100
[ 7174.469413]  [<ffffffff81149415>] mempool_alloc+0x145/0x170
[ 7174.469413]  [<ffffffff8109baf0>] ? __init_waitqueue_head+0x60/0x60
[ 7174.469413]  [<ffffffff811e246b>] bio_alloc_bioset+0x10b/0x1d0
[ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413]  [<ffffffff81184110>] get_swap_bio+0x30/0x90
[ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413]  [<ffffffff81184660>] __swap_writepage+0x150/0x230
[ 7174.469413]  [<ffffffff810ab405>] ? do_raw_spin_unlock+0x5/0xa0
[ 7174.469413]  [<ffffffff81184230>] ? end_swap_bio_read+0xc0/0xc0
[ 7174.469413]  [<ffffffff81184515>] ? __swap_writepage+0x5/0x230
[ 7174.469413]  [<ffffffff81184782>] swap_writepage+0x42/0x90
[ 7174.469413]  [<ffffffff8115ae96>] shrink_page_list+0x676/0xa80
[ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff8115b872>] shrink_inactive_list+0x262/0x4e0
[ 7174.469413]  [<ffffffff8115c1c1>] shrink_lruvec+0x3e1/0x6a0
[ 7174.469413]  [<ffffffff8115c4bf>] shrink_zone+0x3f/0x110
[ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff8115c9e6>] do_try_to_free_pages+0x156/0x4c0
[ 7174.469413]  [<ffffffff8115cf47>] try_to_free_pages+0xf7/0x1e0
[ 7174.469413]  [<ffffffff81150db3>] __alloc_pages_nodemask+0x783/0xb20
[ 7174.469413]  [<ffffffff8119092f>] alloc_pages_current+0x10f/0x1f0
[ 7174.469413]  [<ffffffff81145c0f>] ? __page_cache_alloc+0x13f/0x160
[ 7174.469413]  [<ffffffff81145c0f>] __page_cache_alloc+0x13f/0x160
[ 7174.469413]  [<ffffffff81146c6c>] find_or_create_page+0x4c/0xb0
[ 7174.469413]  [<ffffffff811463e5>] ? find_get_page+0x5/0x130
[ 7174.469413]  [<ffffffff812837b9>] ext4_mb_load_buddy+0x1e9/0x370
[ 7174.469413]  [<ffffffff81284c07>] ext4_mb_regular_allocator+0x1b7/0x460
[ 7174.469413]  [<ffffffff81281070>] ? ext4_mb_use_preallocated+0x40/0x360
[ 7174.469413]  [<ffffffff816eee3f>] ? ftrace_call+0x5/0x2f
[ 7174.469413]  [<ffffffff81287eb8>] ext4_mb_new_blocks+0x458/0x5f0
[ 7174.469413]  [<ffffffff8127d83b>] ext4_ext_map_blocks+0x70b/0x1010
[ 7174.469413]  [<ffffffff8124e6d5>] ext4_map_blocks+0x325/0x530
[ 7174.469413]  [<ffffffff81253871>] ext4_writepages+0x6d1/0xce0
[ 7174.469413]  [<ffffffff812531a0>] ? ext4_journalled_write_end+0x330/0x330
[ 7174.469413]  [<ffffffff811539b3>] do_writepages+0x23/0x40
[ 7174.469413]  [<ffffffff811d2365>] __writeback_single_inode+0x45/0x2e0
[ 7174.469413]  [<ffffffff811d36ed>] writeback_sb_inodes+0x2ad/0x500
[ 7174.469413]  [<ffffffff811d39de>] __writeback_inodes_wb+0x9e/0xd0
[ 7174.469413]  [<ffffffff811d40bb>] wb_writeback+0x29b/0x350
[ 7174.469413]  [<ffffffff81057c3d>] ? __local_bh_enable_ip+0x6d/0xd0
[ 7174.469413]  [<ffffffff811d6e9c>] bdi_writeback_workfn+0x11c/0x480
[ 7174.469413]  [<ffffffff81070610>] ? process_one_work+0x170/0x570
[ 7174.469413]  [<ffffffff81070672>] process_one_work+0x1d2/0x570
[ 7174.469413]  [<ffffffff81070610>] ? process_one_work+0x170/0x570
[ 7174.469413]  [<ffffffff81071bb6>] worker_thread+0x116/0x370
[ 7174.469413]  [<ffffffff81071aa0>] ? manage_workers.isra.19+0x2e0/0x2e0
[ 7174.469413]  [<ffffffff81078e53>] kthread+0xf3/0x110
[ 7174.469413]  [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
[ 7174.469413]  [<ffffffff816ef0ec>] ret_from_fork+0x7c/0xb0
[ 7174.469413]  [<ffffffff81078d60>] ? flush_kthread_worker+0x150/0x150
[ 7174.469413] Code: c0 49 bc fc 8d 76 82 ff ff ff ff e8 44 5a 5b 00 31 f6 8b 05 95 2b b3 00 48 39 c6 7d 0e 4c 8b 04 f5 20 5f c5 81 49 83 f8 ff 75 11 <0f> 0b 48 63 05 71 5a 64 01 48 29 c3 e9 d0 fd ff ff 48 8d 5e 01
[ 7174.469413] RIP  [<ffffffff8112336e>] stack_trace_call+0x2de/0x340
[ 7174.469413]  RSP <ffff880029518290>
[ 7174.469413] ---[ end trace c97d325b36b718f3 ]---

Link: http://lkml.kernel.org/p/1401683592-1651-1-git-send-email-minchan@kernel.org

Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-06-02 16:43:49 -04:00
Steven Rostedt (Red Hat) 4c27e756bc tracing: Move locking of trace_cmdline_lock into start/stop seq calls
With the conversion of the saved_cmdlines output to use seq_read, there
is now a race between accessing the values of the saved_cmdlines and
the writing to them. The trace_cmdline_lock needs to be taken at
the start and stop of the seq calls.

A new __trace_find_cmdline() call is created to allow for the look up
to happen without taking the lock.

Fixes: 42584c81c5 tracing: Have saved_cmdlines use the seq_read infrastructure
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-30 13:03:40 -04:00
Steven Rostedt (Red Hat) 379cfdac37 tracing: Try again for saved cmdline if failed due to locking
In order to prevent the saved cmdline cache from being filled when
tracing is not active, the comms are only recorded after a trace event
is recorded.

The problem is, a comm can fail to be recorded if the trace_cmdline_lock
is held. That lock is taken via a trylock to allow it to happen from
any context (including NMI). If the lock fails to be taken, the comm
is skipped. No big deal, as we will try again later.

But! Because of the code that was added to only record after an event,
we may not try again later as the recording is made as a oneshot per
event per CPU.

Only disable the recording of the comm if the comm is actually recorded.

Fixes: 7ffbd48d5c "tracing: Cache comms only after an event occurred"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-30 09:42:39 -04:00
Yoshihiro YUNOMAE 42584c81c5 tracing: Have saved_cmdlines use the seq_read infrastructure
Current tracing_saved_cmdlines_read() implementation is naive; It allocates
a large buffer, constructs output data to that buffer for each read
operation, and then copies a portion of the buffer to the user space
buffer. This has several issues such as slow memory allocation, high
CPU usage, and even corruption of the output data.

The seq_read infrastructure is made to handle this type of work.
By converting it to use seq_read() the code becomes smaller, simplified,
as well as correct.

Link: http://lkml.kernel.org/p/20140220084431.3839.51793.stgit@yunodevel

Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-29 23:08:07 -04:00
Steven Rostedt (Red Hat) 81dc9f0ef2 tracing: Add tracepoint benchmark tracepoint
In order to help benchmark the time tracepoints take, a new config
option is added called CONFIG_TRACEPOINT_BENCHMARK. When this option
is set a tracepoint is created called "benchmark:benchmark_event".
When the tracepoint is enabled, it kicks off a kernel thread that
goes into an infinite loop (calling cond_sched() to let other tasks
run), and calls the tracepoint. Each iteration will record the time
it took to write to the tracepoint and the next iteration that
data will be passed to the tracepoint itself. That is, the tracepoint
will report the time it took to do the previous tracepoint.
The string written to the tracepoint is a static string of 128 bytes
to keep the time the same. The initial string is simply a write of
"START". The second string records the cold cache time of the first
write which is not added to the rest of the calculations.

As it is a tight loop, it benchmarks as hot cache. That's fine because
we care most about hot paths that are probably in cache already.

An example of the output:

     START
     first=3672 [COLD CACHED]
     last=632 first=3672 max=632 min=632 avg=316 std=446 std^2=199712
     last=278 first=3672 max=632 min=278 avg=303 std=316 std^2=100337
     last=277 first=3672 max=632 min=277 avg=296 std=258 std^2=67064
     last=273 first=3672 max=632 min=273 avg=292 std=224 std^2=50411
     last=273 first=3672 max=632 min=273 avg=288 std=200 std^2=40389
     last=281 first=3672 max=632 min=273 avg=287 std=183 std^2=33666

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-29 22:49:54 -04:00
Steven Rostedt 2184db46e4 tracing: Print nasty banner when trace_printk() is in use
trace_printk() is used to debug fast paths within the kernel. Places
that gets called in any context (interrupt or NMI) or thousands of
times a second. Something you do not want to do with a printk().

In order to make it completely lockless as it needs a temporary buffer
to handle some of the string formatting, a page is created per cpu for
every context (four per cpu; normal, softirq, irq, NMI).

Since trace_printk() should only be used for debugging purposes,
there's no reason to waste memory on these buffers on a production
system. That means, trace_printk() should never be used unless a
developer is debugging their kernel. There's macro magic to allocate
the buffers if trace_printk() is used anywhere in the kernel.

To help enforce that trace_printk() isn't used outside of development,
when it is used, a nasty banner is displayed on bootup (or when a module
is loaded that uses trace_printk() and the kernel core does not).

Here's the banner:

 **********************************************************
 **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
 **                                                      **
 ** trace_printk() being used. Allocating extra memory.  **
 **                                                      **
 ** This means that this is a DEBUG kernel and it is     **
 ** unsafe for produciton use.                           **
 **                                                      **
 ** If you see this message and you are not debugging    **
 ** the kernel, report this immediately to your vendor!  **
 **                                                      **
 **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **
 **********************************************************

That should hopefully keep developers from trying to sneak in a
trace_printk() or two.

Link: http://lkml.kernel.org/p/20140528131440.2283213c@gandalf.local.home

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-29 20:13:59 -04:00
Robert Elliott 607e3a2920 tracing: Add funcgraph_tail option to print function name after closing braces
In the function-graph tracer, add a funcgraph_tail option
to print the function name on all } lines, not just
functions whose first line is no longer in the trace
buffer.

If a function calls other traced functions, its total
time appears on its } line.  This change allows grep
to be used to determine the function for which the
line corresponds.

Update Documentation/trace/ftrace.txt to describe
this new option.

Link: http://lkml.kernel.org/p/20140520221041.8359.6782.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20 23:29:32 -04:00
Robert Elliott ccdb594653 tracing: Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
Eliminate duplicate TRACE_GRAPH_PRINT_xx defines
in trace_functions_graph.c that are already in
trace.h.

Add TRACE_GRAPH_PRINT_IRQS to trace.h, which is
the only one that is missing.

Link: http://lkml.kernel.org/p/20140520221031.8359.24733.stgit@beardog.cce.hp.com

Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-20 23:28:34 -04:00
Steven Rostedt (Red Hat) 4449bf927b tracing: Add __bitmask() macro to trace events to cpumasks and other bitmasks
Being able to show a cpumask of events can be useful as some events
may affect only some CPUs. There is no standard way to record the
cpumask and converting it to a string is rather expensive during
the trace as traces happen in hotpaths. It would be better to record
the raw event mask and be able to parse it at print time.

The following macros were added for use with the TRACE_EVENT() macro:

  __bitmask()
  __assign_bitmask()
  __get_bitmask()

To test this, I added this to the sched_migrate_task event, which
looked like this:

TRACE_EVENT(sched_migrate_task,

	TP_PROTO(struct task_struct *p, int dest_cpu, const struct cpumask *cpus),

	TP_ARGS(p, dest_cpu, cpus),

	TP_STRUCT__entry(
		__array(	char,	comm,	TASK_COMM_LEN	)
		__field(	pid_t,	pid			)
		__field(	int,	prio			)
		__field(	int,	orig_cpu		)
		__field(	int,	dest_cpu		)
		__bitmask(	cpumask, num_possible_cpus()	)
	),

	TP_fast_assign(
		memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
		__entry->pid		= p->pid;
		__entry->prio		= p->prio;
		__entry->orig_cpu	= task_cpu(p);
		__entry->dest_cpu	= dest_cpu;
		__assign_bitmask(cpumask, cpumask_bits(cpus), num_possible_cpus());
	),

	TP_printk("comm=%s pid=%d prio=%d orig_cpu=%d dest_cpu=%d cpumask=%s",
		  __entry->comm, __entry->pid, __entry->prio,
		  __entry->orig_cpu, __entry->dest_cpu,
		  __get_bitmask(cpumask))
);

With the output of:

        ksmtuned-3613  [003] d..2   485.220508: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=3 dest_cpu=2 cpumask=00000000,0000000f
     migration/1-13    [001] d..5   485.221202: sched_migrate_task: comm=ksmtuned pid=3614 prio=120 orig_cpu=1 dest_cpu=0 cpumask=00000000,0000000f
             awk-3615  [002] d.H5   485.221747: sched_migrate_task: comm=rcu_preempt pid=7 prio=120 orig_cpu=0 dest_cpu=1 cpumask=00000000,000000ff
     migration/2-18    [002] d..5   485.222062: sched_migrate_task: comm=ksmtuned pid=3615 prio=120 orig_cpu=2 dest_cpu=3 cpumask=00000000,0000000f

Link: http://lkml.kernel.org/r/1399377998-14870-6-git-send-email-javi.merino@arm.com
Link: http://lkml.kernel.org/r/20140506132238.22e136d1@gandalf.local.home

Suggested-by: Javi Merino <javi.merino@arm.com>
Tested-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-15 11:29:37 -04:00
Steven Rostedt (Red Hat) f1b2f2bd58 ftrace: Remove FTRACE_UPDATE_MODIFY_CALL_REGS flag
As the decision to what needs to be done (converting a call to the
ftrace_caller to ftrace_caller_regs or to convert from ftrace_caller_regs
to ftrace_caller) can easily be determined from the rec->flags of
FTRACE_FL_REGS and FTRACE_FL_REGS_EN, there's no need to have the
ftrace_check_record() return either a UPDATE_MODIFY_CALL_REGS or a
UPDATE_MODIFY_CALL. Just he latter is enough. This added flag causes
more complexity than is required. Remove it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:30 -04:00
Steven Rostedt (Red Hat) 7c0868e03b ftrace: Use the ftrace_addr helper functions to find the ftrace_addr
With the moving of the functions that determine what the mcount call site
should be replaced with into the generic code, there is a few places
in the generic code that can use them instead of hard coding it as it
does.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:29 -04:00
Steven Rostedt (Red Hat) 7413af1fb7 ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global
Move and rename get_ftrace_addr() and get_ftrace_addr_old() to
ftrace_get_addr_new() and ftrace_get_addr_curr() respectively.

This moves these two helper functions in the generic code out from
the arch specific code, and renames them to have a better generic
name. This will allow other archs to use them as well as makes it
a bit easier to work on getting separate trampolines for different
functions.

ftrace_get_addr_new() returns the trampoline address that the mcount
call address will be converted to.

ftrace_get_addr_curr() returns the trampoline address of what the
mcount call address currently jumps to.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:29 -04:00
Steven Rostedt (Red Hat) 68f40969f0 ftrace: Always inline ftrace_hash_empty() helper function
The ftrace_hash_empty() function is a simple test:

	return !hash || !hash->count;

But gcc seems to want to make it a call. As this is in an extreme
hot path of the function tracer, there's no reason it needs to be
a call. I only wrote it to be a helper function anyway, otherwise
it would have been inlined manually.

Force gcc to inline it, as it could have also been a macro.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:28 -04:00
Steven Rostedt (Red Hat) 19eab4a472 ftrace: Write in missing comment from a very old commit
Back in 2011 Commit ed926f9b35 "ftrace: Use counters to enable
functions to trace" changed the way ftrace accounts for enabled
and disabled traced functions. There was a comment started as:

	/*
	 *
	 */

But never finished. Well, that's rather useless. I probably forgot
to save the file before committing it. And it passed review from all
this time.

Anyway, better late than never. I updated the comment to express what
is happening in that somewhat complex code.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:27 -04:00
Steven Rostedt (Red Hat) 66209a5bd4 ftrace: Remove boolean of hash_enable and hash_disable
Commit 4104d326b6 "ftrace: Remove global function list and call
function directly" cleaned up the global_ops filtering and made
the code simpler, but it left a variable "hash_enable" that was used
to know if the hash functions should be updated or not. It was
updated if the global_ops did not override them. As the global_ops
are now no different than any other ftrace_ops, the hash always
gets updated and there's no reason to use the hash_enable boolean.

The same goes for hash_disable used in ftrace_shutdown().

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-14 11:37:25 -04:00
Christoph Lameter bdffd893a0 tracing: Replace __get_cpu_var uses with this_cpu_ptr
Replace uses of &__get_cpu_var for address calculation with this_cpu_ptr.

Link: http://lkml.kernel.org/p/alpine.DEB.2.10.1404291415560.18364@gentwo.org

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-05 22:40:53 -04:00
Steven Rostedt (Red Hat) 561a4fe851 tracing: Use rcu_dereference_sched() for trace event triggers
As trace event triggers are now part of the mainline kernel, I added
my trace event trigger tests to my test suite I run on all my kernels.
Now these tests get run under different config options, and one of
those options is CONFIG_PROVE_RCU, which checks under lockdep that
the rcu locking primitives are being used correctly. This triggered
the following splat:

===============================
[ INFO: suspicious RCU usage. ]
3.15.0-rc2-test+ #11 Not tainted
-------------------------------
kernel/trace/trace_events_trigger.c:80 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
4 locks held by swapper/1/0:
 #0:  ((&(&j_cdbs->work)->timer)){..-...}, at: [<ffffffff8104d2cc>] call_timer_fn+0x5/0x1be
 #1:  (&(&pool->lock)->rlock){-.-...}, at: [<ffffffff81059856>] __queue_work+0x140/0x283
 #2:  (&p->pi_lock){-.-.-.}, at: [<ffffffff8106e961>] try_to_wake_up+0x2e/0x1e8
 #3:  (&rq->lock){-.-.-.}, at: [<ffffffff8106ead3>] try_to_wake_up+0x1a0/0x1e8

stack backtrace:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.15.0-rc2-test+ #11
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
 0000000000000001 ffff88007e083b98 ffffffff819f53a5 0000000000000006
 ffff88007b0942c0 ffff88007e083bc8 ffffffff81081307 ffff88007ad96d20
 0000000000000000 ffff88007af2d840 ffff88007b2e701c ffff88007e083c18
Call Trace:
 <IRQ>  [<ffffffff819f53a5>] dump_stack+0x4f/0x7c
 [<ffffffff81081307>] lockdep_rcu_suspicious+0x107/0x110
 [<ffffffff810ee51c>] event_triggers_call+0x99/0x108
 [<ffffffff810e8174>] ftrace_event_buffer_commit+0x42/0xa4
 [<ffffffff8106aadc>] ftrace_raw_event_sched_wakeup_template+0x71/0x7c
 [<ffffffff8106bcbf>] ttwu_do_wakeup+0x7f/0xff
 [<ffffffff8106bd9b>] ttwu_do_activate.constprop.126+0x5c/0x61
 [<ffffffff8106eadf>] try_to_wake_up+0x1ac/0x1e8
 [<ffffffff8106eb77>] wake_up_process+0x36/0x3b
 [<ffffffff810575cc>] wake_up_worker+0x24/0x26
 [<ffffffff810578bc>] insert_work+0x5c/0x65
 [<ffffffff81059982>] __queue_work+0x26c/0x283
 [<ffffffff81059999>] ? __queue_work+0x283/0x283
 [<ffffffff810599b7>] delayed_work_timer_fn+0x1e/0x20
 [<ffffffff8104d3a6>] call_timer_fn+0xdf/0x1be^M
 [<ffffffff8104d2cc>] ? call_timer_fn+0x5/0x1be
 [<ffffffff81059999>] ? __queue_work+0x283/0x283
 [<ffffffff8104d823>] run_timer_softirq+0x1a4/0x22f^M
 [<ffffffff8104696d>] __do_softirq+0x17b/0x31b^M
 [<ffffffff81046d03>] irq_exit+0x42/0x97
 [<ffffffff81a08db6>] smp_apic_timer_interrupt+0x37/0x44
 [<ffffffff81a07a2f>] apic_timer_interrupt+0x6f/0x80
 <EOI>  [<ffffffff8100a5d8>] ? default_idle+0x21/0x32
 [<ffffffff8100a5d6>] ? default_idle+0x1f/0x32
 [<ffffffff8100ac10>] arch_cpu_idle+0xf/0x11
 [<ffffffff8107b3a4>] cpu_startup_entry+0x1a3/0x213
 [<ffffffff8102a23c>] start_secondary+0x212/0x219

The cause is that the triggers are protected by rcu_read_lock_sched() but
the data is dereferenced with rcu_dereference() which expects it to
be protected with rcu_read_lock(). The proper reference should be
rcu_dereference_sched().

Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-02 23:12:42 -04:00
Steven Rostedt (Red Hat) fd06a54990 ftrace: Have function graph tracer use global_ops for filtering
Commit 4104d326b6 "ftrace: Remove global function list and call
function directly" cleaned up the global_ops filtering and made
the code simpler. But it left out function graph filtering which
also depended on that code. The function graph filtering still
needs to use global_ops as the filter otherwise it wont filter
at all.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-05-01 23:21:16 -04:00
Steven Rostedt (Red Hat) b1169cc69b tracing: Remove mock up poll wait function
Now that the ring buffer has a built in way to wake up readers
when there's data, using irq_work such that it is safe to do it
in any context. But it was still using the old "poor man's"
wait polling that checks every 1/10 of a second to see if it
should wake up a waiter. This makes the latency for a wake up
excruciatingly long. No need to do that anymore.

Completely remove the different wait_poll types from the tracers
and have them all use the default one now.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-30 08:40:05 -04:00
Steven Rostedt (Red Hat) f487426104 tracing: Break out of tracing_wait_pipe() before wait_pipe() is called
When reading from trace_pipe, if tracing is off but nothing was read
it should block. If something is read and tracing is off, then EOF
is returned. If tracing is on and there's nothing to read, it will block.

But because the check of whether tracing is off and something was read
is done after the block on the pipe, it is hit or miss if the EOF is
returned or not leading to inconsistent behavior.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-29 16:07:28 -04:00
Steven Rostedt (Red Hat) a949ae560a ftrace/module: Hardcode ftrace_module_init() call into load_module()
A race exists between module loading and enabling of function tracer.

	CPU 1				CPU 2
	-----				-----
  load_module()
   module->state = MODULE_STATE_COMING

				register_ftrace_function()
				 mutex_lock(&ftrace_lock);
				 ftrace_startup()
				  update_ftrace_function();
				   ftrace_arch_code_modify_prepare()
				    set_all_module_text_rw();
				   <enables-ftrace>
				    ftrace_arch_code_modify_post_process()
				     set_all_module_text_ro();

				[ here all module text is set to RO,
				  including the module that is
				  loading!! ]

   blocking_notifier_call_chain(MODULE_STATE_COMING);
    ftrace_init_module()

     [ tries to modify code, but it's RO, and fails!
       ftrace_bug() is called]

When this race happens, ftrace_bug() will produces a nasty warning and
all of the function tracing features will be disabled until reboot.

The simple solution is to treate module load the same way the core
kernel is treated at boot. To hardcode the ftrace function modification
of converting calls to mcount into nops. This is done in init/main.c
there's no reason it could not be done in load_module(). This gives
a better control of the changes and doesn't tie the state of the
module to its notifiers as much. Ftrace is special, it needs to be
treated as such.

The reason this would work, is that the ftrace_module_init() would be
called while the module is in MODULE_STATE_UNFORMED, which is ignored
by the set_all_module_text_ro() call.

Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com

Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-28 10:37:21 -04:00
Jiaxing Wang 8d1b065d47 tracing: Fix documentation of ftrace_set_global_{filter,notrace}()
The functions ftrace_set_global_filter() and ftrace_set_global_notrace()
still have their old names in the kernel doc (ftrace_set_filter and
ftrace_set_notrace respectively). Replace these with the real names.

Link: http://lkml.kernel.org/p/1398006644-5935-3-git-send-email-wangjiaxing@insigma.com.cn

Signed-off-by: Jiaxing Wang <wangjiaxing@insigma.com.cn>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-24 13:38:01 -04:00
Jiaxing Wang 7eea4fce02 tracing/stack_trace: Skip 4 instead of 3 when using ftrace_ops_list_func
When using ftrace_ops_list_func, we should skip 4 instead of 3,
to avoid ftrace_call+0x5/0xb appearing in the stack trace:

        Depth    Size   Location    (110 entries)
        -----    ----   --------
  0)     2956       0   update_curr+0xe/0x1e0
  1)     2956      68   ftrace_call+0x5/0xb
  2)     2888      92   enqueue_entity+0x53/0xe80
  3)     2796      80   enqueue_task_fair+0x47/0x7e0
  4)     2716      28   enqueue_task+0x45/0x70
  5)     2688      12   activate_task+0x22/0x30

Add a function using_ftrace_ops_list_func() to test for this while keeping
ftrace_ops_list_func to remain static.

Link: http://lkml.kernel.org/p/1398006644-5935-2-git-send-email-wangjiaxing@insigma.com.cn

Signed-off-by: Jiaxing Wang <wangjiaxing@insigma.com.cn>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-24 13:36:03 -04:00
Fabian Frederick ad1438a076 tracing: Add static to local functions
This patch adds static to the following functions:
-cycle_t buffer_ftrace_now
-void free_snapshot
-int trace_selftest_startup_dynamic_tracing

Link: http://lkml.kernel.org/p/20140417214442.d7abc7c0b0e4b90e7fedecc9@skynet.be

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 14:00:46 -04:00
Mathias Krause 8275f69f07 ftrace: Statically initialize pm notifier block
Instead of initializing the pm notifier block in register_ftrace_graph(),
initialize it statically. This safes us some code.

Found in the PaX patch, written by the PaX Team.

Link: http://lkml.kernel.org/p/1396186310-3156-1-git-send-email-minipli@googlemail.com

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: PaX Team <pageexec@freemail.hu>
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 14:00:46 -04:00
Steven Rostedt (Red Hat) 02f2f7646f tracing: Allow irq/preempt tracers to be used by instances
The irqsoff, preemptoff and preemptirqsoff tracers can now be used by
instances. But they may only be used by one instance at a time (including
the top level directory). This allows multiple tracers to run while the
irqsoff (and friends) tracer is running simultaneously.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:29 -04:00
Steven Rostedt (Red Hat) 65daaca7c6 tracing: Allow wakeup tracers to be used by instances
The wakeup and wakeup_rt tracers can now be used by instances.
But they may only be used by one instance at a time (including the
top level directory). This allows multiple tracers to run while
the wakeup tracer is running simultaneously.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:28 -04:00
Steven Rostedt (Red Hat) 0b9b12c1b8 tracing: Move ftrace_max_lock into trace_array
In preparation for having tracers enabled in instances, the max_lock
should be unique as updating the max for one tracer is a separate
operation than updating it for another tracer using a different max.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:27 -04:00
Steven Rostedt (Red Hat) 6d9b3fa5e7 tracing: Move tracing_max_latency into trace_array
In preparation for letting the latency tracers be used by instances,
remove the global tracing_max_latency variable and add a max_latency
field to the trace_array that the latency tracers will now use.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:26 -04:00
Steven Rostedt (Red Hat) 4104d326b6 ftrace: Remove global function list and call function directly
Instead of having a list of global functions that are called,
as only one global function is allow to be enabled at a time, there's
no reason to have a list.

Instead, simply have all the users of the global ops, use the global ops
directly, instead of registering their own ftrace_ops. Just switch what
function is used before enabling the function tracer.

This removes a lot of code as well as the complexity involved with it.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-21 13:59:25 -04:00
Linus Torvalds 7d77879bfd This contains two fixes.
The first is to remove a duplication of creating debugfs files that
 already exist and causes an error report to be printed due to the
 failure of the second creation.
 
 The second is a memory leak fix that was introduced in 3.14.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTUGZwAAoJEKQekfcNnQGu7W8IAIAMBVfrWdP6cmGle4tGfhVE
 sHcwqTH+07oANQJ3eFwFs5wBMb08s3hXwUHUxXcpjyq2Bs+AHr0vSL/nqCG4k8Ap
 2T4ntL7esC1BWKw2lVVVYD12FiL7grUXVlx/q0WE2NuhCzWzNRTyb8sKrPoCRUEB
 3o5rAt9+45PKUb2k/eqGBGhK8b4XDz2Wtk5Gj6YB3xttse/yjjcuw0gWMHN1JWfm
 eRuQUUBDDGUGkfF98k1aLrjPZooT3LIAV8L8md5C3ebEcXSC/h86hTYCGXv3oBDO
 8sxcT0zoQcLuFhjkYLL1J1lBW6gxaVh052jYmQwMppQMos+WID2un2E92Ccg49E=
 =BwLF
 -----END PGP SIGNATURE-----

Merge tag 'trace-fixes-v3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "This contains two fixes.

  The first is to remove a duplication of creating debugfs files that
  already exist and causes an error report to be printed due to the
  failure of the second creation.

  The second is a memory leak fix that was introduced in 3.14"

* tag 'trace-fixes-v3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/uprobes: Fix uprobe_cpu_buffer memory leak
  tracing: Do not try to recreated toplevel set_ftrace_* files
2014-04-18 10:16:43 -07:00
zhangwei(Jovi) 6ea6215fe3 tracing/uprobes: Fix uprobe_cpu_buffer memory leak
Forgot to free uprobe_cpu_buffer percpu page in uprobe_buffer_disable().

Link: http://lkml.kernel.org/p/534F8B3F.1090407@huawei.com

Cc: stable@vger.kernel.org # v3.14+
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-17 10:44:42 -04:00
Steven Rostedt (Red Hat) 5d6c97c559 tracing: Do not try to recreated toplevel set_ftrace_* files
With the restructing of the function tracer working with instances, the
"top level" buffer is a bit special, as the function tracing is mapped
to the same set of filters. This is done by using a "global_ops" descriptor
and having the "set_ftrace_filter" and "set_ftrace_notrace" map to it.

When an instance is created, it creates the same files but its for the
local instance and not the global_ops.

The issues is that the local instance creation shares some code with
the global instance one and we end up trying to create th top level
"set_ftrace_*" files twice, and on boot up, we get an error like this:

 Could not create debugfs 'set_ftrace_filter' entry
 Could not create debugfs 'set_ftrace_notrace' entry

The reason they failed to be created was because they were created
twice, and the second time gives this error as you can not create the
same file twice.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-16 19:21:53 -04:00
Linus Torvalds 5166701b36 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
 "The first vfs pile, with deep apologies for being very late in this
  window.

  Assorted cleanups and fixes, plus a large preparatory part of iov_iter
  work.  There's a lot more of that, but it'll probably go into the next
  merge window - it *does* shape up nicely, removes a lot of
  boilerplate, gets rid of locking inconsistencie between aio_write and
  splice_write and I hope to get Kent's direct-io rewrite merged into
  the same queue, but some of the stuff after this point is having
  (mostly trivial) conflicts with the things already merged into
  mainline and with some I want more testing.

  This one passes LTP and xfstests without regressions, in addition to
  usual beating.  BTW, readahead02 in ltp syscalls testsuite has started
  giving failures since "mm/readahead.c: fix readahead failure for
  memoryless NUMA nodes and limit readahead pages" - might be a false
  positive, might be a real regression..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  missing bits of "splice: fix racy pipe->buffers uses"
  cifs: fix the race in cifs_writev()
  ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
  kill generic_file_buffered_write()
  ocfs2_file_aio_write(): switch to generic_perform_write()
  ceph_aio_write(): switch to generic_perform_write()
  xfs_file_buffered_aio_write(): switch to generic_perform_write()
  export generic_perform_write(), start getting rid of generic_file_buffer_write()
  generic_file_direct_write(): get rid of ppos argument
  btrfs_file_aio_write(): get rid of ppos
  kill the 5th argument of generic_file_buffered_write()
  kill the 4th argument of __generic_file_aio_write()
  lustre: don't open-code kernel_recvmsg()
  ocfs2: don't open-code kernel_recvmsg()
  drbd: don't open-code kernel_recvmsg()
  constify blk_rq_map_user_iov() and friends
  lustre: switch to kernel_sendmsg()
  ocfs2: don't open-code kernel_sendmsg()
  take iov_iter stuff to mm/iov_iter.c
  process_vm_access: tidy up a bit
  ...
2014-04-12 14:49:50 -07:00
Linus Torvalds 0a7418f5f5 This includes the final patch to clean up and fix the issue with the
design of tracepoints and how a user could register a tracepoint
 and have that tracepoint not be activated but no error was shown.
 
 The design was for an out of tree module but broke in tree users.
 The clean up was to remove the saving of the hash table of tracepoint
 names such that they can be enabled before they exist (enabling
 a module tracepoint before that module is loaded). This added more
 complexity than needed. The clean up was to remove that code and
 just enable tracepoints that exist or fail if they do not.
 
 This removed a lot of code as well as the complexity that it brought.
 As a side effect, instead of registering a tracepoint by its name,
 the tracepoint needs to be registered with the tracepoint descriptor.
 This removes having to duplicate the tracepoint names that are
 enabled.
 
 The second patch was added that simplified the way modules were
 searched for.
 
 This cleanup required changes that were in the 3.15 queue as well as
 some changes that were added late in the 3.14-rc cycle. This final
 change waited till the two were merged in upstream and then the
 change was added and full tests were run. Unfortunately, the
 test found some errors, but after it was already submitted to the
 for-next branch and not to be rebased. Sparse errors were detected
 by Fengguang Wu's bot tests, and my internal tests discovered that
 the anonymous union initialization triggered a bug in older gcc compilers.
 Luckily, there was a bugzilla for the gcc bug which gave a work around
 to the problem. The third and fourth patch handled the sparse error
 and the gcc bug respectively.
 
 A final patch was tagged along to fix a missing documentation for
 the README file.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJTR+pwAAoJEKQekfcNnQGuvfoH/A4XZu4/1h2ZuKhzGi6lrrWr
 +zHUQ+JmGiAYRziQFwr2t/gqJ2vmDfHJnbDjKi6Emx8JcxesHas6CQOWps4zEic0
 dwYSQjvuGNGFIFt+7I0K1OxfVVdt2PQ2lVrB5WgYdbash5J4Bi+09QBv0RbUKheo
 37dKSeN3pbsuQsR70OTVP8laG3dA9IbHW7PsKnxIEB5zeIUHUBME/QdPPj/CuJwk
 wxZjXC2dbc3rdRlQjTVtWV3ZkGgZJB0k+JxjvZTA0N6u8Hj8LiFPuNawzf7ceBHx
 gc++57+WuMW0f0X/ar5/+3UPGFQKMSvKmdxIQCnWXQz5seTYYKDEx7mTH22fxgg=
 =OgeQ
 -----END PGP SIGNATURE-----

Merge tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull more tracing updates from Steven Rostedt:
 "This includes the final patch to clean up and fix the issue with the
  design of tracepoints and how a user could register a tracepoint and
  have that tracepoint not be activated but no error was shown.

  The design was for an out of tree module but broke in tree users.  The
  clean up was to remove the saving of the hash table of tracepoint
  names such that they can be enabled before they exist (enabling a
  module tracepoint before that module is loaded).  This added more
  complexity than needed.  The clean up was to remove that code and just
  enable tracepoints that exist or fail if they do not.

  This removed a lot of code as well as the complexity that it brought.
  As a side effect, instead of registering a tracepoint by its name, the
  tracepoint needs to be registered with the tracepoint descriptor.
  This removes having to duplicate the tracepoint names that are
  enabled.

  The second patch was added that simplified the way modules were
  searched for.

  This cleanup required changes that were in the 3.15 queue as well as
  some changes that were added late in the 3.14-rc cycle.  This final
  change waited till the two were merged in upstream and then the change
  was added and full tests were run.  Unfortunately, the test found some
  errors, but after it was already submitted to the for-next branch and
  not to be rebased.  Sparse errors were detected by Fengguang Wu's bot
  tests, and my internal tests discovered that the anonymous union
  initialization triggered a bug in older gcc compilers.  Luckily, there
  was a bugzilla for the gcc bug which gave a work around to the
  problem.  The third and fourth patch handled the sparse error and the
  gcc bug respectively.

  A final patch was tagged along to fix a missing documentation for the
  README file"

* tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Add missing function triggers dump and cpudump to README
  tracing: Fix anonymous unions in struct ftrace_event_call
  tracepoint: Fix sparse warnings in tracepoint.c
  tracepoint: Simplify tracepoint module search
  tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints
2014-04-12 13:06:10 -07:00
Al Viro a786c06d9f missing bits of "splice: fix racy pipe->buffers uses"
that commit has fixed only the parts of that mess in fs/splice.c itself;
there had been more in several other ->splice_read() instances...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-04-12 07:04:19 -04:00
Steven Rostedt (Red Hat) 17a280ea81 tracing: Add missing function triggers dump and cpudump to README
The debugfs tracing README file lists all the function triggers except for
dump and cpudump. These should be added too.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-10 22:43:37 -04:00
Mathieu Desnoyers abb43f6998 tracing: Fix anonymous unions in struct ftrace_event_call
gcc <= 4.5.x has significant limitations with respect to initialization
of anonymous unions within structures. They need to be surrounded by
brackets, _and_ they need to be initialized in the same order in which
they appear in the structure declaration.

Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676
Link: http://lkml.kernel.org/r/1397077568-3156-1-git-send-email-mathieu.desnoyers@efficios.com

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-09 20:02:55 -04:00
Mathieu Desnoyers de7b297390 tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints
Register/unregister tracepoint probes with struct tracepoint pointer
rather than tracepoint name.

This change, which vastly simplifies tracepoint.c, has been proposed by
Steven Rostedt. It also removes 8.8kB (mostly of text) to the vmlinux
size.

From this point on, the tracers need to pass a struct tracepoint pointer
to probe register/unregister. A probe can now only be connected to a
tracepoint that exists. Moreover, tracers are responsible for
unregistering the probe before the module containing its associated
tracepoint is unloaded.

   text    data     bss     dec     hex filename
10443444        4282528 10391552        25117524        17f4354 vmlinux.orig
10434930        4282848 10391552        25109330        17f2352 vmlinux

Link: http://lkml.kernel.org/r/1396992381-23785-2-git-send-email-mathieu.desnoyers@efficios.com

CC: Ingo Molnar <mingo@kernel.org>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Frank Ch. Eigler <fche@redhat.com>
CC: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
[ SDR - fixed return val in void func in tracepoint_module_going() ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-04-08 20:43:28 -04:00
Linus Torvalds 26c12d9334 Merge branch 'akpm' (incoming from Andrew)
Merge second patch-bomb from Andrew Morton:
 - the rest of MM
 - zram updates
 - zswap updates
 - exit
 - procfs
 - exec
 - wait
 - crash dump
 - lib/idr
 - rapidio
 - adfs, affs, bfs, ufs
 - cris
 - Kconfig things
 - initramfs
 - small amount of IPC material
 - percpu enhancements
 - early ioremap support
 - various other misc things

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (156 commits)
  MAINTAINERS: update Intel C600 SAS driver maintainers
  fs/ufs: remove unused ufs_super_block_third pointer
  fs/ufs: remove unused ufs_super_block_second pointer
  fs/ufs: remove unused ufs_super_block_first pointer
  fs/ufs/super.c: add __init to init_inodecache()
  doc/kernel-parameters.txt: add early_ioremap_debug
  arm64: add early_ioremap support
  arm64: initialize pgprot info earlier in boot
  x86: use generic early_ioremap
  mm: create generic early_ioremap() support
  x86/mm: sparse warning fix for early_memremap
  lglock: map to spinlock when !CONFIG_SMP
  percpu: add preemption checks to __this_cpu ops
  vmstat: use raw_cpu_ops to avoid false positives on preemption checks
  slub: use raw_cpu_inc for incrementing statistics
  net: replace __this_cpu_inc in route.c with raw_cpu_inc
  modules: use raw_cpu_write for initialization of per cpu refcount.
  mm: use raw_cpu ops for determining current NUMA node
  percpu: add raw_cpu_ops
  slub: fix leak of 'name' in sysfs_slab_add
  ...
2014-04-07 16:38:06 -07:00