linux-stable-rt/tools/perf/util
Arnaldo Carvalho de Melo aece948f5d perf evlist: Fix per thread mmap setup
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
--pid when monitoring multithreaded apps, as we can only share a ring
buffer for events on the same thread if not doing per cpu.

Fix it by using per thread ring buffers.

Tested with:

[root@felicio ~]# tuna -t 26131 -CP | nl
  1                      thread       ctxt_switches
  2    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
  3 26131   OTHER     0      0,1  10814276      2397830 chromium-browse
  4  642    OTHER     0      0,1     14688            0 chromium-browse
  5  26148  OTHER     0      0,1    713602       115479 chromium-browse
  6  26149  OTHER     0      0,1    801958         2262 chromium-browse
  7  26150  OTHER     0      0,1   1271128          248 chromium-browse
  8  26151  OTHER     0      0,1         3            0 chromium-browse
  9  27049  OTHER     0      0,1     36796            9 chromium-browse
 10  618    OTHER     0      0,1     14711            0 chromium-browse
 11  661    OTHER     0      0,1     14593            0 chromium-browse
 12  29048  OTHER     0      0,1     28125            0 chromium-browse
 13  26143  OTHER     0      0,1   2202789          781 chromium-browse
[root@felicio ~]#

So 11 threads under pid 26131, then:

[root@felicio ~]# perf record -F 50000 --pid 26131

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
  1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

11 mmaps, one per thread since we didn't specify any CPU list, so we need one
mmap per thread and:

[root@felicio ~]# perf record -F 50000 --pid 26131
^M
^C[ perf record: Woken up 79 times to write data ]
[ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	 371310 26131
     2	  96516 26148
     3	  95694 26149
     4	  95203 26150
     5	   7291 26143
     6	     87 27049
     7	     76 661
     8	     60 29048
     9	     47 618
    10	     43 642
[root@felicio ~]#

Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
others are there.

Then, if I specify one CPU:

[root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   8444 26131
     2	   2584 26149
     3	   2518 26148
     4	   2324 26150
     5	    123 26143
     6	      9 661
     7	      9 29048
[root@felicio ~]#

This machine has two cores, so fewer threads appeared on the radar, and:

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

Just one mmap, as now we can use just one per-cpu buffer instead of the
per-thread needed in the previous case.

For global profiling:

[root@felicio ~]# perf record -F 50000 -a
^C[ perf record: Woken up 26 times to write data ]
[ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
     2	7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

It uses per-cpu buffers.

For just one thread:

[root@felicio ~]# perf record -F 50000 --tid 26148
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   9969 26148
[root@felicio ~]#

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15 10:02:14 -03:00
..
include
scripting-engines perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
ui perf hists browser: Fix seg fault when annotate null symbol 2011-04-15 12:51:49 -03:00
PERF-VERSION-GEN perf tools: Version incorrect with some versions of grep 2011-03-16 08:59:50 -03:00
abspath.c
alias.c
annotate.c perf symbols: Rename dso->origin to dso->symtab_type 2011-03-11 13:28:45 -03:00
annotate.h perf top: Live TUI Annotation 2011-02-22 12:02:07 -03:00
bitmap.c
build-id.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
build-id.h
cache.h
callchain.c
callchain.h
cgroup.c perf: Fix a build error with some GCC versions 2011-04-08 17:40:21 +02:00
cgroup.h
color.c
color.h
config.c
cpumap.c
cpumap.h
ctype.c
debug.c perf tools: Fixup exit path when not able to open events 2011-03-29 13:40:27 -03:00
debug.h perf tools: Fixup exit path when not able to open events 2011-03-29 13:40:27 -03:00
debugfs.c
debugfs.h
environment.c
event.c perf symbols: Fix vsyscall symbol lookup 2011-03-28 14:44:15 -03:00
event.h
evlist.c perf evlist: Fix per thread mmap setup 2011-05-15 10:02:14 -03:00
evlist.h perf evlist: Fix per thread mmap setup 2011-05-15 10:02:14 -03:00
evsel.c perf evsel: Fix use of inherit 2011-04-15 12:52:28 -03:00
evsel.h perf evsel: Fix use of inherit 2011-04-15 12:52:28 -03:00
exec_cmd.c perf tools: Makefile: Remove various and sundry cruft 2011-02-18 07:43:06 -02:00
exec_cmd.h
generate-cmdlist.sh
header.c perf build-id: Add quirk to deal with perf.data file format breakage 2011-03-23 19:29:40 -03:00
header.h perf header: Stop using 'self' 2011-03-10 11:16:28 -03:00
help.c
help.h
hist.c perf tools: Improve support for sessions with multiple events 2011-03-06 13:13:40 -03:00
hist.h perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
hweight.c
levenshtein.c
levenshtein.h
map.c
map.h
pager.c
parse-events.c perf script: Add support for H/W and S/W events 2011-03-14 17:07:20 -03:00
parse-events.h perf script: Add support for H/W and S/W events 2011-03-14 17:07:20 -03:00
parse-options.c
parse-options.h
path.c
probe-event.c perf probe: Fix multiple --vars options behavior 2011-04-05 15:36:04 -03:00
probe-event.h
probe-finder.c perf probe: Fix listing incorrect line number with inline function 2011-04-05 15:38:12 -03:00
probe-finder.h
pstack.c
pstack.h
python.c perf evlist: Fix per thread mmap setup 2011-05-15 10:02:14 -03:00
quote.c
quote.h
run-command.c
run-command.h
session.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
session.h perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
setup.py perf tools: Fix NO_NEWT=1 python build error 2011-03-29 16:46:57 -03:00
sigchain.c
sigchain.h
sort.c
sort.h
strbuf.c
strbuf.h
strfilter.c perf: Fix missing strndup declaration 2011-03-04 01:17:18 +01:00
strfilter.h
string.c Fix common misspellings 2011-03-31 11:26:23 -03:00
strlist.c
strlist.h
svghelper.c perf timechart: Fix black idle boxes in the title 2011-02-28 08:56:14 +01:00
svghelper.h
symbol.c perf symbols: Properly align symbol_conf.priv_size 2011-03-29 14:18:39 -03:00
symbol.h perf symbol: Move sym_entry->skip to symbol->ignore 2011-03-11 13:36:01 -03:00
thread.c
thread.h
thread_map.c
thread_map.h
top.c perf top: Remove redundant syme->origin field 2011-03-11 13:28:45 -03:00
top.h perf symbol: Move sym_entry->skip to symbol->ignore 2011-03-11 13:36:01 -03:00
trace-event-info.c
trace-event-parse.c perf script: Move printing of 'common' data from print_event and rename 2011-03-14 17:05:55 -03:00
trace-event-read.c
trace-event-scripting.c perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
trace-event.h perf session: Pass evsel in event_ops->sample() 2011-03-23 19:28:58 -03:00
types.h
usage.c
util.c
util.h perf tools: Makefile: Remove platform-specific cruft 2011-02-18 07:42:07 -02:00
values.c
values.h
wrapper.c
xyarray.c
xyarray.h