[Xorp-hackers] Runtime execution profiling
Bruce Simpson
bms at incunabulum.net
Sun Nov 29 02:39:23 PST 2009
Bruce Simpson wrote:
> I am just getting to grips with FreeBSD's native hardware-based
> profiler, pmc, and I'll post more about that shortly -- I am getting
> some excellent samples out of it. It's similar to oprofile.
>
This worked for me:
%%%
pmcstat -P instructions -P unhalted-cycles -P branches -O txs.pmc -q
./test_xrl_sender
pmcstat -R txs.pmc -F - -q | c++filt > txs.cg
kcachegrind txs.cg
%%%
To use it, I had to check out Fabien Thomas' patches for pmcstat from
gitorious, which are less than a week old:
http://gitorious.org/~fabient/freebsd/fabient-sandbox/commits/work/hwpmc_plugin
One nice feature is that FreeBSD's pmc will go straight into the kernel
and show where we're blocked on socket buffers. In many ways, this is a
little too much information for its own good.
Context switches aren't profiled in the hwpmc call graph at the moment.
There is experimental support for this in hwpmc, but it didn't work for me.
It's another useful metric, along with actual system call count. I was
surprised to find the AMD K8's don't have an MSR for counting
SYSCALL/SYSRET instructions, this would have made that easy -- I
suppose, though, that they assume the kernel will do this.
In the meantime, context switch counts can be found in BSD-compatible
getrusage(). Voluntary switches are usually a result of us hitting
select() with no work to do. They can generally be added to
non-voluntary switches for the purposes of profiling; a switch is a
switch. Linux has this call too, with the same fields; GNU's time(1)
wants -f '%c %w' to show these.
memprof can allegedly be coaxed to provide heap operation stats in
callgrind style; there's a conversion script for this. I can't for the
life of me get memprof to build on a BSD box.
mpatrol looks more portable there. It's had years to mature, and it
looks like one of the better heap profiling tools (XORP used to use
LeakTrace, but it's badly bitrotted). Pity valgrind's memcheck/massif
don't do callgrind output.
I found mpatrol, when used as an LD_PRELOAD, can have some problems with
symbol retrieval. I'll see if adding it to the link line (like Google's
cpu profiler) can overcome this issue.
cheers,
BMS
More information about the Xorp-hackers
mailing list