[Xorp-hackers] Runtime execution profiling

Bruce Simpson bms at incunabulum.net
Sun Nov 29 02:39:23 PST 2009


Bruce Simpson wrote:
> I am just getting to grips with FreeBSD's native hardware-based 
> profiler, pmc, and I'll post more about that shortly -- I am getting 
> some excellent samples out of it. It's similar to oprofile.
>   

This worked for me:
%%%
pmcstat -P instructions -P unhalted-cycles -P branches -O txs.pmc -q 
./test_xrl_sender
pmcstat -R txs.pmc -F - -q | c++filt > txs.cg
kcachegrind txs.cg
%%%

To use it, I had to check out Fabien Thomas' patches for pmcstat from 
gitorious, which are less than a week old:
    
http://gitorious.org/~fabient/freebsd/fabient-sandbox/commits/work/hwpmc_plugin

One nice feature is that FreeBSD's pmc will go straight into the kernel 
and show where we're blocked on socket buffers. In many ways, this is a 
little too much information for its own good.

Context switches aren't profiled in the hwpmc call graph at the moment. 
There is experimental support for this in hwpmc, but it didn't work for me.
It's another useful metric, along with actual system call count. I was 
surprised to find the AMD K8's don't have an MSR for counting 
SYSCALL/SYSRET instructions, this would have made that easy -- I 
suppose, though, that they assume the kernel will do this.

In the meantime, context switch counts can be found in BSD-compatible 
getrusage(). Voluntary switches are usually a result of us hitting 
select() with no work to do. They can generally be added to 
non-voluntary switches for the purposes of profiling; a switch is a 
switch. Linux has this call too, with the same fields; GNU's time(1) 
wants -f '%c %w' to show these.

memprof can allegedly be coaxed to provide heap operation stats in 
callgrind style; there's a conversion script for this. I can't for the 
life of me get memprof to build on a BSD box.
mpatrol looks more portable there. It's had years to mature, and it 
looks like one of the better heap profiling tools (XORP used to use 
LeakTrace, but it's badly bitrotted). Pity valgrind's memcheck/massif 
don't do callgrind output.

I found mpatrol, when used as an LD_PRELOAD, can have some problems with 
symbol retrieval. I'll see if adding it to the link line (like Google's 
cpu profiler) can overcome this issue.

cheers,
BMS



More information about the Xorp-hackers mailing list