[Xorp-hackers] Crash due to stale cached xrl sender pointer.

Mon Nov 9 09:13:17 PST 2009

Bruce Simpson wrote:
> It would be good to know if changing the Xrl allocation in this way 
> helps the situation with the race you saw...
>   

A bit more on the 'static Xrl*' mechanism.

This is a lazy allocation mechanism, which amortizes the cost of 
instantiating class Xrl. The change I posted yesterday just moves the 
pointer to this allocation into the XrlFooClient itself, where we can 
manage its lifecycle; the semantics of its use don't change.

Some data points:
 * There is only one instance of cached 'class Xrl*' for each 
XrlFooClient method.
 * Each method's wrapper lazy-allocates it once per runtime, when the 
method wrapper is called.
    * Without my change, this lazy allocation is non-reentrant, and 
happens once per runtime.
    * With my change, it happens once per method call, for each 
XrlFooClient instance, which is reentrant (but still not thread-safe).
 * It sets up the Xrl::args() field. That's it. No additional 
amortization of up-front costs of sending an XRL.
 * If the consumer of XrlFooClient::send_foo(target, arg) changes its 
'target' argument, the resolved XrlSender is very likely to change, 
forcing a FinderDBEntry lookup in XrlRouter::send() anyway.
 * No upfront resolution is performed when the target changes.

In practice, this optimization probably doesn't offer much, but it's 
better than nothing. class Xrl is still an intermediate representation, 
and until the Xrl is actually sent on the wire, it doesn't get 'packed' 
into its on-wire representation.

There is a lost optimization opportunity here. XRL clients which use the 
C++ client stubs never need to tunnel XRLs through the Finder, so they 
can always use binary representation. However, there is nowhere to put 
it, until we actually hit the output buffers for the destination 
transport.  [1]

I haven't measured cycle-for-cycle yet the costs of sending an XRL. I 
probably will do this when profiling the Thrift run; I'm still in 
analysis, trying to come up with the least invasive change to avoid 
rewriting code I don't/won't need to.

Thread safety hasn't really been a consideration to date. I'm making 
notes, but I'm not going to substantially change the behaviour towards 
making existing code thread-safe.

cheers,
BMS

[1] We can probably leverage this opportunity using Thrift, as we can 
then render the binary representation upfront. We can also use a mixin 
with TTransport to do scatter/gather I/O with multiple TMemoryBuffers, 
so we only hit writev() or send() to keep syscall count minimal. XRL 
will currently try to do this, but it's tied to stream semantics.