[Xorp-hackers] Crash due to stale cached xrl sender pointer.

Bruce Simpson bms at incunabulum.net
Sun Nov 8 06:02:47 PST 2009


The attached patch changes the allocation semantics of the cached Xrl 
pointer to be per-client-instance rather than per-library-instance.

It does this by modifying the Xrl client stub generator to include a 
named auto_ptr<Xrl> for each method in the Xrl client stub.
Sizes are for FreeBSD 7.2/amd64 with gcc 4.2.1. Hopefully it should give 
others a feel for working with the code generator.

    This probably doesn't fix the original race, but it may make it 
easier to mitigate some other way, by giving us a handle on the cached 
Xrls which are causing the problem, rather than letting them dangle in 
the BSS segment. That, and it should fix the glaringly obvious memory 
leakage, and non-reentrancy, caused by polluting a library with 'static 
Xrl*' instances.

    I've tested this briefly with the current SVN tree, just by running 
a xorp_finder, call_xrl, and a xorp_rib. This exercises the reentrancy 
of the XRL client stubs, because the Finder Client is instantiated once 
for every XrlRouter. The RIB uses libfeaclient, which instantiates its 
*own* XrlRouter instance (yes, you can have more than one per 
process...) for getting interface updates from the FEA.

    This patch does increase the code size of the stubs very slightly; 
that's the trade-off. As we are no longer holding the cached 'static 
Xrl*' in the BSS segment, it shrinks versus the code segment; 
auto_ptr<T> doesn't use any additional data storage above the pointer 
type, but we are now checking allocations and deallocations of Xrl for 
the stubs.

       When building with shared libraries, the template expansions for 
auto_ptr are a lost opportunity for coalescing with the linker, although 
this only wastes ~128 bytes per library (for auto_ptr<Xrl>::get() and 
auto_ptr<Xrl>::reset()). Most of this seems to be NOP padding for cache 
line alignment.

    I'm still not 100% happy with how the XrlPFSender cache mechanism 
works. Because we're holding a pointer somewhere, to something whose 
lifecycle is managed somewhere else, there really isn't any other answer 
than the one you've already suggested.
    I'm not that happy about using ref_ptr<T> to do it, for much the 
same reasons as I've already described earlier in this thread -- there 
is no clean syntax for observing a ref_ptr vs Boost's shared_ptr/weak_ptr.
    I'm not at all happy about using ref_ptr<T>&, because it is all too 
easy to ignore its specific meaning and introduce problems. Certainly, 
the only reason I did it in Spt, was because that class was already 
using ref_ptr<T> internally to track nodes in a container.
    Granted, this is code fairly deep down in the core of the tree, 
which many developers would never need to touch directly, but that makes 
it even more important to be careful.

    In Thrift, the binary blobs themselves can be decoupled from where 
they go. There is a potential chicken-and-egg problem if we support 
multiple TProtocol types, where we would need to know the sender before 
the stubs create the blob; if we just speak TBinaryProtocol to 
everything, we don't have this problem, but it does mean we can't just 
tell the XORP RPC endpoints 'speak JSON to this guy', 'speak XML-RPC to 
this guy', 'speak AMQP to this guy' etc.

    So the idea of caching the transport we'd prefer to transmit from, 
is still one that bears further scrutiny, even in a re-spin. The 
difference is, for maximum flexibility, Thrifted XRL stubs would 
actually want to see XrlPFSender's equivalent upfront, before 
XrlSender::send() is even called.

    This is after scrutinizing libxipc even further this week, and 
realizing most of what's there is to support asynchrony.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sizes-xif-autoptr.txt
Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20091108/fccc60fe/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sizes-xif-noautoptr.txt
Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20091108/fccc60fe/attachment-0001.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: xrl-auto-ptr.diff
Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20091108/fccc60fe/attachment.ksh 


More information about the Xorp-hackers mailing list