[Xorp-users] rip on freebsd

Bruce Simpson bms at incunabulum.net
Sat May 9 11:42:51 PDT 2009


John Hay wrote:
> The wild guess was a good one. I stopped xorp, added a default route
> and then started xorp again and rip was working. I did it a few times
> and it started everytime.
>
>   

OK. I am trying to think what the problem could be. It's been many many 
months on since I touched stuff outside of XORP which could have 
affected this.


> I also tried with adding a static default route in the xorp config, it
> does add the route to the kernel, but rip does not work and according
> to sockstat it does not listen on port 520. Maybe it happens to late?
>
> So what now?
>   

Can you confirm that the sockets actually exist at this point?
Use 'sockstat -4 | grep 520' to confirm.

Can you confirm that the box has a multicast membership for RIP on the 
interface(s) where you expect to see them?
Use 'netstat -g' in FreeBSD versions 5.x-7.x to confirm, or better 
still, 'ifmcstat'.  This is still stabbing in the dark, I do not know 
why the FEA is saying the RIP sockets don't exist at this point.

This is the interesting message:

[ 2009/05/07 12:10:42  ERROR xorp_fea:865 LIBXORP +714 asyncio.cc complete_transfer ] Write error 51

51 is ENETUNREACH (grep 51 /usr/include/errno.h). Presumably the FEA closes the socket when it hits this error. This should be made verbose in asyncio.cc by calling strerror() or similar on POSIX platforms, patches welcome.

As I mentioned in my previous reply, 'ktrace' is pretty much needed to
find out exactly what the FEA is doing when this error is hit -- and where the sendto() is going. Because XORP processes are children of the Router Manager, you will need to intercept the FEA being started. You can do this by hand if you have good reaction times, or just write a script to do it. I believe I had a script somewhere to jump in and trap XORP process creation with gdb, but I'd have to hunt for it. 
 

... Are your interfaces configured when you run XORP, or do you rely 
wholly on XORP to configure your interfaces?
... What is the output of 'ifconfig -a' before, after, and during XORP 
run time?
... Are your interfaces UP, RUNNING and MULTICAST?
I usually test with msk(4) myself, and haven't seen issues like this, 
although I haven't done in-depth testing since the XORP 1.5 release cycle.

Based on the information to date, and in the absence of reproducing the 
issue, my best guess is that this is a possible initialization race 
between the FEA and RIP modules.

I know that XORP is still using the old Steve Deering era 
IP_ADD_MEMBERSHIP socket options for multicast, which whilst it is 
reasonably portable, has dire problems if you are lacking IPv4 addresses 
on the link(s) you want to use. There seems to be some misunderstanding 
about what multicast is and how it is expected to work out there, and it 
just plain breaks (in any stack) if certain steps aren't followed.

It hasn't been an issue with XORP, because it does not currently support 
the equivalent of 'ip unnumbered' -- and there are still a number of 
places in the code which assume each interface has an IPv4 network layer 
address of some kind, be that private, link-scope or whatever.

With IPv6, issues of the kind seen with the IPv4 basic multicasting API, 
and IGMP issues, just don't exist, as link local addresses are normally 
always available, except during DAD.

thanks,
BMS



More information about the Xorp-users mailing list