[Xorp-users] rip on freebsd
Bruce Simpson
bms at incunabulum.net
Sat May 9 11:42:51 PDT 2009
John Hay wrote:
> The wild guess was a good one. I stopped xorp, added a default route
> and then started xorp again and rip was working. I did it a few times
> and it started everytime.
>
>
OK. I am trying to think what the problem could be. It's been many many
months on since I touched stuff outside of XORP which could have
affected this.
> I also tried with adding a static default route in the xorp config, it
> does add the route to the kernel, but rip does not work and according
> to sockstat it does not listen on port 520. Maybe it happens to late?
>
> So what now?
>
Can you confirm that the sockets actually exist at this point?
Use 'sockstat -4 | grep 520' to confirm.
Can you confirm that the box has a multicast membership for RIP on the
interface(s) where you expect to see them?
Use 'netstat -g' in FreeBSD versions 5.x-7.x to confirm, or better
still, 'ifmcstat'. This is still stabbing in the dark, I do not know
why the FEA is saying the RIP sockets don't exist at this point.
This is the interesting message:
[ 2009/05/07 12:10:42 ERROR xorp_fea:865 LIBXORP +714 asyncio.cc complete_transfer ] Write error 51
51 is ENETUNREACH (grep 51 /usr/include/errno.h). Presumably the FEA closes the socket when it hits this error. This should be made verbose in asyncio.cc by calling strerror() or similar on POSIX platforms, patches welcome.
As I mentioned in my previous reply, 'ktrace' is pretty much needed to
find out exactly what the FEA is doing when this error is hit -- and where the sendto() is going. Because XORP processes are children of the Router Manager, you will need to intercept the FEA being started. You can do this by hand if you have good reaction times, or just write a script to do it. I believe I had a script somewhere to jump in and trap XORP process creation with gdb, but I'd have to hunt for it.
... Are your interfaces configured when you run XORP, or do you rely
wholly on XORP to configure your interfaces?
... What is the output of 'ifconfig -a' before, after, and during XORP
run time?
... Are your interfaces UP, RUNNING and MULTICAST?
I usually test with msk(4) myself, and haven't seen issues like this,
although I haven't done in-depth testing since the XORP 1.5 release cycle.
Based on the information to date, and in the absence of reproducing the
issue, my best guess is that this is a possible initialization race
between the FEA and RIP modules.
I know that XORP is still using the old Steve Deering era
IP_ADD_MEMBERSHIP socket options for multicast, which whilst it is
reasonably portable, has dire problems if you are lacking IPv4 addresses
on the link(s) you want to use. There seems to be some misunderstanding
about what multicast is and how it is expected to work out there, and it
just plain breaks (in any stack) if certain steps aren't followed.
It hasn't been an issue with XORP, because it does not currently support
the equivalent of 'ip unnumbered' -- and there are still a number of
places in the code which assume each interface has an IPv4 network layer
address of some kind, be that private, link-scope or whatever.
With IPv6, issues of the kind seen with the IPv4 basic multicasting API,
and IGMP issues, just don't exist, as link local addresses are normally
always available, except during DAD.
thanks,
BMS
More information about the Xorp-users
mailing list