[Xorp-users] OSPF Hello messages not exchanged after some period, and link status becomes dead

Ben Greear greearb at candelatech.com
Wed Mar 24 17:00:36 PDT 2010


On 03/24/2010 04:41 PM, Dejan Petkovic wrote:

> Hello Ben,
>
> I am using 2.6.18-164.11.1.el5.028stab068.3 CentOS 5 kernel, modified
> with OpenVZ version 028stab068.3.
>
> This is the latest version of OpenVZ as far as I know, and I need to
> use OpenVZ for my project so I am rather stuck there.
>
> Yes, linux command line hangs when issuing the suggested commands
> while xorp is in "blocked" state.

Well, it certainly looks like a kernel bug, maybe exacerbated by
something xorp is doing, but the kernel should never hang trying
to read a netlink message (as your strace on 'ip ...' shows).

> Perhaps the fact that I am using veth interfaces instead of venet
> interfaces is the reason for the issue? However, since I have to use
> bridging in order to simulate the phy links between the routers I am
> not sure if venet would work for me.

No idea..I use veth, but I'm on a much more recent kernel (2.6.31).

> When I run strace against the xorp-fea process, I do not see it is
> hanged at any system call. It always displays this message in both
> servers, in the bottom of the screen, i.e. this message always repeats
> but the process is not hanged as other messages appear there:
>
> select(41, [9 10 11 12 13 14 15 16 17 18 19 22 23 24 25 26 27 28 29 30
> 31 32 33 34 35 36 37 38 40], [], [], {0, 0}) = 0 (Timeout)
>
> So, here is the strace output of the commands. Please note that I had
> to break out of every command as the command line hanged after
> displaying the output. Any clues...?
>
> I will read the documentation of the xorpct, and try to integrate it
> into openvz containers to test with it.

Well, you're welcome to try xorp.ct, but to be honest, I won't
be surprised if you see similar problems.

If you can't upgrade open-vz, then perhaps you need to start
digging in openvz kernel code.  Enable lockdep if you can, use sysrq
to print out locks held, etc.

Are you at liberty to discuss the larger goal of your project?  Maybe
there is an alternative to open-vz?

> sendto(3, "\24\0\0\0\"\0\1\3b\240\252K\0\0\0\0\2\0\0\0", 20, 0,
> {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
> recvmsg(3,

I think this should never happen regardless of any other user-space
activities.

Thanks,
Ben

-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com



More information about the Xorp-users mailing list