[Xorp-users] Xorp 1.8.4 release almost ready.
Eric S. Johnson
esj at cs.fiu.edu
Thu Sep 15 08:15:36 PDT 2011
Curiouser and curiouser.
So.. It turns out 1.8.3 is also *now* having problems starting on
the same machine. Same sort of errors as with 1.8.4.
Most odd since 1.8.3 had been running on this host for a while
with no problems. A 1.7-svn dated 2010-02-17 does still run fine.
And even odder is the exact same 1.8.3 with the exact same config file
(well, IP address diffs but that is it) runs on another router
(and does still shutdown/startup fine, multiple time). I can
not explain it. But it seems to be host centric.
(oh, same OS/kernels too)
The problem seems to be in some XRL keepalive. On the failing host
all seems fine (IE it start to make adjacency's and some of them
get into db transfer) untill the logs start saying:
[ 2011/09/15 10:17:35.245634 ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:36.107436 ERROR xorp_rtrmgr:3845 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24165 ERROR xorp_ospfv2:3849 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24503 ERROR xorp_ospfv2:3849 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24733 ERROR xorp_fea:3846 LIBXORP libxorp/buffered_asyncio.cc:226 io_event ] read error 104
[ 2011/09/15 10:17:37.24525 ERROR xorp_rib:3847 LIBXORP libxorp/buffered_asyncio.cc:226 io_event ] read error 104
[ 2011/09/15 10:17:37.24828 ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:197 read_event ] Read failed (error = 104)
[ 2011/09/15 10:17:37.24890 ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:407 die ] STCPRequestHandler died: read error
[ 2011/09/15 10:17:37.24908 ERROR xorp_rib:3847 XRL libxipc/xrl_pf_stcp.cc:197 read_event ] Read failed (error = 104)
[ 2011/09/15 10:17:37.24992 ERROR xorp_rib:3847 XRL libxipc/xrl_pf_stcp.cc:407 die ] STCPRequestHandler died: read error
[ 2011/09/15 10:17:44.972044 WARNING xorp_ospfv2:3849 LIBXORP libxorp/timer.cc:439 expire_one ] Timer Expiry *much* later than scheduled: behind by 18.910527 seconds
[ 2011/09/15 10:17:44.972329 WARNING xorp_ospfv2:3849 LIBXORP libxorp/timer.cc:439 expire_one ] Timer Expiry *much* later than scheduled: behind by 18.889423 seconds
[ 2011/09/15 10:17:51.287571 INFO xorp_rtrmgr:3845 RTRMGR rtrmgr/task.cc:1033 shutdown ] Shutting down module: ospf4
[ 2011/09/15 10:17:51.287821 INFO xorp_rtrmgr:3845 RTRMGR rtrmgr/task.cc:1073 shutdown ] Shutdown with XRL: >finder://ospfv2/common/0.1/shutdown<
[ 2011/09/15 10:17:51.289067 INFO xorp_rtrmgr:3845 XRL libxipc/xrl_router.cc:459 lookup_sender ] Sender died (protocol = "unix", address = ":var:tmp:xrl.seHY26")
the xorp_ospfv2 doesn't die though, but it doesn't do anything else
either. xorpsh show ospf neighbor shows all the links as down (though as I said
before the error it shows the adjacency's starting (EXSTART etc).
Takes about 30 seconds for the whole thing to die.
The mentioned socket /var/tmp/seHY26 doesn't exist.
Ive tried rebooting the box. No change.
Am I running into some socket limit? (I thought a reboot might help that..
but no)
Any ideas? Anything I could do debug wise to shine light?
E
More information about the Xorp-users
mailing list