[Xorp-users] Xorp 1.8.4 release almost ready.

Eric S. Johnson esj at cs.fiu.edu
Thu Sep 15 08:15:36 PDT 2011


Curiouser and curiouser. 

So.. It turns out 1.8.3 is also *now* having problems starting on 
the same machine. Same sort of errors as with 1.8.4. 

Most odd since 1.8.3 had been running on this host for a while
with no problems. A 1.7-svn dated 2010-02-17 does still run fine.

And even odder is the exact same 1.8.3 with the exact same config file
(well, IP address diffs but that is it) runs on another router
(and does still shutdown/startup fine, multiple time). I can 
not explain it. But it seems to be host centric.

(oh, same OS/kernels too)

The problem seems to be in some XRL keepalive. On the failing host
all seems fine (IE it start to make adjacency's and some of them 
get into db transfer) untill the logs start saying:

[ 2011/09/15 10:17:35.245634  ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:36.107436  ERROR xorp_rtrmgr:3845 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24165  ERROR xorp_ospfv2:3849 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24503  ERROR xorp_ospfv2:3849 XRL libxipc/xrl_pf_stcp.cc:783 die ] XrlPFSTCPSender died: Keepalive timeout
[ 2011/09/15 10:17:37.24733  ERROR xorp_fea:3846 LIBXORP libxorp/buffered_asyncio.cc:226 io_event ] read error 104
[ 2011/09/15 10:17:37.24525  ERROR xorp_rib:3847 LIBXORP libxorp/buffered_asyncio.cc:226 io_event ] read error 104
[ 2011/09/15 10:17:37.24828  ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:197 read_event ] Read failed (error = 104)
[ 2011/09/15 10:17:37.24890  ERROR xorp_fea:3846 XRL libxipc/xrl_pf_stcp.cc:407 die ] STCPRequestHandler died: read error
[ 2011/09/15 10:17:37.24908  ERROR xorp_rib:3847 XRL libxipc/xrl_pf_stcp.cc:197 read_event ] Read failed (error = 104)
[ 2011/09/15 10:17:37.24992  ERROR xorp_rib:3847 XRL libxipc/xrl_pf_stcp.cc:407 die ] STCPRequestHandler died: read error
[ 2011/09/15 10:17:44.972044  WARNING xorp_ospfv2:3849 LIBXORP libxorp/timer.cc:439 expire_one ] Timer Expiry *much* later than scheduled: behind by 18.910527 seconds
[ 2011/09/15 10:17:44.972329  WARNING xorp_ospfv2:3849 LIBXORP libxorp/timer.cc:439 expire_one ] Timer Expiry *much* later than scheduled: behind by 18.889423 seconds
[ 2011/09/15 10:17:51.287571  INFO xorp_rtrmgr:3845 RTRMGR rtrmgr/task.cc:1033 shutdown ] Shutting down module: ospf4
[ 2011/09/15 10:17:51.287821  INFO xorp_rtrmgr:3845 RTRMGR rtrmgr/task.cc:1073 shutdown ] Shutdown with XRL: >finder://ospfv2/common/0.1/shutdown<
[ 2011/09/15 10:17:51.289067  INFO xorp_rtrmgr:3845 XRL libxipc/xrl_router.cc:459 lookup_sender ] Sender died (protocol = "unix", address = ":var:tmp:xrl.seHY26")

the xorp_ospfv2 doesn't die though, but it doesn't do anything else
either. xorpsh show ospf neighbor shows all the links as down (though as I said
before the error it shows the adjacency's starting (EXSTART etc).

Takes about 30 seconds for the whole thing to die.



The mentioned socket  /var/tmp/seHY26 doesn't exist.

Ive tried rebooting the box. No change.

Am I running into some socket limit? (I thought a reboot might help that.. 
but no)

Any ideas? Anything I could do debug wise to shine light?

E



More information about the Xorp-users mailing list