Hello.<br><br>Sorry for an email filled with formatting, I didn't find any good solutions to make things clear.<br><br>Kernel: Ubuntu 8.04 server (2.6.24-23-server)<br>Xorp: We have tested 1.4, 1.5 and 1.6 for both issues described in this email<br>
<br>Scenario: <a href="http://www.mxd.nu/info/lan-man-route-net-0901.jpg" target="_blank">http://www.mxd.nu/info/lan-man-route-net-0901.jpg</a><br>i.e 2 BGP peers and OSPF to internal network.<br><br><font size="4">Problem 1:</font><br>
Around 2009-01-15 18:50 the communication out from our BGP router suddenly stopped.<br>
Attaching logs.<br><br>Logs:<br><a href="http://www.mxd.nu/router.log" target="_blank">http://www.mxd.nu/router.log</a><br><a href="http://www.mxd.nu/router.err.log" target="_blank">http://www.mxd.nu/router.err.log</a><br>
<br>Mirror:<br><a href="https://denzel.cmd.nu/%7Ebluecommand/xorp/2009-01-15" target="_blank">https://denzel.cmd.nu/~bluecommand/xorp/2009-01-15</a><br>
<br>Recently this apparently happened again:<font size="1"><span style="font-family: courier new,monospace;"><br><br>[ 2009/01/25 06:15:26 INFO xorp_bgp BGP ] Sending: Notification Packet: Hold Timer Expired(4)</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:19 ERROR xorp_bgp:4924 XRL +635 xrl_pf_stcp.cc die ] XrlPFSTCPSender died: Keepalive timeout</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:19 ERROR xorp_fea:4920 XRL +635 xrl_pf_stcp.cc die ] XrlPFSTCPSender died: Keepalive timeout</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:22 ERROR xorp_rib:4921 XRL +635 xrl_pf_stcp.cc die ] XrlPFSTCPSender died: Keepalive timeout</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:23 ERROR xorp_policy:4922 XRL +635 xrl_pf_stcp.cc die ] XrlPFSTCPSender died: Keepalive timeout</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:26 ERROR xorp_bgp:4924 XRL +635 xrl_pf_stcp.cc die ] XrlPFSTCPSender died: Keepalive timeout</span><br style="font-family: courier new,monospace;"><span style="font-family: courier new,monospace;">[
2009/01/25 06:18:26 WARNING xorp_bgp:4924 LIBXORP +468 timer.cc
expire_one ] Timer Expiry *much* later than scheduled: behind by
17.104458 seconds</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25
06:18:26 WARNING xorp_bgp:4924 LIBXORP +468 timer.cc expire_one ]
Timer Expiry *much* later than scheduled: behind by 17.104535 seconds</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25
06:18:26 WARNING xorp_bgp:4924 LIBXORP +468 timer.cc expire_one ]
Timer Expiry *much* later than scheduled: behind by 17.104547 seconds</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25 06:18:26 INFO xorp_bgp XRL ] Sender died (protocol = "stcp", address = "<a href="http://127.0.0.1:40354/" target="_blank">127.0.0.1:40354</a>")</span><br style="font-family: courier new,monospace;">
<span style="font-family: courier new,monospace;">[ 2009/01/25 06:21:26 ERROR xorp_bgp:4924 XRL +338 xrl_pf_stcp.cc die ] STCPRequestHandler died: life timer expired</span></font><br clear="all"><br><font size="4">Problem 2:</font><br>
We are also trying to apply a special localpref value on some routes to balance our peering:<br><br><b>set protocols bgp peer "212.112.175.81" import "PREFERED_BGP_STHLM"<br>
</b>We are given the response: <b>"210 Transport failed</b>" and xorp_policy dies.<br><br>Log and configuration:<br><a href="http://www.mxd.nu/policy-bgp-xorp.txt">http://www.mxd.nu/policy-bgp-xorp.txt</a><br><br>
<span style="font-family: courier new,monospace;">ip ro</span> shows all routes after XORP has died (no processes left). It will not route any traffic on restart even though <span style="font-family: courier new,monospace;">ip ro | wc -l</span> return 270 000 (i.e. all the routes are in there).<br>
Stopping xorp, flushing addresses and routes on the router interfaces followed by a start of xorp "solves" this.<br><br>-- <br>Christian Svensson<br>Command Systems<br>