[Xorp-hackers] Problems with last versions of XORP

Pavlin Radoslavov pavlin@icir.org
Tue, 18 Jan 2005 13:26:54 -0800


> I am currently testing a routing protocol that I have just migrated to XORP (as
> you may have noticed through other messages). I am using a version of XORP that
> I have downloaded from the CVS server around December, 13th. Well, first of all
> I am testing it with linux as the forwarding engine and I am having some
> problems. 
> 
> Everything seems to work fine until I leave my protocol working for a while. In
> the test scenario I have two computers running XORP and my protocol and they
> exchange periodic information about the routes they know, so that each one of
> them may update their routing table according to the received information. The
> problem is that if I leave it working for some time, the system becomes stable
> and no more route updates are done. So, when I shutdown my routing protocol,
> everything gets messed up. I get a lot of error messages from the router
> manager on the computer where I have shutdown my protocol, and also in the
> other computer, when it tries to delete some routes (due to the fact that it no
> longer receives the periodic messages from the first computer). If I shutdown my
> protocol after just a while after it becomes stable (let's say 1 minute or
> something in this magnitude), everything runs well. I have transcripted below
> the error messages I have got from both computers.

Majority of the error messages during shutdown on the first computer
are kind of normal. Well, ideally you shouldn't see them, but they
don't hurt; nevertheless, they should be fixed in the future.

Few of the errors you get are unusual and problematic. See inlined
comments.

> COMPUTER 1 (Where the protocol was shutdown)
> ==========
> 
> [ 2005/01/18 13:18:18  INFO xorp_rtrmgr:2969 RTRMGR +166 master_conf_tree.cc
> execute ] Changed modules: interfaces, fea, rib, static_routes
> [ 2005/01/18 13:18:18  INFO xorp_rtrmgr:2969 RTRMGR +403 module_manager.cc run ]
> Running module: interfaces (/root/XORP/xorp/fea/xorp_fea)
> [ 2005/01/18 13:18:20  INFO xorp_rtrmgr:2969 RTRMGR +403 module_manager.cc run ]
> Running module: fea (/root/XORP/xorp/fea/xorp_fea)
> [ 2005/01/18 13:18:26  INFO xorp_rtrmgr:2969 RTRMGR +403 module_manager.cc run ]
> Running module: rib (/root/XORP/xorp/rib/xorp_rib)
> [ 2005/01/18 13:18:28  INFO xorp_rtrmgr:2969 RTRMGR +403 module_manager.cc run ]
> Running module: static_routes (/root/XORP/xorp/static_routes/xorp_static_routes)
> [ 2005/01/18 13:18:30  INFO xorp_rtrmgr:2969 RTRMGR +1372 task.cc run_task ] No
> more tasks to run
> [ 2005/01/18 13:18:32  ERROR xorp_rib:2971 RIB +133 rib.cc admin_distance ]
> Administrative distance of "test" unknown.


To get rid of this error, you may want to add an admin distance for
your protocol "test". E.g., add a like like to rib/rib.cc
_admin_distances["fib2mrib"] = 220;
along other hard-coded admin distances inside the RIB<A>::RIB()
constructor.

> [ 2005/01/18 13:21:29  INFO xorp_rtrmgr:2969 RTRMGR +1372 task.cc run_task ] No
> more tasks to run
> [ 2005/01/18 13:23:35  ERROR xorp_rib:2971 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  ERROR xorp_fea:2970 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  ERROR xorp_rtrmgr:2969 LIBXORP +271 asyncio.cc
> complete_transfer ] Write error 104 - Connection reset by peer
> [ 2005/01/18 15:49:35 INFO xorp_rib RIB ] Received death event for protocol test
> shutting down -------
> OriginTable: test
> IGP
> next table = Redist:test

The above errors can be ignored.

> [ 2005/01/18 15:49:35  FATAL xorp_fea:2970 FEA +466
> fticonfig_entry_set_netlink.cc delete_entry ] Assertion (ii != it.ifs().end())
> failed

This assert is not normal.
Somehow, the FEA is told to delete a forwarding entry, and the
corresponding interface name doesn't exist inside the FEA.
To debug the problem, please add some printf right before the
XLOG_ASSERT() on line 466 (file fea/fticonfig_entry_set_netlink.cc),
and print the fte.ifname(), and the ifnames of all interfaces inside
the "IfTree& it". This may give you a clue about the mismatch.
E.g., it could be that the XLOG_ASSERT() there is not appropriate.

> [ 2005/01/18 15:49:35  ERROR xorp_static_routes:2972 XRL +628 xrl_pf_stcp.cc die
> ] XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  INFO xorp_rtrmgr:2969 RTRMGR +549 module_manager.cc
> killed ] Module abnormally killed: interfaces
> [ 2005/01/18 15:49:35  INFO xorp_rtrmgr:2969 RTRMGR +549 module_manager.cc
> killed ] Module abnormally killed: fea
> [ 2005/01/18 15:49:35  ERROR xorp_rtrmgr:2969 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  ERROR xorp_rib:2971 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  ERROR xorp_rib:2971 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:35  ERROR xorp_rib:2971 XifRedistTransaction6 +902
> redist_xrl.cc dispatch_complete ] Fatal error during commit transaction "210
> Transport failed"
> [ 2005/01/18 15:49:43  INFO xorp_rtrmgr:2969 RTRMGR +570 task.cc shutdown ]
> Shutting down module: static_routes
> [ 2005/01/18 15:49:43  WARNING xorp_rtrmgr:2969 XrlFinderTarget +406
> ./xrl/targets/finder_base.cc handle_finder_0_2_resolve_xrl ] Handling method
> for finder/0.2/resolve_xrl failed: XrlCmdError 102 Command failed Target "fea"
> does not exist or is not enabled.
> [ 2005/01/18 15:49:43  ERROR xorp_rib:2971 RIB +1306 rib.cc delete_origin_table
> ] Got delete_origin_table for wrong target name
> [ 2005/01/18 15:49:43 WARNING xorp_rib XrlRibTarget ] Handling method for
> rib/0.1/delete_igp_table4 failed: XrlCmdError 102 Command failed Could not
> delete unicast IPv4 igp table "static"
> [ 2005/01/18 15:49:43  ERROR xorp_static_routes:2972 STATIC_ROUTES +308
> xrl_static_routes_node.cc rib_client_send_delete_igp_table4_cb ] Failed to
> deregister IPv4 IGP table with the RIB: 102 Command failed Could not delete
> unicast IPv4 igp table "static". Will give up.
> [ 2005/01/18 15:49:43  INFO xorp_rtrmgr:2969 RTRMGR +523 module_manager.cc
> normal_exit ] Module normal exit: static_routes
> [ 2005/01/18 15:49:43  ERROR xorp_rtrmgr:2969 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:44  WARNING xorp_rtrmgr:2969 XrlFinderTarget +406
> ./xrl/targets/finder_base.cc handle_finder_0_2_resolve_xrl ] Handling method
> for finder/0.2/resolve_xrl failed: XrlCmdError 102 Command failed Target
> "static_routes" does not exist or is not enabled.
> [ 2005/01/18 15:49:45  INFO xorp_rtrmgr:2969 RTRMGR +570 task.cc shutdown ]
> Shutting down module: rib
> [ 2005/01/18 15:49:45  INFO xorp_rtrmgr:2969 RTRMGR +523 module_manager.cc
> normal_exit ] Module normal exit: rib
> [ 2005/01/18 15:49:45  ERROR xorp_rtrmgr:2969 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:49:45  INFO xorp_rtrmgr:2969 XRL +331 xrl_router.cc
> send_resolved ] Sender died (protocol = "stcp", address = "127.0.0.1:33022")
> [ 2005/01/18 15:49:45  ERROR xorp_rtrmgr:2969 IPC +271 sockutil.cc
> create_connected_ip_socket ] failed to connect to 127.0.0.1 port 33022:
> Connection refused
> [ 2005/01/18 15:49:45  ERROR xorp_rtrmgr:2969 XRL +55 xrl_pf_factory.cc
> create_sender ] XrlPFSenderFactory::create failed: XrlPFConstructorError from
> line 578 of xrl_pf_stcp.cc: Could not connect to 127.0.0.1:33022
> 
> [ 2005/01/18 15:49:45  ERROR xorp_rtrmgr:2969 XRL +342 xrl_router.cc
> send_resolved ] Could not create XrlPFSender for protocol = "stcp" address =
> "127.0.0.1:33022"
> [ 2005/01/18 15:49:45  WARNING xorp_rtrmgr:2969 XrlFinderTarget +406
> ./xrl/targets/finder_base.cc handle_finder_0_2_resolve_xrl ] Handling method
> for finder/0.2/resolve_xrl failed: XrlCmdError 102 Command failed Target "rib"
> does not exist or is not enabled.
> [ 2005/01/18 15:49:49  INFO xorp_rtrmgr:2969 RTRMGR +570 task.cc shutdown ]
> Shutting down module: interfaces
> [ 2005/01/18 15:49:49  WARNING xorp_rtrmgr:2969 XrlFinderTarget +406
> ./xrl/targets/finder_base.cc handle_finder_0_2_resolve_xrl ] Handling method
> for finder/0.2/resolve_xrl failed: XrlCmdError 102 Command failed Target "fea"
> does not exist or is not enabled.
> [ 2005/01/18 15:49:49  WARNING xorp_rtrmgr:2969 XrlFinderTarget +406
> ./xrl/targets/finder_base.cc handle_finder_0_2_resolve_xrl ] Handling method
> for finder/0.2/resolve_xrl failed: XrlCmdError 102 Command failed Target "fea"
> does not exist or is not enabled.
> [ 2005/01/18 15:49:50  INFO xorp_rtrmgr:2969 RTRMGR +1372 task.cc run_task ] No
> more tasks to run

The above errors are "kind of normal".

> COMPUTER 2 (tries to delete some routes after computer 1 shutdowns)
> ==========
> 
> [ 2005/01/18 13:24:32  INFO xorp_rtrmgr:2888 RTRMGR +166 master_conf_tree.cc
> execute ] Changed modules: interfaces, fea, rib, static_routes
> [ 2005/01/18 13:24:32  INFO xorp_rtrmgr:2888 RTRMGR +403 module_manager.cc run ]
> Running module: interfaces (/root/XORP/xorp/fea/xorp_fea)
> [ 2005/01/18 13:24:34  INFO xorp_rtrmgr:2888 RTRMGR +403 module_manager.cc run ]
> Running module: fea (/root/XORP/xorp/fea/xorp_fea)
> [ 2005/01/18 13:24:40  INFO xorp_rtrmgr:2888 RTRMGR +403 module_manager.cc run ]
> Running module: rib (/root/XORP/xorp/rib/xorp_rib)
> [ 2005/01/18 13:24:42  INFO xorp_rtrmgr:2888 RTRMGR +403 module_manager.cc run ]
> Running module: static_routes (/root/XORP/xorp/static_routes/xorp_static_routes)
> [ 2005/01/18 13:24:44  INFO xorp_rtrmgr:2888 RTRMGR +1372 task.cc run_task ] No
> more tasks to run
> [ 2005/01/18 13:24:47  ERROR xorp_rib:2891 RIB +133 rib.cc admin_distance ]
> Administrative distance of "test" unknown.
> [ 2005/01/18 15:55:53  FATAL xorp_fea:2889 FEA +466
> fticonfig_entry_set_netlink.cc delete_entry ] Assertion (ii != it.ifs().end())
> failed

This assert is not normal. See my comments above about the same
assert in the first computer.

The rest of the errors are "kind of normal".

Regards,
Pavlin

> [ 2005/01/18 15:55:53  ERROR xorp_rib:2891 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:55:53  ERROR xorp_static_routes:2892 XRL +628 xrl_pf_stcp.cc
> die] XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:55:53  INFO xorp_rtrmgr:2888 RTRMGR +549 module_manager.cc
> killed ] Module abnormally killed: interfaces
> [ 2005/01/18 15:55:53  INFO xorp_rtrmgr:2888 RTRMGR +549 module_manager.cc
> killed ] Module abnormally killed: fea
> [ 2005/01/18 15:55:53  ERROR xorp_rtrmgr:2888 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:55:53  ERROR xorp_rib:2891 XRL +628 xrl_pf_stcp.cc die ]
> XrlPFSTCPSender died: end of file
> [ 2005/01/18 15:55:53  ERROR xorp_rib:2891 XifRedistTransaction6 +902
> redist_xrl.cc dispatch_complete ] Fatal error during commit transaction "210
> Transport failed"