[Xorp-hackers] Problems migrating a routing protocol...

Pavlin Radoslavov pavlin@icir.org
Wed, 15 Dec 2004 15:30:25 -0800


Rafael,

> I am currently migrating a routing process to XORP and I am facing some 
> problems that I do not know how to solve, so I though that maybe 
> somebody could help me... What I did was to change as little as possible 
> the original implementation of the routing process. So, what I did is to 
> create a thread that is responsible to interfacing between my routing 
> process and XORP. It registers the protocol within XORP and keeps the 
> eventloop. The only change in the original routing process (besides 
> creating this new thread) is in the add/del routes functions. Now, these 
> functions send the route to be added/deleted to the new thread through a 
> pipe. The thread processes this route and adds or deletes the route in 
> XORP by calling the appropriate XRL. In fact, this thread is almost the 
> equal to static_routes. The only difference is that it receives routes 
> to add/delete from the pipe.
> 
> Well, this is what I have done so far and, I think it should work fine, 
> but for some reason, it doesn't.
> 
> Below you can see the report from the router manager. The first 
> occurence is during the registration of the new protocol (do I have to 
> change anything in the RIB so that it knows about this new protocol?).
> 
> [ 2004/12/15 16:47:09  ERROR xorp_rib:2140 RIB +132 rib.cc 
> admin_distance ] Administrative distance of "test" unknown.

You can get rid of the above error by adding the appropriate line
to constructor RIB<A>::RIB<A>() inside rib.cc :

_admin_distances["your_protocol_name"] = <your_protocol_admin_distance>;

Unfortunately, currently all protocol admin distances are
hard-coded inside rib.cc. This should be fixed in the future.

Though, I believe the above error message is not related to the next
error...

> 
> Then, when I add a new route (unicast) , it seems to send the XRL 
> correctly, but the callback is never called...
> 
>                  success = _xrl_rib_client.send_add_interface_route4(
>                      _rib_target.c_str(),
>                      TestNode::protocol_name(),
>                      true /* unicast */,
>                      false /* multicast */,
>                      test_route.network().get_ipv4net(),
>                      test_route.nexthop().get_ipv4(),
>                      test_route.ifname(),
>                      test_route.vifname(),
>                      test_route.metric(),
>                      callback(this, 
> &XrlTestNode::send_rib_route_change_cb));
> 
> Some seconds after this, I get the following report in the router manager:
> 
> [ 2004/12/15 16:47:46  ERROR xorp_rtrmgr:2116 FINDER +85 
> finder_xrl_queue.hh dispatch_cb ] Sent xrl got response 211 Reply timed out
> [ 2004/12/15 16:47:46  ERROR xorp_rtrmgr:2116 FINDER +85 
> finder_xrl_queue.hh dispatch_cb ] Sent xrl got response 211 Reply timed out
> [ 2004/12/15 16:47:46 INFO xorp_rib RIB ] Received death event for 
> protocol test shutting down -------
> OriginTable: test
> IGP
> next table = Redist:test
> 
> It seems to me that the RIB tries to call the callback function I have 
> sent in the send_add_interface_route4 call, but for some reason it 
> times-out. And finally, the router manager thinks that my routing 
> protocol died, but in fact it keeps running as nothing has happened. 
> Does anybody have any idea of what could be happening? Why the RIB can 
> not call the callback function?

It looks to me that the XORP eventloop in your process is not
running properly, hence your process doesn't respond to the FINDER
periodic probes, and it fails to process the XRL result from the RIB
response.
When you integrate the XORP eventloop in your process, make sure
that this eventloop is run promptly.

Regards,
Pavlin