[Xorp-users] BGP crash

Arsi Antila bbb999 at zerodistance.fi
Thu Dec 6 03:22:26 PST 2007


The BGP crash situation was tested using both Linux/CentOS/XORP 1.4 and
Linux/Debian/Debian XORP package 1.5~cvs.20070824-1, and it seems to occur in
both environments. The key to get BGP to crash seems to be a combination of a
policy rule (even a simple one), a network like the one described below (or
similar), and flapping BGP peers.

The test network has DUT (device under test, XORP) with eth1 connected to both
Router1 (AS 1) and Router2 (AS 2). DUT eth2 port is connected to Router3 (AS 1)
and Router4 (AS 2).

Test sequence to get XORP BGP to crash is: start routers 1-4, start XORP, make
Router2 go down, make Router2 go up. After this XORP crashes.

Here is XORP output for the crash:


[ 2007/12/05 13:31:18 INFO xorp_bgp BGP ] Peer-{10.10.20.20(179)
10.10.20.40(179)} in state ESTABLISHED(6) received Notification Packet:
Cease(6)
[ 2007/12/05 13:31:18 INFO xorp_bgp BGP ] Peer-{10.10.10.20(179)
10.10.10.30(179)} in state ESTABLISHED(6) received Notification Packet:
Cease(6)
[ 2007/12/05 13:31:19 INFO xorp_bgp BGP ] Peer-{10.10.10.20(179)
10.10.10.40(179)} in state ESTABLISHED(6) received Notification Packet:
Cease(6)
[ 2007/12/05 13:31:19 INFO xorp_bgp BGP ] Peer-{10.10.20.20(179)
10.10.20.30(179)} in state ESTABLISHED(6) received Notification Packet:
Cease(6)
[ 2007/12/05 13:31:34  FATAL xorp_bgp:10028 BGP +83 route_table_cache.cc
add_route ] Internal fatal error: unreachable code reached
[ 2007/12/05 13:31:34  ERROR xorp_rtrmgr:10024 RTRMGR +747
module_manager.cc done_cb ] Command "/usr/local/xorp/bgp/xorp_bgp":
terminated with signal 6.
[ 2007/12/05 13:31:34  INFO xorp_rtrmgr:10024 RTRMGR +294
module_manager.cc module_exited ] Module abnormally killed: bgp
[ 2007/12/05 13:31:34 INFO xorp_rib RIB ] Received death event for
protocol bgp shutting down -------
OriginTable: ebgp
EGP
next table = Merged:(ebgp)+(ibgp)
[ 2007/12/05 13:31:34 INFO xorp_rib RIB ] Received death event for
protocol bgp shutting down -------
OriginTable: ebgp
EGP
next table = Merged:(ebgp)+(ibgp)
[ 2007/12/05 13:31:34 INFO xorp_rib RIB ] Received death event for
protocol bgp shutting down -------
OriginTable: ebgp
EGP
next table = Merged:(ebgp)+(ibgp)
[ 2007/12/05 13:31:34 INFO xorp_rib RIB ] Received death event for
protocol bgp shutting down -------
OriginTable: ebgp
EGP
next table = Merged:(ebgp)+(ibgp)



Here is the configuration:

interfaces {
    restore-original-config-on-shutdown: false

    interface eth1 {
        description: "router interface"
        disable: false
        default-system-config
    }

    interface eth2 {
        description: "router interface"
        disable: false
        default-system-config
    }

}

fea {
    unicast-forwarding4 {
        disable: false
    }
}

policy {
    policy-statement block {
        term bgp_65400 {
            from {
                protocol: "bgp"
            }
            then {
                accept
            }
        }
    }
}

protocols {
    bgp {
        bgp-id: 10.100.100.2
        local-as: 65000
        peer 10.10.10.30 {
            local-ip: 10.10.10.20
            as: 65300
            next-hop: 10.10.10.20
        }
        peer 10.10.10.40 {
            local-ip: 10.10.10.20
            as: 65400
            next-hop: 10.10.10.20
        }
        peer 10.10.20.30 {
            local-ip: 10.10.20.20
            as: 65300
            next-hop: 10.10.20.20
        }
        peer 10.10.20.40 {
            local-ip: 10.10.20.20
            as: 65400
            next-hop: 10.10.20.20
        }
        export: "block"
    }
}



An example of 'show bgp peers' and 'show bgp routes' from another test, just
before a crash. All routes are marked as best.

root at ipca> show bgp peers
Peer 1: local 10.10.10.20/179 remote 10.10.10.30/179
Peer 2: local 10.10.10.20/179 remote 10.10.10.40/179
Peer 3: local 10.10.20.20/179 remote 10.10.20.30/179
Peer 4: local 10.10.20.20/179 remote 10.10.20.40/179
root at ipca> show bgp routes
Status Codes: * valid route, > best route
Origin Codes: i IGP, e EGP, ? incomplete

   Prefix                Nexthop                    Peer            AS 
Path
   ------                -------                    ----            
-------
*> 2.0.0.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 1.0.0.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 2.0.0.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 1.0.0.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.1.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.2.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.3.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.4.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.5.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.6.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.7.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.8.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 1.0.9.0/24            10.10.10.30                10.100.100.13  65300 
i
*> 2.0.1.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.2.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.3.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.4.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.5.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.6.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.7.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.8.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 2.0.9.0/24            10.10.20.40                10.100.100.24  65400 
i
*> 1.0.1.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.2.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.3.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.4.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.5.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.6.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.7.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.8.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 1.0.9.0/24            10.10.20.30                10.100.100.23  65300 
i
*> 2.0.1.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.2.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.3.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.4.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.5.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.6.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.7.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.8.0/24            10.10.10.40                10.100.100.14  65400 
i
*> 2.0.9.0/24            10.10.10.40                10.100.100.14  65400 
i



Regards,
Arsi

On Mon, Dec 03, 2007 at 05:43:49PM +0000, Mark Handley wrote:
> Xorp should not crash; I don't think this is a known issue.  Can you
> clarify - which Xorp process crashes?  The subject implies BGP, but I
> just want to be sure.
> 
> Also I'm not clear on the scenario - BGP doesn't advertise ASs to
> interfaces - it advertises them via BGP connections which are only
> loosely connected to interfaces (if you choose an interface IP address
> for the connection endpoint).  Do you mean the BGP has peering
> configured using the local IP addresses of the three ethernets in your
> scenario?
> 
> Which AS is the router that crashes in?
> 
> Your text says 5 routers, but I'm not sure where the 5th one is - the
> minimum needed to implement something like you describe is 4 (One each
> for AS1, AS2, AS3 and the router that crashes).  Where's the 5th one?
> 
> Also could you send the policy config you used to prevent route redistribution?
> 
> If we understood the scenario, we can build a test suite to tickle
> this problem, but right now I don't really know how to do this.
> 
>  - Mark
> 
> On Dec 3, 2007 10:21 AM, Arsi Antila <bbb999 at zerodistance.fi> wrote:
> > Is the following a known problem in XORP?
> >
> > Note: this was shown to me by someone else. I didn't test this myself,
> > so some of the details may be incorrect.
> >
> > XORP crashes when the same set of BGP routes is advertised from two
> > different routers connected to the same interface and the winning route
> > changes. Tested with VLANs, if-aliases and plain interfaces. Results do
> > not vary.
> >
> >
> > For example, configuration of the network is as follows:
> >
> > - device under test (Linux/Debian, XORP) and five simulated routers
> >
> > - AS 1 is advertised to ports eth1 and eth2
> >
> > - AS 2 is advertised to ports eth1 and eth2
> >
> > - AS 3 is advertised to port eth3
> >
> > - Policy rules so that AS 2 routes are not advertised to AS 1
> >
> > BGP process dies when one of the routers in AS 2 goes down and then up
> > so that the primary route in AS 2 changes.
> >
> >
> > Regards,
> > A.A.
> >
> > _______________________________________________
> > Xorp-users mailing list
> > Xorp-users at xorp.org
> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-users
> >



More information about the Xorp-users mailing list