[Xorp-users] Question on supporting multiple routing tables [PATCH]

Pavlin Radoslavov pavlin at icir.org
Thu Aug 30 11:09:26 PDT 2007


Ben Greear <greearb at candelatech.com> wrote:

> Here's a CVS diff that appears to add BINDTODEVICE support.  I still 
> don't have OSPF
> working properly..but I am not sure if that is related to my 
> configuration or socket binding.
> 
> Comments welcome.

I just committed the small delta for the XLOG_WARNING() message
itself (with slight variation of the printed output):

Revision  Changes                               Path
1.11      +3 -3;  commitid: 115a646d7000f7ea6; xorp/fea/data_plane/io/io_ip_socket.cc

About the setsockopt(SO_BINDTODEVICE) code itself, I belive it
should be moved to the IoIpSocket::proto_socket_transmit() method,
because that method is called from several places, and it is the one
that actually transmits the packet.

Based on earlier analyses, I believe that for transmission purpose
setsockopt(SO_BINDTODEVICE) is not needed for multicast packets,
because we already set the outgoing interface/vif for such packets.

Hence, IoIpSocket::proto_socket_transmit() should be modified like:

    if (dst_address.is_multicast()) {
        ....
    } else {
        // Unicast-related setting

        // Your setsockopt(SO_BINDTODEVICE) setting goes here
    }

The SO_BINDTODEVICE code can be isolated with
#ifdef SO_BINDTODEVICE
..
#endif

because it is available only for Linux.

Also, you could use "if_name.empty()" to check whether the interface
name is empty.
Finally, would it work if you replace "nl" with NULL when you try to
un-bind.

Regards,
Pavlin

> Thanks,
> Ben
> 
> 
> Index: fea/data_plane/io/io_ip_socket.cc
> ===================================================================
> RCS file: /cvs/xorp/fea/data_plane/io/io_ip_socket.cc,v
> retrieving revision 1.10
> diff -u -r1.10 io_ip_socket.cc
> --- fea/data_plane/io/io_ip_socket.cc   26 Jul 2007 01:18:40 -0000      1.10
> +++ fea/data_plane/io/io_ip_socket.cc   30 Aug 2007 17:04:40 -0000
> @@ -1744,8 +1744,8 @@
>      if ((ifp == NULL) || (vifp == NULL)) {
>         // No vif found. Ignore this packet.
>         XLOG_WARNING("proto_socket_read() failed: "
> -                    "RX packet from %s to %s: no vif found",
> -                    cstring(src_address), cstring(dst_address));
> +                    "RX packet from %s to %s: no vif found, pif_index: %u",
> +                    cstring(src_address), cstring(dst_address), pif_index);
>         return;                 // Error
>      }
>      if (! (ifp->enabled() || vifp->enabled())) {
> @@ -2024,6 +2024,25 @@
>                 return (XORP_ERROR);
>             }
> 
> +#ifndef HOST_OS_WINDOWS
> +           if (if_name.c_str() && if_name.c_str()[0]) {
> +               if (setsockopt(_proto_socket_out, SOL_SOCKET, 
> SO_BINDTODEVICE,
> +                              if_name.c_str(), IFNAMSIZ)) {
> +                   error_msg += c_format("setsockopt(SO_BINDTODEVICE, 
> %s) failed: %s",
> +                                         if_name.c_str(), strerror(errno));
> +               }
> +           }
> +           else {
> +               // Un-bind just in case we were previously bound...
> +               char nl[1];
> +               nl[0] = 0;
> +               if (setsockopt(_proto_socket_out, SOL_SOCKET, 
> SO_BINDTODEVICE, nl, 0)) {
> +                   error_msg += c_format("setsockopt(SO_BINDTODEVICE, 
> NULL) failed: %s",
> +                                         strerror(errno));
> +               }
> +           }
> +#endif
> +
>             //
>             // Now hook the data
>             //
> 
> 
> Pavlin Radoslavov wrote:
> > Ben Greear <greearb at candelatech.com> wrote:
> >
> >   
> >>>> I'm sure you'll want to be able to bind the socket to a local IP, but
> >>>> if you want to leave out the SO_BINDTODEVICE I can test it and see
> >>>> if it works.  I can add the SO_BINDTODEVICE if needed and send you a patch.
> >>>>     
> >>>>         
> >>> Currently, we don't bind to the local IP (we do but in certain cases
> >>> only). I believe even if you bind to a local IP you cannot really
> >>> force the unicast IP packet to exit the system on the particular
> >>> interface. Anyway, I might be wrong here, so please let me know if
> >>> you find that bind()-ing only gives us the desired behavior.
> >>>   
> >>>       
> >> If you set up the routing tables and rules correctly, then binding to a 
> >> local IP
> >> is probably sufficient.  If you are certain that you want the pkt to 
> >> leave by a certain
> >> interface, then I don't think it can ever hurt to bind to that local IP, 
> >> but just in case,
> >> it could also be a config option...
> >>     
> >>> BTW, what protocols are you planning to run? Without SO_BINDTODEVICE
> >>> we might have to use different solution for each type of
> >>> sockets/packets: raw IP packets, TCP, UDP.
> >>> FYI, the I/O system-specific stuff is inside fea/data_plane/io,
> >>> though io_tcpudp_socket.cc itself uses the xorp/libcomm wrapper
> >>> library.
> >>> BGP only doesn't use the FEA (yet) and does its own TCP connection
> >>> (inside bgp/socket.{hh,cc}).
> >>>
> >>> Also, there could be some gotchas with RIP's UDP socket, but lets
> >>> address first the protocols you are actually going to use.
> >>>   
> >>>       
> >> At a minimum, I want to support OSPF.  However, I'd like to have options to
> >> do other protocols as well.  In my own experience, binding UDP is very 
> >> similar
> >> to binding TCP, but if you want some sample code I can post it.  I'm not 
> >> sure
> >> about raw IP packets.
> >>
> >> Also, for my application, it will always be running on Linux, so I can 
> >> depend on
> >> SO_BINDTODEVICE being available...
> >>     
> >
> > I played a bit with SO_BINDTODEVICE, and here is what I found.
> > For the record, I am using Gentoo 2006.1 with kernel 2.6.20.1.
> >
> > In my test I opened an UDP socket, then used
> > setsockopt(SO_BINDTODEVICE) to bind the socket to a specific
> > interface (eth1), and then used sendto() to transmit a single UDP packet.
> > At the end I am attaching the test program in case someone wants to play
> > with it.
> >
> > * If the destination address belongs to the same network interface
> >   that is used with SO_BINDTODEVICE (eth1), then the transmitted packets
> >   are sent over the loopback interface (lo) instead of the external
> >   (physical) interface (eth1).
> >
> > * If the destination address belongs to the same subnet as the
> >   network interface that is used with SO_BINDTODEVICE (eth1), then
> >   an ARP request is sent first by the kernel (as we would expect).
> >   If the ARP is resolved, then the UDP packet should follow.
> >
> > * For all other (i.e., remote) destination addresses or IP
> >   addresses that belong to some other interfaces of that host, an
> >   ARP request is sent first by the kernel for the destination
> >   address. If the ARP request is not answered (as it would be the
> >   case for a remote destination unless somebody else is acting as a
> >   proxy), then the UDP transmission will fail.
> >
> > >From the above observations, the interesting behavior (at least for
> > me) is that SO_BINDTODEVICE can be used to force a packet with
> > destination address that belongs to some other interface of that
> > host (e.g., eth0) to be transmitted over the specified interface
> > (eth1). Without SO_BINDTODEVICE such packets are transmitted over
> > the loopback interface (lo).
> >
> > I continued the testing by connecting eth0 directly with eth1, and
> > then used SO_BINDTODEVICE to see whether the UDP packet will be
> > actually sent out of interface eth1 to eth0's IP address.
> > It turned out that the preceding ARP request out of eth1 is never
> > answered by the eth0 interface (probably the kernel recognizes that
> > the origin of the ARP is that host itself so the ARP reply is
> > suppressed).
> > However, after playing a bit I was able to add the ARP entry by
> > hand, but it was a bit tricky:
> >
> > 1. Added the ARP entry, where xx:xx:xx:xx:xx:xx is the Ethernet
> > address of eth0. Note that it must be done before configuring
> > the IP address on eth0:
> > arp -s 10.0.0.3 xx:xx:xx:xx:xx:xx
> >
> > 2. Configure the IP address on eth0:
> > ip addr add 10.0.0.3/8 dev eth0
> >
> > Finally after that I was able to see the UDP packet going out on
> > eth1 with destination address the IP address on eth0.
> >
> > In conclusion, if you are running 2+ virtual routing instances each
> > of them with separate forwarding tables in the kernel and a
> > allocated corresponding subset of the physical interfaces, then it
> > is possible to use SO_BINDTODEVICE to make sure that unicast packets
> > transmit by one virtual router are transmit over the physical link
> > to another virtual router, but it is tricky (see the above ARP
> > hack).
> >
> > However, if we don't care that the unicast packets between the
> > virtual instances are transmitted over the loopback interface, then
> > we don't need SO_BINDTODEVICE: just bind(2)-ing to the outgoing
> > interface's IP address will be sufficient to guarantee the IP source
> > address is set to the desired value.
> >
> > Regards,
> > Pavlin
> >
> >   
> > ------------------------------------------------------------------------
> >
> > /* Test program to play with setsockopt(SO_BINDTODEVICE) on Linux */
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <string.h>
> >
> > #include <sys/socket.h>
> > #include <net/if.h>
> > #include <netinet/in.h>
> >
> > #define MY_INTERFACE_NAME "eth1"
> > #define MY_LOCAL_ADDR "10.0.0.1"
> > #define MY_DST_ADDR "10.0.0.3"
> >
> > int
> > main()
> > {
> >     char ifname[IFNAMSIZ];
> >     int s;
> >     char data[50];
> >     struct sockaddr_in sin;
> >     socklen_t sin_len = sizeof(sin);
> >
> >     /* Init */
> >     strncpy(ifname, MY_INTERFACE_NAME, IFNAMSIZ);
> >     memset(data, 1, sizeof(data));
> >     s = socket(AF_INET, SOCK_DGRAM, 0);
> >
> >     /* Optionally: bind to a local IP address */
> > #if 0
> >     memset(&sin, 0, sizeof(sin));
> >     sin.sin_family = AF_INET;
> >     sin.sin_port = htons(0);
> >     sin.sin_addr.s_addr = inet_addr(MY_LOCAL_ADDR);
> >     if (bind(s, (struct sockaddr*)&sin, sin_len) != 0) {
> > 	perror("bind()");
> > 	exit(1);
> >     }
> > #endif
> >
> >     /* Bind to the interface */
> > #if 1
> >     if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, ifname, sizeof(ifname))
> > 	< 0) {
> > 	perror("setsockopt(SO_BINDTODEVICE)");
> > 	exit(1);
> >     }
> > #endif
> >
> >     /* Set the dest. address and port */
> >     memset(&sin, 0, sizeof(sin));
> >     sin.sin_family = AF_INET;
> >     sin.sin_port = htons(5000);
> >     sin.sin_addr.s_addr = inet_addr(MY_DST_ADDR);
> >
> >     /* Send the packet */
> >     if (sendto(s, data, sizeof(data), 0, (struct sockaddr*)&sin, sin_len)
> > 	< sizeof(data)) {
> > 	perror("sendto()");
> > 	exit(1);
> >     }
> >
> >     exit(0);
> > }
> >   
> 
> 
> -- 
> Ben Greear <greearb at candelatech.com> 
> Candela Technologies Inc  http://www.candelatech.com
> 
> 
> _______________________________________________
> Xorp-users mailing list
> Xorp-users at xorp.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-users



More information about the Xorp-users mailing list