[Xorp-users] Question on supporting multiple routing tables [PATCH]
Pavlin Radoslavov
pavlin at icir.org
Thu Aug 30 11:09:26 PDT 2007
Ben Greear <greearb at candelatech.com> wrote:
> Here's a CVS diff that appears to add BINDTODEVICE support. I still
> don't have OSPF
> working properly..but I am not sure if that is related to my
> configuration or socket binding.
>
> Comments welcome.
I just committed the small delta for the XLOG_WARNING() message
itself (with slight variation of the printed output):
Revision Changes Path
1.11 +3 -3; commitid: 115a646d7000f7ea6; xorp/fea/data_plane/io/io_ip_socket.cc
About the setsockopt(SO_BINDTODEVICE) code itself, I belive it
should be moved to the IoIpSocket::proto_socket_transmit() method,
because that method is called from several places, and it is the one
that actually transmits the packet.
Based on earlier analyses, I believe that for transmission purpose
setsockopt(SO_BINDTODEVICE) is not needed for multicast packets,
because we already set the outgoing interface/vif for such packets.
Hence, IoIpSocket::proto_socket_transmit() should be modified like:
if (dst_address.is_multicast()) {
....
} else {
// Unicast-related setting
// Your setsockopt(SO_BINDTODEVICE) setting goes here
}
The SO_BINDTODEVICE code can be isolated with
#ifdef SO_BINDTODEVICE
..
#endif
because it is available only for Linux.
Also, you could use "if_name.empty()" to check whether the interface
name is empty.
Finally, would it work if you replace "nl" with NULL when you try to
un-bind.
Regards,
Pavlin
> Thanks,
> Ben
>
>
> Index: fea/data_plane/io/io_ip_socket.cc
> ===================================================================
> RCS file: /cvs/xorp/fea/data_plane/io/io_ip_socket.cc,v
> retrieving revision 1.10
> diff -u -r1.10 io_ip_socket.cc
> --- fea/data_plane/io/io_ip_socket.cc 26 Jul 2007 01:18:40 -0000 1.10
> +++ fea/data_plane/io/io_ip_socket.cc 30 Aug 2007 17:04:40 -0000
> @@ -1744,8 +1744,8 @@
> if ((ifp == NULL) || (vifp == NULL)) {
> // No vif found. Ignore this packet.
> XLOG_WARNING("proto_socket_read() failed: "
> - "RX packet from %s to %s: no vif found",
> - cstring(src_address), cstring(dst_address));
> + "RX packet from %s to %s: no vif found, pif_index: %u",
> + cstring(src_address), cstring(dst_address), pif_index);
> return; // Error
> }
> if (! (ifp->enabled() || vifp->enabled())) {
> @@ -2024,6 +2024,25 @@
> return (XORP_ERROR);
> }
>
> +#ifndef HOST_OS_WINDOWS
> + if (if_name.c_str() && if_name.c_str()[0]) {
> + if (setsockopt(_proto_socket_out, SOL_SOCKET,
> SO_BINDTODEVICE,
> + if_name.c_str(), IFNAMSIZ)) {
> + error_msg += c_format("setsockopt(SO_BINDTODEVICE,
> %s) failed: %s",
> + if_name.c_str(), strerror(errno));
> + }
> + }
> + else {
> + // Un-bind just in case we were previously bound...
> + char nl[1];
> + nl[0] = 0;
> + if (setsockopt(_proto_socket_out, SOL_SOCKET,
> SO_BINDTODEVICE, nl, 0)) {
> + error_msg += c_format("setsockopt(SO_BINDTODEVICE,
> NULL) failed: %s",
> + strerror(errno));
> + }
> + }
> +#endif
> +
> //
> // Now hook the data
> //
>
>
> Pavlin Radoslavov wrote:
> > Ben Greear <greearb at candelatech.com> wrote:
> >
> >
> >>>> I'm sure you'll want to be able to bind the socket to a local IP, but
> >>>> if you want to leave out the SO_BINDTODEVICE I can test it and see
> >>>> if it works. I can add the SO_BINDTODEVICE if needed and send you a patch.
> >>>>
> >>>>
> >>> Currently, we don't bind to the local IP (we do but in certain cases
> >>> only). I believe even if you bind to a local IP you cannot really
> >>> force the unicast IP packet to exit the system on the particular
> >>> interface. Anyway, I might be wrong here, so please let me know if
> >>> you find that bind()-ing only gives us the desired behavior.
> >>>
> >>>
> >> If you set up the routing tables and rules correctly, then binding to a
> >> local IP
> >> is probably sufficient. If you are certain that you want the pkt to
> >> leave by a certain
> >> interface, then I don't think it can ever hurt to bind to that local IP,
> >> but just in case,
> >> it could also be a config option...
> >>
> >>> BTW, what protocols are you planning to run? Without SO_BINDTODEVICE
> >>> we might have to use different solution for each type of
> >>> sockets/packets: raw IP packets, TCP, UDP.
> >>> FYI, the I/O system-specific stuff is inside fea/data_plane/io,
> >>> though io_tcpudp_socket.cc itself uses the xorp/libcomm wrapper
> >>> library.
> >>> BGP only doesn't use the FEA (yet) and does its own TCP connection
> >>> (inside bgp/socket.{hh,cc}).
> >>>
> >>> Also, there could be some gotchas with RIP's UDP socket, but lets
> >>> address first the protocols you are actually going to use.
> >>>
> >>>
> >> At a minimum, I want to support OSPF. However, I'd like to have options to
> >> do other protocols as well. In my own experience, binding UDP is very
> >> similar
> >> to binding TCP, but if you want some sample code I can post it. I'm not
> >> sure
> >> about raw IP packets.
> >>
> >> Also, for my application, it will always be running on Linux, so I can
> >> depend on
> >> SO_BINDTODEVICE being available...
> >>
> >
> > I played a bit with SO_BINDTODEVICE, and here is what I found.
> > For the record, I am using Gentoo 2006.1 with kernel 2.6.20.1.
> >
> > In my test I opened an UDP socket, then used
> > setsockopt(SO_BINDTODEVICE) to bind the socket to a specific
> > interface (eth1), and then used sendto() to transmit a single UDP packet.
> > At the end I am attaching the test program in case someone wants to play
> > with it.
> >
> > * If the destination address belongs to the same network interface
> > that is used with SO_BINDTODEVICE (eth1), then the transmitted packets
> > are sent over the loopback interface (lo) instead of the external
> > (physical) interface (eth1).
> >
> > * If the destination address belongs to the same subnet as the
> > network interface that is used with SO_BINDTODEVICE (eth1), then
> > an ARP request is sent first by the kernel (as we would expect).
> > If the ARP is resolved, then the UDP packet should follow.
> >
> > * For all other (i.e., remote) destination addresses or IP
> > addresses that belong to some other interfaces of that host, an
> > ARP request is sent first by the kernel for the destination
> > address. If the ARP request is not answered (as it would be the
> > case for a remote destination unless somebody else is acting as a
> > proxy), then the UDP transmission will fail.
> >
> > >From the above observations, the interesting behavior (at least for
> > me) is that SO_BINDTODEVICE can be used to force a packet with
> > destination address that belongs to some other interface of that
> > host (e.g., eth0) to be transmitted over the specified interface
> > (eth1). Without SO_BINDTODEVICE such packets are transmitted over
> > the loopback interface (lo).
> >
> > I continued the testing by connecting eth0 directly with eth1, and
> > then used SO_BINDTODEVICE to see whether the UDP packet will be
> > actually sent out of interface eth1 to eth0's IP address.
> > It turned out that the preceding ARP request out of eth1 is never
> > answered by the eth0 interface (probably the kernel recognizes that
> > the origin of the ARP is that host itself so the ARP reply is
> > suppressed).
> > However, after playing a bit I was able to add the ARP entry by
> > hand, but it was a bit tricky:
> >
> > 1. Added the ARP entry, where xx:xx:xx:xx:xx:xx is the Ethernet
> > address of eth0. Note that it must be done before configuring
> > the IP address on eth0:
> > arp -s 10.0.0.3 xx:xx:xx:xx:xx:xx
> >
> > 2. Configure the IP address on eth0:
> > ip addr add 10.0.0.3/8 dev eth0
> >
> > Finally after that I was able to see the UDP packet going out on
> > eth1 with destination address the IP address on eth0.
> >
> > In conclusion, if you are running 2+ virtual routing instances each
> > of them with separate forwarding tables in the kernel and a
> > allocated corresponding subset of the physical interfaces, then it
> > is possible to use SO_BINDTODEVICE to make sure that unicast packets
> > transmit by one virtual router are transmit over the physical link
> > to another virtual router, but it is tricky (see the above ARP
> > hack).
> >
> > However, if we don't care that the unicast packets between the
> > virtual instances are transmitted over the loopback interface, then
> > we don't need SO_BINDTODEVICE: just bind(2)-ing to the outgoing
> > interface's IP address will be sufficient to guarantee the IP source
> > address is set to the desired value.
> >
> > Regards,
> > Pavlin
> >
> >
> > ------------------------------------------------------------------------
> >
> > /* Test program to play with setsockopt(SO_BINDTODEVICE) on Linux */
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <string.h>
> >
> > #include <sys/socket.h>
> > #include <net/if.h>
> > #include <netinet/in.h>
> >
> > #define MY_INTERFACE_NAME "eth1"
> > #define MY_LOCAL_ADDR "10.0.0.1"
> > #define MY_DST_ADDR "10.0.0.3"
> >
> > int
> > main()
> > {
> > char ifname[IFNAMSIZ];
> > int s;
> > char data[50];
> > struct sockaddr_in sin;
> > socklen_t sin_len = sizeof(sin);
> >
> > /* Init */
> > strncpy(ifname, MY_INTERFACE_NAME, IFNAMSIZ);
> > memset(data, 1, sizeof(data));
> > s = socket(AF_INET, SOCK_DGRAM, 0);
> >
> > /* Optionally: bind to a local IP address */
> > #if 0
> > memset(&sin, 0, sizeof(sin));
> > sin.sin_family = AF_INET;
> > sin.sin_port = htons(0);
> > sin.sin_addr.s_addr = inet_addr(MY_LOCAL_ADDR);
> > if (bind(s, (struct sockaddr*)&sin, sin_len) != 0) {
> > perror("bind()");
> > exit(1);
> > }
> > #endif
> >
> > /* Bind to the interface */
> > #if 1
> > if (setsockopt(s, SOL_SOCKET, SO_BINDTODEVICE, ifname, sizeof(ifname))
> > < 0) {
> > perror("setsockopt(SO_BINDTODEVICE)");
> > exit(1);
> > }
> > #endif
> >
> > /* Set the dest. address and port */
> > memset(&sin, 0, sizeof(sin));
> > sin.sin_family = AF_INET;
> > sin.sin_port = htons(5000);
> > sin.sin_addr.s_addr = inet_addr(MY_DST_ADDR);
> >
> > /* Send the packet */
> > if (sendto(s, data, sizeof(data), 0, (struct sockaddr*)&sin, sin_len)
> > < sizeof(data)) {
> > perror("sendto()");
> > exit(1);
> > }
> >
> > exit(0);
> > }
> >
>
>
> --
> Ben Greear <greearb at candelatech.com>
> Candela Technologies Inc http://www.candelatech.com
>
>
> _______________________________________________
> Xorp-users mailing list
> Xorp-users at xorp.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-users
More information about the Xorp-users
mailing list