[Xorp-users] Connecting XORP with 1000 Peers

Daniel Seidenstücker d.seidenstuecker at googlemail.com
Thu Feb 18 04:06:18 PST 2016


I think I got what causes the problem (see below) so I think I can spare
recompiling XORP with debugging symbols enabled.

Yes, XORP always crashes when 1000 peers start to connect. (As I wrote 750
peers work.)

While doing web search for such buffer overflows I got
http://www.serverphorums.com/read.php?10,680364. There a program called
haproxy got similar buffer overflows due to high load and a high number of
used file descriptors which is also similar to our problem: our buffer
overflow appears after raising user limits for open files. Furthermore the
website mentions select() causes the buffer overflow if number of open files
growths larger than FD_SETSIZE (Ubuntu default for both: 1024). That
behavior also fits: Before raising the user limits, user limits and
FD_SETSIZE were equal and XORP causes no buffer overflows. So all
indications point to select(). If I look for select() in the XORP code (grep
-rn "select(" xorp.ct-lf-5.3.2/) I get among other results "
Übereinstimmungen in Binärdatei [eng: Matches in binary file]
xorp.ct-lf-5.3.2/xorp/obj/x86_64-unknown-linux-gnu/libxorp/libxorp_core.so."
and libxorp_core.so is the first named file in the backtrace of the buffer
overflow after glibc which provides select().

I additionally work with Quagga and BIRD which have the same problems
(crashes with 1250 peers, raising user limits or FD_SETSIZE doesn't help)
and they also use select(). The difference that XORP can't get 1000 to work
is probably explainable with the speculation that XORP needs more file
descriptors for other things than peers.

So the often mentioned solution for this problem with select() is to switch
to another alternative event framework like poll, epoll, libev, libevent,
... .

I hope this mail could help you and maybe you can fix XORP that future users
may use it with a huge amount of peers. Since my deadline is end of March I
don't think a fix would be fast enough for my experiments. 

Thanks,
Daniel

-----Ursprüngliche Nachricht-----
Von: Ben Greear [mailto:greearb at candelatech.com] 
Gesendet: Dienstag, 16. Februar 2016 17:18
An: Daniel Seidenstücker; xorp-users at xorp.org
Betreff: Re: [Xorp-users] Connecting XORP with 1000 Peers

Please recompile with debugging symbols enabled, that should give a better
backtrace and allow me a better chance to figure out where it is crashing.

Does it always crash in the same location in this test?

Thanks,
Ben


On 02/16/2016 03:59 AM, Daniel Seidenstücker wrote:
> Dear XORP community,
>
> connecting XORP with 750 peers works well but if I connect 1000 XORP
becomes unstable, causes session drops and never reached 1000 established
peers.
>
> Due to the experience with other implementation’s problems I try to raise
the open file limits in /etc/security/limits.conf. “cat /proc/<pid>/limits |
grep files” confirms that XORP’s processes take the new limits. If I then
try to connect 1000 peers XORP crashes due to a buffer overflow (terminal
output as attachment).
>
> Another thing which has helped with other implementations was raising
FD_SETSIZE to 65535 in /usr/include/linux/posix_types.h and
/usr/include/x86_64-linux-gnu/bits/typesizes.h and recompiling the
application. This has apparently no influence on XORP since its behavior
didn’t change at all.
>
> It would be nice if you can help me and solve the following 2 problems:
>
> How to fix the buffer overflow and how to get XORP to run with 1000 peers?
>
> Thanks,
>
> Daniel Seidenstuecker
>
>
>
> _______________________________________________
> Xorp-users mailing list
> Xorp-users at xorp.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-users
>

--
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




More information about the Xorp-users mailing list