From eshe168 at gmail.com Tue Sep 1 01:50:39 2009 From: eshe168 at gmail.com (=?GB2312?B?0e7Qocun?=) Date: Tue, 1 Sep 2009 16:50:39 +0800 Subject: [Xorp-hackers] Bug report. In-Reply-To: <4A9BC77F.2070504@incunabulum.net> References: <56f9e0990810282312q35807459r1310ae268a6b14ca@mail.gmail.com> <4A9BC77F.2070504@incunabulum.net> Message-ID: <56f9e0990909010150t18e4e166lf0da59dc5ee88989@mail.gmail.com> Sorry for this. Because of job arrangement, I have not study the XORP for a long time. I can not remember the bug. B.R. Xiaoshuai Yang 2009/8/31 Bruce Simpson : > ??? wrote: >> >> when you entry a "xxx xxxx" value into txt node. The xorpsh will >> pass it. But when you use show -all in CLI, the information in your >> screen is wrong. Apparently, text should not be "\"abc cba\"". >> > > Sorry for the late reply. Can you please raise a bug report about this > issue? > > We are moving to a new bug tracking system on SourceForge: > https://sourceforge.net/apps/trac/xorp/ > > ...the bugzilla.xorp.org alias now points to this page. > > You will need to create a SourceForge account to submit bugs using Trac: > https://sourceforge.net/account/registration/ > > thanks, > BMS > -- Best Regard Xiaoshuai Yang From bms at incunabulum.net Tue Sep 1 06:04:03 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 14:04:03 +0100 Subject: [Xorp-hackers] BSR Restart In-Reply-To: <200810301655.m9UGtmwk007829@fruitcake.ICSI.Berkeley.EDU> References: <48FFA211.4070406@datacom.ind.br> <200810230057.m9N0vRsv019500@fruitcake.ICSI.Berkeley.EDU> <200810230741.m9N7fwFZ002600@fruitcake.ICSI.Berkeley.EDU> <4900B0FC.4050801@datacom.ind.br> <200810240016.m9O0GfwA009647@fruitcake.ICSI.Berkeley.EDU> <490207B7.9050900@datacom.ind.br> <200810250017.m9P0GVsF027687@fruitcake.ICSI.Berkeley.EDU> <49063A15.4030202@datacom.ind.br> <200810280058.m9S0wrOd010897@fruitcake.ICSI.Berkeley.EDU> <49086A1E.8080107@datacom.ind.br> <200810300001.m9U01NRQ003865@fruitcake.ICSI.Berkeley.EDU> <4909B3BA.1050208@datacom.ind.br> <200810301655.m9UGtmwk007829@fruitcake.ICSI.Berkeley.EDU> Message-ID: <4A9D1BC3.9040302@incunabulum.net> Hi all, I've committed Pavlin's reworked change (with fixups) for PIM-SM runtime bootstrap router configuration, to XORP public SVN, as revision 11529 The change to use the apply_bsr_changes RPC method is currently commented out in the Router Manager template files in revision 11530. The code compiles correctly on FreeBSD 7.2-STABLE with SCons and g++ 4.2.1. I've reviewed the code and it appears to be correct, however, we will be relying on users of XORP's PIM-SM to test this change before we activate it in the main code base. thanks, BMS From bms at incunabulum.net Tue Sep 1 06:36:39 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 14:36:39 +0100 Subject: [Xorp-hackers] Patch to FEA to support multiple xorp instances on Linux. In-Reply-To: <4A981EEF.2090807@candelatech.com> References: <4A981EEF.2090807@candelatech.com> Message-ID: <4A9D2367.10104@incunabulum.net> Hi Ben, Thanks for your FEA patch. Unfortunately, I don't believe that we can take this patch as-is just at the moment. A good deal of reworking will be required. Ben Greear wrote: > Attached is a patch for just my FEA changes (and a few related bits to > make that > function properly). A few technical comments: * Setting SO_REUSEADDR unconditionally on UDP socket binds should not be needed across all operating systems. BSD-derived networking stacks, for example, should support SO_REUSEPORT; and binding the socket on such systems is a no-no (the laddr and faddr part of the tuple should not be bound, this breaks the kernel's lookup and traffic will not be received). * It would be helpful if the Linux specific behaviour, which apparently requires SO_REUSEADDR, could be contained to the FEA. You should be able to call comm_set_reuseaddr() conditionally from within the relevant FEA code, instead of performing it in libcomm. * For an example, look at the IoTcpUdpSocket::udp_open_bind_broadcast() method in the FEA. This also contains an example of how to deal with SO_BINDTODEVICE in a portable way. * I'm concerned that as we plan to move to Boost.ASIO in the future, that having platform hacks in libcomm is going to pollute the code, and make it difficult to transition. I noticed that Boost.ASIO's socket support is lacking in a few places, and we will have to push changes back to them incrementally; preserving the existing code flow in the FEA will make this much easier. * Using a socket per interface is a good idea in principle; not all systems support the BSD behaviour of requesting IP_RECVIF information as an out-of-band control message with recvmsg(). * Also, as libcomm is a C API, code which invokes it shouldn't be relying on C++ booleans being casted to int values; please use explicit integer values for libcomm calls. * As discussed elsewhere, we can't take changes which specify the device name explicitly in the APIs just yet, as this is likely to break portability. However, some refinements could be made here; please read on. * I notice that the patch makes changes to multicast socket behaviour. What would be ideal is a change which conditionally uses the RFC 3678 (Socket Interface Extensions for Multicast) APIs, as these are supported by up-to-date operating systems, and do allow multicast groups to be joined by specifying the link explicitly (which is the correct behaviour), not the protocol address of the link. We also need to support RFC 3678 for dealing with unnumbered interfaces, and IPv6, in a forwards compatible way. * We do need to parse RTA attributes in the Netlink socket support, to be able to deal with features such as ECMP. A good starting point would be to rework this change, and keep it simple, but more general. * Have you seen any alignment errors with the reinterpret_cast<> used herein? A test run of compiling for an embedded target would be really useful. I want to avoid introducing alignment constraint violations into the code; some of the offending code has been cleaned up, however, this remains in a few places. * The method to leave all multicast groups on a link, in the FEA, is a useful addition. * Can you elaborate further on the nature of the race conditions you seem to be experiencing with the FEA on Linux (based on your code comments) ? Style comments: * Try to preserve whitespace and indentation style wherever possible. * Please don't use camelCase, this is a style bug. * Block comments should be /* ... */ wherever possible. * Try to avoid // at the end of a block, unless the nesting is non-obvious from indentation. I appreciate this may be frustrating, given you have clearly put a lot of hard work into your changes, but I need to ensure that any patch which makes such wide-ranging changes to the system, preserves code style, portability, and correctness. When writing code like this, I find it very helpful to have a copy of UNIX Network Programming handy: http://www.kohala.com/start/unpv12e.html ...it points out the differences between the main implementations, and has a number of useful illustrations which show how the APIs fit together. It is no substitute for experience, however, it is an excellent desk reference and tutorial book. Thanks for the SCons RTA_TABLE change, this has already been checked into SVN, and I hope it helps minimize diffs between your own fork, and XORP SVN. best regards, BMS From bms at incunabulum.net Tue Sep 1 06:43:23 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 14:43:23 +0100 Subject: [Xorp-hackers] Fix for PIM task list hang In-Reply-To: <4A4A8AD9.2020107@candelatech.com> References: <4A4A8AD9.2020107@candelatech.com> Message-ID: <4A9D24FB.7090807@incunabulum.net> Ben, Thanks for this change. As of today, I've applied a very small portion of it, by introducing debug_msg() calls into the path(s) where you've added XLOG warnings. Ben Greear wrote: > On some error conditions related to interface removal, the PIM > callbacks would > not handle the next task, and so nothing would ever look at the task > queue > again, effectively hanging the multicast routing daemon. I think we need to look very carefully at changes which affect the flow of RPC calls in and out of PIM, as we are gearing up for significant refactoring in that area. Were you able to pin the task list hang down to a specific PIM RPC call or set of events? It could be argued that failure of the Finder, still shouldn't be regarded as a purely transient failure. This is especially the case, if we're in a situation where we're using in-flight shared memory and user-space synchronization mechanisms (e.g. futex, umtx) to control access to that shared memory, so I'd err on the side of the conservative, and not commit this change in full for now. thanks, BMS From bms at incunabulum.net Tue Sep 1 06:45:58 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 14:45:58 +0100 Subject: [Xorp-hackers] vif indeces In-Reply-To: <4A091C10.6020602@incunabulum.net> References: <626e40340905110117t5512b26an25672366b7b22114@mail.gmail.com> <626e40340905110124n5f8e56dap40b0e24b3023db49@mail.gmail.com> <4A0907CD.2060103@incunabulum.net> <626e40340905112319q3d269b5fi32743e80bc3d37db@mail.gmail.com> <4A091C10.6020602@incunabulum.net> Message-ID: <4A9D2596.1090708@incunabulum.net> Hi there, Given that we hope to refactor XRL in the future, can you provide any suggestions, based on your experience, of how we can expose MFEA state to client processes? This would be very useful, as this can then be implemented in the refactored code. Bruce Simpson wrote: >> Which should be >> authority for vif indeces, or furthermore between vif names and >> indeces? Are there specific situations in which this should be a >> consideration? >> > ... > Unfortunately, in extending IGMP, you've found a weak spot in the API > where the MFEA VIF index has to be exposed for the XRLs to be useful. Up > until now, this hasn't been an issue, because normally only PIM and IGMP > interact with the MFEA directly. > thanks, BMS From bms at incunabulum.net Tue Sep 1 07:23:26 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 15:23:26 +0100 Subject: [Xorp-hackers] fix assert in selector.cc In-Reply-To: <49ED644C.10904@candelatech.com> References: <49ED644C.10904@candelatech.com> Message-ID: <4A9D2E5E.3000708@incunabulum.net> Ben, Thanks for your patch. Can you clarify how/when this condition is triggered, with a full backtrace please? It would be very helpful to get a root cause analysis on this condition you are observing, before a 1.7-RC. Ben Greear wrote: > It's possible this is caused by some of my own patches, but I think it's > a logic > bug: If the only fd ISSET has infinite priority, it will not be chosen and > will hit the assert at the bottom of the method. XorpTask::PRIORITY_INFINITY seems to mean 'this task has infinitely low priority', and therefore should not be run. Using KScope, I do not see any place in the code where a XorpTask is being explicitly instantiated with PRIORITY_INFINITY. Perhaps the bug lies elsewhere, and the assertion is triggered for some other reason? It is likely that the EventLoop/SelectorList code would be completely replaced by the boost::asio::io_service reactor class when Boost is introduced. > For me, simply calling: xorpsh > when no rtr-mgrs are running causes the crash. > I was unable to reproduce this condition in FreeBSD 7.2-STABLE, g++ 4.2.1, with the latest SVN trunk. The xorpsh simply exits after the timeout as expected, with the usual error messages: %%% anglepoise# /usr/local/xorp/bin/xorpsh Waiting for xorp_rtrmgr... [ 2009/09/01 15:21:50 ERROR xorpsh:26879 RTRMGR +96 rtrmgr/xorpsh_main.cc wait_for_xrl_router_ready ] XrlRouter failed. No Finder? [ 2009/09/01 15:21:50 ERROR xorpsh:26879 RTRMGR +905 rtrmgr/xorpsh_main.cc main ] xorpsh exiting due to an init error: Failed to connect to the router manager %%% This is using a shared library build, although that should not make any difference. thanks, BMS From bms at incunabulum.net Tue Sep 1 07:29:37 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 15:29:37 +0100 Subject: [Xorp-hackers] Patches integrated; call for testers Message-ID: <4A9D2FD1.2040203@incunabulum.net> Hi all, I've integrated, as far as possible, any user-contributed patches submitted to the xorp-hackers@ mailing list since XORP 1.6 was released. I will try to make a pass over the company Bugzilla for any low-hanging fruit. If you have patches, now is a good time to air them out for review and incorporation. We would like to roll a 1.7-RC, however, we will be more dependent on user contribution than before, as the XORP, Inc. company resources are focused on delivering product for the moment. It would be my hope in future that we can see more participation from the company developers in this forum, however, this will only be possible if they are able to deliver and launch commercial product. In order to do this, we will probably have to call a code freeze some time this month. One of the things I want to get started on is integrating Boost and Thrift into the main code base. It's difficult with only one hand on deck (myself) on the community branch to do this, whilst we have changes in-flight, and whilst the code is a moving target. To minimize churn, I tend to fork feature branches and re-incorporate them further on; both the Windows port, and OLSR code, were developed in this way. I hope you can appreciate that the time I can spend on email support is limited, and that it is on a best-effort basis. thanks, BMS From marco.canini at epfl.ch Tue Sep 1 09:01:23 2009 From: marco.canini at epfl.ch (Canini Marco) Date: Tue, 1 Sep 2009 18:01:23 +0200 Subject: [Xorp-hackers] RIB serialization In-Reply-To: References: <4A9C1076.4030604@incunabulum.net> Message-ID: In the end, I had to remove the constness of the Payload type in the Trie but that didn't require many changes overall. Now I'm successful in serializing instances of RIB along with all the RouteTable derivate classes. Unfortunately, for certain RouteTables at deserialization time I need to have references to objects of type EventLoop and RibManager. Are these two classes singleton? Thanks Marco Canini, Ph.D. EPFL, Networked Systems Laboratory > -----Original Message----- > From: xorp-hackers-bounces at icir.org [mailto:xorp-hackers- > bounces at icir.org] On Behalf Of Canini Marco > Sent: Monday, August 31, 2009 8:22 PM > To: Bruce Simpson > Cc: xorp-hackers at icir.org > Subject: Re: [Xorp-hackers] RIB serialization > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Bruce, > > at the moment I'm hacking on xorp 1.6. To enable boost support I used > the m4 macros from the autoconf archive > (http://www.nongnu.org/autoconf-archive/). > If I read you correctly now xorp uses scons as the build system though. > > I'll try to avoid changing the constness then. I hope I hit a bug in > boost 1.34 and with a newer version I can get around this problem. > > Thanks > > Marco Canini, Ph.D. > Networked Systems Laboratory, EPFL > > > > -----Original Message----- > > From: Bruce Simpson [mailto:bms at incunabulum.net] > > Sent: 31 August 2009 20:04 > > To: Canini Marco > > Cc: xorp-hackers at icir.org > > Subject: Re: [Xorp-hackers] RIB serialization > > > > Hi Marco, > > > > Thanks for looking into this. We do plan to introduce Boost > > incrementally to the source tree, however this effort is still at a > > very > > early stage. > > > > I am curious if you're using the SVN code, as I have not yet > committed > > my changes to the SCons build glue for XORP to detect Boost's > libraries > > and headers. > > > > You might have better luck with your immediate issue by raising the > > question on the Boost-Users list. Yes, changing the constness of the > > Trie members is likely to cause a lot of code churn. > > > > thanks, > > BMS > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (MingW32) > > iEYEARECAAYFAkqcFMcACgkQK52QDm/mFCmglQCfZa8tcEd9oHWzIkhy7UPamAmq > EIsAn1lQ+eNohkJfd6sqXzS4Et9megtq > =NfC0 > -----END PGP SIGNATURE----- > > _______________________________________________ > Xorp-hackers mailing list > Xorp-hackers at icir.org > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-hackers From bms at incunabulum.net Tue Sep 1 09:43:25 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Tue, 01 Sep 2009 17:43:25 +0100 Subject: [Xorp-hackers] RIB serialization In-Reply-To: References: <4A9C1076.4030604@incunabulum.net> Message-ID: <4A9D4F2D.1060902@incunabulum.net> Canini Marco wrote: > In the end, I had to remove the constness of the Payload type in the Trie but that didn't require many changes overall. > > Now I'm successful in serializing instances of RIB along with all the RouteTable derivate classes. > Excellent stuff... We will need to review carefully. I simply haven't had time to look at Boost.Serialization yet, as you can probably tell from all the list traffic. Which platform(s) are you targeting? > Unfortunately, for certain RouteTables at deserialization time I need to have references to objects of type EventLoop and RibManager. > > Are these two classes singleton? > Off the top of my head, Eventloop yes, RibManager probably. If you spot any obvious Boost refactorings, e.g. using shared_ptr/weak_ptr, tuple etc, feel free to submit, such contributions are very, very welcome... From greearb at candelatech.com Tue Sep 1 10:54:14 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 01 Sep 2009 10:54:14 -0700 Subject: [Xorp-hackers] Patch to FEA to support multiple xorp instances on Linux. In-Reply-To: <4A9D2367.10104@incunabulum.net> References: <4A981EEF.2090807@candelatech.com> <4A9D2367.10104@incunabulum.net> Message-ID: <4A9D5FC6.8070709@candelatech.com> Bruce Simpson wrote: > > * The method to leave all multicast groups on a link, in the FEA, is a > useful addition. > > * Can you elaborate further on the nature of the race conditions you > seem to be experiencing with the FEA on Linux (based on your code > comments) ? I've posted this years ago in far more detail: Basically, interfaces can dissappear at any time. So, no code should panic if the interface suddenly disappears. I've posted specific scenarios years ago, but the main truth is just that you can't depend on interfaces existing at any particular moment so any assert or logic based on that false assumption is a bug. > I appreciate this may be frustrating, given you have clearly put a lot > of hard work into your changes, but I need to ensure that any patch > which makes such wide-ranging changes to the system, preserves code > style, portability, and correctness. Maybe you can take what parts you consider an improvement. I'll merge with the results and try to re-work specific portions to meet your suggestions if that seems possible, send a new patch, etc. The upstream code branch has been virtually frozen for years. I think if my changes are to be merged, we should merge them all (or at least most), and then start hacking on them to fix style issues, etc. Let other people merge their changes too. Then we will have a potentially un-stable build, but everyone will be testing & developing on the same code base. I *can* promise to work very hard to fix any issues that result from my patches, and I can do some heavy regression testing on Linux using virtual networks (including dynamic ones) with up to around 50 xorp instances communicating with each other. I cannot do testing or development on other platforms at this time. If you do not want to merge thus, I understand..but in that case I believe I'll just continue to use my existing code fork. I just don't have time or interest in doing all the heavy lifting to break up these patches. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 1 11:09:33 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 01 Sep 2009 11:09:33 -0700 Subject: [Xorp-hackers] Patches integrated; call for testers In-Reply-To: <4A9D2FD1.2040203@incunabulum.net> References: <4A9D2FD1.2040203@incunabulum.net> Message-ID: <4A9D635D.3010301@candelatech.com> Bruce Simpson wrote: > Hi all, > > It's difficult with only one hand on deck (myself) on the community > branch to do this, whilst we have changes in-flight, and whilst the code > is a moving target. To minimize churn, I tend to fork feature branches > and re-incorporate them further on; both the Windows port, and OLSR > code, were developed in this way. > Considering the small developer and testing base, I think forking would be a bad idea. I think we should merge all that we possibly can and the have as many people as possible test it. I won't harp on this more, however :) Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 1 11:16:17 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 01 Sep 2009 11:16:17 -0700 Subject: [Xorp-hackers] fix assert in selector.cc In-Reply-To: <4A9D2E5E.3000708@incunabulum.net> References: <49ED644C.10904@candelatech.com> <4A9D2E5E.3000708@incunabulum.net> Message-ID: <4A9D64F1.5060900@candelatech.com> Bruce Simpson wrote: > Ben, > > Thanks for your patch. Can you clarify how/when this condition is > triggered, with a full backtrace please? > > It would be very helpful to get a root cause analysis on this > condition you are observing, before a 1.7-RC. > > Ben Greear wrote: >> It's possible this is caused by some of my own patches, but I think >> it's a logic >> bug: If the only fd ISSET has infinite priority, it will not be >> chosen and >> will hit the assert at the bottom of the method. > > XorpTask::PRIORITY_INFINITY seems to mean 'this task has infinitely > low priority', and therefore should not be run. Using KScope, I do not > see any place in the code where a XorpTask is being explicitly > instantiated with PRIORITY_INFINITY. Perhaps the bug lies elsewhere, > and the assertion is triggered for some other reason? Maybe so. The backtrace from this was near useless because it doesn't show what poked the events into the main loop. > > It is likely that the EventLoop/SelectorList code would be completely > replaced by the boost::asio::io_service reactor class when Boost is > introduced. I'll see if I can reproduce the bug with the svn tree on Linux. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 1 11:24:00 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 01 Sep 2009 11:24:00 -0700 Subject: [Xorp-hackers] Who is doing what with Xorp? Message-ID: <4A9D66C0.9000104@candelatech.com> I thought it might be nice to see a show of hands as to who is actively using the Xorp SVN tree (or who might start doing so shortly). Please describe what you are doing and to what extent. I'll go first: We use xorp, with our virtualization patches, to provide virtual router emulation in Linux. We have a tool that configures xorp instances automatically and runs multiple instances of xorp on a single Linux OS. Each xorp has it's own routing table and own set of interfaces it pays attention to. We use OSPF (IPv4, v6), multicast (IPv4 only currently), BGP, RIP, and OLSR. We can easily do regression tests with dynamic networks (links come and go, link connections degrade (pkt loss, latency, etc). We use the 'veth' device in Linux to connect xorp instances, and some of our own proprietary network emulation logic for the impairments. We do not use Xorp on any platform besides Linux. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 1 17:10:55 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 01 Sep 2009 17:10:55 -0700 Subject: [Xorp-hackers] fix assert in selector.cc In-Reply-To: <4A9D64F1.5060900@candelatech.com> References: <49ED644C.10904@candelatech.com> <4A9D2E5E.3000708@incunabulum.net> <4A9D64F1.5060900@candelatech.com> Message-ID: <4A9DB80F.70401@candelatech.com> On 09/01/2009 11:16 AM, Ben Greear wrote: > I'll see if I can reproduce the bug with the svn tree on Linux. I can't reproduce this with the latest SVN tree (64-bit Fedora 11 Linux)..so please just ignore that patch. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jtc at acorntoolworks.com Wed Sep 2 10:48:18 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Wed, 02 Sep 2009 10:48:18 -0700 Subject: [Xorp-hackers] Patches integrated; call for testers In-Reply-To: <4A9D635D.3010301@candelatech.com> (Ben Greear's message of "Tue, 01 Sep 2009 11:09:33 -0700") References: <4A9D2FD1.2040203@incunabulum.net> <4A9D635D.3010301@candelatech.com> Message-ID: <871vmps0f1.fsf@orac.acorntoolworks.com> Ben Greear writes: >> It's difficult with only one hand on deck (myself) on the community >> branch to do this, whilst we have changes in-flight, and whilst the code >> is a moving target. To minimize churn, I tend to fork feature branches >> and re-incorporate them further on; both the Windows port, and OLSR >> code, were developed in this way. >> > Considering the small developer and testing base, I think forking > would be a bad idea. Hi Ben, Bruce, I think the term "forking" should be avoided. It's a emotionally loaded term that should be reserved for talking about when a project splits into multiples. I hope that now the XORP open source project is now hosted at sourceforge and has an independent community, there will be no one who feels they need to create/maintain a fork to accomplish their goals. Now feature branches on the other hand... My understanding that in the past, there was very little use of branches. Even the release process essentially locked the repo while the release was being cut (In my experience, this discourages developer activity during the weeks it takes to make a solid release). It also made it difficult to do bold and interesting experiments that may or may not pan out, or long term infrastructure improvements that might take several release cycles to complete since there was no place to collaborate on them (Bruce's plan to replace XRL's with Thrift might fit both of those categories). Now that the project has moved from CVS to SVN, which has better branch and merge support, I agree with Bruce that we should be using them for release and feature development. And Ben, I don't think of feature branches as some sort of limbo, but rather a staging area where substantial pieces of work can be integrated so when they are merged into the trunk, it will have minimal impact. From what I've seen of the FEA virtualization patch, I think we can consider it substantial. > I think we should merge all that we possibly can and the have as > many people as possible test it. I won't harp on this more, however > :) For 1.7, I think the priority should be to shake out the bugs in the new build system. I'm afraid that if we start integrating new code, we won't be able to make a release in a timely manner. After the release, I'm all in favor of looking at what the best way to integrate changes like Ben's. --jtc -- J.T. Conklin From bms at incunabulum.net Fri Sep 4 07:05:28 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Fri, 04 Sep 2009 15:05:28 +0100 Subject: [Xorp-hackers] RIB serialization In-Reply-To: <4A9D4F2D.1060902@incunabulum.net> References: <4A9C1076.4030604@incunabulum.net> <4A9D4F2D.1060902@incunabulum.net> Message-ID: <4AA11EA8.50905@incunabulum.net> Hi Marco, I've checked in the Boost detection logic for SCons to public SVN now, I hope this useful to you. I've spent some hours reviewing the code base for low-hanging refactoring fruit. My initial conclusion is that we're best off doing this on a subsystem-by-subsystem basis, due to the nature of the changes involved. Introducing Thrift is likely to obsolete a lot of the code in libxorp as it stands; so perhaps that needs to happen before further introduction of Boost, at least in the main line of development. Also, libcomm will probably have to be spliced into Boost.ASIO; their socket API has many gaps. In an experimental Hg branch, Boost.Regex is a drop-in replacement for POSIX regex using its C API. Normally we require pcre/pcreposix for policy and the xorpsh pipe feature on Linux platforms; Boost.Regex ships as a shared library, so we can cut over to it easily. This is perhaps the easiest change we can make for introducing Boost into XORP, in the very beginning. If you come up with anything else, we are very happy to review patches. thanks, BMS From bms at incunabulum.net Fri Sep 4 07:15:59 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Fri, 04 Sep 2009 15:15:59 +0100 Subject: [Xorp-hackers] BGP native ASN32 peerings support Message-ID: <4AA1211F.100@incunabulum.net> Hi all, I took a brief look at BGP the other night whilst examining a bug report about ASN32 support. It seems XORP should do the right thing with tunneled 4-byte ASNs in AS path attributes, however, it looks as though native ASN32 peerings will not yet work (tested against OpenBGPD). If anyone is looking for something to do, this would be an ideal contribution which would be gratefully received. The code probably just needs to be taught to look for AS_TRAN (23456) in the BGP OPEN negotiation, and only perform the valid peer AS check after the optional parameters have been parsed. BGPPeer::check_open_packet() is the place to start. In BGPPeer::event_openmess(), the call to open_negotiation() actually parses the 4-byte ASN tuple. thanks, BMS From bms at incunabulum.net Sun Sep 6 10:25:37 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Sun, 06 Sep 2009 18:25:37 +0100 Subject: [Xorp-hackers] FEA link speed/duplex config support Message-ID: <4AA3F091.8060204@incunabulum.net> Hi all, Whilst reviewing bug reports, I saw one about the FEA not allowing the link-layer parameters (e.g. Ethernet autonegotiation, chosen speed and duplex) to be configured. If anyone is looking for something to do, this would be another fairly simple coding task to get involved with the project and the code base, and which is something we might all generally find useful. In BSD, the kernel interface used to access and configure this state is known as ifmedia. A notification will be sent by the kernel when the link state changes by means of the RTM_IFINFO message. In Linux, the ethtool utility is responsible for this; it has since begun to use a common driver interface in the past few years. I believe the FEA code which listens for network stack layer events already maintains the link state, so most of the work involved in this task would be to add to the existing configuration syntax, and teach the FEA how to save and restore the link-layer state. thanks, BMS From wancheng82 at gmail.com Tue Sep 15 18:15:49 2009 From: wancheng82 at gmail.com (wancheng82) Date: Wed, 16 Sep 2009 09:15:49 +0800 Subject: [Xorp-hackers] Hi,all Message-ID: <200909160915469189352@gmail.com> This is Robin. I am a new comer that want to learn the XORP project. Usually I compare XORP CLI with Juniper CLI. Will XORP support Rollback like Juniper recently? Thank you. 2009-09-16 wancheng82 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090916/401751b7/attachment.html From bms at incunabulum.net Wed Sep 16 02:12:52 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 16 Sep 2009 10:12:52 +0100 Subject: [Xorp-hackers] Hi,all In-Reply-To: <200909160915469189352@gmail.com> References: <200909160915469189352@gmail.com> Message-ID: <4AB0AC14.4010303@incunabulum.net> wancheng82 wrote: > Will XORP support Rollback like Juniper recently? No, but it's been discussed in the past: http://mailman.icsi.berkeley.edu/pipermail/xorp-hackers/2005-May/000377.html Unfortunately, Kristian's patches for this seem to have disappeared. From wancheng82 at gmail.com Thu Sep 17 20:26:05 2009 From: wancheng82 at gmail.com (wancheng82) Date: Fri, 18 Sep 2009 11:26:05 +0800 Subject: [Xorp-hackers] Hi,all, how to do this in Xorp? Message-ID: <200909181126027907706@gmail.com> Take Layer 2 vlan configuration for example. This means that when add interface eth0 to vlan sales, the interfaces configuration will change at the same time(add "vlan memeber sales" automatically). Can Xorp CLI support this case? interfaces { eth0 { vlan member sales; } } vlans { sales { interface eth0; } } Thank you all. 2009-09-18 wancheng82 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090918/9d14c320/attachment.html From bms at incunabulum.net Mon Sep 21 07:49:22 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon, 21 Sep 2009 15:49:22 +0100 Subject: [Xorp-hackers] Hi,all, how to do this in Xorp? In-Reply-To: <200909181126027907706@gmail.com> References: <200909181126027907706@gmail.com> Message-ID: <4AB79272.607@incunabulum.net> wancheng82 wrote: > Take Layer 2 vlan configuration for example. > This means that when add interface eth0 to vlan sales, the interfaces > configuration will change at the same time(add "vlan memeber sales" > automatically). > Can Xorp CLI support this case? The CLI does not support this, however, it would be an interesting contribution for us which we would welcome, should you be happy to work on it as a feature. Whilst this would be convenient for users, this would be tricky to implement, because it would require adding some steps to the configuration process to determine if the interface already exists, and if not, add the appropriate configuration block automatically. This would involve some communication between the Router Manager and the FEA to learn about the interface. thanks, BMS From marco.canini at epfl.ch Fri Sep 25 05:51:43 2009 From: marco.canini at epfl.ch (Canini Marco) Date: Fri, 25 Sep 2009 14:51:43 +0200 Subject: [Xorp-hackers] bgp peering with quagga Message-ID: Hello, I'm have difficulties with what I thought would be a trivial thing: peering with a quagga bgpd. I've collected a packet trace and when I analyze it with wireshark I see that the bgp protocol is correctly dissecting for messages from quagga but there is no decoding for messages from xorp. >From what I see, quagga sends a valid open message, xorp returns with its open message (that is marked as unknown message by wireshark) and quagga sends back an invalid header notification message and resets the session. Is there a specific option that I need to enable in xorp config file to enable peering in this case? Thanks Marco Canini, Ph.D. EPFL, Networked Systems Laboratory From marco.canini at epfl.ch Fri Sep 25 06:18:06 2009 From: marco.canini at epfl.ch (Canini Marco) Date: Fri, 25 Sep 2009 15:18:06 +0200 Subject: [Xorp-hackers] bgp peering with quagga In-Reply-To: References: Message-ID: Problem found! I forgot to disable a local modification which made the open message non-bgp compliant. Marco Canini, Ph.D. EPFL, Networked Systems Laboratory > -----Original Message----- > From: xorp-hackers-bounces at icir.org [mailto:xorp-hackers- > bounces at icir.org] On Behalf Of Canini Marco > Sent: Friday, 25 September, 2009 2:52 PM > To: xorp-hackers at icir.org > Subject: [Xorp-hackers] bgp peering with quagga > > Hello, > > I'm have difficulties with what I thought would be a trivial thing: > peering with a quagga bgpd. > I've collected a packet trace and when I analyze it with wireshark I > see that the bgp protocol is correctly dissecting for messages from > quagga but there is no decoding for messages from xorp. > >From what I see, quagga sends a valid open message, xorp returns with > its open message (that is marked as unknown message by wireshark) and > quagga sends back an invalid header notification message and resets the > session. > Is there a specific option that I need to enable in xorp config file to > enable peering in this case? > Thanks > > Marco Canini, Ph.D. > EPFL, Networked Systems Laboratory > > > > _______________________________________________ > Xorp-hackers mailing list > Xorp-hackers at icir.org > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/xorp-hackers From greearb at candelatech.com Fri Sep 25 12:58:05 2009 From: greearb at candelatech.com (Ben Greear) Date: Fri, 25 Sep 2009 12:58:05 -0700 Subject: [Xorp-hackers] Fix for PIM task list hang In-Reply-To: <4A9D24FB.7090807@incunabulum.net> References: <4A4A8AD9.2020107@candelatech.com> <4A9D24FB.7090807@incunabulum.net> Message-ID: <4ABD20CD.9060300@candelatech.com> On 09/01/2009 06:43 AM, Bruce Simpson wrote: > Ben, > > Thanks for this change. As of today, I've applied a very small portion > of it, by introducing debug_msg() calls into the path(s) where you've > added XLOG warnings. I'm merging with upstream.... Why did you remove the part where I also updated the error_msg? That gives the caller some idea why it failed. I'm fine with getting rid of the XLOG warnings, as that was mostly for my own debugging needs. The pop_xrl changes fix real bugs with the state machine (it could get hung on certain error conditions, at least). I'm attaching a patch of all my changes for the pim/ directory in case you want to apply them, it includes: * Fix xrl task state machine dead-lock due to un-balanced pop/send_xrl calls. * Improve error messages * Don't panic on network device removal. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: xorp-svn.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090925/f130e314/attachment.ksh From greearb at candelatech.com Fri Sep 25 13:02:44 2009 From: greearb at candelatech.com (Ben Greear) Date: Fri, 25 Sep 2009 13:02:44 -0700 Subject: [Xorp-hackers] OLSR Message-ID: <4ABD21E4.7000409@candelatech.com> Any reason we can't move OLSR out of contrib and into the main directory so that we can build it with scons? In my testing OLSR was as stable as any other protocol, and if someone doesn't want to use it, they simply don't add it to their xorp config file, eh? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Fri Sep 25 15:32:36 2009 From: greearb at candelatech.com (Ben Greear) Date: Fri, 25 Sep 2009 15:32:36 -0700 Subject: [Xorp-hackers] PATCH: Enable olsr build (without moving directories) Message-ID: <4ABD4504.9020906@candelatech.com> I have OLSR building now. Haven't actually tested it out in the SVN tree, but this is a good first step at least. I can't get the tools to compile either...but hopefully they aren't critical. I also created an SConscript file for olsr/tools, but it doesn't actually compile. Might be worth adding as a place-holder though. To build: scons enable_olsr=yes Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: olsr_scons.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090925/a2675ecf/attachment.ksh -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: SConscript Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090925/a2675ecf/attachment-0001.ksh From jtc at acorntoolworks.com Sat Sep 26 16:13:44 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Sat, 26 Sep 2009 16:13:44 -0700 Subject: [Xorp-hackers] OLSR In-Reply-To: <4ABD21E4.7000409@candelatech.com> (Ben Greear's message of "Fri, 25 Sep 2009 13:02:44 -0700") References: <4ABD21E4.7000409@candelatech.com> Message-ID: <87hbup47vb.fsf@orac.acorntoolworks.com> Ben Greear writes: > Any reason we can't move OLSR out of contrib and into the main > directory so that we can build it with scons? In my testing OLSR > was as stable as any other protocol, and if someone doesn't want to > use it, they simply don't add it to their xorp config file, eh? Good point. Now that all of XORP is supported by the community, I'm not sure there is a reason to have a "contrib" ghetto for other protocols. I'm not sure what guidelines the community should adopt for importing new protocol implementations into XORP. Right now I'm thinking along the lines of contribution should be of sufficent importance and general usefulness; should be of reasonable implementation quality; should be OS/network stack independent; and should be have someone who is willing to spend some effort to maintain it going forward. This is mostly intended to prevent some research protocol, of questionable quality, with other interoperable implementations, being dropped in our lap where we have to deal with all the maintenance headaches going forwards. However, this isn't my call to make. Anyone who has an opinion please contribute to this thread. --jtc -- J.T. Conklin From bms at incunabulum.net Mon Sep 28 02:09:31 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon, 28 Sep 2009 10:09:31 +0100 Subject: [Xorp-hackers] OLSR In-Reply-To: <4ABD21E4.7000409@candelatech.com> References: <4ABD21E4.7000409@candelatech.com> Message-ID: <4AC07D4B.5050703@incunabulum.net> Ben Greear wrote: > Any reason we can't move OLSR out of contrib and into the main directory > so that we can build it with scons? In my testing OLSR was as stable as > any other protocol, and if someone doesn't want to use it, they > simply don't add it to their xorp config file, eh? > I'd rather we didn't move it out of contrib/ until well after 1.7. The OLSR implementation in XORP is based on a reasonably strict interpretation of the RFC, and doesn't have support for IPv6 or the ETX extensions, which are pretty much essential now for folk deploying OLSR in the field. I would consider it an unfinished work in progress. It became clear, at that point in time, that there were just too many other issues in the existing architecture to deliver what the original client wanted on time and within budget. I understand people are certainly using it and trying to base work off it. That's great, and I'm pleased folk have found what's been produced to date, useful in some way. But I'd rather we didn't create the impression that it's mainline or supported code, until the story with extensibility is dealt with, and that's a risk in moving it into the top of the tree. There are other problems to be solved first, and de-contrib'ing it at this point in time seems like a distraction from the primary goals. So +1 vote for 'come back to this later on'. P.S. I'm not 100% happy with OLSR, as you can probably tell. There are a few places where it could borrow from Joe Macker's code for more efficient MPR set computation, and Boost might well make that easier. There's a list of stuff in contrib/olsr README and NOTES. Note well the comment about BufferedAsyncReader eating 256KB for *every* STCP session in XRL!! From bms at incunabulum.net Mon Sep 28 02:22:37 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon, 28 Sep 2009 10:22:37 +0100 Subject: [Xorp-hackers] PATCH: Enable olsr build (without moving directories) In-Reply-To: <4ABD4504.9020906@candelatech.com> References: <4ABD4504.9020906@candelatech.com> Message-ID: <4AC0805D.2040306@incunabulum.net> Hi Ben, Thanks for the patch. Can't take it just yet, though. Looks like you've run into some of the issues with dropping 3rd party code in in general. It might be an idea to conditionalize the relative paths (../..) in the SConscripts, so OLSR can be moved elsewhere in the tree later on. This is sort of what we're getting at -- the XRL stub build needs to be told that stubs must be generated for the 3rd party protocol, and right now, that's centralized. We should have a better story for this later on, but for now, I need to stay focused on the task at hand. Thanks for looking into this. cheers, BMS From bms at incunabulum.net Mon Sep 28 02:24:49 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon, 28 Sep 2009 10:24:49 +0100 Subject: [Xorp-hackers] Fix for PIM task list hang In-Reply-To: <4ABD20CD.9060300@candelatech.com> References: <4A4A8AD9.2020107@candelatech.com> <4A9D24FB.7090807@incunabulum.net> <4ABD20CD.9060300@candelatech.com> Message-ID: <4AC080E1.5030700@incunabulum.net> Thanks for the PIM patch. Unfortunately I don't have free time to look at this in detail at the moment. It would be great if Pavlin were available to give it a review. PIM is quite a complex piece of machinery. Glancing at it, I didn't commit this previously, because I wasn't 100% sure about what was going on. There are a few practice issues in there (return from inside a switch block) which made me a little itchy. Of course, I'm burning up on a task right now, so I can't really the patch fair consideration. +1 for 'come back to this later'. From marco.canini at epfl.ch Mon Sep 28 05:12:55 2009 From: marco.canini at epfl.ch (Canini Marco) Date: Mon, 28 Sep 2009 14:12:55 +0200 Subject: [Xorp-hackers] Peering problem with mixed 2- and 4-byte ASN Message-ID: Hello, I believe I've hit a bug in the BGP implementation of Xorp but before reporting it through bugzilla I want to double check it here. In essence, the problem lies in route advertisements propagation from a 2-byte ASN Xorp instance (old-bgp speaker) to a 4-byte ASN Xorp instance (new-bgp speaker). The simplest way to see it is to configure 3 bgp speakers in the following way: New-bgp (AS 1.0) +----+ Old-bgp (AS 2) +----+ New-bgp (AS 3.0) The way Old-bgp needs to be configured involves two peerings with AS_TRANS (23456) because both AS 1.0 and 3.0 use the AS_TRANS in their open message. Then, suppose that AS 1.0 advertises the prefix 10.100.0.0/16. The route advertisement reaches AS 2 which installs it correctly in its routing table with AS_PATH = 23456. However, AS 3.0 never receives this advertisement from AS 2. In theory, AS 3.0 should get an update message from AS 2 with AS_PATH = 2 23456 and AS4_PATH = 1.0 but this is simply not happening. I've tried to use Quagga as the old-bgp speaker and in this case AS 3.0 receives the update. Unless I've missed some important configuration detail, I believe there is a bug in the way Xorp (doesn't) forward the advertisements for AS_TRANS peers. Both AS 1.0 and 3.0 have enable-4byte-as-numbers: true while AS 2 doesn't. I can provide more details if necessary. Does it appear as a bug to you? Thanks Marco Canini, Ph.D. EPFL, Networked Systems Laboratory From bms at incunabulum.net Mon Sep 28 06:03:53 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Mon, 28 Sep 2009 14:03:53 +0100 Subject: [Xorp-hackers] Peering problem with mixed 2- and 4-byte ASN In-Reply-To: References: Message-ID: <4AC0B439.60709@incunabulum.net> Hi Marco, Thanks for your detailed message regarding issues in XORP's 32-bit AS implementation. My only real exposure to this has been in attempting to configure a native 32-bit AS peering with OpenBGPD, which was unsuccessful. The 32-bit AS support in XORP is a relatively recent change, by current history; it was implemented towards the end of 2008, and there has not been significant developer activity on BGP since then. Unfortunately I don't have free time at the moment to investigate the issue further. If you could raise a Trac ticket about this issue on SourceForge, that would be great. I'll forward your original message about this onto the author(s) of BGP and the 32-bit AS patch. best regards, BMS From marco.canini at epfl.ch Mon Sep 28 08:15:05 2009 From: marco.canini at epfl.ch (Canini Marco) Date: Mon, 28 Sep 2009 17:15:05 +0200 Subject: [Xorp-hackers] Peering problem with mixed 2- and 4-byte ASN In-Reply-To: <4AC0B439.60709@incunabulum.net> References: <4AC0B439.60709@incunabulum.net> Message-ID: I've opened a Trac ticket about this. Cheers Marco Canini, Ph.D. EPFL, Networked Systems Laboratory > -----Original Message----- > From: Bruce Simpson [mailto:bms at incunabulum.net] > Sent: Monday, 28 September, 2009 3:04 PM > To: Canini Marco > Cc: xorp-hackers at icir.org > Subject: Re: [Xorp-hackers] Peering problem with mixed 2- and 4-byte > ASN > > Hi Marco, > > Thanks for your detailed message regarding issues in XORP's 32-bit AS > implementation. > > My only real exposure to this has been in attempting to configure a > native 32-bit AS peering with OpenBGPD, which was unsuccessful. The > 32-bit AS support in XORP is a relatively recent change, by current > history; it was implemented towards the end of 2008, and there has not > been significant developer activity on BGP since then. > > Unfortunately I don't have free time at the moment to investigate the > issue further. If you could raise a Trac ticket about this issue on > SourceForge, that would be great. I'll forward your original message > about this onto the author(s) of BGP and the 32-bit AS patch. > > best regards, > BMS > From greearb at candelatech.com Mon Sep 28 12:27:43 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 12:27:43 -0700 Subject: [Xorp-hackers] Unofficial xorp development tree available. Message-ID: <4AC10E2F.5080703@candelatech.com> In order to aid sharing our patches with upstream xorp developers and other users, I've made our xorp source tree available on our server. I'm using 'git' instead of 'svn', but they have a similar feature set. I plan to sync my tree with the upstream xorp svn code tree often, and will post significant patches to the xorp-users mailing list in case the upstream developers want to incorporate the patches. If Bruce or some other official Xorp person perfers it, I can automatically post the changes to my tree to xorp-cvs or similar mailing list. I'm not going to spam those lists unless specifically requested, however. Please understand that the official Xorp project bears no responsibility for my xorp.ct tree. Any bug reports against it should be directed to me. Right now, the tree is read-only for outside users. I am certainly willing to consider applying patches, even for new experimental routing protocols and such, as long as they do not overly risk de-stabilizing the existing protocols, and as long as the changes do not make it too difficult to keep synchronized with the upstream xorp tree. http://www.candelatech.com/oss/xorp-ct.html Suggestions & comments welcome. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Mon Sep 28 12:45:49 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 12:45:49 -0700 Subject: [Xorp-hackers] Patch to update build notes slightly Message-ID: <4AC1126D.4030907@candelatech.com> We're no longer using (g)make and ./configure. Update notes to provide some brief details on how to build with scons. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: patch0.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090928/97f26d97/attachment.ksh From greearb at candelatech.com Mon Sep 28 16:10:11 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 16:10:11 -0700 Subject: [Xorp-hackers] PATCH: Enable olsr build (without moving directories) In-Reply-To: <4AC0805D.2040306@incunabulum.net> References: <4ABD4504.9020906@candelatech.com> <4AC0805D.2040306@incunabulum.net> Message-ID: <4AC14253.4030502@candelatech.com> On 09/28/2009 02:22 AM, Bruce Simpson wrote: > Hi Ben, > > Thanks for the patch. Can't take it just yet, though. > > Looks like you've run into some of the issues with dropping 3rd party > code in in general. It might be an idea to conditionalize the relative > paths (../..) in the SConscripts, so OLSR can be moved elsewhere in the > tree later on. > > This is sort of what we're getting at -- the XRL stub build needs to be > told that stubs must be generated for the 3rd party protocol, and right > now, that's centralized. > > We should have a better story for this later on, but for now, I need to > stay focused on the task at hand. Thanks for looking into this. Just in case it helps, here is a follow-on patch that fixes the install target and templates for OLSR. With these applied, I compiled, installed and (lightly) tested OLSR. Seems to be working fine. Pushed to my xorp.ct git repository as well. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: olsr.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090928/635d7554/attachment.ksh From greearb at candelatech.com Mon Sep 28 17:09:36 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 17:09:36 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? Message-ID: <4AC15040.5000006@candelatech.com> I tried to compile xorp on Fedora 8. It has scons 0.98 by default (yum install scons). This fails to compile with: scons: Reading SConscript files ... SCons 1.2 or greater required, but you have SCons 0.98.4 Anyone know what part of xorp has the requirements for newer scons? In order to make xorp easier for users to build, I'd like to attempt to fix up xorp to build with older scons, as opposed to making people manually find, download, configure and install a newer scons... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Mon Sep 28 17:37:10 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 17:37:10 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? In-Reply-To: <4AC15040.5000006@candelatech.com> References: <4AC15040.5000006@candelatech.com> Message-ID: <4AC156B6.3060007@candelatech.com> On 09/28/2009 05:09 PM, Ben Greear wrote: > I tried to compile xorp on Fedora 8. It has scons 0.98 by default (yum install scons). > > This fails to compile with: > > scons: Reading SConscript files ... > SCons 1.2 or greater required, but you have SCons 0.98.4 > > > Anyone know what part of xorp has the requirements for newer scons? In order to make > xorp easier for users to build, I'd like to attempt to fix up xorp to build with older > scons, as opposed to making people manually find, download, configure and install a > newer scons... This patch allows it to compile, but maybe there are subtle issues somewhere? The -*-python-*- thing makes xemacs properly recognize the file and do syntax highlighting, by the way. That token just needs to be somewhere in the first two lines of the file. diff --git a/SConstruct b/SConstruct index 5011ad4..8cbb93a 100644 --- a/SConstruct +++ b/SConstruct @@ -1,4 +1,4 @@ -#Copyright (c) 2009 XORP, Inc. +#Copyright (c) 2009 XORP, Inc. -*-python-*- # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License, Version 2, June @@ -31,7 +31,6 @@ # TODO conditionalize new directory layout here EnsurePythonVersion(2, 3) -EnsureSConsVersion(1, 2) Help(""" cross=true if you are doing a cross build. Default is false. @@ -53,6 +52,13 @@ from SCons.Script.SConscript import SConsEnvironment import SCons.Action import SCons.Builder +try: + EnsureSConsVersion(1, 2) +except SystemExit: + print "WARNING: Actually, SCONS version 1.2 or later is _preferred_." + print "Attempting to continue with version: " + SCons.__version__ + " but it may not work properly.\n" + + vars = Variables() vars.AddVariables( -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jtc at acorntoolworks.com Mon Sep 28 20:02:06 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Mon, 28 Sep 2009 20:02:06 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? In-Reply-To: <4AC15040.5000006@candelatech.com> (Ben Greear's message of "Mon, 28 Sep 2009 17:09:36 -0700") References: <4AC15040.5000006@candelatech.com> Message-ID: <87ljjyh2s1.fsf@orac.acorntoolworks.com> Ben Greear writes: > I tried to compile xorp on Fedora 8. It has scons 0.98 by default (yum install scons). > > This fails to compile with: > > scons: Reading SConscript files ... > SCons 1.2 or greater required, but you have SCons 0.98.4 > > > Anyone know what part of xorp has the requirements for newer scons? In order to make > xorp easier for users to build, I'd like to attempt to fix up xorp to build with older > scons, as opposed to making people manually find, download, configure and install a > newer scons... Hi Ben, I added the EnsureSConsVersion(1, 2) when someone tried to an earlier version without support for Variables(). 1.2 was the earliest version I had handy. If we can verify it works with an earlier version, we can change the version number accordingly. If you can confirm it works with 0.98.4, I think we can change it to EnsureSConsVersion(0, 98, 4). --jtc -- J.T. Conklin From greearb at candelatech.com Mon Sep 28 20:05:33 2009 From: greearb at candelatech.com (Ben Greear) Date: Mon, 28 Sep 2009 20:05:33 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? In-Reply-To: <87ljjyh2s1.fsf@orac.acorntoolworks.com> References: <4AC15040.5000006@candelatech.com> <87ljjyh2s1.fsf@orac.acorntoolworks.com> Message-ID: <4AC1797D.7040602@candelatech.com> J.T. Conklin wrote: > Ben Greear writes: > >> I tried to compile xorp on Fedora 8. It has scons 0.98 by default (yum install scons). >> >> This fails to compile with: >> >> scons: Reading SConscript files ... >> SCons 1.2 or greater required, but you have SCons 0.98.4 >> >> >> Anyone know what part of xorp has the requirements for newer scons? In order to make >> xorp easier for users to build, I'd like to attempt to fix up xorp to build with older >> scons, as opposed to making people manually find, download, configure and install a >> newer scons... >> > > Hi Ben, > > I added the EnsureSConsVersion(1, 2) when someone tried to an earlier > version without support for Variables(). > > 1.2 was the earliest version I had handy. If we can verify it works > with an earlier version, we can change the version number accordingly. > If you can confirm it works with 0.98.4, I think we can change it to > EnsureSConsVersion(0, 98, 4). > It certainly compiles. Unless some random conditional test acted differently, then it would appear to work fine. There should be a comment in the code near that call or at least in the commit message that explains the limitation of earlier scons packages in case someone wants to write work-around logic to compile on yet older systems. Thanks, Ben > --jtc > > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jtc at acorntoolworks.com Mon Sep 28 20:07:39 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Mon, 28 Sep 2009 20:07:39 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? In-Reply-To: <4AC156B6.3060007@candelatech.com> (Ben Greear's message of "Mon, 28 Sep 2009 17:37:10 -0700") References: <4AC15040.5000006@candelatech.com> <4AC156B6.3060007@candelatech.com> Message-ID: <87hbumh2is.fsf@orac.acorntoolworks.com> Hi Ben, Ben Greear writes: > On 09/28/2009 05:09 PM, Ben Greear wrote: >> I tried to compile xorp on Fedora 8. It has scons 0.98 by default (yum install scons). >> >> This fails to compile with: >> >> scons: Reading SConscript files ... >> SCons 1.2 or greater required, but you have SCons 0.98.4 >> >> >> Anyone know what part of xorp has the requirements for newer scons? In order to make >> xorp easier for users to build, I'd like to attempt to fix up xorp to build with older >> scons, as opposed to making people manually find, download, configure and install a >> newer scons... > > This patch allows it to compile, but maybe there are subtle issues somewhere? I think it's best to change the version number to the earliest version we know we support. > The -*-python-*- thing makes xemacs properly recognize the file and > do syntax highlighting, by the way. That token just needs to be > somewhere in the first two lines of the file. I have: (autoload 'python-mode "python-mode" "Python editing mode." t) (add-to-list 'auto-mode-alist '("SConstruct" . python-mode)) (add-to-list 'auto-mode-alist '("SConscript" . python-mode)) in my ~/.xemacs/init.el. If we do the -*-python-*- trick, I think it needs to be added to not just the SConstruct, but also all the SConscript files. It also should be part of a separate commit. --jtc -- J.T. Conklin From jtc at acorntoolworks.com Mon Sep 28 20:26:58 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Mon, 28 Sep 2009 20:26:58 -0700 Subject: [Xorp-hackers] Adding support for older scons releases? In-Reply-To: <4AC1797D.7040602@candelatech.com> (Ben Greear's message of "Mon, 28 Sep 2009 20:05:33 -0700") References: <4AC15040.5000006@candelatech.com> <87ljjyh2s1.fsf@orac.acorntoolworks.com> <4AC1797D.7040602@candelatech.com> Message-ID: <878wfyh1ml.fsf@orac.acorntoolworks.com> Ben Greear writes: >> I added the EnsureSConsVersion(1, 2) when someone tried to an earlier >> version without support for Variables(). >> >> 1.2 was the earliest version I had handy. If we can verify it works >> with an earlier version, we can change the version number accordingly. >> If you can confirm it works with 0.98.4, I think we can change it to >> EnsureSConsVersion(0, 98, 4). >> > It certainly compiles. Unless some random conditional test acted > differently, then it would appear to work fine. > > There should be a comment in the code near that call or at least in > the commit message that explains the limitation of earlier scons > packages in case someone wants to write work-around logic to compile > on yet older systems. I'll commit a patch along those lines later tonight. --jtc -- J.T. Conklin From mkaddoura at atcorp.com Tue Sep 29 09:44:30 2009 From: mkaddoura at atcorp.com (Maher Kaddoura) Date: Tue, 29 Sep 2009 11:44:30 -0500 Subject: [Xorp-hackers] BGP routes distribution Message-ID: <0FEA67D1BDFF4244AAA48D6DECA90856@Coho> Hi, I have a gateway that is configured with two interfaces 192.168.21.1 and 192.168.20.1. I want the gateway to distribute both routes into 192.168.20.0 domain. The problem that I am facing is that when RIP and BGP are configured to distribute routes using {protocol: "connected"}, the BGP would only distribute 192.168.21.0/24 but not 192.168.20.0/24. If I set RIP to distribute static route (as define below while BGP distribute {protocol: "connected"}, then BGP would distribute both 192.168.20.0/24 and 192.168.21.0/24. However RIP will not distribute 192.168.20.0 /24 and 192.168.21.0/24. Can someone please let know what configuration should I used so BGP and RIP distribute both routes into domain 192.168.20.0. Thank you, Maher protocols { static { route 192.168.20.0/24 { next-hop: 192.168.20.1 metric: 1 } route 192.168.21.0/24 { next-hop: 192.168.20.1 metric: 1 } }} From greearb at candelatech.com Tue Sep 29 17:29:44 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 29 Sep 2009 17:29:44 -0700 Subject: [Xorp-hackers] Building without strip enabled? Message-ID: <4AC2A678.3010702@candelatech.com> It seems the upstream code won't build without stripping. The patch below allows me to have this work as I expected: scons strip=no install diff --git a/SConstruct b/SConstruct index ffec87f..aca5d96 100644 --- a/SConstruct +++ b/SConstruct @@ -133,9 +133,8 @@ print 'CXX: ', env['CXX'] env['STRIP'] = ARGUMENTS.get('STRIP', 'strip') print 'STRIP: ', env['STRIP'] -if env['strip']: - env['strip'] = True -print 'Strip binaries: ', env.has_key('strip') +env['strip'] = ARGUMENTS.get('strip', 'yes') +print 'Strip binaries: ', env['strip'] if env['shared']: env['SHAREDLIBS'] = "defined" @@ -165,7 +164,7 @@ def InstallProgram(env, dest, files, perm = 0755): obj = env.Install(dest, files) for i in obj: env.AddPostAction(i, env.Chmod(str(i), perm)) - if env.has_key('strip') and env.has_key('STRIP'): + if (env['strip'] == 'yes') and env.has_key('STRIP'): env.AddPostAction(i, Action("$STRIP $TARGET")) return obj SConsEnvironment.InstallProgram = InstallProgram @@ -187,7 +186,7 @@ def InstallLibrary(env, dest, files, perm = 0644): obj = env.Install(dest, files) for i in obj: env.AddPostAction(i, env.Chmod(str(i), perm)) - if env.has_key('strip') and env.has_key('STRIP'): + if (env['strip'] == 'yes') and env.has_key('STRIP'): env.AddPostAction(i, Action("$STRIP --strip-unneeded $TARGET")) return obj SConsEnvironment.InstallLibrary = InstallLibrary -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 29 17:45:19 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 29 Sep 2009 17:45:19 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory Message-ID: <4AC2AA1F.1080308@candelatech.com> Looks like we are running stale node objects that have since been deleted by the resizing of the _selector entries. This is from my code tree, so it's possible it's something I added, but I don't think it is... I'm going to work on fixing this, but if someone has any quick ideas, feel free to let me know! ==19682== Invalid read of size 4 ==19682== at 0x5420AB: SelectorList::Node::run_hooks(SelectorMask, XorpFd) (selector.cc:169) Bug is that 'this' is deleted, as far as I can tell. selector.cc line 169: SelectorMask match = SelectorMask(_mask[i] & m & ~already_matched); ==19682== by 0x5416DF: SelectorList::wait_and_dispatch(TimeVal&) (selector.cc:486) ==19682== by 0x52EA2C: EventLoop::do_work(bool) (eventloop.cc:147) ==19682== by 0x52E8C1: EventLoop::run() (eventloop.cc:100) ==19682== by 0x4070D2: Rtrmgr::run() (main_rtrmgr.cc:346) ==19682== by 0x407DB6: main (main_rtrmgr.cc:653) ==19682== Address 0x4e9bfb4 is 3,524 bytes inside a block of size 3,608 free'd ==19682== at 0x4A05E3F: operator delete(void*) (vg_replace_malloc.c:342) ==19682== by 0x542F31: __gnu_cxx::new_allocator::deallocate(SelectorList::Node*, unsigned long) (new_allocator.h:95) ==19682== by 0x542821: std::_Vector_base >::_M_deallocate(SelectorList::Node*, unsigned long) (stl_vector.h:146) ==19682== by 0x542DC0: std::vector >::_M_fill_insert(__gnu_cxx::__normal_iterator > >, unsigned long, SelectorList::Node const&) (vector.tcc:451) ==19682== by 0x54278F: std::vector >::insert(__gnu_cxx::__normal_iterator > >, unsigned long, SelectorList::Node const&) (stl_vector.h:851) ==19682== by 0x542563: std::vector >::resize(unsigned long, SelectorList::Node) (stl_vector.h:557) ==19682== by 0x5408F9: SelectorList::add_ioevent_cb(XorpFd, IoEventType, ref_ptr > const&, int) (selector.cc:239) // Bug is that this deletes old memory and allocates new..we must have saved a pointer to the old // memory somewhere. selector.cc: 239 _selector_entries.resize(fd + 32); ==19682== by 0x52EA73: EventLoop::add_ioevent_cb(XorpFd, IoEventType, ref_ptr > const&, int) (eventloop.cc:240) ==19682== by 0x5285C5: AsyncFileReader::start() (asyncio.cc:307) ==19682== by 0x53BD2D: RunCommandBase::execute() (run_command.cc:358) ==19682== by 0x44B82A: ModuleManager::Process::startup(std::string&) (module_manager.cc:708) ==19682== by 0x44AF9A: ModuleManager::execute_process(std::string const&, std::string&) (module_manager.cc:608) -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Tue Sep 29 19:18:20 2009 From: greearb at candelatech.com (Ben Greear) Date: Tue, 29 Sep 2009 19:18:20 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC2AA1F.1080308@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> Message-ID: <4AC2BFEC.6010802@candelatech.com> On 09/29/2009 05:45 PM, Ben Greear wrote: > Looks like we are running stale node objects that have since > been deleted by the resizing of the _selector entries. > > This is from my code tree, so it's possible it's something I added, > but I don't think it is... > > I'm going to work on fixing this, but if someone has any quick ideas, feel free > to let me know! This one is nasty. Here's the work-around fix: Most of the attached patch is debugging logic, and you can skip all of that if you want. The thing that actually 'fixes' it is this: selector.cc: @@ -199,9 +228,18 @@ SelectorList::Node::is_empty() // ---------------------------------------------------------------------------- // SelectorList implementation + +// NOTE: It is possible for callbacks to add an event, that that event can +// cause the selector_entries to be resized. See: add_ioevent_cb +// This in turn deletes the old memory in the vector. This this causes +// Node::run_hooks to be accessing deleted memory (ie, 'this' was deleted during +// the call to dispatch(). +// Seems like a lot of pain to fix this right, so in the meantime, will pre-allocate +// logs of space in the selector_entries vector in hopes we do not have to resize. SelectorList::SelectorList(ClockBase *clock) : _clock(clock), _observer(NULL), _testfds_n(0), _last_served_fd(-1), - _last_served_sel(-1), _maxfd(0), _descriptor_count(0), _is_debug(false) + _last_served_sel(-1), _selector_entries(1024), + _maxfd(0), _descriptor_count(0), _is_debug(false) { static_assert(SEL_RD == (1 << SEL_RD_IDX) && SEL_WR == (1 << SEL_WR_IDX) && SEL_EX == (1 << SEL_EX_IDX) && SEL_MAX_IDX == 3); Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: selector_xorp_crash.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090929/8f524e85/attachment.ksh From bms at incunabulum.net Wed Sep 30 02:24:07 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 10:24:07 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC2AA1F.1080308@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> Message-ID: <4AC323B7.5010603@incunabulum.net> Ben Greear wrote: > Looks like we are running stale node objects that have since > been deleted by the resizing of the _selector entries. > Thanks for the feedback. Can you please raise a Trac ticket about this issue?. As far as I know, the commercial product is still using the same libxorp code for the EventLoop and SelectorList components, so engineering needs to see this one. There have been some instances of use-after-free with std::vector elsewhere in the code base. It is an easy mistake to leave pointers into a vector's storage which are later resized. Early last year, I caught some instances of this in libxorp/libxipc after valgrind runs. I noted some more general issues like this, and suggested to Atanu, at that time, that a co-ordinated QA sweep was needed. In the case of SelectorList, this is a class whose semantics are already implemented inside Boost.ASIO's io_service. One advantage is that ASIO has had a lot more eyes on it, so issues quickly get stamped out. However, cutting over to ASIO is not a simple drop-in change -- it requires a lot of refactoring, and what's in XORP now, is there largely because ASIO, and other useful tools, just didn't exist when the project started :-) thanks, BMS From bms at incunabulum.net Wed Sep 30 02:29:34 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 10:29:34 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC2BFEC.6010802@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> Message-ID: <4AC324FE.7010700@incunabulum.net> Hi Ben, Thanks for the patch. Yes, this would work around the issue by pre-allocating all the storage for Selectors upfront. Ben Greear wrote: > > This one is nasty. Here's the work-around fix: I would far rather the problem is fixed at root however. If we're pushed for time then we may check this in as an interim fix, it does waste some memory, but that's better than risking heap corruption in some situations. Ironically, the ref_ptr template is being used to try to avoid premature deletion. If you could attach this workaround patch to your Trac ticket that would be great. thanks, BMS From bms at incunabulum.net Wed Sep 30 02:31:53 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 10:31:53 +0100 Subject: [Xorp-hackers] Building without strip enabled? In-Reply-To: <4AC2A678.3010702@candelatech.com> References: <4AC2A678.3010702@candelatech.com> Message-ID: <4AC32589.2050709@incunabulum.net> Ben Greear wrote: > It seems the upstream code won't build without stripping. The > patch below allows me to have this work as I expected: > Thanks for pointing this out. Re your patch: Can't the SCons 'Variables' construct be used, to ensure that env['strip'] is always defined to a sane Boolean value, rather than using string comparisons in the conditionals? thanks, BMS From jtc at acorntoolworks.com Wed Sep 30 06:11:27 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Wed, 30 Sep 2009 06:11:27 -0700 Subject: [Xorp-hackers] Building without strip enabled? In-Reply-To: <4AC2A678.3010702@candelatech.com> (Ben Greear's message of "Tue, 29 Sep 2009 17:29:44 -0700") References: <4AC2A678.3010702@candelatech.com> Message-ID: <877hvgd1c0.fsf@orac.acorntoolworks.com> Hi Ben, Ben Greear writes: > It seems the upstream code won't build without stripping. The > patch below allows me to have this work as I expected: > > scons strip=no install Can you try updating to the most recent SConstruct? There used to be problems with boolean command line arguments like strip, but I fixed that on 9/15 in checkin 11546. env['strip'] is now set by a Variable, so should always have the boolean value True or False. --jtc -- J.T. Conklin From jtc at acorntoolworks.com Wed Sep 30 07:21:33 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Wed, 30 Sep 2009 07:21:33 -0700 Subject: [Xorp-hackers] Building without strip enabled? In-Reply-To: <877hvgd1c0.fsf@orac.acorntoolworks.com> (J. T. Conklin's message of "Wed, 30 Sep 2009 06:11:27 -0700") References: <4AC2A678.3010702@candelatech.com> <877hvgd1c0.fsf@orac.acorntoolworks.com> Message-ID: <87pr98tswi.fsf@orac.acorntoolworks.com> jtc at acorntoolworks.com (J.T. Conklin) writes: > Ben Greear writes: >> It seems the upstream code won't build without stripping. The >> patch below allows me to have this work as I expected: >> >> scons strip=no install > > Can you try updating to the most recent SConstruct? There used to be > problems with boolean command line arguments like strip, but I fixed > that on 9/15 in checkin 11546. > > env['strip'] is now set by a Variable, so should always have the > boolean value True or False. However, upon further review, I discovered that the conditionals in InstallProgram and InstallLibrary were checking whether 'strip' was in the SCons environment (which it always was), rather than it's value. I've just checked in a fix (11559). --jtc -- J.T. Conklin From greearb at candelatech.com Wed Sep 30 08:20:38 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 08:20:38 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC324FE.7010700@incunabulum.net> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> Message-ID: <4AC37746.2080004@candelatech.com> Bruce Simpson wrote: > Hi Ben, > > Thanks for the patch. Yes, this would work around the issue by > pre-allocating all the storage for Selectors upfront. > > Ben Greear wrote: >> >> This one is nasty. Here's the work-around fix: > > I would far rather the problem is fixed at root however. If we're > pushed for time then we may check this in as an interim fix, it does > waste some memory, but that's better than risking heap corruption in > some situations. Ironically, the ref_ptr template is being used to try > to avoid premature deletion. It's not deletion of the refptr or what it points to that is the problem..it's deletion of the Node that holds the refptr. Moving to a linked list storage or something like that for nodes would probably fix the problem, but considering non-const access to elements, it may not be worth the effort. If we are worried about memory usage, should fix the lex parsing code...it leaks memory all over the place according to valgrind. But, I couldn't see any easy way to fix that.... I'll add a Trac ticket later today. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From bms at incunabulum.net Wed Sep 30 08:51:05 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 16:51:05 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC37746.2080004@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> Message-ID: <4AC37E69.4040407@incunabulum.net> Hi Ben, This is a good catch. Thanks. Ben Greear wrote: > ...It's not deletion of the refptr or what it points to that is the > problem..it's deletion of the Node that holds the refptr. This sounds very, very similar to the problems I found in Spt with the use of ref_ptr: http://cvsweb.xorp.org/cgi-bin/cvsweb.cgi/xorp/libproto/spt.hh.diff?r1=1.19;r2=1.20 ...that might be a good starting point for the root fix. One issue with ref_ptr is that merely holding one bumps the reference count. Boost has a weak_ptr which allows the magic of a shared_ptr to be preserved even when 'just passing through'. > > Moving to a linked list storage or something like that for nodes would > probably fix the problem, but considering non-const access > to elements, it may not be worth the effort. Agree. Considering that Selector is fast-path stuff anyway, the additional bookkeeping involved is probably not worth the effort. > > If we are worried about memory usage, should fix the lex parsing > code...it leaks memory all over the place > according to valgrind. But, I couldn't see any easy way to fix that.... If you could raise a ticket on Trac for this also, with the valgrind log, that would be great. I can't promise that I could get around to fixing it any time soon, though. I've been up to my eyeballs in Thrift and libxipc, and have 'hit the wall' in more ways than one. The Router Manager is a large and complex beast. In corporate, it's largely been replaced with something else. Some degree of complexity also exists there due to the existence of textual XRLs. cheers, BMS From greearb at candelatech.com Wed Sep 30 09:59:15 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 09:59:15 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC37E69.4040407@incunabulum.net> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> Message-ID: <4AC38E63.6030308@candelatech.com> On 09/30/2009 08:51 AM, Bruce Simpson wrote: > Hi Ben, > > This is a good catch. Thanks. > > Ben Greear wrote: >> ...It's not deletion of the refptr or what it points to that is the >> problem..it's deletion of the Node that holds the refptr. > > This sounds very, very similar to the problems I found in Spt with the > use of ref_ptr: > http://cvsweb.xorp.org/cgi-bin/cvsweb.cgi/xorp/libproto/spt.hh.diff?r1=1.19;r2=1.20 The problem I found has nothing at all to do with pointers, weak or otherwise. The problem is that a method called by an object can cause that object to be deleted, and when that method continues, it is accessing deleted memory. >> If we are worried about memory usage, should fix the lex parsing >> code...it leaks memory all over the place >> according to valgrind. But, I couldn't see any easy way to fix that.... > > If you could raise a ticket on Trac for this also, with the valgrind > log, that would be great. > > I can't promise that I could get around to fixing it any time soon, > though. I've been up to my eyeballs in Thrift and libxipc, and have 'hit > the wall' in more ways than one. It's a one-time action, so the leaked memory isn't too big of a deal to me. I have some more regression tests to run and I'm finding bugs everywhere I look (in xorp and my own code too for that matter). My plan is to run under valgrind some more, fix easy valgrind bugs and all possible crashes. Then, run under oprofile and try to optimize things. OLSR burns 100% CPU during re-configure, and xorpsh seems fairly expensive to launch as well. > The Router Manager is a large and complex beast. In corporate, it's > largely been replaced with something else. Some degree of complexity > also exists there due to the existence of textual XRLs. That reminds me: What is the plan when 'corporate' releases their code to customers? Since it's GPL, we will then have access to the source. Do you plan to merge their tree with the public SVN tree at that point? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Wed Sep 30 11:02:00 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 11:02:00 -0700 Subject: [Xorp-hackers] Building without strip enabled? In-Reply-To: <87pr98tswi.fsf@orac.acorntoolworks.com> References: <4AC2A678.3010702@candelatech.com> <877hvgd1c0.fsf@orac.acorntoolworks.com> <87pr98tswi.fsf@orac.acorntoolworks.com> Message-ID: <4AC39D18.1000206@candelatech.com> On 09/30/2009 07:21 AM, J.T. Conklin wrote: > jtc at acorntoolworks.com (J.T. Conklin) writes: >> Ben Greear writes: >>> It seems the upstream code won't build without stripping. The >>> patch below allows me to have this work as I expected: >>> >>> scons strip=no install >> >> Can you try updating to the most recent SConstruct? There used to be >> problems with boolean command line arguments like strip, but I fixed >> that on 9/15 in checkin 11546. >> >> env['strip'] is now set by a Variable, so should always have the >> boolean value True or False. > > However, upon further review, I discovered that the conditionals in > InstallProgram and InstallLibrary were checking whether 'strip' was > in the SCons environment (which it always was), rather than it's > value. I've just checked in a fix (11559). That fix works... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Wed Sep 30 12:24:21 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 12:24:21 -0700 Subject: [Xorp-hackers] OLSR assert Message-ID: <4AC3B065.3070300@candelatech.com> I have a complex many-to-many network and I'm runnng OLSR. I'm getting an assert repeatedly. I instrumented the code with debugging logic, and I have an idea of what might be the problem. The reset_twohop_mpr_state counts neighbors that are strict and reachable. But, the consider_poorly_covered method checks for reachability == 1. In the log below, neighbor 10.7.7.7 is not counted in poorly_covered. Should we maybe check for reachability() > 0 instead of == 1? Thanks, Ben [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, n2: 1-(10.9.9.9) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, n2: 2-(10.8.8.8) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, n2: 5-(10.7.7.7) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, n2: 6-(10.6.6.6) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 3-(10.4.4.4) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 1-(10.9.9.9) in consider_persistent, strict: 1 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 2-(10.8.8.8) in consider_persistent, strict: 1 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 4-(10.2.2.2) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 5-(10.7.7.7) in consider_persistent, strict: 1 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 6-(10.6.6.6) in consider_persistent, strict: 1 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 7-(10.5.5.5) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 4-(10.2.2.2) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 8-(10.3.3.3) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 7-(10.5.5.5) in consider_persistent, strict: 0 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 5-(10.7.7.7) in consider_persistent, strict: 1 willingness: 3 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1486 recount_mpr_set ] covered_n2_count after consider_persistent: 0 reachable_n2_count: 4 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 1-(10.9.9.9) n is set as mpr: 2-(10.3.3.3) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 2-(10.8.8.8) n is set as mpr: 2-(10.3.3.3) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: 3-(10.4.4.4) strict: 0 reachability: 0 n2-covered: 0 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: 4-(10.2.2.2) strict: 0 reachability: 1 n2-covered: 0 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: 5-(10.7.7.7) strict: 1 reachability: 2 n2-covered: 0 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 6-(10.6.6.6) n is set as mpr: 3-(10.4.4.4) [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: 7-(10.5.5.5) strict: 0 reachability: 1 n2 -covered: 0 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: 8-(10.3.3.3) strict: 0 reachability: 0 n2-covered: 0 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1494 recount_mpr_set ] covered_n2_count after consider_poorly_covered: 3 reachable_n2_count: 4 [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1503 recount_mpr_set ] covered_n2_count after consider_remaining: 3 reachable_n2_count: 4 [ 2009/09/30 12:10:50 FATAL xorp_olsr4:2771 OLSR +1507 contrib/olsr/neighborhood.cc recount_mpr_set ] Assertion (covered_n2_count >= reachable_n2_count) failed -- Ben Greear Candela Technologies Inc http://www.candelatech.com From mkaddoura at atcorp.com Wed Sep 30 13:01:08 2009 From: mkaddoura at atcorp.com (Maher Kaddoura) Date: Wed, 30 Sep 2009 15:01:08 -0500 Subject: [Xorp-hackers] BGP routes distribution In-Reply-To: <4AC382F7.5080403@icir.org> Message-ID: Hi, Below is the configuration that I have used. This configuration cause RIP to export connected and static routes, while BGP does not export connected and static routes. I did few testing using different variations of the configuration below. Based on my observations, in XORP connected and static routes can only be exported by either RIP or BGP but not by both at the same time. And RIP always has the priority over BGP. Maher /* $XORP: xorp/rtrmgr/config/interfaces.boot,v 1.1 2007/08/30 06:32:17 pavlin Exp $ */ interfaces { interface eth2 { default-system-config } } interfaces { interface eth3 { /* Use the default setup as configured in the system */ default-system-config } } fea { unicast-forwarding4 { disable: false } } protocols { static { route 192.168.22.0/24 { next-hop: 192.168.20.1 metric: 1 } route 192.168.21.0/24 { next-hop: 192.168.20.1 metric: 1 } route 192.168.20.0/24 { next-hop: 192.168.20.1 metric: 1 } }} policy { policy-statement export-connected { term 200 { from { protocol: "connected" } } } } policy { policy-statement export-bgp { term 600 { from { protocol: "connected" } then { accept{} } } term 800 { from { protocol: "rip" } then { accept{} } } term 700 { from { protocol: "static" } then { accept{} } } } } protocols { rip { export: "export-connected" interface eth3{ vif eth3 { address 192.168.20.1 { disable: false } } } interface eth2{ vif eth2 { address 192.168.21.1 { disable: false } } } } } protocols { bgp { bgp-id: 192.168.20.1 local-as: 200 export: "export-bgp" peer 192.168.1.1 { local-ip: 192.168.20.1 as: 200 next-hop: 192.168.20.1 } peer 192.168.2.1 { local-ip: 192.168.20.1 as: 200 next-hop: 192.168.20.1 } } } -----Original Message----- From: Bruce Simpson [mailto:bms at icir.org] Sent: Wednesday, September 30, 2009 11:11 AM To: Maher Kaddoura Cc: xorp-hackers at icir.org Subject: Re: [Xorp-hackers] BGP routes distribution Hi, Could you please provide your full configuration, so that someone can try to help you better? Section 8.3.1 of the user manual has an example of how to configure RIP to export connected routes. Are you using separate policies to do this for both BGP and RIP? Maher Kaddoura wrote: > Can someone please let know what configuration should I used so BGP and RIP > distribute both routes into domain 192.168.20.0. > Redistribution in BGP and RIP using export policies should be functionally separate; one should not affect the other. Off the top of my head, I can't think of situations where this would happen -- however I'm busy on another task, and swapping how policy behaves back in is difficult. :-) thanks, BMS From bms at incunabulum.net Wed Sep 30 13:31:17 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 21:31:17 +0100 Subject: [Xorp-hackers] OLSR assert In-Reply-To: <4AC3B065.3070300@candelatech.com> References: <4AC3B065.3070300@candelatech.com> Message-ID: <4AC3C015.5070703@incunabulum.net> Ben Greear wrote: > The reset_twohop_mpr_state counts neighbors that are strict and reachable. > But, the consider_poorly_covered method checks for reachability == 1. > In the log below, neighbor 10.7.7.7 is not counted in poorly_covered. > Should we maybe check for reachability() > 0 instead of == 1? > Off the top of my head, for classical OLSR, as specified in the RFC, it needs to be covered by a minimum of 1 neighbour, in terms of links. I don't have the code in front of me, obviously a test of reachability == 1 would be naive. If the fix is that simple, that's great. The "poorly covered" predicate's behaviour changes if ETX metrics (or other compound metrics) are implemented; it then becomes possible for the link to be considered too poor to cover the neighbouring node in the graph, even though the link might exist. For the non-ETX case, the code is probably an inlining candidate, but that's up to the compiler. thanks, BMS > Thanks, > Ben > > > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, > n2: 1-(10.9.9.9) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, > n2: 2-(10.8.8.8) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, > n2: 5-(10.7.7.7) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1723 reset_twohop_mpr_state ] Counting 2-hop neighbor, is strict and reachable, > n2: 6-(10.6.6.6) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 3-(10.4.4.4) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 1-(10.9.9.9) in > consider_persistent, strict: 1 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 2-(10.8.8.8) in > consider_persistent, strict: 1 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 4-(10.2.2.2) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 5-(10.7.7.7) in > consider_persistent, strict: 1 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 6-(10.6.6.6) in > consider_persistent, strict: 1 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 7-(10.5.5.5) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 4-(10.2.2.2) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 8-(10.3.3.3) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 7-(10.5.5.5) in > consider_persistent, strict: 0 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1836 consider_persistent_cand_mprs ] NOT covering n2: 5-(10.7.7.7) in > consider_persistent, strict: 1 willingness: 3 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1486 recount_mpr_set ] covered_n2_count after consider_persistent: 0 > reachable_n2_count: 4 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 1-(10.9.9.9) > n is set as mpr: 2-(10.3.3.3) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 2-(10.8.8.8) > n is set as mpr: 2-(10.3.3.3) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: > 3-(10.4.4.4) strict: 0 reachability: 0 n2-covered: 0 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: > 4-(10.2.2.2) strict: 0 reachability: 1 n2-covered: 0 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: > 5-(10.7.7.7) strict: 1 reachability: 2 n2-covered: 0 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1878 consider_poorly_covered_twohops ] Counting poorly_covered n2: 6-(10.6.6.6) > n is set as mpr: 3-(10.4.4.4) > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: > 7-(10.5.5.5) strict: 0 reachability: 1 n2 > -covered: 0 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1884 consider_poorly_covered_twohops ] NOT Counting poorly_covered n2: > 8-(10.3.3.3) strict: 0 reachability: 0 n2-covered: 0 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1494 recount_mpr_set ] covered_n2_count after consider_poorly_covered: 3 > reachable_n2_count: 4 > [ 2009/09/30 12:10:50 WARNING xorp_olsr4:2771 OLSR contrib/olsr/neighborhood.cc:1503 recount_mpr_set ] covered_n2_count after consider_remaining: 3 > reachable_n2_count: 4 > [ 2009/09/30 12:10:50 FATAL xorp_olsr4:2771 OLSR +1507 contrib/olsr/neighborhood.cc recount_mpr_set ] Assertion (covered_n2_count >= reachable_n2_count) failed > From greearb at candelatech.com Wed Sep 30 13:46:47 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 13:46:47 -0700 Subject: [Xorp-hackers] OLSR assert In-Reply-To: <4AC3C015.5070703@incunabulum.net> References: <4AC3B065.3070300@candelatech.com> <4AC3C015.5070703@incunabulum.net> Message-ID: <4AC3C3B7.40204@candelatech.com> On 09/30/2009 01:31 PM, Bruce Simpson wrote: > Ben Greear wrote: >> The reset_twohop_mpr_state counts neighbors that are strict and >> reachable. >> But, the consider_poorly_covered method checks for reachability == 1. >> In the log below, neighbor 10.7.7.7 is not counted in poorly_covered. >> Should we maybe check for reachability() > 0 instead of == 1? > > Off the top of my head, for classical OLSR, as specified in the RFC, it > needs to be covered by a minimum of 1 neighbour, in terms of links. > > I don't have the code in front of me, obviously a test of reachability > == 1 would be naive. If the fix is that simple, that's great. > > The "poorly covered" predicate's behaviour changes if ETX metrics (or > other compound metrics) are implemented; it then becomes possible for > the link to be considered too poor to cover the neighbouring node in the > graph, even though the link might exist. > > For the non-ETX case, the code is probably an inlining candidate, but > that's up to the compiler. The more I look, the weirder it seems..but I may be mis-interpreting things. The code looks quite tricky..and reading the pertinent subsection of the RFC is not helping too much. I'm going to comment out the assert for now so that I can pull some live data out of the router. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From bms at incunabulum.net Wed Sep 30 13:51:17 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 21:51:17 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC38E63.6030308@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> Message-ID: <4AC3C4C5.7040307@incunabulum.net> Ben Greear wrote: > The problem is that a method called by an object can cause that object > to be deleted, and when that method continues, it is accessing deleted > memory. I defer to the wisdom of folk like Scott Meyers in this sort of situation... however I'm glad the problem is read access, which limits the scope of error to logic, rather than heap corruption. Caveat: I didn't write it, I just have to work with it :-) The situation in win_dispatcher.cc, which I did write, is totally different. Whilst a std::map may not be as efficient in memory space as a vector, the scope for error is a bit more limited, even if the container isn't intrusive. > > I have some more regression tests to run and I'm finding bugs everywhere > I look (in xorp and my own code too for that matter). My plan is to run > under valgrind some more, fix easy valgrind bugs and all possible > crashes. Sounds good. We are grateful for any and all bug hunting work which you might engage in. There are a number of valgrind hits to be found, I'm sure. Static analyzers like cppcheck can only go so far. It might be interesting to fire something like Coverity up, although that involves signing a license agreement. > > Then, run under oprofile and try to optimize things. OLSR burns 100% CPU > during re-configure, and xorpsh seems fairly expensive to launch as well. If OLSR is racing during a reconfigure, profiling data would be interesting to see. I can say off the top of my head that most OLSR reconfiguration ops will cause the link state to be lost or recalculated. If there is low hanging race fruit to be fixed, bring it on. One of the things Pavlin identified as a possible TODO item is to refactor the routing processes to use a transaction model at the RPC (XRL or Thrift) layer -- this would greatly simplify the Router Manager, as it can then use commit/rollback directly, rather than trying to emulate it in the config tree. However it does push some of the complexity of holding configuration state to the processes themselves. > > That reminds me: What is the plan when 'corporate' releases their > code to customers? Since it's GPL, we will then have access to the > source. As far as I know, not all of the code in the corporate branch is under the GPL, some of it is subject to NDA -- so no, not all of that source would be publicly visible. Obviously the parts which are under GPL, are already available in the public tree, however it's up to XORP, Inc. to make changes to GPLed code available publicly. I'm not responsible for compliance, so I can't speak for whether or not that really is the case. It seems reasonable that release would be on a best-effort basis. > Do you plan to merge their tree with the public SVN tree at that point? I had sketched out a schema for this with JT, but as he is no longer responsible for SVN inside the company, this would be subject to discussion again in the future. The plan has been to try to preserve similar source tree layouts to make this easier. As corporate is obviously using a private SVN server, the revision numbers change, and a merge process is then required. This can be automated up to a point, but would be mostly manual. It happens anyway whenever we cross a versioning system boundary, so the change involved is largely procedural -- it's down to how the tools are deployed to facilitate code sharing. SVN mergeinfo is properly supported since Subversion 1.5, so it shouldn't be a big deal tool-wise. cheers, BMS From bms at incunabulum.net Wed Sep 30 13:56:42 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 21:56:42 +0100 Subject: [Xorp-hackers] OLSR assert In-Reply-To: <4AC3C3B7.40204@candelatech.com> References: <4AC3B065.3070300@candelatech.com> <4AC3C015.5070703@incunabulum.net> <4AC3C3B7.40204@candelatech.com> Message-ID: <4AC3C60A.7060708@incunabulum.net> Ben Greear wrote: > > The more I look, the weirder it seems..but I may be mis-interpreting > things. It really takes some visualization to get through, I sat up with a good book on graph theory to work it all out. "Introduction to Graph Theory" by Trudeau is really good to have around. Some of the naming in the RFC is counterintuitive. E.g. an MPR set is the set of relays chosen by the local node, an MPR selector set is the set of neighbours which choose the local node as a relay. All of this is happening in near-real-time, or at least as close to real time as you can get with the link state update quantization. So the regression tests can sometimes fail on slow machines due to timer aliasing. > > The code looks quite tricky..and reading the pertinent subsection of > the RFC > is not helping too much. No worries. I had to read it about 3 times incrementally before it sank in, and even then I had what you might call a minor nervous breakdown. cheers, BMS From greearb at candelatech.com Wed Sep 30 14:09:00 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 14:09:00 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC3C4C5.7040307@incunabulum.net> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> <4AC3C4C5.7040307@incunabulum.net> Message-ID: <4AC3C8EC.7090001@candelatech.com> On 09/30/2009 01:51 PM, Bruce Simpson wrote: > Ben Greear wrote: >> The problem is that a method called by an object can cause that object >> to be deleted, and when that method continues, it is accessing deleted >> memory. > > I defer to the wisdom of folk like Scott Meyers in this sort of > situation... however I'm glad the problem is read access, which limits > the scope of error to logic, rather than heap corruption. Caveat: I > didn't write it, I just have to work with it :-) It causes core dumps because some other process (or this process) can allocate and scribble on the deleted memory, so when the method accesses that memory it can wander off and do horrible things. >> That reminds me: What is the plan when 'corporate' releases their >> code to customers? Since it's GPL, we will then have access to the >> source. > > As far as I know, not all of the code in the corporate branch is under > the GPL, some of it is subject to NDA -- so no, not all of that source > would be publicly visible. Well, anything that links with any of the (external SVN) code in Xorp becomes GPL. They may have a private copy of some XRL logic that allows them to link proprietary protocols, I suppose... They would NOT be allowed to pull changes from the external SVN tree into their internal tree and not treat that code as GPL. That said, the GPL only takes affect when you sell/distribute the source outside your domain..so until they ship something, they are not in any violation regardless of other issues. > Obviously the parts which are under GPL, are already available in the > public tree, however it's up to XORP, Inc. to make changes to GPLed code > available publicly. > > I'm not responsible for compliance, so I can't speak for whether or not > that really is the case. It seems reasonable that release would be on a > best-effort basis. Well, I guess we'll see how that goes. If corporate goes off and makes big structural changes, and we do similar, seems like we'll probably never merge a stable product in either direction, regardless of licenses involved... Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Wed Sep 30 14:33:26 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 14:33:26 -0700 Subject: [Xorp-hackers] OLSR assert In-Reply-To: <4AC3C60A.7060708@incunabulum.net> References: <4AC3B065.3070300@candelatech.com> <4AC3C015.5070703@incunabulum.net> <4AC3C3B7.40204@candelatech.com> <4AC3C60A.7060708@incunabulum.net> Message-ID: <4AC3CEA6.5040207@candelatech.com> On 09/30/2009 01:56 PM, Bruce Simpson wrote: > Ben Greear wrote: >> >> The more I look, the weirder it seems..but I may be mis-interpreting >> things. > > It really takes some visualization to get through, I sat up with a good > book on graph theory to work it all out. "Introduction to Graph Theory" > by Trudeau is really good to have around. > > Some of the naming in the RFC is counterintuitive. E.g. an MPR set is > the set of relays chosen by the local node, an MPR selector set is the > set of neighbours which choose the local node as a relay. > > All of this is happening in near-real-time, or at least as close to real > time as you can get with the link state update quantization. So the > regression tests can sometimes fail on slow machines due to timer aliasing. My regression tests are more horrible yet...I'm using 'real' routers :) But, I'm not exhaustively checking that route propagating works properly in all directions at this point. >> The code looks quite tricky..and reading the pertinent subsection of >> the RFC >> is not helping too much. > > No worries. I had to read it about 3 times incrementally before it sank > in, and even then I had what you might call a minor nervous breakdown. Here's an attached patch that seems to fix things. I believe the main error was checking for (!is_mpr()) in consider_remaining_cand_mprs I can't see why that check helps anything, and it was excluding from consideration the mpr that was needed to find the 2-hop neighbor in my setup. With the attached patch, no more crashes and routes seem to be propagating properly, though I haven't done a full verification of the (large amount of) routes. Most of the patch is debugging, but since I fear I'll be visiting this again someday, I'm leaving that in my tree... Considering OLSR seems to be at least mostly working, maybe you could commit the two patches I posted a few days ago that enables other ppl to build OLSR? Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: olsr_neigh2_assert.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090930/cc97d071/attachment.ksh From greearb at candelatech.com Wed Sep 30 14:53:30 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 14:53:30 -0700 Subject: [Xorp-hackers] PATCH: Fix endless loop in OLSR Message-ID: <4AC3D35A.3030100@candelatech.com> Seems a simple but deadly bug: diff --git a/contrib/olsr/external.cc b/contrib/olsr/external.cc index a475dd2..79124c2 100644 --- a/contrib/olsr/external.cc +++ b/contrib/olsr/external.cc @@ -242,7 +242,7 @@ ExternalRoutes::delete_hna_route_in(OlsrTypes::ExternalID erid) ExternalDestInMap::iterator> rd = _routes_in_by_dest.equal_range(er->dest()); ExternalDestInMap::iterator jj; - for (jj = rd.first; jj != rd.second; ) { + for (jj = rd.first; jj != rd.second; jj++) { if ((*jj).second == erid) { _routes_in_by_dest.erase(jj); break; -- Ben Greear Candela Technologies Inc http://www.candelatech.com From greearb at candelatech.com Wed Sep 30 15:37:26 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 15:37:26 -0700 Subject: [Xorp-hackers] Another OLSR patch. Message-ID: <4AC3DDA6.6030009@candelatech.com> This is on top of my previous OLSR patches. * Stop spt.hh from spewing warnings (related to OLSR) * Add debugging logic to help figure out why the assert related to covered_n2_count >= reachable_n2_count is happening. I saw it once...but it's much harder to reproduce now. The 'dbg' message will be visible in core files, and it's also visible in the xorp logs, but it doesn't spam unless the error occurs. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: olsr_dbg.patch Url: http://mailman.ICSI.Berkeley.EDU/pipermail/xorp-hackers/attachments/20090930/1b344a5c/attachment-0001.ksh From bms at incunabulum.net Wed Sep 30 15:50:30 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Wed, 30 Sep 2009 23:50:30 +0100 Subject: [Xorp-hackers] OLSR assert In-Reply-To: <4AC3C015.5070703@incunabulum.net> References: <4AC3B065.3070300@candelatech.com> <4AC3C015.5070703@incunabulum.net> Message-ID: <4AC3E0B6.4050206@incunabulum.net> Bruce Simpson wrote: > Ben Greear wrote: > >> The reset_twohop_mpr_state counts neighbors that are strict and reachable. >> But, the consider_poorly_covered method checks for reachability == 1. >> In the log below, neighbor 10.7.7.7 is not counted in poorly_covered. >> Should we maybe check for reachability() > 0 instead of == 1? >> >> > > Off the top of my head, for classical OLSR, as specified in the RFC, it > needs to be covered by a minimum of 1 neighbour, in terms of links. > I don't have the code in front of me, obviously a test of reachability > == 1 would be naive. If the fix is that simple, that's great. > I just skimmed this code again. Based on how I interpreted the RFC, a 'poorly covered two-hop neighbour' is indeed one which is reachable only by a single one-hop neighbour, so the reachability test 1 is fine. This is the case for classical OLSR, but not the case for ETX where the reachability has fractional dimension, and it's something the 'reachability' metric has to take into account. We could use floating point to compute this, but integer operations are generally better for performance, that's just an implementation constraint. There are places where you can't get away from using floating point. It's not that bad if the target supports IEEE 754 efficiently. An uncovered two-hop neighbour would not be considered in the MPR computation, and the existence of such would normally only be an artefact of pending updates. Jitter is usually used to avoid such situations, and isn't fully implemented in the XORP OLSR process (I did say it was unfinished work :-)). There are a number of areas where the theory doesn't meet the practice. The OLSR.org crew did do a lot of modeling work in the end, and there's places where some of the theory in the RFC falls down. cheers, BMS From bms at incunabulum.net Wed Sep 30 16:40:59 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Thu, 01 Oct 2009 00:40:59 +0100 Subject: [Xorp-hackers] PATCH: Fix endless loop in OLSR In-Reply-To: <4AC3D35A.3030100@candelatech.com> References: <4AC3D35A.3030100@candelatech.com> Message-ID: <4AC3EC8B.10709@incunabulum.net> Committed, thanks! From bms at incunabulum.net Wed Sep 30 16:55:38 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Thu, 01 Oct 2009 00:55:38 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC3C8EC.7090001@candelatech.com> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> <4AC3C4C5.7040307@incunabulum.net> <4AC3C8EC.7090001@candelatech.com> Message-ID: <4AC3EFFA.8040705@incunabulum.net> Ben Greear wrote: > > Well, anything that links with any of the (external SVN) code in Xorp > becomes GPL. > They may have a private copy of some XRL logic that allows them to > link proprietary > protocols, I suppose... You raise a valid point, and it's one that's worth addressing further. The XORP libraries are in fact LGPLv2; the protocols, in the community branch, just happen to be GPLv2. So the virality of the GPL doesn't apply, just because a process happens to speak XRL. The scope of the GPL was purely limited to individual routing processes, not the core libraries, which are LGPL. The XRL RPC stubs don't actually have an explicit license, and should probably be updated to reflect either LGPL or public domain status. I haven't looked at the corporate code in detail (other than seeing a directory manifest and working with the SCons* files), so I can't speak for the rest of the stack. My understanding is that ongoing development there is subject to NDA. In this case, the modules are probably being developed from scratch, so the GPL probably doesn't apply in any form, in that instance. > Well, I guess we'll see how that goes. If corporate goes off and > makes big > structural changes, and we do similar, seems like we'll probably never > merge > a stable product in either direction, regardless of licenses involved... Again, as I understand it, the routing processes are in similar locations in the source tree, so two-way merges should be possible. Given that the product is not using the Router Manager, this scenario has in fact already happened. thanks, BMS From bms at incunabulum.net Wed Sep 30 17:10:05 2009 From: bms at incunabulum.net (Bruce Simpson) Date: Thu, 01 Oct 2009 01:10:05 +0100 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC3EFFA.8040705@incunabulum.net> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> <4AC3C4C5.7040307@incunabulum.net> <4AC3C8EC.7090001@candelatech.com> <4AC3EFFA.8040705@incunabulum.net> Message-ID: <4AC3F35D.9000601@incunabulum.net> Bruce Simpson wrote: > ... > The XORP libraries are in fact LGPLv2; the protocols, in the community > branch, just happen to be GPLv2. So the virality of the GPL doesn't > apply, just because a process happens to speak XRL. P.S. Thrift has the Apache ASF license, which is mostly BSD / MIT like, so has no virality. Assuming the Thrift XRL refactoring work is successfully completed (which looks likely at this point in time), the libxipc shims would be sufficiently different from the original implementation to be candidates for relicensing; only the shell of API needed for linkage to existing XORP processes would remain. Whilst ABI (binary) compatibility is likely it's not something I'm ruling in at this stage of the work. It's likely the original LGPL license would be preserved at the point of merge, as it could constitute a derived work, although the libxipc change hasn't significantly changed since it was last released under a BSD license, which is not viral. In any event, external contributors would be free to make whatever changes they like, without any obligations, providing they don't touch code which has been GPLed (i.e. the RIB, FEA, or existing protocols). From greearb at candelatech.com Wed Sep 30 17:24:36 2009 From: greearb at candelatech.com (Ben Greear) Date: Wed, 30 Sep 2009 17:24:36 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC3F35D.9000601@incunabulum.net> References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> <4AC3C4C5.7040307@incunabulum.net> <4AC3C8EC.7090001@candelatech.com> <4AC3EFFA.8040705@incunabulum.net> <4AC3F35D.9000601@incunabulum.net> Message-ID: <4AC3F6C4.6050107@candelatech.com> On 09/30/2009 05:10 PM, Bruce Simpson wrote: > Bruce Simpson wrote: >> ... >> The XORP libraries are in fact LGPLv2; the protocols, in the community >> branch, just happen to be GPLv2. So the virality of the GPL doesn't >> apply, just because a process happens to speak XRL. > > P.S. Thrift has the Apache ASF license, which is mostly BSD / MIT like, > so has no virality. > > Assuming the Thrift XRL refactoring work is successfully completed > (which looks likely at this point in time), the libxipc shims would be > sufficiently different from the original implementation to be candidates > for relicensing; only the shell of API needed for linkage to existing > XORP processes would remain. Whilst ABI (binary) compatibility is likely > it's not something I'm ruling in at this stage of the work. > > It's likely the original LGPL license would be preserved at the point of > merge, as it could constitute a derived work, although the libxipc > change hasn't significantly changed since it was last released under a > BSD license, which is not viral. > > In any event, external contributors would be free to make whatever > changes they like, without any obligations, providing they don't touch > code which has been GPLed (i.e. the RIB, FEA, or existing protocols). Yes, I agree. I hadn't realized that the libraries were licensed LGPL. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From jtc at acorntoolworks.com Wed Sep 30 18:17:48 2009 From: jtc at acorntoolworks.com (J.T. Conklin) Date: Wed, 30 Sep 2009 18:17:48 -0700 Subject: [Xorp-hackers] valgrind: selector.cc: Reading free'd memory In-Reply-To: <4AC3C8EC.7090001@candelatech.com> (Ben Greear's message of "Wed, 30 Sep 2009 14:09:00 -0700") References: <4AC2AA1F.1080308@candelatech.com> <4AC2BFEC.6010802@candelatech.com> <4AC324FE.7010700@incunabulum.net> <4AC37746.2080004@candelatech.com> <4AC37E69.4040407@incunabulum.net> <4AC38E63.6030308@candelatech.com> <4AC3C4C5.7040307@incunabulum.net> <4AC3C8EC.7090001@candelatech.com> Message-ID: <878wfvud37.fsf@orac.acorntoolworks.com> Hi Ben, Bruce, Ben Greear writes: > On 09/30/2009 01:51 PM, Bruce Simpson wrote: >> As far as I know, not all of the code in the corporate branch is under >> the GPL, some of it is subject to NDA -- so no, not all of that source >> would be publicly visible. > > Well, anything that links with any of the (external SVN) code in > Xorp becomes GPL. They may have a private copy of some XRL logic > that allows them to link proprietary protocols, I suppose... > > They would NOT be allowed to pull changes from the external SVN tree > into their internal tree and not treat that code as GPL. > > That said, the GPL only takes affect when you sell/distribute the > source outside your domain..so until they ship something, they are > not in any violation regardless of other issues. My understanding is that XORP Inc. holds the copyright of (most of) the code in their corporate repo, which enabled the change of license terms from BSD to GPL in the first place. In theory, this allows them the flexibility to release their commercial product under a different, non-GPL, license. However, there are no copyright assignment or copyright disclaimer required for community enhancements or bug fixes (as there is for FSF/GNU projects). IANAL, but I believe copyright of those would be considered retained by author/contributor, with implied distribution rights granted under the GPL. So I would tend to agree that changes taken from the community SVN repo going forward would remove any flexibility to distribute the corporate product under non-GPL terms. --jtc -- J.T. Conklin