[Xorp-hackers] rtrmgr crash on SIGABRT because of pop_front in task_done

Li Zhao lizhaous2000 at yahoo.com
Thu Oct 29 07:54:05 PDT 2009


I added a new protocol and I can start it in CLI by command "create protocol XXX", but the rtrmgr crashed after command "delete protocol XXX".
I can also easily reproduce the exactlt same crash via the following steps:

0. I am running xorp processes on an embedded system.
1. start rtrmgr from linux shell on the system;
2. manually start xorp_static_routes from linux shell. This static will hijack the xrl channels to rtrmgr;
3. use cli command "create protocol static" to start a second xorp_static_routes.
4. use cli command "delete protocol static" to stop static. both xorp_static_routes were terminated. depended process like fea, rib and policy were also terminated. rtrmgr crash.

I am attaching two stack traces. the first one is for my new protocl XXX case and the second is for the static triggered case.

Anybody has any clue? Thanks.

Li

case 1:

(gdb) tar rem 10.65.1.117:6666
Remote debugging using 10.65.1.117:6666
0x0059a850 in _start () from /lib/ld-linux.so.2
Current language:  auto; currently c
(gdb) dis b
(gdb) c
Continuing.
[New Thread 0]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0]
0xb80cd424 in ?? ()
(gdb) bt
#0  0xb80cd424 in ?? ()
#1  0xbffc2624 in ?? ()
#2  0x00000006 in ?? ()
#3  0x000017fe in ?? ()
#4  0x00a71450 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5  0x00a72e18 in abort () at abort.c:88
#6  0x00aaefdd in __libc_message (do_abort=2, 
    fmt=0xb89bc8 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#7  0x00ab5394 in malloc_printerr (action=2, 
    str=0xb86a88 "free(): invalid pointer", ptr=0x8d55238) at malloc.c:5994
#8  0x00ab7346 in __libc_free (mem=0x8d55238) at malloc.c:3625
#9  0x05438591 in operator delete (ptr=0x0)
    at ../../../../libstdc++-v3/libsupc++/del_op.cc:49
#10 0x080a2f5f in __gnu_cxx::new_allocator<std::_List_node<Task*> >::deallocate
    (this=0x8d55238, __p=0x8d55238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/ext/new_allocator.h:98
#11 0x080a2f84 in std::_List_base<Task*, std::allocator<Task*> >::_M_put_node (
    this=0x8d55238, __p=0x8d55238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:318
#12 0x080a6f39 in std::list<Task*, std::allocator<Task*> >::_M_erase (
---Type <return> to continue, or q <return> to quit---
    this=0x8d55238, __position={_M_node = 0x8d55238})
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:1361
#13 0x080a6f6b in std::list<Task*, std::allocator<Task*> >::pop_front (
    this=0x8d55238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:861
#14 0x08098c23 in TaskManager::task_done (this=0x8d55210, success=true, errmsg=
        {static npos = 4294967295, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x546ccd4 ""}}) at task.cc:2251
#15 0x080a5911 in XorpMemberCallback2B0<void, TaskManager, bool, std::string>::dispatch (this=0x8d60228, a1=true, a2=
        {static npos = 4294967295, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x546ccd4 ""}}) at ../libxorp/callback_nodebug.hh:4636
#16 0x08095bd1 in Task::step8_report (this=0x8d60460) at task.cc:1993
#17 0x080a22e7 in XorpMemberCallback0B0<void, Task>::dispatch (this=0x8d5fd90)
    at ../libxorp/callback_nodebug.hh:306
#18 0x0808b2c1 in Module::terminate_with_prejudice (this=0x8d58450, cb=
      {_M_ptr = 0x8d5fd90, _M_index = 110}) at module_manager.cc:218
#19 0x0808f36e in XorpMemberCallback0B1<void, Module, ref_ptr<XorpCallback0<void> > >::dispatch (this=0x8d60938) at ../libxorp/callback_nodebug.hh:598
---Type <return> to continue, or q <return> to quit---
#20 0x081af7da in OneoffTimerNode2::expire (this=0x8d5ff28) at timer.cc:167
#21 0x081ae8ed in TimerList::expire_one (this=0xbffcce4c, worst_priority=4)
    at timer.cc:441
#22 0x081aea48 in TimerList::run (this=0xbffcce4c) at timer.cc:389
#23 0x08198564 in EventLoop::do_work (this=0xbffcce48, can_block=true)
    at eventloop.cc:153
#24 0x08198828 in EventLoop::run (this=0xbffcce48) at eventloop.cc:99
#25 0x080682df in Rtrmgr::run (this=0xbffcd4b4) at main_rtrmgr.cc:418
#26 0x08069432 in main (argc=6, argv=0xbffcd5c4) at main_rtrmgr.cc:725
(gdb) 


Case 2:

Program received signal SIGABRT, Aborted.
[Switching to Thread 0]
0xb80db424 in ?? ()
(gdb) bt
#0  0xb80db424 in ?? ()
#1  0xbffceeb4 in ?? ()
#2  0x00000006 in ?? ()
#3  0x00001802 in ?? ()
#4  0x00a71450 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5  0x00a72e18 in abort () at abort.c:88
#6  0x00aaefdd in __libc_message (do_abort=2, 
    fmt=0xb89bc8 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#7  0x00ab5394 in malloc_printerr (action=2, 
    str=0xb89bf4 "munmap_chunk(): invalid pointer", ptr=0x93ed238)
    at malloc.c:5994
#8  0x05438591 in operator delete (ptr=0x0)
    at ../../../../libstdc++-v3/libsupc++/del_op.cc:49
#9  0x080a2f5f in __gnu_cxx::new_allocator<std::_List_node<Task*> >::deallocate
    (this=0x93ed238, __p=0x93ed238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/ext/new_allocator.h:98
#10 0x080a2f84 in std::_List_base<Task*, std::allocator<Task*> >::_M_put_node (
    this=0x93ed238, __p=0x93ed238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:318
#11 0x080a6f39 in std::list<Task*, std::allocator<Task*> >::_M_erase (
---Type <return> to continue, or q <return> to quit---
    this=0x93ed238, __position={_M_node = 0x93ed238})
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:1361
#12 0x080a6f6b in std::list<Task*, std::allocator<Task*> >::pop_front (
    this=0x93ed238)
    at /usr/lib/gcc/i386-redhat-linux/4.3.2/../../../../include/c++/4.3.2/bits/stl_list.h:861
#13 0x08098c23 in TaskManager::task_done (this=0x93ed210, success=true, errmsg=
        {static npos = 4294967295, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x546ccd4 ""}}) at task.cc:2251
#14 0x080a5911 in XorpMemberCallback2B0<void, TaskManager, bool, std::string>::dispatch (this=0x93f4e80, a1=true, a2=
        {static npos = 4294967295, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x546ccd4 ""}}) at ../libxorp/callback_nodebug.hh:4636
#15 0x08095bd1 in Task::step8_report (this=0x93f3c78) at task.cc:1993
#16 0x080a22e7 in XorpMemberCallback0B0<void, Task>::dispatch (this=0x93f4ba0)
    at ../libxorp/callback_nodebug.hh:306
#17 0x0808b64b in Module::terminate (this=0x93f39a0, cb=
      {_M_ptr = 0x93f4ba0, _M_index = 284}) at module_manager.cc:166
#18 0x0808c0a5 in ModuleManager::kill_module (this=0xbffdbb68, 
    module_name=@0x93f3c80, cb={_M_ptr = 0x93f4ba0, _M_index = 284})
---Type <return> to continue, or q <return> to quit---
    at module_manager.cc:472
#19 0x08093e38 in Task::step7_kill (this=0x93f3c78) at task.cc:1983
#20 0x080a22e7 in XorpMemberCallback0B0<void, Task>::dispatch (this=0x93f3910)
    at ../libxorp/callback_nodebug.hh:306
#21 0x081af7da in OneoffTimerNode2::expire (this=0x942f198) at timer.cc:167
#22 0x081ae8ed in TimerList::expire_one (this=0xbffdb65c, worst_priority=4)
    at timer.cc:441
#23 0x081aea48 in TimerList::run (this=0xbffdb65c) at timer.cc:389
#24 0x08198564 in EventLoop::do_work (this=0xbffdb658, can_block=true)
    at eventloop.cc:153
#25 0x08198828 in EventLoop::run (this=0xbffdb658) at eventloop.cc:99
#26 0x080682df in Rtrmgr::run (this=0xbffdbcc4) at main_rtrmgr.cc:418
#27 0x08069432 in main (argc=6, argv=0xbffdbdd4) at main_rtrmgr.cc:725
(gdb) c
Continuing.

Program terminated with signal SIGABRT, Aborted.
The program no longer exists.



      



More information about the Xorp-hackers mailing list