[Home]

Summary:ASTERISK-11917: Asterisk crash randomly while doing transfer
Reporter:tcap (tcap)Labels:
Date Opened:2008-04-24 17:05:32Date Closed:2011-06-07 14:01:08
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/PBX
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) gdb_bt.txt
( 1) gdb_thread_apply_bt_full.txt
Description:My version is 1.4.10.1 , asterisk crash randomly while I do transfer test. It can be reproduced after I did hundread times transfer. I update to 1.4.19, but no help, got same situation. Anyone known about it?

The attach is GDB for 1.4.10.1 version

****** ADDITIONAL INFORMATION ******

(gdb) bt
#0  0x00000008 in ?? ()
#1  0x0049a018 in pbx_builtin_setvar_helper (chan=0x623624, name=0x2b503968 "ANSWEREDTIME", value=0x7e7f0758 "6")
   at pbx.c:5822
#2  0x2b4fc0fc in dial_exec_full (chan=0x6232e0, data=0x2b508000, peerflags=0x7e7f07f8, continue_exec=0x0)
   at app_dial.c:1684
#3  0x2b4ffbc0 in dial_exec (chan=0x38360020, data=0x0) at app_dial.c:1734
#4  0x0048b3b8 in pbx_exec (c=0x6232e0, app=0x7e7f8c5b, data=0x7e7f0a90) at pbx.c:532
ASTERISK-1  0x0049a6c0 in pbx_extension_helper (c=0x6232e0, con=0x0, context=0x623460 "macro-dialExten", exten=0x6234b0 "s",
   priority=6, label=0x6234b0 "s", callerid=0x5ac7f0 "+O?\200+P3$+P\021<", action=5949444) at pbx.c:1833
ASTERISK-2  0x0049acec in ast_spawn_extension (c=0x38360020, context=0x0, exten=0x5f <Address 0x5f out of bounds>,
   priority=726677865, callerid=0x5f <Address 0x5f out of bounds>) at pbx.c:2288
ASTERISK-3  0x2ba62fec in _macro_exec (chan=0x6232e0, data=0x5b16a0, exclusive=0) at app_macro.c:308
ASTERISK-4  0x2ba640fc in macro_exec (chan=0x38360020, data=0x0) at app_macro.c:486
ASTERISK-5  0x0048b3b8 in pbx_exec (c=0x6232e0, app=0x7e7f9740, data=0x7e7f8c5b) at pbx.c:532
ASTERISK-6 0x2b2a98dc in handle_exec (chan=0x6232e0, agi=0x7e7f8028, argc=721722992, argv=0x7e7f8150) at res_agi.c:1108
ASTERISK-7 0x2b2abde0 in agi_exec_full (chan=0x6232e0, data=0x2b2f2744, enhanced=0, dead=34) at res_agi.c:1805
ASTERISK-8 0x2b2aca70 in agi_exec (chan=0x6232e0, data=0x7e7f9740) at res_agi.c:2062
ASTERISK-9 0x0048b3b8 in pbx_exec (c=0x6232e0, app=0x0, data=0x7e7f9740) at pbx.c:532
ASTERISK-10 0x0049a6c0 in pbx_extension_helper (c=0x6232e0, con=0x0, context=0x623460 "macro-dialExten", exten=0x6234b0 "s",
   priority=200, label=0x6234b0 "s", callerid=0x5af7e8 "+*é?+*?\\+*??", action=5961724) at pbx.c:1833
ASTERISK-11 0x0049acec in ast_spawn_extension (c=0x38360020, context=0x0, exten=0x5f <Address 0x5f out of bounds>,
   priority=726677865, callerid=0x5f <Address 0x5f out of bounds>) at pbx.c:2288
ASTERISK-12 0x0049bc7c in __ast_pbx_run (c=0x6232e0) at pbx.c:2388
ASTERISK-13 0x0049d744 in pbx_thread (data=0x38360020) at pbx.c:2603
ASTERISK-14 0x004d2034 in dummy_start (data=0x5f) at utils.c:775
ASTERISK-15 0x2ab45de4 in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-16 0x2afa104c in ?? () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Comments:By: Mark Michelson (mmichelson) 2008-04-25 09:32:24

It would be much more helpful if we could have a backtrace from the latest 1.4 instead of 1.4.10.1. One reason is that it is entirely possible that the bug you are experiencing is actually happening under the same circumstances, but in a completely different section of code in 1.4.19 (although I find it doubtful).

So, please upload a backtrace from 1.4.19 (or even better, the latest svn revision of 1.4) so we can be certain it actually is the same crash. If it is, then be prepared to use valgrind to help out in solving the issue (instructions are in doc/valgrind.txt, but I'm not sure if that document was in 1.4.10.1) since this appears to be an issue of memory corruption.

Thanks very much!

By: Steve Graham (stgraham2000) 2008-05-08 12:29:08

I'm working on this bug as well.  The bug is much harder to reproduce now that we have upgraded to 1.4.19 but it still has happened.  Unfortunately, we have only crashed once on 1.4.19 and at that particular time we were not dumping cores.  However, I've been doing some reviews of the code and may have found a few issues but I need an expert in the area to confirm my findings.

In "apps/app_dial.c", "apps/app_queue.c", and "apps/app_speech_utils.c" there are calls to "ast_channel_datastore_remove" that are not protected with locks.  It looks like this linked list needs to be protected before modification.

A couple of other functions I'm concerned about are "ast_channel_datastore_inherit" and "ast_channel_inherit_variables".  Do these require locks as well?  It looks like they do.  If so then I think there are a few more areas in the code that may not be protected.  Some of the calls are a bit tricky to tell if they are all locked because somewhere higher in the call stack may be locking the channel.

By: Michiel van Baak (mvanbaak) 2008-06-25 11:53:51

Do you still see this issue on 1.4.21 and/or latest 1.4 svn ?

By: tcap (tcap) 2008-07-07 11:55:41

No. The issue has gone after we upgraded to latest SVN. Thanks for your help.

By: Mark Michelson (mmichelson) 2008-07-07 12:03:56

Closing at request of the reporter.