Summary: | ASTERISK-11774: Asterisk crashes after timeout / redirect / hangup when directly parking a call via AMI interface | ||
Reporter: | pguido (pguido) | Labels: | |
Date Opened: | 2008-04-03 02:51:11 | Date Closed: | 2008-04-14 11:20:59 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Resources/res_features |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) cleanup_datastore.patch.1.4.18.1.patch ( 1) park_crash.tgz | |
Description: | A (external) calls B (internal) B directly parks the call via AMI PARK Command In these situation Asterisk crashes - Parking times out - A hangs up - B sends hangup via AMI Interface - B sends redirect via AMI Interface The Park command uses Channel: <channel of A> Channel2: <channel of B> It seems the datastores within the channel get's corrupted. (gdb) p *ast_channel_datastore_find::chan->datastores->first $9 = {uid = 0x0, data = 0x17e2a527, info = 0x0, inheritance = 772014104, entry = {next = 0x3030002e}} Redirect Case: The corrupted datastore: (gdb) p *ast_channel_datastore_find::chan->datastores->first $9 = {uid = 0x0, data = 0x17e2a527, info = 0x0, inheritance = 772014104, entry = {next = 0x3030002e}} (gdb) p *ast_channel_datastore_find::chan->datastores->first->entry->next Cannot access memory at address 0x3030002e Backtrace: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1215976560 (LWP 1522)] 0x080865f7 in ast_channel_datastore_find (chan=0x969a0b0, info=0x81544e0, uid=0x0) at channel.c:1356 1356 AST_LIST_TRAVERSE_SAFE_BEGIN(&chan->datastores, datastore, entry) { (gdb) backtrace #0 0x080865f7 in ast_channel_datastore_find (chan=0x969a0b0, info=0x81544e0, uid=0x0) at channel.c:1356 #1 0x003b39c6 in dial_exec_full (chan=0x969a0b0, data=0xb7852e48, peerflags=0xb7850d14, continue_exec=0x0) at app_dial.c:1133 #2 0x003b74d2 in dial_exec (chan=0x969a0b0, data=0xb7852e48) at app_dial.c:1760 #3 0x080cd4da in pbx_exec (c=0x969a0b0, app=0x9639f30, data=0xb7852e48) at pbx.c:532 #4 0x080d11f6 in pbx_extension_helper (c=0x969a0b0, con=0x0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=3, label=0x0, callerid=0x9688788 "0061313279755", action=E_SPAWN) at pbx.c:1851 ASTERISK-1 0x080d253b in ast_spawn_extension (c=0x969a0b0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=3, callerid=0x9688788 "0061313279755") at pbx.c:2306 ASTERISK-2 0x00215f83 in _macro_exec (chan=0x969a0b0, data=0xb7857f38, exclusive=0) at app_macro.c:308 ASTERISK-3 0x00216ca2 in macro_exec (chan=0x969a0b0, data=0xb7857f38) at app_macro.c:486 ASTERISK-4 0x080cd4da in pbx_exec (c=0x969a0b0, app=0x9631b10, data=0xb7857f38) at pbx.c:532 ASTERISK-5 0x080d11f6 in pbx_extension_helper (c=0x969a0b0, con=0x0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=1, label=0x0, callerid=0x9688788 "0061313279755", action=E_SPAWN) at pbx.c:1851 ASTERISK-6 0x080d253b in ast_spawn_extension (c=0x969a0b0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=1, callerid=0x9688788 "0061313279755") at pbx.c:2306 ASTERISK-7 0x080d2a67 in __ast_pbx_run (c=0x969a0b0) at pbx.c:2408 ASTERISK-8 0x080d3883 in pbx_thread (data=0x969a0b0) at pbx.c:2623 ASTERISK-9 0x08116085 in dummy_start (data=0x9688760) at utils.c:852 ASTERISK-10 0x00a7045b in start_thread () from /lib/libpthread.so.0 ASTERISK-11 0x009c824e in clone () from /lib/libc.so.6 Hangup case Backtrace: #0 0x00392402 in __kernel_vsyscall () #1 0x00922ba0 in raise () from /lib/libc.so.6 #2 0x009244b1 in abort () from /lib/libc.so.6 #3 0x00958dfb in __libc_message () from /lib/libc.so.6 #4 0x00960aa6 in _int_free () from /lib/libc.so.6 ASTERISK-1 0x00963fc0 in free () from /lib/libc.so.6 ASTERISK-2 0x08081efb in ast_channel_free (chan=0xa37120) at channel.c:1295 ASTERISK-3 0x0808473b in ast_hangup (chan=0xa052b70) at channel.c:1496 ASTERISK-4 0x00348313 in do_parking_thread (ignore=0x0) at res_features.c:1752 ASTERISK-5 0x080f97fb in dummy_start (data=0x9fc3620) at utils.c:852 ASTERISK-6 0x00a7045b in start_thread () from /lib/libpthread.so.0 ASTERISK-7 0x009c824e in clone () from /lib/libc.so.6 | ||
Comments: | By: Mark Michelson (mmichelson) 2008-04-03 07:55:12 Since this is a memory corruption issue, could you please reproduce the situation while running Asterisk under valgrind? Instructions for doing so are in doc/valgrind.txt Thanks! By: pguido (pguido) 2008-04-03 10:38:31 I have attached the file park_crash.tgz that incluse the valgrind output. It seems the hangup is catched by valgrind but the redirected leaded nevertheless to a crash. By: Norbert Reinartz (nreinartz) 2008-04-03 12:58:03 I am able to reproduce this bug. Tested against 1.4.18.1. - A calls B - B did not take up the call - channel of A is parked via AMI PARK Command - A hangs up --> program crashes immediately I did some debugging and found that the program crashes in ast_channel_datastore_free() of channel.c. The following happens: - A calls B: app_dial.c / dial_exec_full(): chan->name is SIP/3456-08247bd0, program runs up to "peer = wait_for_answer(chan, ..)" in dial_exec_full() - B did not take up the call: program stays at "peer = wait_for_answer(chan, ..)" in dial_exec_full() - channel of A is parked via AMI PARK Command: manager_park() / res_features.c ast_masq_park_call(): channel is masqueraded, chan->name is changed to "Parked/SIP/3456-08247bd0<ZOMBIE>" app_dial.c / dial_exec_full() goes on: call to ast_channel_datastore_free(), chan->name is "Parked/SIP/3456-08247bd0<ZOMBIE>" !!!!! ast_channel_datastore_free() / channel.c: free the data store - A hangs up channel.c, ast_channel_free(): chan->name 'SIP/3456-08247bd0' ast_channel_datastore_free() / channel.c: try to free the some data store content as above !!!!! --> program crashes Both, the original channel and the masqueraded channel use a datastore with same content. As both try to clean it up the program crashes. By: Mark Michelson (mmichelson) 2008-04-03 15:47:26 pguido and nreinartz: Great feedback! I will look at this further on Monday, since I will be on vacation until then. Thanks to your input, this shouldn't be a problem to solve. By: Norbert Reinartz (nreinartz) 2008-04-04 07:58:17 Created a patch to fix the software crash described above. This patch has been tested with 1.4.18.1. It works but I don't know if there are some negative effects. Its more a dirty workaround as it doesn't fix the problem which is more general. How should data stores be handled if they are duplicated by masquerade of a channel? By: Mark Michelson (mmichelson) 2008-04-07 18:50:17 nreinartz: I am unable to view your patch because your license is still pending, but I thought I'd comment on your last question regarding how to handle datastores when they are duplicated during a masquerade. The answer is that the datastore actually is not duplicated during a masquerade. During a masquerade, the AST_LIST_APPEND_LIST macro is called, which actually moves the list of datastores from one channel to the other. It does not copy them. The problem is that app_dial still has a reference to the datastore which has been moved. So what is happening is that app_dial calls ast_datastore_free() on the datastore that has actually been moved to a completely different channel than it was on to begin with. I have in mind a patch to fix this, but I'm curious to see first if you had the same idea with the patch you uploaded. I would expect your license to be approved some time tomorrow, so I'll have a look at it then. Thanks for the help on this and for the good questions. It's always fun to discuss things like this. By: Mark Michelson (mmichelson) 2008-04-14 10:42:23 nreinartz: I just had a look at your patch, and I see what you were trying to accomplish, but based on what I told you in my previous note, that won't work well since the datastores aren't actually copied during a masquerade. Also, calling AST_LIST_HEAD_INIT_NOLOCK is not necessary in that scenario. I'll be committing a fix for this issue as soon as possible. By: Digium Subversion (svnbot) 2008-04-14 11:19:31 Repository: asterisk Revision: 114112 U branches/1.4/apps/app_dial.c U branches/1.4/apps/app_queue.c ------------------------------------------------------------------------ r114112 | mmichelson | 2008-04-14 11:19:29 -0500 (Mon, 14 Apr 2008) | 9 lines If the datastore has been moved to another channel due to a masquerade, then freeing the datastore here causes an eventual double free when the new channel hangs up. We should only free the datastore if we were able to successfully remove it from the channel we are referencing (i.e. the datastore was not moved). (closes issue ASTERISK-11774) Reported by: pguido ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=114112 By: Digium Subversion (svnbot) 2008-04-14 11:20:06 Repository: asterisk Revision: 114113 _U trunk/ U trunk/apps/app_dial.c U trunk/apps/app_queue.c ------------------------------------------------------------------------ r114113 | mmichelson | 2008-04-14 11:20:06 -0500 (Mon, 14 Apr 2008) | 17 lines Merged revisions 114112 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r114112 | mmichelson | 2008-04-14 11:24:22 -0500 (Mon, 14 Apr 2008) | 9 lines If the datastore has been moved to another channel due to a masquerade, then freeing the datastore here causes an eventual double free when the new channel hangs up. We should only free the datastore if we were able to successfully remove it from the channel we are referencing (i.e. the datastore was not moved). (closes issue ASTERISK-11774) Reported by: pguido ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=114113 By: Digium Subversion (svnbot) 2008-04-14 11:20:59 Repository: asterisk Revision: 114114 _U branches/1.6.0/ U branches/1.6.0/apps/app_dial.c U branches/1.6.0/apps/app_queue.c ------------------------------------------------------------------------ r114114 | mmichelson | 2008-04-14 11:20:58 -0500 (Mon, 14 Apr 2008) | 25 lines Merged revisions 114113 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r114113 | mmichelson | 2008-04-14 11:25:09 -0500 (Mon, 14 Apr 2008) | 17 lines Merged revisions 114112 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r114112 | mmichelson | 2008-04-14 11:24:22 -0500 (Mon, 14 Apr 2008) | 9 lines If the datastore has been moved to another channel due to a masquerade, then freeing the datastore here causes an eventual double free when the new channel hangs up. We should only free the datastore if we were able to successfully remove it from the channel we are referencing (i.e. the datastore was not moved). (closes issue ASTERISK-11774) Reported by: pguido ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=114114 |