[Home]

Summary:ASTERISK-11774: Asterisk crashes after timeout / redirect / hangup when directly parking a call via AMI interface
Reporter:pguido (pguido)Labels:
Date Opened:2008-04-03 02:51:11Date Closed:2008-04-14 11:20:59
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Resources/res_features
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) cleanup_datastore.patch.1.4.18.1.patch
( 1) park_crash.tgz
Description:A (external) calls B (internal)
B directly parks the call via AMI PARK Command
In these situation Asterisk crashes
- Parking times out
- A hangs up
- B sends hangup via AMI Interface
- B sends redirect via AMI Interface

The Park command uses
Channel: <channel of A>
Channel2: <channel of B>

It seems the datastores within the channel get's corrupted.

(gdb) p *ast_channel_datastore_find::chan->datastores->first
$9 = {uid = 0x0, data = 0x17e2a527, info = 0x0, inheritance = 772014104, entry = {next = 0x3030002e}}

Redirect Case:

The corrupted datastore:

(gdb) p *ast_channel_datastore_find::chan->datastores->first
$9 = {uid = 0x0, data = 0x17e2a527, info = 0x0, inheritance = 772014104, entry = {next = 0x3030002e}}

(gdb) p *ast_channel_datastore_find::chan->datastores->first->entry->next
Cannot access memory at address 0x3030002e

Backtrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1215976560 (LWP 1522)]
0x080865f7 in ast_channel_datastore_find (chan=0x969a0b0, info=0x81544e0, uid=0x0) at channel.c:1356
1356            AST_LIST_TRAVERSE_SAFE_BEGIN(&chan->datastores, datastore, entry) {
(gdb) backtrace
#0  0x080865f7 in ast_channel_datastore_find (chan=0x969a0b0, info=0x81544e0, uid=0x0) at channel.c:1356
#1  0x003b39c6 in dial_exec_full (chan=0x969a0b0, data=0xb7852e48, peerflags=0xb7850d14, continue_exec=0x0) at app_dial.c:1133
#2  0x003b74d2 in dial_exec (chan=0x969a0b0, data=0xb7852e48) at app_dial.c:1760
#3  0x080cd4da in pbx_exec (c=0x969a0b0, app=0x9639f30, data=0xb7852e48) at pbx.c:532
#4  0x080d11f6 in pbx_extension_helper (c=0x969a0b0, con=0x0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=3, label=0x0,
   callerid=0x9688788 "0061313279755", action=E_SPAWN) at pbx.c:1851
ASTERISK-1  0x080d253b in ast_spawn_extension (c=0x969a0b0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=3,
   callerid=0x9688788 "0061313279755") at pbx.c:2306
ASTERISK-2  0x00215f83 in _macro_exec (chan=0x969a0b0, data=0xb7857f38, exclusive=0) at app_macro.c:308
ASTERISK-3  0x00216ca2 in macro_exec (chan=0x969a0b0, data=0xb7857f38) at app_macro.c:486
ASTERISK-4  0x080cd4da in pbx_exec (c=0x969a0b0, app=0x9631b10, data=0xb7857f38) at pbx.c:532
ASTERISK-5  0x080d11f6 in pbx_extension_helper (c=0x969a0b0, con=0x0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=1, label=0x0,
   callerid=0x9688788 "0061313279755", action=E_SPAWN) at pbx.c:1851 ASTERISK-6 0x080d253b in ast_spawn_extension (c=0x969a0b0, context=0x969a2f0 "macro-intern_dial_m", exten=0x969a340 "s", priority=1,
   callerid=0x9688788 "0061313279755") at pbx.c:2306
ASTERISK-7 0x080d2a67 in __ast_pbx_run (c=0x969a0b0) at pbx.c:2408
ASTERISK-8 0x080d3883 in pbx_thread (data=0x969a0b0) at pbx.c:2623
ASTERISK-9 0x08116085 in dummy_start (data=0x9688760) at utils.c:852
ASTERISK-10 0x00a7045b in start_thread () from /lib/libpthread.so.0
ASTERISK-11 0x009c824e in clone () from /lib/libc.so.6

Hangup case Backtrace:

#0  0x00392402 in __kernel_vsyscall ()
#1  0x00922ba0 in raise () from /lib/libc.so.6
#2  0x009244b1 in abort () from /lib/libc.so.6
#3  0x00958dfb in __libc_message () from /lib/libc.so.6
#4  0x00960aa6 in _int_free () from /lib/libc.so.6
ASTERISK-1  0x00963fc0 in free () from /lib/libc.so.6
ASTERISK-2  0x08081efb in ast_channel_free (chan=0xa37120) at channel.c:1295
ASTERISK-3  0x0808473b in ast_hangup (chan=0xa052b70) at channel.c:1496
ASTERISK-4  0x00348313 in do_parking_thread (ignore=0x0) at res_features.c:1752
ASTERISK-5  0x080f97fb in dummy_start (data=0x9fc3620) at utils.c:852 ASTERISK-6 0x00a7045b in start_thread () from /lib/libpthread.so.0
ASTERISK-7 0x009c824e in clone () from /lib/libc.so.6


Comments:By: Mark Michelson (mmichelson) 2008-04-03 07:55:12

Since this is a memory corruption issue, could you please reproduce the situation while running Asterisk under valgrind? Instructions for doing so are in doc/valgrind.txt

Thanks!

By: pguido (pguido) 2008-04-03 10:38:31

I have attached the file park_crash.tgz that incluse the valgrind output.
It seems the hangup is catched by valgrind but the redirected leaded nevertheless to a crash.

By: Norbert Reinartz (nreinartz) 2008-04-03 12:58:03

I am able to reproduce this bug. Tested against 1.4.18.1.
- A calls B
- B did not take up the call
- channel of A is parked via AMI PARK Command
- A hangs up
  --> program crashes immediately

I did some debugging and found that the program crashes in ast_channel_datastore_free() of channel.c.
The following happens:
- A calls B:
  app_dial.c / dial_exec_full(): chan->name is SIP/3456-08247bd0,
  program runs up to "peer = wait_for_answer(chan, ..)" in dial_exec_full()

- B did not take up the call:
  program stays at "peer = wait_for_answer(chan, ..)" in dial_exec_full()

- channel of A is parked via AMI PARK Command:
  manager_park() / res_features.c
  ast_masq_park_call(): channel is masqueraded, chan->name is changed to "Parked/SIP/3456-08247bd0<ZOMBIE>"
  app_dial.c / dial_exec_full() goes on:
  call to ast_channel_datastore_free(), chan->name is "Parked/SIP/3456-08247bd0<ZOMBIE>"   !!!!!
  ast_channel_datastore_free() / channel.c: free the data store

- A hangs up
  channel.c, ast_channel_free(): chan->name 'SIP/3456-08247bd0'
  ast_channel_datastore_free() / channel.c: try to free the some data store content as above  !!!!!
  --> program crashes

Both, the original channel and the masqueraded channel use a datastore with same content.
As both try to clean it up the program crashes.

By: Mark Michelson (mmichelson) 2008-04-03 15:47:26

pguido and nreinartz: Great feedback! I will look at this further on Monday, since I will be on vacation until then. Thanks to your input, this shouldn't be a problem to solve.

By: Norbert Reinartz (nreinartz) 2008-04-04 07:58:17

Created a patch to fix the software crash described above.
This patch has been tested with 1.4.18.1. It works but I don't know if there are some negative effects.
Its more a dirty workaround as it doesn't fix the problem which is more general.
How should data stores be handled if they are duplicated by masquerade of a channel?

By: Mark Michelson (mmichelson) 2008-04-07 18:50:17

nreinartz: I am unable to view your patch because your license is still pending, but I thought I'd comment on your last question regarding how to handle datastores when they are duplicated during a masquerade.

The answer is that the datastore actually is not duplicated during a masquerade. During a masquerade, the AST_LIST_APPEND_LIST macro is called, which actually moves the list of datastores from one channel to the other. It does not copy them. The problem is that app_dial still has a reference to the datastore which has been moved. So what is happening is that app_dial calls ast_datastore_free() on the datastore that has actually been moved to a completely different channel than it was on to begin with. I have in mind a patch to fix this, but I'm curious to see first if you had the same idea with the patch you uploaded. I would expect your license to be approved some time tomorrow, so I'll have a look at it then.

Thanks for the help on this and for the good questions. It's always fun to discuss things like this.

By: Mark Michelson (mmichelson) 2008-04-14 10:42:23

nreinartz: I just had a look at your patch, and I see what you were trying to accomplish, but based on what I told you in my previous note, that won't work well since the datastores aren't actually copied during a masquerade. Also, calling AST_LIST_HEAD_INIT_NOLOCK is not necessary in that scenario.

I'll be committing a fix for this issue as soon as possible.

By: Digium Subversion (svnbot) 2008-04-14 11:19:31

Repository: asterisk
Revision: 114112

U   branches/1.4/apps/app_dial.c
U   branches/1.4/apps/app_queue.c

------------------------------------------------------------------------
r114112 | mmichelson | 2008-04-14 11:19:29 -0500 (Mon, 14 Apr 2008) | 9 lines

If the datastore has been moved to another channel due to a masquerade, then
freeing the datastore here causes an eventual double free when the new channel
hangs up. We should only free the datastore if we were able to successfully remove
it from the channel we are referencing (i.e. the datastore was not moved).

(closes issue ASTERISK-11774)
Reported by: pguido


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114112

By: Digium Subversion (svnbot) 2008-04-14 11:20:06

Repository: asterisk
Revision: 114113

_U  trunk/
U   trunk/apps/app_dial.c
U   trunk/apps/app_queue.c

------------------------------------------------------------------------
r114113 | mmichelson | 2008-04-14 11:20:06 -0500 (Mon, 14 Apr 2008) | 17 lines

Merged revisions 114112 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114112 | mmichelson | 2008-04-14 11:24:22 -0500 (Mon, 14 Apr 2008) | 9 lines

If the datastore has been moved to another channel due to a masquerade, then
freeing the datastore here causes an eventual double free when the new channel
hangs up. We should only free the datastore if we were able to successfully remove
it from the channel we are referencing (i.e. the datastore was not moved).

(closes issue ASTERISK-11774)
Reported by: pguido


........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114113

By: Digium Subversion (svnbot) 2008-04-14 11:20:59

Repository: asterisk
Revision: 114114

_U  branches/1.6.0/
U   branches/1.6.0/apps/app_dial.c
U   branches/1.6.0/apps/app_queue.c

------------------------------------------------------------------------
r114114 | mmichelson | 2008-04-14 11:20:58 -0500 (Mon, 14 Apr 2008) | 25 lines

Merged revisions 114113 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r114113 | mmichelson | 2008-04-14 11:25:09 -0500 (Mon, 14 Apr 2008) | 17 lines

Merged revisions 114112 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114112 | mmichelson | 2008-04-14 11:24:22 -0500 (Mon, 14 Apr 2008) | 9 lines

If the datastore has been moved to another channel due to a masquerade, then
freeing the datastore here causes an eventual double free when the new channel
hangs up. We should only free the datastore if we were able to successfully remove
it from the channel we are referencing (i.e. the datastore was not moved).

(closes issue ASTERISK-11774)
Reported by: pguido


........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114114