[Home]

Summary:ASTERISK-13662: Asterisk segfaults when parking call
Reporter:Richard Begg (meric)Labels:
Date Opened:2009-02-26 20:53:24.000-0600Date Closed:2009-03-03 12:30:48.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) console.out
( 1) malloc_debug.txt
( 2) valgrind.txt
Description:Since our upgrade to 1.4.23.1, Asterisk is segfaulting regularly when parking calls.

****** ADDITIONAL INFORMATION ******

Backtrace:
Core was generated by `/usr/sbin/asterisk -f -U asterisk -G asterisk -vvvg -c'.
Program terminated with signal 11, Segmentation fault.
#0  0x001a41ca in free () from /lib/libc.so.6
(gdb) where
#0  0x001a41ca in free () from /lib/libc.so.6
#1  0x0807adac in ast_cdr_discard (cdr=0x7069733a) at cdr.c:460
#2  0x08083292 in ast_channel_free (chan=0x8658e10) at channel.c:1297
#3  0x080869c3 in ast_hangup (chan=0x8658e10) at channel.c:1549
#4  0x00cf4fce in park_exec (chan=0x85a6fb0, data=0xb5f6ef08) at res_features.c:2234
ASTERISK-1  0x080cbfeb in pbx_extension_helper (c=0x85a6fb0, con=0x0, context=0x85a7130 "from-internal", exten=0x85a7180 "901", priority=1, label=0x0, callerid=0x85f1b88 "301", action=E_SPAWN) at pbx.c:537
ASTERISK-2  0x080ce9a1 in __ast_pbx_run (c=0x85a6fb0) at pbx.c:2318
ASTERISK-3  0x080cf9ee in pbx_thread (data=0x85a6fb0) at pbx.c:2622
ASTERISK-4  0x080fee4b in dummy_start (data=0x8659d10) at utils.c:856
ASTERISK-5  0x002b145b in start_thread () from /lib/libpthread.so.0
ASTERISK-6 0x00208e5e in clone () from /lib/libc.so.6

We've got a number of core files since the upgrade.  They are split 50/50 between segfaults and aborts, but all have a common relationship in regard to call parking operations, and most notably a ast_cdr_discard() call.
Comments:By: Tilghman Lesher (tilghman) 2009-02-27 09:50:17.000-0600

Given that this is crashing within memory management, I'd like to see some valgrind output, as specified in doc/valgrind.txt.

By: Richard Begg (meric) 2009-03-01 19:11:32.000-0600

Cannot reproduce crash with DONT_OPTIMIZE MALLOC_DEBUG enabled.
Logs attached regardless.

Before this, the issue could be reproduced simply by parking an incoming call.

By: Richard Begg (meric) 2009-03-01 21:09:34.000-0600

For reference, the reproducable backtrace is a little different from that listed above.  This is the one matching the exact scenario used to produce the valgrind output:

Program terminated with signal 11, Segmentation fault.
#0  0x00565635 in ast_bridge_call (chan=0x99f02a0, peer=0x9a596b0, config=0xb62d5cd0) at res_features.c:1815
1815    res_features.c: No such file or directory.
       in res_features.c
(gdb) where
#0  0x00565635 in ast_bridge_call (chan=0x99f02a0, peer=0x9a596b0, config=0xb62d5cd0) at res_features.c:1815
#1  0x07eaf3e8 in dial_exec_full (chan=0x99f02a0, data=<value optimized out>, peerflags=0xb62d5de4, continue_exec=0x0) at app_dial.c:1843
#2  0x07eb1b82 in dial_exec (chan=0x99f02a0, data=0xb62d7e58) at app_dial.c:1882
#3  0x080cbfeb in pbx_extension_helper (c=0x99f02a0, con=0x0, context=0x99f0420 "macro-dial", exten=0x99f0470 "s", priority=7, label=0x0, callerid=0x99e89a8 "0390298690", action=E_SPAWN) at pbx.c:537
#4  0x051c0359 in _macro_exec (chan=0x99f02a0, data=0xb62dcf08, exclusive=0) at app_macro.c:346
ASTERISK-1  0x080cbfeb in pbx_extension_helper (c=0x99f02a0, con=0x0, context=0x99f0420 "macro-dial", exten=0x99f0470 "s", priority=13, label=0x0, callerid=0x99e89a8 "0390298690", action=E_SPAWN) at pbx.c:537
ASTERISK-2  0x080ce9a1 in __ast_pbx_run (c=0x99f02a0) at pbx.c:2318
ASTERISK-3  0x080cf9ee in pbx_thread (data=0x99f02a0) at pbx.c:2622
ASTERISK-4  0x080fee4b in dummy_start (data=0x99e9180) at utils.c:856
ASTERISK-5  0x0077b45b in start_thread () from /lib/libpthread.so.0
ASTERISK-6 0x006d2e5e in clone () from /lib/libc.so.6

By: Joshua C. Colp (jcolp) 2009-03-02 15:04:58.000-0600

Can you also attach the console output of this happening? The actual initial backtrace is from when a parked call is picked up, not when it is parked. This should help narrow things down.

By: Richard Begg (meric) 2009-03-02 15:20:51.000-0600

Console output uploaded (from the scenario matching the most recent backtrace and valgrind output)

By: Joshua C. Colp (jcolp) 2009-03-02 16:06:09.000-0600

I haven't yet been able to reproduce the issue but I think I know what might be up... based on the console output provided it looks like the dialed channel is actually hanging up shortly after they go on hold. Is this possible?

By: Richard Begg (meric) 2009-03-02 16:28:40.000-0600

Quite likely... the scenario is this:
- Incoming call, rings extension
- Extension answers, then does a blind transfer to ext 900 (parking lot)
- As it's a blind transfer, the called extension probably does hang up at this stage.
- Asterisk crashes.

By: Digium Subversion (svnbot) 2009-03-03 12:27:30.000-0600

Repository: asterisk
Revision: 179840

U   branches/1.4/res/res_features.c

------------------------------------------------------------------------
r179840 | file | 2009-03-03 12:27:29 -0600 (Tue, 03 Mar 2009) | 9 lines

Do not assume that the bridge_cdr is still attached to the channel when the 'h' exten is finished executing.

It is possible for a masquerade operation to occur when the 'h' exten is operating. This operation moves
the CDR records around causing the bridge_cdr to no longer exist on the channel where it is expected to.
We can not safely modify it afterwards because of this, so don't even try.

(closes issue ASTERISK-13662)
Reported by: meric

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=179840

By: Digium Subversion (svnbot) 2009-03-03 12:28:47.000-0600

Repository: asterisk
Revision: 179841

_U  trunk/
U   trunk/main/features.c

------------------------------------------------------------------------
r179841 | file | 2009-03-03 12:28:46 -0600 (Tue, 03 Mar 2009) | 16 lines

Merged revisions 179840 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r179840 | file | 2009-03-03 14:27:09 -0400 (Tue, 03 Mar 2009) | 9 lines
 
 Do not assume that the bridge_cdr is still attached to the channel when the 'h' exten is finished executing.
 
 It is possible for a masquerade operation to occur when the 'h' exten is operating. This operation moves
 the CDR records around causing the bridge_cdr to no longer exist on the channel where it is expected to.
 We can not safely modify it afterwards because of this, so don't even try.
 
 (closes issue ASTERISK-13662)
 Reported by: meric
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=179841

By: Digium Subversion (svnbot) 2009-03-03 12:29:35.000-0600

Repository: asterisk
Revision: 179842

_U  branches/1.6.0/
U   branches/1.6.0/main/features.c

------------------------------------------------------------------------
r179842 | file | 2009-03-03 12:29:35 -0600 (Tue, 03 Mar 2009) | 23 lines

Merged revisions 179841 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r179841 | file | 2009-03-03 14:28:46 -0400 (Tue, 03 Mar 2009) | 16 lines
 
 Merged revisions 179840 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r179840 | file | 2009-03-03 14:27:09 -0400 (Tue, 03 Mar 2009) | 9 lines
   
   Do not assume that the bridge_cdr is still attached to the channel when the 'h' exten is finished executing.
   
   It is possible for a masquerade operation to occur when the 'h' exten is operating. This operation moves
   the CDR records around causing the bridge_cdr to no longer exist on the channel where it is expected to.
   We can not safely modify it afterwards because of this, so don't even try.
   
   (closes issue ASTERISK-13662)
   Reported by: meric
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=179842

By: Digium Subversion (svnbot) 2009-03-03 12:30:48.000-0600

Repository: asterisk
Revision: 179843

_U  branches/1.6.1/
U   branches/1.6.1/main/features.c

------------------------------------------------------------------------
r179843 | file | 2009-03-03 12:30:47 -0600 (Tue, 03 Mar 2009) | 23 lines

Merged revisions 179841 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r179841 | file | 2009-03-03 14:28:46 -0400 (Tue, 03 Mar 2009) | 16 lines
 
 Merged revisions 179840 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r179840 | file | 2009-03-03 14:27:09 -0400 (Tue, 03 Mar 2009) | 9 lines
   
   Do not assume that the bridge_cdr is still attached to the channel when the 'h' exten is finished executing.
   
   It is possible for a masquerade operation to occur when the 'h' exten is operating. This operation moves
   the CDR records around causing the bridge_cdr to no longer exist on the channel where it is expected to.
   We can not safely modify it afterwards because of this, so don't even try.
   
   (closes issue ASTERISK-13662)
   Reported by: meric
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=179843