Summary:ASTERISK-06904: Crash or block on Park with applicationmap
Reporter:Alistair Cunningham (acunningham)Labels:
Date Opened:2006-05-05 12:16:49Date Closed:2006-05-22 15:22:42
Versions:Frequency of
Environment:Attachments:( 0) 7090-bt
( 1) builtin-parkcalls.diff
Description:Asterisk 1.2.6 and consistently either crash or block new calls if you have the following in features.conf:

parkext => *7
parkpos => 701-799
context => parkedcalls

park_caller => **,callee,Park
park_called => **,caller,Park

Then in a Perl FastAGI script do $agi->set_variable( '__DYNAMIC_FEATURES', 'park_caller#park_called' ); then have the caller press ** during the call. They hear the 701, then hang up, and Asterisk crashes within a second of them hanging up. The last few lines of /var/log/asterisk/full are:

May  5 18:02:19 DEBUG[18103] chan_sip.c: update_call_counter() - decrement call limit counter
May  5 18:02:19 WARNING[18103] channel.c: PBX may not have been terminated properly on 'SIP/'
May  5 18:02:19 VERBOSE[18125] logger.c:   == Spawn extension (nta, s, 1) exited non-zero on 'xÃ<90>^H05<9e>^H31.0.2-089d4eb0'
May  5 18:02:19 VERBOSE[18125] logger.c:     -- Executing DeadAGI("X<8a>F", "agi://") in new stack
May  5 18:02:19 VERBOSE[18125] logger.c:     -- AGI Script agi:// completed, returning 0
May  5 18:02:19 DEBUG[18125] chan_sip.c: Asked to hangup channel not connected
May  5 18:02:19 WARNING[18125] channel.c: Unable to find channel in list

The full log is available on request.
Comments:By: BJ Weschke (bweschke) 2006-05-05 12:22:55

alistair: can we get a backtrace of any of these crashes on an Asterisk that is compiled 'dont-optimize' ? thanks.

By: Leif Madsen (lmadsen) 2006-05-06 11:00:04

The AGI script you used might also be useful here

By: Leif Madsen (lmadsen) 2006-05-06 11:44:18

Reproduced -- bt/bt full uploaded with 'make dont-optimize'

By: Leif Madsen (lmadsen) 2006-05-06 11:46:35

Of note: this was reproduced in trunk rev 25256 -- so it's still a problem beyond

By: Andrey S Pankov (casper) 2006-05-06 12:24:49

Backtrace like this is of very little use... What's at channel.c:1923?

if (ast_seekstream(chan->monitor->read_stream, jump - f->samples, SEEK_FORCECUR) == -1)

this line? Can you do in gdb 'print var' for appropriate vars after 'frame 2'?
If (f == NULL), can you constify args to queue_frame_to_spies and try again

What distro/gcc/etc are you using?

By: Alistair Cunningham (acunningham) 2006-05-09 15:05:50

Here's another backtrace, but it doesn't really show anything more. I do have the core file if more information is needed.

#0  0x003b0670 in malloc_consolidate () from /lib/tls/libc.so.6
No symbol table info available.
#1  0x003afe6a in _int_malloc () from /lib/tls/libc.so.6
No symbol table info available.
#2  0x003af92e in calloc () from /lib/tls/libc.so.6
No symbol table info available.
#3  0x0033ed54 in _dl_new_object () from /lib/ld-linux.so.2
No symbol table info available.
#4  0x0033c17c in _dl_map_object_from_fd () from /lib/ld-linux.so.2
No symbol table info available.
ASTERISK-1  0x0033a78d in _dl_map_object () from /lib/ld-linux.so.2
No symbol table info available.
ASTERISK-2  0x0043b814 in dl_open_worker () from /lib/tls/libc.so.6
No symbol table info available.
ASTERISK-3  0x00341b66 in _dl_catch_error () from /lib/ld-linux.so.2
No symbol table info available.
ASTERISK-4  0x0043b637 in _dl_open () from /lib/tls/libc.so.6
No symbol table info available.
ASTERISK-5  0x0043d693 in do_dlopen () from /lib/tls/libc.so.6
No symbol table info available.
ASTERISK-6 0x00341b66 in _dl_catch_error () from /lib/ld-linux.so.2
No symbol table info available.
ASTERISK-7 0x0043d502 in __libc_dlopen_mode () from /lib/tls/libc.so.6
No symbol table info available.
ASTERISK-8 0x00672f08 in _Unwind_ForcedUnwind () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-9 0x00670bd3 in __pthread_unwind () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-10 0x0066d021 in pthread_exit () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-11 0x08089694 in pbx_thread (data=0x8aa6058) at pbx.c:2517
       c = (struct ast_channel *) 0x8aa6058
ASTERISK-12 0x0066b98c in start_thread () from /lib/tls/libpthread.so.0
No symbol table info available.
ASTERISK-13 0x0040a7da in clone () from /lib/tls/libc.so.6
No symbol table info available.

By: Andrey S Pankov (casper) 2006-05-09 15:10:47

Please do not paste backtraces directly in the note, upload it as an attach.
Your 2nd crash doesn't seem to be related in any way to asterisk.

By: Andrey S Pankov (casper) 2006-05-09 15:13:58

Could you please provide information I asked you about?
If no, please find a developer on IRC and give him access to the affected box
where asterisk sources and the 1st core reside.

Can you confirm that your backtrace is form a non-optimized build?
#make clean dont-optimize

What distro/gcc/etc are you using?

By: Alistair Cunningham (acunningham) 2006-05-09 15:28:49

The system is Fedora Core release 2 (Tettnang), gcc 3.3.3-7. Asterisk was compiled using dont-optimize. I'm not sure what you mean by "first core"; I've only uploaded one. The other was by someone else, and I don't have access to their machine. Blitzrage, can you help?

By: BJ Weschke (bweschke) 2006-05-09 16:06:49

Alistair - Your bug was THE topic of discussion today on the dev conference call. :) There's two issues going on here and both have to do with the current implementation of applicationmap.

The first, which is the cause of your segfaults, is that when the application executes, we're not presently making a determination that the bridged channels should stop bridging audio once the application is done executing. After some discussions today, I'm going to try and put a patch together on this one to address.

The second issue is that we don't presently have a way to determine when an application needs to be specific about who does what to which channel (when there's more than one with the app) which channel should go where. (confusing enough? :)   We've talked about a possible solution for this as well today and the devcon europe team was going to take a look at that part of it later on this week.

So that's where we are with this. My apologies for going "radio silent" for a bit on this one. It's not like there hasn't been any work done to find "root cause", it's just that you've reported a "good one". :)

By: Alistair Cunningham (acunningham) 2006-05-09 16:24:49

Thanks for the update. There's no need to apologise; I appreciate that this is a tricky one. Thanks also to Blitzrage and Casper for following up. Your support is most gratefully received, especially considering the amount we're not paying you!

By: Leif Madsen (lmadsen) 2006-05-10 14:33:02

Seems bweschke has enough information now, so if someone else really needs the additional information from the core dump, let me know what commands you want me to run to start off with. I was working with bweschke on another crash/core today, and think I'm starting to get the idea of how that gdb app works.

By: BJ Weschke (bweschke) 2006-05-22 07:11:46

The crash was fixed over the weekend with a patch to both 1.2 and /trunk. However, in order for one-step parking to work, as anticipated, it must be a built-in function accessible from the featuremap section of features.conf. A patch is attached that does just this, but I haven't yet committed it to /trunk as there's another oustanding deadlock bug in /trunk (and maybe 1.2.X - haven't tested) with regard to timeouts on parking.

By: Serge Vecher (serge-v) 2006-05-22 10:31:15

BJ: since the crash is resolved now, would it be better to close 7090 and finish up 6340 with your patch?

By: BJ Weschke (bweschke) 2006-05-22 12:24:41

Closing. Crash fixed and one-step-park corrected in /trunk. Thanks!

By: Serge Vecher (serge-v) 2006-05-22 15:22:42

For the record, the fix is in 1.2 r29196 and in trunk r29197. Thanks, BJ!