[Home]

Summary:ASTERISK-07416: Asterisk crashes when attended transfer doesn't read data and call ended(autoservice related)
Reporter:Guillermo Winkler (guillecabeza)Labels:
Date Opened:2006-07-28 16:35:40Date Closed:2006-07-31 12:12:49
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Resources/res_features
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:When function "builtin_atxfer" follows this path

ast_log(LOG_WARNING, "Did not read data.\n");
res = ast_streamfile(transferer, "beeperr", transferer->language);
if (ast_waitstream(transferer, "") < 0) {
    return -1;

It is not removing the call from autoservice:
ast_autoservice_stop(transferee);

This leads to a crash on autoservice thread a few moments later...

Evidence suggest that dial_app is hanging up on peer after bridge completes, and the transferee never left autoservice thread.

- Happening in version 1.2.4 but code is the same in 1.2.10 -

Need to add that line isn't it?

****** ADDITIONAL INFORMATION ******

BACKTRACE
(gdb) bt
#0  0x00a7ac94 in pthread_mutex_lock () from /lib/tls/libpthread.so.0
#1  0x08063a0a in ast_do_masquerade (original=0xa1c1710) at lock.h:592
#2  0x080653f0 in ast_read (chan=0xa1c1710) at channel.c:1800
#3  0x080c3b81 in autoservice_run (ign=0x0) at autoservice.c:94
#4  0x00a79341 in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-1  0x0097a6fe in clone () from /lib/tls/libc.so.6

In all those cases we can see this log:
Jul 28 13:38:48 WARNING[23150] channel.c: Hard hangup called by thread -1317028944 on Zap/161-1, while fd is blocked by thread -1286919248 in procedure ast_waitfor_nandfds!  Expect a failure

When compiled with DO_CRASH flag to find out the offending hangup we have:

DO CRASH BACKTRACE
#0  0x080678ba in ast_hangup (chan=0x9544ed0) at channel.c:1326
1326                    CRASH;
(gdb) bt
#0  0x080678ba in ast_hangup (chan=0x9544ed0) at channel.c:1326
#1  0x00fa100e in dial_exec_full (chan=0xb7e0df18, data=Variable "data" is not available.
) at app_dial.c:1574
#2  0x00fa33ed in dial_exec (chan=0x29, data=0x34d7b4) at app_dial.c:1601
#3  0x080907dd in pbx_extension_helper (c=0xb7e0df18, con=Variable "con" is not available.
) at pbx.c:544
#4  0x08091aa6 in __ast_pbx_run (c=0xb7e0df18) at pbx.c:2218
ASTERISK-1  0x0809350c in pbx_thread (data=0x29) at pbx.c:2505
ASTERISK-2  0x00401341 in start_thread () from /lib/tls/libpthread.so.0
ASTERISK-3  0x002ed6fe in clone () from /lib/tls/libc.so.6



Comments:By: Serge Vecher (serge-v) 2006-07-31 09:09:26

was this Asterisk built with 'make don-optimize'. If not, if we need a backtrace from non-optimized build. Thanks.

By: Guillermo Winkler (guillecabeza) 2006-07-31 11:32:02

vechers: The problem is easily reproduced, it goes like this...

- Make a call
- press *2 to start attended transfer
- wait 3 seconds(or whatever your timeout for digits is)
- hangup on timeout (it has to be on the moment "did not read data" appears).
- crash(or expect failure log, depends on DO_CRASH)

I think dead-listing analysis shows clearly the problem.

We have now all of our test facilities on another issue, so I can't get the backtrace now.(the crash was originally on a production site that is already patched with the line I suggested)

If you can't see the problem indeed with the info I've sent you, please let me know and I'll try to make the arrangements to get you a bt.

Regards

By: Joshua C. Colp (jcolp) 2006-07-31 12:12:49

Fixed in 1.2 as of revision 38585. Already fixed in trunk so not needed there. Thanks!