Summary:ASTERISK-09871: Deadlock in chan_local causing a channel to be 'dummy' and impossible to soft hangup it.
Reporter:Eliel Sardanons (eliel)Labels:
Date Opened:2007-07-12 17:15:10Date Closed:2007-08-27 10:26:57
Versions:Frequency of
Environment:Attachments:( 0) thread_apply_all_bt.txt
Description:A call that was automatically generated by asterisk, and was redirected to a queue, goes in a deadlock and I couldn?t hangup the channel or 'restart now' asterisk, the only way to make that channel dissapear was doing a 'kill -9' of asterisk

If you want I can send to you all the full log, but this occurs in a production enviroment and the asterisk full log is very big. What I see related to this bug was:

[Jul 12 18:57:32] DEBUG[28406] channel.c: Avoiding deadlock for channel '0xa088488'


----------------- ASTERISK CLI --------------
pbx*CLI> show channels
Channel              Location             State   Application(Data)

Local/4003@from-inte 4003@from-internal:1 Down    AppQueue((Outgoing Line))

SIP/GWSIP3A-b6cc34c8 (None)               Up      Bridged Call(Local/9XXXXXXXXXX

2 active channels

2 active calls

globantpbx*CLI> queue show 1002

1002         has 1 calls (max unlimited) in 'rrmemory' strategy (24s holdtime), W:0, C:154, A:43, SL:0.0% within 0s

  No Members


     1. Local/9XXXXXXXXXXX@from-internal-a101,1 (wait: 50:37, prio: 0)
------------- EO ASTERISK CLI ----------

When this cli command was executed all the agents where logged out, but while the agents where logged in, this call didn?t reach any member. Also the member who initially answered the call was shown as (inuse) by the queue cli command, but the SIP command "sip show inuse" didn?t report this state and the member continue receiving and answering calls.

The call was generated by an originate manager action (like the other 300 calls that behave in a good manner) and the other leg was sent to the queue (1002).

[I think this bug is a crash bacause forced me to restart asterisk to hangup the call.]

Comments:By: Joshua C. Colp (jcolp) 2007-07-13 08:34:07

What exactly was each Local channel connected to? I assume one was to a SIP channel, and the other was to app_queue? It looks as though each of them were set on hold which is slightly odd.

By: Eliel Sardanons (eliel) 2007-07-13 10:51:00

A predictive dialer generates a call like (Action: Originate):
Channel: Local/s@ccs-Leaddialer
Context: ccs-predictive
Priority: 1
Exten: s

ccs-Leaddialer runs an AGI to start trying to contact a lead and generate tons of calls (calling Local/9<leadnumber> that ends in a SIP/GWS...), when one of them is answered the call is passed to ccs-predictive that is an AGI that sends the call to the Queue(1002).

By: Eliel Sardanons (eliel) 2007-07-13 10:54:39

Local/4003@from-internal is the agent that answer the call.

By: Joshua C. Colp (jcolp) 2007-07-16 07:43:36

Okay, I need to see the console output for this. Also: are you running Dial and Queue inside of the AGIs?

By: Eliel Sardanons (eliel) 2007-07-24 15:49:52

Im running Dial() inside of the AGI,  but not Queue()

By: Eliel Sardanons (eliel) 2007-07-24 15:50:43

Sorry but I´m cleaning up the full log because its plenty of private client information. I will need two days more to give you more feedback.

By: Russell Bryant (russell) 2007-08-23 13:31:49

If you are still having a problem here, please build with DEBUG_THREADS and get the output of "core show locks".  That will show where the deadlock is.

By: Eliel Sardanons (eliel) 2007-08-27 10:01:33

If you want you can close this issue, we rollback to 1.2 and I can't give you more feedback against this issue.

Thanks and sorry for the wasted time.