[Home]

Summary:ASTERISK-12877: channel get stuck on ast_queue_frame when hanging up
Reporter:Octavio Ruiz (tacvbo)Labels:
Date Opened:2008-10-13 04:34:46Date Closed:2009-05-04 11:25:19
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 13676.patch
( 1) ast_show_channels
( 2) ast_show_locks
( 3) ast_show_threads
( 4) bt_full_thread_174
( 5) bt_thread_apply_all_bt
Description:From my understanding, as result of a channel stuck on ast_queue_frame when hanging up and not unlocking their &p->lock a bunch of AgentCallBackLogin() threads get stuck trying to get the agents list.

They are plenty of warnings messages on the logger output:

* channel.c: Avoiding deadlock for channel
* channel.c: Dropping voice to exceptionally long queue on IAX/...
* chan_iax2.c: Max retries exceeded to host ... on IAX2/.. (type = ., subclass = ., ts=1., seqno=.)

The scenario is a production system on a call center enviroment, calls are made and received through an IAX2 facility. This use to append  after 40 to 60 minutes uptime with about 20 logged agents using AgentCallBackLogin() mainly doing outbound calls.

****** ADDITIONAL INFORMATION ******

===
=== Thread ID: 3058826128 (pbx_thread           started at [ 2645] pbx.c ast_pbx_start())
=== ---> Lock #0 (channel.c): MUTEX 2613 ast_write &chan->lock 0x90e8420 (1)
=== ---> Lock #1 (chan_agent.c): MUTEX 630 agent_write &p->lock 0x8fa8f10 (1)
=== ---> Lock #2 (channel.c): MUTEX 2613 ast_write &chan->lock 0x8fe63f8 (1)
=== ---> Lock #3 (channel.c): MUTEX 3416 ast_do_masquerade &clone->lock 0x91901c0 (1)
=== ---> Lock #4 (chan_local.c): MUTEX 515 local_hangup &p->lock 0x9037428 (1)
=== ---> Tried and failed to get Lock ASTERISK-1 (channel.c): MUTEX 962 ast_queue_hangup &chan->lock 0x90e7b78 (1)
=== -------------------------------------------------------------------
===
=== Thread ID: 3041868688 (pbx_thread           started at [ 2645] pbx.c ast_pbx_start())
=== ---> Lock #0 (channel.c): MUTEX 2613 ast_write &chan->lock 0x90e7b78 (1)
=== ---> Waiting for Lock #1 (chan_local.c): MUTEX 309 local_write &p->lock 0x9037428 (1)
=== --- ---> Locked Here: chan_local.c line 515 (local_hangup)
=== -------------------------------------------------------------------
===
=== Thread ID: 3009919888 (pbx_thread           started at [ 2645] pbx.c ast_pbx_start())
=== ---> Lock #0 (chan_agent.c): MUTEX 2023 __login_exec &(&agents)->lock 0x585a28 (1)
=== ---> Lock #1 (chan_agent.c): MUTEX 2026 __login_exec &chan->lock 0xb7993a68 (1)
=== ---> Waiting for Lock #2 (chan_agent.c): MUTEX 2027 __login_exec &p->lock 0x8fa8f10 (1)
=== --- ---> Locked Here: chan_agent.c line 630 (agent_write)
=== -------------------------------------------------------------------
===
=== Thread ID: 3032038288 (pbx_thread           started at [ 2645] pbx.c ast_pbx_start())
=== ---> Waiting for Lock #0 (chan_agent.c): MUTEX 2023 __login_exec &(&agents)->lock 0x585a28 (1)
=== --- ---> Locked Here: chan_agent.c line 2023 (__login_exec)
=== -------------------------------------------------------------------
Comments:By: Octavio Ruiz (tacvbo) 2008-10-13 11:03:49

Same symptoms with SVN-branch-1.4-r148257.

By: Phoebe Anderson (phoebe) 2008-10-13 14:12:32

This may be related to issue 0013645.  As of 1.4.22, IAX2 doesn't seem to hangup properly.

By: Mark Michelson (mmichelson) 2008-10-13 14:47:19

This appears to be an old-fashioned deadlock. The two competing threads are the first two listed, where the locking order is different between the same mutexes.

I am uploading a patch which should alleviate the issue.

By: Mark Michelson (mmichelson) 2008-10-13 14:48:01

Please try 13676.patch. Thanks!

By: Octavio Ruiz (tacvbo) 2008-10-13 16:20:58

Looks like solved, by now that system have more than 60 min uptime; anyway please let me do further tests to make sure the problem has gone.

By: Octavio Ruiz (tacvbo) 2008-10-13 19:36:47

More than 4 hours uptime with ~ 80 agents logged in, confirm that it's solved. There is no evidence of more deadlocks. Thank you Mark.

By: Mark Michelson (mmichelson) 2008-10-14 12:21:35

No problem. I'll get this patch committed. Thanks for testing!

By: Digium Subversion (svnbot) 2008-10-14 12:23:16

Repository: asterisk
Revision: 148912

U   branches/1.4/channels/chan_local.c

------------------------------------------------------------------------
r148912 | mmichelson | 2008-10-14 12:23:15 -0500 (Tue, 14 Oct 2008) | 9 lines

Deadlock prevention in chan_local.

(closes issue ASTERISK-12877)
Reported by: tacvbo
Patches:
     13676.patch uploaded by putnopvut (license 60)
Tested by: tacvbo


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=148912