Summary:ASTERISK-07409: use of chan_agent in combination with transfers causes deadlock
Reporter:Jared Smith (jsmith)Labels:
Date Opened:2006-07-28 08:27:46Date Closed:2006-09-06 10:06:48
Versions:Frequency of
Environment:Attachments:( 0) deadlock.993.backtrace.txt
( 1) junky_7604.txt
Description:I'm running a queue with around 15 agents (three or four logged in at any time) and up to about 10 callers in the queue.  If I run asterisk -rx "show queues" every 30 seconds, eventually it causes Asterisk to stop responding to calls or CLI commands.

This problem is very similar to bug number 7592, but my machine just deadlocks instead of crashing asterisk like was reported in the other bug.

****** STEPS TO REPRODUCE ******

Setup a queue
Run asterisk -rx "show queues" repeatedly
Watch asterisk deadlock


I've got debug_threads and dont-optimize turned on (at Russell's request in bug 7592), and I'm seeing lots and lots of the following the logs:

Jul 28 07:21:46 ERROR[23550]: ../include/asterisk/lock.h:306 __ast_pthread_mutex_lock: chan_agent.c line 2020 (__login_exec): Deadlock? waited 230 sec for mutex '&p->app_lock'?
Jul 28 07:21:46 ERROR[23550]: ../include/asterisk/lock.h:309 __ast_pthread_mutex_lock: chan_agent.c line 950 (agent_new): '&p->app_lock' was locked here.

Comments:By: Serge Vecher (serge-v) 2006-07-28 11:12:27

jsmith: do you want to attach a complete log ...

By: finnepinne (finnepinne) 2006-08-04 06:01:11

I just wanted to report that I am seeing the exact same behaviour on my machine. Our machine stops responding after a while even if not running 'show queues'. We are however using Flash Operator Panel.

By: Jared Smith (jsmith) 2006-08-04 17:57:09

I just uploaded a complete backtrace from today's deadlock, just for historical purposes.

By: Russell Bryant (russell) 2006-08-05 00:32:10

Just so it is noted here in this bug, when I logged into jsmith's machine, this problem turned out not to be related to "show queues".

This problem was related to the use of chan_agent in combination with transfers.

If you are experiencing this problem, you are using chan_agent, and your agents are using transfers, I would STRONGLY recommend that you migrate your system to use dynamic queue members instead.  If you are not using chan_agent, but are experiencing deadlocks, then please say so that we can figure it out, as it would be a different issue than what jsmith was having.

By: Serge Vecher (serge-v) 2006-08-21 14:34:59

jsmith: did you migrate over to dynamic queues in dialplan. Are we ready to close this issue?

By: Jared Smith (jsmith) 2006-08-22 14:24:17

I've moved over to dynamic queues, and while they work, they're not perfect.  First, the queue statistics (if you do a "show queues" from the CLI) is all wrong.  It shows agents as not being on a call, even if they are.  Second, issuing a "reload" seems to corrupt something in the queue, as doing a "show queues" after a reload shows all kinds of garbage on the screen.  If you'd like, I can open these as seperate issues, as they're not 100% related to this particular crash.

By: Jared Smith (jsmith) 2006-08-22 14:25:25

Yes, I've moved to dynamic queues, and transfers no longer crash the system.

By: Serge Vecher (serge-v) 2006-08-22 14:36:49

jsmith: I'm glad your issue is fixed. Please open new bug reports for other issues you've mentioned with dynamic queues. Thanks.