|Summary:||ASTERISK-07409: use of chan_agent in combination with transfers causes deadlock|
|Reporter:||Jared Smith (jsmith)||Labels:|
|Date Opened:||2006-07-28 08:27:46||Date Closed:||2006-09-06 10:06:48|
|Environment:||Attachments:||( 0) deadlock.993.backtrace.txt|
( 1) junky_7604.txt
|Description:||I'm running a queue with around 15 agents (three or four logged in at any time) and up to about 10 callers in the queue. If I run asterisk -rx "show queues" every 30 seconds, eventually it causes Asterisk to stop responding to calls or CLI commands.|
This problem is very similar to bug number 7592, but my machine just deadlocks instead of crashing asterisk like was reported in the other bug.
****** STEPS TO REPRODUCE ******
Setup a queue
Run asterisk -rx "show queues" repeatedly
Watch asterisk deadlock
****** ADDITIONAL INFORMATION ******
I've got debug_threads and dont-optimize turned on (at Russell's request in bug 7592), and I'm seeing lots and lots of the following the logs:
Jul 28 07:21:46 ERROR: ../include/asterisk/lock.h:306 __ast_pthread_mutex_lock: chan_agent.c line 2020 (__login_exec): Deadlock? waited 230 sec for mutex '&p->app_lock'?
Jul 28 07:21:46 ERROR: ../include/asterisk/lock.h:309 __ast_pthread_mutex_lock: chan_agent.c line 950 (agent_new): '&p->app_lock' was locked here.
|Comments:||By: Serge Vecher (serge-v) 2006-07-28 11:12:27|
jsmith: do you want to attach a complete log ...
By: finnepinne (finnepinne) 2006-08-04 06:01:11
I just wanted to report that I am seeing the exact same behaviour on my machine. Our machine stops responding after a while even if not running 'show queues'. We are however using Flash Operator Panel.
By: Jared Smith (jsmith) 2006-08-04 17:57:09
I just uploaded a complete backtrace from today's deadlock, just for historical purposes.
By: Russell Bryant (russell) 2006-08-05 00:32:10
Just so it is noted here in this bug, when I logged into jsmith's machine, this problem turned out not to be related to "show queues".
This problem was related to the use of chan_agent in combination with transfers.
If you are experiencing this problem, you are using chan_agent, and your agents are using transfers, I would STRONGLY recommend that you migrate your system to use dynamic queue members instead. If you are not using chan_agent, but are experiencing deadlocks, then please say so that we can figure it out, as it would be a different issue than what jsmith was having.
By: Serge Vecher (serge-v) 2006-08-21 14:34:59
jsmith: did you migrate over to dynamic queues in dialplan. Are we ready to close this issue?
By: Jared Smith (jsmith) 2006-08-22 14:24:17
I've moved over to dynamic queues, and while they work, they're not perfect. First, the queue statistics (if you do a "show queues" from the CLI) is all wrong. It shows agents as not being on a call, even if they are. Second, issuing a "reload" seems to corrupt something in the queue, as doing a "show queues" after a reload shows all kinds of garbage on the screen. If you'd like, I can open these as seperate issues, as they're not 100% related to this particular crash.
By: Jared Smith (jsmith) 2006-08-22 14:25:25
Yes, I've moved to dynamic queues, and transfers no longer crash the system.
By: Serge Vecher (serge-v) 2006-08-22 14:36:49
jsmith: I'm glad your issue is fixed. Please open new bug reports for other issues you've mentioned with dynamic queues. Thanks.