|Summary:||ASTERISK-18205: Deadlock in app_queue when loading real-time queues and handling state change.|
|Reporter:||Steven Wheeler (swheeler)||Labels:|
|Date Opened:||2011-07-28 14:11:34||Date Closed:||2011-07-28 19:04:11|
|Environment:||CentOS Linux 2.6.18-238.9.1.el5PAE #1 SMP Tue Apr 12 18:52:55 EDT 2011 i686 i686 i386 GNU/Linux||Attachments:||( 0) core-show-locks.2011-07-28.txt|
( 1) gdb.txt
|Description:||We are seeing deadlocks when a queue call is loading the queue & member from the real-time database at the same time that another thread is updating the state of a different agent. This happens a few times a week when the queues are under higher than average load. I have gathered the 'core show locks' output taken while the deadlock was occurring. It indicates that the locks are being acquired out of order in one of the threads, I don't know the asterisk source well enough to know which order is correct.|
tps_processing_function acquires in this order:
Lock #0 &conlock(0x8216240) in ast_rdlock_contexts(pbx.c:9367)
Lock #1 &(&hints)->lock(0x8217508) in handle_statechange(pbx.c:3861)
Waiting for Lock #2 &p->priv_data.lock(0xb7587f58) in ao2_lock(astobj2.c:164) This lock is already held as #0 in pbx_thread thread.
pbx_thread acquires in this order:
Lock #0 queues(0xb7587f58) in load_realtime_queue(app_queue.c:1956)
Lock #1 q(0x90d2858) in find_queue_by_name_rt(app_queue.c:1803)
Lock #2 &conlock(0x8216240) in ast_rdlock_contexts(pbx.c:9367)
Waiting for Lock #3 &(&hints)->lock(0x8217508) in ast_add_hint(pbx.c:4076) This lock is already held as #1 in tps_processing_function thread.
I will upload the full output as well as the core file. Please let me know if there is anymore information you need to debug and I will try to get it next time the deadlock pops up.
|Comments:||By: Steven Wheeler (swheeler) 2011-07-28 14:12:35.727-0500|
Output of asterisk -rx 'core show locks' while the system was deadlocked.
By: Steven Wheeler (swheeler) 2011-07-28 14:13:32.750-0500
The core file is 136 MB so I can't upload it here. I would be happy to perform any actions on it and upload the output.
By: Steven Wheeler (swheeler) 2011-07-28 14:16:06.269-0500
Output of gdb thread apply all bt.
By: Richard Mudgett (rmudgett) 2011-07-28 19:04:11.783-0500
Thanks for the report. This is a duplicate of a deadlock already fixed. See ASTERISK-17760