Summary: | ASTERISK-18205: Deadlock in app_queue when loading real-time queues and handling state change. | ||||
Reporter: | Steven Wheeler (swheeler) | Labels: | |||
Date Opened: | 2011-07-28 14:11:34 | Date Closed: | 2011-07-28 19:04:11 | ||
Priority: | Major | Regression? | |||
Status: | Closed/Complete | Components: | Applications/app_queue PBX/General | ||
Versions: | Frequency of Occurrence | Occasional | |||
Related Issues: |
| ||||
Environment: | CentOS Linux 2.6.18-238.9.1.el5PAE #1 SMP Tue Apr 12 18:52:55 EDT 2011 i686 i686 i386 GNU/Linux | Attachments: | ( 0) core-show-locks.2011-07-28.txt ( 1) gdb.txt | ||
Description: | We are seeing deadlocks when a queue call is loading the queue & member from the real-time database at the same time that another thread is updating the state of a different agent. This happens a few times a week when the queues are under higher than average load. I have gathered the 'core show locks' output taken while the deadlock was occurring. It indicates that the locks are being acquired out of order in one of the threads, I don't know the asterisk source well enough to know which order is correct. tps_processing_function acquires in this order: Lock #0 &conlock(0x8216240) in ast_rdlock_contexts(pbx.c:9367) Lock #1 &(&hints)->lock(0x8217508) in handle_statechange(pbx.c:3861) Waiting for Lock #2 &p->priv_data.lock(0xb7587f58) in ao2_lock(astobj2.c:164) This lock is already held as #0 in pbx_thread thread. pbx_thread acquires in this order: Lock #0 queues(0xb7587f58) in load_realtime_queue(app_queue.c:1956) Lock #1 q(0x90d2858) in find_queue_by_name_rt(app_queue.c:1803) Lock #2 &conlock(0x8216240) in ast_rdlock_contexts(pbx.c:9367) Waiting for Lock #3 &(&hints)->lock(0x8217508) in ast_add_hint(pbx.c:4076) This lock is already held as #1 in tps_processing_function thread. I will upload the full output as well as the core file. Please let me know if there is anymore information you need to debug and I will try to get it next time the deadlock pops up. | ||||
Comments: | By: Steven Wheeler (swheeler) 2011-07-28 14:12:35.727-0500 Output of asterisk -rx 'core show locks' while the system was deadlocked. By: Steven Wheeler (swheeler) 2011-07-28 14:13:32.750-0500 The core file is 136 MB so I can't upload it here. I would be happy to perform any actions on it and upload the output. By: Steven Wheeler (swheeler) 2011-07-28 14:16:06.269-0500 Output of gdb thread apply all bt. By: Richard Mudgett (rmudgett) 2011-07-28 19:04:11.783-0500 Thanks for the report. This is a duplicate of a deadlock already fixed. See ASTERISK-17760 |