#### The two interlocked threads: === Thread ID: 140071508813568 (do_monitor started at [24477] chan_sip.c restart_monitor()) === ---> Lock #0 (chan_sip.c): MUTEX 24449 do_monitor &monlock 0x7f64de1edb60 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xaf) [0x4df635] /usr/lib/asterisk/modules/chan_sip.so(+0x817b0) [0x7f64ddfa27b0] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === ---> Lock #1 (chan_sip.c): MUTEX 22286 handle_request_refer p 0x7f64cc0be4c0 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xaf) [0x4df635] /usr/sbin/asterisk(__ao2_lock+0x5a) [0x442288] /usr/lib/asterisk/modules/chan_sip.so(+0x23976) [0x7f64ddf44976] /usr/sbin/asterisk(ast_indicate_data+0x367) [0x471ff1] /usr/sbin/asterisk(ast_indicate+0x2b) [0x471c38] /usr/sbin/asterisk(ast_channel_bridge+0x16a5) [0x47e50a] /usr/sbin/asterisk(ast_bridge_call+0xdf9) [0x4bb06a] /usr/lib/asterisk/modules/app_dial.so(+0xfa12) [0x7f64d4baba12] /usr/lib/asterisk/modules/app_dial.so(+0x103c2) [0x7f64d4bac3c2] /usr/sbin/asterisk(pbx_exec+0x1fb) [0x4ff7ec] /usr/sbin/asterisk() [0x5096f2] /usr/sbin/asterisk(ast_spawn_extension+0x65) [0x50b134] /usr/sbin/asterisk() [0x50bbbd] /usr/sbin/asterisk() [0x50d7fb] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === ---> Lock #2 (chan_sip.c): MUTEX 22415 handle_request_bye bridge 0x7f64cc13c680 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_trylock+0xaf) [0x4df9b8] /usr/sbin/asterisk(__ao2_trylock+0x5a) [0x44234a] /usr/lib/asterisk/modules/chan_sip.so(+0x79084) [0x7f64ddf9a084] /usr/lib/asterisk/modules/chan_sip.so(+0x7ecb4) [0x7f64ddf9fcb4] /usr/lib/asterisk/modules/chan_sip.so(+0x7f0fb) [0x7f64ddfa00fb] /usr/lib/asterisk/modules/chan_sip.so(+0x7f379) [0x7f64ddfa0379] /usr/sbin/asterisk(ast_sched_runq+0x19a) [0x54a4f2] /usr/lib/asterisk/modules/chan_sip.so(+0x817c2) [0x7f64ddfa27c2] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === ---> Waiting for Lock #3 (pbx.c): MUTEX 9541 pbx_builtin_setvar_helper chan 0x2710a50 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xaf) [0x4df635] /usr/sbin/asterisk(__ao2_lock+0x5a) [0x442288] /usr/sbin/asterisk(pbx_builtin_setvar_helper+0xfc) [0x51d4fe] /usr/sbin/asterisk(ast_rtp_instance_set_stats_vars+0x8a) [0x5267cc] /usr/lib/asterisk/modules/chan_sip.so(+0x7928a) [0x7f64ddf9a28a] /usr/lib/asterisk/modules/chan_sip.so(+0x7ecb4) [0x7f64ddf9fcb4] /usr/lib/asterisk/modules/chan_sip.so(+0x7f0fb) [0x7f64ddfa00fb] /usr/lib/asterisk/modules/chan_sip.so(+0x7f379) [0x7f64ddfa0379] /usr/sbin/asterisk(ast_sched_runq+0x19a) [0x54a4f2] /usr/lib/asterisk/modules/chan_sip.so(+0x817c2) [0x7f64ddfa27c2] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === --- ---> Locked Here: channel.c line 4222 (ast_indicate_data) === ------------------------------------------------------------------- === Thread ID: 140071007495936 (pbx_thread started at [ 5035] pbx.c ast_pbx_start()) === ---> Lock #0 (channel.c): MUTEX 4222 ast_indicate_data chan 0x2710a50 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xaf) [0x4df635] /usr/sbin/asterisk(__ao2_lock+0x5a) [0x442288] /usr/sbin/asterisk(ast_indicate_data+0x7c) [0x471d06] /usr/sbin/asterisk(ast_indicate+0x2b) [0x471c38] /usr/sbin/asterisk(ast_channel_bridge+0x16a5) [0x47e50a] /usr/sbin/asterisk(ast_bridge_call+0xdf9) [0x4bb06a] /usr/lib/asterisk/modules/app_dial.so(+0xfa12) [0x7f64d4baba12] /usr/lib/asterisk/modules/app_dial.so(+0x103c2) [0x7f64d4bac3c2] /usr/sbin/asterisk(pbx_exec+0x1fb) [0x4ff7ec] /usr/sbin/asterisk() [0x5096f2] /usr/sbin/asterisk(ast_spawn_extension+0x65) [0x50b134] /usr/sbin/asterisk() [0x50bbbd] /usr/sbin/asterisk() [0x50d7fb] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === ---> Waiting for Lock #1 (chan_sip.c): MUTEX 6356 sip_indicate p 0x7f64cc0be4c0 (1) /usr/sbin/asterisk(ast_bt_get_addresses+0x1d) [0x4e67e3] /usr/sbin/asterisk(__ast_pthread_mutex_lock+0xaf) [0x4df635] /usr/sbin/asterisk(__ao2_lock+0x5a) [0x442288] /usr/lib/asterisk/modules/chan_sip.so(+0x23976) [0x7f64ddf44976] /usr/sbin/asterisk(ast_indicate_data+0x367) [0x471ff1] /usr/sbin/asterisk(ast_indicate+0x2b) [0x471c38] /usr/sbin/asterisk(ast_channel_bridge+0x16a5) [0x47e50a] /usr/sbin/asterisk(ast_bridge_call+0xdf9) [0x4bb06a] /usr/lib/asterisk/modules/app_dial.so(+0xfa12) [0x7f64d4baba12] /usr/lib/asterisk/modules/app_dial.so(+0x103c2) [0x7f64d4bac3c2] /usr/sbin/asterisk(pbx_exec+0x1fb) [0x4ff7ec] /usr/sbin/asterisk() [0x5096f2] /usr/sbin/asterisk(ast_spawn_extension+0x65) [0x50b134] /usr/sbin/asterisk() [0x50bbbd] /usr/sbin/asterisk() [0x50d7fb] /usr/sbin/asterisk() [0x566e17] /lib/libpthread.so.0(+0x7971) [0x7f64f33e6971] /lib/libc.so.6(clone+0x6d) [0x7f64f3c3492d] === --- ---> Locked Here: chan_sip.c line 22286 (handle_request_refer) === ------------------------------------------------------------------- #### Respective backtraces from core dump: Thread 21 (Thread 17987): #0 0x00007f64f33ed6b4 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007f64f33e8864 in _L_lock_1024 () from /lib/libpthread.so.0 #2 0x00007f64f33e86c7 in pthread_mutex_lock () from /lib/libpthread.so.0 #3 0x00000000004df6ab in __ast_pthread_mutex_lock (filename=0x5bf754 "pbx.c", lineno=9541, func=0x5c5810 "pbx_builtin_setvar_helper", mutex_name=0x5c4039 "chan", t=0x2710a50) at lock.c:230 #4 0x0000000000442288 in __ao2_lock (user_data=0x2711638, file=0x5bf754 "pbx.c", func=0x5c5810 "pbx_builtin_setvar_helper", line=9541, var=0x5c4039 "chan") at astobj2.c:157 #5 0x000000000051d4fe in pbx_builtin_setvar_helper (chan=0x2711638, name=0x5c8483 "RTPAUDIOQOS", value=0x7f64f0869630 "ssrc=645964487;themssrc=211109326;lp=0;rxjitter=0.013916;rxcount=5590;txjitter=0.000000;txcount=5576;rlp=0;rtt=0.000000") at pbx.c:9541 #6 0x00000000005267cc in ast_rtp_instance_set_stats_vars (chan=0x2711638, instance=0x7f64cc092fc8) at rtp_engine.c:1598 #7 0x00007f64ddf9a28a in handle_request_bye (p=0x7f64cc0bf0a8, req=0x7f64cc0306e0) at chan_sip.c:22443 #8 0x00007f64ddf9fcb4 in handle_incoming (p=0x7f64cc0bf0a8, req=0x7f64cc0306e0, addr=0x7f64cc0bf730, recount=0x7f64f0869cfc, nounlock=0x7f64f0869cf8) at chan_sip.c:23733 #9 0x00007f64ddfa00fb in process_request_queue (p=0x7f64cc0bf0a8, recount=0x7f64f0869cfc, nounlock=0x7f64f0869cf8) at chan_sip.c:23804 #10 0x00007f64ddfa0379 in scheduler_process_request_queue (data=0x7f64cc0bf0a8) at chan_sip.c:23850 #11 0x000000000054a4f2 in ast_sched_runq (con=0x7f64ec556960) at sched.c:636 #12 0x00007f64ddfa27c2 in do_monitor (data=0x0) at chan_sip.c:24450 #13 0x0000000000566e17 in dummy_start (data=0x7f64ec0192e0) at utils.c:971 #14 0x00007f64f33e6971 in start_thread () from /lib/libpthread.so.0 #15 0x00007f64f3c3492d in clone () from /lib/libc.so.6 #16 0x0000000000000000 in ?? () Thread 33 (Thread 21668): #0 0x00007f64f33ed6b4 in __lll_lock_wait () from /lib/libpthread.so.0 #1 0x00007f64f33e8864 in _L_lock_1024 () from /lib/libpthread.so.0 #2 0x00007f64f33e86c7 in pthread_mutex_lock () from /lib/libpthread.so.0 #3 0x00000000004df6ab in __ast_pthread_mutex_lock (filename=0x7f64ddfc90b4 "chan_sip.c", lineno=6356, func=0x7f64ddfe0599 "sip_indicate", mutex_name=0x7f64ddfca92d "p", t=0x7f64cc0be4c0) at lock.c:230 #4 0x0000000000442288 in __ao2_lock (user_data=0x7f64cc0bf0a8, file=0x7f64ddfc90b4 "chan_sip.c", func=0x7f64ddfe0599 "sip_indicate", line=6356, var=0x7f64ddfca92d "p") at astobj2.c:157 #5 0x00007f64ddf44976 in sip_indicate (ast=0x2711638, condition=20, data=0x0, datalen=0) at chan_sip.c:6356 #6 0x0000000000471ff1 in ast_indicate_data (chan=0x2711638, _condition=20, data=0x0, datalen=0) at channel.c:4294 #7 0x0000000000471c38 in ast_indicate (chan=0x2711638, condition=20) at channel.c:4161 #8 0x000000000047e50a in ast_channel_bridge (c0=0x2711638, c1=0x7f64cc13d268, config=0x7f64d2a4c530, fo=0x7f64d2a4c078, rc=0x7f64d2a4c070) at channel.c:7263 #9 0x00000000004bb06a in ast_bridge_call (chan=0x2711638, peer=0x7f64cc13d268, config=0x7f64d2a4c530) at features.c:3564 #10 0x00007f64d4baba12 in dial_exec_full (chan=0x2711638, data=0x7f64d2a4f9a0 "SIP/trunk-out/14109999999,90,m(silence)gFU(orig-remote-connect-direct^4109999999)", peerflags=0x7f64d2a4d3e0, continue_exec=0x0) at app_dial.c:2651 #11 0x00007f64d4bac3c2 in dial_exec (chan=0x2711638, data=0x7f64d2a4f9a0 "SIP/trunk-out/14109999999,90,m(silence)gFU(orig-remote-connect-direct^4109999999)") at app_dial.c:2751 #12 0x00000000004ff7ec in pbx_exec (c=0x2711638, app=0x16459b0, data=0x7f64d2a4f9a0 "SIP/trunk-out/14109999999,90,m(silence)gFU(orig-remote-connect-direct^4109999999)") at pbx.c:1407 #13 0x00000000005096f2 in pbx_extension_helper (c=0x2711638, con=0x0, context=0x2711b90 "originate-dial-direct", exten=0x2711be0 "sw_131_14106157887", priority=10, label=0x0, callerid=0x7f64cc06d170 "8779999999", action=E_SPAWN, found=0x7f64d2a51c7c, combined_find_spawn=1) at pbx.c:4086 #14 0x000000000050b134 in ast_spawn_extension (c=0x2711638, context=0x2711b90 "originate-dial-direct", exten=0x2711be0 "sw_131_14109999999", priority=10, callerid=0x7f64cc06d170 "8779999999", found=0x7f64d2a51c7c, combined_find_spawn=1) at pbx.c:4605 #15 0x000000000050bbbd in __ast_pbx_run (c=0x2711638, args=0x0) at pbx.c:4703 #16 0x000000000050d7fb in pbx_thread (data=0x2711638) at pbx.c:5014 #17 0x0000000000566e17 in dummy_start (data=0x2651480) at utils.c:971 #18 0x00007f64f33e6971 in start_thread () from /lib/libpthread.so.0 #19 0x00007f64f3c3492d in clone () from /lib/libc.so.6 #20 0x0000000000000000 in ?? () #### Analysis: The do_monitor thread takes locks in a wrong order: the pvt is locked, and it deadlocked on trying to get a lock on the channel. The code in handle_request_bye() is written in an assumption that is is entered into with channel lock held: struct ast_channel *bridge = p->owner ? ast_bridged_channel(p->owner) : NULL; /* We need to get the lock on bridge because ast_rtp_instance_set_stats_vars will attempt * to lock the bridge. This may get hairy... */ while (bridge && ast_channel_trylock(bridge)) { ast_channel_unlock(p->owner); do { /* Can't use DEADLOCK_AVOIDANCE since p is an ao2 object */ sip_pvt_unlock(p); usleep(1); sip_pvt_lock(p); } while (p->owner && ast_channel_trylock(p->owner)); bridge = p->owner ? ast_bridged_channel(p->owner) : NULL; } Apparently, the body of the outer while loop is not entered into (or there would be a warninig from an attempt to release an unowned lock). Later, if (p->owner) { ast_rtp_instance_set_stats_vars(p->owner, p->rtp); } and this is where the deadlock occurs, as p->owner is not locked. On a different day, I received the following event on the same server: ERROR[17987]: lock.c:384 in __ast_pthread_mutex_unlock: chan_sip.c line 23627 (handle_incoming): mutex 'owner' freed more times than we've locked! handle_incoming() is what calls handle_request_bye() and also fully assumes the owner locks is held. Apparently, in rare cases the assumption is not true. I am tracking down why that could happen, but help is indeed appreciated.