[Home]

Summary:ASTERISK-17387: [patch] Deadlock In chan_sip (conlock / cb_extensionstate
Reporter:Gregory Hinton Nietsky (irroot)Labels:
Date Opened:2011-02-11 01:40:42.000-0600Date Closed:2011-02-11 12:27:16.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:1.6.2.15 Frequency of
Occurrence
Related
Issues:
is duplicated byASTERISK-17287 Random deadlocks in channel.c line 2744 (__ast_read), Asterisk freeze
is duplicated byASTERISK-17126 [patch] Random Deadlocks in or !?!
Environment:Attachments:( 0) chan_sip-1.6.2.patch
Description:
There is a deadlock in processing state changes it would appear conlock should be locked when calling cb_extensionstate from chan_sip.c ...

have a patch ill be loading up in a follow up.

this was a problem in 1.4 that i hacked to death to make it go away this patch seems to make more sense.

NB we have only now moved from 1.4.36 [with backports] to 1.6.2.16.1 this problem affects sites with higher call volume and reliance on BLF [Snom/Polycom] and may be triggered on a sip/system reload

****** ADDITIONAL INFORMATION ******

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: -1266246800 (tps_processing_function started at [  451] taskprocessor.c ast_taskprocessor_get())
=== ---> Lock #0 (pbx.c): MUTEX 9419 ast_rdlock_contexts &conlock 0x81fd2a0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x81174d1]
       /usr/sbin/asterisk(ast_rdlock_contexts+0x32) [0x8131a2e]
       /usr/sbin/asterisk() [0x81209f6]
       /usr/sbin/asterisk() [0x816c24f]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Lock #1 (pbx.c): MUTEX 3912 handle_statechange hints 0x9fbe3f0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x8080295]
       /usr/sbin/asterisk(_ao2_lock+0x48) [0x8080e3a]
       /usr/sbin/asterisk() [0x8120a23]
       /usr/sbin/asterisk() [0x816c24f]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Lock #2 (pbx.c): MUTEX 3913 handle_statechange hint 0xafe06658 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x8080295]
       /usr/sbin/asterisk(_ao2_lock+0x48) [0x8080e3a]
       /usr/sbin/asterisk() [0x8120a4e]
       /usr/sbin/asterisk() [0x816c24f]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Waiting for Lock #3 (chan_sip.c): MUTEX 12978 cb_extensionstate p 0xac477790 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x8080295]
       /usr/sbin/asterisk(_ao2_lock+0x48) [0x8080e3a]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x35ed7) [0xb0c96ed7]
       /usr/sbin/asterisk() [0x8120b47]
       /usr/sbin/asterisk() [0x816c24f]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== --- ---> Locked Here: chan_sip.c line 7514 (find_call)
=== -------------------------------------------------------------------
===
=== Thread ID: -1332446352 (do_monitor           started at [22798] chan_sip.c restart_monitor())
=== ---> Lock #0 (chan_sip.c): MUTEX 22248 handle_request_do &netlock 0xb0ceecc0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x7d3f) [0xb0c68d3f]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x60eba) [0xb0cc1eba]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x60c82) [0xb0cc1c82]
       /usr/sbin/asterisk(ast_io_wait+0x144) [0x80fb1e0]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x629e6) [0xb0cc39e6]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Lock #1 (chan_sip.c): MUTEX 7514 find_call sip_pvt_ptr 0xac477790 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x8080295]
       /usr/sbin/asterisk(_ao2_lock+0x48) [0x8080e3a]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x1d4ef) [0xb0c7e4ef]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x60ed6) [0xb0cc1ed6]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x60c82) [0xb0cc1c82]
       /usr/sbin/asterisk(ast_io_wait+0x144) [0x80fb1e0]
       /usr/lib/asterisk/modules-1.6/chan_sip.so(+0x629e6) [0xb0cc39e6]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Waiting for Lock #2 (pbx.c): MUTEX 9419 ast_rdlock_contexts &conlock 0x81fd2a0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x81174d1]
       /usr/sbin/asterisk(ast_rdlock_contexts+0x32) [0x8131a2e]
       /usr/sbin/asterisk() [0x811fdc7]
       /usr/sbin/asterisk(ast_canmatch_extension+0x55) [0x81218e6]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x361f8) [0xb41a01f8]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== --- ---> Locked Here: pbx.c line 9419 (ast_rdlock_contexts)
=== -------------------------------------------------------------------
===
=== Thread ID: -1333789840 (pri_dchannel         started at [14006] chan_dahdi.c start_pri())
=== ---> Lock #0 (chan_dahdi.c): MUTEX 12924 pri_dchannel &pri->lock 0xb41c4924 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x6a0f) [0xb4170a0f]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x344ce) [0xb419e4ce]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Lock #1 (chan_dahdi.c): MUTEX 13171 pri_dchannel &pri->pvts[chanpos]->lock 0xb082e578 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x6a0f) [0xb4170a0f]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x35a41) [0xb419fa41]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Waiting for Lock #2 (pbx.c): MUTEX 9419 ast_rdlock_contexts &conlock 0x81fd2a0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x81174d1]
       /usr/sbin/asterisk(ast_rdlock_contexts+0x32) [0x8131a2e]
       /usr/sbin/asterisk() [0x811fdc7]
       /usr/sbin/asterisk(ast_canmatch_extension+0x55) [0x81218e6]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x361f8) [0xb41a01f8]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== --- ---> Locked Here: pbx.c line 9419 (ast_rdlock_contexts)
=== -------------------------------------------------------------------
===
=== Thread ID: -1334035600 (pri_dchannel         started at [14006] chan_dahdi.c start_pri())
=== ---> Lock #0 (chan_dahdi.c): MUTEX 12924 pri_dchannel &pri->lock 0xb41c5bbc (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x6a0f) [0xb4170a0f]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x344ce) [0xb419e4ce]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Lock #1 (chan_dahdi.c): MUTEX 13171 pri_dchannel &pri->pvts[chanpos]->lock 0xb085f208 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x6a0f) [0xb4170a0f]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x35a41) [0xb419fa41]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== ---> Waiting for Lock #2 (pbx.c): MUTEX 9419 ast_rdlock_contexts &conlock 0x81fd2a0 (1)
       /usr/sbin/asterisk(ast_bt_get_addresses+0x19) [0x8105607]
       /usr/sbin/asterisk() [0x81174d1]
       /usr/sbin/asterisk(ast_rdlock_contexts+0x32) [0x8131a2e]
       /usr/sbin/asterisk() [0x811fdc7]
       /usr/sbin/asterisk(ast_canmatch_extension+0x55) [0x81218e6]
       /usr/lib/asterisk/modules-1.6/chan_dahdi.so(+0x361f8) [0xb41a01f8]
       /usr/sbin/asterisk() [0x817cdf8]
       /lib/libpthread.so.0(+0x5afe) [0xb7477afe]
       /lib/libc.so.6(clone+0x5e) [0xb76b564e]
=== --- ---> Locked Here: pbx.c line 9419 (ast_rdlock_contexts)
=== -------------------------------------------------------------------
===
=======================================================================
Comments:By: Gregory Hinton Nietsky (irroot) 2011-02-11 01:45:18.000-0600

please note the first 2 hunks in patch are not directly related to this bug but do seem to fix a deadlock and could be extraneous code.

By: Stefan Schmidt (schmidts) 2011-02-11 03:31:06.000-0600

there is allready a solution for this submitted in 1.6.2 branch rev 302265 but i dont think this made it into 1.6.2.16.1

this patch written by jeff peeler solves the deadlock in handle_statechange instead of chan_sip.c which makes more sense cause its called more often than the place you attached your patch.

please take a look if this revision is allready in 1.6.2.16.1 if yes and you still got this problem please reply.

if not try the patch from rev 302265.

thanks!

best regards
stefan

By: Gregory Hinton Nietsky (irroot) 2011-02-11 04:37:09.000-0600

indeed this will resolve it as it no longer locks the conlock in handle_statechange i removed these locks in 1.4 to work arround the problem the patch i submited was to ensure conlock was locked before calling the callback.

Regards Greg

By: Stefan Schmidt (schmidts) 2011-02-11 04:47:25.000-0600

but you dont have to take the conlock there cause cb_extensionstate calles transmit_state_notify which calles ast_get_hint which calles ast_hint_extension and there you have the conlock allready and IMHO this is the only place where you need the conlock when you access the extension list.

the only thing you change with the conlock in your patch is to lock it again and for a longer time which could cause other problems.


best regards

stefan

By: Leif Madsen (lmadsen) 2011-02-11 12:27:15.000-0600

Closing as it was mentioned this has been resolved already. Thanks for the patch!