Summary: | ASTERISK-14162: [patch] Deadlock On One-legged Transfer [SIP / REPLACES] (Call Pickup) | ||
Reporter: | Gregory Hinton Nietsky (irroot) | Labels: | |
Date Opened: | 2009-05-19 08:31:14 | Date Closed: | 2010-02-19 12:38:31.000-0600 |
Priority: | Blocker | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_sip/Transfers |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) 15151.diff ( 1) invite_w_replaces_1.4.diff ( 2) onelegtrf.patch | |
Description: | Hi there here is a nasty deadlock on a one legged transfer [SIP / REPLACES]. the solution is trivial but took me a few hours to pickup the patch is a patch against clean 1.4.24.1. the bellow lock dump is from a patched version. BTW the this is a bit of a joke as im a amputee so me colegues had a field day. ****** ADDITIONAL INFORMATION ****** ======================================================================= === Currently Held Locks ============================================== ======================================================================= === === <file> <line num> <function> <lock name> <lock addr> (times locked) === === Thread ID: 2988112784 (do_monitor started at [16761] chan_sip.c restart_monitor()) === ---> Lock #0 (chan_sip.c): MUTEX 9487 get_sip_pvt_byid_locked (channel lock) 0x9876780 (2) === ---> Lock #1 (chan_sip.c): MUTEX 9487 get_sip_pvt_byid_locked (channel lock) 0xb1e0f0f0 (4) === ---> Lock #2 (chan_sip.c): MUTEX 16707 do_monitor &monlock 0xb2220e20 (1) === ------------------------------------------------------------------- === === Thread ID: 3005942672 (pbx_thread started at [ 2660] pbx.c ast_pbx_start()) === ---> Waiting for Lock #0 (channel.c): MUTEX 1699 ast_waitfor_nandfds (channel lock) 0x9876780 (1) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === ------------------------------------------------------------------- === === Thread ID: 2981931920 (pbx_thread started at [ 2660] pbx.c ast_pbx_start()) === ---> Waiting for Lock #0 (channel.c): MUTEX 1699 ast_waitfor_nandfds (channel lock) 0xb1e0f0f0 (1) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === --- ---> Locked Here: chan_sip.c line 9487 (get_sip_pvt_byid_locked) === ------------------------------------------------------------------- === === Thread ID: 2981272464 (netconsole started at [ 1035] asterisk.c listener()) === ---> Tried and failed to get Lock #0 (channel.c): MUTEX 1100 channel_find_locked (channel lock) 0x9876780 (0) === ------------------------------------------------------------------- === ======================================================================= | ||
Comments: | By: Gregory Hinton Nietsky (irroot) 2009-05-19 13:11:48 this seems strongly related to ASTERISK-14154 altough the major versions differ. the case is circumstancial but i feel quite strong ... By: Gregory Hinton Nietsky (irroot) 2009-05-21 09:53:37 Ok since this patch has gone live no problems to report. this only affected a small % of customers in total about 6 sites were affected. there have been no other issues related to this patch rolled out to about 80 installs directly managed by us. i certainly would recommend inclusion in 1.4.25, we adopted 1.4.24 late and thus a delay in picking up this problem. By: Joshua C. Colp (jcolp) 2009-05-28 13:33:30 I've created a different change to fix this. The patch attached would have caused crashes under scenarios where they expected to be able to unlock the channel. I also noticed a double unlock and fixed that. Please give it a go. By: Gregory Hinton Nietsky (irroot) 2009-05-28 14:13:18 great looks like it will do the job nicely and removing that dbl unlock is a bonus ill be scheduling a rebuild and test hopefully in the next few days thx for the work. By: Leif Madsen (lmadsen) 2009-06-08 11:03:42 irroot: anything to report back? By: Gregory Hinton Nietsky (irroot) 2009-06-09 00:56:45 sorry not as of yet ive been busy with a project need to finish off and get to Lusaka Zambia by the weekend [a asterisk based project] but this will be hopefully done before i depart. the original patch i put in is in production with no odd problems sofar but keen to use the new patch as i have seen the unlocking when unlocked error. Greg By: Loris Santamaria (loris) 2009-07-14 13:02:47 We've been having the same problem as Irroot. Using 15151.diff seems to cure that problem but it seems to have problems of its own. Compiling asterisk 1.2.25.1 with DONT OPTIMIZE, DEBUG THREADS, DEBUG LOCKS, making a call and then generating a Call Pickup (Invite with replaces), leads to the following messages in the logs: [Jul 14 13:23:13] ERROR[21679] /home/loris/rpmbuild/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 7544 (transmit_state_notify): mutex '&chan->lock' freed more times than we've locked! [Jul 14 13:23:13] ERROR[21679] /home/loris/rpmbuild/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 7544 (transmit_state_notify): Error releasing mutex: Operation not permitted [Jul 14 13:23:20] ERROR[21702] /home/loris/rpmbuild/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 16367 (sipsock_read): mutex '(channel lock)' freed more times than we've locked! [Jul 14 13:23:20] ERROR[21702] /home/loris/rpmbuild/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 16367 (sipsock_read): Error releasing mutex: Operation not permitted [Jul 14 13:23:20] ERROR[21702] /home/loris/rpmbuild/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 3259 (__sip_destroy): Error destroying mutex &p->lock: Device or resource busy and then the following lock is generated: ======================================================================= === Currently Held Locks ============================================== ======================================================================= === === <file> <line num> <function> <lock name> <lock addr> (times locked) === === Thread ID: 1081186624 (do_monitor started at [16636] chan_sip.c restart_monitor()) === ---> Lock #0 (chan_sip.c): MUTEX 9336 get_sip_pvt_byid_locked &sip_pvt_ptr->lock 0x3f8aad0 (1) === ------------------------------------------------------------------- === ======================================================================= if one removes the second hunk of the patch (the double unlock), the following messages are generated: ERROR[10199] /usr/src/redhat/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 16365 (s ipsock_read): mutex '&p->owner->lock' freed more times than we've locked! [Jul 14 11:14:58] ERROR[10199] /usr/src/redhat/BUILD/asterisk-1.4.25.1/include/asterisk/lock.h: chan_sip.c line 16365 (s ipsock_read): Error releasing mutex: Operation not permitted this time no lock builds up inmediately, but after two or three hours of medium traffic the lock appears, and then is enough to execute "show channels" to have asterisk deadlock. By: Digium Subversion (svnbot) 2009-09-17 16:31:25 Repository: asterisk Revision: 219303 U branches/1.4/channels/chan_sip.c ------------------------------------------------------------------------ r219303 | dvossel | 2009-09-17 16:31:24 -0500 (Thu, 17 Sep 2009) | 21 lines INVITE w/Replaces deadlock fix This patch cleans up the locking logic in chan_sip.c's handle_invite_replaces() function as well as making use of ast_do_masquerade() rather than forcing the masquerade on an ast_read(). The code had several redundant unlocks that would result in 'freed more times than we've locked!' errors. I cleaned these up as well as moving all the unlock logic to the end of the function. This patch should also resolve the issue people were having with the replacecall channel never being unlocked with one legged calls. (closes issue ASTERISK-14162) Reported by: irroot Patches: invite_w_replaces_1.4.diff uploaded by dvossel (license 671) Tested by: irroot, dvossel Review: https://reviewboard.asterisk.org/r/371/ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=219303 By: Digium Subversion (svnbot) 2009-09-17 17:01:09 Repository: asterisk Revision: 219304 _U trunk/ U trunk/channels/chan_sip.c ------------------------------------------------------------------------ r219304 | dvossel | 2009-09-17 17:01:08 -0500 (Thu, 17 Sep 2009) | 27 lines Merged revisions 219303 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r219303 | dvossel | 2009-09-17 16:29:37 -0500 (Thu, 17 Sep 2009) | 21 lines INVITE w/Replaces deadlock fix This patch cleans up the locking logic in chan_sip.c's handle_invite_replaces() function as well as making use of ast_do_masquerade() rather than forcing the masquerade on an ast_read(). The code had several redundant unlocks that would result in 'freed more times than we've locked!' errors. I cleaned these up as well as moving all the unlock logic to the end of the function. This patch should also resolve the issue people were having with the replacecall channel never being unlocked with one legged calls. (closes issue ASTERISK-14162) Reported by: irroot Patches: invite_w_replaces_1.4.diff uploaded by dvossel (license 671) Tested by: irroot, dvossel Review: https://reviewboard.asterisk.org/r/371/ ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=219304 By: Digium Subversion (svnbot) 2009-09-17 17:03:34 Repository: asterisk Revision: 219305 _U branches/1.6.0/ U branches/1.6.0/channels/chan_sip.c ------------------------------------------------------------------------ r219305 | dvossel | 2009-09-17 17:03:33 -0500 (Thu, 17 Sep 2009) | 34 lines Merged revisions 219304 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r219304 | dvossel | 2009-09-17 16:59:21 -0500 (Thu, 17 Sep 2009) | 27 lines Merged revisions 219303 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r219303 | dvossel | 2009-09-17 16:29:37 -0500 (Thu, 17 Sep 2009) | 21 lines INVITE w/Replaces deadlock fix This patch cleans up the locking logic in chan_sip.c's handle_invite_replaces() function as well as making use of ast_do_masquerade() rather than forcing the masquerade on an ast_read(). The code had several redundant unlocks that would result in 'freed more times than we've locked!' errors. I cleaned these up as well as moving all the unlock logic to the end of the function. This patch should also resolve the issue people were having with the replacecall channel never being unlocked with one legged calls. (closes issue ASTERISK-14162) Reported by: irroot Patches: invite_w_replaces_1.4.diff uploaded by dvossel (license 671) Tested by: irroot, dvossel Review: https://reviewboard.asterisk.org/r/371/ ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=219305 By: Digium Subversion (svnbot) 2009-09-17 17:06:44 Repository: asterisk Revision: 219306 _U branches/1.6.1/ U branches/1.6.1/channels/chan_sip.c ------------------------------------------------------------------------ r219306 | dvossel | 2009-09-17 17:06:44 -0500 (Thu, 17 Sep 2009) | 34 lines Merged revisions 219304 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r219304 | dvossel | 2009-09-17 16:59:21 -0500 (Thu, 17 Sep 2009) | 27 lines Merged revisions 219303 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r219303 | dvossel | 2009-09-17 16:29:37 -0500 (Thu, 17 Sep 2009) | 21 lines INVITE w/Replaces deadlock fix This patch cleans up the locking logic in chan_sip.c's handle_invite_replaces() function as well as making use of ast_do_masquerade() rather than forcing the masquerade on an ast_read(). The code had several redundant unlocks that would result in 'freed more times than we've locked!' errors. I cleaned these up as well as moving all the unlock logic to the end of the function. This patch should also resolve the issue people were having with the replacecall channel never being unlocked with one legged calls. (closes issue ASTERISK-14162) Reported by: irroot Patches: invite_w_replaces_1.4.diff uploaded by dvossel (license 671) Tested by: irroot, dvossel Review: https://reviewboard.asterisk.org/r/371/ ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=219306 By: Digium Subversion (svnbot) 2009-09-17 17:07:56 Repository: asterisk Revision: 219307 _U branches/1.6.2/ U branches/1.6.2/channels/chan_sip.c ------------------------------------------------------------------------ r219307 | dvossel | 2009-09-17 17:07:56 -0500 (Thu, 17 Sep 2009) | 34 lines Merged revisions 219304 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r219304 | dvossel | 2009-09-17 16:59:21 -0500 (Thu, 17 Sep 2009) | 27 lines Merged revisions 219303 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r219303 | dvossel | 2009-09-17 16:29:37 -0500 (Thu, 17 Sep 2009) | 21 lines INVITE w/Replaces deadlock fix This patch cleans up the locking logic in chan_sip.c's handle_invite_replaces() function as well as making use of ast_do_masquerade() rather than forcing the masquerade on an ast_read(). The code had several redundant unlocks that would result in 'freed more times than we've locked!' errors. I cleaned these up as well as moving all the unlock logic to the end of the function. This patch should also resolve the issue people were having with the replacecall channel never being unlocked with one legged calls. (closes issue ASTERISK-14162) Reported by: irroot Patches: invite_w_replaces_1.4.diff uploaded by dvossel (license 671) Tested by: irroot, dvossel Review: https://reviewboard.asterisk.org/r/371/ ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=219307 |