Summary:ASTERISK-18149: Dead lock in parking chan_sip handle_request_refer
Reporter:Michael Cramer (micc)Labels:
Date Opened:2011-07-18 13:35:04Date Closed:2011-10-03 15:44:10
Versions: Frequency of
Environment:CentOS 5.3 x32Attachments:( 0) coreshowlocks.txt
( 1) park_lock_fix.diff
( 2) park_lock_fix2.diff
Description:Had several deadlocks today in the same place it seems.
Comments:By: Michael Cramer (micc) 2011-07-18 13:37:04.409-0500

Results of core show locks when it happened.

By: Gregory Hinton Nietsky (irroot) 2011-07-18 13:44:06.770-0500

Patch from RB 1322

By: Michael Cramer (micc) 2011-07-18 16:13:27.911-0500

Patch is bad, doesn't seem to solve the problem and causes another one:

lock.c:407 __ast_pthread_mutex_unlock: chan_sip.c line 24584 (handle_request_do): mutex 'p->owner' freed more times than we've locked!
lock.c:438 __ast_pthread_mutex_unlock: chan_sip.c line 24584 (handle_request_do): Error releasing mutex: Operation not permitted

By: Michael Cramer (micc) 2011-07-19 00:14:24.472-0500

I also noticed that my ref counts on the parking lots was not being decremented when a call was picked up from a parking spot. I found the problem and solution of that I think in features.c around line 4577 there's a comment about XXX Why do we unlock here? Instead of using ASTOBJ_UNLOCK I added ao2_ref(parkinglot, -1); and this seems to keep the count correct in parkedcalls show. If this could cause a deadlock, I don't know, but I suspect the dead lock still exists but I have not found a repro case yet. It only seems to happen on heavy load on my production server, which I can't test with anymore unless I'm sure the dead lock is fixed.

By: Gregory Hinton Nietsky (irroot) 2011-07-19 02:54:30.615-0500

That is not going to cause a deadlock but is a aditional problem im looking at the info you sent have a better solution in the works.

By: Gregory Hinton Nietsky (irroot) 2011-07-20 09:54:40.025-0500

Keep it simple reworked patch that keeps channel locked in right order

By: Gregory Hinton Nietsky (irroot) 2011-07-20 10:16:48.392-0500

Right one .... sorry folks

By: Gregory Hinton Nietsky (irroot) 2011-09-08 12:54:47.581-0500

Please see the following commit i believe it includes this fix.

r331867 | dvossel | 2011-08-15 17:12:16 +0200 (Mon, 15 Aug 2011) | 6 lines

Fixes locking inversion issues present in the handling of the sip REFER method.

(closes issue ASTERISK-18082)
Reported by: James Van Vleet