Summary:ASTERISK-20708: Deadlock in chan_sip on transfer when trying to update redirecting information
Reporter:Mark Michelson (mmichelson)Labels:
Date Opened:2012-11-20 17:24:50.000-0600Date Closed:2012-12-11 18:04:45.000-0600
Versions:11.0.1 Frequency of
Environment:Attachments:( 0) ASTERISK-20708.patch
( 1) ASTERISK-20708-2.patch
( 2) ASTERISK-20708-3.patch
Description:Issue reported by marv (Tim Ringenbach) in #asterisk-dev

There is a code path that results in an incorrect locking order between a sip_pvt and its owner channel.

The problem is that the owner and pvt get unlocked, then the pvt gets relocked and calls change_redirecting_information(), which calls get_rdnis(), which then calls pbx_builtin_setvar_helper() on the pvt's owner.

A patch is attached that moves where the redirecting updates occur to a place where both the pvt and owner are unlocked so that we can safely lock them in the correct order.
Comments:By: Tim Ringenbach at Asteria Solutions Group (tim_ringenbach) 2012-11-21 11:36:07.790-0600

I found a problem with the patch.

The last sip_pvt_unlock(p) on the original line 25881 needs to be there.

Otherwise, we first lock current.chan1, violating the locking order, but then we go on to lock p again, and return p double locked instead of just locked.

By: Tim Ringenbach at Asteria Solutions Group (tim_ringenbach) 2012-11-21 11:57:58.499-0600

I think this is correct, but I'm still in the process of testing it.

By: Mark Michelson (mmichelson) 2012-11-26 09:05:26.782-0600

Good catch. The sip_pvt_unlock() location you added is not quite correct though. If current.chan2 is NULL, then we'll end up returning with p unlocked instead of locked. I believe the proper place to put the sip_pvt_unlock() is just below where the SIP_DEFER_BYE_ON_TRANSFER flag is set.

By: Mark Michelson (mmichelson) 2012-11-26 09:06:45.195-0600

I've uploaded ASTERISK-20708-3.patch.

That should do the trick.

By: Tim Ringenbach at Asteria Solutions Group (tim_ringenbach) 2012-11-29 15:16:02.414-0600

That patch seems good. I put it on a customer's box who was running into the deadlock, and asterisk has been stable ever since.