[Home]

Summary:ASTERISK-12331: [patch] ast_iax2_new() fails to account for a bad situation, deals with another incorrectly
Reporter:Peter Grayson (jpgrayson)Labels:
Date Opened:2008-07-07 15:05:04Date Closed:2008-07-07 17:34:26
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_iax2
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) chan_iax2_ast_iax2_new.patch
( 1) chan_iax2_ast_iax2_new2.patch
Description:ast_iax2_new() does a tricky dance when allocating an ast_channel. There are two related problems in ast_iax2_new() regarding this channel allocation.

For reference, here is the offending code:

ast_mutex_unlock(&iaxsl[callno]);
tmp = ast_channel_alloc(1, state, ...);
ast_mutex_lock(&iaxsl[callno]);
if (!iaxs[callno]) {
if (tmp) {
ast_channel_free(tmp);
}
ast_mutex_unlock(&iaxsl[callno]);
return NULL;
}

The first and most obvious problem in the current implementation is that there is a case we will unlock callno's mutex prior to returning NULL. This breaks the protocol for ast_iax2_new() -- users of this function expect iaxsl[callno] to _always_ be returned locked. When this failure case is hit, callers of ast_iax2_new() will try to unlock an already unlocked mutex. Whoops.

The second problem is that this test is incomplete. This code only tests to see if the chan_iax2_pvt object for callno has become null. A better test is to check if iaxs[callno] has changed (e.g. i != iaxs[callno]). There are real-world cases where asterisk is being bombarded with new calls that this callno may not only be destroyed, but destroyed and reused while ast_channel_alloc() is being called.
Comments:By: Peter Grayson (jpgrayson) 2008-07-07 15:10:43

This patch addresses to the two issues outlined and adds a warning message when this relatively rare event happens.

By: Peter Grayson (jpgrayson) 2008-07-07 15:12:46

Another thing to note is that when we originally fixed this problem, we would unlock iaxsl[callno] prior to calling ast_channel_free() and relock after. I could not figure out the rationale for doing that, so I left it out of the patch.

By: Russell Bryant (russell) 2008-07-07 16:14:46

Nice catch.  A couple of comments before I commit the patch ...

1) With regards to the check for non-NULL, I don't think it's really possible for the pvt != iaxs[callno] case to occur.  There is code in the channel driver to not reuse a call number for 60 seconds.  I do agree that it's an assumption that probably shouldn't be made all over the place, but that is why it doesn't actually cause a problem.

2) The reason that the pvt is unlocked before calling ast_channel_free() has to do with locking order.  You must lock the ast_channel before the pvt.  That code path would have the channel locked after the pvt, and can cause a deadlock.  So, that change is necessary.

By: Peter Grayson (jpgrayson) 2008-07-07 16:52:36

I've uploaded a second version of the patch where we unlock/relock around the ast_channel_free() call. Russel's explanation makes sense.

As for (1), it makes sense that the (i != iaxs[callno]) part of the test can no longer be true. We originally fixed this problem against 1.4.10. I believe that version did not have the 60 second guarantee.

To be pedantic, I left the test unchanged from the original patch anyway. I wouldn't object to someone changing that piece of the patch though.

By: Digium Subversion (svnbot) 2008-07-07 17:34:23

Repository: asterisk
Revision: 128795

U   branches/1.4/channels/chan_iax2.c

------------------------------------------------------------------------
r128795 | russell | 2008-07-07 17:34:20 -0500 (Mon, 07 Jul 2008) | 8 lines

Fix handling of when a pvt disappears.  Properly return the pvt locked
and don't hold the pvt lock while destroying the ast_channel.

(closes issue ASTERISK-12331)
Reported by: jpgrayson
Patches:
     chan_iax2_ast_iax2_new2.patch uploaded by jpgrayson (license 492)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=128795