[Home]

Summary:ASTERISK-16118: [patch] Redirecting ;1 side of local channel during optimisation causes double free of ;1 side and crash
Reporter:David Woolley (davidw)Labels:
Date Opened:2010-05-19 11:17:44Date Closed:2010-09-20 18:42:57
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) redirectoptimise-config.diff.txt
( 1) redirectoptimise-exacerbate-race.diff.txt
( 2) redirectoptimise-full.log
( 3) redirectoptimise-minimalfix.diff.txt
Description:If the ;1 side of a local channel is redirected between the ast_channel_masquerade call and the ast_do_masquerade call resulting from the channel being answered and optimised, the ;1 side gets double freed and, without MALLOC_DEBUG, free() calls abort(), crashing Asterisk.

Scenario.  With MALLOC_DEBUG enabled, use ChannelRedirect on the ;q side of a local channel marginally after the ;2 side has been answered.

Expect. Redirect fails gracefully and optimisation completes.

Actual.  Sometimes the original ;1 side channel structure is freed twice.  (With MALLOC_DEBUG not enabled, but using 1.6.1.0, free() calls abort() and crashes Asterisk.)


****** ADDITIONAL INFORMATION ******

Firstly, as this involves a race condition, I patched app_channelredirect.c to delay until my presumed pre-condition for the crash had happened.  As this is just a delay in asynchronous processing, and the same timing can (and we believe did, in the wild) happen by chance, I do not believe this invalidates this report.

Then, taking make samples as a baseline, I added the following dialplan:

exten => 6999,1,Answer
exten => 6999,2,Dial(SIP/6105&Local/Redirector@default)
exten => 6998,1,Set(__FIRSTCHAN=${CHANNEL})
exten => 6998,n,Dial(Local/6999@default)
exten => Redirector,1,ChannelRedirect(${IMPORT(${FIRSTCHAN},BRIDGEPEER)},default,600,1)

and called extension 6998.  Asterisk reported an invalid free and probably bad hangup:

WARNING: Freeing unused memory at 0xd12c490, in ast_channel_free of channel.c, line 1435
[May 19 14:45:39] WARNING[21343]: channel.c:1434 ast_channel_free: Channel 'SIP/6105-00000013' may not have been hung up properly.

Currently ast_channel_masquerade checks for ->masqr already being set on the original side of the masquerade.  I believe this problem can be avoided if it also fails the masquerade if ->masq is set.  I would speculate that both should be checked on the clone side as well.  However masquerading is a complex area, and I cannot be sure about collateral damage.  Once this is submitted, I will try the check for ->masq, although as the diagnostic patch ensures that this condition almost certainly applies, I am pretty sure that it will help.

Note that it is still possible for the optimisation masquerade to complete in this model; the model just tries to maximise the chance of failure, not guarantee it.

In the real world case, we were running 1.6.1.0, did not have MALLOC_DEBUG set and were using AMI Redirect, rather than ChannelRedirect.  We got a crash after a few weeks of live running.

More details, particularly of the real world case, can be found on asterisk-dev: http://lists.digium.com/pipermail/asterisk-dev/2010-May/044096.html

The trunk version uses reference counting, so probably wouldn't try a double free, although is still likely to be confused.

Attachments to follow with the diagnostic patch, full log and configurations relative to make samples.

This is the third attempt to get this through Mantis!
Comments:By: David Woolley (davidw) 2010-05-19 11:23:43

Could someone please remove the [patch] from the subject, as the patch uploaded is for diagnostic purposes and deliberately introduces a bug!

By: David Woolley (davidw) 2010-05-20 05:52:53

Even with MALLOC_DEBUG there can be a subsequent crash.  I'm treating that as secondary and not analysing it.

As a measure of reproducibility on my test rig, I got 3 out of the first eight attempts ending in the double free.  Now to try the fix.

By: David Woolley (davidw) 2010-05-20 06:31:56

I think I may have got my original and clone terminology crossed.  However, with the patch that I have just uploaded, I now get this (reasonable) message if the race window is fully met:

[May 20 12:18:02] WARNING[5970]: channel.c:4303 ast_channel_masquerade: SIP/6105-00000007 is already going to masquerade as Local/6999@default-607c;1

Both with and without the patch, one gets this message if the window is over-shot:

[May 20 12:17:57] WARNING[5967]: app_channelredirect.c:92 asyncgoto_exec: No such channel: Local/6999@default-8c5d;1

Whilst I suspect that all four combinations of original/clonechan and masq/masqr should be faulted, the remaining case doesn't apply for my problem, so I cannot really exercise it and have left it out of the patch.

By: Digium Subversion (svnbot) 2010-09-20 17:57:49

Repository: asterisk
Revision: 287682

U   branches/1.4/main/channel.c

------------------------------------------------------------------------
r287682 | alecdavis | 2010-09-20 17:57:49 -0500 (Mon, 20 Sep 2010) | 18 lines

ast_channel_masquerade: Avoid recursive masquerades.

Check all 4 combinations of (original/clonechan) * (masq/masqr).

Initially original->masq and clonechan->masqr were only checked.

It's possible with multiple masq's planned - and not yet executed, that
the 'original' chan could already have another masq'd into it - thus original->masqr
would be set, that masqr would lost.
Likewise for the clonechan->masq.

(closes issue ASTERISK-14975;ASTERISK-16118)
Reported by: amorsen;davidw,alecdavis
Patches:
     bug16057.diff4.txt uploaded by alecdavis (license 585)
Tested by: ramonpeek, davidw, alecdavis


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=287682

By: Digium Subversion (svnbot) 2010-09-20 18:15:29

Repository: asterisk
Revision: 287684

U   branches/1.4/main/channel.c

------------------------------------------------------------------------
r287684 | alecdavis | 2010-09-20 18:15:26 -0500 (Mon, 20 Sep 2010) | 10 lines

ast_channel_masquerade: remove extra else if

(closes issue ASTERISK-16118,ASTERISK-14975)

Reported by: amorsen;davidw,alecdavis
Patches:
     based on bug16057.diff4.txt uploaded by alecdavis (license 585)
Tested by: ramonpeek, davidw, alecdavis


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=287684

By: Digium Subversion (svnbot) 2010-09-20 18:16:46

Repository: asterisk
Revision: 287685

U   branches/1.6.2/main/channel.c

------------------------------------------------------------------------
r287685 | alecdavis | 2010-09-20 18:16:45 -0500 (Mon, 20 Sep 2010) | 18 lines

ast_channel_masquerade: Avoid recursive masquerades.

Check all 4 combinations of (original/clonechan) * (masq/masqr).

Initially original->masq and clonechan->masqr were only checked.

It's possible with multiple masq's planned - and not yet executed, that
the 'original' chan could already have another masq'd into it - thus original->masqr
would be set, that masqr would lost.
Likewise for the clonechan->masq.

(closes issue ASTERISK-14975;ASTERISK-16118)
Reported by: amorsen;davidw,alecdavis
Patches:
     based on bug16057.diff4.txt uploaded by alecdavis (license 585)
Tested by: ramonpeek, davidw, alecdavis


------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=287685

By: Digium Subversion (svnbot) 2010-09-20 18:20:06

Repository: asterisk
Revision: 287701

_U  branches/1.8/
U   branches/1.8/main/channel.c

------------------------------------------------------------------------
r287701 | alecdavis | 2010-09-20 18:20:06 -0500 (Mon, 20 Sep 2010) | 24 lines

Merged revisions 287685 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

........
 r287685 | alecdavis | 2010-09-21 11:16:45 +1200 (Tue, 21 Sep 2010) | 18 lines
 
 ast_channel_masquerade: Avoid recursive masquerades.
 
 Check all 4 combinations of (original/clonechan) * (masq/masqr).
 
 Initially original->masq and clonechan->masqr were only checked.
 
 It's possible with multiple masq's planned - and not yet executed, that
  the 'original' chan could already have another masq'd into it - thus original->masqr
 would be set, that masqr would lost.
 Likewise for the clonechan->masq.
 
 (closes issue ASTERISK-14975;ASTERISK-16118)
 Reported by: amorsen;davidw,alecdavis
 Patches:
       based on bug16057.diff4.txt uploaded by alecdavis (license 585)
 Tested by: ramonpeek, davidw, alecdavis
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=287701

By: Digium Subversion (svnbot) 2010-09-20 18:42:57

Repository: asterisk
Revision: 287756

_U  trunk/
U   trunk/main/channel.c

------------------------------------------------------------------------
r287756 | alecdavis | 2010-09-20 18:42:57 -0500 (Mon, 20 Sep 2010) | 24 lines

Merged revisions 287685 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

........
 r287685 | alecdavis | 2010-09-21 11:16:45 +1200 (Tue, 21 Sep 2010) | 18 lines
 
 ast_channel_masquerade: Avoid recursive masquerades.
 
 Check all 4 combinations of (original/clonechan) * (masq/masqr).
 
 Initially original->masq and clonechan->masqr were only checked.
 
 It's possible with multiple masq's planned - and not yet executed, that
  the 'original' chan could already have another masq'd into it - thus original->masqr
 would be set, that masqr would lost.
 Likewise for the clonechan->masq.
 
 (closes issue ASTERISK-14975;ASTERISK-16118)
 Reported by: amorsen;davidw,alecdavis
 Patches:
       based on bug16057.diff4.txt uploaded by alecdavis (license 585)
 Tested by: ramonpeek, davidw, alecdavis
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=287756