Summary: | ASTERISK-16118: [patch] Redirecting ;1 side of local channel during optimisation causes double free of ;1 side and crash | ||
Reporter: | David Woolley (davidw) | Labels: | |
Date Opened: | 2010-05-19 11:17:44 | Date Closed: | 2010-09-20 18:42:57 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/General |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) redirectoptimise-config.diff.txt ( 1) redirectoptimise-exacerbate-race.diff.txt ( 2) redirectoptimise-full.log ( 3) redirectoptimise-minimalfix.diff.txt | |
Description: | If the ;1 side of a local channel is redirected between the ast_channel_masquerade call and the ast_do_masquerade call resulting from the channel being answered and optimised, the ;1 side gets double freed and, without MALLOC_DEBUG, free() calls abort(), crashing Asterisk. Scenario. With MALLOC_DEBUG enabled, use ChannelRedirect on the ;q side of a local channel marginally after the ;2 side has been answered. Expect. Redirect fails gracefully and optimisation completes. Actual. Sometimes the original ;1 side channel structure is freed twice. (With MALLOC_DEBUG not enabled, but using 1.6.1.0, free() calls abort() and crashes Asterisk.) ****** ADDITIONAL INFORMATION ****** Firstly, as this involves a race condition, I patched app_channelredirect.c to delay until my presumed pre-condition for the crash had happened. As this is just a delay in asynchronous processing, and the same timing can (and we believe did, in the wild) happen by chance, I do not believe this invalidates this report. Then, taking make samples as a baseline, I added the following dialplan: exten => 6999,1,Answer exten => 6999,2,Dial(SIP/6105&Local/Redirector@default) exten => 6998,1,Set(__FIRSTCHAN=${CHANNEL}) exten => 6998,n,Dial(Local/6999@default) exten => Redirector,1,ChannelRedirect(${IMPORT(${FIRSTCHAN},BRIDGEPEER)},default,600,1) and called extension 6998. Asterisk reported an invalid free and probably bad hangup: WARNING: Freeing unused memory at 0xd12c490, in ast_channel_free of channel.c, line 1435 [May 19 14:45:39] WARNING[21343]: channel.c:1434 ast_channel_free: Channel 'SIP/6105-00000013' may not have been hung up properly. Currently ast_channel_masquerade checks for ->masqr already being set on the original side of the masquerade. I believe this problem can be avoided if it also fails the masquerade if ->masq is set. I would speculate that both should be checked on the clone side as well. However masquerading is a complex area, and I cannot be sure about collateral damage. Once this is submitted, I will try the check for ->masq, although as the diagnostic patch ensures that this condition almost certainly applies, I am pretty sure that it will help. Note that it is still possible for the optimisation masquerade to complete in this model; the model just tries to maximise the chance of failure, not guarantee it. In the real world case, we were running 1.6.1.0, did not have MALLOC_DEBUG set and were using AMI Redirect, rather than ChannelRedirect. We got a crash after a few weeks of live running. More details, particularly of the real world case, can be found on asterisk-dev: http://lists.digium.com/pipermail/asterisk-dev/2010-May/044096.html The trunk version uses reference counting, so probably wouldn't try a double free, although is still likely to be confused. Attachments to follow with the diagnostic patch, full log and configurations relative to make samples. This is the third attempt to get this through Mantis! | ||
Comments: | By: David Woolley (davidw) 2010-05-19 11:23:43 Could someone please remove the [patch] from the subject, as the patch uploaded is for diagnostic purposes and deliberately introduces a bug! By: David Woolley (davidw) 2010-05-20 05:52:53 Even with MALLOC_DEBUG there can be a subsequent crash. I'm treating that as secondary and not analysing it. As a measure of reproducibility on my test rig, I got 3 out of the first eight attempts ending in the double free. Now to try the fix. By: David Woolley (davidw) 2010-05-20 06:31:56 I think I may have got my original and clone terminology crossed. However, with the patch that I have just uploaded, I now get this (reasonable) message if the race window is fully met: [May 20 12:18:02] WARNING[5970]: channel.c:4303 ast_channel_masquerade: SIP/6105-00000007 is already going to masquerade as Local/6999@default-607c;1 Both with and without the patch, one gets this message if the window is over-shot: [May 20 12:17:57] WARNING[5967]: app_channelredirect.c:92 asyncgoto_exec: No such channel: Local/6999@default-8c5d;1 Whilst I suspect that all four combinations of original/clonechan and masq/masqr should be faulted, the remaining case doesn't apply for my problem, so I cannot really exercise it and have left it out of the patch. By: Digium Subversion (svnbot) 2010-09-20 17:57:49 Repository: asterisk Revision: 287682 U branches/1.4/main/channel.c ------------------------------------------------------------------------ r287682 | alecdavis | 2010-09-20 17:57:49 -0500 (Mon, 20 Sep 2010) | 18 lines ast_channel_masquerade: Avoid recursive masquerades. Check all 4 combinations of (original/clonechan) * (masq/masqr). Initially original->masq and clonechan->masqr were only checked. It's possible with multiple masq's planned - and not yet executed, that the 'original' chan could already have another masq'd into it - thus original->masqr would be set, that masqr would lost. Likewise for the clonechan->masq. (closes issue ASTERISK-14975;ASTERISK-16118) Reported by: amorsen;davidw,alecdavis Patches: bug16057.diff4.txt uploaded by alecdavis (license 585) Tested by: ramonpeek, davidw, alecdavis ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=287682 By: Digium Subversion (svnbot) 2010-09-20 18:15:29 Repository: asterisk Revision: 287684 U branches/1.4/main/channel.c ------------------------------------------------------------------------ r287684 | alecdavis | 2010-09-20 18:15:26 -0500 (Mon, 20 Sep 2010) | 10 lines ast_channel_masquerade: remove extra else if (closes issue ASTERISK-16118,ASTERISK-14975) Reported by: amorsen;davidw,alecdavis Patches: based on bug16057.diff4.txt uploaded by alecdavis (license 585) Tested by: ramonpeek, davidw, alecdavis ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=287684 By: Digium Subversion (svnbot) 2010-09-20 18:16:46 Repository: asterisk Revision: 287685 U branches/1.6.2/main/channel.c ------------------------------------------------------------------------ r287685 | alecdavis | 2010-09-20 18:16:45 -0500 (Mon, 20 Sep 2010) | 18 lines ast_channel_masquerade: Avoid recursive masquerades. Check all 4 combinations of (original/clonechan) * (masq/masqr). Initially original->masq and clonechan->masqr were only checked. It's possible with multiple masq's planned - and not yet executed, that the 'original' chan could already have another masq'd into it - thus original->masqr would be set, that masqr would lost. Likewise for the clonechan->masq. (closes issue ASTERISK-14975;ASTERISK-16118) Reported by: amorsen;davidw,alecdavis Patches: based on bug16057.diff4.txt uploaded by alecdavis (license 585) Tested by: ramonpeek, davidw, alecdavis ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=287685 By: Digium Subversion (svnbot) 2010-09-20 18:20:06 Repository: asterisk Revision: 287701 _U branches/1.8/ U branches/1.8/main/channel.c ------------------------------------------------------------------------ r287701 | alecdavis | 2010-09-20 18:20:06 -0500 (Mon, 20 Sep 2010) | 24 lines Merged revisions 287685 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r287685 | alecdavis | 2010-09-21 11:16:45 +1200 (Tue, 21 Sep 2010) | 18 lines ast_channel_masquerade: Avoid recursive masquerades. Check all 4 combinations of (original/clonechan) * (masq/masqr). Initially original->masq and clonechan->masqr were only checked. It's possible with multiple masq's planned - and not yet executed, that the 'original' chan could already have another masq'd into it - thus original->masqr would be set, that masqr would lost. Likewise for the clonechan->masq. (closes issue ASTERISK-14975;ASTERISK-16118) Reported by: amorsen;davidw,alecdavis Patches: based on bug16057.diff4.txt uploaded by alecdavis (license 585) Tested by: ramonpeek, davidw, alecdavis ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=287701 By: Digium Subversion (svnbot) 2010-09-20 18:42:57 Repository: asterisk Revision: 287756 _U trunk/ U trunk/main/channel.c ------------------------------------------------------------------------ r287756 | alecdavis | 2010-09-20 18:42:57 -0500 (Mon, 20 Sep 2010) | 24 lines Merged revisions 287685 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ........ r287685 | alecdavis | 2010-09-21 11:16:45 +1200 (Tue, 21 Sep 2010) | 18 lines ast_channel_masquerade: Avoid recursive masquerades. Check all 4 combinations of (original/clonechan) * (masq/masqr). Initially original->masq and clonechan->masqr were only checked. It's possible with multiple masq's planned - and not yet executed, that the 'original' chan could already have another masq'd into it - thus original->masqr would be set, that masqr would lost. Likewise for the clonechan->masq. (closes issue ASTERISK-14975;ASTERISK-16118) Reported by: amorsen;davidw,alecdavis Patches: based on bug16057.diff4.txt uploaded by alecdavis (license 585) Tested by: ramonpeek, davidw, alecdavis ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=287756 |