[Home]

Summary:ASTERISK-12632: [patch] Device Side transfer of a call between 2 extensions leads to failure because MACRO_DEPTH is not reset
Reporter:Philippe Lindheimer (p_lindheimer)Labels:
Date Opened:2008-08-22 13:24:08Date Closed:2009-08-23 09:38:53
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Applications/app_macro
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 20080825__bug13363.diff.txt
( 1) 20081209__bug13363.diff.txt
( 2) 20081216__bug13363.diff.txt
( 3) app_macro.2.c.patch
( 4) app_macro.c.patch
Description:Here is the scenario. You have 2 levels of macro depth to make a call such that upon successfully creating the call, the MACRO_DEPTH in the channel is set to 2. You now use the SIP phone's 'transfer' key to transfer the call to another extension and then begin repeating the transfers between the 2 extensions. The MACRO_DEPTH increments by 2 each transfer until Asterisk eventually terminates the call with:

[Aug 22 09:57:51] ERROR[9446] app_macro.c: Macro():  possible infinite loop detected.  Returning early.
[Aug 22 09:57:51] DEBUG[9446] app_macro.c: Executed application: Macro

The issue seems to be seen in the following trace. The following is a snapshot of the log where the call being transfered exits back out of MoH and seeks the new extension it was transfered to (which results to the macro call to exten-vm,700,700):

[Aug 22 09:55:53] VERBOSE[9446] logger.c:     -- Stopped music on hold on SIP/224-09850350
[Aug 22 09:55:53] VERBOSE[9446] logger.c:   == Spawn extension (from-internal-xfer, 700, 0) exited non-zero on 'SIP/224-09850350' in macro 'dial'
[Aug 22 09:55:53] VERBOSE[9446] logger.c:   == Spawn extension (from-internal-xfer, 700, 0) exited non-zero on 'SIP/224-09850350' in macro 'dial-confirm'
[Aug 22 09:55:53] VERBOSE[9446] logger.c:   == Spawn extension (from-internal-xfer, 700, 0) exited non-zero on 'SIP/224-09850350'
[Aug 22 09:55:53] VERBOSE[9446] logger.c:     -- Executing [700@from-internal-xfer:1] Macro("SIP/224-09850350", "exten-vm|700|700") in new stack

So as can be seen in that example, the macros are exited. The bug seems to be that the channel is considered dead, so the MACRO_DEPTH is never decremented. This is because of the following code segment in app_maro.c:

 if (!dead) {
   pbx_builtin_setvar_helper(chan, "MACRO_DEPTH", depthc);
   ast_set2_flag(chan, autoloopflag, AST_FLAG_IN_AUTOLOOP);
 }

Since the channel is marked as dead in the code above this, the MACRO_DEPTH does not get decremented even though the trace shows that the macros are exiting.

The patch below moves the call to pbx-builtin_settvar_helper above the if so that it always gets called. Since I am not intimately familiar with this code, I  don't know if this is safe or the right way to do it. I can say that testing this code removes the issue and I can transfer back and forth continually.

ANOTHER IMPORTANT COMMENT: If I run the same test, but instead of using the device side transfer I use the Asterisk '#' transfer the issue goes away.






****** STEPS TO REPRODUCE ******

Either use the FreePBX dialplan, or otherwise create a dialplan that uses a couple of levels of macros to do the call. In the case of FreePBX:

macro-exten-vm is called with the extension to dial
macro-dial does the dial

In the following example, phone 222 is called by dialing extension 222, and so forth.

phone 222 calls extension 224 which answers the call
phone 222 does a blind transfer, using the phone's transfer key, to extension 700
phone 700 answers the call, and then does a blind transfer to extension 222 using the phone's transfer key
phone 222 answers the call, and then does a blind transfer to extension 700 using the phone's transfer key

Eventually, (after about 3 transfers) the error occurs.

Note - in the specific test, 222 is an Aastra 55i, 700 is a Polycom 601 and 224 is a Mitel Navigator SIP phone. However, the issue has been repeated with various softphones by others and does not appear phone related other than the fact that it happens when using the device side transfer.

****** ADDITIONAL INFORMATION ******

*** Code Removed ***
Comments:By: Sean Bright (seanbright) 2008-08-22 13:30:15

Please post all code as attachments

By: Philippe Lindheimer (p_lindheimer) 2008-08-22 13:39:03

it was mentioned in an IRC chat that the following debug message was not very valuable:

[Aug 22 09:55:53] VERBOSE[9446] logger.c: == Spawn extension (from-internal-xfer, 700, 0) exited non-zero on 'SIP/224-09850350' in macro 'dial'

I had modified it during my own testing to print out the value of res formated as %d in the same verbose and debug statements. The result in all cases was -1

By: Philippe Lindheimer (p_lindheimer) 2008-08-23 02:10:36

another issue related to this. Should we also be clearing all the old ARGn macro arguments upon returning. If we don't, they end up getting left in the channel for the transfered call which could create an issue in some cases. Example:

Last macro to run had ARG1-ARG3 passed to it. The call is transfered and because of this issue, ARG1 - ARG3 are left with stale values.

The new call hits a new macro to process it which takes ARG1, ARG2 and an OPTIONAL ARG3. The specific extension has the optional ARG3 blank. But the macro that is called has a construct that looks like:

if ("${ARG3}" != "") {do something}

If this channel were properly initiated, the ARG3 would be blank, instead, it has a 'random' stale value in it and the macro does something that it shouldn't...

This same issue may or may not be a problem with some of the other leftover macro channel variables as well that I'm not thinking of. In any event, a modified patch is being attached that also makes sure the ARGn variables are processed normally whether the channel is marked as 'dead' or not.

By: Tilghman Lesher (tilghman) 2008-08-25 11:33:07

Well, the issue with setting variables on a dead channel is that a channel may be in some state of deallocation when that logic is called, and so we check for a dead channel before decrementing the MACRO_DEPTH, to avoid a possible memory leak and/or possibility of a crash.

One way to solve this might be to set the macro depth value in a channel datastore, and set the channel datastore to prohibit inheritance.  However, I can't see a good way to get around the ARGn situation, without changing the semantics of ARGn syntax.  Basically, I'd look into making macro arguments accessible via a dialplan function instead of directly as variables.  Another possibility would be to always check for ARG{n+1} and pushvar a blank, if the value exists.

By: Philippe Lindheimer (p_lindheimer) 2008-08-25 11:45:01

Is there any other data to determine if it is specifically in this unique dead state, vs. any dead state, and is it safe in this unique dead state (since I realized my patch doesn't differentiate and that was my concern with my patch, that it may not always be safe). The value of 'ret' is -1 so that doesn't seem to differentiate it.

By: Tilghman Lesher (tilghman) 2008-08-25 16:38:53

Something along these lines.  This still needs some testing, however.

By: Philippe Lindheimer (p_lindheimer) 2008-08-25 23:14:36

applied the patch you supplied, ran a test, system hung up after the second transfer:

[Aug 25 21:18:51] ERROR[32642] app_macro.c: Macro():  possible infinite loop detected.  Returning early.

so for now put my unsafe patch back.

By: Philippe Lindheimer (p_lindheimer) 2008-09-30 00:20:40

ping - in case it was missed, I tested the supplied patch and the problem still persisted. If there is any additional data that I can provide or a new patch I would be happy to help. It would be nice to see this one closed - macros are in widespread use and this type of failure is the type that can give Asterisk a bad name.

By: Jimbo (digmaster) 2008-11-28 05:26:49.000-0600

I'm also having this problem, my users reported that the call is dropped when they transfer it.

Apparently, this happens quite frequently here because the secretary does a first transfer and the desk does a second transfer. Eventually, any additional transfers are killed by asterisk, which means no employee can transfer an incoming call.

The log shows loads of these:
app_macro.c: Macro():  possible infinite loop detected.  Returning early.

By: Alex Loh (alex728) 2008-12-09 05:36:21.000-0600

I've been able to reproduce this, its after 3 transfers the next transfer fails due to the hardcoded maxdepth of 7 in macro.c - as MACRO_DEPTH is being incremented by 2 each time...

That said I suspect few businesses would transfer a caller more than 5 or so times (I hope!)

As a workaround, would increasing this maxdepth figure to something like 15 lead to any risks? (I appreciate the warning earlier in the code and that this whole bit of code was obviously put there for a good reason..)

Unfortunately I won't be able to test this myself for a few weeks or I would have done so already but it came to mind as a "quick and dirty" workaround (not for a production machine of course!)

By: Tilghman Lesher (tilghman) 2008-12-09 15:22:00.000-0600

Let's try a slightly different approach, killing MACRO_DEPTH on masquerade.

By: Tilghman Lesher (tilghman) 2008-12-13 18:36:57.000-0600

In case that last note wasn't clear, there is now a new patch up on this issue, for testing.

By: Digium Subversion (svnbot) 2008-12-17 15:14:56.000-0600

Repository: asterisk
Revision: 165317

U   branches/1.4/apps/app_macro.c

------------------------------------------------------------------------
r165317 | tilghman | 2008-12-17 15:14:53 -0600 (Wed, 17 Dec 2008) | 4 lines

Reverse the fix from issue ASTERISK-6019 and add proper handling for that issue.
(Closes issue ASTERISK-13115, closes issue ASTERISK-12632)
Fixed by myself (license 14)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=165317

By: Digium Subversion (svnbot) 2008-12-17 15:18:54.000-0600

Repository: asterisk
Revision: 165319

_U  trunk/
U   trunk/apps/app_macro.c

------------------------------------------------------------------------
r165319 | tilghman | 2008-12-17 15:18:52 -0600 (Wed, 17 Dec 2008) | 11 lines

Merged revisions 165317 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r165317 | tilghman | 2008-12-17 15:14:37 -0600 (Wed, 17 Dec 2008) | 4 lines
 
 Reverse the fix from issue ASTERISK-6019 and add proper handling for that issue.
 (Closes issue ASTERISK-13115, closes issue ASTERISK-12632)
 Fixed by myself (license 14)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=165319

By: Digium Subversion (svnbot) 2008-12-17 15:21:16.000-0600

Repository: asterisk
Revision: 165321

_U  branches/1.6.0/
U   branches/1.6.0/apps/app_macro.c

------------------------------------------------------------------------
r165321 | tilghman | 2008-12-17 15:21:14 -0600 (Wed, 17 Dec 2008) | 18 lines

Merged revisions 165319 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r165319 | tilghman | 2008-12-17 15:18:57 -0600 (Wed, 17 Dec 2008) | 11 lines
 
 Merged revisions 165317 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r165317 | tilghman | 2008-12-17 15:14:37 -0600 (Wed, 17 Dec 2008) | 4 lines
   
   Reverse the fix from issue ASTERISK-6019 and add proper handling for that issue.
   (Closes issue ASTERISK-13115, closes issue ASTERISK-12632)
   Fixed by myself (license 14)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=165321

By: Digium Subversion (svnbot) 2008-12-17 15:22:35.000-0600

Repository: asterisk
Revision: 165323

_U  branches/1.6.1/
U   branches/1.6.1/apps/app_macro.c

------------------------------------------------------------------------
r165323 | tilghman | 2008-12-17 15:22:33 -0600 (Wed, 17 Dec 2008) | 18 lines

Merged revisions 165319 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
 r165319 | tilghman | 2008-12-17 15:18:57 -0600 (Wed, 17 Dec 2008) | 11 lines
 
 Merged revisions 165317 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r165317 | tilghman | 2008-12-17 15:14:37 -0600 (Wed, 17 Dec 2008) | 4 lines
   
   Reverse the fix from issue ASTERISK-6019 and add proper handling for that issue.
   (Closes issue ASTERISK-13115, closes issue ASTERISK-12632)
   Fixed by myself (license 14)
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=165323

By: Digium Subversion (svnbot) 2009-02-11 14:54:20.000-0600

Repository: asterisk
Revision: 174885

U   branches/1.4/apps/app_macro.c
U   branches/1.4/main/pbx.c

------------------------------------------------------------------------
r174885 | tilghman | 2009-02-11 14:54:19 -0600 (Wed, 11 Feb 2009) | 13 lines

Restore a behavior that was recently changed, when we fixed issue ASTERISK-13115 and
issue ASTERISK-12632 (related to issue ASTERISK-6019).  When a hangup occurs during a Macro
execution in earlier 1.4, the h extension would execute within the Macro
context, whereas it was always supposed to execute only within the main context
(where Macro was called).  So this fix checks for an "h" extension in the
deepest macro context where a hangup occurred; if it exists, that "h" extension
executes, otherwise the main context "h" is executed.
(closes issue ASTERISK-13256)
Reported by: wetwired
Patches:
      20090210__bug14122.diff.txt uploaded by Corydon76 (license 14)
Tested by: andrew

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=174885

By: Digium Subversion (svnbot) 2009-02-11 14:55:48.000-0600

Repository: asterisk
Revision: 174886

_U  trunk/

------------------------------------------------------------------------
r174886 | tilghman | 2009-02-11 14:55:47 -0600 (Wed, 11 Feb 2009) | 19 lines

Blocked revisions 174885 via svnmerge

........
 r174885 | tilghman | 2009-02-11 14:54:18 -0600 (Wed, 11 Feb 2009) | 13 lines
 
 Restore a behavior that was recently changed, when we fixed issue ASTERISK-13115 and
 issue ASTERISK-12632 (related to issue ASTERISK-6019).  When a hangup occurs during a Macro
 execution in earlier 1.4, the h extension would execute within the Macro
 context, whereas it was always supposed to execute only within the main context
 (where Macro was called).  So this fix checks for an "h" extension in the
 deepest macro context where a hangup occurred; if it exists, that "h" extension
 executes, otherwise the main context "h" is executed.
 (closes issue ASTERISK-13256)
  Reported by: wetwired
  Patches:
        20090210__bug14122.diff.txt uploaded by Corydon76 (license 14)
  Tested by: andrew
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=174886