Summary:ASTERISK-18976: pbx_lua and confbridge menu dialplan_exec() do not work together
Reporter:Timo Teräs (fabled)Labels:
Date Opened:2011-12-07 01:06:51.000-0600Date Closed:2012-01-06 15:30:01.000-0600
Status:Closed/CompleteComponents:Applications/app_confbridge PBX/pbx_lua
Versions:10.0.0-rc2 Frequency of
Environment:Attachments:( 0) fix-pbxlua-recursion.patch
Description:I have a pure lua dialplan with ConfBridge using in-conference dtmf keys bound to execute dialplan.

Everything works just fine until the conference ends (marked user leaves). If during the conference no dtmf keys calling dialplan was clicked, then everything is fine: the ConfBridge returns back to my lua code where it was called. However, if during ConfBridge, my custom lua code was called, ConfBridge never returns: on conference end the channel is immediately hang up.
Comments:By: Timo Teräs (fabled) 2011-12-07 07:43:23.339-0600

Seems to me that the problem is in pbx_lua. When lua extension is executed via pbx, it uses lua_get_state() to allocate the lua state mahine that is ast_channel specific.

However, confbridge, seems to save and zero-out channel's pbx, context, extension and priority fields. Then it just re-enters the dialplan. At this point pbx_lua would reuse the old statemachine since the channel is still the same. And when returning to confbridge, it restores the old pbx, context, extension and priority. But since the old lua state machine was used, it's left in bad state.

pbx_lua should probably tie the state machine to chan->pbx instead of chan.

By: Leif Madsen (lmadsen) 2011-12-07 11:50:18.091-0600

Thank you for taking the time to report this issue. Please note that pbx_lua is a community supported module and thus the support level may reflect that. More information about the various module support levels can be found at https://wiki.asterisk.org/wiki/display/AST/Asterisk+Module+Support+States

By: Timo Teräs (fabled) 2011-12-07 12:40:23.864-0600

I'm willing to provide patch, if someone could verify that my plan to fix it is correct. pbx_lua says Matthew Nicholson as it's author. Perhaps, he could comment on this issue?

My current idea, as said in earlier comment, is to make the lua state machine to be bound to chan->pbx instead of chan. I'm just curious if it causes some unexpected problems or not.

By: Timo Teräs (fabled) 2011-12-07 13:42:20.724-0600

Actually, the solution might be easier than expected. I think we just need to make pbx_lua:exec() re-entrant. That is, so that it can be called recursively for a single channel. Currently the problem is that exec() will unconditionally release the lua state machine after it's done. The idea is to make lua_get_state() return also value if it created or not the state machine; this way exec() can skip the release when called recursively.

I'll try to test this tomorrow. Will post patch, if it works.

By: Timo Teräs (fabled) 2011-12-07 14:16:28.129-0600

Duh. exec() is re-entrant state machine wise. It's the goto detection stuff that breaks it. I think I know how to fix it properly now.

By: Timo Teräs (fabled) 2011-12-08 01:01:35.481-0600

Fixes goto detection when pbx application re-enters dialplan (e.g. app_confbridge).

By: Timo Teräs (fabled) 2011-12-08 01:03:43.352-0600

Attached patch fixes my bug. I also tested it to detect properly the goto's even while inside recursion.

By: Timo Teräs (fabled) 2011-12-09 03:23:04.805-0600

Marked as regression. The change r317721 broke this.

By: Matthew Nicholson (mnicholson) 2011-12-09 09:22:09.792-0600

At a brief glance the patch looks good.