Summary:ASTERISK-09016: Crash on dereferencing null pointer in chan_zap.c - zt_get_index()
Reporter:Juha Vehvilainen (jusu)Labels:
Date Opened:2007-03-15 04:13:48Date Closed:2007-07-09 21:20:52
Versions:Frequency of
Environment:Attachments:( 0) extensions.conf
( 1) zapata.conf
( 2) zaptel.conf
Description:When answering a call with TDM400P, I get this:

   -- Starting simple switch on 'AsyncGoto/Zap/4-1<ZOMBIE>'

And Asterisk segfaults.
Running under debugger, I can see ss_thread() in chan_zap.c called. The first few lines have:

index = zt_get_index(chan, p, 1);

At this point p is NULL and dereferencing it inside zt_get_index() crashes asterisk.

This can easily be fixed by checking if p == null and exiting ss_thread(). If I do this the call gets answered and everything seems to work.

Sorry, I do not know why p (assigned to chan->tech_pvt) is null to begin with. I realize something in my setup is causing it, a bug this severe but easy to fix just wouldn't go unnoticed.

This crash is always reproducible when running under gdb debugger. Otherwise it happens every now and then.

I would like to either

- Have ss_thread fixed so the case of (p == null) would exit ss_thread and not crash asterisk or
- Get insight of what could be wrong with my setup that is causing this

Let me know how I can help to fix this bug.
Comments:By: Serge Vecher (serge-v) 2007-03-15 08:39:13

Hmm, that AsyncGoto shouldn't be there, I think. Can you post your extension.conf and zaptel.conf?

By: Juha Vehvilainen (jusu) 2007-03-15 08:51:48

Thanks, posted. My dialplan is the last few lines of extensions.conf. A new call goes to s of [default], and my manager connection redirects it to extension voice to answer the call (is there another way of doing this with manager commands?).

The seconds command 'voice' in extension voice is my own plugin, app_voice.

Hope this helps, I'm baffled by the AsyncGoto too.

By: Serge Vecher (serge-v) 2007-03-15 09:04:23

what about zapata.conf?

By: Juha Vehvilainen (jusu) 2007-03-15 09:30:22


By: Serge Vecher (serge-v) 2007-03-15 10:17:07

you need to answer the call in [jusu] first. Looking at your zapata.conf, you have improper settings done. Configuration issues are out of the bug-tracker scope. Please contact technical support at support@digium.com to properly setup your TDM400P card.

By: Juha Vehvilainen (jusu) 2007-04-21 02:10:51

From Digium Support: "Your problem is not in the configuration.  Please reopen the bug and we will let the developers know not to close it."

By: Kevin P. Fleming (kpfleming) 2007-04-24 14:54:52

First, a note to serge-v: this is not a 'configuration issue' that should be sent to Digium support. What do you see in his zaptel.conf or zapata.conf that makes you think it is?

Second to jusu: If I understand what you are doing here, you are trying to go off-hook on an FXS port, wait in a Wait() command, then redirect the call via an AMI action. This will not work as you expect, because while the channel is in ss_thread() it is not an active channel and cannot be treated as one. It is essentially a 'provisional' channel that is waiting for the user to supply digits to match to an extension in the dialplan.

To do what you want, I believe you can add 'immediate=yes' to the zapata.conf segment for this channel. What this will do is cause the channel to be immediately delivered the 's' extension as a real channel, and not sit in ss_thread() waiting for digits do be dialed.

By: Juha Vehvilainen (jusu) 2007-04-25 00:38:30

Actually what I'm doing is:
1. Let the channel ring unanswered in default, s,1,Wait(), until I...
2. Redirect the call to extension 'voice' via AMI
3. First command in extension 'voice' is Answer, which answers the call.
Reason: My application (connected via AMI) needs to decide whether to answer the call or not.
Is this a valid sequence of events or is there a better way of doing this?
I will test the 'immediate' setting.

By: Kevin P. Fleming (kpfleming) 2007-04-25 08:35:04

Sorry, I misread your configuration and thought you were using FXS ports.

So you have a call arriving (ringing) at an FXO port and you want do something to that call via the manager interface before it has been answered? It would seem to make a lot more sense to use AGI for this, so you can run your external logic and then decide what to do with the call from within the call thread, rather than via a disconnected interface like AMI.

However, I suspect that what is happening here is that ss_thread() is being called when it should not be, and this is actually a bug.

By: Juha Vehvilainen (jusu) 2007-04-25 13:40:10


I simply need to decide if the machine I'm running on has enough resources to handle the call, and only then answer it. If I let it ring my operator will transfer it to another number.

My reasoning for AMI instead of AGI is that having a single, always-on connection between my app and asterisk would be more efficient than running a script/program, or connecting with FastAGI for each call. Also I'm trying to keep things simple as I would need the AMI connection anyway. Do you think this is dangerous, would it be safer in the long run to go with AGI? Once the call is answered, I might need to redirect/transfer it, which seems possible with AMI too. Are there strong reasons in the internal structure to use AGI?

Even if the bug is fixed elsewhere, it would seem like a good idea to add checking for null in ss_thread like I proposed in my initial description. This could throw an error instead of crashing asterisk.


By: Kevin P. Fleming (kpfleming) 2007-04-25 17:34:39

I can't speak to whether checking for NULL there is a good idea or not; you have not given us a complete verbose/debug console trace to know exactly what happened to the call from the instant it began ringing until the crash occurred.

It does appear that you have identified a very bizarre race condition, though, so I have made the change in these branches:

branches/1.2: rev 61913
branches/1.4: rev 61914
trunk: rev 61915