Summary:ASTERISK-02726: IAX2 trunked channel cannot recover from loss of packet containing 'first full voice frame'?
Reporter:kb1_kanobe2 (kb1_kanobe2)Labels:
Date Opened:2004-11-02 11:07:52.000-0600Date Closed:2011-06-07 14:10:02
Versions:Frequency of
Environment:Attachments:( 0) trunked-frame-problem
( 1) vnak-calledserver-snippet
Description:I have two Asterisk hosts connected via an IAX2 trunk over the public Internet. This is a fairly busy connection with approximatly 450 to 500 starts per day. Intermittantly I have been receiving complaints of blank voicemails and/or one way audio on calls. Reviewing the logs this seems to correllate to bursts of 'Received trunked frame before first full voice frame' warnings. Looking at socket_read() code in chan_iax2.c it seems that the incoming audio (relative to server logging the warnings) is being intentionally discarded because a specific packet has been dropped containing the elusive 'first full voice frame'.

If that is indeed the case, should not chan_iax2 make an attempt to restart the offending trunked channel or (less desireably) tear it down after a reasonable wait period of a second or two? The current behaviour seems to be to keep the channel up until either party disconnects - giving the illusion of a complete two-way talk path.

I suspect this condition is only occuring when the trunk is already up and servicing other channels - if the trunk itself is not brought up completely Asterisk seems to be able to recover quite gracefully.


Looking in recordings produced by Voicemail() it seems that the messages resulting from this condition have 0 byte audio files but a 'duration' > 0 in the associated .txt file for the message. Similarly, callers arriving in voicemail who encounter this problem and speak longer than the maxsilence timeout are being forcably disconnected, after which the recordings are erased etc.

Looking through my logs for yesterday, it seems I experienced about net 5 minutes of discarded audio, assuming one warning for every 20ms frame.

I regret that resolving this is beyond my abilities to code. I have asked around on #asterisk for a few weeks now but been unable to speak with anyone intimatly familar with IAX trunking. Similarly this is not consistently reproduceable for debugging as it seems to stem from host-network issues.

I'd be happy to provide any other information that may be of use.
Comments:By: Mark Spencer (markster) 2004-11-02 14:51:25.000-0600

In your logs, do you see a "VNAK" message?  If the first packet is lost, a "VNAK" is sent for any miniframe which is received on that.

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-02 15:08:55.000-0600

No VNAK around the time of the problem. I do seem to be getting a debug message from the same thread at the same time, thus:

Nov  1 08:54:24 WARNING[21546]: Received trunked frame before first full voice frame
Nov  1 08:54:24 DEBUG[21546]: Received iseqno 9 not within window 1->1
Nov  1 08:54:24 WARNING[21546]: Received trunked frame before first full voice frame
Nov  1 08:54:24 DEBUG[21546]: Received iseqno 9 not within window 1->1
Nov  1 08:54:24 WARNING[21546]: Received trunked frame before first full voice frame
Nov  1 08:54:24 DEBUG[21546]: Received iseqno 9 not within window 1->1

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-02 15:10:21.000-0600

I have extracted a full debug log from the server that's not getting the audio for the two minutes surrounding a problem call. I'd upload it but I'd rather it wasn't public - is there somewhere I can email it or will the upload honor the private flag on the bugnote?

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-09 14:40:04.000-0600

'trunked-frame-problem' contains an excerpt of the full log file from the server during an event and as the call is hung up by the calling party. Unfortunatly I didn't catch it starting. Some entries seem to have been overrun by syslog...

This host is now running 'CVS-v1-0-11/03/04'. The far end is still 'CVS-v1-0-10/02/04'.

By: Mark Spencer (markster) 2004-11-09 16:55:51.000-0600

Is one end of the other being restarted?

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-09 17:57:43.000-0600

If you mean is the IAX channel being restarted, no. Generally it's happening when the call lands in Voicemail, so the caller is hung up on off after the silence timeout expires.

The calling server seems to just sit there pumping out frames which are discarded equally quickly by the called server... but only for that one channel - the others in the trunk are fine.

After the offending call is hung up subsequent calls along the same path are fine. I don't know about concurrent calls along the same path as, again, I can't work out how to make it break intentionally, but I'm guessing they're ok otherwise I'd be getting a higher volume of dud-voicemail complaints from users.

Is there more useful information I could be gathering?

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-15 12:09:04.000-0600

As another data point, this seems to be related somehow to calls that have gone through an IAX native transfer. Ie. they're originating Server A -[iax trunked]-> Server B, then going through a Server B, Server C handoff and the call is ending up Server A -[iax trunked]-> Server C. It seems to be immediately around the trunk handoff that the WARNINGs start churning out (but not every time - most calls are fine...)

I'm attaching a second of log entries leading up to and around the apparent time things go wrong.

Again, any suggestions welcomed.

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-15 12:11:10.000-0600

Also, there is an additional nonstandard WARNING in these logs of the general form:

WARNING[32472]: Transmitting VNAK for callno 16389 isqno 1

I built it into iax2_vnak() to see if the system was getting hung up on one particular frame, which seems to be mostly the case...

By: kb1_kanobe2 (kb1_kanobe2) 2004-11-16 14:40:47.000-0600

Just another data point - this issue has not reoccured since setting 'notransfer=yes' in iax.conf on all servers involved in the network. I'll continue to run this way for now as a workaround.

By: twisted (twisted) 2004-12-01 01:02:10.000-0600

Just bringing this back up again.  Still an issue, I assume?

By: kb1_kanobe2 (kb1_kanobe2) 2004-12-02 14:17:24.000-0600

Well, yes and no...

I've disabled transfers in my configuration to work around the problem which  prevents the calls from breaking but on the other hand means a call may end up traversing the same IAX trunk three or four times as the caller is handed off between staff.

I haven't looked into the underlying problem any more as I'm not too hot with IAX and been a little busy of late. If no one else comes up with anything then I'll probably be able to dig into it in another three weeks or so...

Perhaps just let this linger?

By: twisted (twisted) 2004-12-15 20:49:45.000-0600

I'm going to suspend this for now.  Please re-open once you have time to deal with it.