Summary:ASTERISK-03873: [patch] fix to timestamps in jitterbuf and chan_iax2
Reporter:grolloj (grolloj)Labels:
Date Opened:2005-04-05 16:12:24Date Closed:2011-06-07 14:10:37
Versions:Frequency of
Environment:Attachments:( 0) sample_cnt.2.patch.txt
Description:The included patch addresses issues within the new jitterbuffer and modifies chan_iax2. stevek's been guiding me through these changes.

To address a problem with reported statistics, I've changed the way the jitterbuffer tracks its silent state. When a frame indicating silence is received, the jitterbuffer will enter its silent state. The next voice frame received will end the silent state. The problem is that a voice frame placed in the jitterbuffer late or out of order will spuriously end the silent state. This will cause jb_next to return a predicted time for the next available frame and
as a result, will cause statistics reporting dropped packets and packet loss percentage to go awry. In order to fix that, I've stored the timestamp of the last frame received that indicated silence and I've prevented the jitterbuffer from leaving the silent state because of a late voice frame.

In the jitterbuffer, the last_voice_ms member of jb_info was overused. Since last_voice_ms may change from frame to frame (iLBC delivers 30ms frames that, when transcoded to speex, may end up as a series of 20ms and 40ms frames), the value shouldn't be used for static adjustments. I've replaced two instances where it did with defines: JB_TARGET_EXTRA and JB_ADJUST_DELAY. JB_TARGET_EXTRA adds slack to the jitterbuffer's target size. It's defined to add an extra 40ms to the target jitterbuffer size, reducing the amount of growing and interpolation necessary in edge cases. JB_ADJUST_DELAY defines the delay in ms that the jitterbuffer should honor between interpolation frames to grow the jitterbuffer. If the jitterbuffer will run out of space without growing, this define won't be honored.

I've added a max_contig_interp member to jb_info. It can be set via jb_setinfo(). It will cap the number of interpolation frames that the jitterbuffer will return in a row. If the jitterbuffer returns max_contig_interp interpolation frames consecutively, it will enter the silent state. This can be used with clients that don't support silent control frames to prevent unnecessary interpolation being done at *.

The last change to the jitterbuffer fixes a problem with how last_voice_ts was being calculated. An assumption was made that last_voice_ms would be the same as frame->ms, which is not always the case (the same iLBC situation described above). jb_get has been changed to update last_voice_ts to the correct value even in cases where the frame length may be changing from one frame to the next. To keep accurate track of last_voice_ts, the jitterbuffer needs to know how much time will be taken by an interpolated frame. A new parameter to jb_get, interpl, provides this information.

For chan_iax2.c, I changed the call to jb_get() to add the interpl parameter. I've hardcoded this as 20ms until more accurate information about interpolation frame length can be provided.

Also in chan_iax2.c, I've added a manager command to retrieve more readily parsable netstats. It prevents truncation of the iaxs[]->owner in the format string and returns the stats in a space-separated list.

I've also addressed a problem in how the last timestamp was being tracked in chan_iax2.c. Previously, last was not being updated by AST_FRAME_IAX frames. However, because schedule_delivery() was being called for these frames, the frames would have unwrap_timestamp() called on their timestamp. If voice is being sent continuously, this isn't a problem. However, if no voice is being sent, this can cause a wrapping issue and ruin the timestamps for control frames. I believe this is what caused bug 3961 and that this patch should fix that issue. schedule_delivery() will no longer be called for AST_FRAME_IAX frames, and the last timestamp will be updated when the frame contains a client timestamp (not for PONG, ACK, or LAGRP).

Before committing, I'd like to make sure that this does fix the problem described in bug 3961. There is a disclaimer on file for stevek which I'm told applies to me.
Comments:By: stevekstevek (stevekstevek) 2005-04-05 17:21:34

Re: disclaimer.  JeffG works for my company, therefore I can disclaim the code he writes.  I can put this into a fax if necessary.  But, the karma goes to him.

For the interp_len parameter, we can make a new function (perhaps just make it inline, and put it in translate.h or something, with other ast_codec stuff), like this:

ast_codec_interp_len(int format) {
    return (format == AST_FORMAT_ILBC) ? 30 : 20;

At the same time, we should move the various implementations of get_samples in chan_iax2 (and the inline version of this in rtp.c) into an ast_codec_get_samples(ast_frame *f) function in the same place..

By: grolloj (grolloj) 2005-04-08 11:47:37

I've moved the code to get a frame's sample count from rtp.c and chan_iax2.c to frame.c. chan_iax2 will now get the correct number of G723 samples. Additionally, to get the correct number of samples from a speex frame, I've added a function that's independent of the speex codec (so that linking to speex won't be required).

The new patch provides the correct interpolation frame length to calls to jb_get.

I've removed the changes fixing a timestamp issue in chan_iax2.c, since I posted that patch separately in bug ASTERISK-386961

edited on: 04-08-05 18:31

By: grolloj (grolloj) 2005-04-12 16:30:16

The latest patch, jb_chan_iax2.3.patch.txt, fixes the return from jb_next() to account for changes made to tracking last_voice_ts.

By: grolloj (grolloj) 2005-04-22 12:23:08

Maybe at this point, it'd be better for me to break this patch up to make review easier, but for now, the newest patch behaves as previously described, with the following changes:

- jb_reset() initializes the jitterbuffer to the size of JB_TARGET_EXTRA
- During silence, the jitterbuffer was shrinking to the target size in one jump. Since this could cause a silent period to be completely collapsed, I've modified it to shrink by interpl every 10ms.
- last_voice_ts was being erroneously rolled back in the case of a late voice frame during silence. This remnant of the previous method of doing voice ts prediction has been removed.
- Because of changes to voice ts prediction, an issue arose causing small variances in timestamps to completely screw up the jitterbuffer. The jb would report a frame lost then late. It would keep doing this for each frame that was a bit off, causing a complete loss of voice. An accomodation has been made for frames arriving too early, but in order, and for frames that arrived late, causing the jitterbuffer to interpolate past the late frame's timestamp. When compensating for this change, the jitterbuffer may return a frame more rapidly than expected by the last frame's ts and ms, but this will handle voice ts sequences like 520, 540, 563, 583 . . . and 520, 540, 559, 579 . . ..

By: Mark Spencer (markster) 2005-04-26 23:28:01

Do you mind breaking this into two patches, one which is just the refactoring, and then doing another one with the actual changes?  That will make it easier to review and merge.  Thanks!

By: grolloj (grolloj) 2005-04-27 14:32:51

sample_cnt.patch.txt (sorry, forgot the .txt extesion the first time; wasn't sure if that mattered) does the refactoring with just a few relevant additions.

It moves sample counting code out of rtp.c and chan_iax2.c into frame.c. It also adds a speex_samples() function to count the number of samples contained in a speex frame. Additionally, I've added a ast_codec_interp_len() to frame.h to return the expected interpolation length for a given format. That will get used with some jitterbuffer changes, but I thought it made sense to include it with the rest of these changes.

By: kb1_kanobe2 (kb1_kanobe2) 2005-04-28 01:03:02

I still seem to be having newjb problems when newjb, trunktimestamps and trunking are all enabled. I'm running cvs-head from today (27apr) with the jb_chan_iax2.4.patch.txt from this bug and chan_iax2.iax_fr_ts.patch.txt.gz
from ASTERISK-3869 applied on both ends.

What appears to be happening is that the 'last' timestamp simply isn't being updated with the timestamp of the miniframe from the trunk. It does get updated periodically, but only in response to a non-trunked frame, such as a LAGRQ. Also, after the miniframe timestamp has passed approximatly 70000, wrapped around and 'last' has been reset to 0 it is never updated again - even in response to non-trunked frames.

If the jitterbuffer is on then the incoming audio stops entirely after the first wraparound. If I continue to leave the channel up for a few minutes more I very infrequently hear bursts of a few hundred milliseconds of audio and the jb stats start to go berzerk. Ie.:

astpbx-masonrd*CLI> iax2 show netstats
                                -------- LOCAL ---------------------  -------- REMOTE --------------------
Channel                    RTT  Jit  Del  Lost   %  Drop  OOO  Kpkts  Jit  Del  Lost   %  Drop  OOO  Kpkts
IAX2/astpbx-wharfrd-16385   40   48  101     0   0     0   23      0    0  501 16776063   0     1 4482      7

Now, this is going to sound wierd, but for the life of me I cannot find where in chan_iax2.c the 'last' ts is supposed to be being updated from the miniframe ts so I'm at a loss to debug this further myself.

I've attached two chunks of debug log to this bug, one with the newjb disabled, the other with it enabled. I'd appreciate any comments, observations or fixes.


By: grolloj (grolloj) 2005-04-29 18:54:03

Although it may have sounded weird, the 'last' ts for a call wasn't being updated from trunk media frames. Similar to the patch for bug ASTERISK-386961, the chan_iax2.tr_fr_ts.patch.txt patch I just added should update 'last' where it hadn't been before.

socket_read() updates 'last' for a call before returning at the bottom of the function. However, there's a number of returns before then when the timestamp won't be updated. As I understand, the new jb is more dependent on sane timestamps being passed in, so missed updates to 'last' could have a very adverse affect.

I'm  not sure why 'last' would get reset to 0 and never updated.

By: kb1_kanobe2 (kb1_kanobe2) 2005-04-29 20:42:56

I've applied and tested the chan_iax2.tr_fr_ts.patch.txt in addition to the other two patches applied previously and, while the 'last' timestamps are now updating correctly on both ends and all of the other related jitterbuffer/timestamp related debugs have stopped (as expected) I still seem to be loosing audio after the ts passes 70000 and wraps back around to 0.

If I leave trunking and trunktimestamps turned on on both boxes but only enable the newjb on one side then that side looses its incoming audio, while the outgoing audio remains solid.

I'll try to have a deeper look over the weekend, but I thought I'd throw this out there and see if anyone else can make the connection.

By: grolloj (grolloj) 2005-04-30 18:23:26

The patch, chan_iax2.tr_fr_tr.2.patch.txt, fixes one-way audio problems with iax trunking. When voice frames (minis, not super minis) were pulled from the trunk frame, they were being passed into schedule_delivery as 16 bit timestamps. This patch properly adjusts them to be full 32 bit timestamps and updates the last timestamp for a call receiving trunked audio.

kb1_kanobe tested the patch and continued to hear audio past the ~65 second mark with trunking and the newjb enabled (and was key to tracking the problem down).

By: kb1_kanobe2 (kb1_kanobe2) 2005-05-03 22:08:53

In support of grolloj's most recent update: I've been successfully running the updated timestamps patch with newjb, trunktimestamps and trunking enabled for a few days now without experiencing the previous permanent loss of audio problems.

I have had some reports of wierd 'metalic' sounding calls from users, but I suspect this is an internal jitterbuffer resizing issue and will look into it seperately.

By: Andrew Kohlsmith (akohlsmith) 2005-05-04 10:47:06

This patch doesn't appear to fix the DTMF causing horribly bad newjb performance as I explained in 3007.

Haven't tested the 10s one-way dead audio yet :-)

By: Michael Jerris (mikej) 2005-05-09 11:39:40

trunk timestamp patches moved to 4221.

By: grolloj (grolloj) 2005-05-13 16:32:02

sample_cnt.2.patch.txt is just a replacement for the previous patch. The ast_codec_interp_len() function made it into HEAD with the patch from bug ASTERISK-4214249, so I'm just keeping this one up to date.

By: Mark Spencer (markster) 2005-05-14 19:35:05

Is this ready for merge?

By: Mark Spencer (markster) 2005-05-14 19:45:10

Added to CVS head after discussion with mikej.