ASTERISK-01623: [patch] minor timestamp cleaning changes; jitter buffer knobs.

[Home]

Summary: ASTERISK-01623: [patch] minor timestamp cleaning changes; jitter buffer knobs.

Reporter: stevekstevek (stevekstevek) Labels:

Date Opened: 2004-05-14 15:39:45 Date Closed: 2011-06-07 14:05:18

Priority: Major Regression? No

Status: Closed/Complete Components: Core/General

Versions: Frequency of
Occurrence

Related
Issues:

Environment: Attachments: ( 0) iax2.patch

Description: This patch [disclaimer on file] makes the following minor changes to chan_iax2:

(1) [this is the only thing you might _not_ want]: Changes the "fuzz" factor for cleaning voice timestamps to +-240 from +-640ms. I'm not sure where the 640ms value came from, but a consequence of changing from "cleaned" timestamp bases is that it will introduce jitter on the remote end; my feeling is that 240ms should be enough, and that's what I did in iaxclient [libiax2 patch will come soon].

(2) update nextpred based on the timestamp of every frame that is sent, not just previously predicted frames; This stops a timestamp discontinuity when the channel switches from a context in which timestamps are dictated upstream [e.g. by an application or bridge to a channel which sets "delivery" in the frame] into a context where timestamps are predicted.

(3) adds two configurable parameters to control the jitterbuffer,

(a) jitterwindow: controls the size of the "history" buffer used to look at historical jitter.

(b) jittershrinkrate: controls the rate (in ms/frame) at which the jitter buffer will shrink.

The jitter buffer really needs to be rewritten, but these knobs have helped me tune things a bit. In particular, although there's a small performance hit from this, increasing the jitterwindow has helped a lot, because many networks have jitter which looks like 5 to 10s of low delay, followed by spikes of delay lasting for less than a frame, etc. With the default 100 frame window, the jitterbuffer was constantly shrinking/growing for these kinds of people.

Comments: By: zoa (zoa) 2004-05-15 04:50:09

stevekstevek, can you please contact me ? (joachim@securax.be)
I'm looking for someone to implement some jitter buffer changes/improvements or maybe to start from scratch.
By: Mark Spencer (markster) 2004-05-15 12:29:54

Having a static jitter window of 4096 samples hugely impacts the size of an iax structure.

Also, when adding parameters, you must be mindful that things happen correctly on a reload e.g:

1) I have this in my iax.conf:

jittershrinkrate=5

2) I change it to:

;jittershrinkrate=5

and then reload.... The actual jitter shrink rate should fall back to 2.

Other than that looks pretty reasonable to me.
By: zoa (zoa) 2004-05-17 05:53:50

- Is it possible to have an extra option to set a static jitter buffer of e.g. 100ms, that will not increase or decrease over time ?

That way, the iax structure doesnt grow big because of a big jitter window.

- When watching iax2 show channels (without using your patch), after a while i notice 0ms delay calls, and during the first second of the gsm call, i often notice 23452345234ms jitter calls, i suppose these values could have a very big negative influence on the jitter buffer size, any chance you see something suspicious in the code for those parts ?
By: zoa (zoa) 2004-05-17 08:32:46

I found another problem with the jitter buffer (not with your patch, but maybe you are willing to help with it.)

Enabling a 1 way 100ms jitter buffer (on the receiving end) , may cause problems with dtmf detection on that receiving end. (especially when there actually is some jitter)

Symptoms are: sometimes its working, sometimes its not. (also depends on the device sending... tones generated by a cisco 7960 sometimes came through, those sent by openphone were not.)

Both the calls sent through the cisco as through openphone were first put on pstn before arriving at a TE410p and wired over iax2 to the ending asterisk server.
By: stevekstevek (stevekstevek) 2004-05-17 09:42:31

OK:

First, Mark's comments:

1) The static jitter window member should add less than 32kB to each iax2 call; it's just an array of int. I'm currently working with 750 sample windows, so we could make the static array 1024 samples (8kB), and it would still give more flexibility than we have now.

2) I'll check on the reload semantics; I think the jitterwindow stuff should be OK, because the structure is zeroed initially, so growing it will be OK; shrinking it will also be OK because you'll just ignore the extra values in there. If you grow it, reload, shrink it, reload, then grow it again, (all while a call is in session), then you'll end up using some older historical values for a bit, but it shouldn't be a big deal. I'll have to see how the shrink rate parameter works in this situation.

Zoa's comments: I think you're misunderstanding the jitterwindow: It's not the size of the jitter buffer at all, it's just the "window" of historical lateness used to calculate what the optimum jitter buffer size should be.

Also, it's important to understand that the current jitter buffer implementation deals not only with "jitter" in the sense of the difference in lateness between a group of packets, but also with a "timebase offset"; i.e. a static difference in clocks between sender and receiver, which may gradually change over time. You really need to run asterisk -d (with a debug log) configured to see how it works in practice).

Yes, the jitter buffer needs some work; Maybe I will open a new "enhancement" but where we can discuss the requirements for a new implementation; The basic requirements as I would like to see it are:

a) A single jitter buffer implementation that can be shared between chan_iax2 (trunking and non-trunking), libiax2, and other VoIP channels in asterisk (sip, h323, etc).

b) Some systemic changes in asterisk to better be able to determine the difference silence and "missing" voice frames, such that applications (and translators) can perform interpolation to reconstruct missing voice frames. ("missing" frames may be due to either network loss, or jitter buffer growth).

c) Hopefully we can do this with code which is more easily understood. Having variables like maxjitterbuffer and max_jitter_buffer which mean two entirely different things is pretty confusing :)
By: zoa (zoa) 2004-05-17 11:17:04

i understand the kitter window is not the jitter buffer, but when the size of the jitter buffer is static, there is no need for a jitter window, thus not taking up any space.

Thanx for the -d tip, didnt know it could be used for the jitter buffer debug too.
By: stevekstevek (stevekstevek) 2004-05-21 14:18:10

Mark:

I looked over this again re: your comments.

1) you can change the MAX_JITTERWINDOW define from 4096 to 1024, this would reduce the extra space used by this patch from a little less than 32k to less than 4K per session.

2) The reload semantics are the same as for other iax2.conf parameters: They will be re-set properly by a reload, but defaults are not restored as per your example. None of the other parameters work that way. I suppose you'd make that work by setting all the defaults in set_config, before reading the parsed values, instead of (or in addition to) the static initializers.

I also have another [unrelated] patch to chan_iax2 coming later today.
By: Mark Spencer (markster) 2004-05-22 18:19:48

Defaults *should* be restored :) Whatever is that isn't defaulted back properly is technically a "minor" bug but i'd rather not put new things that duplicate that behavior. If you want to fix up ones that are broken, that'd make me really happy too :)
By: twisted (twisted) 2004-06-16 23:49:00

where is this headed? stevekstevek, did you make the requested changes for this to be considered?
By: clive18 (clive18) 2004-06-19 13:30:21

as a suggestion for an additional jitter buffer knob, how abt an option for a fixed time jitter-buffer (or delay added) , with a knob like "fixedjitter=300ms"
assumming the timestamping is correct.

Thanks.
By: zoa (zoa) 2004-06-20 03:00:23

clive: that is the same as my request :)
By: clive18 (clive18) 2004-06-20 04:27:42

zoa, sorry, I got confused with all the terminology of "jitterwindow" etc. gets confusing.
A much improved jitter buffer would do asterisk a lot of good, especially for countries like South Africa where the internet connectivity quality is not up to the standard of the internet in the US. There have been times that I could not use asterisk for voip because of bad jitter, even though packet loss was not too bad (4%).
Thanks for all your efforts , steve, mark and zoa
By: Steve Davies . (stevedavies) 2004-06-21 04:54:31

There are issues/bugs with the way timestamps are generated - see my mail of Jun21st to asterisk-dev. I'd suggest we fix those before changing the jitterbuffer logic.

I'be been poring over traces all weekend - from that, my comments on steve's patch are:

1) the change of "fuzz" from 640 to 240msec.

I'd say that it should probably be made even tighter. But it'll probably be subjective - do you prefer more smaller audio glitches, or less frequent but much bigger gaps?

2) change so nextpred always gets recalculated for every frame that is sent

I understand what you are trying to do, but don't agree, because:

First: there is a bug in the timestamping of non-voice frames. The code to stamp them as lastsent+3 doesn't get used - because they are all "genuine" (see the code, and confirmed by me by adding debugging). Once this bug is fixed, then the control frames will get +3, +6, +9 etc. If we send the next voice frame with this extra offset we'll introduce a glitch. So we should rather leave nextpred tied to the voice frames only. I do agree that if the +3 process isn't used because of too much skew, than in that case nextpred should be moved.

Second: calc_timestamp is sometimes called with a passed in timestamp. This seems to be for LAGRP, PONG, ACKs where the peer's timestamp is returned to him. This is not one of our timestamps, and definitely should not change either nextpred or even lastsent for that matter.

3) More knobs

Go for it - fully agree that they are useful.

Regards,
Steve Davies
By: stevekstevek (stevekstevek) 2004-06-21 09:30:01

Yes, SteveD, there are a bunch of things.

(1) I've actually reverted the timestamp prediction altogether in the code I'm using here. It certainly makes debugging easier, but it does cause 240ms glitches on Windows all the time. About 10% of the Windows boxes I've seen out there have clocks that are so bad that they seem to drift 240ms in less than 5 minutes. [This is actually system clock vs soundcard clock drift, which causes the most glitches].

With the code in this patch, if they system clock is _slower_ than the soundcard clock, then you get 240ms of gibberish whenever this happens, because you get old and new audio interleaved with the same timestamps when we rollback nextpred.

I actually though that the soundcards would be the big problem, but if I look at the server, the soundcard clocks have been better than the system clock.

We've done a lot of work to try to find better time on Windows, and we just can't. There's dozens of ways to get Time on Windows, which all resolve to about 3 internal mechanisms, and they are all wrong.

GetTickCount() drifts, SetSystemTime[AsFileTime], drifts even more, and the "hi resolution" stuff like QueryPerformanceCounter also has problems.

With ts prediction, all we really need/want here would be an _accurate_ time interval +- 100ms or so. We just need it to be accurate over the long term, short-term resolution isn't so important.
By: Steve Davies . (stevedavies) 2004-06-21 10:08:16

Having tried some changes, I think I now understand better what the "genuine" flag means in calc_timestamp. This is supposed to identify frames like LAGRQ etc. As opposed to CONTROL frames.

For LAGRQ/LAGRP to get the right lag, its currently necessary for the sent LAGRQ to have a real clock-based timestamp. And for the calc_timestamp used for comparison against the returned LAGRP to similarly have a real clock-based timestamp. Same for PING/PONG.

If they have predicted timestamps, the measured lag ends up just being 20msec times the number of voice frames sent between sending the RQ and getting back the RP.

That still doesn't mean its a good idea to move nextpred when these control frames are sent. In fact its even more a bad idea, seeing that the timestamps on these frame are deliberately out of step with the voice frames - they should be treated as "alien" by the predictor, and left out of the jitter processing on the other side.
By: Steve Davies . (stevedavies) 2004-06-21 11:18:44

SteveK:

I've just been plotting generated timestamps for audio sourced from Milliwatt.

I'm still allowing 640msec skew between clock and predicted timestamps.

In the course of a 10 minute call there were three points at which the timestamp jumped ahead because it got more than 640msec behind the clock.

An idea does occur: is there a way to actual warp the samples themselves to keep the timing in step in more gradual (and inaudible) steps.

Without bringing the cisco bug back, that is.

Steve
By: Olle Johansson (oej) 2004-07-16 14:03:08

Is this patch/bug report still an open case? Does it apply to current CVS?

/Housekeeping
By: Steve Davies . (stevedavies) 2004-07-16 15:43:09

Hi,

I've been doing some jitter buffer changes - some in discussion with stevek. So the majority of things in stevek's patch are now in the CVS.

I agree there's much more to do - especially so that we can take advantage of lost packet concealment that's in ilbc and can be added for g711.

I'd also like to do the longer history tracking - find a way that doesn't need the big history structure.

I'd also like to do zoa's fixed size buffer idea,

Perhaps in the meantime this bug can be closed though - and steve and I should discuss "offline" to come up with the next patch.

But it's stevek's bug entry so I guess he gets the call.

Steve
By: stevekstevek (stevekstevek) 2004-07-16 17:53:22

It doesn't matter to me, I guess, as long as we all eventually fix this.

I have the changes similar to this in my local builds that I'm using, but there's still a lot of work to be done to make the jitter buffers work correctly all the time.
By: Mark Spencer (markster) 2004-07-16 22:11:05

Is this ready to be closed out?
By: twisted (twisted) 2004-07-23 21:07:32

Since this is assigned to Markster, I'm not going to close it, but add an additional request for update.

Please respond asap to avoid a pre-mature close!
By: Steve Davies . (stevedavies) 2004-07-26 09:03:43

Like I say - equivalents of these changes are in the CVS.

There is still more that should be done on the jitter buffer though; especially something to permit lost packet concealment to work with iLBC and any other codecs that have it or could have it.

Steve
By: twisted (twisted) 2004-08-19 23:52:03

so then we're ready to close?
By: Steve Davies . (stevedavies) 2004-08-20 01:23:29

I think so.

I'll open a new entry and write up my proposed new jitter buffer design that will properly do lost packet concealment.

Steve
By: twisted (twisted) 2004-08-20 01:31:53

closed per stevedavies