[Home]

Summary:ASTERISK-04199: [patch] ztdummy accuracy improvements on kernel 2.6
Reporter:Tony Mountifield (softins)Labels:
Date Opened:2005-05-17 03:21:11Date Closed:2011-06-07 14:10:16
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ztdummy.diff.txt
( 1) ztdummy.h.diff.txt
( 2) ztdummy.ifdef.diff.txt
Description:The kernel-based timing in ztdummy under Linux 2.6 is not accurate anough for good MeetMe operation. It is based on the kernel jiffy counter and the add_timer() call, which is a one-shot and needs to be called again in every handler. The effect is that ticks can be missed. This was causing increasing delay on IAX channels into MeetMe, and audio breaking up after a while on OH323 channels into MeetMe.

The rtc driver in 2.6 includes a hook to enable a handler to be added into the RTC hardware interrupt processing, enabling a solution similar to zaprtc, but without the need to recompile the kernel or replace the rtc module. The attached patch uses this hook, and has been demonstrated to provide greatly improved accuracy in ztdummy processing. Tests so far suggest that this completely eliminates the problems mentioned above with IAX and OH323 channels in MeetMe.

Please see these postings on asterisk-dev for more detailed discussion:
http://lists.digium.com/pipermail/asterisk-dev/2005-May/012509.html
http://lists.digium.com/pipermail/asterisk-dev/2005-May/012555.html

****** ADDITIONAL INFORMATION ******

Disclaimer on file.

ztdummy.c and ztdummy.h are currently identical in -STABLE and -HEAD, and the patch therefore applies to either.

Note that this is a completely different issue from either ASTERISK-3519 or ASTERISK-4152, even if some of the effects appear similar.
Comments:By: Tony Mountifield (softins) 2005-05-17 03:23:09

Note: It looks like Mantis mangled the URLs in the description :-(

By: Mark Spencer (markster) 2005-05-18 22:05:25

This would appear to make the patch x86 specific, am I correct?

By: Tony Mountifield (softins) 2005-05-19 02:07:48

Yes, I think  you are correct, in that it requires rtc rather than genrtc (which I didn't know about until recently).

Should this patch be made into a separate ztrtc module instead then, that could be used as a replacement for ztdummy if desired?



By: Michael Jerris (mikej) 2005-05-19 08:13:56

Could it be ifdef'ed in for better performance on x86 machines in the same module.  From a user standpoint not having to pick the right one is much easier.

By: Mark Spencer (markster) 2005-05-19 16:40:07

My interest in ztdummy is highly limited for obvious reasons ;)

By: Jean-Hugues Robert (jeanhuguesrobert) 2005-05-20 03:12:39

Hi,

I am running (for testing) Asterisk HEAD on a 2.6 CoLinux.
Unfortunately, there is not RTC support at this point in CoLinux.
Hence, rtc_register() is unresolved and ztdummy fails to load somehow.
The "older" method was somehow working better:
The module would load, a conference could be setup, but no sound at
all would be rendered correctly. Looked like slow motion sounds.
rtp.c keept complaining about delays.
Could ztdummy check RTC support at load time or compile time ?

By: Tony Mountifield (softins) 2005-05-20 03:21:43

Mark - isn't ztdummy still commercially relevant for ABE on a system without cards?

Regarding JHR's problem on CoLinux, I think the determination would have to be done at compile time, because to leave it till runtime would still require the symbol rtc_register to be resolved when loading.

If we assumed that all i386 Linuxes had an RTC, we could compile with #ifdef __i386__, but that evidently would not work on CoLinux. But the existing ztdummy doesn't work either :-)

Concerning the existing ztdummy on CoLinux, what is the HZ value in that kernel? The existing jiffy-based ztdummy assumes that HZ == 1000. With ztdummy loaded, try running the zttest program to see what it says.



By: Brian West (bkw918) 2005-05-20 10:51:36

Can you post your proof its better?  I have it installed and I can't tell any difference ... I ran zttest on ztdummy prior to this.. it was more consistent.  With your patch its all over the place.. granted still 99.93 - 99.97 but it goes up and down between that so rapidly.  Is this how it should work?

/b

By: Tony Mountifield (softins) 2005-05-20 14:08:55

The two links in the original bug description give my initial reports on the difference. Since then I have run zttest on two identical boxes (dual 3GHz Xeon with SCSI), one with the patched ztdummy and one without, and found the following:

The one with the patch gave:

--- Results after 71 passes ---
Best: 99.963379 -- Worst: 99.938965

The value oscillated between those values, similar to what you noticed, which I found unsurprising, given that the RTC driver is working on a 1024Hz interrupt and skipping one out of every 42 or 43, to give exactly 1000 per second.

The one without the patch gave:

--- Results after 71 passes ---
Best: 99.987793 -- Worst: 96.411133

On this one, although most of the results were closer to 100%, every so often I got one that was really low at 96 or 97 (perhaps one reading in every ten). I think this is the missed tick problem due to a missed jiffy in the kernel timer, and is what seems to upset MeetMe channels (e.g. the gradually increasing IAX delay). It is exactly this problem that my patch addresses, by relying on a hardware interrupt instead of just the kernel.

It's quite likely that the average value would differ between the two versions, because in the patched case it is comparing two different time sources: the RTC interrupt against the kernel clock (which drives gettimeofday()). In the unpatched case, both ztdummy and gettimeofday() are based on the same kernel clock, so are likely to be closer to 100% when a jiffy doesn't get missed.



By: Michael Jerris (mikej) 2005-05-21 08:56:44

Can anyone else confirm that this behaves better on 2.6 x86 machines?  If so can somone provide and ifdefed patch to ztdummy?

By: Brian West (bkw918) 2005-05-21 10:10:59

Well considering the machine I was testing this on died (literally DIED).  It was no more accurate before or after running zttest when I tested.

/b

By: Tony Mountifield (softins) 2005-05-21 13:41:17

I can certainly provide a patch with #ifdef __i386__ this weekend, but I wasn't sure what to do about the CoLinux issue.

By: Jean-Hugues Robert (jeanhuguesrobert) 2005-05-21 15:00:56

Regarding CoLinux: 1) There must be a way to determine that RTC support is not available. It will probably be available someday. I don't know how to check what features were enabled when kernel was built. 2) When I ran zttest (without the new RTC based patch) I got weird result, like 9800% or similar and sound was "slow motion", so maybe there is an issue with the HZ mentionned earlier. I do have a CoLinux running and I can do testing if desired, I am not a Linux expert however.

By: Tony Mountifield (softins) 2005-05-21 15:24:01

Heh, I've just googled for CoLinux, and found that it's a way of running Linux co-operatively *within Windows*!!! I would think in that scenario all bets are completely off for anything that requires accurate timing! I certainly wouldn't do any more with Asterisk than a bit of personal experimentation in that environment.

So I don't think we need to worry about CoLinux support - I'll go ahead and produce a patch with ifdef as described.

By: Tony Mountifield (softins) 2005-05-21 16:55:36

Revised patch uploaded with conditional compilation.

As before, the same patch applies to both HEAD and STABLE.



By: Michael Jerris (mikej) 2005-05-21 19:51:12

Ok, considering BKW got a crash from this, we really need some serious testing on this to resolve any issues.

By: Matt Riddell (zx81) 2005-05-21 23:29:55

Before:

[root@minerva zaptel]# ./zttest
Opened pseudo zap interface, measuring accuracy...
99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586%
99.975586% 98.364258% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586%
99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 99.975586% 98.461914%
99.975586% 99.975586%
--- Results after 25 passes ---
Best: 99.975586 -- Worst: 98.364258

After:

[root@minerva zaptel]# ./zttest
Opened pseudo zap interface, measuring accuracy...
99.951172% 99.951172% 99.963379% 99.951172% 99.951172% 99.951172% 99.951172%
99.963379% 99.951172% 99.951172% 99.951172% 99.951172% 99.963379% 99.951172% 99.951172%
99.951172% 99.951172% 99.963379% 99.951172% 99.951172%
--- Results after 20 passes ---
Best: 99.963379 -- Worst: 99.951172

It does appear to fix the problem of increasing delay.  The MeetMe conf was being run with Q option.

This is on a 1.0.7 server.

Before the delay was like 5-7 seconds after a few minutes.  I have been in the conf now for about half an hour and the delay is like 400ms (I am using a server in Italy for the test - it's the only one I have without hardware).



By: Tony Mountifield (softins) 2005-05-22 02:06:01

bkw didn't say this was the cause of his crash. Sounded more like a hardware failure. I'd be utterly amazed if ztdummy was anything to do with it.

ZX81: thanks for the info



By: Mark Spencer (markster) 2005-05-25 13:54:42

Fixed in CVS head!

By: Brian West (bkw918) 2005-05-25 14:17:47

Come on guys.. lets test a bit more... this broke ztdummy on a few of my 2.6 boxes.  Do not make USE_RTC default.  Its broken on some machines.

/b

By: Brian West (bkw918) 2005-05-25 14:21:11

softins  was correct it was hardware failure.. but compile failure is more serious than that.

 CC [M]  /usr/src/zaptel/ztdummy.o
/usr/src/zaptel/ztdummy.c: In function `rtc_interrupt':
/usr/src/zaptel/ztdummy.c:128: error: structure has no member named `ticks'
/usr/src/zaptel/ztdummy.c:129: error: structure has no member named `ticks'
/usr/src/zaptel/ztdummy.c:134: error: structure has no member named `ticks'
/usr/src/zaptel/ztdummy.c: In function `init_module':
/usr/src/zaptel/ztdummy.c:226: error: structure has no member named `ticks'
/usr/src/zaptel/ztdummy.c:227: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c:228: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c:229: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c:236: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c:237: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c: In function `cleanup_module':
/usr/src/zaptel/ztdummy.c:275: error: structure has no member named `rtc_task'
/usr/src/zaptel/ztdummy.c:276: error: structure has no member named `rtc_task'
make[2]: *** [/usr/src/zaptel/ztdummy.o] Error 1
make[1]: *** [_module_/usr/src/zaptel] Error 2
make[1]: Leaving directory `/usr/src/linux-2.6.9-r1'
make: *** [linux26] Error 2

/b

By: Michael Jerris (mikej) 2005-05-25 14:23:54

is there somthing happy to ifdef this with?

By: Tony Mountifield (softins) 2005-05-25 17:05:10

The patch includes an update for ztdummy.h too - lack of that update in CVS was what caused bkw's unresolved symbols, that's all.

Once ztdummy.h is updated in CVS, USE_RTC should be perfectly safe.



By: Tony Mountifield (softins) 2005-05-27 01:34:24

The patch ztdummy.ifdef.diff.txt contained updates to both ztdummy.c and ztdummy.h.

The new ztdummy.c appeared in CVS two days ago, but the new ztdummy.h has still not appeared. Without it, ztdummy.c will not compile properly when USE_RTC is defined (which is the whole point of the patch).

By: Michael Jerris (mikej) 2005-05-27 08:26:11

Can you create a new patch to go to what it should be from head right now so we can do some additional testing on this one?

By: Tony Mountifield (softins) 2005-05-28 14:24:57

It was just the second part of the existing patch. I'm not sure how the ztdummy.h part of it escaped being applied to CVS.

Anyway, I've now uploaded ztdummy.h.diff.txt containing just the outstanding patch for ztdummy.h

When testing, it will also be necessary to remove the #if 0 from around the #define of USE_RTC in ztdummy.c

By: Michael Jerris (mikej) 2005-05-29 22:19:31

need some more testing on multiple diff 2.6 boxes...

By: Mark Spencer (markster) 2005-05-30 09:34:18

okay

By: Olle Johansson (oej) 2005-07-18 06:31:37

Any new test results? This report has been sleeping for a while...
Still a problem?

/Housekeeping

By: Michael Jerris (mikej) 2005-07-18 08:00:56

okay, mark has already commited the header stuff, I saw somone comit a change to turn this on for newer 2.6 kernels all the time.  I am going to close this one out as it seems it is perfect how it is (as in no one is clamoring to change it)

By: Michael Jerris (mikej) 2005-07-18 08:03:21

oh look, it already is closed... how strange.

By: Digium Subversion (svnbot) 2008-06-07 10:52:19

Repository: dahdi
Revision: 655

U   trunk/ztdummy.c

------------------------------------------------------------------------
r655 | markster | 2008-06-07 10:52:16 -0500 (Sat, 07 Jun 2008) | 2 lines

ztdummy fixes (bug ASTERISK-4199)

------------------------------------------------------------------------

http://svn.digium.com/view/dahdi?view=rev&revision=655

By: Digium Subversion (svnbot) 2008-06-07 10:52:44

Repository: dahdi
Revision: 656

U   trunk/ztdummy.c

------------------------------------------------------------------------
r656 | markster | 2008-06-07 10:52:40 -0500 (Sat, 07 Jun 2008) | 2 lines

Turn off USE_RTC by default (bug ASTERISK-4199)

------------------------------------------------------------------------

http://svn.digium.com/view/dahdi?view=rev&revision=656

By: Digium Subversion (svnbot) 2008-06-07 10:53:05

Repository: dahdi
Revision: 657

U   trunk/ztdummy.h

------------------------------------------------------------------------
r657 | markster | 2008-06-07 10:53:03 -0500 (Sat, 07 Jun 2008) | 2 lines

more ztdummy fixes (bug ASTERISK-4199)

------------------------------------------------------------------------

http://svn.digium.com/view/dahdi?view=rev&revision=657