Summary:ASTERISK-05372: [patch] fix for the kb1 echo canceller
Reporter:Michael Gernoth (mgernoth)Labels:
Date Opened:2005-10-26 11:56:27Date Closed:2008-06-07 11:23:09
Versions:Frequency of
Environment:Attachments:( 0) mg2ec-experimental-coeff-limit.diff.txt
( 1) mg2ec-experimental-coeff-limit-trunk.diff.txt
( 2) mg2ec-experimental-coeff-limit-trunk2.diff.txt
Description:These are the enhancements I made to mec3 in Bug 0005108 ported over to kb1.


It will stop kb1 from under- and overflowing the simulated echo and reset the coefficients in conditions when it detects an under-/overflow. It will also disable echo-cancellation on a continuous signal from the far end (like when being on hold) and reenable it when the signal changes again.
Comments:By: Michael Gernoth (mgernoth) 2005-10-26 12:10:16

mg2ec.diff.txt now adds it as ECHO_CAN_MG2.
As I've just seen I still include kb1ec_const.h. This will be changed in mg2ec.diff-2.txt uploaded shortly

By: xrobau (xrobau) 2005-10-26 15:15:34

I'd strongly suggest the [post 1.2] bit be redacted. This is a valid fix, it doesn't change any functionality (except for the executive decision to move it to a different ec name) and it works towards fixing a problem (ec sucks in *).

I realise that, from Digium's perspective, the problem is solved with the echo cancelling hardware cards, but I have a TE110 with active 10 channels. It's a bit silly to buy a $3,000 card for 10 lines.

[edit: changed 'bug' to 'ec name']

By: Matthew Fredrickson (mattf) 2005-10-26 15:59:45

FWIW, there is not really any "financial motivation" involved in any Post 1.2 selection.  I think maybe somebody just figured that this would be something considered as post 1.2 material.  The only reason that it takes a while to get these echo canceler modifications committed is that it's such a sensitive part of the code.  I usually like to see quite a bit of testing before committing any changes to existing echo cancelers.  However, from what I have seen of user feedback so far, I'd like to see this put in ASAP.  

Mgernoth, Would you say that your patch is at a point that you feel it is committable?

By: kb1_kanobe2 (kb1_kanobe2) 2005-10-26 16:44:11

Kudos for this! It covers exactly the same thing I was trying to wrap my head around last night.

I'm not in a position to test this immediately, however there is a reference 'problem number' that's been floating around and is reasonably bad for most people: 888-737-4787. In the evening (North American time) it's answered by an IVR that does all sorts of ugly, ugly things. Perhaps you (and other testers) could evalute the performance of this patch against this number and report back?


By: xrobau (xrobau) 2005-10-27 02:20:38

I've been having complaints every day of echo until today. Today the complaints stopped. I even rang the most verbose complainers and said 'How's the phone system been today?' 'Uh. Pretty good, I guess'. That's wild praise from them 8)

Strongly suggest adding to CVS, possibly as you did KB1 - put it in commented for a week or so, and get people to try it.

Also, mattf, I wasn't saying it was a financial motivation not to include the EC, I was just saying that Digium have a 'fix', that does work, already. Just that, for me, it's cost prohibitive! 8)

By: Michael Gernoth (mgernoth) 2005-10-27 04:00:48

Great to hear that it's working good for someone besides me :-)

mattf, I think it should be committable as I'm running this code for a few months and had no problems with it.

By: Matthew Fredrickson (mattf) 2005-10-27 11:50:57

Ok.  I switched over our system here to it and haven't heard any problems yet with it.  I don't see anything wrong with committing it right now as an option, and as more people use it perhaps changing it to the default if it proves itself over time.

By: Matthew Fredrickson (mattf) 2005-10-27 12:20:55

Ok, it's committed.  I'll leave the this bug note open for a few days in case we get anymore feedback on this.

By: Philip Walls (malverian) 2005-10-27 12:44:21

Whoever committed this left out mg2ec.h, this is causing compile errors when compiling with MG2 echo canceller

By: Mark Spencer (markster) 2005-10-28 06:24:01

Somebody added it :)

By: radamson (radamson) 2005-10-28 09:50:05

Matt, tried this with a TDM04b card; echo was pretty bad. Recompiled with KB1 with no other changes, and almost no echo (had to listen very close to hear any). cvs-head from last night, one call in system to a C7960. Need to do more testing here.

By: radamson (radamson) 2005-10-28 10:39:41

Additional followup... Using MG2 with a analog TDM04b, echo _always_ occurs. Reducing rxgain and txgain from 5 and 0, to 3 and -2 cleared a small amount of echo but still very very noticable. Switching back to KB1 with gains of 5 and 0 (I'm 7db from the central office) resulted in no noticable echo. Just guessing... I'd suggest there is a real difference in pci/chipset delays between the TDM card and the various T1/E1 cards.
Using the MG2 with the TDM, pstn calls are noticably louder in both directions (compared to KB1 with the exact same gain settings).
Since I'm stuck using a TDM analog card, I'll have to stick with KB1.
(Note: all four pstn lines have been extensively tested, in use for over ten years, and are in excellent condition.)
Another noticable side effect of MG2... when placing a pstn outbound call and right after the called number answers, there is no echo. However, after the _first_ spoken word "from" the called number, lots of echo as heard on a C7960. This was consistent on all pstn test calls (regardless of calls to cell phones or pstn numbers).

By: xrobau (xrobau) 2005-10-28 16:50:23

I haven't tried it on a TDM400 yet - Today I'm doing 'baby stuff' with Jade (newborn/4week old) but I'll have a play with it tomorrow on my TDM422 and see if I can get it to fail catastrophicly like you said. FWIW, I'm using echocancel=256, echocancelwhenbridged=yes and echotraining=800 - this works flawlessly on an E1 with a TE110

By: xrobau (xrobau) 2005-10-29 23:17:25

I couldn't tell the difference between KB1 and MG2 on the TDM - admittedly, I've never had an echo problem once I had got the 'opermode=AUSTRALIA' thing out of the way (Eg, impedeance mismatch).
Do you possibly have the same problem? You loaded the module and forgot your localisations?

By: radamson (radamson) 2005-10-30 06:56:08.000-0600

I'm in the US and the default install values are in use. Based on experience gained from 20+ years as an actual transmission engineer for a large telephone company, many of the issues posted relative to * s/w echo cans relate to the _distance_ asterisk is from the central office. In other words, the greater the transmission loss, the larger the echo problem. (Verified via helping many other * users over the last two years. I'm a measured 7db from the CO with four pstn lines.) With commercial transmission products, one should be setting gains at about 5db (leaving -2db of loss on pstn calls), however that cannot be approached with * and any echo can. For those users closer to the CO (roughly less then about 5db of pstn loss), most any echo can functions reasonably well (KB1 better then others in cvs-head). That _is_ the primary reason why I believe all of the existing echo cans have very narrow operating ranges (since the majority of developers with analog interfaces, including digium, are much closer to their CO's). The higher loss analog loops have simply never been tested by anyone with the knowledge necessary to eval technical results. Is the issue really higher loss or is it related to longer loop delays; don't know.

If you'd like to run tests using this system, I can provide anyone with the access necessary to eval anything they'd like (email radamson@routers.com) to help resolve echo can issues and operating ranges. It would certainly be of high value to other * users. :)

By: Michael Gernoth (mgernoth) 2005-10-30 08:56:26.000-0600

I think the problem is that my patch to kb1 resets the coefficients when it detects that the canceller has trained wrong. This works fine on lines where it retrains instantly (at least on PRIs with gains=0) but may lead to the problems radamson9 mentioned when the canceller doesn't train so quickly. As mec3 solved that by backing-up the coefficients I have implemented the coefficient backup and restore for mg2 now and uploaded mg2ec-backup.diff.txt, which should solve the problem on analog lines and still work fine on PRIs.

By: radamson (radamson) 2005-10-30 09:12:26.000-0600

mgernoth, that sounds logical as I've always had to use echotraining=800 on this system with Alltel pstn lines. It also _seems_ like those that have had problems that I've worked with also had to use 800. Maybe there is a corelation between long ptsn loops, need for gain, and echotraining=800. I'll see if I can apply the patch manually in a little bit. Just updated from cvs-head a few minutes ago.

By: radamson (radamson) 2005-10-30 11:48:07.000-0600

mgernoth, the mg2ec.backup.diff.txt patch did in fact correct the issue with the TDM04b. I can't tell any significant difference between KB1 and MG2 now, except I can increase the gains by another db. Anything more then 1 db and echo does return in the first few seconds, but disappears after that. Would you guess that maybe doing Agressive might impact this, or, possibly playing with the echotraining timing (now =800)?

By: Michael Gernoth (mgernoth) 2005-10-31 15:12:36.000-0600

I have uploaded mg2ec-backup2.diff.txt which now backups the coefficients directly after the initial training, so they are not lost when the canceller restores the backup right after its start. The backup-interval has been lowered too, as it was possible to get bad behaviour with the old interval on PRIs. I was unable to get the canceller to misbehave with the new interval.

xrobau, it would be great if you could test if this patch still works fine on your system.

radamson9, I would not activate AGGRESSIVE, as this really degrades sound-quality in my opinion. Playing with the training time could help to get the canceller to work right from the beginning of the call, but I have never played around with it.

By: Michael Gernoth (mgernoth) 2005-10-31 16:09:49.000-0600

I was just able to get the backup2 version to misbehave, so the currently best working version for me is still the unpatched one in cvs. I'm thinking about other solutions to this problem now, like removing coefficients below a certain level (like 10% of the average level) or doing error-statistics.

By: xrobau (xrobau) 2005-10-31 17:07:06.000-0600

Well, I applied backup2.diff this morning to the E1 site. I'll see how they go.


By: xrobau (xrobau) 2005-11-03 16:43:32.000-0600

I haven't had any complaints, but I'm about to wander off for three weeks of travelling around Australia (work related). They seem to be happy enough, but there is definately still 'problems', and not just complex echo issues.

However, they're not complaining, and it's a definate, distinct, improvement on KB1.

By: Matthew Fredrickson (mattf) 2005-12-27 07:32:43.000-0600

To sort of re-open this discussion, mgernoth, what do you think of applying the latest backup2 version to -HEAD?  From what radamson said, it sounds like without the patch to backup and changes there, lines that use extended echo training would still have issues.

By: Michael Gernoth (mgernoth) 2005-12-27 08:13:56.000-0600

I think backup3 (which I just uploaded) should be applied to head, as it is essentially backup2 with the additional feature of zeroing the coefficients if too many errors are detected (which works really good on my BRI). The percentage can be set in mg2ec_const.h and it can be disabled completely by setting it to 100.

By: Michael Gernoth (mgernoth) 2005-12-27 08:30:36.000-0600

I just found another thing which should greatly improve the canceller on long tails. I'll update this bug and add a new patch if it works out well. I'm now trying to filter out 'irrelevant' coefficients,

By: Matthew Fredrickson (mattf) 2005-12-27 08:38:01.000-0600

Are you referring to cases where the echo exists at some offset from z, so that 0 valued coefficients would be not used?

By: Michael Gernoth (mgernoth) 2005-12-27 08:52:45.000-0600

When looking at the coefficients many have a small value, even when the echo is certainly not at this point.  My current code sets all coefficients to zero which have a value less than 1 percent of the second biggest coefficient. My first (real life test as someone with a problematic echo-path just called me) was really good. No need to reset the canceller as in my backup3 patch. backup4 will follow shortly, the filtering can be disabled by not defining FILTER_PERCENT in mg2ec_const.h

By: Michael Gernoth (mgernoth) 2005-12-27 09:05:44.000-0600

patch uploaded, the filtering works fine for me and prevents rs from overflowing so easily with an echotail of 256. It also works fine with shorter echotails.

By: Michael Gernoth (mgernoth) 2005-12-27 09:09:47.000-0600

I shouldn't use ABS() here. Fixed in backup5 patch. Sorry for the noise...

By: radamson (radamson) 2005-12-27 16:01:10.000-0600

I just applied the backup5 patch to svn trunk (checked out today). The limited number of calls completed via a TDM04b card (analog fxo's) seem to be functioning just fine. Very short amount of echo in the first few seconds but quickly disappears. Currently using echotraining=400. Will be making more test calls in the next couple of days.
Any suggestions on tweeks, quantitative monitoring, etc? (direct email is fine to radamson@routers.com if you want)

By: Michael Gernoth (mgernoth) 2005-12-28 06:02:12.000-0600

I will upload mg2ec-experimental-coeff-limit.diff.txt now, which makes it possible to specify the number of coefficients used by the echo-canceller and so using the canceller on a 256 tap tail and only using the 64 biggest taps. This should not be included in HEAD right now (instead backup5 should).

I am interested in which results people who need long tails (128 and 256) get with the experimental patch and if adjusting USED_COEFFS in mg2ec_const.h helps. I have no idea what will happen on analog lines as I have none to test... If you test this code, make sure that you have a bigger echocancel setting in zapata.conf as the value of USED_COEFFS (64 by default) as the code will have no effect otherwise.

If anyone can come up with a faster way to get the n.th biggest element in an unsorted list as my current insertion-sort based approach, I would be happy to hear it.

By: Matthew Fredrickson (mattf) 2005-12-28 13:26:02.000-0600

ok, I'm going to update MG2 in trunk to backup5.  You should be able to check it out by the time this message goes up.

By: Michael Gernoth (mgernoth) 2005-12-28 14:00:38.000-0600

mattf, thanks for committing backup5.

I'll upload the experimental coeff-limit patch rediffed against the new trunk right now. It currently looks good after handling my calls today, but it should be tested by more people (and with analog lines, too).

By: Andrew Kohlsmith (akohlsmith) 2005-12-28 14:10:33.000-0600

checking out the new coefficients.  Anyone here know much about the ectoolkit?

By: Michael Gernoth (mgernoth) 2005-12-29 09:00:54.000-0600

radamson9: Do you get a faster ec-adaption if you lower MIN_UPDATE_THRESH_I in mg2ec_const.h?
Try to set it to the half of the current value and see if you still get echo in the beginning of the call. It would be interesting if this helps you.

By: Michael Gernoth (mgernoth) 2005-12-31 06:56:33.000-0600

Just uploaded mg2ec-experimental-coeff-limit-trunk2.diff.txt which does not change the algorithm but might be a bit faster than the previous version.

By: Andrew Kohlsmith (akohlsmith) 2006-01-06 09:21:11.000-0600

Whichever MG2 changes that are in current svn trunk (r878) have been *amazing*.  I am VERY happy with that, but as per matt's request I am going to try the trunk2 changes to see how they work.

By: Martin Vit (festr) 2006-01-06 09:34:30.000-0600

is it possible to port this *AMAZING* :) changes into 1.2 branch? Or it is not in plan? (i think that not :)

If i'm looking into the patch, this should work with 1.2 am I right? Which patch should i try on 1.2.1?

By: Matthew Fredrickson (mattf) 2006-01-06 10:46:06.000-0600

The patches are not there.  I just took them down since they have already been applied.

By: Michael Gernoth (mgernoth) 2006-01-06 11:41:42.000-0600

festr: you can still get the patch which is now in trunk from:

By: Michael Gernoth (mgernoth) 2006-01-08 06:39:54.000-0600

akohlsmith: Great that ihe trunk changes are working for you so good :-)
Do you use an analog card or a {P,B}RI card?

By: Andrew Kohlsmith (akohlsmith) 2006-01-08 06:54:43.000-0600

I have both PRI (at work and at the colo) and TDM420 (at home)

All calls go though the colo, whether they go out the colo's PRI or a VOIP provider depends on the destination and who's reachable.

By: Michael Gernoth (mgernoth) 2006-01-24 18:00:40.000-0600

akohlsmith: so the current code in svn works both on analog and digital lines? that is great :-)
did you have a chance of testing the coeff-limit-trunk2 on any of your lines?

If you need this as a patch against zaptel 1.2, it can be found here:

By: kb1_kanobe2 (kb1_kanobe2) 2006-02-01 22:03:56.000-0600

mgernoth deserves _serious_ accolades for keeping up with this project.

All efforts need to be made to see that it gets into 1.2.5 - I have no problems sweeping aside the kb1 code and seeing it replaced with this work on the basis of it being a 'bugfix'.

just my 2c.

By: Matthew Fredrickson (mattf) 2006-02-02 09:52:24.000-0600

Ok, let me get back up to speed with this.  Sorry, I've been working on some other things so I'm a little out of the loop.  Is the latest one (mg2ec-experimental-coeff-limit-trunk2.diff.txt) the one you want me to put in?  Also, has this been tested by anybody, (including yourself mgernoth) with good results?  Thanks, sorry for being so pedantic, but, as you know, this is a sensitive part of the code :-)

By: Michael Gernoth (mgernoth) 2006-02-02 10:50:35.000-0600

Yes, mg2ec-experimental-coeff-limit-trunk2.diff.txt should be good to go in. I'm using this patch since I have uploaded it in december and it works fine for me.

By: Matthew Fredrickson (mattf) 2006-02-02 13:01:30.000-0600

Ok, I'm putting it in.  I'm going to close this bug, so next time there is an update, we can open a new bug for it.  We're trying to make the front for developer discussion on the -dev list, rather than on the bug tracker since it gets more exposure.

By: Matthew Fredrickson (mattf) 2006-02-02 13:03:01.000-0600

Put in head

By: Digium Subversion (svnbot) 2008-06-07 11:05:50

Repository: dahdi
Revision: 799

A   trunk/mg2ec.h
A   trunk/mg2ec_const.h

r799 | mattf | 2008-06-07 11:05:49 -0500 (Sat, 07 Jun 2008) | 2 lines

silly me, forgot to cvs add the new echo can :-)  (Bug ASTERISK-5372)



By: Digium Subversion (svnbot) 2008-06-07 11:16:57

Repository: dahdi
Revision: 878

U   trunk/mg2ec.h
U   trunk/mg2ec_const.h

r878 | mattf | 2008-06-07 11:16:56 -0500 (Sat, 07 Jun 2008) | 2 lines

Updating MG2 to latest set of tweaks and optimizatons.  See bug ASTERISK-5372 for more details on changes.



By: Digium Subversion (svnbot) 2008-06-07 11:23:09

Repository: dahdi
Revision: 933

U   trunk/mg2ec.h
U   trunk/mg2ec_const.h

r933 | mattf | 2008-06-07 11:23:07 -0500 (Sat, 07 Jun 2008) | 2 lines

More echo canceller updates for MG2.  Thanks mgernoth, you rock! (ASTERISK-5372)