[Home]

Summary:ASTERISK-01236: chan_sip.c : using g729 with * and snom 200 results in dropped call 6 secs after it is answered : error is max_retries_exceeded
Reporter:chrisorme (chrisorme)Labels:
Date Opened:2004-03-17 21:48:31.000-0600Date Closed:2004-09-25 02:51:26
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) g729snombug-300304.txt
( 1) g729snombug-300304.txt
( 2) possjitter4.mp3
( 3) snom_g729_bug
( 4) snom_notregistered
Description:When the snom 200 tries to connect to * the call only lasts 6 seconds after answer call and lots of errors 'max retries exceeded' are reported on the console and then the call is dropped.
this doesn't happen with the sipura unit, or the draytek unit it only happens with the snom and is 100% repeatable.
Switching to gsm codec works fine in this respect.
A sip debug output file is attached.  I have another output file if this one is no good.

****** ADDITIONAL INFORMATION ******

I don't know but maybe * hasn't recognised the way the snom defines that it's using g729.
I know you guys hate g729, and I agree we could use gsm with this phone but hopefully it's not something that's a nightmare to fix but just miscommunication or something ?
Comments:By: Brian West (bkw918) 2004-03-18 00:46:59.000-0600

Got any NAT in the mix?

By: zoa (zoa) 2004-03-18 03:06:52.000-0600

oh, shit, looks like i accidentely closed your previous bug report, i was a little too quick on that one, sorry.

Try disallow=all, allow=g729 in sip.conf, somehow you are using ALAW.
Also please include the version of asterisk you are running + the firmware version of snom.

By: chrisorme (chrisorme) 2004-03-18 06:21:20.000-0600

Yup, NAT is in the mix (with nat=1 in sip.conf) but works fine when not using g729 codec both Upnp NAT and NAT via a linux box masquerade are ok with all other codecs they have in common (except g723, which I don't think is in * yet and I don't want anyway)
The separate bug re: LAN traffic dropping registration when secret= auth=md5 is used which might possibly have a NAT element to it but is in a separate report.

The snom connects fine with Alaw, GSM and G726.  This error is limited to the snom using g729 when forcing as described - that's what creates the error and is what I did.

Ie. at the time of the sip debug report sip.conf did have disallow=all allow=g729 to force it to g729 that's the only way I can get this error.

The reference to ALAW is between the * box (raven) and the upstream provider of lines whom the * box talks to using IAX.  That connection is made in ALAW.

The set up is

ie.  
snom --(SIP - g729/NAT)--> * (raven-212.18.255.13) --(IAX/ALAW)--> dialtone provider (212.something).

Snom Firmware is the latest shipped - ie. snom200-2.04e.
I believe there is one newer version.  I will try this and report back once I find the firmware flash utility.

* CVS is

:Connected to Asterisk CVS-03/12/04-20:33:56 currently running on raven (pid = 8834)
raven*CLI>
actually that's the compile date (won't be far off), is the actual CVS date in a Makefile somewhere ?  

I asked snom and they say they're not aware of any problems with their g729 implementation (ok, they may say that, but I think it was worth asking)

Thanks for looking at this.

Chris

By: chrisorme (chrisorme) 2004-03-18 13:53:45.000-0600

zoa : the newer SNOM Firmware.

http://snom.com/download/share/snom200-2.04g-SIP.bin

exhibits the same bug... - Chris

By: Olle Johansson (oej) 2004-03-20 08:38:31.000-0600

Please add SIP debug files as attached .txt files.

Could you please add output from ASterisk - what happens in asterisk, why does Asterisk issue a BYE

...and why does the SNOM send two INVITE's so fast without waiting for answers?
I might be missing something, but the debug output indicates that the SNOM sends an INVITE, that later gets an OK from asterisk, which the SNOM ACK which means we have a call. After that, Asterisk issues a BYE and the call is going down. The question is why Asterisk sends a BYE - which you need to track by running Asterisk with a lot of -vvvvvv as well as a -d for debug. Capture the output and add it to this bug report, so we can see what happens.
I want to see those failure reports.

This might be the same problem as the other issues with in-dialogue SIP messages tearing calls down, like INFO and NOTIFY. I don't see them in this debug output, but it's not from the Asterisk side. I need to see the Asterisk side of things.

If you have any questions, I'm available on IRC now and then with the "oej" nickname.

By: chrisorme (chrisorme) 2004-03-20 12:59:51.000-0600

Many thanks indeed for taking a look at this.

Currently I've got a friend testing the snom to find any bugs I've missed I get it back in a week.

I've attached two files of logs I took whilst I had it.  These may not be of any help as they weren't taken with -vvvvvv or -d and also I may have had the snom fixed to GSM codec only and * to g729 only but the behaviour is almost identical (except the call connects with 'no common codecs' and 'max retries exceeded' in this case.)  when g729 only is set on * and snom then there is no error 'no common codecs' but the call drops after 6 seconds after the screen fills with 'max retries exceeded'.  A call is definitely set up and starts, but only lasts 6 seconds exactly at which point it seems *'s chan_sip.c is sick of all the 'max retries exceeded' and bails.

I realise this is bugnote may not be a lot of help.  I'll send better debug for all these 3 bugs when I get the equipment back in a week and 2 days, unless someone with an snom200 wants to replicate this and post their debug here, or in the case of 'lan traffic takes down authentication' (when secret is defined) that can be replicated with either sipura or snom and your favourite download site.

Sorry by the way for posting the snom debug output in plain text. Sorry.

I'll post more decent debug for these bugs soon which hopefully should pinpoint the problem more accurately.   - Chris

By: Olle Johansson (oej) 2004-03-20 16:58:48.000-0600

Ok. There is a call with call id 3c267021dcaa-ojydtir1ohvw@213-137-8-59
* SNOM sends an INVITE
* Asterisk sends session progress
* Asterisk sends OK several times
* the SNOM fails to send an ACK to Asterisk (or sends it, but it doesn't reach asterisk
* Asterisk sends a BYE to cancel call. I'm uncertain if this is correct. Maybe it should be a CANCEL since the call isn't correctly setup yet.
* Even so, the SNOM phone doesn't ACK the call, so there's really no call ever. The six seconds is the time that passes while Asterisk waits for an ACK. Guess the SNOM believes it ACKed the call and starts the CALL as if nothing happened.

Either the SNOM is buggy or Asterisk doesn't receive all packets from it. Are you using SNOM beta firmware or stable? SNOM stable is seldom buggy in my point of view.

Except for the BYE/CANCEL discussion, I can't see that Asterisk fails in this call. Please, please, correct me if I'm wrong.

By: chrisorme (chrisorme) 2004-03-21 11:03:37.000-0600

As this debug output may (I don't know) be complicated by me perhaps picking GSM only in the snom and * being on G729 I'll post really good specific debug as soon as I can.   As this problem is isolated to G729 and the snom I wouldn't be surprised if it's a snom problem but whether anything can be done in chan_sip.c to fudge it to work, or even if that is desirable I don't know.  
The snom isn't running with any beta firmware. (it's running with the latest stable firmware and which was shipped with the snom 200 phone)

There is good 2 way audio throughout the 6 second life of the call and it seems to work fine during this 6 seconds (don't know what codec it's using at this point) but then the call dies/ hangs up 6 seconds after answering with all the 'max_retries_exceeded' errors from *.
- Chris

By: chrisorme (chrisorme) 2004-03-21 11:19:13.000-0600

As this debug output may (I don't know) be complicated by me perhaps picking GSM only in the snom and * being on G729 I'll post really good specific debug as soon as I can.   As this problem is isolated to G729 and the snom I wouldn't be surprised if it's a snom problem but whether anything can be done in chan_sip.c to fudge it to work, or even if that is desirable I don't know.  
The snom isn't running with any beta firmware. (it's running with the latest stable firmware and which was shipped with the snom 200 phone)

There is good 2 way audio throughout the 6 second life of the call and it seems to work fine during this 6 seconds (don't know what codec it's using at this point) but then the call dies/ hangs up 6 seconds after answering with all the 'max_retries_exceeded' errors from *.
- Chris

By: Olle Johansson (oej) 2004-03-26 06:20:49.000-0600

Can you please check if the problem still exists with the latest CVS, dev branch.

By: chrisorme (chrisorme) 2004-03-30 09:31:39.000-0600

I tried the latest CVS with the 'make samples' however it will not get past 'parsing sip.conf' and using asterisk -r results in just the ='s under Mark's name.

I will try and find a CVS that will boot to a prompt to test with.
In the meantime I've added 300304.txt with the latest stable branch.

Chris

By: chrisorme (chrisorme) 2004-03-30 12:52:36.000-0600

Ok.. I got the latest CVS to boot thanks to Mark telling me about 2GB logs on our dev box causing it to hang on boot.

Now I get just one 'Maximum retries exceeded complaint' 'non critical' when using the g729

ie..
-----------------
Mar 30 18:28:53 WARNING[1125329600]: chan_sip.c:500 retrans_pkt: Maximum retrie
exceeded on call 3c26aaa0a3c6-3m6g51p5oufr@192-168-2-69 for seqno 103 (Non-cri
ical Request)
Urgent handler
Urgent handler
Urgent handler
Urgent handler
----------------------
And the call stays up...
----------------------
> sip show channels
Peer             User/ANR    Call ID      Seq (Tx/Rx)  Lag      Jitter  Format
192.168.2.69     devphone1   3c26aaffcfe  00101/00003  00000ms  0000ms  G729A
1 active SIP channel(s)
*CLI> Urgent handler
----------------------
There are pages of urgent handlers flying around... are they anything to worry about??  Hopefully not.

So basically I think whatever you did here looks like an effective patch to make g729 work with the snom 200 - firmware is 2.04f (the 2.04g firmware caused lots of problems I found, delayed audio only on incoming calls, dropped calls etc).
Maybe this patch could be considered for the stable version ??
I heard that the using the g729 codec puts a lot of pressure/load on the snom200 CPU but it superficially seems to work ok for me now (apart from the call timer is erratic on the snom due to the cpu load perhaps!).

Thanks a lot Olle and everyone !  - Chris

By: chrisorme (chrisorme) 2004-04-01 07:21:52.000-0600

Using g729 with the snom causes the call progress counter to update erratically.  it does not count evenly in seconds but only updates every 30-60 seconds +.  I've heard this may be due to the load put on the snom when running this codec.  Anyway that's a hardware limitation in the phone and nothing to do with * but I thought I would mention it for completeness. - C

By: Mark Spencer (markster) 2004-04-09 22:28:35

This wasn't exactly a bug in asterisk but at least we're tolerant of things not responding to messages of certain types.

By: chrisorme (chrisorme) 2004-04-11 10:42:51

For Christian at snom to add a note relating to the requested response that asterisk was looking for when speaking g729.

By: Mark Spencer (markster) 2004-04-11 11:21:54

It doesn't still hangup does it?

By: chrisorme (chrisorme) 2004-04-11 11:29:06

No sorry, I just had to hit 'reopen' as Christian (who writes the snom firmware) wanted to post something to this bug for completeness.

I think he wants to know what request asterisk is sending when doing the g729 that the snom doesn't answer so he can maybe consider fixing it in the next firmware.

Snom have been resolving compatibility with other devices that weren't happy with how the snom did g729 (specifically the snom sent at >30ms) so it's something he was doing anyway that perhaps can get totally sorted with asterisk at the same time?
 
Not that it's really needed now you fixed it :-)  
He just couldn't post so I hit reopen as he's tried to add a note several times and hasn't been able to.  Sorry to trouble you.

By: stredicke (stredicke) 2004-04-11 11:37:08

Some versions (until 2.04k) sent 30 ms G729 which caused some trouble. Now the value in ptime is handled correctly.

The slow snom phone is caused by a old bootloader (running at a too low frequency. Our hardware designer changed the clock speed to save money on the snom 200.-)...

You need to have the latest bootloader (try http://www.snom.com/download/snom200-boot1.9.bin). It gives a little bit more juice on the CPU... But please check what bootloader you have first (updating a bootloader is dangerous cause it&ASTERISK-7996;s the only way to tilt the flash). Also see http://www.snom.com/faq/FAQ-04-03-24-sf.pdf on how to update a snom phone.

By: chrisorme (chrisorme) 2004-04-12 04:34:12

Confirmed snom200 works fine with G729, latest CVS, 2.04k and 1.9 bootloader but I seemed to need to modify rtp.c per the suggestion on asterisk-users to remove echo (apx 500ms when calling POTS line) by commenting out a nested if in rtp.c as follows and changing the nested /* */ to //.

That's all folks!!  this can now be laid to rest! Thanks! - C

     }
               /* Re-calculate last TS */
               rtp->lastts = rtp->lastts + ms * 8;
/*              if (!f->delivery.tv_sec && !f->delivery.tv_usec) {
                       // If this isn't an absolute delivery time, Check
if it is close to our prediction,
                          and if so, go with our prediction
                       if (abs(rtp->lastts - pred) < 640)
                               rtp->lastts = pred;
                       else
                               ast_log(LOG_DEBUG, "Difference is %d, ms
is %d\n
", abs(rtp->lastts - pred), ms);
               } */
       } else {

edited on: 04-12-04 03:52

By: Mark Spencer (markster) 2004-04-12 10:54:57

The last change should not be necessary.  Can you please describe what application you are calling, and what symptom you have without it?  I will also need to see an ethereal dump of the timestamps (just attach the raw file to the bug).

By: chrisorme (chrisorme) 2004-04-12 11:23:48

Ok.. The setup is :

Snom200 -GSM/G729/SIP-> chatterbox (*) -G711A/IAX-> provider -T410P-ISDN30/E1/TDM - >PSTN -> analogue line user

I think that's what you mean with 'what application you are calling' ?
If not, sorry..

When speaking to the person remotely (without the bit commented out) I hear my own voice back in the ear piece about 500ms->1000ms later, sort of an echo I guess, or delay, quite a bit quieter but it's very off putting when talking.
I don't think they hear it... but I don't know, it was a shock to hear it and it wasn't someone I could ask.

When I did what I said above (ie the /* */) the echo went away for subsequent calls.
I'm recompiling with the code back in and downloading ethereal (15M!) and will post the dump here.

commenting this code out worked for at least two other people on asterisk-users too who had this when calling POTS lines I think so I was very grateful for their suggestion !  Hope you don't mind me doing this too much to get it to work.

Bit pushed for time today as I've got a marine exam but I'll do my best to get you this dump within an hour, hopefully showing what happened/ is going on.

By: chrisorme (chrisorme) 2004-04-13 04:12:42

I've emailed markster@digium.com with a link to the tcpdumps. - hope that's ok

By: Mark Spencer (markster) 2004-04-15 00:43:13

There are recent IAX changes which might be significantly helpful in this regard.

By: chrisorme (chrisorme) 2004-04-15 07:34:16

Hope you got my tcpdump ok.

Many thanks for last nights patch.  Yup, the echo/delay has gone completely on both ends when using the snom.  Haven't tried any other devices yet.
But thank you very much and congratulations!!
I no longer seem to need to touch rtp.c (phew) !!

Setup is:

snom200-SIP/G729/64k ISDN -> chatterbox(*)/latest CVS -IAX2/Alaw/100Mbps IP-> provider(Dec5-03 CVS I think) -E1/T410/T405-> PSTN

The connection is perfect except the only thing left is some weird 'clicks'  see/listen to the file possjitter4.mp3 attached which may not be related to this bug but instead to ASTERISK-1291331.  I'll annotate that.

As you may have heard this noise a million times before when doing your own tests maybe you could tell me what this is.
This is the noise we also got when forcing ulaw->alaw conversion on IAX2 (say with an iaxy).  
and I guess as I'm doing g729->alaw here so another codec conversion.

Reported jitter on the IAX2 is only reported as 2-3ms.  I know SIP jitter isn't reported.

However I have heard these same clicks before on high bandwidth connections and this noise only seems to come when we force IAX2 to change codec even when we stick with IAX2 and don't use SIP at all.

Perhaps you disagree and think perhaps it is from the network connection or because SIP has no buffering but I think it's something to do with IAX2 and codecs still?

By: chrisorme (chrisorme) 2004-04-15 17:28:17

The clicks seem to be symptomatic of the provider (though I can't track down why (apart from suspect their possible use of 12/5/03 CVS) and are mentioned separately in ASTERISK-1323.  

I think the substance of this bug report namely the g729 errors causing the call to drop after 6 seconds being fixed in CVS plus Christians fixes in the snom firmware (2.04k) plus the timestamp issues and the echos that required an rtp patch until you did something in the CVS with IAX2 have all been resolved now.  Thanks!

A lot done eh! wow! thanks everything for straightening it all out!!

So this is probably 'fixed in CVS' if you think, just leaving possibly something with IAX2 in ASTERISK-1323 ?

By: Mark Spencer (markster) 2004-04-15 17:41:23

Fixed in CVS