[Home]

Summary:ASTERISK-02195: [patch] Asterisk sends corrupt data when peer dynamically switches from GSM to ULAW
Reporter:ajz (ajz)Labels:
Date Opened:2004-08-06 04:10:42Date Closed:2008-01-15 15:06:43.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-corrupts-codec-following-reINVITE
( 1) patch-2004-08-06-asterisk-dynamic-codec-change-fix
Description:I have a SIP VoIP CPE which I am using to call the asterisk EchoTest demo (sip:600@asterisk-ipaddr).  In its 200 OK response, asterisk offers ULAW but says it prefers GSM, and so the session is initially GSM.

Using ethereal I can confirm that the GSM payloads being sent by asterisk match those being sent to asterisk.

When the CPE dynamically changes to ULAW, asterisk copies the change and starts sending the CPE ULAW frames.  However, now the frames sent by asterisk no longer match those being sent to asterisk - not even with an offset.

As a debugging excercise I made ULAW samples sent by the CPE all 0xAAs.  Asterisk responded by sending a stream of ULAW samples which started at 0xAA and gradually incremented until it got stuck at 0xFD.


****** ADDITIONAL INFORMATION ******

I tried also issuing asterisk a reINVITE (offering only ULAW) instead of just dynamically switching codecs.  I had exactly the same problem when I did this.

I had a look through the code and I eventually found that sip_rtp_read() was failing to correctly change the write/read formats when it received a frame containing a different codec.  The attached patch seems to fix this.
Comments:By: Mark Spencer (markster) 2004-08-06 10:18:19

The patch is certainly inappropriate.  The logic is that once we receive a packet with a different codec, we start sending using the codec we received.  Remember that "readformat" and "writeformat" are those that the higher level application is expecting.  By changing the "nativeformats" field and then setting the read/write format, we retain the same format for the user application while changing what low layer codec is being sent and received.

I am having some trouble understanding your AA/FD experiment.

By: Brian West (bkw918) 2004-08-06 19:53:44

I suspect he has the evil evil allow=all

bkw

By: Mark Spencer (markster) 2004-08-07 18:29:27

Is there an actual audio quality issue here?

By: ajz (ajz) 2004-08-09 04:11:47

Hi,

I don't actually have allow=all.  This is from the [general] section on sip.conf

disallow=all
;allow=g729                    ; Pass-thru only unless g729 license obtained
allow=gsm
allow=ulaw
allow=alaw

Also, there is definitely an audio quality issue.  I am using two VoIP CPEs connected to asterisk via SIP and I am trying to send a fax call between them.  When each one detects a fax call it sends a reINVITE to change the codec from GSM to ULAW in order to allow fax pass-through.  Unfortunately, the corruption that occurs when the codec is changed means that the fax cannot pass through.

The idea behind the 0xAA experiment was to prove that Asterisk was corrupting the ULAW data following a reINVITE (or dynamic change of codec).  What I did was to use a single VoIP CPE call the EchoTest demo, and then change codec dynamically from GSM to ULAW.  I then got it to send all 0xAAs in the ULAW samples.  I would have expected to see 0xAAs coming back, but instead I saw a stream of samples that started at 0xAA, but then slowly incremented until it got to 0xFD, at which point it became stuck and continued sending 0xFDs indefinitely.

I am not surprised that my patch is inappropriate as I am new to this code and I do not fully understand it.  However, when I apply the patch the problem goes away (and I can successfully make fax calls :-).  I attached the patch to the bug report in the hope it would provide clues as to what was really going wrong.

Alex

By: Brian West (bkw918) 2004-08-09 08:59:44

Remove the GSM allow=gsm and see what happens.

By: ajz (ajz) 2004-08-09 09:08:55

When I remove allow=gsm I don't get the problem.  This isn't surprising because GSM is not negotiated in the first place and the codec is not dynamically switched.

Clearly if I start off with ULAW in the first place I do not get the problem.  However, I need to start off with a high compression codec (GSM) and only switch to a low compression codec (ULAW) *after* I have detected I am handling a fax call.

By: Mark Spencer (markster) 2004-08-09 15:44:45

I'm still not clarified on  one thing.  There is no audible problem with the audio right?  It just doesn't pass your fax?

By: ajz (ajz) 2004-08-17 03:45:40

Sorry for the delay.  You are correct.  The problem is not audible, but the corruption is enough to ruin fax.  I suspected that my problems with fax were caused by corruption of the audio data when being bridged by Asterisk.  This is why I generated a test ULAW stream (all 0xAAs) to confirm this.

Alex

By: ajz (ajz) 2004-08-27 09:12:57

Hi,

Has any progress been made with this bug?  My patch was rejected but it does appear to fix the problem (for me - at least).  Can anyone suggest a modification that would make the patch acceptable?

Regards,

Alex

By: Mark Spencer (markster) 2004-08-27 10:02:59

What should happen is that if, during a bridge, the "nativeformats" changes on either side, make_channels_compatible should be called a second time.

By: Mark Spencer (markster) 2004-09-03 21:04:39

Upgrade to latest CVS and let me know if that fixes your corruption problem in a bridged configuration.  thanks.

By: Mark Spencer (markster) 2004-09-07 13:00:41

Presumed fixed, if not then the bug owner can reopen it.

By: ajz (ajz) 2004-09-15 10:05:03

Hi,

Sorry for the delay - I've been on holiday.

I've just checked out the latest source and retested and the bug is still there.  I will attach an ethereal dump showing the problem shortly.

Alex

By: ajz (ajz) 2004-09-15 10:15:10

I've just attached a file called asterisk-corrupts-codec-following-reINVITE which can by opened using ethereal.

Prior to starting the capture I called the EchoTest demo (sip:600@10.0.0.229) from my VoIP CPE (sip:8020@10.0.0.159).  I started the capture after the demo started.  From the capture you can see that the RTP traffic (ports 4000 and 4002) is GSM and that asterisk is correctly echoing back exactly the same data it receives.

The VoIP CPE then sends a reINVITE and both it and asterisk change codecs to G.711.  The G.711 samples sent by the VoIP CPE are all 0xAA (this is deliberate).  Now asterisk is no longer echoing back exactly what it receives.  Instead it sends back samples which slowly increment from 0xAA (the value of the received samples) to 0xFD.

If we can get this to work then we will have a way of performing fax upspeed.  The SIP CPE simply needs to send a reINVITE which only offers G.711 when it detects that the call is a fax call.

TIA,

Alex

By: Mark Spencer (markster) 2004-09-15 10:19:18

Don't use echo app.  It's only fixed for the case of bridging.

By: Mark Spencer (markster) 2004-09-15 10:19:59

(to clarify, i mean bridging when Asterisk is carrying the media.

By: ajz (ajz) 2004-09-15 10:32:46

Brilliant!

I've just retested it using two SIP CPEs with asterisk doing the bridging between them.  They both start off using GSM and both renegotiate to use G711.  And in this scenario the problem is FIXED :-)  I don't actually need it to work for the EchoTest demo - I was only using this because I thought it would more clearly demonstrate the problem.

Thank you very much for all your help.

By: Mark Spencer (markster) 2004-09-16 23:47:59

did you try with dial?

By: ajz (ajz) 2004-09-17 04:10:18

I've just tried using

asterisk*CLI> dial <number>

But I get

By: ajz (ajz) 2004-09-17 04:12:27

cont...

but I get:
  chan_oss.c:273 sound_thread: Failed to write sound

when the callee picks up the phone.  I'm not sure how I would know whether or not the corruption is happening in this scenario even if dial did work on my system.  Surely asterisk should only echo back exact copies of what it receives if it is either bridging or running EchoTest.

Maybe I have misunderstood you....

By: Mark Spencer (markster) 2004-09-17 08:33:37

I mean can you try the same test going between a client and another channel (e.g. Zap/foo) rather than the echo application!

By: ajz (ajz) 2004-09-17 08:45:44

Yes, I've repeated the test.  SIP/8020 calls Zap/2, and then renegotiates from GSM to G711.  This appears to work.  However, since the RTP data received by asterisk is converted into analogue I cannot determine whether the corruption has occurred.  (Also, there's no point attempting a fax call since the echo cancellation on the Zap/2 channel will screw it up.)

By: Mark Spencer (markster) 2004-09-17 09:16:23

Okay, you can use SIP to IAX to another box with echo test, so long as the echo test runs on the box that is talking IAX and g.711 only.  That should give you a good answer.

Also, you can disable echo cancellation on Zap/2 but that still will not guarantee that there is no extraneous conversion going on.

By: ajz (ajz) 2004-09-17 10:35:01

Okay, I've tested this now and it seemed to work.

   softphone --SIP--> Asterisk1 --IAX2--> Asterisk2

In my test Asterisk1 bridged SIP/GSM to IAX2/ULAW.  The softphone then sent a reINVITE and started sending ULAW with samples all 0xAA.  I confirmed that following the reINVITE Asterisk1 started sending 0xAAs to Asterisk2 - i.e. Asterisk1 was not corrupting the data.

I guess this means it works!  Thanks.

By: Digium Subversion (svnbot) 2008-01-15 15:06:43.000-0600

Repository: asterisk
Revision: 3722

U   trunk/channel.c

------------------------------------------------------------------------
r3722 | markster | 2008-01-15 15:06:43 -0600 (Tue, 15 Jan 2008) | 2 lines

If nativeformats changes, recalculate formats (bug ASTERISK-2195)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=3722