ASTERISK-09058: Garbled audio using speex

[Home]

Summary: ASTERISK-09058: Garbled audio using speex

Reporter: Joel (jbebel) Labels:

Date Opened: 2007-03-20 17:15:55 Date Closed: 2007-06-21 17:10:27

Priority: Major Regression? No

Status: Closed/Complete Components: Core/CodecInterface

Versions: Frequency of
Occurrence

Related
Issues:

Environment: Attachments: ( 0) asterisk.conf.new
( 1) rtp_debug_ekiga
( 2) rtp_debug.txt
( 3) verbosedebug.txt

Description: I see this referenced in bug 0009027, but it has been closed without resolution, and I didn't see a way to reopen it. I'm happy to provide whatever you want to help resolve this, but I'm not satisfied with the notion that asterisk has nothing to do with this problem. Audio is garbled using asterisk 1.4 with either ekiga or x-lite, where asterisk 1.2 seems to work ok with both ekige and x-lite. The variable causing brokenness seems to be asterisk version. Are there any clients that DO work with speex and asterisk 1.4?

Comments: By: Clod Patry (junky) 2007-03-20 21:17:35

Please add debug files, like requested.
By: Joshua C. Colp (jcolp) 2007-03-20 21:47:24

What version of speex are you using? Once I have that I'll install and lab this up.
By: Joel (jbebel) 2007-03-21 02:48:05

I did have speex 1.0.5, but I upgraded to 1.2beta1 to see if it fixed the problem but didn't any changes. I currently have 1.2beta1 installed.

I'd be happy to provide debug files however you like. What specifically do you need? Just debug, or the full => notice,warning,error,debug,verbose. That won't get sip or rtp debug info will it? How would you like that?

For what it's worth, this seems fairly easy to reproduce. I'm using ekiga 2.0.3, X-Lite release 1105d build stamp 99999, and asterisk 1.4.1. If you find a speex softphone that works, I'm very curious. I've noticed a few peculiarities. Under ekiga, if I select 16 kHz speex, I don't hear anything at all. If I select 8 kHz speex, I get garble. Using x-lite, I also get garble, but it sounds very different than the ekiga garble, so perhaps it's not behaving exactly the same. In any case, I feel relatively confident that if you try, you can reproduce this problem.

Thanks for looking into this,
Joel
By: Jure Petrovic (jure) 2007-04-05 03:14:28

I am also experiencing same problem with
asterisk 1.4.2 I am using ekiga 2.1.0 and
this used to work with asterisk 1.2.10.

My calling partner tells me, he understands
me normally, while I get only jibberish.

It seems like some timing problem?

I am also willing to help here. Please
tell me, what kind of debug report would you
like to have?

Oh, speex 1.2beta1 as recommended by www.speex.org
Codec works ok on file samples with speexenc, speexdec

Regards,
Jure

By: Serge Vecher (serge-v) 2007-04-05 08:45:01

Please produce the log as per following:
1) Prepare test environment (reduce the amount of unrelated traffic on the server);
2) Make sure your logger.conf has the following line:
console => notice,warning,error,debug
3) restart Asterisk with the following command:
'asterisk -Tvvvvvdddddngc | tee /tmp/verbosedebug.txt'
4) Enable SIP transaction logging with the following CLI commands (1.4/trunk commands in parenthesis):
set debug 4 (core set debug 4)
set verbose 4 (core set verbose 4)
sip debug (sip set debug)
5) Reproduce the problem
6) Trim startup information and attach verbosedebug.txt to the issue.
By: Jure Petrovic (jure) 2007-04-05 11:29:13

As requested, I have uploaded verbosedebug.txt
I trimmed the file, so it starts from "core set debug 4" command. Problem was reproduced when dialing to "55". In my system this is a test extension to play music on hold. Speex is set to vbr=>false, vad=>false and abr=8000.

If you need any aditional information, please let me know.
By: dea (dea) 2007-04-05 13:24:19

Do either of these options in asterisk.conf make a difference?

[options]
internal_timing = yes
transcode_via_sln = no

I just installed 1.2beta1-1 and tested calls with acceptable results,
and both of these options are new to 1.4 so they are not likely set.
It would be interesting to see what impact if any they have on the
problem.
By: Jure Petrovic (jure) 2007-04-05 16:37:16

Just added two mentioned lines to asterisk.conf
I have also uploaded my whole asterisk.conf file

There is no difference. Sound is still garbled.

Oh, Ekiga reports this in console:

warning: Invalid mode encountered: corrupted stream?
warning: Invalid mode encountered: corrupted stream?
warning: Invalid wideband mode encountered. Corrupted stream?
warning: Invalid wideband mode encountered. Corrupted stream?

By: Jure Petrovic (jure) 2007-04-06 10:47:41

I did some tests with 2 more clients and discovered something
that might be important:

When using IAX protocol, speex codec works fine.
When using SIP protocol, speex audio is garbled.

Maybe that might help. Does this have to do with some
settings in sip? NAT possibly?
By: dea (dea) 2007-04-06 12:24:47

No, the issue is going to be located in the packetization
support for RTP. To establish smoothers, asterisk needs
to know how the coded is encoded-

Frame length
Bytes per frame
Samples per frame

VBR codecs prove to be tricky, and I thought we had picked
sane defaults that would work for SpeeX, but apparently not.

Current values for SpeeX:
Min framing: 10 (I think this should be 20)
Max framing: 60
Default framing: 20
Increment: 10 (I also now think this should be 20)
Bytes per frame: 10 ( Wrong! but I am not sure what is right)

My only SpeeX capable endpoint is an old Firefly and it seems to
sent 39byte frames, which does not jive with any of the SpeeX
documents I can find.

If one of the developers with better knowledge of SpeeX can chime
in, we can either pick better defaults, or add the framework to
detect the bytes per frame and setup the smoother properly.
By: Jure Petrovic (jure) 2007-04-06 13:11:23

As far as I know, there are only two versions of speex used in
telephony. The most common is narrowband, as it is intended for
low-bandwidth connections.

narrowband version of speex has 8000 Hz sampling rate and
8000 bits per second. This one has following specifics:

Frame length = 20 millisecs
Bytes per frame = 20 bytes
Samples per frame = 160

It is weird though, that is used to work with asterisk
1.2.10 without problems. Has rtp packetization method
changed since then?

In this case it has constant bitrate.

Is there a possibility to change these values? So it would
work with this particular configuration of speex?

By: Jason Parker (jparker) 2007-04-06 13:26:06

Well, 1.2 didn't support rtp packetization, so that has definitely changed. Do the suggested changes to the defaults make any different?
By: dea (dea) 2007-04-06 13:51:10

So far no. Looking over the 1.2 code, smoothers were used,
albeit with fixed sizes, EXCEPT for SpeeX.

Bypassing the creation of a smoother in rtp.c for SpeeX
results in a 21 byte RTP packet, but still no audio in
Firefly.

So I would guess/suggest that the packetization feature
is not the sole culprit here. Commit 42477 looks
moderately fishy and does not exist in 1.2.

A small section of an 'rtp debug' from a SIP endpoint using
SpeeX would also be handy. FireFly is sending 39 byte packets
for SpeeX, but I am not sure I trust that data point.

The RFC's for SpeeX indicate 160 samples encoded into 15 bytes,
but that doesn't add up to what I am seeing, and also produces
munged garbage.

I'm not actually getting garbled audion, but one-way audio with
the SpeeX sending, but not receiving.

By: Jure Petrovic (jure) 2007-04-06 14:11:04

Yes, the party I called also told me, that he could hear me. So, I guess this
is one way audio too...

I am attaching a rtp debug from ekiga client 2.1.0 using speex. Ekiga is set
to use 8kbps, 8kHz speex codec. Rtp debug is obtained by packet sniffing
with Wireshark. It is saved in wireshark native format. Wireshark nicely
displays payload size -- 20 bytes. Oh, ekiga client is on machine with LAN ip.
The other RTP party is asterisk server.

If you prefer some other formats, please let me know. I thought that this is the best way to analyze payload, as asterisk doesn't print actual payload data.

Attaching: rtp_debug_ekiga

By: dea (dea) 2007-04-06 14:15:13

Could you get a small section of 'rtp debug' from the Asterisk
command line? I do not have wireshark, and the level of detail
I need is available right from the CLI
By: Jure Petrovic (jure) 2007-04-06 14:56:34

Oh, why didn't you say so :-)

Attached short text file. All setings remained same.
Is this OK?

By: dea (dea) 2007-04-06 16:13:20

Yes thanks.

So My Firefly is doing something odd, but it doesn't
change the results. I backed out 42477 by hand and
it did not change the results either.

I am about out oof ideas. I was (and still am) willing
to accept that the packetization feature introduced the
problem, but I am currently bypassing the smoother with
the same results.
By: Jure Petrovic (jure) 2007-04-06 16:40:02

Hmm...I am not into asterisk devel, but it's never too late, right?

If I understand you correctly there was rtp packetization introduced from version 1.4.* on. As I can imagine, this is some mechanism that fragments rtp payload to network packets?

I checked that sound again. In my case asterisk was doing translation from speex to g729. One way audio - when ekiga is encoding speex - works great. Asterisk translates it without problems.

Ploblem appears only when encoding so this packetization stuff seems logical. Maybe frames are split in wrong places? Just guesssing....

Smoother? is this how you call packetization or is this some sort of digital signal processing? (DSP)
By: dea (dea) 2007-04-06 17:41:45

The idea behind a smoother is to set up buffers to be able to
take audio at one rate (30ms) and 'ssmoothly' feed it to a client
expects a different rate (20ms).

Fixed bit rate codecs make this easy. VBR, not so much.

In this case it appears that the rtp packets are not padded to
end on an octet boundry (SpeeX RFC lists that as a must).

Not sure why, digging now.
By: dea (dea) 2007-04-06 21:37:58

OK, I still think there are a couple issues here, but I have
found that an older version of FireFly is happy with the
smoothers bypassed.
In main/rtp.c at line 2677 replace
if (!rtp->smoother) {
with
if (!rtp->smoother && (rtp->lasttxformat != AST_FORMAT_SPEEX)) {

I also tested setting the format framer to use 21 bytes for
20ms of audio and it appeared to work OK as well.

I think the VBR features of SpeeX are going to require a some
thought on how the smoothers are created and leveraged.

Edit* More testing has shown that if vbr mode is enabled in
/etc/asterisk/codecs.conf, the configurable framer/smoother option
fails, while the smoother bypass works.

(Disclaimer on file, but this is more for brainstorming that commit)

Discussion topics:
Eliminate packetization option for SpeeX?
Recommend vbr be turned off in codecs.conf?
How to make the smoother setup more intelligent for
SpeeX, with a possible need to reconfigure it on the
fly for vbr.

By: Jure Petrovic (jure) 2007-04-07 04:04:04

padded on octet boundary? You mean byte aligned?
But byte is the smallest unit that can be send...
Can you have network(rtp) packet that is not byte aligned?

I just recompiled asterisk with your extra
condition. It works perfectly :-)

I was wondering about that too. You think is reasonable
to have a variable bitrate codec in telephony? IMHO, there
should be only a few "telephony" configurations of speex
supported by asterisk.

By: dea (dea) 2007-04-07 11:37:53

The octet alignment was a false trail.

VBR support would be a great concept to support, allowing
for the best combination of quality and bandwidth usage.
Trying to mesh that with support for configurable framing
makes things interesting.

I am still concerned that the SpeeX codec in Asterisk has
an issue, but at least we have a work-around until it can
be tracked down and the logic can be added to allow the
smoothers to interact with VBR codecs.

SpeeX provides an interesting oppertunity. It can
provide decent quality at low bandwidths and supports
wideband and ultra-wideband sampling. A lack of patents
would make it ideal for the 'prefered' codec in OS IPTel.

A lack of endpoints supporting it is the major strike against it.
By: Jure Petrovic (jure) 2007-04-07 12:14:56

I agree with you on that. Speex sure is an interesting opportunity and a good audio codec. However, as you mentioned there is no much interest in it.

I see that most of the softphones (opensource ones) are supporting it and therefore I am trying to use it. But, like I said those clients mostly use just one or two versions - narrowband (8kbs, 8kHz) or wideband. And this is done in CBR. I have never seen a hardphone - real IP phone - that would support that codec.

Is there somebody actively working on audio codecs? especially speex?
By: Steve Davies . (stevedavies) 2007-05-14 13:52:05

Hi,

I just spent a day assisting a client who needed Asterisk 1.4 to accept and generate RTP packets containing 3 speex frames.

They wanted to use VBR and all the rest to get the least possible bandwidth use. (Which we also really love about Speex!)

They got the "garbled audio" mentioned here, plus all the errors about wideband corruption.

We found a couple of issues which we roughly worked around and eventually got Asterisk working properly. Some of them were already coming out on this thread.

I'll try to construct a proper "prime time" patch but here are the issues:

1) the RTP smoother _really_ mangles Speex. DEA's main/rtp.c change sorts that out by disabling the smoother for Speex. Don't see how the smoother idea can be made to work for bit-orientated Speex without a lot of pain involved in trying to find the frame boundaries and inserting the "01111" end markers etc. Easier to do what I did in (3) below.

2) In main/frame.c, functions speex_samples and speex_get_wb_sz_at don't work right when there is more than 1 frame in the packet. There seem to be some check that there actually _is_ a wideband frame in there that is missing. For now I just hacked speex_get_wb_sz_at to always return 0 as a workaround. I'll look into it in more detail soon.

3) With the smoother removed, the original requirement to generate 3 frames per RTP packet still remains. I met that requirement by just hacking codecs/codec_speex.c in lintospeex_frameout to only make an output packet when there are 3 * tmp->framesize samples waiting for processing. That's not a production quality solution though.

That was enough to successfully generate 3-frame Speex RTP packets which the client accepted fine.
Asterisk was already able to digest multi-frame packets coming in once item (2) above was "fixed".

Our interoperability test was against the customer's inhouse developed SIP phone for mobile devices.

Regards,
Steve
By: Russell Bryant (russell) 2007-06-21 17:10:24

File just fixed problems with speex and RTP just a couple days ago, so this should be fixed now. However, please reopen this bug if you still have issues. Thanks!