|Summary:||ASTERISK-03888: H.323 codec negotiation fails|
|Date Opened:||2005-04-07 12:36:34||Date Closed:||2005-04-29 00:08:01|
|Environment:||Attachments:||( 0) ast-h323-codec.diff|
( 1) ast-h323-dead-codec.diff
( 2) h323_codec_negotiate.txt
( 3) h323_codec_negotiate-log.txt
( 4) h323postpatch.txt
( 5) openphone-119.txt
|Description:||With the most recent set of CVS updates, H.323 signalling using a gatekeeper is now fully functional- however, RTP codec negotiation fails, causing the call to terminate immediately after initiation- the phone rings, but the channel is dropped upon connection. Note that this only happens on calls ORIGINATED from Asterisk, not calls TERMINATED on asterisk.|
My test case is to Cisco Callmanager. If I call to an Asterisk extension from CCM, everything works perfectly. If I call from an IAX softphone through Asterisk to CCM, the phone rings, then the channel is destroyed upon first pickup. I've got both IAX.CONF and H323.CONF set to only allow ulaw (711), and CCM is set to only accept ulaw as well, to simplify the negotiation process.
I've attached a full trace. If you search for '723', you'll find that, the call is negotiated (faststarts are passed) and a capability table that only includes ulaw and the dtmf/signalling stuff. The channel is initiated, then the calls to set_format are made- Asterisk immediately complains that there's no path from 723 to slin - indicating that it was attempting to transcode. There may be some help in that 723 is codec 1 in the codecs table, and ulaw was negotiated as position 1 in the capabilities table- I've not been able to fully prove the link there, but it seems likely.
|Comments:||By: Paul Cadach (pcadach) 2005-04-07 13:15:57|
Could you configure your asterisk to record all (include verbose) messages into log file and post your full log - console grabs is just hard-readable. :(
edited on: 04-07-05 13:17
By: Paul Cadach (pcadach) 2005-04-07 14:14:12
Looks like payload information you got from OpenH323 isn't valid. Is you build chan_h323 correctly? Verify you have compatible settings in channels/h323/Makefile and main Makefile about thread debugging (-DDO_CRASH, -DDEBUG_THREADS).
By: pbd (pbd) 2005-04-07 14:15:25
Done- see the -log version.
By: Paul Cadach (pcadach) 2005-04-07 14:19:30
Look at this:
... User Input RFC2833 payload type set to [pt=101]
Apr 7 14:08:44 DEBUG chan_h323.c: Setting DTMF payload to 0 on ip$localhost/22421
Payload for DTMF should be 101 not 0... Check your build.
By: Paul Cadach (pcadach) 2005-04-07 14:22:16
Try ASTERISK-3889 - possible it could help a little to not verify all compilation parameters by hands.
By: Paul Cadach (pcadach) 2005-04-07 14:24:56
BTW, your traces points faststart isn't used.
By: pbd (pbd) 2005-04-07 15:10:49
I've verified the build, am using a valid version of the openh323 libs (following JerJer's instructions and the makefile). This process works flawlessly on inbound calls to Asterisk via the same CCM- indicating that my build is valid.
I'm digging in to see if I can throw some debugs to determine when and where the codec selection went bad- my first inclination is that oh323 hands it off correctly, but either the pvt-> read/writeformat structures are incorrect, or the tables are built wrong. It's possible it's not being passed back from oh323 correctly as well- good call there.
I take it that it *does* work for some people?
By: pbd (pbd) 2005-04-07 19:32:35
Ok, I've just gone pretty deep into debugs here- I'm not ready to patch yet, I'm not sure I'll ever be- but to whomever delves into this bug, I've got some findings to report before I do some other work tonight.
The problem appears to be in chan_h323.c's setup_rtp_connection function, specifically with this line:
pvt->nativeformats = rtptype.code;
Following the code, rtptype.code is set by a call to ast_rtp_lookup_pt, and, the way setup_rtp_connection is using it, should return the code of the codec to be used- in my case, I've got everything locked to ulaw, so it should return code 4. I've verified that the problem can be at least partially resolved by hardcoding pvt->nativeformats = 4;. It should be set on the basis that the payload type is Zero, which matches ulaw in the static_RTP_PT table.
For whatever reason, however, and judging by the state of the static_RTP_PT table, the stream's current_RTP_PT is returning code 1- this is the value for isAstFormat for all payload types in static_RTP_PT, and presumably, is set for the current_RTP_PT structure (although I've not debugged it's loading yet). The value of code in static_RTP_PT is set correctly (in my case, 1<<2, or 4 in dec) Codec 1 is g723, resulting in the error message about an inability to bridge between g723 and slin.
Surface level, it looks like current_RTP_PT is being loaded incorrectly- it's set to isAstFormat not code- but I've still got a way to go.
Beware of red herrings.
edited on: 04-07-05 19:36
By: pbd (pbd) 2005-04-07 20:31:05
Ok, I've done my last bit of tracing and debugging tonight (really).
No question, current_RTP_PT in the channel structure we're working with to setup the RTP stream is mashed. I'm not sure of the cause yet- it doesn't *look* like it's based in anything the openh323 libraries do, since that structure is initialized in Asterisk, and is Asterisk specific. rtp.c has the static_RTP_PT structure, and when the asterisk channel is initialized, it's supposed to be copied into current_RTP_PT- verbatim.
What I get by the time we get to setup_rtp_connection, and load the pvt structure with the channel, it looks like this:
PT : 0, isAstFmt : 0, Code : 1
PT : 3, isAstFmt : 1, Code : 2
PT : 4, isAstFmt : 1, Code : 1
PT : 5, isAstFmt : 1, Code : 32
PT : 6, isAstFmt : 1, Code : 32
PT : 7, isAstFmt : 1, Code : 128
PT : 8, isAstFmt : 1, Code : 8
PT : 10, isAstFmt : 1, Code : 64
PT : 11, isAstFmt : 1, Code : 64
PT : 13, isAstFmt : 0, Code : 2
PT : 16, isAstFmt : 1, Code : 32
PT : 17, isAstFmt : 1, Code : 32
PT : 18, isAstFmt : 1, Code : 256
PT : 19, isAstFmt : 0, Code : 2
I've only checked through the first few, and most seem OK- with exception of PT 0 (the one I conveniently want to use), which is incorrectly set to 1. This is beginning to smell a little of buffer overflow of something else writing to the oh323_pvt structure- looking at the way the structures come together, it's a definite possibility- but it's not going to be fun to track down.
By: Paul Cadach (pcadach) 2005-04-08 13:26:53
Try attached patch (ast-h323-mk3.diff) if you had not tried ASTERISK-3889.
By: Paul Cadach (pcadach) 2005-04-08 15:20:19
Patch is updated - mistype found.
By: wgfreewill (wgfreewill) 2005-04-15 03:32:49
I am having the same problem with mk1 and mk2 patches, or mk3 patches. Looks like some codec mis-match, I am all Ulaw too.
I can send packet captures of behaviour where g729 enabled it works, and g711U it gives the error that 723 to ulaw not available.
edited on: 04-15-05 04:22
By: Chih-Wei Huang (cwhuang) 2005-04-18 22:46:32
Did you try my patch to channel.c? See
By: Paul Cadach (pcadach) 2005-04-18 22:55:15
Reminder sent to cwhuang
Hello! Please, update your patch to not modify channel.c...
Also, do you know any version of pwlib/openh323 which works successfully on outgoing calls when H.245 tunnelling is disabled? Could you find me on the IRC?
By: pbd (pbd) 2005-04-18 23:30:30
Yes, I tried that one first, as it was the only thing remotely matching the problem in the bugtracker. No dice. It would appear that the parameters are all correct- but the codec table is being smashed somewhere along the line- specifically having to do with ulaw, as it's entry number one in the table.
By: wgfreewill (wgfreewill) 2005-04-19 00:07:11
Yes I have tried all the patches as well. No love. Hopefully the packet traces I sent you give some clue.
By: Chih-Wei Huang (cwhuang) 2005-04-19 02:57:56
The codec negotiation of Asterisk is not good, as now been discussed in -dev list. My patch just provides a workaround. It may help to others that also suffer the same problems I encountered. I hope the core developers can provide a better solution soon.
For H.245 tunnelling, it does work. Doesn't it?
I have tested Asterisk-H.323 with my softphone based on Openh323 with the four situations:
faststart on, tunnelling on
faststart on, tunnelling off
faststart off, tunnelling on
faststart off, tunnelling off
All work. I'm using the latest Openh323 (Atlas devel 1).
I seldom use IRC. Sorry!
By: Paul Cadach (pcadach) 2005-04-19 03:41:45
As I can see at least JerJer's recommended versions of pwlib/openh323 have deadlocks on H.245 negotiations under high loads (I have one on single box running one outgoing callgen323 to incoming callgen323 with 50 simultaneous calls). Endpoint creation modified to use next options:
By: Paul Cadach (pcadach) 2005-04-21 10:53:30
Reminder sent to pbd
Do you have dtmfcodec option in h323.conf? It should be set to 101 or totally missed from h323.conf. If you have dtmfcodec=0 you will have such type of problems. Could you confirm?
By: pbd (pbd) 2005-04-21 13:57:18
Absolutely set to 101. Here's my entire h323.conf
; Configuration file of OpenH323 channel driver
; Port to listen to
; Specify alias(es) of this host.
; It may be used multiple times.
; Set the context of H323 calls
By: Paul Cadach (pcadach) 2005-04-21 23:08:03
Try attached patch. The problem was due to call options (dtmf codec, etc.) isn't setup when gatekeeper is used.
By: Paul Cadach (pcadach) 2005-04-25 10:42:35
Reminder sent to pbd
Could you update info about attached patch? Is it helps you?
By: pbd (pbd) 2005-04-25 13:32:02
Unable to try as yet.
We upgraded our Cisco Callmanager to 4.01, and now the connection will not ring through- Callmanager rejects the call as if the end number is not defined. We're working on the problem- I hope to try your patch by the end of the week.
By: Paul Cadach (pcadach) 2005-04-27 14:37:27
Common "hot fixes" added (the patch is equal to one at ASTERISK-3928 and ASTERISK-3951).
By: pbd (pbd) 2005-04-28 21:27:02
Still playing with getting the channel driver to connect to Callmanager 4.0.2sr(a), using OpenGK. I reported this bug when we were at CCM 3.3.3, we've since upgraded- since the upgrade, Callmanager no longer can find the dialed number from Asterisk. Probably a different bug, but since I've got this thread open, and the patches attached have some relevance to signal control, I'll throw it here as well.
Two new files attached- h323postpatch is a logger trace of the call from Asterisk to CCM4, showing that CCM reports that the number is not registered. openphone-119 however shows a same level trace, same gatekeeper, same number, that works. Slightly different pdus going back and forth, but the critical ones appear identical- net effect is, I can no longer test for codec mismatch, and we're back to square one.
Someone else might have to be the tester on this one for a while- all suggestions accepted.
By: jerjer (jerjer) 2005-04-28 23:35:06
fixed in cvs -head
By: jerjer (jerjer) 2005-04-28 23:35:41
not in stable
By: Paul Cadach (pcadach) 2005-04-29 00:08:00
Reminder sent to pbd
Please, open new ticket related a problem you found with CCM-4.0.2. We hope current CVS-HEAD fixes the issues you initially reported by this ticket.