Summary:ASTERISK-12761: [patch] Hold logic broken?
Reporter:Stefan Gofferje (sgofferj)Labels:
Date Opened:2008-09-22 04:53:43Date Closed:2008-11-20 17:16:08.000-0600
Versions:Frequency of
Environment:Attachments:( 0) 13531.patch
( 1) nokia_calls_xlite.txt
( 2) xlite_calls_nokia.txt
Description:I have the following symptoms:

Call X-lite / Nokia E51
X-lite press hold: Nokia DOES hear MOH
Nokia press hold: X-lite does NOT hear MOH

Call X-lite / SCCP phone
MOH works as supposed

Call SCCP phone / Nokia E51
SCCP press hold: Nokia DOES hear MOH
Nokia press hold: X-lite does NOT hear MOH

In addition, the BLF on the SCCP phones does NOT show the hinted SIP
extension on hold.

With 1.4 latest, everything worked as supposed.
As this problem appears also between SIP clients, it is NOT a
chan_sccp-related issue.
Comments:By: Stefan Gofferje (sgofferj) 2008-09-22 04:54:19

I did further testing. The music problem only occurs if the E51
originates the call. If it is being called, the other side (sccp or
x-lite) DOES hear MOH.

I included two logs (debug 10, verbose 10, SIP debug on).

- X-Lite client calls Nokia E51
- Nokia picks ups
- Nokia presses hold
- MOH played to X-Lite
- Nokia presses Unhold
- Hangup

- Nokia E51 calls X-Lite client
- X-Lite picks ups
- Nokia presses hold
- NO MOH played to X-Lite
- Nokia presses Unhold
- Hangup

However, the hold is still not displayed in the BLF which worked with 1.4.

By: Mark Michelson (mmichelson) 2008-10-09 16:56:16

I took a look at the traces, and I must say that the nokia_calls_xlite.txt has some strange things in it, and they don't appear to be Asterisk's fault from what I can tell.

As a point of reference, let me first point to the re-INVITE used by the Nokia in the xlite_calls_nokia.txt file. It is about half-way down the page, and has a Cseq of "1 INVITE" . This is a fairly normal re-INVITE to place the far end on hold, in that there is an a=sendonly line in the SDP.

Now, for some reason, if you look at nokia_calls_xlite.txt, you'll see that at timestamp 13:36:43, there are *3* re-INVITE transactions that take place. The first re-INVITE Asterisk receives (Cseq 786 INVITE) is once again a normal re-INVITE to place the far end on hold. Asterisk responds with a 200 OK and reads the ACK sent by the Nokia. Then, for some reason, the Nokia decides to initiate two more INVITE transactions. In each case, the INVITE is properly responded to with a 200 OK. In the second re-INVITE that the Nokia sends (Cseq 787 INVITE), it has a bizarre SDP with an a=sendonly and an a=sendrecv line. Then, the third re-INVITE that the Nokia sends (Cseq 788 INVITE) has an SDP with just an a=sendrecv line. It's at this point that the Nokia stops initiating any new re-INVITE transactions.

While I think it is bizarre that this is happening, I have noticed something a bit odd that Asterisk is doing which may be contributing. If you look at xlite_calls_nokia.txt, in response to the re-INVITE that the Nokia sends to Asterisk to indicate hold, Asterisk responds with an SDP with an a=recvonly line. However, if you look at nokia_calls_xlite.txt, in response to the re-INVITE that the Nokia sends to Asterisk to indicate hold, Asterisk responds with an SDP with an a=sendrecv line. This may actually be what is causing the Nokia to get "confused" and send new re-INVITEs to Asterisk. I'll have a closer look at this.

By: Mark Michelson (mmichelson) 2008-10-09 17:28:55

Okay, I think I see what the problem is. Asterisk checks the version string in the o line of the SDP to see if it has increased when a new SDP arrives. Asterisk converts this into an integer and compares this to the previous version number to determine if there has been a change.

Now, if you look at the version number used in the reINVITE sent to Asterisk in the nokia_calls_xlite.txt file, you'll see that the new version number is 63390173789561876 (the old version number is one less). This is much larger than what a 32-bit integer can hold. I ran a test program where I used sscanf to convert these strings to integers and found that even though the strings are different, sscanf converted both to -151378924. They're both the same! Asterisk therefore doesn't recognize that the version has changed in the SDP and does ignores the contents.

Considering that RFC 2327 does not give any sort of limit on the value of version numbers and even recommends using NTP timestamps for the version (which are 64-bit fixed-point numbers), it appears that Asterisk should either use a 64-bit integer datatype for comparing versions or should use string comparisons when comparing the versions. I will write a patch which accomplishes one of these methods.

By: Mark Michelson (mmichelson) 2008-10-09 17:39:22

Actually, it appears as though I made a mistake in my test program and so even though those numbers are way larger than a 32-bit int can hold, they actually do evaluate to different numbers. It appears as though I am incorrect on this one as to why Asterisk sent a=sendrecv in response to an SDP with a=sendonly. Still, I suppose that behavior is system-dependent and so I'll still write a patch like I said I would.

By: Mark Michelson (mmichelson) 2008-10-09 17:44:50

I have uploaded 13531.patch. Please test to see if this fixes the broken hold logic. Thanks!

By: Stefan Gofferje (sgofferj) 2008-10-17 04:21:21

I will test this weekend.

By: Leif Madsen (lmadsen) 2008-10-27 11:48:49

sgofferj: any luck on getting that patch tested?

By: Stefan Gofferje (sgofferj) 2008-10-27 11:55:19

Sry, I had no time. Have family emergency. Test finally possible next weekend. Put it to my calendar already.

By: Stefan Gofferje (sgofferj) 2008-11-03 02:34:32.000-0600

I did some testing and it seems to work now. I will test during the whole week and report if further problems occur.

By: Mark Michelson (mmichelson) 2008-11-03 10:40:37.000-0600

Great. Thanks very much!

By: Mark Michelson (mmichelson) 2008-11-20 16:32:20.000-0600

Based on a lack of complaint from the reporter, and based on the relative triviality of this change,I am inclined to go ahead and commit it...so I will :)

By: Mark Michelson (mmichelson) 2008-11-20 17:16:08.000-0600

This was committed in revision 158230 of trunk, 158231 of 1.6.0, and 158232 of 1.6.1.