Summary:ASTERISK-03278: Polycom Phones Unable To Complete Calls
Reporter:damin (damin)Labels:
Date Opened:2005-01-15 00:42:51.000-0600Date Closed:2005-01-16 18:59:59.000-0600
Versions:Frequency of
Environment:Attachments:( 0) chan_sip_diff.txt
( 1) sip-debug-mwi-subscribe-1-01.txt
( 2) sip-debug-mwi-subscribe-1-15.txt
( 3) sip-debug-non-working.txt
( 4) sip-debug-working.txt
( 5) sip-debug-working-no-mwi-subscribe
( 6) tcpdump-chan_sip-1-1-05
( 7) tcpdump-chan_sip-1-2-05
Description:Polycom Soundpoint IP 300 and 500 phones running both the 1.3.1 and 1.3.4 SIP firmware are unable to complete calls with any cvs-stable chan_sip after 1/1/2005. Calls appear to complete, and the first frame of the audio is heard, but the Polycom then gives a fasy busy.


I have spent most of this evening tracking this down. I first noticed the problem with an update to a client's server on 1/8/2005. All their Polycom phones exhibited the same behaviour and were unable to make any outbound calls. I then rolled chan_sip back to the 12/26/2004 version and it worked fine.

I called Digium and spoke with Matt and asked him if he had any reports of this, and he said no. I promised that I would try to track it down, update my firmware, check all configuration settings and open a bug report if neccessary.

Tonight, I spent a good portion of time actually testing various chan_sip code and found that the problems are not present in the chan_sip code before 1/2/2005. I am able to reproduce this faithfully, and have attempted to discuss this on #asterisk, but have not been able to catch Russell or Mark.

I have attached a diff of the changes between 1/1/2005 and 1/2/2005, and the changes are minor, mostly related to mailbox checks. I haven't dug into the code to find more detailed information, but I will attach tcpdumps from both a successful 1/1/2005 call and an unsuccessful 1/2/2005 call. Hopefully, we can figure out where this issue is before 1.0.4 is dropped with this problem, and avoid a lot of complaints on the list about Polycom phones not working with it.
Comments:By: Mark Spencer (markster) 2005-01-15 00:54:24.000-0600

How about sip debug (from asterisk) since you're reporting a sip error, in accordance with the bug guidelines highlighted in yellow when you went to place your bug report.  Please include both working and non-working examples.

By: damin (damin) 2005-01-15 01:20:41.000-0600

Also, I just noticed something.. The call DOES appear to go through, and after 27 seconds, I hear the voice message on the other side. I've uploaded sip_debug output from both working and not-working calls.

By: damin (damin) 2005-01-15 01:48:05.000-0600

From IRC:

<Corydon76-home> Damin:  remove the mailbox subscription from the Polycom config file. Rely only on sip.conf to tell the Polycom which mailboxes have voicemail. That's how we fixed our Polycom 500s. The problem is that the mailbox subscription turns the line into "Shared" and makes calls impossible, as the line is already in use.

<kram> damin: can you confirm?

<Damin> Corydon-w: Let me take a look at that and see..

<Damin> Well, that seems to work..
<Damin> But, as expected, it breaks One Touch Voicemail..

<kram> so what's the root cause then?

I don't have an answer to this. All I can say is that it worked up until 1/1/2005.

By: Mark Spencer (markster) 2005-01-15 02:04:58.000-0600

How about comparing the sip debug of the SUBSCRIBE activity?

By: damin (damin) 2005-01-15 02:11:15.000-0600

<kram> damin: still here?

<Damin> kram: Yep...
<Damin> kram: Just reading the Polycom manual and waiting for my phone to reboot. I THINK I have a solution...
<Damin> kram: Yep.. it works. Solved the problem...
<Damin> kram: You CANNOT have the msg.mwi.1.subscribe tag set.
<Damin> kram: I had my Voicemailbox number in there..
<flewid> Damin: ah, yeah, cause tyhe phone will 'always be busy'
<Damin> kram: So, from Corydon's suggestion, + the manual + the
page, I've got it all working as it should..

By: damin (damin) 2005-01-15 02:26:36.000-0600

So the upshot of this little fact-finding mission is that Polycom SoundPoint IP phones cannot use MWI subscribe, and must have the msg.mwi.1.subscribe attribute blank. AKA, you cannot have msg.mwi.1.subscribe="100" (or have your voicemailbox) and have it work, if you are using chan_sip code newer than 1/1/2005.

In my original NON WORKING configuration, I had the following:

<mwi msg.mwi.1.subscribe="100" msg.mwi.1.callBackMode="contact" msg.mwi.1.callBack="8500"/>

By dropping the "100" from the subscribe attribute, everything works as expected.

By: damin (damin) 2005-01-15 02:33:47.000-0600

It is rather late and I have to be up in 4 hours, so I'm going to defer this until tommorow. I do think it is a good idea to get a trace of the Subscribe activity and I will see what I can do tommorow. Meanwhile, if someone wants to take a look at the sip-debugs I have already posted and see if they can get some understanding from them, please do.

By: Mark Spencer (markster) 2005-01-15 13:16:31.000-0600

Downgrading this to "minor" since there is a very simple work around and especially since it would appear to be a bug in the polycom (why is it that everyone wants bugs in other people's junk to be fixed within Asterisk??).  I think my request for information has been misinterpreted.  My request is to see the debug of the SUBSCRIBE request before and after the date that things "broke" with respect to the polycom.  Not the INVITE but the "SUBSCRIBE" where it tries to subscribe to MWI.

By: damin (damin) 2005-01-15 14:26:59.000-0600

Just a comment here. I'm not expecting Asterisk to fix bugs in Polycom's code. In fact, I have no idea if/what/where the bug is, or if should even be considered a bug. From what I can understand based on what I've learned over the past 24 hours trying to get to the bottom of the issue, it is just a misconfiguration that causes the behavior that I experienced. So I am happy to take the blame for the problem and raising the issue. However, for 2 weeks prior, I had attempted to diagnose the problem. In order, I did the following:

1. Backed off chan_sip to a previous version.
2. Looked at Mantis to see if there was anything similar.
3. Asked multiple times on IRC if anyone had the same problem, and was told by more than one person that I was nuts.
4. Asked on the developer's audio conferences if anyone might be able to replicate or help me track it down.
5. Called Digium and asked if they had any other complaints about similar behaviour, and was informed that they hadn't and that I should open a bug on the bug tracker.
6. Decided that before I wasted anyone else's time I would upgrade my Polycom's firmware and totally rebuild my Polycom configuration files from the ground up to see if I could replicate the problem.
7. After all of that, decided to take several hours out of my evening to download, rebuild and pinpoint the exact date / code changes where the problem occurred.
8. After narrowing it down to a specific date range, I then asked again on IRC, to which I was again told I was nuts.
9. Finally decided to open a bug and post a note on asterisk-dev mailing list to see if anyone could help.

Only after I went through many hours of diagnostics, changes, updates and testing did I decide to open a Mantis report. Why? Because I value everyone's time, and I do not want to be a pain in the ass. I appreciate the frustration that developers feel when someone cries wolf over a problem that isn't related to the code, and I hate to be considered "one of those users". What more could I have done, and how could I have approached it differently? I'd really like to know, because although I'm relieved that there is a solution to my issue, the fact of the matter is that I'm less inclined to spend as much time as I did trying to be painstakingly thorough before filing another bug report in the future. Quite frankly, I'd rather spend the time in other places, but I feel obligated to contribute back to the community in some meaningful way.

As to your comment about your request being mis-interpreted, I would say that I understood what you were after very clearly, but I had fallen asleep at the keyboard several times while adding my last bug note, so it may not be entirely coherent.

If I have a couple of spare moments tonight, I will attempt to get you the subscribe traces for both the working and non-working code. If this 10 minutes of work will help avoid hundreds of messages on the Asterisk-Users mailing list, it is well worth the time, especially since I see increased activity that appears to be a lead-up to a 1.0.4 release.

By: Brian West (bkw918) 2005-01-15 14:56:41.000-0600

ok lets just have a group hug? how's that sound?!?!?  Then lets post the solution to -users so it can get slurped up by google.



By: damin (damin) 2005-01-15 17:17:48.000-0600

I uploaded SIP debugs that contain the SUBSCRIBE activity;

The file sip-debug-mwi-subscribe-1-01.txt contains an example of a working transaction using chan_sip from 1/1/2005. My reading of this activity is that the following is taking place:

Polycom: SUBSCRIBE sip:100@ SIP/2.0
Asterisk: SIP/2.0 407 Proxy Authentication Required
Polycom: SUBSCRIBE sip:100@ SIP/2.0
Asterisk: SIP/2.0 403 Forbidden

The file sip-debug-mwi-subscribe-1-15.txt contains an example of a NON working transaction using chan_sip from 1/15/2005 which looks like:

Polycom: SUBSCRIBE sip:100@ SIP/2.0
Asterisk: SIP/2.0 407 Proxy Authentication Required
Polycom: SUBSCRIBE sip:100@ SIP/2.0
Asterisk: SIP/2.0 200 OK
Asterisk: SIP/2.0 200 OK (Why the second one?)
Asterisk: NOTIFY sip:100@ SIP/2.0
Polycom: SIP/2.0 200 OK

Hopefully this helps...

By: Mark Spencer (markster) 2005-01-16 01:20:05.000-0600

Latest CVS should avoid the double 200 OK but I doubt this is the problem.  In any case, given that the functionality of the phone breaks when we send "200 OK" instead of "403 Unauthorized" I think it's pretty clear the problem is in the phone.