Summary:ASTERISK-07035: Auth Fails on Failover
Reporter:Douglas Garstang (dgarstang)Labels:
Date Opened:2006-05-25 11:40:16Date Closed:2006-05-25 14:28:46
Versions:Frequency of
Environment:Attachments:( 0) bad-debug.txt
( 1) good-debug.txt
( 2) messages.txt
( 3) sip_trace.txt
Description:Polycom phone with number 2944093 registered with host pbx1. Pbx1 is shut down. Phone is configured to use host pbx2 as secondary. Phone is not registered with pbx2, but pbx1 and pbx2 share the same sip.conf.

Place a call from the phone. It sends an INVITE to pbx2. Pbx2 sends back a 407 Proxy Auth message. The phone sends the invite to pbx2 with credentials. Here's where it gets weird. Asterisk sends the 407 Proxy Auth again, and the phone of course sends the invite with credentials again. This occurs several times before the phone appears to give up. It looks like the phone is doing the right thing and Asterisk is behaving badly.

Here's the relevant entry in sip.conf for 29449093 (on both pbx1 and pbx2)

type = friend
context = pbx_one_start
username = 2944093
accountcode = 2944093
qualify = no
canreinvite = no
host = dynamic
callgroup = 1
pickupgroup = 1
dtmfmode = rfc2833
nat = no
mailbox = 2944093@voicemail
callerid = Douglas Garstang <2944093>
secret = foo

When the phone sends the INVITE to pbx2, it has not re-registered with pbx2 yet. Someone tell me if that matters. Asterisk should still process INVITE messages for phones it can authenticate with, even if not registered, correct?

If I set insecure=very and comment the secret= line, which tells asterisk not to authenticate, this problem does not occur at all.

After some period of time, around 5 minutes, it all starts to work again. The phone may or may not have registered with pbx2. Ie it suddenly starts to work even if the phone is not registered with pbx2, but I have also seen it work when the phone has re-registered with pbx2.

This is a serious problem for Asterisk failover.


Attached files:
sip_trace: SIP debug on pbx2.
Comments:By: Douglas Garstang (dgarstang) 2006-05-25 11:49:20

Ok, I just realised that was is happening is actually...

1. Phone sends INVITE to pbx2.
2. Asterisk on pbx2 sends back 407 proxy auth required.
3. Phone sends ACK
4. Phone sends second INVITE with credentials.
5. Asterisk on pbx2 logs 'Ignoring this INVITE request' and doesn't do anything else.
6. Phone keeps sending the INVITE with credentials.

So.. maybe it's not a bug. Well, maybe it IS a bug and whoever has that printf needs to say WHY it's ignoring the INVITE at the very least.

By: Joshua C. Colp (jcolp) 2006-05-25 11:50:20

You need to turn on debug output in Asterisk, not just a sip debug. It'll tell you the reason why chan_sip thinks the second INVITE should be ignored. Take a look at logger.conf to enable it on the console.

By: Serge Vecher (serge-v) 2006-05-25 13:47:26

dgarstang: are you using or stable branch? If later, please update to the most recent revision (30329 as of right now). thanks.

By: Douglas Garstang (dgarstang) 2006-05-25 14:03:13

vechers: We're using ... I thought that WAS the latest stable version?

I have attached two more files.
bad-debug: Dump of what occurs when the calls fail.
good-debug: Dump of what occurs when the calls magically start to work.

As stated above, the calls will fail for some period of time after the phone originally tries to use pbx2, and then at some point, say 5-10min later, pbx2 will stop ignoring the INVITE and process calls, with no human intervention.

By: Douglas Garstang (dgarstang) 2006-05-25 14:07:17

I'm not SIP expert, but if I look at sip_trace.txt, I can see that the Polycom phone is incrementing the value of CSeq from 2 for the first INVITE to 3 for the second INVITE. If you look at bad-debug.txt, you will see:

Ignoring SIP message because of retransmit (INVITE Seqno 3, ours 3)

It seems that Asterisk thinks our counter is 3, when it really should be 2...?

By: Leif Madsen (lmadsen) 2006-05-25 14:08:13

What version of the Polycom firmware are you running?

By: Joshua C. Colp (jcolp) 2006-05-25 14:11:25

This seems to be a bug in the Polycom SIP stack. The ACK MUST contain the same CSeq value as the original request (the INVITE) - and it does not. Upgrade to the latest Polycom firmware, 1.6.6 and see if it is still an issue. If so - we'll need to take it to Polycom.

By: Douglas Garstang (dgarstang) 2006-05-25 14:23:30

joshnet: Are you sure that it's not supposed to increment the CSeq for an ACK? Is that in the RFC's somewhere? I can use this as evidence when I call Polycom.

This must only happen when the phone fails over an INVITE request from one outgoing proxy to another, or every single call would fail!

I just made a call too, after it had magically started to work, and I can see that it did not increment the CSeq for the ACK, so it definitely looks like that Polycom incrementing the ACK is breaking stuff.

I'm going to take it to Polycom right now (through our crappy VAR). If Polycom can't make their firmware generally available, what else am I supposed to do.

By: Joshua C. Colp (jcolp) 2006-05-25 14:24:43

RFC 3261: Construction of the ACK Request

The CSeq header field in the ACK MUST contain the same
  value for the sequence number as was present in the original request,
  but the method parameter MUST be equal to "ACK".

By: Joshua C. Colp (jcolp) 2006-05-25 14:28:45

Since this has been determined to be a Polycom SIP stack bug, I'm closing this bug report out.