Summary:ASTERISK-14800: [patch][regression] upgrade from 1.4.18 broke NOTIFY keep-alive reponse and stale nonce handling to Linksys SPA962
Reporter:Jeff LaCoursiere (lacoursj)Labels:
Date Opened:2009-09-08 23:29:18Date Closed:2015-03-13 21:07:26
Versions:Frequency of
Environment:Attachments:( 0) chan_sip.c.patch
( 1) full.post-patch.snip.gz
( 2) full.pre-patch.snip.gz
Description:Site has ~300 remote Linksys SPA962 phones registering from behind many different NAT routers.  Common config and firmware (6.1.3a) includes sending keep-alive as SIP NOTIFY messages.  After performing a source upgrade from 1.4.18 to the normal 489 INVALID response was no longer being received by the phones, which would try six or seven times then assume the registration was lost, and resend it.  This triggered another (probably Linksys) bug where the re-registration was sent with a stale nonce, and some subset of the phones refused to honor the 401 response and try again with a new nonce, thus would fall offline completely.  Several hours of debugging uncovered that the 489 INVALID was indeed being sent, but to the internal address of the phone rather than its public NAT address, which is new behavior from 1.4.18.


The attached patch is not intended as a submission for inclusion - in fact I am certain that I am doing things incorrectly!  The two changes solved the problems for us for now until someone can help us determine the correct solution.  The first change was to allow a stale nonce with correct auth info to pass authentication immediately, rather than return the 401.  When this patch was applied the phones that had previously fallen offline every five to twenty minutes stopped doing so.  We realize that this patch makes us vulnerable to "bad things", and hope to find a better method.  The second change was to force __sip_xmit to use p->recv as dst ALWAYS instead of calling sip_real_dst().  I *think* this means that we won't be able to support peers with NAT turned off.  In our case this doesn't matter.  Obviously this patch won't work for everyone.  We hope to better understand why this version of asterisk set dst to the internal IP of the phone and fix THAT.
Comments:By: Jeff LaCoursiere (lacoursj) 2009-09-08 23:46:40

I have attached parts of the "full" log with debug set to 5, verbose set to 5, and "sip debug" turned on.  If you pick a peer you should be able to see the behavior both before and after the patch.

By: Jeff LaCoursiere (lacoursj) 2009-09-08 23:48:30

This is related to issue number 15084, and if interested I would like to supply the patch to have return "200 OK" in response to the NOTIFY keep-alive.

By: Leif Madsen (lmadsen) 2009-09-10 07:49:03

Assigned to dvossel for review as I believe he would be the best candidate currently :)

By: Jeff LaCoursiere (lacoursj) 2010-06-18 18:43:33

A recent upgrade to 1.4.32 shows that this problem still exists.

By: Joshua C. Colp (jcolp) 2015-03-13 21:07:16.416-0500

Per the Asterisk versions page [1], the maintenance (bug fix) support for the Asterisk branch you are using has ended. For continued maintenance support please move to a supported branch of Asterisk. After testing with a supported branch, if you find this problem has not been resolved, please open a new issue against the latest version of that Asterisk branch.


[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions