Summary:ASTERISK-05889: [patch] 'Reload' clears SIP subscriptions
Reporter:Douglas Garstang (dgarstang)Labels:
Date Opened:2005-12-22 09:59:35.000-0600Date Closed:2008-01-15 16:10:09.000-0600
Versions:Frequency of
Environment:Attachments:( 0) astcrash-btfull
( 1) bigpatch
( 2) compile_errors_asterisk_v1.2.5.txt
( 3) latest-6047-chan_sip.c.diff
( 4) sipdebug-deactivated.trace
( 5) sip-subs-deactivated
Description:Well, the guidelines said if this is a feature request to put 'reqest' in the title. Can't seem to find a 'title' field here so it's going in the Summary. Someone might want to update the guidelines.

Have a feeling this may already be requested or fixed, but a search for 'subscriptions' and/or 'reload' got me no results.

Onto the problem. When you issue a 'reload' command on the console, all SIP subscriptions are removed. It would be nice if, upon issuing a 'reload' command on the console, SIP subscriptions where not cleared. Makes it hard to implement BLF in a production environment.


Comments:By: BJ Weschke (bweschke) 2005-12-30 23:56:49.000-0600

Initial implementation has now been put up on http://svn.digium.com/svn/asterisk/team/bweschke/bug_6047/

There's really a couple things that are causing this.

During a 'reload' the hint extension gets "deactivated". chan_sip is catching this message and is doing what it's been coded to do, which is to immediately expire the subscription and tell the phone that the subscription has gone away temporarily and try to resubscribe in 60 seconds. I know with the Polycom SoundPoints, at least, testing has shown this request for it to retry in 60 seconds time isn't working on the phone.

So, what we do instead here is wait 5 seconds after we receive the deactivation message, and then attempt to reconnect that susbcription with the core hint extension and send a state resync message. If that all works, then the subscription successfully survived the reload. If we are no longer able to get state on the hint, then we assume that the hint went away with the reload, and we immediately expire the subscription and tell the phone that the subscription has gone away indefinitely.

This needs testing! I will test with the Polycom SoundPoints and Eyebeam. Would appreciate it if others test with other equipment as well. Thanks.

By: Olle Johansson (oej) 2005-12-31 01:43:59.000-0600

bweschke: Wouldn't it be better to solve this in the hint system instead of fixing it within the actual channel? By not fixing it in the core, we have to implement a fix in everything that will subscribe to statuses... Or?

By: Olle Johansson (oej) 2005-12-31 01:45:23.000-0600

Can we see a SIP debug of this? As I remember, the message we send to the phone is that the subscription is cancelled and the phone is supposed to re-subscribe.

By: BJ Weschke (bweschke) 2005-12-31 07:48:14.000-0600

oej: trace uploaded. As you can see, we are indeed sending exactly the message we're supposed to be sending asking the phone to retry in 60 seconds, but this isn't working with the Polycoms and probably other phones too. I'm going to open a case with Polycom support about this next week (the fact that it doesn't respond to the request correctly and what we should send, if anything, to get it to respond correctly). That also really doesn't address the other phones that may not be working correctly either. I know you've demo'd this with Eyebeam before. Does Eyebeam work correctly in response to this message?
I also somewhat agree in theory to your statement of "let's fix the core instead of a reaction to what the core is doing", but I started to look into doing that and it was certainly not as trivial as this patch.

By: BJ Weschke (bweschke) 2005-12-31 10:33:48.000-0600

Was able to test this patch successfully this morning with a SoundPoint 501. Subscriptions are now surviving a "reload".

By: Russell Bryant (russell) 2006-01-18 17:11:55.000-0600

The only issue I see with that patch is that if you reload the dialplan, and then unload chan_sip less than 5 seconds later, you will leak 16 bytes for every hint.

By: raarts (raarts) 2006-02-17 10:49:27.000-0600

I experienced a crash within this code. bt full is attached, if someone wants the coredump please say so. This was on asterisk 1.2.4, patched with:
- this patch
- some small patches in of my own in app_queue
- and the one in ASTERISK-5230).

It crashed right after a reload in cb_hintmanager while sending the current notification, and the SIP session structure seems to be invalid.
Maybe related:

n010297*CLI> sip show subscriptions
Peer             User        Call ID      Extension        Last state     Type     xxxxxxx182  cdce2c69-e4  4441             Ringing        xpidf+xml
1 active SIP subscription
Feb 17 18:21:42 ERROR[15791]: cdr_pgsql.c:154 pgsql_log: cdr_pgsql: Failed to insert call detail record into database!
Feb 17 18:21:42 ERROR[15791]: cdr_pgsql.c:155 pgsql_log: cdr_pgsql: Reason: connection not open

Feb 17 18:21:42 ERROR[15791]: cdr_pgsql.c:156 pgsql_log: cdr_pgsql: Connection may have been lost... attempting to reconnect.
 == Extension state: Watcher for hint  deactivated. Notify User xxxxxxx182

By: ianplain (ianplain) 2006-02-17 11:17:19.000-0600

Has or will this be added to "tag", As im running "SVN-tag-1.2.4-r8963M" and the hints arent surviving a reload or restart, The phones are Aastra 480i.


By: mustardman (mustardman) 2006-03-04 20:49:09.000-0600

I am having problems getting the patch to compile in Asterisk v1.2.5.  Here are the compile errors.

By: Olle Johansson (oej) 2006-03-05 02:33:23.000-0600

SHould I look at the branch or the patch file?

By: mustardman (mustardman) 2006-03-05 12:05:22.000-0600

Compile errors in Asterisk v1.2.5

chan_sip.c: In function `cb_hintmanager':
chan_sip.c:6335: error: too many arguments to function `append_history'
chan_sip.c: In function `cb_extensionstate':
chan_sip.c:6360: warning: implicit declaration of function `ast_calloc'
chan_sip.c:6360: warning: assignment makes pointer from integer without a cast

By: mustardman (mustardman) 2006-03-06 20:19:32.000-0600

I tried compiling asterisk from the svn version posted to get the patch working.  BLF seems to be working now for a reload BUT....when I reboot the server it's the same story.  No BLF until I reboot the phones.

Oh well, at least half the problem is solved.

By: Olle Johansson (oej) 2006-03-08 05:54:02.000-0600

This branch is not kept up to date with trunk. Please update to svn trunk and enable automerging, so I can look at it. Thanks!

By: mustardman (mustardman) 2006-03-13 10:13:03.000-0600

So is this even on the radar screen?  Seems like a pretty important issue to me.

By: Olle Johansson (oej) 2006-03-13 14:06:01.000-0600

My message only means that bweschke needs to update the branch when he has time to look into it. Nothing else.

By: BJ Weschke (bweschke) 2006-03-23 19:25:29.000-0600

the branch is up to date with /trunk and now also had some code added to the scheduler infrastructure to address the situation that could happen as described in russell's note from 1-16-06.

By: BJ Weschke (bweschke) 2006-03-27 13:20:15.000-0600

oej: I know you've been in transit, but this branch is now current with /trunk and ready for your review/feedback when you're ready and have a few moments.

By: Olle Johansson (oej) 2006-03-27 13:57:39.000-0600

...and the diff shows me it's not up to date with current trunk... Sorry, Bj. Still trying to look into it, but you might need to kick automerge back into action.

By: Olle Johansson (oej) 2006-03-28 18:34:19.000-0600

We are working on fixing this in Asterisk core. It's better to fix the problem than to work around it.

By: Olle Johansson (oej) 2006-03-28 18:37:39.000-0600

Fix committed to svn trunk and svn 1.2. Please test and confirm. Thanks.

And a special thank you to bweschke for your hard work to try to fix this!

By: Brian Degenhardt (bmdhacks) 2006-03-28 19:04:02.000-0600

do you have a svn revision number?  I don't see anything in the logs wrt a fix for this.

By: Olle Johansson (oej) 2006-03-29 16:15:45.000-0600

Fix committed to svn trunk and 1.2. Thanks for all the work.

By: BJ Weschke (bweschke) 2006-03-29 19:29:14.000-0600

reopened. Prior commit didn't fix the problem. Troubleshooting directly with oej and others.

By: BJ Weschke (bweschke) 2006-03-31 19:15:03.000-0600

attached sip-subs-deactivated which is a trace of SVN-trunk-r16829 showing that the deactivated msg is still getting sent from core to chan_sip causing the subscriptions to go bye-bye on a reload. oej: keeping this assigned to you for now.

By: Kevin P. Fleming (kpfleming) 2006-04-11 16:16:00

This has been corrected in 1.2 and trunk.. the entire dialplan was being dumped during 'reload', which was unnecessary.

By: Digium Subversion (svnbot) 2008-01-15 16:10:09.000-0600

Repository: asterisk
Revision: 7704

U   team/bweschke/polycom_acd_functions/channels/chan_sip.c

r7704 | bweschke | 2008-01-15 16:10:08 -0600 (Tue, 15 Jan 2008) | 3 lines

Bringing in changes from bug ASTERISK-5889. (Keeping SIP subscriptions around after a 'reload')