Summary: | ASTERISK-14953: [patch] Autocreated peers not deleted when unregister explicitly, become zombies | ||||
Reporter: | Kirill Katsnelson (kkm) | Labels: | |||
Date Opened: | 2009-10-07 04:50:12 | Date Closed: | 2011-04-27 15:54:31 | ||
Priority: | Major | Regression? | No | ||
Status: | Closed/Complete | Components: | Channels/chan_sip/Registration | ||
Versions: | Frequency of Occurrence | ||||
Related Issues: |
| ||||
Environment: | Attachments: | ( 0) 016033-chansip-autocreateunreg-1.6.1.12.diff ( 1) 016033-chansip-autocreateunreg-1.6.1.16.diff ( 2) 016033-chansip-autocreateunreg-1.6.2.4.diff ( 3) 016033-chansip-autocreateunreg-trunk.240319.diff ( 4) 016033-tilgman-fixed-refcount.diff ( 5) 20100425__issue16033.diff.txt ( 6) chansip-autocreate-unregfix.diff | |||
Description: | When a SIP peer is autocreated (per `autocreatepeer=yes' in sip.conf), and later explicitly un-registers, the peer record in Asterisk is never deleted. The `sip show peer' command output indicated `Transport : UNKNOWN'. This changes the moment the peer unregisters; before that, the transport is indicated as UDP. The peer stays in the list forever, but registration with the same name becomes impossible. Each REGISTER message is rejected by Asterisk, and the following error is printed: [Oct 7 02:21:31] ERROR[26490]: chan_sip.c:11650 register_verify: 'UDP' is not a valid transport for 'dyn-Q-N5'. we only use 'UNKNOWN'! ending call. [Oct 7 02:21:31] NOTICE[26490]: chan_sip.c:19958 handle_request_register: Registration from '<sip:dyn-Q-N5@192.168.0.91>' failed for '192.168.0.103' - Device not configured to use this transport type If, however, the peer fails to register within its registration expiration time, the record is deleted as expected. ****** STEPS TO REPRODUCE ****** 1. Add a line `autocreatepeer=yes' into the General section of sip.conf and issue `sip reload' CLI command. 2. Register a SIP device with a name not among static peers/users defined in sip.conf. The device used UDP transport. See that `sip show peers' command lists the peer, and `sip show peer <name>' shows Transport: UDP 3. Have the device unregister with the PBX. Expected behavior: - Device disappears from the `sip show peers' command output Observed: 1. Device never goes away from that list. 2. An attempt to register with the same name fails. ****** ADDITIONAL INFORMATION ****** A device in unregistered in three places in chan/chan_sip.c: 1. Upon expiration, in the function expire_register(), near line 10665. There, if the condition `peer->selfdestruct || ast_test_flag(&peer->flags[1], SIP_PAGE2_RTAUTOCLEAR)' is true, the device is unlinked from 2 internal device lists. This, as I understand, is the correct behavior. 2. From CLI command `sip unregister <peer-name>'. This calls the above function. 3. From explicit unregistration by the device, in the function parse_register_contact(), near line 10903. There most of the logic of expire_register() is repeated, except the device is not unlinked. This behavior appears incorrect to me. I am going to work on a fix and sumbit a patch in a day or two, because correct working autoregistration is critical to me. | ||||
Comments: | By: Kirill Katsnelson (kkm) 2009-10-08 00:17:55 I am attaching a patch that fixes the issue. The patch is against vanilla 1.6.1.6 release. Has been abused for a day by application developers, but within a narrowly limited usage pattern. Not tested with “realitme” configuration, for one. By: Kirill Katsnelson (kkm) 2010-01-14 23:53:09.000-0600 Adding a patch against the latest trunk (rev 240319) By: Olle Johansson (oej) 2010-02-17 04:20:39.000-0600 "THanks for contributing. Please stand by." And apologies for no one responding here. By: Olle Johansson (oej) 2010-02-17 04:23:47.000-0600 We need to separate the addition of the "Cause" to the manager event from the bug fixes. And we propably need a bug fix for 1.4 too. By: Kirill Katsnelson (kkm) 2010-02-17 18:47:12.000-0600 The 1.6.1.6 patch does not apply cleanly to neither 1.6.1.12 nor the 1.6.2 branch. I have a patch for 1.6.1.12. Will that help? The patch against 1.6.1 is not readable. I rearranged functions in the 1.8 trunk patch with the only purpose that the diff could be understood. I do not understand the comment about the "Cause" in the manager. I never attempted to fix the problem in 1.4. By: Olle Johansson (oej) 2010-02-18 01:46:50.000-0600 If I understand it right, your patch adds a new header called "Cause" to an existing manager event. We can't do that in release code, only in trunk. By: Kirill Katsnelson (kkm) 2010-02-18 09:06:30.000-0600 No, this is not the case. There are 2 slightly mutated clones of the same code currently; I rearranged them into one function unregister_peer that is called from 2 places. One place (expire_peer, if I remember correctly) has the Cause field: - manager_event(EVENT_FLAG_SYSTEM, "PeerStatus", "ChannelType: SIP\r\nPeer: SIP/%s\r\nPeerStatus: Unregistered\r\nCause: Expired\r\n", peer->name); and the second does not - manager_event(EVENT_FLAG_SYSTEM, "PeerStatus", "ChannelType: SIP\r\nPeer: SIP/%s\r\nPeerStatus: Unregistered\r\n", peer->name); The behavior is preserved in the new function through its second argument `cause': + if (mgr_cause) + manager_event(EVENT_FLAG_SYSTEM, "PeerStatus", "ChannelType: SIP\r\nPeer: SIP/%s\r\nPeerStatus: Unregistered\r\nCause: %s\r\n", peer->name, mgr_cause); + else + manager_event(EVENT_FLAG_SYSTEM, "PeerStatus", "ChannelType: SIP\r\nPeer: SIP/%s\r\nPeerStatus: Unregistered\r\n", peer->name); By: Kirill Katsnelson (kkm) 2010-02-18 09:16:03.000-0600 Unrelated question: is there any way to keep code formatted in tickets? Can't find any information even on mantisbt.org By: Kirill Katsnelson (kkm) 2010-02-19 21:00:10.000-0600 Uploaded patches against 1.6.1.12 (working version here), 1.6.1.16 (latest 1.6.1 - compiles, not tested), 1.6.2.4 (same). By: Kirill Katsnelson (kkm) 2010-03-03 13:31:31.000-0600 We have upgraded to 1.6.2.4, so the patch 016033-chansip-autocreateunreg-1.6.2.4.diff above is confirmed working (in case anybody uses it already). By: Tilghman Lesher (tilghman) 2010-04-25 12:52:59 The patch seems to be a little overcomplicated, based upon your report. I've uploaded a patch based upon your description (for trunk), which is a lot simpler. Does this not work for your needs? By: Kirill Katsnelson (kkm) 2010-04-26 01:08:06 I think it should work (I'll test tomorrow to be completely sure). I just tried to collect the 2 almost identical copies of code into one function with a clear semantics. This is just my habit of eliminating such things early from my codebase, but of course I am not insisting on imposing the same policy onto that of yours. :-) By: Kirill Katsnelson (kkm) 2010-04-30 20:41:42 I am sure you noticed that, but your patch changes event semantics, albeit a little. Hope it wont break regression or make somebody's life harder. Namely, the PeerStatus event is now sent with 'Reason: Expired' even though the registration ended on UAC's request. I tried to preserve the behavior completely in my patch -- but really I do not know how important it is too keep it fully backward-compatible. Anyhow, the patch has been on the production machine for quite a few hours, and I do not see any ill effects. Agents coming and going, and there are no zombified registrations. By: Kirill Katsnelson (kkm) 2010-07-31 05:06:15 *** STOP THE SHOW *** tilghman's patch has a reference count problem (and mine likely had, too). Please do not merge. I have a reliable real-life crash scenario. I am working on a fix (there are other refcounting problems with autopeers). By: Kirill Katsnelson (kkm) 2010-07-31 22:40:58 Refcounting problem fixed in the latest attached patch. Show must go on. NB: This fix has not been tested separately from ASTERISK-1752774. Both are required together to fix refcounting properly. Please add a xref, I do not have the privilege. By: ibercom (ibercom) 2011-04-22 12:58:24 This bug is related to 0018699, although it is about asterisk 1.4 and realtime. The bug description is similar. Additional information -3- is my problem. I will try patch 20100425__issue16033.diff.txt. I don't know about patch 016033-tilgman-fixed-refcount.diff and asterisk 1.4 By: Terry Wilson (twilson) 2011-04-26 16:08:04 This looks good to me. I've tested and it appears to work. I'll commit it. By: Digium Subversion (svnbot) 2011-04-26 17:47:58 Repository: asterisk Revision: 315671 U branches/1.4/channels/chan_sip.c ------------------------------------------------------------------------ r315671 | twilson | 2011-04-26 17:47:58 -0500 (Tue, 26 Apr 2011) | 11 lines Make sure unregistering a peer unlinks it from the peer container Instead of mostly copying the code from expire_register, just use the function that "does the right thing". (closes issue ASTERISK-14953) Reported by: kkm Patches: 016033-tilgman-fixed-refcount.diff uploaded by kkm (license 888) Tested by: kkm, tilghman, twilson ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=315671 By: Digium Subversion (svnbot) 2011-04-26 17:52:26 Repository: asterisk Revision: 315672 _U branches/1.6.2/ U branches/1.6.2/channels/chan_sip.c ------------------------------------------------------------------------ r315672 | twilson | 2011-04-26 17:52:26 -0500 (Tue, 26 Apr 2011) | 18 lines Merged revisions 315671 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r315671 | twilson | 2011-04-26 15:47:56 -0700 (Tue, 26 Apr 2011) | 11 lines Make sure unregistering a peer unlinks it from the peer container Instead of mostly copying the code from expire_register, just use the function that "does the right thing". (closes issue ASTERISK-14953) Reported by: kkm Patches: 016033-tilgman-fixed-refcount.diff uploaded by kkm (license 888) Tested by: kkm, tilghman, twilson ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=315672 By: Digium Subversion (svnbot) 2011-04-26 17:56:20 Repository: asterisk Revision: 315673 _U branches/1.8/ U branches/1.8/channels/chan_sip.c ------------------------------------------------------------------------ r315673 | twilson | 2011-04-26 17:56:20 -0500 (Tue, 26 Apr 2011) | 25 lines Merged revisions 315672 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r315672 | twilson | 2011-04-26 15:52:25 -0700 (Tue, 26 Apr 2011) | 18 lines Merged revisions 315671 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r315671 | twilson | 2011-04-26 15:47:56 -0700 (Tue, 26 Apr 2011) | 11 lines Make sure unregistering a peer unlinks it from the peer container Instead of mostly copying the code from expire_register, just use the function that "does the right thing". (closes issue ASTERISK-14953) Reported by: kkm Patches: 016033-tilgman-fixed-refcount.diff uploaded by kkm (license 888) Tested by: kkm, tilghman, twilson ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=315673 By: Digium Subversion (svnbot) 2011-04-26 18:10:59 Repository: asterisk Revision: 315675 _U trunk/ U trunk/channels/chan_sip.c ------------------------------------------------------------------------ r315675 | twilson | 2011-04-26 18:10:59 -0500 (Tue, 26 Apr 2011) | 32 lines Merged revisions 315673 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.8 ................ r315673 | twilson | 2011-04-26 15:56:19 -0700 (Tue, 26 Apr 2011) | 25 lines Merged revisions 315672 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.6.2 ................ r315672 | twilson | 2011-04-26 15:52:25 -0700 (Tue, 26 Apr 2011) | 18 lines Merged revisions 315671 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r315671 | twilson | 2011-04-26 15:47:56 -0700 (Tue, 26 Apr 2011) | 11 lines Make sure unregistering a peer unlinks it from the peer container Instead of mostly copying the code from expire_register, just use the function that "does the right thing". (closes issue ASTERISK-14953) Reported by: kkm Patches: 016033-tilgman-fixed-refcount.diff uploaded by kkm (license 888) Tested by: kkm, tilghman, twilson ........ ................ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=315675 By: Digium Subversion (svnbot) 2011-04-27 15:54:31 Repository: asterisk Revision: 315989 U branches/1.4/channels/chan_sip.c ------------------------------------------------------------------------ r315989 | seanbright | 2011-04-27 15:54:31 -0500 (Wed, 27 Apr 2011) | 6 lines Partial revert of r315671 which removed a logging statement and not a manager event. Reported by ibercom in #asterisk-bugs. (issue ASTERISK-14953) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=315989 |