[Home]

Summary:ASTERISK-08577: No more registry after a timeout
Reporter:tootai (tootai)Labels:
Date Opened:2007-01-14 10:30:40.000-0600Date Closed:2007-01-30 10:28:01.000-0600
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Registration
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) verbosedebug.tx
( 1) verbosedebug.txt
Description:When asterisk is registring to several peers, if one of them has a timeout, all others -defined before in sip.conf- disappears from show registry list and are no more registred to the peer. Example:

in sip.conf:

register => user1:pass1@peer1/exten1
register => user2:pass2@peer2/exten2
register => user3:pass3@peer3/exten3
register => user4:pass4@peer4/exten4

if the registration to peer 3 timed out, peer4 is still showed in sip show registry but peer1 and peer2 disappears and are no more registred with the peer.

Doing a module reload chan_sip make it work again or sometimes Asterisk crash.
Comments:By: tootai (tootai) 2007-01-15 02:50:52.000-0600

What I had this morning:

keewi*CLI> sip show registry
Host                            Username       Refresh State                Reg.Time
(null):5060                     (null)               0 Unregistered

then "module reload chan_sip" and all my peers where here

By: tootai (tootai) 2007-01-15 08:08:19.000-0600

I ran a sip show registry who showed me the 3 last peers from sip.conf registry list (the one before was in timeout) and got a core file. I ran gdb and got this (asterisk is compiled with DONT_OPTIMIZE

keewi:/tmp# gdb core.5009
GNU gdb 6.3-debian
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-linux"..."/tmp/core.5009": not in executable format: File format not recognized

(gdb) thread apply all bt full
(gdb)

Not really helpfull :-(

By: Serge Vecher (serge-v) 2007-01-15 11:03:53.000-0600

As per bug guidelines, you need to attach a SIP debug trace illustrating the problem. Please do the following:
1) Prepare test environment (reduce the amount of unrelated traffic on the server);
2) Make sure your logger.conf has the following line:
  console => notice,warning,error,debug
3) restart Asterisk with the following command:
  'asterisk -Tvvvvvdddddngc | tee /tmp/verbosedebug.txt'
4) Enable SIP transaction logging with the following CLI commands:
set debug 4
set verbose 4
sip debug
5) Trim startup information and attach verbosedebug.txt to the issue.

By: tootai (tootai) 2007-01-16 11:41:11.000-0600

The uploaded file is a debug one. Asterisk stopped after doing a "module reload chan_sip" as one peer disappear from the list. See the (null)5060 line. After restarting Asterisk, he was here again (sipgate.de) There was no core file.

By: tootai (tootai) 2007-01-18 03:45:10.000-0600

On the last uploaded file you can clearly see the problem. The peer TNET-ipac has a register attempt 3 at line 5737 and then disappear. At line 7534 you see a sip show registry command who only shows 3 peers, even if two lines above you see that asterisk has registred with sipgate.de and sipphone.com which are not listed. That gedameurope is not showed is logical as the registration failed at the first attempt. The register part of sip.conf -in defined order- is:

-sipgate
-gedameurope
-sipphone
-tnet-ipac
-freephonie
-tootaiaudio
-wengo

By: tootai (tootai) 2007-01-20 12:46:09.000-0600

I updated to 51351 version same problem. I also notice taht from the time the problem happend, after having done a "sip show registry", I have no more reaction in CLI: I can enter whatever command I want -eg help-, immediately back to CLI without any output of the command. Command "stop now" do nothing. Restarting with rc script is ok and make things working again!

By: Olle Johansson (oej) 2007-01-23 03:51:58.000-0600

I agree that something is fishy here. I got a registry entry disappearing today. Hunting...

By: Olle Johansson (oej) 2007-01-23 04:30:53.000-0600

This is definitely confirmed. Will test in 1.4 too.

By: Olle Johansson (oej) 2007-01-23 06:53:28.000-0600

Can't repeat in 1.4. Good!

By: Olle Johansson (oej) 2007-01-23 07:25:20.000-0600

There's something really fishy here. If we have to retransmit a REGISTER request and get no reply, we corrupt the registry by deleting not only the dialog, but also the registry entry.

By: Olle Johansson (oej) 2007-01-23 09:36:41.000-0600

Solution committed to rev #51659, please test. Works for me.

By: tootai (tootai) 2007-01-23 11:42:45.000-0600

Just made a fresh SVN install, asterisk is showing

keewi*CLI> core show version
Asterisk SVN-trunk-r51353 built by root @ keewi on a i686 running Linux on 2007-01-23 16:46:44 UTC

Is the the rev #51659 include in this one? Anyway, will follow it. Thanks.

By: tootai (tootai) 2007-01-23 12:04:49.000-0600

Got the problem again which means #51659 is _not_ yet committed in trunk or solution is not working

By: Olle Johansson (oej) 2007-01-23 12:28:39.000-0600

51353 is lower than 51659... Please test with the latest revision.

By: tootai (tootai) 2007-01-23 12:45:47.000-0600

Which means -and that's wht I wanted to point- that SVN has a problem. It's a *fresh* install made _after_ you aske me to test. I just tried a make update:

keewi:/usr/src/asterisk# make update
Updating from Subversion...

Fetching external item into 'menuselect'

Fetching external item into 'menuselect/mxml'
External at revision 19.

At revision 105.
At revision 51363.
keewi:/usr/src/asterisk#

By: Serge Vecher (serge-v) 2007-01-29 13:15:40.000-0600

tootai: the svn server's had some problems earlier, which is why you were seeing the problem. This should be resolved, so please update to the latest svn and try again. Thanks.

By: tootai (tootai) 2007-01-30 03:47:33.000-0600

I SVN updated to 52820 and problem disappear. Thanks.

Daniel

By: Joshua C. Colp (jcolp) 2007-01-30 10:28:01.000-0600

Per comment - issue has been fixed as of revision 51659. Way to go oej!