Summary:ASTERISK-19093: sip reload not loading all users
Reporter:Stephen Velluto (svelluto)Labels:
Date Opened:2011-12-21 16:02:32.000-0600Date Closed:2012-02-09 16:57:42.000-0600
Versions:SVN Frequency of
Environment:Attachments:( 0) sysinfo.jpg
Description:We are in the process of implementing Asterisk and have a couple of scripts to handle reloads based on which files have been changed.
We are having an issue with reloading users, with the users are reloaded, about 5% of the time it will not load all the users, it will load only a couple. The only way to get them back is to revert the users file back to it's previous state.

I have tried determining if there is a property in the user record that causes this issue by eliminating them one by one, but the bug still doesn't happen all the time.

This system is about to go into production, any help would be appreciated!

If you need more detail, just ask.
Comments:By: Stephen Velluto (svelluto) 2011-12-21 16:28:52.144-0600

I did not spot any errors when this was happening, it just fails to load the entire users file.

By: Richard Mudgett (rmudgett) 2011-12-21 17:49:50.678-0600

Thank you for taking the time to report this bug and helping to make Asterisk better. Unfortunately, we cannot work on this bug because your description did not include enough information. You may find it helpful to read the Asterisk Issue Guidelines http://www.asterisk.org/developers/bug-guidelines. We would be grateful if you would then provide a more complete description of the problem. At a minimum, we need:

1. the specific steps or actions you took that caused you to encounter the problem,
2. the behavior you expected, and
3. the behavior you actually encountered (in as much detail as possible).

This likely includes output from the console with debug level logging, a SIP trace (if this is SIP related), and configuration information such as dialplan (e.g. extensions.conf) and channel configuration (e.g. sip.conf). Thanks!

By: Stephen Velluto (svelluto) 2011-12-22 12:20:44.401-0600

I am not able to replicate this issue all the time, but when I do it is very annoying.
This system is a couple days away from going into production.

Here is what I do when it does happen:
In our users.conf, we have roughly 60 users (continually growing), when I add a user, and do an sip reload, everything looks alright, then all of a sudden, phones begin registering and giving errors with peer does not exist. I look at the sip show peers and on some occasions it loads no peers, on others it loads some (not all).
When I look through the users.conf file, there is nothing out of the ordinary, I use the same properties for all the users (shown below). To fix it, I have to touch the file, then do another sip reload, that causes the users.conf to be read correctly. If I just do an sip reload without touching the file, the users list does not get refreshed, and it only loads the same users (if any) it had already loaded. This only happens 5% of the time, I cannot replicate it 100%.

I expected sip reload to reload the users properly with the changes I had made to the users.conf file. Instead, it will only load some users or none at all.

Here is a sample users record:

fullname=User Name

By: Richard Mudgett (rmudgett) 2011-12-22 12:57:20.641-0600

This may have to do with the timing of your reloads.  chan_sip will not reload a configuration if it does not think that the file has changed.  (This is true for most any Asterisk module that has a configuration file.)  I think the timestamp granularity is one second.

Are you loading the configuration, modifying it again, and reloading within that timeframe?

By: Stephen Velluto (svelluto) 2011-12-22 13:03:20.421-0600

Ok - that makes sense why it wouldn't reload.

As for the timing of the reloads. I had originally thought it was because I was saving after each change, but then it started happening on the first change. I started making all my changes at once, and then doing a sip reload at the end, and even that caused it not to load the entire file.

By: Richard Mudgett (rmudgett) 2011-12-22 15:14:14.660-0600

Is it loading the first so many users in the file or is it skipping some users?
Could you be running out of memory?
Do you really want to use trunk code?  Why not Asterisk 10 or the 10 branch?

By: Stephen Velluto (svelluto) 2011-12-22 15:30:59.116-0600

It always loads from the beginning.

I've attached a screenshot of our sysinfo, it doesn't look like we're running out of ram.

I didn't realize Asterisk 10 had been released.
What would be required to upgrade to 10? If I go ahead and update, I need to make sure I don't loose any of the settings, and I would need to be able to do it quickly.

By: Richard Mudgett (rmudgett) 2011-12-22 16:26:56.258-0600

Asterisk 10.0.0 was released Dec 15.  It was branched from the trunk on 2011-07-21.
Read the CHANGES and UPGRADE.txt files.

For this issue is there any consistency in which users are not loaded?

By: Stephen Velluto (svelluto) 2011-12-23 07:19:44.379-0600

Originally it was crashing on loading a specific user.
This user was one connected to our analog card (only 2 devices are connected to that card), and when this issue happened it would stop at that record. So when I moved that record to the bottom of the file, it had reloaded the entire list, which I thought fixed it. Since then, I have not been able to find a pattern as to where it stops loading the file.

By: Richard Mudgett (rmudgett) 2011-12-23 14:56:40.298-0600

This issue might have some bearing on your problem: ASTERISK-16508.
Nevermind, the issue was a new feature and is only available in trunk.

By: Matt Jordan (mjordan) 2011-12-27 07:27:33.126-0600

Can you attach a debug log from an instance where the reload only loads some subset of the peers?  

By: Stephen Velluto (svelluto) 2011-12-28 07:21:44.263-0600

This system went into production the other day, and I have only had the issue once in that time.

But because it's the holidays I will have a lighter than normal load, so I can try and force the issue to get a debug log.

By: Matt Jordan (mjordan) 2012-02-09 16:57:36.453-0600

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.  Further information can be found at http://www.asterisk.org/developers/bug-guidelines