|Summary:||ASTERISK-10731: SIP reload for large config halts SIP Processing|
|Date Opened:||2007-11-09 16:06:39.000-0600||Date Closed:||2011-06-07 14:02:44|
|Description:||It seems that asterisk stops processing sip for some amount of time during reloads. This causes weird dialing behavior for our end users trying to make / recieve calls during the reloads, and some of our sip endpoints start re-registering to other, backup servers when the reloads occur. Do you have any suggestions / thoughts on this? I was going to start looking at realtime as a possible solution, but given the size and complexity of our dialplan, and integration with our existing backend systems this is probably not going to be a quick fix.|
****** ADDITIONAL INFORMATION ******
With 5000 sip peers on a single asterisk server, sip call processing is stopped for approx 45 sec when chan_sip reloads on a dual CPU dell 1950. We are in the process of migrating our sip peers to realtime to try to avoid this issue. You may want to check this out since you have our dialplan. Extension / pbx_config reloads are much less of an issue because calls do not stop processing on these reloads. We have had to back off our sip reloads to a bare minimum to avoid causing service issues, but this limits our ability to be responsive to necessary config changes.
|Comments:||By: Terry Wilson (twilson) 2007-11-12 11:58:32.000-0600|
I know I ran into this issue about 2 years ago and switched to realtime to avoid it (which works great btw). When you do a sip reload, asterisk sends qualify packets to all peers (if you have qualify set to something)--which I think are now spaced 100ms apart to avoid this issue, and I think it will also cause mwi to be sent to all peers as well. This is a lot of messages to send (and I've taken down a SER box with them before).
I would suggest, for testing purposes, turning off qualify and trying a reload to see if there is any interruption after reload. Then if it is the same, temporarily take out the malibox setting for the peers and do a reload and see if that is what is causing the problem and then we can go from there. Although, when I had to fix the problem I had to use realtime without caching, and handle MWI externally (we had many asterisk boxes sharing a config, so *each* would blast mwi messages when it did a reload).
By: dtyoo (dtyoo) 2007-11-12 12:29:16.000-0600
Appreciate the feedback. All the sip peers in question are remote polycom handsets, and the peers have qualify and mwi turned on. I think you are correct that the qualifies in particular are causing asterisk to send a whole bunch of messages on reload. We need the qualifies turned on for nat-traversal reasons, so we can't get rid of them. We tried setting them to some large value (e.g. 10000), but this didn't result in any improvement in behavior.
We are about 75% done with our migration of sip peers from files to realtime, and have tested that this definitely avoids the issue, much as you found 2 years ago. We are using realtime with caching, and it does seem to handle MWI for us without any associated performance issues. In our implementation, we are only pruning changed peers rather than all the peers, and this is probably where most of the improvement is coming from.
By: Terry Wilson (twilson) 2007-11-12 13:08:03.000-0600
We did NAT traversal handling with setting REGISTER timeouts very low so you can then turn off the qualify handling. As long as it is on at all it will send them to everyone on reload.
By: Olle Johansson (oej) 2007-11-15 04:58:24.000-0600
Ok, this is an architecture issue, not really a bug. Let's move the discussion to asterisk-dev and close this bug report. Thanks. /O