|Summary:||ASTERISK-11936: [patch] Congestion feature request|
|Reporter:||Kamil Czajko (kactus)||Labels:|
|Date Opened:||2008-04-28 21:45:18||Date Closed:||2008-07-02 08:20:43|
|Environment:||Attachments:||( 0) 20080616__bug12544.diff.txt|
( 1) sipconf.txt
( 2) siptable.txt
I've been looking at a few different ways to provide failover for clients and was hoping that there was a way to set auto_congest timeout values for non registered peers.
This would allow for dial plans such as
Autocongestion currently works if the peer returns an error but if the remote host is down it waits the full 32 seconds before progressing to the next priority. I'd like to set this so that if there is no response after 5 seconds that it goes to the next priority. (that way the cheaper but less reliable trunks are used first)
It appears asterisk 1.6 no longer uses the qualify column to determine maximum time before failover, and the timerb in sip.conf for example only works on registered peers. Passing time in the dial only seems to work once the other side receives it and setting timeout absolute = 5 redials the same priority every 5 seconds.
If this configuration has been moved somewhere else already can you please let me know as I was unable to find it in the documentation, online or after greping the source code.
****** ADDITIONAL INFORMATION ******
Using asterisk 1.6 beta8 as had issues with svn (still testing)
|Comments:||By: pj (pj) 2008-04-29 04:27:06|
I'm using 'qualify' to monitor all peers. If peer is qualified as 'unreachable' and I trying to dial this peer, diaplan logic steps immediatelly to next priority.
By: Joshua C. Colp (jcolp) 2008-04-29 06:59:20
Can you be more specific by what you mean with qualify column? Are you using realtime peers? Can you provide the configuration minus passwords?
By: Tilghman Lesher (tilghman) 2008-04-29 17:00:27
I'm also similarly confused. The qualify column only specified the time (in ms) by which a peer must respond to an OPTIONS request before being considered to be either too lagged (in the case of a response) or unreachable (in the case of no response). Qualify has NEVER worked as an active Dial timeout.
By: Kamil Czajko (kactus) 2008-04-29 21:10:45
Hi Sorry for not being clear.
I've been piecing information from forums, voip-info and bug fixes from previous versions.
From what I have been reading, and the information might be completely out of date and/or wrong, is that auto congest used to be calculated as qualify *2, which was later changed to qualify*4 (see http://bugs.digium.com/view.php?id=765 )
This I understand was later changed again to use timerb in sip.conf which defaults to 64*timert1 = 32 seconds. Changing these values does not seem to effect peers that do not register however and reading voip-info we get: 'For instance, a 1.6 version of Asterisk would not use the <snip> 'qualify' <snip> columns, ...' (see http://www.voip-info.org/wiki/view/Asterisk+RealTime+Sip )
I’ve included the providers SIP information attached as siptable.txt, there is no registration, they just accept calls across a vpn link to an IP address, if they are down we want to send it out to a different provider. I have tried setting the qualify value in ms as well as just set to yes with no difference.
I have tested it by sending both numbers that it will not recognise and by modifying the ip so it is incorrect. In the first scenario it works like a charm as we get sip error 500 and it flows through the next priority, however in the later it takes the full 32 seconds before it fails over (which is too long for an end user) and we would like to set this to something more reasonable like 3 or 5 seconds.
The settings I have set in sip.conf have been attached as sipconf.txt
Let me know if you need anything else.
All the best - Kactus
By: Tilghman Lesher (tilghman) 2008-04-29 21:19:22
No, I suspect the actual problem is that you don't have rtcachefriends=yes in sip.conf. If you are not caching the hosts in memory, then the qualify value won't actually do anything (it affects only peers that are held in memory). Without Realtime caching, hosts are disposed from memory when they are not still needed (and no, "qualify" does not count as "needed").
By: Kamil Czajko (kactus) 2008-04-30 01:36:58
Unfortunately some of the things we are doing do not allow us to run rtcachefriends=yes as we need to be able to programatically manage end users sip connections (such as password and context) on the fly without requiring a reload.
Ideally it would be nice for timerb to work as described in sip.conf for non registered peers or to have a variable that could be set to specify the maximum response time before auto congestion kicks in. The qualify information was mainly there because of the historical behavior of auto congest.
By: Olle Johansson (oej) 2008-04-30 10:24:22
Why don't you use the SIPPEER dialplan function BEFORE you place the call to check the status of the peer? That way, you don't even have to place the call.
By: Tilghman Lesher (tilghman) 2008-04-30 10:44:44
oej: I don't think he can, since the peer isn't cached in memory, so his system isn't keeping track of whether a peer is reachable. The "status" argument will always return "UNKNOWN".
By: Olle Johansson (oej) 2008-04-30 10:54:01
Oh, you're right. The realtime system hits us in the back again. Well, the T1 setting will help to quicken up the congestion time for SIP. And if Asterisk really gets an udp transmit error, we will cancel out quicker too.
By: Kamil Czajko (kactus) 2008-05-07 15:43:49
Well I've played around with some of the constants in chan_sip.c and basically the timeout is directly dependant on global_timer_b which is hard coded to 64*SIP_TIMER_T1 which inturn is hard defined as 500 ms
I've set global_timer_b to 10 *SIP_TIMER_T1 and that works well for me, and doesn't seem to break anything yet.
Oej the t1 timer in sip.conf does not speed up the process though. I imagine it wouldn't be too hard to initialise global_timer_b to = timerb as long as its valid, but I imagine getting asterisk to RC would be a higher priority for you :)
By: Tilghman Lesher (tilghman) 2008-06-16 16:00:55
You don't actually need to do that in 1.6. Simply set "timerb" in the peer definition to whatever number of ms you like. However, I do agree with you that the timer should be settable globally.
By: Kamil Czajko (kactus) 2008-06-18 20:53:36
I'll be setting up another box and will be able to test this again for you on Monday.
By: Kamil Czajko (kactus) 2008-06-25 02:24:29
Hi just an update I've unfortunately had to push back building the new box until Next Monday.
By: Olle Johansson (oej) 2008-07-01 11:35:23
By: Kamil Czajko (kactus) 2008-07-01 21:02:51
sorry I was working on issues around 12508 once we got the new server up and running.
I ran the patch and as Corydon mentioned, added a timerb column to the sip table. I played with a few settings and it did seem to work as desired with null values defaulting to 32 seconds and it caught -ve values which is always a plus.
Only issue I can see is that the warning that is generated:
WARNING: chan_sip.c:19968 build_peer: Timer B has been set lower than recommended. (RFC 3261, 126.96.36.199)
appears more often than it probably should.
It appears 4 times before calling the testtrunk and 3 times after (during the congestion). I imagine that a suppress warning flag can be added at a future date but that really is not important since the autocongestion works well now. The way it now also allows tuning different trunks with different time out which is brilliant.
I also tested it without the patch (just 1.6 beta 9) and found that it worked as Corydon had mentioned, so the patch only appears to add the warnings.
All in all looks good - Thank you very much ^^ we'll be taking advantage of this as we move forward.
All the best - Kactus
By: Tilghman Lesher (tilghman) 2008-07-01 21:33:22
The main difference now is that the global timer B is dependent upon a parameter that can be reset in the [general] section of sip.conf, rather than being hardcoded to 32 seconds.
The message printed every time was a typo. It will now only be printed every 20 times that a peer is built with a timer B that is lower than recommended.
By: Olle Johansson (oej) 2008-07-02 00:46:39
THanks for a quick reply. We will look into the warnings.