Summary:ASTERISK-03290: Dial returns CONGESTION instead of CHANUNAVIAL if host not reachable
Reporter:Andrew Kohlsmith (akohlsmith)Labels:
Date Opened:2005-01-17 12:32:39.000-0600Date Closed:2008-01-15 15:55:05.000-0600
Versions:Frequency of
Environment:Attachments:( 0) 20051104__func_hangupcause.diff.txt
Description:Basically if IAX2 can't hit the far end it should return CHANUNAVAIL not CONGESTION -- there is no sane way to do VOIP provider failover if IAX2 insists on telling me that the link is congested when in reality it just can't get there from here.

Now if the far end returned "I can't complete your call" then fine, it's CONGESTION -- that, however, isn't the case when I unplug the network cable.  :-)

-- Executing Dial("IAX2/user1@host1/16386", "IAX2/user2@host2/1234567890||g") in new stack
Jan 17 12:44:24 NOTICE[22410]: app_dial.c:803 dial_exec: Unable to create channel of type 'IAX2' (cause 3)
== Everyone is busy/congested at this time (1:0/1/0)
-- Executing NoOp("IAX2/user1@host1/16386", "HOST2: HANGUPCAUSE is 0 and DIALSTATUS is CONGESTION") in new stack


This is just a placeholder -- I intend on submitting a patch for this in the next couple days.
Comments:By: Tilghman Lesher (tilghman) 2005-01-17 12:51:06.000-0600

But the problem doesn't appear to be IAX2 -- the message coming back from IAX2 is cause 3 -- AST_CAUSE_NO_ROUTE_DESTINATION.  The problem appears to be that app_dial is not properly handling that condition, not that IAX2 isn't sending it.

By: Tilghman Lesher (tilghman) 2005-01-17 12:59:17.000-0600

And furthermore, telling the difference between not being able to get through to a remote host because the network is down and not being able to get through because the network is heavily congested is difficult -- both conditions result in exactly the same symptoms -- unless you mean that you want to reserve the Congestion condition for PSTN conditions, not VOIP conditions.

By: Andrew Kohlsmith (akohlsmith) 2005-01-17 13:45:17.000-0600

bkw: thanks for changing that to minor, I mis-clicked again.

corydon76: in my mind if you can't get a message back from the far end saying "piss off, I've got too much on my plate at the moment" then it shouldn't be treated as congestion...  in other words, congestion signifies a PSTN condition -- if you can't get to the other side it's not necessarily congestion so much as it is a connectivity problem.  It's specifically the ability to differentiate between connectivity issues vs congestion issues that makes it possible to do failover with the dialplan.

By: Tilghman Lesher (tilghman) 2005-01-17 14:16:38.000-0600

I disagree.  Network congestion, whether that network is IP based or TDM based, still signifies a condition whereby the caller is best informed that the call cannot be made now but should retry later.

To use your analogy, if a remote system wants to say "piss off, I'm too busy right now" it should send a cause code 42 (switching equipment congestion) or possibly cause code 39 (facility rejected), not a cause code 34 (no circuit available).  Cause code 34 is for specifying that a pathway does not exist, and that's entirely appropriate when an IP route is down.

By: twisted (twisted) 2005-01-17 15:00:09.000-0600

update to latest cvs.  12/23 is pretty old.   Also, are you using qualify lines in your iax2 peers?

By: Mark Spencer (markster) 2005-01-17 16:45:22.000-0600

I concur that a failure within the network *is* CONGESTION and that this behavior is correct.  CHANUNAVAIL would be used for, for example, an unregistered user.

By: Andrew Kohlsmith (akohlsmith) 2005-11-04 08:00:58.000-0600

I am re-opening this.  

Mark, can you give me a solution then on how to tell whether I cannot reach a peer for call termination or whether the peer cannot complete the call due to TDM congestion?

This is a very real issue.  If Dial() returns CONGESTION for both I have no alternative than to attempt to fail the call over through all providers before finally arriving at the conclusion that there was a far-end call completion problem.  This is highly, highly sub-optimal and it increases the delay between the caller placing the call and finally getting a congestion indication.

So please... what is the point of qualify if I can't use it to intelligently send calls to peers?

By: Tilghman Lesher (tilghman) 2005-11-04 11:07:04.000-0600

How about a HANGUPCAUSE() function, that allows you to send whatever cause code you like from the remote system?  If it doesn't go through to the remote system, you get cause code 34 from chan_iax2, and if it does go through but the TDM is clogged, you can send back whatever other error you like, e.g.

exten => _X.,n,Set(HANGUPCAUSE()=127)
exten => _X.,n,Hangup

By: Andrew Kohlsmith (akohlsmith) 2005-11-04 11:12:12.000-0600

That works great if all the IAX2 providers on the planet support it...  I believe that chan_iax2 will pass back the Zap ${HANGUPCAUSE} variable too.  I'm looking for a uniform solution that works with all channel technologies, not just IAX2... IAX2's just where it bit me.  :-)

realistically though I still maintain that while a qualify failure CAN indicate network (internet) congestion it is by far more likely that there is a hard cut in service between you and the peer -- a network DOWN case.  It is for this reason that I maintain that qualify failure should send back CHANUNAVAIL and not CONGESTION.

CONGESTION implies that the far side has responded "I can't help you right now" -- CHANUNAVAIL imples "I can't get to that computer."

By: Tilghman Lesher (tilghman) 2005-11-04 11:21:42.000-0600

It's the same hook into the channel as the builtin variable HANGUPCAUSE -- we're just permitting you to set it in the dialplan.

As far as distinguishing the two, that's a matter you need to take up with your service provider.  You can't very well expect a change in code today (especially CVS) to affect your provider tomorrow.

By: Andrew Kohlsmith (akohlsmith) 2005-11-04 11:44:28.000-0600

Several examples to (hopefully) strengthen my argument.

Mark says that CHANUNAVAIL is not for network problems.  However when I unplug my T1 and it goes into RA (a network problem) I get CHANUNAVAIL when I try to dial out through it.

On normal telephone switches you do not get a Congestion tone unless the far-side switch SPECIFICALLY says "I'm too busy to service your request" or "I have no available channels to service your request".  If the call cannot be completed for another reason (can't physically reach the far side switch) you get SIT and the PRI return code is not "Congestion".

I'm just looking for consistency.  Congestion means the far side says it's congested.  You can't assume that just because you can't reach them they're overloaded.

By: Tilghman Lesher (tilghman) 2005-11-04 11:47:31.000-0600

If you don't get a PRI cause code 34 in that case, what is the numeric cause code you get?

By: Andrew Kohlsmith (akohlsmith) 2005-11-04 11:55:05.000-0600

I *think* it's #2 (No route to transit network) but it was a LONG time since that has happened... it's very very rare to find that in circuit-switched networks.  (This was on a Bell Canada PRI)

By: Kevin P. Fleming (kpfleming) 2005-11-08 20:00:10.000-0600

Fixed in CVS HEAD. chan_iax2 now causes CHANUNAVAIL when the peer is known to be UNREACHABLE. For good measure, I also changed chan_zap to cause CONGESTION when a group-channel dial is requested and all the channels in the group are busy (unlike requesting a single channel which is busy, which causes BUSY).

By: Digium Subversion (svnbot) 2008-01-15 15:55:05.000-0600

Repository: asterisk
Revision: 7037

U   trunk/ChangeLog
U   trunk/channels/chan_iax2.c
U   trunk/channels/chan_zap.c

r7037 | kpfleming | 2008-01-15 15:55:05 -0600 (Tue, 15 Jan 2008) | 2 lines

issue ASTERISK-3290 plus related fix