[Home]

Summary:ASTERISK-08063: [patch] iax channel goes into an error loop on network read error
Reporter:John Riordan (john)Labels:
Date Opened:2006-11-03 13:47:53.000-0600Date Closed:2007-07-09 21:20:42
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) patch-iax2-errorloop.diff
Description:Issue:
When a network read error occurs, in certain cases, iax2_read() is called in a loop (calling iax2_read and getting the same error again, and again, and again). This leads to general instability (not to mention it can fill the message log with thousands of errors a second).

Patch:
Cause the channel to hangup on an error by returning NULL instead of a null_frame.
Comments:By: Russell Bryant (russell) 2006-11-09 21:29:44.000-0600

Exactly what case could actually cause this to happen?  As the log message in iax2_read indicates, that function should never be called due to the way chan_iax2 operates.  Even if a network read were to have an error for some reason in chan_iax2, that would not lead to iax2_read() getting called ...

By: Joshua C. Colp (jcolp) 2006-11-09 21:38:24.000-0600

I would have to agree with Russell on this. There are some cases where a call may exhibit this notice but recover and operate fine. Hanging up the call would be an... unfortunate thing to do.

By: Russell Bryant (russell) 2006-11-09 21:39:15.000-0600

What cases would *ever* cause iax2_read to be called?

By: Joshua C. Colp (jcolp) 2006-11-09 21:41:02.000-0600

There are no "cases" exactly. Nobody (including myself) has been able to pinpoint exactly why it would occur. It's not something that could be recreated but it still occasionally shows up for people.

By: Russell Bryant (russell) 2006-11-09 21:46:11.000-0600

Well I think it is safe to say that under any normal operation, iax2_read() should never be called.  If anyone can recreate a situation where it is called, then we need to fix that.  We should never have to work around that situation by modifying iax2_read() itself.

By: John Riordan (john) 2006-11-09 22:23:34.000-0600

We were seeing it randomly once every 100,000 calls or so
and more often than not it would drag down the Asterisk instance.
We've been patching Asterisk internally with this for about a year.
We've never seen it get called when things are working fine.
But it does get called. I will dig into the code again and
try to recall what the circumstances are. What I kinda remember
is it had something to do with a pointer to this function is
actually available to the channel and while it doesn't normally
get called under some error condition the channel did call it
to do the next read. If I remember correctly, the instances
where we saw it happen, the call (or call on the bridge channel)
were always gone, so hanging it up was the thing to do.

By: John Riordan (john) 2006-11-09 22:35:48.000-0600

This is from a log off a single Asterisk instance over Oct and Nov.
Again, this is about once every 100,000 calls.
We are currently hanging up.

Oct  4 04:51:16 WARNING[1610] chan_iax2.c: I should never be called!
Oct  4 04:51:16 WARNING[1611] chan_iax2.c: I should never be called!
Oct  5 10:20:18 WARNING[11775] chan_iax2.c: I should never be called!
Oct  5 10:20:18 WARNING[11776] chan_iax2.c: I should never be called!
Oct 10 13:04:28 WARNING[18503] chan_iax2.c: I should never be called!
Oct 11 11:27:32 WARNING[28569] chan_iax2.c: I should never be called!
Oct 13 18:34:59 WARNING[25277] chan_iax2.c: I should never be called!
Oct 23 04:53:50 WARNING[23987] chan_iax2.c: I should never be called!
Oct 23 04:53:50 WARNING[23990] chan_iax2.c: I should never be called!
Oct 23 10:54:21 WARNING[25871] chan_iax2.c: I should never be called!
Oct 23 10:54:21 WARNING[25872] chan_iax2.c: I should never be called!
Oct 25 11:37:02 WARNING[15568] chan_iax2.c: I should never be called!
Oct 25 14:22:58 WARNING[17942] chan_iax2.c: I should never be called!
Oct 25 14:22:58 WARNING[17943] chan_iax2.c: I should never be called!
Oct 29 11:21:47 WARNING[21522] chan_iax2.c: I should never be called!
Oct 29 12:16:13 WARNING[28797] chan_iax2.c: I should never be called!
Oct 29 12:28:22 WARNING[30576] chan_iax2.c: I should never be called!
Oct 29 12:46:05 WARNING[1921] chan_iax2.c: I should never be called!
Oct 29 14:00:41 WARNING[9297] chan_iax2.c: I should never be called!
Oct 29 14:24:19 WARNING[11865] chan_iax2.c: I should never be called!
Oct 30 12:23:04 WARNING[18503] chan_iax2.c: I should never be called!
Nov  2 04:55:17 WARNING[19594] chan_iax2.c: I should never be called!
Nov  2 04:55:17 WARNING[19598] chan_iax2.c: I should never be called!
Nov  7 17:41:56 WARNING[10306] chan_iax2.c: I should never be called!
Nov  7 17:41:56 WARNING[10307] chan_iax2.c: I should never be called!
Nov  8 10:29:05 WARNING[15632] chan_iax2.c: I should never be called!

By: John Riordan (john) 2006-11-09 22:38:38.000-0600

Here is a single case:

Nov  7 17:41:56 WARNING[10306] chan_iax2.c: I should never be called!
Nov  7 17:41:56 WARNING[10306] chan_iax2.c: John thinks we should hangup now.
Nov  7 17:41:56 WARNING[10306] channel.c: Unable to write to alert pipe on IAX2/66.227.100.30:4569-44, frametype/subclass 4/1 (qlen = 0): Bad file descriptor!
Nov  7 17:41:56 WARNING[10307] chan_iax2.c: I should never be called!
Nov  7 17:41:56 WARNING[10307] chan_iax2.c: John thinks we should hangup now.
Nov  7 17:41:56 WARNING[10307] channel.c: Unable to write to alert pipe on IAX2/66.227.100.30:4569-58, frametype/subclass 4/1 (qlen = 0): Broken pipe!
Nov  7 18:11:03 WARNING[29695] chan_iax2.c: Max retries exceeded to host 72.25.103.165 on IAX2/66.227.100.30:4569-42 (type = 2, subclass = 4, ts=65540, seqno=1)

By: John Riordan (john) 2006-11-09 23:46:00.000-0600

Ok, I went back into the code to refresh my memory.
Here is what we are seeing to be best of my recollection.

During a bridged call being handled by ast_generic_bridge() [channel.c]
ast_read() is getting called in the main bridge loop. [line 3657]

For reasons beyond what I've dug into, it turns out the read queue
is empty when ast_read() looks for the next frame [line 2024]
and so it calls iax2_read() [line 2043] which currently returns
a null frame and we end up back in ast_generic_bridge(). And repeat.

While I know this doesn't completely outline the exact case where
this is happening, but I hope it helps.

By: Russell Bryant (russell) 2006-11-10 15:59:26.000-0600

Thanks for the update.  I'll try to look at this in more detail when I have some more time ...

By: Serge Vecher (serge-v) 2007-01-05 08:11:27.000-0600

john: what are the messages that come up on the console?

By: John Riordan (john) 2007-01-05 11:11:05.000-0600

The WARNING messages in the earlier note are from the our log.
At the time, I'm sure they were on the console too.

By: Russell Bryant (russell) 2007-03-29 11:57:26

This patch has been merged into the 1.4 branch and trunk in revisions 59341 and 59343.  Thanks!