[Home]

Summary:ASTERISK-06232: After an unkonwn period of time asterisk starts refusing IAX inbound calls
Reporter:Scott Caudell (scaudell)Labels:
Date Opened:2006-02-01 14:28:44.000-0600Date Closed:2011-06-07 14:02:42
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_iax2
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:Teliax is the provider, after a period of time (Today it was less then 12 hours but prior it has taken 2-3 days) All my inbound IAX calls fail w/ the below error message. A full restart of Asterisk does not resolve this issue. What I have found that resolves it is re-compile from the same source code I compiled from originally and it starts working again. I have tried this on 1.2.0 through 1.2.3 so far and am seeing the same issue.
I'm running this on Debian Linux kernel 2.6.8-2-686

****** ADDITIONAL INFORMATION ******

IAX2 Debugging Enabled
Rx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 000 Type: IAX     Subclass: NEW
  Timestamp: 00004ms  SCall: 00215  DCall: 00000 [208.139.204.245:4569]
  VERSION         : 2
  CALLED NUMBER   : 312[omitted]
  CODEC_PREFS     : (ulaw|g726)
  CALLING NUMBER  : 312[omitted]
  CALLING PRESNTN : 0
  CALLING TYPEOFN : 0
  CALLING TRANSIT : 0
  CALLING NAME    : Cell Phone   IL
  LANGUAGE        : en
  USERNAME        : teliax
  FORMAT          : 4
  CAPABILITY      : 63508
  ADSICPE         : 2
  DATE TIME       : 2006-02-01  14:09:44

Tx-Frame Retry[000] -- OSeqno: 000 ISeqno: 001 Type: IAX     Subclass: AUTHREQ
  Timestamp: 00004ms  SCall: 00002  DCall: 00215 [208.139.204.245:4569]
  AUTHMETHODS     : 2
  CHALLENGE       : [challenge omitted]
  USERNAME        : teliax

Rx-Frame Retry[ No] -- OSeqno: 001 ISeqno: 001 Type: IAX     Subclass: AUTHREP
  Timestamp: 00030ms  SCall: 00215  DCall: 00002 [208.139.204.245:4569]
  MD5 RESULT      : f85d4ec5dfca66f87b68c7daaf53791c

Feb  1 15:09:06 NOTICE[14685]: chan_iax2.c:7183 socket_read: Host 208.139.204.245 failed to authenticate as teliax
Comments:By: Mark Spencer (markster) 2006-02-19 23:32:17.000-0600

Are you still able to duplicate this?  It makes no sense.

By: Olle Johansson (oej) 2006-03-29 18:36:03.000-0600

No response from reporter.

By: Scott Caudell (scaudell) 2006-03-29 22:40:26.000-0600

I replied to the update e-mail I sent, didn't realize I need to post it here.
We are working around the issue by getting a PRI & a Zaptel T1 card. It continues to cause a problem, however recently the re-compile doesn't even seem to resolve the issue. I still get the message:
Mar 29 08:57:12 NOTICE[10256]: chan_iax2.c:7201 socket_read: Host 207.174.202.3 failed to authenticate as teliax
in the console.

By: Scott Caudell (scaudell) 2006-03-29 22:44:45.000-0600

Also, I have completely changed out the hardware for this server, and the linux distro. I am currently running Gentoo linux 2.6.12-gentoo-r9. Simply running the re-compile resolved the issue a number of times (around 10 at least) but has ceased resolving the issue. I have had to forward the calls from our provider to our PRI DID due to our number port hasn't gone through yet. This is a completely new install of asterisk running 1.2.5 w/ the same configs we used prior.

By: Olle Johansson (oej) 2006-03-29 22:53:31.000-0600

Have you checked the authentication that fails?

By: Scott Caudell (scaudell) 2006-03-29 23:04:00.000-0600

yes - the username & password are correct for this account.

By: Mark Seamans (n5yzv) 2006-04-01 11:30:34.000-0600

I am having the issue on 1.2.4 on Gentoo.  I have the same issue, however I can temp resolve the issue by stopping Asterisk and starting it.  Happens about every 24 - 48 hours, with no regard to load.  Other peers off the main trunking server remain working fine.  I have upgraded to 1.2.6.  I have the cli launched with iax2 debug logging to a file via screen.  I will post more if it dies again.



By: Mark Seamans (n5yzv) 2006-04-07 09:14:46

Ok...I -believe- I found my issue.  chan_agent
Seems my queue agents (login/logout), along with queue functionality would fail prior to the trunk dieing.
So I went to ring groups, and are now into day 3 of no troubles ( I never made it past 2).
Also, after researching, I was not having an auth issue, I was getting a no route to host on the gateway box.



By: Joshua C. Colp (jcolp) 2006-04-15 19:18:20

Any update from you scaudell? Tried 1.2.7.1?

By: Mark Seamans (n5yzv) 2006-04-15 21:19:05

My issue was actually chan_agent was blowing up the system...and the iax trunk took it in the teeth.  I would like to see if scaudell was using chan_agent.  My window of the issue being seen with chan_agent to the point of the trunk failing was about 30 seconds.  Not much time to catch the real issue.

By: Scott Caudell (scaudell) 2006-04-17 16:29:10

I'm installing 1.2.7.1 tonight. Will update with a status soon.

By: Scott Caudell (scaudell) 2006-04-20 00:34:44

Upgraded to 1.2.7.1 tonight - Will post another update in a few days.

By: Serge Vecher (serge-v) 2006-05-02 16:41:14

scaudell: looks like your upgrade to 1.2.7.1 went well since there are no more reports from you. Please confirm with a short note. Thanks!

By: Scott Caudell (scaudell) 2006-05-08 17:24:11

Hey - 1.2.7.1 seems to be successful as far as lite testing goes of this issue. I have removed the forward to our PRI line so all of our inbound is coming through IAX. If it's going to break, it will break soon. Will post update later this week.

By: Serge Vecher (serge-v) 2006-05-08 17:34:19

I'm going to close the issue at this time. If the original problem reoccurs, please reopen the issue with a backtrace of a deadlocked asterisk. Thanks.

By: Scott Caudell (scaudell) 2006-05-09 16:24:54

Since I removed the forward from our carrier, it has been a little over 24 hours and we are now seeing this issue once again. Asterisk is not deadlocking, and I'm not sure how to provide you of a backtrace. Please let me know what the steps are to move fowarward with this.

By: Serge Vecher (serge-v) 2006-05-09 16:29:45

well, what exactly are the symptoms?

set the console to high verbosity, turn iax debug on, and post here as an attachment when the problem occurs.

By: Scott Caudell (scaudell) 2006-05-09 16:39:25

They symptoms are that after a certain period of time (seems to be related to how many calls we get via IAX) it starts refusing calls saying that Host 207.174.202.3 failed to authenticate as teliax. Teliax is the provider and I've had lots of trouble working with them on this, so i'm unsure what they are doing. I do know that restarting the box & or asterisk makes no difference. The only thing I have found to do to fix this is recompile. The following is the output you asked for. Let me know what else I can do.

Rx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 000 Type: IAX     Subclass: NEW
  Timestamp: 00009ms  SCall: 00205  DCall: 00000 [207.174.202.3:4569]
  VERSION         : 2
  CALLED NUMBER   : 3123240408
  CODEC_PREFS     : (g729|ulaw|g726|gsm)
  CALLING NUMBER  : 3124469704
  CALLING PRESNTN : 0
  CALLING TYPEOFN : 0
  CALLING TRANSIT : 0
  CALLING NAME    : Cell Phone   IL
  LANGUAGE        : en
  USERNAME        : teliax
  FORMAT          : 4
  CAPABILITY      : 63766
  ADSICPE         : 2
  DATE TIME       : 2006-05-09  15:37:46

Tx-Frame Retry[000] -- OSeqno: 000 ISeqno: 001 Type: IAX     Subclass: AUTHREQ
  Timestamp: 00010ms  SCall: 00001  DCall: 00205 [207.174.202.3:4569]
  AUTHMETHODS     : 2
  CHALLENGE       : 189129517
  USERNAME        : teliax

Rx-Frame Retry[ No] -- OSeqno: 001 ISeqno: 001 Type: IAX     Subclass: AUTHREP
  Timestamp: 00054ms  SCall: 00205  DCall: 00001 [207.174.202.3:4569]
  MD5 RESULT      : 2ac64b401502c57b8b9f94bfef5fbc67

May  9 11:37:10 NOTICE[15825]: chan_iax2.c:7203 socket_read: Host 207.174.202.3 failed to authenticate as teliax
Tx-Frame Retry[000] -- OSeqno: 001 ISeqno: 002 Type: IAX     Subclass: REJECT
  Timestamp: 00036ms  SCall: 00001  DCall: 00205 [207.174.202.3:4569]
  CAUSE           : No authority found
  CAUSE CODE      : 50

Rx-Frame Retry[ No] -- OSeqno: 002 ISeqno: 002 Type: IAX     Subclass: ACK
  Timestamp: 00036ms  SCall: 00205  DCall: 00001 [207.174.202.3:4569]

By: Serge Vecher (serge-v) 2006-05-09 18:20:23

> it starts refusing calls saying that Host 207.174.202.3 failed to authenticate as teliax
what do you mean by this statement:
1) Asterisk refuses to accept any iax call (chan_iax2 unresponsive); or
2) Outbound IAX call to TELIAX do not go through, while internally, calls between IAX clients or IAX->sip work ok.

If 2) then obviously something screwy is going on with Teliax -> test with another provider, like VoipJet.

By: Andrew Kohlsmith (akohlsmith) 2006-05-11 12:04:25

This sounds identical to the issue I have with SVN trunk.

IAX2 works great but for some reason the peer entry gets lost after some point.  "iax2 show peers" shows that the peer that is trying to authenticate just does not exist anymore.

reload chan_iax2.so solves it.

It's maddeningly difficult to reproduce, but it seems to be related to a peer going LAGGED or UNREACHABLE (qualify=yes and qualifysmoothing=yes).  file on IRC wonders if it might have something to do with the rtcache being cleared, but I have no realtime set up on ANY of my Asterisk boxes.

My setup:  A---[dedicated link]---B---[internet]---C
(B has the PRI)

All calls from A must go through B, and similarly, all calls from C must go through B.

A, B and C all run roughly (within a few svn trunk revs) the same code.  I only need to reload chan_iax2.so on B to make it work again.

When A "drops off" from B's perspective, C may or may not continue to work.  Similarly when C drops off from B's perspective, The problem GENERALLY happens on B, but I have seen it happen on A as well, but never on C yet.  Whenever a peer can no longer place/take calls, their entry in 'iax2 show peers' is ALWAYS gone.  No exceptions.  I see C trying to place a call to B and I see "failed auth" on B, and "can't auth" on C, which seems to tell me this disappearing peer entry is real.  :-)

I've added some debugging to the cache clearing code on B to see if I can see it trying to delete A or C.  Nothing yet.

By: Joshua C. Colp (jcolp) 2006-05-16 19:01:26

The only way this should happen is if the MD5 generation done does not match the one calculated on the other side. I can create a patch for you to see what your side is saying it should be, and how it's generated... so we can try to narrow down where the issue is exactly.

By: Andrew Kohlsmith (akohlsmith) 2006-05-24 13:19:56

I'm not sure if you're talking to me or the original reporter, but I'd welcome the patch.  I am curious though -- why on earth would you *delete* the peer if the MD5 didn't match?

By: Joshua C. Colp (jcolp) 2006-05-24 13:45:10

This problem is not your problem tzanger, they're two different ones. As for this bug I'm closing it since the original poster has not responded to my offer.