Summary: | ASTERISK-06232: After an unkonwn period of time asterisk starts refusing IAX inbound calls | ||
Reporter: | Scott Caudell (scaudell) | Labels: | |
Date Opened: | 2006-02-01 14:28:44.000-0600 | Date Closed: | 2011-06-07 14:02:42 |
Priority: | Major | Regression? | No |
Status: | Closed/Complete | Components: | Channels/chan_iax2 |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ||
Description: | Teliax is the provider, after a period of time (Today it was less then 12 hours but prior it has taken 2-3 days) All my inbound IAX calls fail w/ the below error message. A full restart of Asterisk does not resolve this issue. What I have found that resolves it is re-compile from the same source code I compiled from originally and it starts working again. I have tried this on 1.2.0 through 1.2.3 so far and am seeing the same issue. I'm running this on Debian Linux kernel 2.6.8-2-686 ****** ADDITIONAL INFORMATION ****** IAX2 Debugging Enabled Rx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 000 Type: IAX Subclass: NEW Timestamp: 00004ms SCall: 00215 DCall: 00000 [208.139.204.245:4569] VERSION : 2 CALLED NUMBER : 312[omitted] CODEC_PREFS : (ulaw|g726) CALLING NUMBER : 312[omitted] CALLING PRESNTN : 0 CALLING TYPEOFN : 0 CALLING TRANSIT : 0 CALLING NAME : Cell Phone IL LANGUAGE : en USERNAME : teliax FORMAT : 4 CAPABILITY : 63508 ADSICPE : 2 DATE TIME : 2006-02-01 14:09:44 Tx-Frame Retry[000] -- OSeqno: 000 ISeqno: 001 Type: IAX Subclass: AUTHREQ Timestamp: 00004ms SCall: 00002 DCall: 00215 [208.139.204.245:4569] AUTHMETHODS : 2 CHALLENGE : [challenge omitted] USERNAME : teliax Rx-Frame Retry[ No] -- OSeqno: 001 ISeqno: 001 Type: IAX Subclass: AUTHREP Timestamp: 00030ms SCall: 00215 DCall: 00002 [208.139.204.245:4569] MD5 RESULT : f85d4ec5dfca66f87b68c7daaf53791c Feb 1 15:09:06 NOTICE[14685]: chan_iax2.c:7183 socket_read: Host 208.139.204.245 failed to authenticate as teliax | ||
Comments: | By: Mark Spencer (markster) 2006-02-19 23:32:17.000-0600 Are you still able to duplicate this? It makes no sense. By: Olle Johansson (oej) 2006-03-29 18:36:03.000-0600 No response from reporter. By: Scott Caudell (scaudell) 2006-03-29 22:40:26.000-0600 I replied to the update e-mail I sent, didn't realize I need to post it here. We are working around the issue by getting a PRI & a Zaptel T1 card. It continues to cause a problem, however recently the re-compile doesn't even seem to resolve the issue. I still get the message: Mar 29 08:57:12 NOTICE[10256]: chan_iax2.c:7201 socket_read: Host 207.174.202.3 failed to authenticate as teliax in the console. By: Scott Caudell (scaudell) 2006-03-29 22:44:45.000-0600 Also, I have completely changed out the hardware for this server, and the linux distro. I am currently running Gentoo linux 2.6.12-gentoo-r9. Simply running the re-compile resolved the issue a number of times (around 10 at least) but has ceased resolving the issue. I have had to forward the calls from our provider to our PRI DID due to our number port hasn't gone through yet. This is a completely new install of asterisk running 1.2.5 w/ the same configs we used prior. By: Olle Johansson (oej) 2006-03-29 22:53:31.000-0600 Have you checked the authentication that fails? By: Scott Caudell (scaudell) 2006-03-29 23:04:00.000-0600 yes - the username & password are correct for this account. By: Mark Seamans (n5yzv) 2006-04-01 11:30:34.000-0600 I am having the issue on 1.2.4 on Gentoo. I have the same issue, however I can temp resolve the issue by stopping Asterisk and starting it. Happens about every 24 - 48 hours, with no regard to load. Other peers off the main trunking server remain working fine. I have upgraded to 1.2.6. I have the cli launched with iax2 debug logging to a file via screen. I will post more if it dies again. By: Mark Seamans (n5yzv) 2006-04-07 09:14:46 Ok...I -believe- I found my issue. chan_agent Seems my queue agents (login/logout), along with queue functionality would fail prior to the trunk dieing. So I went to ring groups, and are now into day 3 of no troubles ( I never made it past 2). Also, after researching, I was not having an auth issue, I was getting a no route to host on the gateway box. By: Joshua C. Colp (jcolp) 2006-04-15 19:18:20 Any update from you scaudell? Tried 1.2.7.1? By: Mark Seamans (n5yzv) 2006-04-15 21:19:05 My issue was actually chan_agent was blowing up the system...and the iax trunk took it in the teeth. I would like to see if scaudell was using chan_agent. My window of the issue being seen with chan_agent to the point of the trunk failing was about 30 seconds. Not much time to catch the real issue. By: Scott Caudell (scaudell) 2006-04-17 16:29:10 I'm installing 1.2.7.1 tonight. Will update with a status soon. By: Scott Caudell (scaudell) 2006-04-20 00:34:44 Upgraded to 1.2.7.1 tonight - Will post another update in a few days. By: Serge Vecher (serge-v) 2006-05-02 16:41:14 scaudell: looks like your upgrade to 1.2.7.1 went well since there are no more reports from you. Please confirm with a short note. Thanks! By: Scott Caudell (scaudell) 2006-05-08 17:24:11 Hey - 1.2.7.1 seems to be successful as far as lite testing goes of this issue. I have removed the forward to our PRI line so all of our inbound is coming through IAX. If it's going to break, it will break soon. Will post update later this week. By: Serge Vecher (serge-v) 2006-05-08 17:34:19 I'm going to close the issue at this time. If the original problem reoccurs, please reopen the issue with a backtrace of a deadlocked asterisk. Thanks. By: Scott Caudell (scaudell) 2006-05-09 16:24:54 Since I removed the forward from our carrier, it has been a little over 24 hours and we are now seeing this issue once again. Asterisk is not deadlocking, and I'm not sure how to provide you of a backtrace. Please let me know what the steps are to move fowarward with this. By: Serge Vecher (serge-v) 2006-05-09 16:29:45 well, what exactly are the symptoms? set the console to high verbosity, turn iax debug on, and post here as an attachment when the problem occurs. By: Scott Caudell (scaudell) 2006-05-09 16:39:25 They symptoms are that after a certain period of time (seems to be related to how many calls we get via IAX) it starts refusing calls saying that Host 207.174.202.3 failed to authenticate as teliax. Teliax is the provider and I've had lots of trouble working with them on this, so i'm unsure what they are doing. I do know that restarting the box & or asterisk makes no difference. The only thing I have found to do to fix this is recompile. The following is the output you asked for. Let me know what else I can do. Rx-Frame Retry[ No] -- OSeqno: 000 ISeqno: 000 Type: IAX Subclass: NEW Timestamp: 00009ms SCall: 00205 DCall: 00000 [207.174.202.3:4569] VERSION : 2 CALLED NUMBER : 3123240408 CODEC_PREFS : (g729|ulaw|g726|gsm) CALLING NUMBER : 3124469704 CALLING PRESNTN : 0 CALLING TYPEOFN : 0 CALLING TRANSIT : 0 CALLING NAME : Cell Phone IL LANGUAGE : en USERNAME : teliax FORMAT : 4 CAPABILITY : 63766 ADSICPE : 2 DATE TIME : 2006-05-09 15:37:46 Tx-Frame Retry[000] -- OSeqno: 000 ISeqno: 001 Type: IAX Subclass: AUTHREQ Timestamp: 00010ms SCall: 00001 DCall: 00205 [207.174.202.3:4569] AUTHMETHODS : 2 CHALLENGE : 189129517 USERNAME : teliax Rx-Frame Retry[ No] -- OSeqno: 001 ISeqno: 001 Type: IAX Subclass: AUTHREP Timestamp: 00054ms SCall: 00205 DCall: 00001 [207.174.202.3:4569] MD5 RESULT : 2ac64b401502c57b8b9f94bfef5fbc67 May 9 11:37:10 NOTICE[15825]: chan_iax2.c:7203 socket_read: Host 207.174.202.3 failed to authenticate as teliax Tx-Frame Retry[000] -- OSeqno: 001 ISeqno: 002 Type: IAX Subclass: REJECT Timestamp: 00036ms SCall: 00001 DCall: 00205 [207.174.202.3:4569] CAUSE : No authority found CAUSE CODE : 50 Rx-Frame Retry[ No] -- OSeqno: 002 ISeqno: 002 Type: IAX Subclass: ACK Timestamp: 00036ms SCall: 00205 DCall: 00001 [207.174.202.3:4569] By: Serge Vecher (serge-v) 2006-05-09 18:20:23 > it starts refusing calls saying that Host 207.174.202.3 failed to authenticate as teliax what do you mean by this statement: 1) Asterisk refuses to accept any iax call (chan_iax2 unresponsive); or 2) Outbound IAX call to TELIAX do not go through, while internally, calls between IAX clients or IAX->sip work ok. If 2) then obviously something screwy is going on with Teliax -> test with another provider, like VoipJet. By: Andrew Kohlsmith (akohlsmith) 2006-05-11 12:04:25 This sounds identical to the issue I have with SVN trunk. IAX2 works great but for some reason the peer entry gets lost after some point. "iax2 show peers" shows that the peer that is trying to authenticate just does not exist anymore. reload chan_iax2.so solves it. It's maddeningly difficult to reproduce, but it seems to be related to a peer going LAGGED or UNREACHABLE (qualify=yes and qualifysmoothing=yes). file on IRC wonders if it might have something to do with the rtcache being cleared, but I have no realtime set up on ANY of my Asterisk boxes. My setup: A---[dedicated link]---B---[internet]---C (B has the PRI) All calls from A must go through B, and similarly, all calls from C must go through B. A, B and C all run roughly (within a few svn trunk revs) the same code. I only need to reload chan_iax2.so on B to make it work again. When A "drops off" from B's perspective, C may or may not continue to work. Similarly when C drops off from B's perspective, The problem GENERALLY happens on B, but I have seen it happen on A as well, but never on C yet. Whenever a peer can no longer place/take calls, their entry in 'iax2 show peers' is ALWAYS gone. No exceptions. I see C trying to place a call to B and I see "failed auth" on B, and "can't auth" on C, which seems to tell me this disappearing peer entry is real. :-) I've added some debugging to the cache clearing code on B to see if I can see it trying to delete A or C. Nothing yet. By: Joshua C. Colp (jcolp) 2006-05-16 19:01:26 The only way this should happen is if the MD5 generation done does not match the one calculated on the other side. I can create a patch for you to see what your side is saying it should be, and how it's generated... so we can try to narrow down where the issue is exactly. By: Andrew Kohlsmith (akohlsmith) 2006-05-24 13:19:56 I'm not sure if you're talking to me or the original reporter, but I'd welcome the patch. I am curious though -- why on earth would you *delete* the peer if the MD5 didn't match? By: Joshua C. Colp (jcolp) 2006-05-24 13:45:10 This problem is not your problem tzanger, they're two different ones. As for this bug I'm closing it since the original poster has not responded to my offer. |