[Home]

Summary:ASTERISK-14767: [patch] dnsmgr: problem handling A and SRV record changes/problem with multiple A/SRV records returned
Reporter:Dennis DeDonatis (dennisd)Labels:
Date Opened:2009-09-03 15:54:17Date Closed:2010-06-17 10:11:58
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/Netsock
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) dnsmgr_15827.patch
Description:This ONLY happens with callcentric and only after Asterisk is up for a few hours:

[2009-09-03 16:28:01.487] DEBUG[6851] devicestate.c: device 'SIP/callcentric' state '6'
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: ** Our capability: 0x4 (ulaw) Video flag: False Text flag: False
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: ** Our prefcodec: 0x4 (ulaw)
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: -- Done with adding codecs to SDP
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: Done building SDP. Settling with this capability: 0x4 (ulaw)
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: Initializing initreq for method INVITE - callid 6e3807ab0e1b8f406ae186e8110e9606@dennisd.com
[2009-09-03 16:28:01.488] DEBUG[10700] chan_sip.c: Trying to put 'INVITE sip' onto UDP socket destined for 204.11.192.39:5060
[2009-09-03 16:28:01.488] WARNING[10700] chan_sip.c: sip_xmit of 0x2844f90 (len 943) to 204.11.192.39:5060 returned -1: Address family not supported by protocol
[2009-09-03 16:28:01.488] VERBOSE[10700] app_dial.c:     -- Called 15869091150@callcentric
[2009-09-03 16:28:02.488] DEBUG[8281] chan_sip.c: ** SIP timers: Rescheduling retransmission 2 to 1000 ms (t1 500 ms (Retrans id #19577))
[2009-09-03 16:28:02.488] DEBUG[8281] chan_sip.c: Trying to put 'INVITE sip' onto UDP socket destined for 204.11.192.39:5060
[2009-09-03 16:28:02.488] WARNING[8281] chan_sip.c: sip_xmit of 0x2844f90 (len 943) to 204.11.192.39:5060 returned -1: Address family not supported by protocol
[2009-09-03 16:28:03.488] DEBUG[8281] chan_sip.c: ** SIP timers: Rescheduling retransmission 3 to 2000 ms (t1 500 ms (Retrans id #19577))
[2009-09-03 16:28:03.488] DEBUG[8281] chan_sip.c: Trying to put 'INVITE sip' onto UDP socket destined for 204.11.192.39:5060
[2009-09-03 16:28:03.488] WARNING[8281] chan_sip.c: sip_xmit of 0x2844f90 (len 943) to 204.11.192.39:5060 returned -1: Address family not supported by protocol
[2009-09-03 16:28:05.488] DEBUG[8281] chan_sip.c: ** SIP timers: Rescheduling retransmission 4 to 4000 ms (t1 500 ms (Retrans id #19577))
[2009-09-03 16:28:05.488] DEBUG[8281] chan_sip.c: Trying to put 'INVITE sip' onto UDP socket destined for 204.11.192.39:5060
[2009-09-03 16:28:05.488] WARNING[8281] chan_sip.c: sip_xmit of 0x2844f90 (len 943) to 204.11.192.39:5060 returned -1: Address family not supported by protocol
[2009-09-03 16:28:09.488] DEBUG[8281] chan_sip.c: ** SIP timers: Rescheduling retransmission 5 to 8000 ms (t1 500 ms (Retrans id #19577))
[2009-09-03 16:28:09.488] DEBUG[8281] chan_sip.c: Trying to put 'INVITE sip' onto UDP socket destined for 204.11.192.39:5060
[2009-09-03 16:28:09.488] WARNING[8281] chan_sip.c: sip_xmit of 0x2844f90 (len 943) to 204.11.192.39:5060 returned -1: Address family not supported by protocol
[2009-09-03 16:28:11.129] DEBUG[8281] chan_sip.c: Auto destroying SIP dialog '64357d2a-d82cc7ab@127.0.0.1'
[2009-09-03 16:28:11.129] DEBUG[8281] chan_sip.c: Destroying SIP dialog 64357d2a-d82cc7ab@127.0.0.1
[2009-09-03 16:28:17.094] DEBUG[8281] acl.c: Found IP address for this socket




****** STEPS TO REPRODUCE ******

I've turned on transport=udp, srvlookup=yes and no (this was with no, obviously because it's on port 5060 instead of callcentric's srv records having 5080).

If I fully restart Asterisk it works perfectly for a least 3 hours, maybe more.  Then, after a while, this starts happening, but ONLY with callcentric.

****** ADDITIONAL INFORMATION ******

I'm currently running 2.6.27.30-170.2.82.fc10.x86_64, but this has happened in other kernel versions and with other asterisk versions back to 1.6.0.something, I think.

This REALLY doesn't seem like it could be an Asterisk problem, but an OS issue, but I can't get it to happen with any other application, nor with any other provider other than callcentric.  No traffic actually goes out to callcentric when trying to set up the call.

sip show peers shows:

callcentric/1xxxx    204.11.192.23        N      5060     Unmonitored

IPv6 IS enabled on this machine, but not used.

Comments:By: Dennis DeDonatis (dennisd) 2009-09-15 11:26:12

I turned qualify=yes for callcentric.

Here is another clue:

[2009-09-15 12:06:54.760] NOTICE[18316] dnsmgr.c: dnssrv: host 'callcentric.com' changed from 204.11.192.36:5060 to 204.11.192.22:5060
[2009-09-15 12:07:41.042] WARNING[19776] chan_sip.c: sip_xmit of 0x7f3888514ab0 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p
[2009-09-15 12:07:42.042] WARNING[19776] chan_sip.c: sip_xmit of 0x7f3888514ab0 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p
[2009-09-15 12:07:43.043] WARNING[19776] chan_sip.c: sip_xmit of 0x7f3888514ab0 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p
[2009-09-15 12:07:44.042] WARNING[19776] chan_sip.c: sip_xmit of 0x7f3888514ab0 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p
[2009-09-15 12:07:45.042] WARNING[19776] chan_sip.c: sip_xmit of 0x7f3888514ab0 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p
[2009-09-15 12:07:45.043] NOTICE[19776] chan_sip.c: Peer 'callcentric' is now UNREACHABLE!  Last qualify: 46
[2009-09-15 12:07:55.177] WARNING[19776] chan_sip.c: sip_xmit of 0x7f38884e3260 (len 497) to 204.11.192.22:5060 returned -1: Address family not supported by p


It looks like when the dnsmgr changes hosts it causes this.

I'm guessing that I don't have problems with future-nine or voip.ms as they don't change IP addresses that often.  Well callcentric isn't really changing them, but they have 9 A records defined.

By: Dennis DeDonatis (dennisd) 2009-09-15 12:04:39

I changed callcentric to only have one A record (I'm running unbound as a DNS resolver) and I don't see the Address family not supported messages any more.  This, of course, won't work for automatic failover.

I bet if future-nine has an outage/failover (rare, but it can happen), Asterisk won't fail over properly (they change the A DNS record).

Maybe the title of title of this should be changed to something like "dnsmgr: problem handling A and SRV record changes/problem with multiple A/SRV records returned" as it more accurately describes what I'm seeing.

Please let me know if you want me to try anything.  I can override DNS to make it return anything fairly easily.

By: Russell Bryant (russell) 2009-09-15 12:42:44

I have seen this type of thing caused by the sockaddr_in->sin_family field being left uninitialized.  It should be set to AF_INET.  Usually, at least on linux, if you leave it uninitialized, it will just assume AF_INET (IPv4).

However, if you have IPv6 enabled on the box, it may now require it to be set (it should be set, anyway).

If you completely disable IPv6, I wonder if it will go away.  In any case, we'll have to find where the AF_INET setting is missing and fix it up.

By: Colin Beckingham (colbec) 2009-10-17 15:39:35

I too am seeing this issue, Asterisk 1.6.2 from SVN, no IPv6 enabled and IPv6 not compiled into the kernel (2.6.31).

For me it is a bit more than minor, asterisk remains registered to CC, all looks good, but calls fail silently.

If I can provide further info let me know.

By: Chris Gentle (gentlec) 2010-01-27 06:24:43.000-0600

I'm seeing the same behavior with 1.6.2.1 with my outgoing SIP line to          Vitelity.  As reported above, it seems that the problem starts after dnsmgr     updates the IP for the outgoing host.

Calls do fail silently.  There's no indication to the caller that the call is not going through.  You have to be watching the console output to see the problem.

I'm running Asterisk 1.6.2.1 on Ubuntu Server 9.04 with kernel 2.6.28-17.  IPv6 is disabled.

Anything I can do to help debug this?

By: damage (damage) 2010-02-07 14:04:21.000-0600

I confirm this bug on Asterisk 1.6.2.2 on Gentoo with Kernel version 2.6.31-gentoo-r6. I run into this problem while my dyndns.org domain switches to another IP.

By: David Chappell (chappell) 2010-02-18 15:20:38.000-0600

Disabling IPv6 does not help.  However, I think russell is on the right track as it appears that Dnsmgr fails to initialize sin_family.  Chan_iax2.c does it itself in a least one place.  I have patched the problem and will be trying it.

By the way, I don't think this bug is minor.  It makes trunks go down and stay down.

Response to Damage's comment below:

It is worse than that.  Any trunk to a SIP provider who uses load balancing will go down, likely on each Dnsmgr refresh.  When the IP address changes, the protocol and port fields get trashed.  This bug should have been marked major.



By: damage (damage) 2010-02-19 11:58:42.000-0600

I acknowledge that this bug makes some people crazy. At least in germany dialup connection gets dropped after 24 hours by most carriers. After reconnect you mostly have a new IP. Thus youre trunk goes down.

By: David Chappell (chappell) 2010-02-19 12:24:16.000-0600

I have attached a patch for this (dnsmgr_15827.patch).

This patch fixes the following problems:

1) sin_family is not set when ast_dnsmgr_lookup() is called
2) The first time the IP address changes, sin_port is reset to whatever it was when ast_dnsmgr_lookup() was called (probably zero)
3) The documentation in dnsmgr.h fails to mention the service parameter
4) Some of the public functions in dnsmgr.c do not have comments.  I copied the first line of the description from dnsmgr.h.

The files patched are:

include/asterisk/dnsmgr.h (documentation only)
main/acl.c (copy sin_family from DNS look result)
main/dnsmgr.c (keep current value of sin_port, set sin_family to AF_INET when parsing IP address)

By: Dennis DeDonatis (dennisd) 2010-02-19 15:13:20.000-0600

I applied the patch to 1.6.2.3-rc2, although patch REALLY didn't want to apply the dnsmgr.c changes for some reason, so I applied them by hand.  I don't see why it didn't want to apply them as the line numbers were right and everything looked the same to me.  The other two files patched just fine.

It's only been running for about 2 hours with the patches, but I haven't seen any problems, yet (and I definitely would have seen problems by now without the patches if I didn't have callcentric.com hard coded in my local DNS - I took out the hardcoded address for this test).

I'll let you know if I see any problems.



By: damage (damage) 2010-02-20 03:37:11.000-0600

I patched 1.6.2.2 successfully without any problems. May source has changed between 1.6.2.2 and 1.6.2.3

Still watching asterisk. Lets see how it behaves.

By: Dennis DeDonatis (dennisd) 2010-02-21 14:26:40.000-0600

So far, Asterisk 1.6.2.3-rc2 with the patch is working perfectly for me.



By: Maciej Krajewski (jamicque) 2010-02-22 04:13:58.000-0600

same thing happens in 1.6.1.14, however during the patch there is such an error:
patching file include/asterisk/dnsmgr.h
patching file main/acl.c
Hunk #1 FAILED at 380.
1 out of 1 hunk FAILED -- saving rejects to file main/acl.c.rej
patching file main/dnsmgr.c

By: Birger "WIMPy" Harzenetter (wimpy) 2010-02-26 13:46:24.000-0600

I set up a new provider today and hit the same issue.

The patch cured the issue for me.
No side effects seen.

Thank a lot for that patch.

By: damage (damage) 2010-02-26 16:09:06.000-0600

Tested on Asterisk 1.6.2.2 (Gentoo)

Seems good to me.

EDIT: Sorry, I played around with my extensions. I just heard my local server.

I could not see the SIP messages anymore. But I get

[Feb 26 23:22:07] NOTICE[7892]: chan_sip.c:20039 handle_request_invite: Call from '' to extension '012011600' rejected because extension not found.

But I think this is another problem. I think the patch is working.



By: Chris Gentle (gentlec) 2010-02-27 12:48:37.000-0600

I applied the patch to 1.6.2.1 and it looks like it has solved the problem for me.  The patch applied cleanly to the 1.6.2.1 sources.  I haven't had a failed outgoing call since I rebuilt with the patch.  Thanks!

By: Dennis DeDonatis (dennisd) 2010-03-04 20:00:57.000-0600

I patched 1.6.2.6-rc1 and it seems to work great.

I patched SVN and it seemed to fix it there, too.

By: krp-kp (krp-kp) 2010-03-16 09:56:18

same thing happens in 1.6.0.25, however during the patch there is such an error:

Patching file include/asterisk/dnsmgr.h
Hunk #1 failed at 42.
No such line 68 in input file, ignoring
Hunk #2 failed at 69.
2 out of 2 hunks failed--saving rejects to include/asterisk/dnsmgr.h.rej

Patching file main/acl.c
Hunk #1 failed at 380.
1 out of 1 hunks failed--saving rejects to main/acl.c.rej

Patching file main/dnsmgr.c
Hunk #1 failed at 52.
Hunk #2 failed at 83.
Hunk #3 failed at 97.
Hunk #4 succeeded at 101 (offset -8 lines).
Hunk ASTERISK-1 failed at 118.
Hunk ASTERISK-2 failed at 131.
Hunk ASTERISK-3 succeeded at 163 (offset 4 lines).
Hunk ASTERISK-4 failed at 176.
6 out of 8 hunks failed--saving rejects to main/dnsmgr.c.rej

By: Leif Madsen (lmadsen) 2010-03-19 09:42:01

From David Chappell on the mailing list:

"May a suggest that the severity of this bug be changed from "minor" to
"major" and "regression" be changed to "yes".  This bug is major because
it makes SIP trunks from numerous providers unusable.  This is a
regression because the bug was not present before chan_sip was changed
to use Dnsmgr."

By: Dennis DeDonatis (dennisd) 2010-04-15 15:01:47

This patch works great for me on 1.6.2.7-rc1 and 1.6.2.7-rc2.

By: Dennis DeDonatis (dennisd) 2010-06-01 21:35:10

This patch works great for me on 1.6.2.8 and 1.6.2.9-rc1.

By: Digium Subversion (svnbot) 2010-06-16 15:34:31

Repository: asterisk
Revision: 270974

U   trunk/main/acl.c
U   trunk/main/dnsmgr.c

------------------------------------------------------------------------
r270974 | mnicholson | 2010-06-16 15:34:30 -0500 (Wed, 16 Jun 2010) | 8 lines

Set sin_family to AF_INET when doing lookups, also reset sin_port the first time the ip address changes.

(closes issue ASTERISK-14767)
Reported by: DennisD
Patches:
     dnsmgr_15827.patch uploaded by chappell (license 8)
Tested by: DennisD, gentlec, damage, wimpy

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=270974

By: Digium Subversion (svnbot) 2010-06-16 15:41:40

Repository: asterisk
Revision: 270975

_U  branches/1.6.2/
U   branches/1.6.2/main/acl.c
U   branches/1.6.2/main/dnsmgr.c

------------------------------------------------------------------------
r270975 | mnicholson | 2010-06-16 15:41:39 -0500 (Wed, 16 Jun 2010) | 15 lines

Merged revisions 270974 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r270974 | mnicholson | 2010-06-16 15:34:31 -0500 (Wed, 16 Jun 2010) | 8 lines
 
 Set sin_family to AF_INET when doing lookups, also reset sin_port the first time the ip address changes.
 
 (closes issue ASTERISK-14767)
 Reported by: DennisD
 Patches:
       dnsmgr_15827.patch uploaded by chappell (license 8)
 Tested by: DennisD, gentlec, damage, wimpy
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=270975

By: Digium Subversion (svnbot) 2010-06-17 09:25:13

Repository: asterisk
Revision: 270974

U   trunk/main/acl.c
U   trunk/main/dnsmgr.c

------------------------------------------------------------------------
r270974 | mnicholson | 2010-06-16 15:34:31 -0500 (Wed, 16 Jun 2010) | 11 lines

Set sin_family to AF_INET when doing lookups, also reset sin_port the first time the ip address changes.

(closes issue ASTERISK-14719)
Reported by: ManChicken

(closes issue ASTERISK-14767)
Reported by: DennisD
Patches:
     dnsmgr_15827.patch uploaded by chappell (license 8)
Tested by: DennisD, gentlec, damage, wimpy

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=270974

By: Digium Subversion (svnbot) 2010-06-17 09:26:25

Repository: asterisk
Revision: 270975

_U  branches/1.6.2/
U   branches/1.6.2/main/acl.c
U   branches/1.6.2/main/dnsmgr.c

------------------------------------------------------------------------
r270975 | mnicholson | 2010-06-16 15:41:40 -0500 (Wed, 16 Jun 2010) | 18 lines

Merged revisions 270974 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

........
 r270974 | mnicholson | 2010-06-16 15:34:31 -0500 (Wed, 16 Jun 2010) | 8 lines
 
 Set sin_family to AF_INET when doing lookups, also reset sin_port the first time the ip address changes.

 (closes issue ASTERISK-14719)
 Reported by: ManChicken
 
 (closes issue ASTERISK-14767)
 Reported by: DennisD
 Patches:
       dnsmgr_15827.patch uploaded by chappell (license 8)
 Tested by: DennisD, gentlec, damage, wimpy
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=270975

By: Digium Subversion (svnbot) 2010-06-17 10:11:27

Repository: asterisk
Revision: 271123

U   branches/1.4/main/acl.c
U   branches/1.4/main/dnsmgr.c

------------------------------------------------------------------------
r271123 | mnicholson | 2010-06-17 10:11:27 -0500 (Thu, 17 Jun 2010) | 7 lines

Set sin_family in ast_get_ip_or_srv() and removed the 'last' member of the ast_dnsmgr_entry struct.

(closes issue ASTERISK-14767)
Reported by: DennisD
Patches:
     (modified) dnsmgr_15827.patch uploaded by chappell (license 8)

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=271123

By: Digium Subversion (svnbot) 2010-06-17 10:11:55

Repository: asterisk
Revision: 271124

_U  trunk/

------------------------------------------------------------------------
r271124 | mnicholson | 2010-06-17 10:11:55 -0500 (Thu, 17 Jun 2010) | 13 lines

Blocked revisions 271123 via svnmerge

........
 r271123 | mnicholson | 2010-06-17 10:11:27 -0500 (Thu, 17 Jun 2010) | 7 lines
 
 Set sin_family in ast_get_ip_or_srv() and removed the 'last' member of the ast_dnsmgr_entry struct.
 
 (closes issue ASTERISK-14767)
 Reported by: DennisD
 Patches:
       (modified) dnsmgr_15827.patch uploaded by chappell (license 8)
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=271124