Summary:ASTERISK-21725: Asterisk 11 attempts IPv6 (with an insane address) when talking to an IPv4-only endpoint
Reporter:Tony Hain (tonyhain)Labels:
Date Opened:2013-04-29 12:14:18Date Closed:2018-07-11 04:40:20
Versions:11.3.0 13.18.4 Frequency of
Environment:Asterisk 11.3.0 / Optware (current) / DD-wrt 17201 (2.6 kernel) on Linksys E3000. IPv6 & IPv4 configured and functional.Attachments:( 0) issue_21725_logdate_20130430.txt
Description:Upgrade asterisk via uninstall 1.8.? (~dec '11) / install 11.3.0
Migrate jabber to motif for Gvoice
Calls to/from Gvoice working again. Calls from 3102 working. Calls to 3102 fail.

The most polite thing I can say about the Asterisk 11 IPv6 implementation is that "the developer can't read". The incoming call from Xlite on OSX works fine over IPv6 or IPv4, then the dialplan points to a name that resolves IPv4-only for the 3102. The IPv4 address for the 3102 is acquired correctly from DNS, because it shows up in the most significant 32 bits of the subsequent IPv6 connect attempt (even in the cases of the incoming call from X-lite being IPv4). At best, the IPv4 address should be stored in IPv6 format as a "mapped" address in the form ::ffff:IPv4/96, and that format should not appear in the outer header on the wire, but could be passed as data. The entire point of that format was to allow for internal storage of 128 bit values for IPv4-only endpoints, so it would make sense if that address format showed up in a log somewhere. What should never happen is that the IPv4 address would be on-the-wire as the most significant 32 bits of an address. That implies that the implementation is 'unaware' of the fact that the answer it got from DNS was IPv4, and just jammed the answer plus random crap into the address field. The flip side of that lack of awareness would be that an IPv6-only answer came back from DNS and the 128 bits was jammed into a buffer meant for 32 bits, so something ends up overwritten. Snippets of the SIP debug output:
<--- Reliably Transmitting (no NAT) to --->
SIP/2.0 200 OK
Via: SIP/2.0/UDP;branch=z9hG4bK-d8754z-f553c96682293764-1---d8754z-;received=;rport=43968
From: "Tony"<sip:X-lite@fqdn>;tag=1bf0cd06
To: <sip:123@fqdn>;tag=as3755e173
Server: HGC PBX
Supported: replaces, timer
Contact: <sip:123@>
Content-Type: application/sdp
Content-Length: 289

o=root 604546809 604546809 IN IP4
s=Asterisk PBX 11.3.0
c=IN IP4
ACK sip:123@ SIP/2.0
Via: SIP/2.0/UDP;branch=z9hG4bK-d8754z-2f015734fce4183e-1---d8754z-;rport
Max-Forwards: 70
Contact: <sip:X-lite@>
To: <sip:123@fqdn>;tag=as3755e173
From: "Tony"<sip:X-lite@fqdn>;tag=1bf0cd06
CSeq: 1 ACK
User-Agent: X-Lite release 4.5 stamp 69608
Content-Length: 0

--- (10 headers 0 lines) ---
Audio is at 15364
Video is at [2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:10006
Adding codec 100003 (ulaw) to SDP
Adding codec 100002 (gsm) to SDP
Adding codec 100004 (alaw) to SDP
Adding codec 100017 (testlaw) to SDP
Adding video codec 200002 (h263) to SDP
Adding non-codec 0x1 (telephone-event) to SDP
Reliably Transmitting (no NAT) to [ac10:903c:989d:bf7a::800:0]:5060:
INVITE sip:3102FXS@3102.fqdn SIP/2.0
Via: SIP/2.0/UDP [2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060;branch=z9hG4bK65197241
Max-Forwards: 70
From: "Tony" <sip:X-lite@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]>;tag=as6bae6195
To: <sip:3102FXS@3102.fqdn>
Contact: <sip:X-lite@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060>
Call-ID: 6a0a34a2594ff7e57d81a13e14aadd3d@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060
CSeq: 102 INVITE
Retransmitting #1 (no NAT) to [ac10:903c:989d:bf7a::800:0]:5060:
INVITE sip:3102FXS@3102.fqdn SIP/2.0
Via: SIP/2.0/UDP [2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060;branch=z9hG4bK65197241
Max-Forwards: 70
From: "Tony" <sip:X-lite@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]>;tag=as6bae6195
To: <sip:3102FXS@3102.fqdn>
Contact: <sip:X-lite@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060>
Call-ID: 6a0a34a2594ff7e57d81a13e14aadd3d@[2001:abc:def:7000:6a7f:74ff:fe9e:8e71]:5060
CSeq: 102 INVITE
Maybe X-lite is doing something stupid, but the only change between working and not was to upgrade Asterisk from 1.8 to 11, so Asterisk would have to be feeding X-lite something different. Even then, the IPv6 address in the INVITE is for an interface on the Asterisk box.
Maybe it is possible that whoever built the Optware image (Asterisk 11.3.0 built by slug @ imitron on a i686 running Linux on 2013-04-24 19:39:03 UTC) had something misconfigured, but even so there are clearly no sanity checks in the core code to make sure the address length matches the protocol being spoken. There is no AAAA record for the 3102, so there should never be an attempt to connect to it using IPv6, and particularly with a malformed address. There never was an attempt to connect over IPv4, just retransmits on IPv6 until the call timed out and closed.

Issue ASTERISK-16545 claims "DNS lookups in chan_sip.c are diligent enough to properly filter which family of address to lookup", but that is clearly not the case. That issue was the flipped scenario (IPv6-only target) on 1.8. In any case, there is no way a 32 bit answer from DNS should get jammed into the upper part of an IPv6 address.
Comments:By: Michael L. Young (elguero) 2013-04-30 08:11:48.517-0500

This looks like it might be related to ASTERISK-21654.

Can you attach a full debug log?



By: Tony Hain (tonyhain) 2013-04-30 16:17:19.099-0500

Not sure how to use this tool to attach the logfile. It is available at:
http://tndh.net/~tony/issue_21725_logdate_20130430.txt   : 2 distinct call attempts
While the lack of address family awareness may be the same issue as ASTERISK-21654, there is no SRV record in this case. The only entry in DNS for the target is an A record. Somehow internally though Asterisk has gotten confused as it thinks there are multiple:
[Apr 30 13:54:39] DEBUG[921][C-0000000d] chan_sip.c: Multiple addresses, using the first one only
[Apr 30 13:54:39] DEBUG[921][C-0000000d] acl.c: For destination 'ac10:903c:989d:bf7a::300:0', our source address is '2001:470:e930:7000:6a7f:74ff:fe9e:8e71'.
the IPv4 address of the target is ==> ac10:903c

By: Michael L. Young (elguero) 2013-04-30 23:08:39.384-0500

Thanks for the log... we will take a look at it.

I have attached the log to this issue for you so we can make sure that it is accessible.  Under the menu "More Actions", above, there is an option to "Attach Files".

By: Tony Hain (tonyhain) 2013-05-01 12:54:55.066-0500

When I looked at More Actions, it only had 'canned response'.

FWIW: The concept of locking to the first address in a set is just broken, and always has been. I understand that for multi-stream apps there needs to be an affinity for the duration of a session, but that does not equate to 'lock to one & retry forever'. If the current implementation tried the RFC 6555 approach to find the member of the set to use as the affinity anchor, this bug of blindly inserting the IPv4 address into a 128-bit container might never have been found. That said, in an IPv6 world, multiple address per interface will be the order of the day (there is always LL + one or more others), so the sooner apps accept that reality the more likely they are to live past the transition as IPv4 is weaned out of the system.

By: Michael L. Young (elguero) 2013-05-01 14:17:35.698-0500

I am curious about something... are you able to check what is being resolved by the device you have Asterisk running on?  Can you see what it resolves for telegraph.hain-global-consulting.com?

While you are correct that the current design of only using the first address resolved needs to be changed, that is not part of the issue here from what I can see.

So, the debug message says Asterisk is getting "Multiple addresses".  There should only be 1 address retrieved.  Asterisk only uses what comes back from the OS.  If more than one address is being returned to Asterisk, I think you need to start there.  Then Asterisk copies the first result into an appropriately sized structure for handling both IPv4 and IPv6.  It then goes on from there.  It appears to me that Asterisk is displaying what was given to it.

By: Michael L. Young (elguero) 2013-05-01 14:58:09.392-0500

What are you binding to in sip.conf?

By: Tony Hain (tonyhain) 2013-05-01 17:14:47.312-0500

> set ty=any
> telegraph.hain-global-consulting.com

Name:   telegraph.hain-global-consulting.com

That is the correct resolution, as there is only the one A record in DNS. Somebody is synthesizing an IPv6 address that starts with those 32 bits. Given that it worked when running Asterisk 1.8, and failed immediately when switching to Asterisk 11, and nothing else gets confused about the appropriate stack to run when given a 32 bit object, the continuing claims that Asterisk is doing the right thing are questionable.

The sip.conf bind entries are from some examples a couple of years ago that I can't find currently about enabling IPv6.
grep bind sip.conf

The sip stack is clearly binding to IPv4, as inbound calls from the 3102 continue to work. As I recall, the instructions said not to put in multiple bindaddr strings because that would confuse the system, and that IPv4 was always bound to default anyway unless you gave it a specific value. Given that 1.8 worked with this sip.conf, and inbound calls continue to work, the claim that IPv4 is always bound appears to be a true statement.

By: Tony Hain (tonyhain) 2013-05-02 15:29:03.442-0500

replied yesterday, but the Issue Navigator still shows waiting for feedback

By: Matt Jordan (mjordan) 2013-05-02 16:34:47.436-0500

You have to click "Send Back", otherwise it remains in that status.

By: Tony Hain (tonyhain) 2013-05-03 14:13:56.930-0500

Additional input from dig:

dig telegraph.hain-global-consulting.com any
; <<>> DiG 9.6.1-P3 <<>> telegraph.hain-global-consulting.com any
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 47655
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;telegraph.hain-global-consulting.com. IN ANY

telegraph.hain-global-consulting.com. 0 IN A

;; Query time: 6 msec
;; WHEN: Thu May  2 16:55:30 2013
;; MSG SIZE  rcvd: 70

By: Tony Hain (tonyhain) 2013-06-20 17:48:22.928-0500

Update:  this is related to dns resolution. When the extensions.conf file has the explicit IPv4 dotted-quad, the connection to the 3102 works. When it has the dns name (which only has an IPv4 RR), it fails. Asterisk 11.3.0 built by slug @ imitron on a i686 running Linux on 2013-04-24 19:39:03 UTC

By: Alexander Traud (traud) 2016-09-23 09:33:51.449-0500

Tony, do you still face that issue?

In your case, {{ast_sockaddr_resolve}} in {{main/netsock2.c}} does not behave as expected. Now, the cause of that must be isolated or even better reproducible for others. Are you able to compile Asterisk 11 yourself and try to isolate that issue? That way, you could add some {{ast_log(LOG_NOTICE, …}} into that for-loop at the end of {{ast_sockaddr_resolve}}. If that is not possible,
- Where/how did you get Asterisk – directly via the DD-WRT distribution on your Linksys E3000 router?
- Are you using externaddr, externhost, or localnet in your sip.conf?
- Somebody has to patch {{ast_sockaddr_resolve}} to include an {{ast_debug}} in {{ast_sockaddr_resolve}}. That way, this additional debug information reaches you one day.

By: Alexander Traud (traud) 2017-12-19 13:55:03.049-0600

[~jcolp], you tagged this report with 13.18.4. Did you face that issue recently or do you make sure this issue does not get lost/closed? I never faced that issue and played a lot with IPv4/IPv6 Dual Stack for more than a year now.

By: Joshua C. Colp (jcolp) 2017-12-19 13:56:31.040-0600

[~traud] Any issues which I didn't think had been specifically fixed were tagged during JIRA cleanup.

By: Joshua C. Colp (jcolp) 2017-12-19 13:57:01.912-0600

As [~traud] has stated he's used IPv4 and IPv6 quite a bit in current chan_sip. Is this still a problem for you?

By: Joshua C. Colp (jcolp) 2018-07-11 04:40:20.890-0500

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines