ASTERISK-20644: Don't always use the existing TCP connection for in-dialog requests

[Home]

Summary: ASTERISK-20644: Don't always use the existing TCP connection for in-dialog requests

Reporter: Iñaki Baz Castillo (ibc) Labels:

Date Opened: 2012-11-02 06:21:11 Date Closed: 2013-08-23 12:11:52

Priority: Major Regression? No

Status: Closed/Complete Components: Channels/chan_sip/Interoperability Channels/chan_sip/TCP-TLS

Versions: 11.0.0 Frequency of
Occurrence Constant

Related
Issues:

Environment: Attachments:

Description: If Asterisk receives an INVITE via TCP comming from a SIP proxy, answers the call and later Asterisk sends a re-INVITE or BYE for that dialog, Asterisk sends such an in-dialog request over the existing TCP connection previously opened by the SIP proxy. This is incorrect. Asterisk should open a NEW TCP connection to the address given in the top Record-Route header.

So for example, Asterisk receives the following INVITE from TCP 1.2.3.4 port 8888:

{code}
# TCP 1.2.3.4:8888 => ASTERISK:5060

INVITE sip:test@domain.com SIP/2.0
Record-Route: <sip:1.2.3.4:5060;transport=tcp>
{code}

When Asterisk sends a BYE or a re-INVITE in this leg, it MUST respect the address in the top Route of the BYE or re-INVITE, which is: TCP 1.2.3.4 port 5060. So it should send the BYE to:

{code}
# TCP ASTERISK:5060 => 1.2.3.4:5060

BYE sip:abc@9.8.7.6 SIP/2.0
Route: <sip:1.2.3.4:5060;transport=tcp>
{code}

This is something really basic in RFC 3261 and mandatory. Asterisk could use the existing TCP connection just in case nat=yes (or some other new values I don't fully know in the latest version), but by default, if "nat" parameter is not set for the Proxy peer, please DON'T reuse the existing TCP connection because that is a violation of RFC 3261.

In fact, it's perfectly legal for a SIP proxy or SIP server to refuse/reject SIP requests coming from a TCP connection that the proxy/server itself opened against a remote proxy/server. These are RFC 3261 rules, really.

Comments: By: Iñaki Baz Castillo (ibc) 2012-11-02 07:47:32.385-0500

IMHO related "wrong" code is here:

http://lists.digium.com/pipermail/asterisk-commits/2012-October/057327.html

specially here:

{code}
if (p->socket.type != SIP_TRANSPORT_UDP && p->socket.tcptls_session) {
/* For TCP/TLS sockets that are connected we won't need
* to do any hostname/IP lookups */
{code}

This is incorrect. For sending in-dialog request Asterisk should not reuse the remote client/proxy/server initiated connection but open a new one against the URI in the top Route or against the URI in the Contact if there is no Route header. Obviously this can be overriden in case of "nat=something..." but it should NOT be the default and only behavior.
By: Walter Doekes (wdoekes) 2012-11-05 06:15:00.145-0600

You're probably right regarding requests with Record-Route headers where the address differs from the connected address. A quick scan through the RFC does not say anything at all about reusing open connections.

However, the "bad" code that you point to, does nothing more than skip dns lookups.

The socket reuse that you say is wrong is a lot older; see __sip_xmit():
{code}
if (p->socket.type == SIP_TRANSPORT_UDP) {
res = ast_sendto(p->socket.fd, data->str, ast_str_strlen(data), 0, dst);
} else if (p->socket.tcptls_session) {
res = sip_tcptls_write(p->socket.tcptls_session, data->str, ast_str_strlen(data));
{code}
By: Iñaki Baz Castillo (ibc) 2012-11-05 06:32:45.217-0600

Reusing an existing connection initiated by the peer for sending request to it is not allowed as per RFC 3261. There are two specifications that allow it:

* RFC 5626 (Outbound): In which final endpoints (phones) connect to a proxy or server and the proxy/server reuses the same connection for sending requests to the peer. This is what Asterisk does all the time. The problem is that RFC 5626 is just for *endpoints* (phones directly connected to the proxy/registrar), and it should never occur when the peer is a proxy (what I report in this issue).

* RFC 5923 (Connection Reuse): This spec is similar but between servers or proxies. It is really complex and *just* allows reusing a connection when using TLS and the proxy/server initiating the connection presented a valid TLS certificate... So we can forget it.

Said that, IMHO the mechanism to improve this behaviour in Asterisk shoud be:

If the connection is initiated by a proxy (so we get Record-Route) and Asterisk must send an in-dialog request, then:

* Reuse the same connection just in case "nat" parameter has some specific value (always, nat or some new value combination...).
* Otherwise and, by default, DON'T reuse the same connection and instead open a new one to the host:port in the top Route of the in-dialog request to send.
By: Walter Doekes (wdoekes) 2012-11-05 07:53:15.775-0600

If what you say is true, then wouldn't it be more correct to just do the following?
- if nat=someyes then reuse
- else open new connection

For scenario's where there was a proxy, but not a record-route, it'd be even more wrong to reuse the connection. You want the in-dialog request to do to the Contact.

Nat is already on by default, so that should at least mitigate the change for most people.

By: Matt Jordan (mjordan) 2012-11-05 08:12:36.799-0600

These kinds of changes - particularly in release branches - worry me. Often when we change behavior such as this, we end up dealing with a litany of regression issues for several minor release versions.

Asterisk's behavior may not be "correct" (although like Walter, I haven't found the language in RFC 3261 that explicitly forbids it - but I could certainly be missing it). For sweeping changes like this it'd be nice to know what we're currently breaking. For what devices and/or scenarios is the current behavior detrimental?
By: Olle Johansson (oej) 2012-11-05 08:29:49.806-0600

If you read the comments I added years ago in chan_sip.c this was listed there. It is also clearly explained in sip.conf.sample:

; Note that the TCP and TLS support for chan_sip is currently considered
; experimental. Since it is new, all of the related configuration options are
; subject to change in any release. If they are changed, the changes will
; be reflected in this sample configuration file, as well as in the UPGRADE.txt file.

I would not argue about the RFC with Mr Baz Castillo. He is right here. It will take a lot of code to fix this.
By: Matt Jordan (mjordan) 2012-11-05 08:39:50.028-0600

I'm not arguing the RFC, I'm just looking for a reference point. :-)
By: Iñaki Baz Castillo (ibc) 2012-11-05 08:42:16.277-0600

Hi, replying to all comments:

> For scenario's where there was a proxy, but not a record-route, it'd be even more wrong to reuse the connection. You want the in-dialog request to do to the Contact.

Sure, but if the proxy didn't add Record-Route then Asterisk may not be able to detect whether it is a proxy or not (it could inspect the number of Via headers...). Anyhow if there is no Record-Route IMHO the behavior should be (IMHO) what you suggest:

* if nat=someyes then reuse
* else open new connection

> Asterisk's behavior may not be "correct" (although like Walter, I haven't found the language in RFC 3261 that explicitly forbids it - but I could certainly be missing it)

If there are two RFC's (RFC 5626 and 5923) for allowing reusing a remotely initiated connection IMHO it's clear that this behavior is not allowed by RFC 3261. Please read section 1 "Introduction" in RFC 5923:

{quote}
The SIP protocol includes the notion of a persistent connection
(defined in Section 2), which is a mechanisms to insure that
responses to a request reuse the existing connection that is
typically still available, as well as reusing the existing
connections for other requests sent by the originator of the
connection. However, new requests sent in the backwards direction
are unlikely to reuse the existing connection. This frequently
causes a pair of SIP entities to use one connection for requests
sent in each direction.

Unlike TCP, TLS connections can be reused to send requests in the
backwards direction since each end can be authenticated when the
connection is initially set up.
{quote}

> For what devices and/or scenarios is the current behavior detrimental?

Well, for phones that directly connect to Asterisk there will be no problem or regression since there won't be Record-Route and thus Asterisk will behave as now (always reuse the connection initiated by the peer). Anyhow this is not correct and should be seteable via "nat" value(s) for that peer. If for example "nat=no/never" and the peer is a phone or server (not a proxy) then Asterisk should not reuse the connection but instead open a new one against the Contact URI hostport.

But there are proxies and servers that open an outgoing TCP connection (i.e. against Asterisk) and will not mantain that connection open all the time (they will close it after N minutes of inactivity, for example). Imagine that a proxy opens a connection with Asterisk, Asterisk receives an INVITE, dialog is established and takes more than 3 hours (no in-dialog requests in those 3 hours). The proxy will probably close that connection. Now leg B in Asterisk receives a BYE, will Asterisk fail to send it because the connection was closed "by peer A"? That's not correct. Asterisk should open a connection to the top Route URI regardless the connection is open or not (if "nat" != "no" / "never").

Phones instead are typically coded for mantaining outgoing TCP connections all the time (they "implement" something like RFC 5626 "Outbound" before such an RFC exists) because they expect to receive incoming requests over that connection (otherwise no way to work under NAT scenarios).

So maybe there could be a new SIP parameter "reuse_connection" (seteable for each peer), with "yes" as default value. If it's set to "no" and the peer establishes a TCP/TLS connection with Asterisk, then when Asterisk must send an in-dialog request to the peer:

- If there is Route header then open a new connection against the URI in the top Route.
- If not, open a new connection against the URI in the Request-URI (the value of the Contact URI in the received INVITE).

PS: And hopefuly, this will be also implemented when Asterisk supports Path ("RFC 3327") so a phone can send a REGISTER via a Proxy, the Proxy adds "Path" header, routes it to Asterisk (registrar server) and when Asterisk sends an initial INVITE to that peer it adds Route header with the value(s) of the Path header and follows same behaviour as described above (i.e. based on "reuse_connection" SIP peer parameter).

By: Matt Jordan (mjordan) 2012-11-05 08:58:12.240-0600

Thanks for describing where this behavior would cause a problem. Given the scope of this change, I think the best way to treat this bug is to attempt to fix it in the next major version branch (Asterisk 12), rather than trying to patch it in the current release branches and potentially break phones using TCP connections. Plus, as you pointed out, this feels like it fits into the same category of problems Asterisk has when it sits behind a SIP Proxy, and adding Path support would be nice to get done for Asterisk 12 as well.

For now I'm going to ack this issue and tag it as something to consider for Asterisk 12.
By: Iñaki Baz Castillo (ibc) 2012-11-05 09:07:30.921-0600

Thanks Matt, I consider correct your release policy. But please, don't forget this, some of us are doing real "hacks" for making Asterisk to work behind proxies, and even more for "implementing Path mechanism" when a proxy forwards a REGISTER to Asterisk (see [OverSIP OutboundMangling module|http://www.oversip.net/documentation/1.3.x/api/built_in_modules/outbound_mangling/]) ;)
By: Iñaki Baz Castillo (ibc) 2012-11-05 11:03:30.060-0600

Another comment for defending the need of this "feature" or fix:

Let's assume that Asterisk XX supports Path.

* Clients send REGISTER to a Proxy (via UDP, TCP, TLS, WS or whatever transport).
* The Proxy adds Path header and routes the REGISTER requests to Asterisk using TCP transport.
* Now imagine that Asterisk does not implement the requested feature in this issue.
* Asterisk is restarted (for any reason).
* Now what, all the registrations are lost? Even worse, clients have no way to detect it, and the Proxy is a proxy, it cannot "resend" the REGISTER requests.

But if the current issue is fixed and the Proxy peer is configured with "reuse_connection=no" (or maybe "nat=comedia,rport,no_reuse_conn") then Asterisk will open a TCP connection to the Proxy when sending an INVITE to any registered user, so things would work even if Asterisk is restarted.

Please, give importance to this issue. With the incoming WebRTC world, INVITE requests may not fit into a UDP datagram (ICE, audio, video in the SDP...) and thus TCP becomes a real need. It must work properly.
By: Matt Jordan (mjordan) 2012-11-05 11:06:50.079-0600

Agreed - one of the major takeaways from AstriDevCon this year was a goal to do some substantial work on the SIP channel driver in Asterisk. Behaving 'friendly' when sitting behind a proxy would certainly fall into that camp.
By: Matt Jordan (mjordan) 2013-08-23 12:11:45.778-0500

After discussing this with Josh a bit, {{res_pjsip}} and its corresponding modules - thanks to the loveliness that is {{pjsip}} - should behave correctly with respect to this issue. That means that channels created with {{chan_pjsip}} should do the right thing when sitting behind a proxy in the scenarios you outlined above.

A few thoughts:
# {{chan_sip}} is obviously still {{chan_sip}}. Since the goal of Asterisk moving forward is to move people from {{chan_sip}} to {{chan_pjsip}}, I'm inclined to not keep this open simply because {{chan_sip}} still can't behave correctly in this fashion. In the long run, we certainly don't want to maintain two SIP channel drivers.
# {{chan_pjsip}} does not, unfortunately, have the Path support that was put into {{chan_sip}} in Asterisk 12. There is an issue (ASTERISK-21084) open, however, to get that done - hopefully relatively soon.

Based on that, I'm going to go ahead and close this out as "Fixed" in version 12. If how {{chan_pjsip}} behaves is not correct per your description in this issue, I'll be more than happy to reopen this.