Summary:ASTERISK-22236: REGISTER reply send to bad port with nat=yes(or force_rport,comedia) in 11.5.0
Reporter:Filip Frank (frenk77)Labels:
Date Opened:2013-07-31 02:44:49Date Closed:2013-10-17 15:38:03
Versions:11.5.0 Frequency of
must be completed before resolvingASTERISK-22562 Open blockers for 11.6.0
is caused byASTERISK-21374 [patch] One-way Audio With auto_* NAT Settings When SIP Calls Initiated By PBX
Environment:Attachments:( 0) asterisk-2236-always-set-rport.diff
( 1) debug_log
( 2) debug.txt
( 3) iptel412_11_5_0.txt
( 4) iptel421_11_5_0.txt
( 5) peers_settings.txt
( 6) sip_settings_11_5_0.txt
Description:I have 2 peers after NAT with same IP, one registers from source port 5060, second registers from port 1114. After upgrade to 11.5.0 only first peer is registered, asterisk send register reply both to 5060 port. I using nat=yes, i try new nat=force_rport,comedia but not helps, I think this is issue in 11.5.0, after downgrade back to 11.4.0 its ok. In 11.4.0 both peers are correctly registered and Asterisk send reply first peer to destination port 5060, second peer to 1114.
Comments:By: Pavel Sidlo (pavel.sidlo@linuxbox.cz) 2013-07-31 03:39:31.839-0500

Same problem for me.

By: Michael L. Young (elguero) 2013-07-31 10:54:03.330-0500

We require a complete debug log to help triage the issue. This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue: https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information

Please include sanitized (no passwords) version of your sip.conf settings as well.

By: Rusty Newton (rnewton) 2013-08-14 17:39:29.792-0500

I attempted to reproduce this and couldn't reproduce (11.5.0 and SVN 11 branch as of yesterday). With a phone sending from a non-default source port Asterisk responds appropriately regardless of the various NAT settings that I tried.

Can someone Pavel or Filip please post the debug that Michael requested? We need the packet captures of your working and not working scenarios, plus the sip.conf settings that you used for both. Please point out in your debug where the issue occurs (via timestamp or other unique line)

By: Filip Frank (frenk77) 2013-09-01 14:03:46.841-0500

It is a specific situation both IP phones has same IP and different port ( 2 user with same idiotic ISP - which not provide connection to internet, but only  connection to network which is connected to intenernet over NAT). Here is my debug log from 11.4.0 and 11.5.0. You can see asterisk 11.5.0 send packet out to bad port for peer iptel421, with nat=yes it must send it to source port from transport layer as 11.4.0 doing.

[Edited by Rusty Newton - removed inling debug chunk per the guidelines]

By: Filip Frank (frenk77) 2013-09-01 14:13:01.683-0500

Debug 11.4.0 vs 11.5.0

By: Rusty Newton (rnewton) 2013-09-04 18:31:28.447-0500

moved inline debug to debug.txt

By: Walter Doekes (wdoekes) 2013-09-06 05:55:08.676-0500

(iptel412 => iptel-works, iptel421 => iptel-fails)

1. For both 11.4 and 11.5, iptel-fails says "no NAT", while it should do NAT.
2. iptel-works uses ";rport" in the Via => this makes things work.
3. Both devices have: "N" in the Forcerport column, see 1.
4. What happens if you revert this? https://reviewboard.asterisk.org/r/2421/diff/#index_header

Please review the account settings. I believe both accounts have nat=no for some reason. (What does 'sip show peer iptel421' say?)

By: Filip Frank (frenk77) 2013-09-06 06:18:23.192-0500

Added attachment with sip show peer iptel412 on 11.5.0

By: Filip Frank (frenk77) 2013-09-06 06:18:45.104-0500

Added attachment with  sip show peer iptel421 on 11.5.0

By: Filip Frank (frenk77) 2013-09-06 06:22:38.467-0500

Added peer setting attachment

By: Michael L. Young (elguero) 2013-09-06 09:21:40.803-0500


Please attach the output from "sip show settings".


By: Filip Frank (frenk77) 2013-09-06 09:30:44.403-0500

Attached sip show settings on 11.5.0

By: Michael L. Young (elguero) 2013-09-06 09:45:26.936-0500


So you have "nat=no" in the global section?  Can you try commenting that line out so that Asterisk uses the default nat setting?

In 11.5, CHANGES:
From 11.4 to 11.5:
* The default settings for chan_sip are now overriden properly by the general
 settings in sip.conf.  Please look over your settings upon upgrading.

I have a feeling that the "nat=no" was not really having an effect on 11.4 but now it is in 11.5.  I would be curious to see what "sip show settings" said in 11.4 with your setup.

I would recommend using "nat=force_rport,comedia" in 11.5+ for your peers since that does work properly now and "nat=yes" has been deprecated.

Let us know if letting Asterisk use the default nat setting helps or not.

By: Walter Doekes (wdoekes) 2013-09-06 09:49:37.749-0500

@elguero: still doesn't explain the Forcerport=Y in the two individual 'sip show peers'.

By: Michael L. Young (elguero) 2013-09-06 10:46:55.375-0500


You are always spot on... so I went and actually looked into the code instead of trying to troubleshoot based on this info attached.  A full debug log (not just SIP debug) would probably have helped us out as well...

Let me explain my thinking and why I initially recommended what I did to Filip.

I so much dislike these NAT settings since one person's setup can work perfectly and then another combination of the settings doesn't quite work.  Here is the comment from the UPGRADE.txt file:

--- Functionality changes from Asterisk 1.6.2 to Asterisk 1.8 ----------------

SIP Changes
* Due to potential username discovery vulnerabilities, the 'nat' setting in sip.conf
  now defaults to force_rport. It is very important that phones requiring nat=no be
  specifically set as such instead of relying on the default setting. If at all
  possible, all devices should have nat settings configured in the general section as
  opposed to configuring nat per-device.


The recommended way to setting the nat settings, if I understand the above comment correctly, is to use the global settings (general section) and then turn it off per peer when necessary, if I am reading things correctly.  Also, Filip is probably getting messages in his logs about the default nat settings being different from his peer settings and that this combination is not recommended.

The combinations became more complex with the addition of these auto_force_rport and auto_comedia settings that showed up in 11 and at the same time the auto_force_rport became the default setting.  Hence, all the work that was done to try to make the nat settings work as they were intended.

I see this in the code which changed from 11.4 to 11.5, the addition of the conditional:

if (p->natdetected && ast_test_flag(&p->flags[2], SIP_PAGE3_NAT_AUTO_RPORT)) {
                               ast_copy_flags(&p->flags[0], &peer->flags[0], SIP_NAT_FORCE_RPORT);

Therefore, I will admit that the patch you mentioned in an earlier comment which references this change above, was probably not tested by turning on nat per device as opposed to turning it off per device since that does not seem to be the recommended method.  The issue here is probably from the fact that the peer's flags are never being copied to the dialog.  So, in this particular setup, it is causing a problem.

So, I am thinking that the correct solution would be to revert that change and always copy the peer's flags?  That is the change that I see that happened which would affect this particular setup.  We should also be encouraging users to use the recommended way of setting nat.  Just some thoughts... kind of typing out loud here.

By: Rusty Newton (rnewton) 2013-09-26 16:36:56.885-0500

Pinging Walter, as I think Michael was asking Walter about the potential source change.

By: Filip Frank (frenk77) 2013-09-30 02:28:11.142-0500

I try enable nat in global section, and disable it in sip templates, where not need it. And it works ok and solve my problem. But i still think this isnt correct situation if asterisk show me force_rport enabled in sip show peers, but not respect this in communication with phone.

By: Michael L. Young (elguero) 2013-09-30 14:51:02.317-0500

Filip, thanks for the feedback.  That helps to confirm where the issue lies.  I agree that it doesn't seem correct and we will work on fixing that.

By: Kinsey Moore (kmoore) 2013-10-07 08:48:03.121-0500

You appear to have a pretty good handle on the issue here. Have you made any progress on a fix?

By: Michael L. Young (elguero) 2013-10-07 09:22:01.653-0500

Kinsey, I can move forward with a fix, sure.  I was waiting for feedback from Walter but I think it is pretty straight forward.  So, yes I can continue to work on this then.  I did patch a system and I haven't had any side affects from it.

By: Kinsey Moore (kmoore) 2013-10-07 09:28:46.485-0500

Sounds good. If it's a complicated fix, feel free to poke me for a code review.

By: Kinsey Moore (kmoore) 2013-10-14 12:42:13.010-0500

Is there any update on this? I'd like to get it in for 11.6.0 RC2 which will be going out soon.

By: Michael L. Young (elguero) 2013-10-16 14:19:34.806-0500

Sorry for my tardiness... Unfortunately, things have been crazy.

Anyways, please find attached a patch.  It is simple but I wanted to do thorough testing before submitting this.

I have had this patch in use for a bit now without any issues.  A problem with a softphone contributed to the tardiness of attaching this because I thought there was a problem when there really wasn't one.  I have checked everything today and this patch appears to resolve the reported issue.

Kinsey, would you like me to submit this to review board at all?

By: Kinsey Moore (kmoore) 2013-10-16 15:33:16.623-0500

The patch looks good, go ahead and push it in and merge it up.

By: Michael L. Young (elguero) 2013-10-16 16:21:20.076-0500

Kinsey, On IRC Matt asked me to put it up on Review Board.  Should I wait until you review it there?

By: Walter Doekes (wdoekes) 2013-10-17 09:29:10.162-0500

More confusion:

commit cd5283947e908f6e929ea4b2a6b345169caf15a6
Author: Joshua Colp <jcolp@digium.com>
Date:   Fri Jun 26 20:19:49 2009 +0000

   Fix the 'nat' option to actually do RFC3581 as expected and extend the configurable values for finer control.
   (closes issue #8855)
   Reported by: mikma
   Tested by: klaus3000, file
   git-svn-id: http://svn.asterisk.org/svn/asterisk/trunk@203735 f38db490-d61c-443f-a65b-d21fe96a405b
asterisk-git$ git show cd5283947 -p | grep ast_cli.*FORMAT
- ast_cli(a->fd, FORMAT, "Username", "Secret", "Accountcode", "Def.Context", "ACL", "NAT");
+ ast_cli(a->fd, FORMAT, "Username", "Secret", "Accountcode", "Def.Context", "ACL", "ForcerPort");

During that commit. The "N" (NAT) suddenly became "N" (Forcerport), which looks like "no".

Since then, the "A" and "a" options have been added, but "N" still means Forcerport=Yes.

The code now looks like this:
               ast_test_flag(&peer->flags[2], SIP_PAGE3_NAT_AUTO_RPORT) ?
                       ast_test_flag(&peer->flags[0], SIP_NAT_FORCE_RPORT) ? " A " : " a " :
                       ast_test_flag(&peer->flags[0], SIP_NAT_FORCE_RPORT) ? " N " : "   ",    /* NAT=yes? */
but that should be:
               ast_test_flag(&peer->flags[2], SIP_PAGE3_NAT_AUTO_RPORT) ?
                       ast_test_flag(&peer->flags[0], SIP_NAT_FORCE_RPORT) ? " A " : " a " :
                       ast_test_flag(&peer->flags[0], SIP_NAT_FORCE_RPORT) ? " Y " : " N ",    

By: Walter Doekes (wdoekes) 2013-10-17 09:58:36.952-0500

And the check_for_nat check is too strict. Auto-rport basically matches anything.

That could use this change:
@@ -17944,7 +17945,7 @@ static void check_for_nat(const struct ast_sockadd

- if (ast_sockaddr_cmp(addr, &p->recv)) {
+ if (ast_sockaddr_cmp_addr(addr, &p->recv)) {
char *tmp_str = ast_strdupa(ast_sockaddr_stringify(addr));
ast_debug(3, "NAT detected for %s / %s\n", tmp_str, ast_sockaddr_stringify(&p->recv));
p->natdetected = 1;

But other than that, your patch looks good to go.

By: Filip Frank (frenk77) 2013-10-18 02:28:48.190-0500

Thank you very much for fix :-)