[Home]

Summary:ASTERISK-16839: Peer goes unreachable when placing a call from it.
Reporter:Jordan Kirby (jordankirby)Labels:
Date Opened:2010-10-20 06:23:31Date Closed:
Priority:MinorRegression?No
Status:Open/NewComponents:Channels/chan_sip/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 3002_peer_details.txt
( 1) issue_18172_debug.log
( 2) mysql_log.txt
( 3) PhoneGoingOffline.pcap
( 4) sip_show__peer_3001_-_after_unreachable.txt
( 5) sip_show__peer_3001_-_before_call.txt
( 6) sip.conf
( 7) sip_show_objects.txt
Description:We are using 1.8RC5, Asterisk RealTime (MySQL) and realtime caching.
Both the server and the phones are behind separate NAT (phone in office, server in data centre).

In this example, SIP/3001 calls SIP/3002 (via a gosub). The call proceeds fine but asterisk sees SIP/3001 as unavailable for a period of time after the initial invite.

This looks to be because asterisk starts sending OPTIONS and NOTIFIES to the LAN IP address of the phone rather than it's NATed IP address.

In the attached trace:
 10.50.0.47 is the server's LAN IP.
 212.62.4.230 is the server's NATed IP.
 192.168.1.231 is 3001's LAN IP.
 195.59.152.66 is 3001's NATed IP.

Asterisk output:

 == Using SIP RTP CoS mark 5
   -- Executing [3002@DLPN_All:1] Gosub("SIP/3001-000000aa", "internal,3002,1")
   -- Executing [3002@internal:1] Dial("SIP/3001-000000aa", "SIP/3002")
 == Using UDPTL CoS mark 5
 == Using SIP RTP CoS mark 5
   -- Called 3002
   -- SIP/3002-000000ab is ringing
[Oct 20 12:11:05] NOTICE[24099]: chan_sip.c:24694 sip_poke_noanswer: Peer '3001' is now UNREACHABLE!  Last qualify: 37
 == Spawn extension (internal, 3002, 1) exited non-zero on 'SIP/3001-000000aa'
   -- Registered SIP '3001' at 195.59.152.66:17261
[Oct 20 12:11:55] NOTICE[24099]: chan_sip.c:19523 handle_response_peerpoke: Peer '3001' is now Reachable. (34ms / 2000ms)
Comments:By: Stefan Schmidt (schmidts) 2010-10-20 06:43:11

We require a complete debug log to help triage the issue.

This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue:

http://svn.digium.com/svn/asterisk/trunk/doc/HOWTO_collect_debug_information.txt

could you also please attach the output of sip show peer before and after the invite when the options stop working.

what i see and this is strange in your invite the c= row looks like this:
c=IN IP4 192.168.1.231 normaly the external ip should be used there not the internal. Further the sip response of this call itself is sent back to the right address but somehow the contact ip for this peer is changed.

Thanks!

By: Jordan Kirby (jordankirby) 2010-10-20 08:10:38

Thanks.

I've uploaded the debug log and the result of sip show peer before and after.

By: Stefan Schmidt (schmidts) 2010-10-20 09:20:47

thanks

what i see on the first invite is a wrong via header:
Via: SIP/2.0/UDP 192.168.1.231;branch=z9hG4bKae42d80eB02D4E3
this via should be set to the external adress (by the nat in front) and the port is missed so this could cause this problem. but i am not sure.

By: Jordan Kirby (jordankirby) 2010-10-22 05:25:28

I can confirm this bug still exists in the new release of 1.8.0.

By: Paul Belanger (pabelanger) 2010-10-22 07:38:50

This is a configuration issue, not a bug (see below).

---
Thanks for your comments. This does not appear to be a bug report and we are closing it. We appreciate the difficulties you are facing, but it would make more sense to raise your question in the support tracker, http://www.asterisk.org/support

By: Stefan Schmidt (schmidts) 2010-10-22 08:02:11

after talking to pabelanger we reopen this issue but we need more information from you, like your sip.conf, realtime conf and also if possible a mysql log what happens during this call.

and btw your nat really make some problems here too. i have seen in your pcap file you have a cisco device in your network. If this is a cisco router do you have disabled sip nat service (sip alg)?
if not just set the following parameters in your cisco router:
conf t
no nat service sip udp port 5060
write memory

thanks

schmidts

By: Paul Belanger (pabelanger) 2010-10-22 08:03:04

Reopening after a discussion on #asterisk-dev (see below).  I'd like to see the reporter resolve the nat issue and sip.conf file.

---
<schmidts> pabelanger about issue 18172 do you really think this is a configuration issue?
<MuffinMan> [closed] [Asterisk] Channels/chan_sip/General 0018172: Peer goes unreachable when placing a call from it. reported by jordankirby https://issues.asterisk.org/view.php?id=18172
<pabelanger> schmidts: Until he fixes the nat issue with his phone, no need to waste time on it.
<pabelanger> I suspect it is a routing issue, asterisk is setting the option message to the private IP, which is not routed properly.
<schmidts> But should an invite ever change the the IPaddr field of a peer?
<schmidts> yes, but directly after the invite and before sending a trying back to the user
<pabelanger> schmidts: I'm not sure honestly.
<pabelanger> However, looking at this in the log:
<pabelanger> [Oct 20 13:58:25] DEBUG[24099] chan_sip.c: Trying to put 'OPTIONS sip' onto UDP socket destined for 192.168.1.231:17261
<pabelanger> [Oct 20 13:58:25] NOTICE[24099] chan_sip.c: Peer '3001' is now UNREACHABLE!  Last qualify: 35
<schmidts> yes but, as you can see in the log the dialog itself will go through the right contact adress
<schmidts> 100 trying, 200 ok, and bye or ACK is sent back to the right ip, but the internal one is also saved into the pvt structure of the peer
<pabelanger> Ya, but like you pointed out in the SDP, c=IN IP4 192.168.1.231 is an issue
<schmidts> thats only for rtp and has nothing to do with the sip adress for this pvt
<schmidts> the problem i see with this is if we dont find what cause this problem it could be a possible way to DOS
<schmidts> if its really a asterisk bug
<pabelanger> schmidts: Fair enough, we can reopen it.  Get more feedback.
<schmidts> will do ;)
<pabelanger> But would like to see him resolve is natting issue first

By: Jordan Kirby (jordankirby) 2010-10-22 08:16:31

The Cisco devices on the network are switches and ASA firewalls.
The SIP ALG is disabled on all devices.

We have 15 or so Asterisk 1.6.2 servers running behind the same switch & firewall kit (using static config files rather than realtime) without problem.

By: Stefan Schmidt (schmidts) 2010-10-22 08:35:33

could you please add your sip conf and an example for this peer where this happens.

and also what your database say about this peer before and after this happens. in you log i cant see a realtime update for the new address and i want to know if a upgrade would happens somewhere or not.

thanks!

By: Jordan Kirby (jordankirby) 2010-10-22 08:44:30

I have uploaded sip.conf, peer details from mysql under normal circumstances and the mysql log.

The mysql log shows the database update setting the lastms to -1 but there's no other update:

UPDATE gag1_sip SET lastms = '-1' WHERE name = '3002'

By: Stefan Schmidt (schmidts) 2010-10-22 09:01:30

this update is also in the log file. so the data in the db should be right atleast and thats why the dialog itself works, cause this data is read out of the db.

there is still something strange ;)

By: Jordan Kirby (jordankirby) 2010-10-22 09:02:22

Once peers are registered and online, if I issue a reload some of them start to show in "sip show peers" with their internal IP address:

Name/username              Host                                    Dyn Forcerport ACL Port     Status     Realtime
3001/3001                  192.168.1.231                            D   N      17261    UNREACHABLE Cached RT
3002/3002                  192.168.1.230                            D   N      40452    UNREACHABLE Cached RT


But viewing the database at the same time shows the correct, external IP address:

mysql> select name,ipaddr from gag1_sip;
+--------------+---------------+
| name         | ipaddr        |
+--------------+---------------+
| 3001         | 195.59.152.66 |
| 3002         | 195.59.152.66 |

By: Stefan Schmidt (schmidts) 2010-10-22 09:18:26

if possible could you please turn off the allowguest parameter in your sip conf and try it again.
what i dont understand about this at all is why asterisk sends out a option package directly after the invite is parsed.

please also try "sip show objects" and look if there are more than one object for 3001 or 3002

thanks!

By: Jordan Kirby (jordankirby) 2010-10-22 10:09:58

I see the same problem with allowguest set to no in sip.conf.

The results of sip show objects is attached. I only see 1 entry per peer.

By: Sébastien Couture (sysreq) 2010-10-25 14:59:40

Wouldn't that have to do with the fact that Asterisk, when using Realtime for SIP peers, uses what's in the 'fullcontact' field instead of the 'ipaddr' and 'port' fields to send OPTIONS; and because the 'fullcontact' holds the local IP address and port, even if peer is nat=yes.

This issue had supposedly been fixed in issue 15194, for 1.4 as well as 1.6; but from what I can see, it's not.

http://lists.digium.com/pipermail/asterisk-users/2006-December/174101.html
https://issues.asterisk.org/view.php?id=14756
https://issues.asterisk.org/view.php?id=15194

By: Jordan Kirby (jordankirby) 2010-10-26 08:02:06

This does look to be related to those issues.

If I comment out the line below the problem goes (since we always use NAT it doesn't seem to have a negative impact but I assume it would if you weren't using NAT):

__set_address_from_contact(fullcontact->str, &peer->addr, 0);

(line 26183 in chan_sip.c in 1.8.0)

Unfortunately my knowledge of the asterisk code is far too limited to find where the fault with the code actually is!

By: Mehmet Tolga Avcioglu (mta59066) 2011-07-23 06:41:34.117-0500

Any update on this?

Currently running 1.8.4.2 and experiencing this problem. Just checked 1.8.5.0 and SIP_NAT_RPORT_PRESENT has been changed to SIP_NAT_FORCE_RPORT just above the line mentioned in previous comment. Wondering if that might fix it. Will try shortly.

By: Mehmet Tolga Avcioglu (mta59066) 2011-07-23 09:55:32.845-0500

No still cannot get realtime peers behind nat talking with asterisk behind nat for more than few minutes using 1.8.5.0, rtcachefriends=yes, qualify=yes. I believe my problem is same as the one mentioned on this ticket and which was fixed on 1.6. Any ideas?