|Summary:||ASTERISK-11313: bindaddr problems|
|Date Opened:||2008-01-27 18:45:47.000-0600||Date Closed:||2008-07-21 11:23:51|
|Environment:||Attachments:||( 0) sipdump|
|Description:||I run asterisk on a machine with six IP addresses not counting loopback, four public and two private, the latter connecting VPNs. I want SIP on only one public and one private IP address and absolutely not on the rest. |
As far as I have been able to figure by trial and error (yeay documentation!), there is no way to tell chan_sip to bind to two addresses; bindaddr will only accept one single address and can only occur once in sip.conf. So, since I need SIP on two addresses and must not advertise it on the other four, using bindaddr=0.0.0.0 and firewalling UDP 5060 on the remaining addresses seems to be the only solution.
I have a NAT'ed Linksys PAP2 SIP client that keeps causing problems. It works just fine with bindaddr=[public SIP address] in sip.conf, but keeps failing with bindaddr=0.0.0.0. Debugging SIP with bindaddr=0.0.0.0 shows "Contact: <sip:[account]@[wrong_public_address]>" headers being sent by asterisk in the "Trying" message of the registration process, followed by "401 Unauthorized". The wrong public address that is sent is the "real" public address of the machine, the rest being aliases on the same physical interface.
My (uncertain) conclusion is that, if bindaddr=0.0.0.0, asterisk puts the first address it can find on the machine in the Contact: header of its response to a registration attempt, thereby causing the client to send authentication data to the wrong address and authentication to fail.
If this conclusion is correct, the solution should be easy: don't go guessing addresses and don't send the Contact: header at all, but just let the client continue talking to the IP address that it was already talking to.
|Comments:||By: Joshua C. Colp (jcolp) 2008-01-28 06:27:33.000-0600|
Can't you just set bindaddr to 0.0.0.0 but use externip and localnet so the correct IP goes into the messages?
By: Tilghman Lesher (tilghman) 2008-01-28 11:14:58.000-0600
And firewall the other IPs?
By: () 2008-01-28 19:52:01.000-0600
With bindaddr=[public SIP address], the PAP2 client works (and all VPN clients are lost). With bindaddr=0.0.0.0, externip=[public SIP address] and a localnet declaration, asterisk uses the correct address in the Contact: header, but the PAP2 still can't log in (while the VPN clients can). Thus, the wrong Contact: information wasn't the only problem. Globals says nat=no and the section of each nat'ed client says nat=yes in both cases.
I'm attaching a debug dump of both variants (with public IP addresses changed for privacy). The interesting thing is, I don't see any login data being sent to the server when using bindaddr=0.0.0.0, externip=[public SIP address], so it's no wonder the client can't log in. I tried changing bindaddr several times back and fro and this behaviour is consistent. Then again, with everything else unchanged, this simply doesn't make any sense.
By: () 2008-01-28 20:25:19.000-0600
bindaddr = 0.0.0.0
externip = 22.214.171.124
the SIP messages correctly use the address declared in externip, but the UDP packets are sent out on the first available interface that can route to the destination, which is not the one in externip. That is, if the machine has addresses 126.96.36.199, 188.8.131.52 and 184.108.40.206, the SIP messages will be sent out on 220.127.116.11 even if externip is 18.104.22.168.
At the client end, the firewall as well as the client expect UDP 5060 from the server's advertised sip address that they have been talking to, so when the packets arrive from elsewhere, they're simply dropped. This is what's causing the PAP2 to keep trying over and over again to register without credentials: it never gets the "401 Unauthorized" responses that asterisk keeps sending to it.
For a while I feared I might be using a bug report for what could prove to be a support question, but this is certainly a bug: if a SIP dialogue becomes asymetric on the network level, with messages coming in on one address and responses going out on another, nothing will ever work. This probably breaks asterisk on most multi-homed machines, as well as on some single-homed multi-address machines like mine.
I suspect that all this can be solved much easier by allowing multiple occurrences of bindaddr, than by trying to track where messages come in and send replies out on the same address.
By: Olle Johansson (oej) 2008-01-30 07:20:09.000-0600
This is a duplicate of a multi-year old bug report that wasn't fixed, but somehow seems lost.
By: () 2008-01-30 08:50:58.000-0600
And what's the prospect of it being fixed now? I mean, is it easy to just add three lines of code to allow multiple occurences of bindaddr, or is it a major rewrite that could face the same fate as the previous bug?
By: Olle Johansson (oej) 2008-01-30 08:55:14.000-0600
I think the major rewrite is on the way finally in netsock2 that file is working on.
By: Tilghman Lesher (tilghman) 2008-01-30 09:35:14.000-0600
zenon: if you think it's as easy as adding 3 lines of code, you're certainly welcome to upload that patch.
By: () 2008-01-30 09:56:44.000-0600
Uh, I can hardly read code, let alone write it, that's why I asked whether it's three lines of code OR a major rewrite. Anyway, taking into account oej's reply and the new version policy, would it be correct to assume that the fix might make it into 1.6 but will not make it into 1.4?
The bottom line of these questions is that I'm trying to figure how to deal with this problem in the meanwhile and how long that meanwhile might be. All the workarounds I can think of are extremely messy and the only other alternative, to move asterisk to a machine of its own, is costly, so that's what's on my mind; not an argument.
Talking about workarounds, let me ask: if bindaddr=0.0.0.0 and some other application has already bound to port 5060 on some of many IP addresses on the machine when asterisk is started, will asterisk just bind to the remaining addresses or will it fail to start?
By: Olle Johansson (oej) 2008-01-30 10:04:26.000-0600
I don't think it'll make it into 1.4 unless someone comes up with a simple patch - but that hasn't happened before. Sorry.
By: Tilghman Lesher (tilghman) 2008-01-30 10:05:30.000-0600
Asterisk does not enumerate the addresses and bind to all of them. If you look at your netstat output, you'll see that if you set bindaddr=0.0.0.0, it binds exactly to 0.0.0.0. It is setup currently to only bind to one address, which is why this is not a 3 line change (but prove me wrong!).
By: Olle Johansson (oej) 2008-01-30 10:10:46.000-0600
Corydon: That's only part of the problem. Even with bindaddr=0.0.0.0 we send from the wrong IP in some cases.
By: Tilghman Lesher (tilghman) 2008-01-30 10:57:33.000-0600
I think we can agree that we are proceeding on this with a developer branch, and it will be fixed sometime during the 1.6 release cycle.
By: () 2008-01-30 14:57:21.000-0600
Why close the bug with won't fix if the plan is to fix it?
Leaving it open until it is actually fixed (a) makes it easier for others with the same problem to find this bug and not report dupes, (b) ensures that the bug won't be forgotten, (c) allows the assignee to make notes and close the bug when fixed, so that Mantis can keep everybody watching it updated and (d) prevents bugs from disappearing unfixed through bug tracker migrations (I'm guessing that's what happened to the previous one, which now somehow seems lost).
By: () 2008-01-30 18:27:29.000-0600
oej: Is bug ASTERISK-2326 the one you had in mind in comment 81406?
corydon: As I said, I can't code. But I did put out a small incitament to whoever can, see http://www.voip-info.org/wiki/view/Asterisk+multiple+bindaddr+bounty
By: Tilghman Lesher (tilghman) 2008-01-31 01:48:40.000-0600
zenon: The resolution is "won't fix" because we will not make this change for 1.4, which is what you reported it against.