Summary:ASTERISK-18806: Asterisk stops responding to SIP requests after DOS attack
Reporter:Mark Hassman (mhassman)Labels:
Date Opened:2011-11-02 12:50:45Date Closed:2011-11-21 14:17:17.000-0600
Status:Closed/CompleteComponents:Channels/chan_sip/General Core/General Core/RTP
Versions: Frequency of
Environment:CentOS 2.6.18-274.7.1.el5Attachments:
Description:Recently (past several weeks), my asterisk server hangs.. i.e. the process doesn't hang, but no longer handles calls.

I can still access asterisk via console.. 'sip show peers' shows an active list, but it's stale as the phones are reporting unregistered and inability to connect/make calls. 'sip show registry' also shows what appears to be an active list, but inbound trunks are not working. 'core restart now' successfully restarts asterisk and everything is back to normal.

After reviewing /var/log/asterisk/full, the issue occurs after either a hack or DOS attempt from an unknown external source.. pertinent log entries within reference notes below - two independent occurrences are included.

Worth noting #1: there's a regular dns lookup check always running for telasip.com.. this thread stops running after the hack attempt along with a the pbx functionality.. so, there's definitely an issue here.

Note #2: my rtp.conf is intentionally limiting ports:
This is for reduced firewall rule set and security exposure. Perhaps rtp.c or chan_sip.c aren't properly releasing ports and the shortage is creating a deadlock? just a thought.
If this is a contributing factor, wouldn't a larger pool of ports just mask the root bug/issue - subject to load/frequency from external hacking?

Comments:By: Leif Madsen (lmadsen) 2011-11-02 14:47:15.364-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions.  After testing with Asterisk 1.8, if you find this problem has not been resolved, please open a new issue against Asterisk 1.8.

By: Leif Madsen (lmadsen) 2011-11-02 15:11:57.520-0500

We discussed something like this in theory at AstriDevCon in that if your RTP port range is really small, and you get a lot of traffic, this very thing could potentially happen. I believe it had something to do with sequence numbers getting used multiple times. I only vaguely recall what the exact scenario was.

If you leave Asterisk up and running for a while after, will it eventually recover (especially if you stop accepting any traffic to Asterisk for a period of time)?

I think a backtrace could potentially be useful, along with a 'core show locks' -- I'll post links to instructions after this comment.

By: Leif Madsen (lmadsen) 2011-11-02 15:12:10.149-0500

Debugging deadlocks: Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install).  This will then give you the console command "core show locks." When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt
# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt
gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!

By: Leif Madsen (lmadsen) 2011-11-02 15:12:16.481-0500

Thank you for your bug report. In order to move your issue forward, we require a backtrace[1] from the core file produced after the crash. Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

After enabling, reproduce the crash, and then execute the backtrace[1] instructions. When complete, attach that file to this issue report.

[1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Mark Hassman (mhassman) 2011-11-02 17:01:13.662-0500

Hi Leif,
Thanks for the speedy reply!

Given the two steps in queue: upgrading to 1.8 and compiling from source to enable tracing and lock reporting (i'm currently running from rpm).. i'll try an initial first step as a stop-gap: increase the range of rtp ports.. now running with the standard 10k range. If that fails, i'll probably reinstall with 1.8, but that's another project.

I'll let you know if the problem still exists with the larger rtp port pool.
Thanks again!


By: Leif Madsen (lmadsen) 2011-11-03 14:20:10.033-0500

Assigned to reporter while awaiting feedback. Thanks!

By: Leif Madsen (lmadsen) 2011-11-21 14:17:09.789-0600

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.  Further information can be found at http://www.asterisk.org/developers/bug-guidelines

By: Mark Hassman (mhassman) 2011-11-21 16:01:16.337-0600

Sorry for the delay.. wanted to give sufficient time to test reliability.

Summary: increasing the RTP port pool fixed the issue - asterisk no longer hangs when receiving this type of hack attempt.