[Home]

Summary:ASTERISK-17253: sip call fails to hang up - asterisk uses 99% resources
Reporter:John Fawcett (john fawcett)Labels:
Date Opened:2011-01-16 11:23:20.000-0600Date Closed:
Priority:MajorRegression?No
Status:Open/NewComponents:Channels/chan_sip/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace-threads_1.txt
( 1) backtrace-threads.txt
( 2) core-show-locks_1.txt
( 3) core-show-locks.txt
( 4) debuglog
( 5) debuglog_1.txt
( 6) sip_history_1.txt
( 7) sip_show_channel_1.txt
Description:After upgrading to 1.8.2 (and also present in 1.8.1.1) asterisk is often found to be consuming 99% of resources.
This has been traced to calls which have been hangup by the remote party but are still show as active (in BYE state) in asterisk.
It does seem to be random but happens so frequently that I can easily reproduce this state by doing test calls until it happens. I can do any debugging or tracing that is needed, just point me in the direction of what is needed.
Comments:By: Leif Madsen (lmadsen) 2011-01-17 08:53:46.000-0600

Please save traces of calls (SIP debug) and enable SIP history as well. When you have a hung call, please provide the SIP trace and history for that call.

Additionally, please provide backtrace and deadlock information per here:

https://wiki.asterisk.org/wiki/display/AST/Debugging

By: John Fawcett (john fawcett) 2011-01-19 06:58:39.000-0600

I followed all the instructions above (and compiled with DONT OPTIMIZE). I left debug, verbose and history active for a couple of days and could no longer reproduce.

I have now also upgraded to 1.8.2.1.

I suggest that this issue should be closed.

I have the debug instructions and will reopen with all info requested if this starts happening again.

Sorry that I could not get to the bottom of it.

By: John Fawcett (john fawcett) 2011-02-01 03:00:24.000-0600

I have managed to reproduce, (now with 1.8.2.3)

I am attaching backtrace and locks info as well as debug log output with verbose and debug set to 15 and sip debug and sip history on.

As the log contains a lot of info, the call which triggered the problem was a test call from SIP/203 extension to a test queue SIP/252.

At the end of the call asterisk is consuming 90+% resources and is unresponsive (i.e. no calls can be made any more).

By: Andrew Latham (lathama) 2011-02-02 09:48:53.000-0600

What is your rtptimeout and rtpholdtimeout configured to?

By: John Fawcett (john fawcett) 2011-02-02 13:40:04.000-0600

I haven't specifically set rtptimeout or rtpholdtimeout.

at asterisk cli, doing "sip show settings" I see

..
Global Signalling Settings:
---------------------------
..
RTP Timeout:            0 (Disabled)
RTP Hold Timeout:       0 (Disabled)
..

By: Andrew Latham (lathama) 2011-02-02 13:42:25.000-0600

I am looking at something like this right now.  The debug / backtraces are not helping us on this.  Setting the rtptimeout changed the performance drastically.  If I find our issue to be related I will comment.

By: John Fawcett (john fawcett) 2011-02-02 13:47:45.000-0600

Thanks for the update. Is there anything more I can do to improve information. I couldn't see the BETTER_BACKTRACES option in make menuselect. For some reason the problem is now happening at least once a day, though prior to this it hadn't happened for a week.

By: Andrew Latham (lathama) 2011-02-02 13:53:58.000-0600

BETTER_BACKTRACES is in the RC for 1.8.3 right now.  I assume Leif will package up 1.8.3 soon for that.

By: John Fawcett (john fawcett) 2011-02-03 01:34:48.000-0600

I am now able to reproduce very easily. I have redone the capture of debugging info shortly after starting asterisk so hopefully the info in the logs is easier to read. I noticed that I had not attached the sip call history last time so have also attached that. This time the call that caused the issue was from extension 215 to queue 252. The problem happens after hanging up the call.

One thing I notice is that there is always a failed lock when this happens:

=== ---> Tried and failed to get Lock #1 (chan_sip.c): MUTEX 3614 __sip_autodestruct p->owner 0xb2814ea0 (0)

Also the channel is left active even though the call has been closed (see sip_show_channel_1.txt) attached.