[Home]

Summary:ASTERISK-05840: Asterisk Crash
Reporter:Johann Hoehn (johann)Labels:
Date Opened:2005-12-14 07:23:22.000-0600Date Closed:2006-01-11 19:28:01.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace_01-03-2006-02:29PM.txt
( 1) backtrace_01-04-2006-06:23AM.txt
( 2) backtrace_01-04-2006-06:24AM.txt
( 3) backtrace_01-04-2006-06:48AM.txt
( 4) backtrace_01-04-2006-07:12AM.txt
( 5) backtrace_12-21-2005-combined.txt
( 6) backtrace_2005-12-14.txt
Description:We upgraded to Asterisk 1.2.1 on Monday (2005-12-12).  On the morning of Wednesday (2005-12-14), Asterisk crashed twice.  Checking the messages, the only warning message is about it being unable to send to a SIP device.

Unsure if this is related, but one of the phones that uses NAT was moved to a new office and got a new IP.  It was not allowed in the firewall for the machine resulting in lots of warning messages.  Both crashes occured before this IP was allowed into the firewall.  I have temporary reblocked that IP to see if Asterisk will die once again for about 15 minutes without any noticeable problem.

The backtraces from the two core dumps are included.

****** ADDITIONAL INFORMATION ******

Dec 14 06:19:28 WARNING[1976]: chan_sip.c:1064 __sip_xmit: sip_xmit of 0x81e1420 (len 519) to 195.85.219.114:5060 returned -1: Operation not permitted
Dec 14 06:19:29 WARNING[1976]: chan_sip.c:1064 __sip_xmit: sip_xmit of 0x81e1420 (len 519) to 195.85.219.114:5060 returned -1: Operation not permitted

Those are repeated about every second.
Comments:By: BJ Weschke (bweschke) 2005-12-14 20:11:37.000-0600

Are these cores provided by a build that was compiled with "dont-optimize" ?

By: Johann Hoehn (johann) 2005-12-16 06:59:13.000-0600

Unforunately no :(

This is a production box, would using dont-optimize cause any side effects that should be noted for a production machine?

By: Tilghman Lesher (tilghman) 2005-12-17 20:26:06.000-0600

No, the only side effect of using dont-optimize is that the binaries are a bit larger, because they have extensive debugging enabled.  This should not affect performance, however.

By: BJ Weschke (bweschke) 2005-12-19 21:40:58.000-0600

we're going to need a backtrace from a "dont-optimize" compiled system in order to trace back where this crash is coming from. Are you still able to reproduce or should we close this out until you can reproduce?

By: Johann Hoehn (johann) 2005-12-20 06:19:40.000-0600

Please close this until I can get a backtrace with dont-optimize.  I'll reopen when I can provide the needed data.

By: BJ Weschke (bweschke) 2005-12-20 06:48:40.000-0600

closing per user request - will reopen if condition happens again and a full bt is obtained.

By: Johann Hoehn (johann) 2005-12-21 07:35:09.000-0600

I have 2 backtraces of Asterisk now and recompiling with dont-optimize.  Both are due to seg faulting.

By: Johann Hoehn (johann) 2005-12-21 07:37:44.000-0600

Both backtraces are in the attached file.  Asterisk crashed, I restarted and then within a minute crashed again.  It is now running again so far without issue.  Not much activity call wise during the crashes either...

By: Tilghman Lesher (tilghman) 2005-12-21 08:00:04.000-0600

Neither of these backtraces are with dont-optimize.  Please do the following:

make clean dont-optimize

in your Asterisk source directory and try again with another backtrace.

By: Johann Hoehn (johann) 2006-01-04 06:51:48.000-0600

There are 4 more backtraces.  This time debugging is turned on with asterisk compile.  Notice that all of those die with a call to pthreads.  The machine is redhat 7.3 and gcc is 2.9.5.  I've noticed that the readme advises to use a gcc 3.0 or higher version.

It seems that using some pages that use the Asterisk Manager API seem to result in causing the crashes.  I have disabled those pages currently.  I have also tested on another machine running Debian testing that has gcc 4.0.3 and am unable to duplicate the crashes that Asterisk Manager API are causing in production.

At this point I think we just need to upgrade the software on the machine and recompile Asterisk.  Please check and see if there are other possible solutions.



By: BJ Weschke (bweschke) 2006-01-11 19:27:12.000-0600

closing. crash doesn't happen when using >= gcc 3.0. If we can get it to crash with >= gcc 3.0 let's reopen with a new BT. Thanks.