|Summary:||ASTERISK-11368: The asterisk service crashes twice a day|
|Date Opened:||2008-02-04 04:04:39.000-0600||Date Closed:||2011-06-07 14:00:40|
|Environment:||Attachments:||( 0) backtrace.txt|
( 1) bt.txt
( 2) frame.txt
( 3) gdb.txt
|Description:||Our production asterisk service (using safe_asterisk) is crashing twice a day in avarage. I didn't manage to attach a core dump... Someone?|
****** ADDITIONAL INFORMATION ******
It's very crucial for us since we need to manually star the asterisk service because of another issue (Asterisk is out of IAX threads on startup...).
|Comments:||By: Joshua C. Colp (jcolp) 2008-02-04 08:26:48.000-0600|
Was this built with DONT_OPTIMIZE selected under Compiler Flags in menuselect? Could you please do the following and attach the output:
By: moty (moty) 2008-02-04 08:39:48.000-0600
Attached is the output.
I am not sure about the DONT_OPTIMIZE, I think not. Anyhow, since this is a production environment, it will be very hard for me to re-compile it.
By: Joshua C. Colp (jcolp) 2008-02-04 08:44:31.000-0600
This doesn't look right at all... which might have been caused by lack of DONT_OPTIMIZE
By: moty (moty) 2008-02-04 08:51:25.000-0600
First, thanks for the rapid answers.
Second, what do you suggest?
By: Dmitry Andrianov (dimas) 2008-02-04 10:11:57.000-0600
He suggest you to recompile with DONT_OPTIMIZE :)
this option does not really slow asterisk down a lot...
By: Tilghman Lesher (tilghman) 2008-02-06 12:39:34.000-0600
Also, you should be trying SVN 1.4, as we've fixed a fairly major memory corruption issue after the release of 1.4.17 (fix will be in 1.4.18), and from the looks of your backtrace, you've got memory corruption.
By: moty (moty) 2008-02-07 04:13:15.000-0600
Thanks all for your replies.
I will hopefully compile asterisk with DONT_OPTIMIZE flag for better bt.
Anyhow, I will wait for the final .18 release, since again, it's a production environment.
Any other suggestions will be very appreciated.
By: Norman Franke (norman) 2008-02-08 19:48:20.000-0600
This looks identical [Thread 325 (process 6235)] to a crash I keep getting, except I'm running 1.4.18-rc4 which doesn't seem to be different from 1.4.18 release. In my case, I can reverse engineer that the thread that was corrupted was dialing an extension from a client, but failed
By: Chris Miller (scratchspace) 2008-02-09 13:02:40.000-0600
We're experiencing similar behavior with 1.4.17 on RHEL 5 kernel 2.6.18-53.1.4.el5. The system has Sangoma A101d and A200 installed with wanpipe 3.2.1 and Zaptel 1.4.7. The problems we've witnessed seem consistent with or related to 11712, 11818, 11862, and 11915. Originally the system would just hang and we would have to kill -9 the process. We did see failed locks building up to this event. Upon analyzing a couple of core dumps, it appeared this was most likely the memory corruption issue. None the less, we applied patched 11818 and 11862 to 1.4.18 final. Within 12 hours the system crashed while one local SIP extension called another. Attached is a backtrace.
By: Norman Franke (norman) 2008-02-12 18:11:25.000-0600
You may want to try the patch in ASTERISK-1189960 since I'm testing to help with my similar crashes.
By: moty (moty) 2008-02-14 02:54:59.000-0600
I've attached another back trace after compiling the asterisk with DONT_OPTIMIZE flag. Please take a look and let me know if it helps to resolve this issue.
By: Norman Franke (norman) 2008-02-14 10:53:41.000-0600
Can you also upload the console log? Did it crash or just hang? Thread 364 seems very similar to the other issues, I think.
To track these down, I often run them under gdb. For example, "gdb /usr/sbin/asterisk" then "run -c -vvvg" When something happens, I can control-C to enter the debugger, then "generate-core-file" then re-run. I can then analyze the core offline with minimal downtime. Unfortunately, when I did that for a similar freeze up, I couldn't really tell anything. In my case, I suspected a new SIP call was being initiated from a client workstation, and while trying to add the channel to the channel list, it froze (since it wasn't actually in the list yet.)
By: moty (moty) 2008-03-02 02:42:38.000-0600
By: Jason Parker (jparker) 2008-04-02 13:00:54
moty: norman asked several questions and gave some very good advice on 02/14. Please upgrade to Asterisk 1.4.19, answer his questions, follow his instructions, and report back here on whether this is still an issue.
By: Russell Bryant (russell) 2008-04-22 13:43:18
Suspended. Feel free to reopen after an upgrade with up to date information about what is happening.