Summary:ASTERISK-11215: My asterisk crashes randomly with very low volume
Reporter:Private Name (falves11)Labels:
Date Opened:2008-01-11 21:49:07.000-0600Date Closed:2011-06-07 14:00:51
Versions:Frequency of
Environment:Attachments:( 0) blowup10.txt
( 1) blowup12.txt
( 2) blowup5.txt
( 3) blowup6.txt
( 4) blowup7.txt
( 5) blowup8.txt
( 6) blowup9.txt
( 7) thread_bt1.txt
( 8) valgrind_core1.txt
( 9) valgrind.txt
(10) valgrind2.txt
(11) valgrindnew1.txt
Description:It happens at random
Comments:By: Private Name (falves11) 2008-01-11 21:51:14.000-0600

This a 64 Bit RHEL 5 environment

By: Tilghman Lesher (tilghman) 2008-01-11 22:07:24.000-0600

Valgrind, please.

By: Private Name (falves11) 2008-01-11 22:19:44.000-0600

I cannot run Asterisk under valgrind in a production environment. Is there any way to extract the information form the core dumps? I compiled Asterisk with "Don't Optimize", should I use any other compiler options?

By: Private Name (falves11) 2008-01-12 20:25:07.000-0600

I figured out what was happenning. One dialer is sending over an over the same number, until it overwhelms Asterisk. I always get the same information in the core dump. Question: what can we do to make it more resilient? based on the information posted, is there any way to make it harder to crash? The machine was not overloaded at all, and the autoservices.c MAX_MON was 15000, 10 times more than the standard, and it crashed every 20 minutes.

By: Private Name (falves11) 2008-01-14 18:41:39.000-0600

I wonder if somebody can upload a safe_asterisk that uses valgrind. I have a copy but it restarts asterisk immediately, and goes in a loop. Additionally, "make valgrind" does not work.

By: Private Name (falves11) 2008-01-14 23:07:38.000-0600

I uploaded the valgrind file and the backtrace. It seems that the culprit is channel_h323. I wonder who the person is that can take a look at the code, for I use h323 for business 24x7.

By: Private Name (falves11) 2008-01-15 17:19:05.000-0600

if you look at the latest blowup, blowup9.txt, you will see that h323 has nothing to do with this. I configured H323.conf not to listen on the usual port, so I would not have to receive h323 calls, and still blew up. I hope that the backtrace holds enough information to indicate what is happenning.

By: Tilghman Lesher (tilghman) 2008-01-15 20:28:11.000-0600

Until you give me actual valgrind output, as specified in doc/valgrind.txt, I cannot help you with this issue.

By: Private Name (falves11) 2008-01-15 21:09:04.000-0600

Culd you upload an example of safe_asterisk with valgrind support? I am afraid that when it blows up I will not be watching it and my business will stop. I need Asteriks to come back after a crash.

By: Private Name (falves11) 2008-01-15 21:56:40.000-0600

This the valgrind file. Malloc file had zero bytes. I hope this can help you help me.

By: Private Name (falves11) 2008-01-15 22:26:52.000-0600

when starting asterisk with
valgrind --log-file-exactly=valgrind.txt asterisk -vvvvf 2>malloc_debug.txt
in one of my 3 servers, it gets stuck in res_features and core dumps. But it loads fine without valgrind.

By: Private Name (falves11) 2008-01-16 12:18:06.000-0600

The latest file includes both "bt full" and "thread apply all bt"

By: Tilghman Lesher (tilghman) 2008-01-16 12:29:11.000-0600

valgrindnew1.txt shows a crash within codec_g723.so, which is not distributed with Asterisk and is not supported by us.

By: Private Name (falves11) 2008-01-16 12:31:38.000-0600

I removed the codec g723 long ago, because I saw that too. Nevertheless, it keeps crashing. Would you be so kind as to look a the latest files? I also disabled chan_h323 for inbound, and for outbound I only rarely use it.

By: Tilghman Lesher (tilghman) 2008-02-04 11:43:49.000-0600

Please try the latest SVN 1.4, which will become the 1.4.18 release later today.  We fixed a major memory corruption issue recently, which should resolve this issue.

If I don't hear back within 3 days, I will assume this was fixed, and the issue will be closed.

By: Private Name (falves11) 2008-02-04 13:25:08.000-0600

Question: I am using Trunk because I dependo on res_odbc and func_odbc. Is it the same to upgrade to the latest trunk version or this fix has not been applied to Trunk?

By: Tilghman Lesher (tilghman) 2008-02-04 13:43:07.000-0600

Yes, the same fix has been applied to trunk.

By: Michiel van Baak (mvanbaak) 2008-02-07 15:26:19.000-0600

"If I don't hear back within 3 days,I will assume this was fixed, and the issue will be closed."
Been 4 days already, and no feedback.

By: Private Name (falves11) 2008-02-07 15:42:37.000-0600

I am sorry but I could not test the solution, for I had to go back to version 99082 and abandon the current Trunk. It happens that on the jump from 99082 and 99085 the Contact field on the SIP packet got a :0 suffice, where the port number should go, and that problem kills Asterisk, in the sense that many end points would reject the call. There is a case open about it(11916) . I wonder why so much time has passed and nobdy can post a patch. I did a regression and the problem starts precisely from 99082 to 99085, so somebody could "diff" the sip channels and analize what is happenning.

By: Tilghman Lesher (tilghman) 2008-02-07 16:00:29.000-0600

That's fine, but if you can't test it, then it's a problem without a reporter and must be closed.  If (or more importantly, WHEN) you can verify that this is still an issue in current trunk, you may reopen at that time.