[Home]

Summary:ASTERISK-20829: Crash during normal operation
Reporter:WRP (wrp)Labels:
Date Opened:2012-12-18 12:02:53.000-0600Date Closed:2013-01-31 19:59:55.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Core/General
Versions:1.8.11.1 Frequency of
Occurrence
One Time
Related
Issues:
Environment:CentOS 5.5, Linux 2.6.18-194.el5 #1 SMP Fri Apr 2 14:58:14 EDT 2010 x86_64 x86_64 x86_64 GNU/LinuxAttachments:( 0) core_dump_output.txt
Description:Our production instance of Asterisk crashed. There was no indication in the Asterisk logs of the source of the problem. A core dump was produced; I have attached the complete output generated by following the directions at http://www.voip-info.org/wiki/view/Asterisk+debugging#Backtracingacoredumpfileintmp.

I've extracted the parts of the output that seem to be relevant to the crash, though I may have missed other information.

Notably:

{code}
Thread 1 (Thread 15651):
#0  0x0000003a1d030265 in raise () from /lib64/libc.so.6
#1  0x0000003a1d031d10 in abort () from /lib64/libc.so.6
#2  0x0000003a1d06a84b in __libc_message () from /lib64/libc.so.6
#3  0x0000003a1d072405 in _int_free () from /lib64/libc.so.6
#4  0x0000003a1d07276b in free () from /lib64/libc.so.6
#5  0x00000000004b6b55 in frame_cache_cleanup (data=<value optimized out>) at frame.c:331
#6  0x0000003a1d805ad9 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#7  0x0000003a1d80674b in start_thread () from /lib64/libpthread.so.0
#8  0x0000003a1d0d3f6d in clone () from /lib64/libc.so.6

Thread 23646:
Thread 23645:
Thread 31378:
ast_spawn_extension: <Address 0xffffffffffffffff out of bounds>
{code}
Comments:By: WRP (wrp) 2012-12-18 12:03:26.542-0600

GDB output from core dump file.

By: Richard Mudgett (rmudgett) 2012-12-18 15:30:27.766-0600

Your backtrace appears to contain memory corruption and we require valgrind output in order to move this issue forward. Please see https://wiki.asterisk.org/wiki/display/AST/Valgrind for more information about how to produce debugging information. Thanks!



By: Rusty Newton (rnewton) 2013-01-10 18:07:20.220-0600

WRP, can you provide the valgrind output? Also, is it possible to test 1.8.19.1 in your production environment?


By: WRP (wrp) 2013-01-10 19:30:57.918-0600

We are able to upgrade to 1.8.19.1 (and likely will be doing so next week). We aren't able to provide valgrind output, as we are not comfortable enabling DONT_OPTIMIZE , MALLOC_DEBUG, and DEBUG_THREADS  flags on our production instance.

By: Richard Mudgett (rmudgett) 2013-01-10 23:15:34.973-0600

The DEBUG_THREADS option is more of a performance robber than the other two options and isn't likely to help find this issue.

FYI: Asterisk v1.8.20.0 (Now in -rc2 status) has an improved MALLOC_DEBUG that can detect more memory corruption issues.  It can be used as a poor man's substitute for valgrind with much less performance loss.

By: Rusty Newton (rnewton) 2013-01-31 13:59:20.199-0600

Let us know what happens with the upgrade (Hopefully to 1.8.20+ so you can use the MALLOC_DEBUG enhancements)

We won't be able to do much without valgrind output, or the mmlog from MALLOC_DEBUG https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace#GettingaBacktrace-IdentifyingPotentialMemoryCorruption

By: WRP (wrp) 2013-01-31 14:05:25.988-0600

We haven't had an issue since the initial one. We are not planning on tracking it further unless it happens again. In the event that it does, we'll repost to this ticket.

Thank you for your help!

By: Michael L. Young (elguero) 2013-01-31 19:59:55.399-0600

I am going to suspend this for now.  If it happens again and you can get the requested info, please attach it and ask a bug marshall to re-open the issue for you.

Thanks