Summary:ASTERISK-11134: Asterisk random crash during load.
Reporter:Larry McConnell (lmcconnell)Labels:
Date Opened:2007-12-31 18:07:24.000-0600Date Closed:2011-06-07 14:01:05
Versions:Frequency of
Environment:Attachments:( 0) apply_thread.txt
( 1) bt_asterisk.txt
( 2) bt_full_asterisk.txt
( 3) msg0027.txt
Description:During random times our asterisk server will crash. Centos , 2 ghz xeon dual core processor. Attached is bt, bt full and thread apply all.  Not sure what else may be useful. Noticed this problem with 1.4.4  appears to still be present in 1.16.2. Possibly related to http://bugs.digium.com/view.php?id=8175
(seeing same messages)
Comments:By: Larry McConnell (lmcconnell) 2007-12-31 18:11:04.000-0600

According to log, last thing it was referencing was the voicemail system.  verbose was low.. and debug was 0.

[Dec 31 14:06:01] VERBOSE[8826] logger.c:   == Parsing '/var/spool/asterisk/voicemail/default/129/INBOX/msg0026.txt': [Dec 31 14:06:01] VERBOSE[8826] logger.c: Found
[Dec 31 14:06:01] VERBOSE[8826] logger.c:     -- <SIP/129-09b68160> Playing 'vm-undeleted' (language 'en')
[Dec 31 14:06:03] VERBOSE[8826] logger.c:     -- <SIP/129-09b68160> Playing 'vm-message' (language 'en')
[Dec 31 14:06:04] VERBOSE[8826] logger.c:     -- <SIP/129-09b68160> Playing 'digits/20' (language 'en')
[Dec 31 14:06:04] VERBOSE[8826] logger.c:     -- <SIP/129-09b68160> Playing 'digits/8' (language 'en')
[Dec 31 14:06:09] VERBOSE[9042] logger.c: Asterisk Event Logger Started /var/log/asterisk/event_log
[Dec 31 14:06:09] VERBOSE[9042] logger.c: Asterisk Dynamic Loader Starting:

By: Larry McConnell (lmcconnell) 2008-01-02 01:15:51.000-0600

also.. all SIP. Usually 35-45 sim. calls

By: Steve Murphy (murf) 2008-01-04 11:39:46.000-0600

Well, I'm curious. Please attach the contents of /var/spool/asterisk/voicemail/default/129/INBOX/msg0027.txt

since, that is what it appears the config reading code is choking on.

By: Larry McConnell (lmcconnell) 2008-01-04 11:58:52.000-0600

Attached the file. I manually removed the voicemails after updating. was not sure if there may have been formatting changes since 1.4.4

By: Larry McConnell (lmcconnell) 2008-01-04 18:06:29.000-0600

Well, I think I may have discovered the cause, and it's user related. We had another crash today and in the log I noticed it was a user emptying their voicemail box.  The user had about 50 messages (Prompting me to also now place a lower limit on msgs) and was going through hitting next delete next delete rapidly  ( 1 - 7 ,1 7 , 1 7 etc)  which caused the crash.

By: Larry McConnell (lmcconnell) 2008-01-04 18:36:48.000-0600

I would attach the bt, etc of the last occurence but when trying to do a bt on the core file, it says no stack. Spoke to person at 129 and confirmed that too were doing that very thing last week when it went down.

By: Mark Michelson (mmichelson) 2008-04-02 14:08:52

I was recently assigned this issue and I have tried several times to reproduce it myself, as well as analyze the code to see if there is any obvious fault. Unfortunately, I have not yet been able to detect what Asterisk is doing wrong, if anything at all.

The problem is that the crash happens deep within the glibc glob function, so it is not obvious at all whether Asterisk is actually causing the crash.

So I have a few questions to ask with regards to this issue.

1) Is this still happening for you? To be more specific, does this still happen when using the latest 1.4 version?

2) If it is, then is this reliably reproducible for you? The backtrace isn't really enough to solve this issue; however, I suspect that if the issue were to be reproduced while running Asterisk under Valgrind, we may get some insight into what sort of memory accesses are causing the segmentation fault to occur. See doc/valgrind.txt for information on how to run Asterisk under valgrind.

By: Mark Michelson (mmichelson) 2008-04-14 14:41:46

Nearly two weeks with no reply on a bug which cannot be clearly traced back to Asterisk's code. I'm suspending this. I'll reopen if I can get the information requested previously or if someone can tell if we're incorrectly using the glob() functions somehow.