Summary:ASTERISK-08014: asterisk crashes with core several times/week during nightly restart script
Reporter:Scott Keagy (skeagy)Labels:
Date Opened:2006-10-26 04:33:04Date Closed:2007-01-09 13:26:56.000-0600
Versions:Frequency of
Environment:Attachments:( 0) bug8234.backtrace.txt
Description:bt output from  core follows....

Issue is not obviously caused by my scripts (at least not obvious to me!). I do an asterisk -rx "stop now", followed by service zaptel stop, service zaptel start, service asterisk start (with appropriate sleep times to wait between steps). Whenever it fails, I can log into the box and when I manually execute the steps it works fine. I test ran the cron-job script 15 times in a row, no failures. Yet in production it fails several times per week (e.g. more than 2 out of 7 times). Of course difference is having a day's worth of activity to change the system state before a stop/start cycle. Asterisk spits out a core file when the failures occur. Here is the backtrace info, please let me know if I need to include my start-up scripts (they do nothing fancier than what I summarize above). I have additional core files if this one is inconclusive.
Comments:By: Serge Vecher (serge-v) 2006-10-26 12:44:47

what happens if you use stop gracefully / 1.2.13?

By: Tilghman Lesher (tilghman) 2006-11-12 21:38:10.000-0600

The greater question is, why are you doing a nightly restart?  We should look at doing a fix for the problem which causes you to need to do a daily restart, not a fix for the workaround for the problem you're having.

By: Scott Keagy (skeagy) 2006-11-17 09:53:33.000-0600

Asterisk voice quality issues (echo, static, artifacts that sound like jitter issues) seem to worsen with time, and daily restarts is an ugly hack to get around this. I've got more than enough CPU and mem, so these aren't issues. Several production environments on different asterisk versions consistently see improvements when I stop/start asterisk (and I do zaptel too in case issues are there... I haven't isolated).

My original script had "stop gracefully", but it wasn't sophisticated enough to wait until the system really was stopped, and after 30 secs it tried to restart zaptel (fail because it's in use) and then try to start asterisk (fail) and subsequently have asterisk finally stop gracefully. So that's when I switched to asterisk -rx "stop now" @ 3am as another ugly hack.

I haven't tried 1.2.13 yet, but I do have a test box on 1.4beta3, not in production (no consistent traffic, call transfers, etc. to really exercise it).

Present problem seemed to have cured itself for a while with no changes, but for last week it has been almost every day failure. I looked at "bt" output of the core file and it seems related to issue from before:

(gdb) bt
#0  0x0053658c in memcpy () from /lib/tls/libc.so.6
#1  0x080cd841 in plc_rx (s=0x875b9c7c, amp=0x936f0c4, len=180) at plc.c:80
#2  0x002a2bbe in lpc10tolin_framein (tmp=0x936b1d0, f=0x2ab7c0) at codec_lpc10.c:249
#3  0x0806bb47 in calc_cost (t=0x2aa5a0, samples=Variable "samples" is not available.
) at translate.c:256
#4  0x0806c4a5 in ast_register_translator (t=0x2aa5a0) at translate.c:409
ASTERISK-1  0x002a3126 in load_module () at codec_lpc10.c:408
ASTERISK-2  0x0805c62d in __load_resource (resource_name=0x932deb7 "codec_lpc10.so", cfg=Variable "cfg" is not available.
) at loader.c:413
ASTERISK-3  0x0805ce36 in load_modules (preload_only=0) at loader.c:553
ASTERISK-4  0x080c0499 in main (argc=2, argv=0xbff31054) at asterisk.c:2372

By: Matt O'Gorman (mogorman) 2006-12-07 17:18:20.000-0600

is this still an issue, and can you provide a bt full or any more information about this issue?


By: Serge Vecher (serge-v) 2006-12-08 08:27:50.000-0600

with 1.2.13 please

By: Serge Vecher (serge-v) 2007-01-09 13:26:56.000-0600

please reopen if able to reproduce and additional information requested.