Summary:ASTERISK-05220: crash about every 2 hours
Reporter:Matt Florell (mflorell)Labels:
Date Opened:2005-10-03 08:56:43Date Closed:2005-10-04 18:43:55
Versions:Frequency of
Description:On our server testing a Digium TE406P card(With CVS_HEAD 2005-09-29) we had 3 crashes in 7 hours about 2 hours apart each time. Here's the gdb backtrace:

(gdb) bt
#0  0x40145ef1 in kill () from /lib/libc.so.6
#1  0x4002cbb1 in pthread_kill () from /lib/libpthread.so.0
#2  0x4002cf2b in raise () from /lib/libpthread.so.0
#3  0x40145b24 in raise () from /lib/libc.so.6
#4  0x401473fd in abort () from /lib/libc.so.6
ASTERISK-1  0x4017876c in __libc_message () from /lib/libc.so.6
ASTERISK-2  0x40181066 in malloc_printerr () from /lib/libc.so.6
ASTERISK-3  0x4017fbea in _int_malloc () from /lib/libc.so.6
ASTERISK-4  0x4017e7a3 in malloc () from /lib/libc.so.6
ASTERISK-5  0x08058945 in ast_frdup (f=0x40d00010) at frame.c:332
ASTERISK-6 0x0805f322 in ast_queue_frame (chan=0x8176078, fin=0x40621044) at channel.c:577
ASTERISK-7 0x403cbafe in local_queue_frame (p=0x814ebc0, isoutbound=1, f=0x40621044, us=0x819b720) at chan_local.c:158
ASTERISK-8 0x403cb740 in local_write (ast=0x819b720, f=0x40621044) at chan_local.c:231
ASTERISK-9 0x08061eaa in ast_write (chan=0x819b720, fr=0x40621044) at channel.c:2007
ASTERISK-10 0x4077adf2 in wait_for_answer (in=0x819b720, outgoing=0x8183aa0, to=0xb7bf8968, peerflags=0xb7bf957c, sentringing=0xb7bf896c,
   status=0xb7bf8b14 "NOANSWER", statussize=256, busystart=0, nochanstart=0, congestionstart=0, priority_jump=0, result=0x0)
   at app_dial.c:545
ASTERISK-11 0x40777629 in dial_exec_full (chan=0x819b720, data=0x819b720, peerflags=0xb7bf957c) at app_dial.c:1258
ASTERISK-12 0x40775b15 in dial_exec (chan=0x0, data=0x0) at app_dial.c:1687
ASTERISK-13 0x0808ca50 in pbx_extension_helper (c=0x819b720, con=0x0, context=0x819b870 "demo", exten=0x819b964 "913304168223", priority=2,
   label=0x0, callerid=0x0, action=0) at pbx.c:564
ASTERISK-14 0x0808d66f in __ast_pbx_run (c=0x819b720) at pbx.c:2235
ASTERISK-15 0x0808e22f in pbx_thread (data=0x0) at pbx.c:2515
ASTERISK-16 0x4002a54e in pthread_start_thread () from /lib/libpthread.so.0
ASTERISK-17 0x401d6b8a in clone () from /lib/libc.so.6
(gdb) info thread
* 1 process 23776  0x40145ef1 in kill () from /lib/libc.so.6
Comments:By: Roy Sigurd Karlsbakk (rkarlsba) 2005-10-03 09:03:44

bt full might help.....

By: Roy Sigurd Karlsbakk (rkarlsba) 2005-10-03 09:11:15

also, since malloc is the bad guy here, perhaps your memory is running full
if running linux, try using mrtg to monitor `free | awk '/^(Mem|\-)/ { print $4 }'`


By: Matt Florell (mflorell) 2005-10-03 09:13:35

I'll run a bt full tonight and post it.

The server has run fine on CVS_HEAD 2005-09-06 for 4 weeks now and I downgraded back to it after all of the crashes and it's running fine since Thursday night with CVS_HEAD 2005-09-06 so I don't think it's a general machine memory issue.

By: Roy Sigurd Karlsbakk (rkarlsba) 2005-10-03 09:16:46

it's malloc that seems to be causing the call to kill, and the only way i know causing that is to allocate too much memory

By: Matt Florell (mflorell) 2005-10-04 16:00:22

Talked with Digium support and this was an Asterisk issue, it has been fixed as of CVS_HEAD 2005-10-03 and we've been running it for the last 5 hours with no problems. I'll post again tomorrow to confirm, but looks like this bug is resolved

By: Kevin P. Fleming (kpfleming) 2005-10-04 18:43:40

Re-open if you experience the problem again.