[Home]

Summary:ASTERISK-10964: Asterisk crash in the middle of a "reload" command
Reporter:Edgar Landivar (elandivar)Labels:
Date Opened:2007-12-03 22:49:22.000-0600Date Closed:2007-12-07 12:55:07.000-0600
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Addons/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) asterisk-reload-bug-backtrace.txt
( 1) gdb_2nd_full_bt.txt
( 2) gdb1.txt
( 3) gdb2.txt
Description:Asterisk is crashing about 7 of 10 times when I run a reload command.

I have ran asterisk with strace to see if i can get more information. Here are the last lines of 3 strace runs. As you can see asterisk crashed in different stages of the reload process:

RELOAD #1
=========

 == Parsing '/etc/asterisk/users.conf': Found
   -- Reloading module 'codec_alaw.so' (A-law Coder/Decoder)
 == Parsing '/etc/asterisk/codecs.conf': Found
   -- codec_alaw: using generic PLC
   -- Reloading module 'chan_sip.so' (Session Initiation Protocol (SIP))
Reloading SIP
[{fd=29, events=POLLIN}], 1, -1)   = -1 EINTR (Interrupted system call)
+++ killed by SIGSEGV +++

RELOAD #2
=========

 == Setting global variable 'CWINUSEBUSY' to 'true'
 == Setting global variable 'AMPMGRUSER' to 'admin'
 == Setting global variable 'AMPMGRPASS' to 'elastix456'
   -- Registered extension context 'park-dial'
   -- Including context 'park-dial-custom' in context 'park-dial'
   -- Added extension 't' priority 1 to park-dial
   -- Added extension 't' priority 2 to park-dial
[{fd=32, events=POLLIN}], 1, -1)   = -1 EINTR (Interrupted system call)
+++ killed by SIGSEGV +++

RELOAD #3
=========

/usr/lib/asterisk/modules/res_clioriginate.so
008c8000-008e1000 r-xp 00000000 03:02 17826039   /lib/ld-2.5.so
008e1000-008e2000 r-xp 00018000 03:02 17826039   /lib/ld-2.5.so
008e2000-008e3000 rwxp 00019000 03:02 17826039   /lib/ld-2.5.so
008e3000-008e4000 r-xp 00000000 03:02 11506593   /usr/lib/asterisk/modules/func_sha1.so
008e4000-008e5000 rwxp 00000000 03:02 11506593   /usr/lib/asterisk/modules/func_sha1.so
008e5000-00a22000 r-xp 00000000 03:02 17829511   /lib/libc-2.5.so
00a22000-00a24000 r-xp 0013d000 03:02 17829[{fd=29, events=POLLIN}], 1, -1)   = -1 EINTR (Interrupted system call)
trace: ptrace(PTRACE_SYSCALL, ...): No such process

This bug was also happening in 1.4.14.

If you need more information, please let me know.

Comments:By: Jason Parker (jparker) 2007-12-04 14:34:32.000-0600

We need a backtrace as described in the bug guidelines.

By: Edgar Landivar (elandivar) 2007-12-04 21:34:07.000-0600

Dear qwell,

I have attached the full backtrace. Ran with the gdb utility.

I hope this helps.

Please, tell me if you need more information.

By: Gregory Hinton Nietsky (irroot) 2007-12-05 09:13:05.000-0600

please recompile with dont_optimise and in gdb do a bt full

By: Tilghman Lesher (tilghman) 2007-12-05 09:32:03.000-0600

Actually, I need you to follow the instructions in doc/valgrind.txt.  Backtrace won't provide any additional information.

By: Edgar Landivar (elandivar) 2007-12-05 11:20:28.000-0600

Sure, I'll try to reproduce the bug in about 5 hours when the server is on low traffic.

By: Edgar Landivar (elandivar) 2007-12-06 01:27:29.000-0600

I know it will sounds very weird but i can not make asterisk fails when i run it through this command:

 valgrind --log-file-exactly=valgrind.txt /usr/sbin/asterisk -vvv 2>malloc_debug.txt

I have tried for more than one hour :(

So, I returned to gdb and it crashed again.

For this reason I'm attaching just the gdb output files.

gdb1.txt contains a "bt full"
bgd2.txt contains a "thread apply all bt"

By: Edgar Landivar (elandivar) 2007-12-06 01:36:58.000-0600

Just for you to have more information I have crashed asterisk again and here is another "bt full" called gdb_2nd_full_bt.txt

By: Edgar Landivar (elandivar) 2007-12-06 01:41:33.000-0600

A last one: gdb_3rd_full_bt.txt

By: Edgar Landivar (elandivar) 2007-12-06 01:55:20.000-0600

Hey guys, sorry for your time. I think i have found the problem. It seems to be my chan_unicall module. I unloaded this module and after 30 reloads asterisk is still running!



By: Gregory Hinton Nietsky (irroot) 2007-12-06 06:52:02.000-0600

looks like it from the BT as well ... surely that should cost a case of beers ;)

By: Tilghman Lesher (tilghman) 2007-12-06 07:31:59.000-0600

Please still upload your valgrind.txt.  That is what contains the information necessary to debug this issue.  The gdb outputs are useless (other than telling me that I need valgrind output) in this regard.

By: Eliel Sardanons (eliel) 2007-12-06 20:46:28.000-0600

Is this a patched version? I see chan_unicall on your last backtrace.

By: Edgar Landivar (elandivar) 2007-12-07 12:40:50.000-0600

Yes it was a patched version.

Sorry folks for your time, was my mistake :(.

I fixed my chan_unicall and everything is working fine now.

Thanks a lot for your time and help!