Summary:ASTERISK-18087: asterisk will crash on "module reload"
Reporter:Joao Carvalho (foxfire)Labels:
Date Opened:2011-07-04 05:07:45Date Closed:2011-08-03 12:49:21
Versions: Frequency of
Environment:I can reproduce this on completly diferent hardware, but basicly i am running Linux kernel 32bitsAttachments:
Description:When reloading more than once in under 30 seconds asterisk will crash most of the times. To replicate just enter "reload" several times in the console, the best i was able to do was 4 reloads.
I consider this major because asterisk crashes, but personly i stopped doing full reloads some versions ago. Doing specific reloads like "sip reload" or "dialplan reload" will not crash asterisk, in these cases i can do as many as i like. I prefer this problem ways better than the one that existed in that locked up asterisk.
I haven added the core files because it should be easy to reproduce. Neverless if you want me to just ask.

Comments:By: Walter Doekes (wdoekes) 2011-07-05 07:59:16.295-0500

Cannot reproduce. Is there a specific configuration that you can reduce it to? Does it lock up if you have no config files at all? Start with none and keep adding until you get it to crash again.

By: Joao Carvalho (foxfire) 2011-07-05 11:56:22.145-0500

I have unloaded all loadable modules.
polyspeak*CLI> module show
Module                         Description                              Use Count
0 modules loaded
if i check what is still in there i see :
cdr          dnsmgr       dsp          enum         extconfig    features     http         indications  logger       manager      rtp          udptl        plc          

but even so it still crashes.
What is the best way to procede ?
how do you want me to debug this.

I did some tests and running it with
asterisk -rvvvvvvvvvvvvvvvvvvvvvvvv -g -dddddddd -cn
and the last line is now "Parsing features.conf" , i am using the default features.conf file for testing
Sometimes i get an glibc error like  "*** glibc detected *** asterisk: double free or corruption (out): 0xb7a0d0c8"

what i did noticed is some weard behaviour with the extensions:

mostly it is

 == Parsing '/etc/asterisk/features.conf':   == Found
   -- Added extension '700' priority 1 to parkedcalls (0xb7906d90)

then sometimes

 == Parsing '/etc/asterisk/features.conf':   == Found
   -- Remove parkedcalls/700/1, registrar=features; con=parkedcalls(0xb7906d90); con->root=0xb790ed90
[Jul  5 17:49:55] WARNING[775]: pbx.c:4969 ast_context_remove_extension_callerid2: Cannot find extension 700 in root_table in context parkedcalls
   -- Registered extension context '' (0xb790a3a0) in table 0xb7907158; registrar: features
   -- Added extension '700' priority 1 to  (0xb790a3a0)


 == Parsing '/etc/asterisk/features.conf':   == Found
   -- Remove /700/1, registrar=features; con=(0xb790a3a0); con->root=0xb790cb18
   -- Added extension '700' priority 1 to parkedcalls (0xb7906d90)

By: Joao Carvalho (foxfire) 2011-07-05 12:22:40.071-0500

It seams that asterisk crashes somewhere when he is merging the dialplan.
I loaded the modules again and config files.
Now it hangs while parsing extension.conf not always in the same place.

By: Walter Doekes (wdoekes) 2011-07-05 15:25:38.364-0500

Better. But you still haven't distilled it down to the bare minimum.

Please do the configuration adding/removing until you have only a handful of files left in /etc/asterisk/ that are needed to get it crashing. (Best is if you set modules.conf autoload=>no and load only those modules that are needed for it to crash.)

Then post that configuration so we can try to reproduce it.

By: Joao Carvalho (foxfire) 2011-07-05 15:39:03.139-0500

i used that option, the modules left are i believe buildin, i can not unload them. But i will check if i can reduce it more and will get back to you.
Also if you like i can give you shell access to a machine.

By: Joao Carvalho (foxfire) 2011-07-06 08:52:38.955-0500

ok i went extreme and here is what i did :
1- deleted all modules from /usr/lib/asterisk/modules
2- removed all conf files from /etc/asterisk

i ran the comand

asterisk -vvvvvvvvvvvvvvvvv -dddddddddddd -cn

in one tty and the following shell script in another term :

while true
asterisk -rx "module reload"

and it crashes with
*** glibc detected *** asterisk: free(): invalid next size ...

libptheead.so ...

or with memory corruption

It is a lot harder to crash, so if you are unable to crash it that way
try the following :
run the script in a shell without asterisk running

next start asterisk with :

asterisk -vvvvvvvvvvvvvvvvv -dddddddddddd -cn

if it doesn't crashes within 2 seconds use Control-C to stop asterisk and satr it again , you can leave the script running.

hope this helps, let me know what you need.

By the way, this version improved a lot with the lockups i had in, but i still get one or two per week, i believe that the lockups and this problem might be related. When that happens SIP hangs and on reload all extensions in the dialplan becaome "IN USE". This is a very anoing problem because the only way to fix this is with a manual restart.

By: Leif Madsen (lmadsen) 2011-07-11 15:12:58.345-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

By: Paul Belanger (pabelanger) 2011-08-03 12:49:13.901-0500