Summary:ASTERISK-12352: Asterisk crash while unloading pbx_ael.so
Reporter:Eliel Sardanons (eliel)Labels:
Date Opened:2008-07-09 15:45:02Date Closed:2008-07-11 16:55:36
Versions:Frequency of
Environment:Attachments:( 0) btfull
( 1) btfull2
Description:If we unload the first module that registered a context, then asterisk crash.
How to reproduce it:
1) Check the first module in the 'contexts' hashtab.
     > dialplan show
       .......... [pbx_ael]       1st
       ............ [pbx_config]   2nd
   So in this example pbx_ael is the first.
2) unload the first module that registered a context.
     > module unload pbx_ael.so
3.1) Asterisk crash.
3.2) Asterisk didn't crash but if you run 'dialplan show' you continue seeing pbx_ael contexts that werent removed.

Backtrace full uploaded.


I trace the bug to ast_context_remove_extension2 in __ast_context_destroy(). If you check whats going in the loop that calls ast_context_remove_extension2() you will notice that we are loosing a reference to exten_item and prio_item->exten becomes null, also prio_item->registrar, thats is why you see that it crash because of a call to strcmp().
Comments:By: Eliel Sardanons (eliel) 2008-07-09 15:50:46

If we unload a second or other module that adds a context (like pbx_config), this don't happens

By: Digium Subversion (svnbot) 2008-07-11 13:17:19

Repository: asterisk
Revision: 130145

U   trunk/main/pbx.c

r130145 | murf | 2008-07-11 13:17:12 -0500 (Fri, 11 Jul 2008) | 40 lines

(closes issue ASTERISK-12352)
Reported by: eliel
Tested by: murf

(closes issue ASTERISK-12287)
Reported by: mnicholson

In this 'omnibus' fix, I **think** I solved both
the problem in 13041, where unloading pbx_ael.so
caused crashes, or incomplete removal of previous
registrar'ed entries. And I added code to completely
remove all includes, switches, and ignorepats that
had a matching registrar entry, which should
appease 12960.

I also added a lot of seemingly useless brackets
around single statement if's, which helped debug
so much that I'm leaving them there.

I added a routine to check the correlation between
the extension tree lists and the hashtab
tables. It can be amazingly helpful when you have
lots of dialplan stuff, and need to narrow
down where a problem is occurring. It's ifdef'd
out by default.

I cleaned up the code around the new CIDmatch code.
It was leaving hanging extens with bad ptrs, getting confused
over which objects to remove, etc. I tightened
up the code and changed the call to remove_exten
in the merge_and_delete code.

I added more conditions to check for empty context
worthy of deletion. It's not empty if there are
any includes, switches, or ignorepats present.

If I've missed anything, please re-open this bug,
and be prepared to supply example dialplan code.



By: Eliel Sardanons (eliel) 2008-07-11 13:43:10

I am experiencing a crash while unloading pbx_config (the last module in the hashtab), and then running 'dialplan show'.
I think we are not removing an extension so when we are trying to print it out with the 'dialplan show', we experienced a segfault while trying to access a memory address (of the registrar string) that was already unloaded from memory.

By: Eliel Sardanons (eliel) 2008-07-11 14:08:03

The problem appears while having this in the extensions.conf.
exten => s,1,Hangup()  ; Something.

So, pbx_config is the registrar of this context, but some extens are being added by another module (features.so), so when we unload module pbx_config.so the context is not destroyed because of the features.so added extensions.
And the registrars are static in each module, so when the module is unloaded the registrar memory address is not there.

By: Digium Subversion (svnbot) 2008-07-11 16:55:35

Repository: asterisk
Revision: 130297

U   trunk/main/pbx.c

r130297 | murf | 2008-07-11 16:55:34 -0500 (Fri, 11 Jul 2008) | 11 lines

(closes issue ASTERISK-12352)
Reported by: eliel

OK, now the context registrar slot is strdup'd. It is freed
on destruction. I don't see the need to do this with all
the structs' registrar fields, but if some wild case proves
they should also be handled this way, then we can
put in the extra work at that time.