[Home]

Summary:ASTERISK-15668: setting pollmailboxes=yes will crash asterisk on startup
Reporter:Marcello Ceschia (marcelloceschia)Labels:
Date Opened:2010-02-21 10:41:56.000-0600Date Closed:2011-06-07 14:00:35
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Applications/app_voicemail/ODBC
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:Program terminated with signal 11, Segmentation fault.
#0  0xb750d9a3 in strlen () from /lib/libc.so.6
(gdb) bt full
#0  0xb750d9a3 in strlen () from /lib/libc.so.6
No symbol table info available.
#1  0x08081947 in __ast_strdup (s=0x4 <Address 0x4 out of bounds>, file=0xb67276e2 "app_voicemail.c", lineno=10274, func=0xb672c23a "mwi_sub_event_cb") at astmm.c:261
       len = 3076472820
       ptr = <value optimized out>
#2  0xb670a6ab in mwi_sub_event_cb (event=0x909d5f0, userdata=0x0) at app_voicemail.c:10274
       __PRETTY_FUNCTION__ = "mwi_sub_event_cb"
#3  0x080c79cf in handle_event (data=0x909e0b8) at event.c:1137
       ie_val = 0x0
       sub = 0x8f9d4f0
#4  0x0814ed04 in tps_processing_function (data=0x82050a8) at taskprocessor.c:310
       t = 0x90b00b0
       size = 0
       __PRETTY_FUNCTION__ = "tps_processing_function"
ASTERISK-1  0x0815b5c3 in dummy_start (data=0x8209da8) at utils.c:968
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {136355392, 0, 4001536, -1222401128, -490538338, -1896930336}, __mask_was_saved = 0}}, __pad = {0xb723a454, 0x0, 0xb7310ff4, 0x0}}
       not_first_call = <value optimized out>
       ret = <value optimized out>
       __PRETTY_FUNCTION__ = "dummy_start"
ASTERISK-2  0xb72ff6e5 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
ASTERISK-3  0xb72ff600 in ?? () from /lib/libpthread.so.0
No symbol table info available.


event->type = 768
event->event_len = 7680
event->payload = ""
Comments:By: Leif Madsen (lmadsen) 2010-02-22 11:22:54.000-0600

This is probably enough information for a developer (sorry, I'm not too good at reading backtraces :)), but I'll ask you to check something.

Can you see if you enabled DONT_OPTIMIZE in the Compiler Flags section of menuselect when you built Asterisk? I see the <value_optimized_out> in your backtrace which typically means DONT_OPTIMIZE wasn't enabled.

More information about backtraces is available in doc/backtrace.txt of your Asterisk source directly.

If you could provide the backtraces per that document, that would be great. Also please attach as a text file to this issue.

By: Marcello Ceschia (marcelloceschia) 2010-02-22 15:58:27.000-0600

no, I dont have - but i will do.

By: Marcello Ceschia (marcelloceschia) 2010-02-22 16:32:21.000-0600

I recompiled with DONT_OPTIMIZE and the problem is away

By: Elazar Broad (ebroad) 2010-02-22 17:08:08.000-0600

That's not good, uninitialized values strike again...

By: Marcello Ceschia (marcelloceschia) 2010-02-23 05:19:48.000-0600

ok, i can reproduce it.
I use https://issues.asterisk.org/view.php?id=15757 to publish device an voicemail states.
res_jabber subscribes for AST_EVENT_MWI without defining a mailbox and context:

mwi_sub = ast_event_subscribe(AST_EVENT_MWI, aji_mwi_cb, client, AST_EVENT_IE_END);

mybe this is a problem?
Full backtrace:

bt full
#0  0xb75169a3 in strlen () from /lib/libc.so.6
No symbol table info available.
#1  0x0808292f in __ast_strdup (s=0x4 <Address 0x4 out of bounds>, file=0xb65acbd4 "app_voicemail.c", lineno=10274, func=0xb65b1ea3 "mwi_sub_event_cb") at astmm.c:261
       len = 136354272
       ptr = 0x0
#2  0xb65a6f2c in mwi_sub_event_cb (event=0x90289d8, userdata=0x0) at app_voicemail.c:10274
       mwist = 0x8fc6090
       __PRETTY_FUNCTION__ = "mwi_sub_event_cb"
#3  0x080c3ba1 in handle_event (data=0x8fc6ce8) at event.c:1137
       ie_val = 0x0
       event_ref = 0x8fc6ce8
       sub = 0x8fa4620
       host_event_type = 3
#4  0x08146a51 in tps_processing_function (data=0x82099a8) at taskprocessor.c:310
       i = 0x82099a8
       t = 0x8fc5b60
       size = 0
       __PRETTY_FUNCTION__ = "tps_processing_function"
ASTERISK-1  0x081541dd in dummy_start (data=0x820e4e8) at utils.c:968
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {-1221484556, 0, 4001536, -1222364264, 1489906403, 975901085}, __mask_was_saved = 0}}, __pad = {0xb7243454, 0x0, 0xb7322450, 0xb72af550}}
       __cancel_routine = 0x80783cf <ast_unregister_thread>
       __cancel_arg = 0xb7243b70
       not_first_call = 0
       ret = 0xb74a8dd4
       a = {start_routine = 0x81468e3 <tps_processing_function>, data = 0x82099a8, name = 0x820e580 "tps_processing_function started at [  451] taskprocessor.c ast_taskprocessor_get()"}
       __PRETTY_FUNCTION__ = "dummy_start"
ASTERISK-2  0xb73086e5 in start_thread () from /lib/libpthread.so.0
No symbol table info available.
ASTERISK-3  0xb7308600 in ?? () from /lib/libpthread.so.0
No symbol table info available.

print event
$1 = (struct ast_event *) 0x90289d8

print (*event).payload
$2 = 0x90289dc ""

print (*event).event_len
$2 = 7680

By: Elazar Broad (ebroad) 2010-02-23 14:21:25.000-0600

That explains it. In app_voicemail.c, under mwi_sub_event_cb() we have:

---
mwist->mailbox = ast_strdup(ast_event_get_ie_str(event, AST_EVENT_IE_MAILBOX));
mwist->context = ast_strdup(ast_event_get_ie_str(event, AST_EVENT_IE_CONTEXT));
---

If there is no context/mailbox pair passed, then we are duplicating a null string, which a call to strlen() in __ast_strdup() doesn't like.

Don't you think this would better serve a note posted to 15757 as opposed to a new issue?

With that said, for future reference, when posting a bug, please note any additional patches and/or experimental features that you are using. Thanks!

By: Marcello Ceschia (marcelloceschia) 2010-02-23 16:59:04.000-0600

But checking against a null string should be done by the core, so no component can crash the system.

As I see it is not posible to subscribe mwi events for all mailboxes?
Set pollmailboxes=no will not caus this issue



By: Leif Madsen (lmadsen) 2010-02-24 09:37:19.000-0600

Actually this seems like a separate issue (kind of). I've marked this as a child of 15757 so that it can be closed prior to merge of 15757.

By: Bradley Watkins (marquis) 2010-02-24 11:16:55.000-0600

I'm wondering if this is a bug in the patch from 15757 or if pollmailboxes should be fixed somehow.  The way that the patch subscribes to MWI is for all intents identical to how it's done in res_ais.  So if someone were using pollmailboxes + res_ais it should crash as well.

By: Marcello Ceschia (marcelloceschia) 2010-02-24 12:30:52.000-0600

@Marquis

in your original patch you have:

if (!mwi_sub) {
              mwi_sub = ast_event_subscribe(AST_EVENT_MWI, aji_mwi_cb,                               "aji_mwi_subscription",
                       client, AST_EVENT_IE_END);
    }

i had to change this to:

mwi_sub = ast_event_subscribe(AST_EVENT_MWI, aji_mwi_cb, client, AST_EVENT_IE_END);

to get it work with 1.6.2, I thing this is the right syntax for subscription.


@lmadsen
but why does this only happen with pollmailboxes=yes, setting it to no - all works.



By: Bradley Watkins (marquis) 2010-02-24 13:25:46.000-0600

marcelloceschia:  yes, for 1.6.2 the arguments for ast_event_subscribe are different.  The patch and related branch in SVN are for trunk.

By: Leif Madsen (lmadsen) 2010-03-03 12:16:42.000-0600

marcelloceschia: sorry, I have no idea why it only happens with pollmailboxes=yes

By: Bradley Watkins (marquis) 2010-03-03 19:19:48.000-0600

lmadsen, marcelloceschia:  The answer is pretty straightforward, actually.  The pollmailboxes code is the only place where there is an expectation that all subscriptions to MWI events have a specific context and mailbox.

I'm not sure that's a reasonable assumption, and it certainly affects both my code and res_ais.

By: Leif Madsen (lmadsen) 2010-03-04 17:52:37.000-0600

Any chance you could comment on this issue? I think you might be able to make a comment here. If not, then perhaps this is more up Tilghmans alley?

By: Tilghman Lesher (tilghman) 2010-03-05 16:55:06.000-0600

Is this a crash that only occurs when ODBC storage is used?  I'm trying to get it to reproduce, but no crash occurs for filesystem-based storage and pollmailboxes=yes (with DONT_OPTIMIZE off).

By: Marcello Ceschia (marcelloceschia) 2010-03-06 05:13:44.000-0600

I recompiled with file storage, same problem.

@tilghman
did you add a subscription like this:

mwi_sub = ast_event_subscribe(AST_EVENT_MWI, aji_mwi_cb, client, AST_EVENT_IE_END);

By: Mark Michelson (mmichelson) 2010-03-08 10:56:14.000-0600

Leif asked me to take a look at this, so let me weigh in.

I think the issue here is actually in the event code itself. Check out frame 1 of both backtraces provided. The string being passed in is at address 0x4. There is code in the ast_strdup function that makes sure that if a NULL string is passed in, a NULL string is returned. Since the address of the string is 0x4, we bypass that safeguard and attempt instead to read the data. This address is clearly out of bounds, and thus there is a crash.

The problem, if you ask me, is that for some reason ast_event_get_ie_str(event, AST_EVENT_IE_MAILBOX) is not returning NULL when it clearly should be. If it did, I suspect this crash would not occur. The problem is either with the event itself, or that ast_event_get_ie_str misbehaves when asked for an IE that is not in the event. My gut says it's the former, but it would be a much better idea to test to determine the fault instead of guessing.

By: Leif Madsen (lmadsen) 2010-03-08 11:56:06.000-0600

Sounds like a possible unit test?

Thanks for the feedback!

By: Marcello Ceschia (marcelloceschia) 2010-03-08 11:56:24.000-0600

yes, I think the event system is the problem.
Today I tried this:

if (!(event = ast_event_new(
AST_EVENT_DEVICE_STATE_CHANGE,
   
AST_EVENT_IE_DEVICE, AST_EVENT_IE_PLTYPE_STR, channelName,
AST_EVENT_IE_STATE,  AST_EVENT_IE_PLTYPE_UINT, 0,
// AST_EVENT_IE_CIDNAME, AST_EVENT_IE_PLTYPE_STR, cidname,
// AST_EVENT_IE_CIDNUM, AST_EVENT_IE_PLTYPE_STR, cidnum,
AST_EVENT_IE_END
))) {

return;

}
const char *testCidName = ast_event_get_ie_str(event, AST_EVENT_IE_CIDNAME);
ast_log(LOG_NOTICE, "hint %s has changed, cidname %p\n", channelName, testCidName);

I expected to get a null pointer for testCidName, but i get a pointer to:

ast_event_get_ie_raw(event, AST_EVENT_IE_STATE) + sizeof(struct ast_event_iterator)

By: Mark Michelson (mmichelson) 2010-03-08 12:34:12.000-0600

Leif's comment about this being testable reminded me that not only had Russell already written a unit test for the event API, but he also fixed a bug he found when running the unit test. Here is the text of the commit he made to fix the issue:

<pre>
 Fix return value of get_ie_str() and get_ie_str_hash() for non-existent IE.
 
 I found this bug while developing a unit test for event allocation.  Testing
 is awesome.
</pre>

This sounds like exactly what you are seeing. The revision was made on Feb 8. 1.6.2.2 was released Feb 2. Now here's something very very important. If you try upgrading to 1.6.2.5, you will still see this problem! This is because 1.6.2.4 and 1.6.2.5 were security releases that only contained fixes for the security issues that they addressed. If you want to test a version of 1.6.2 that should have this fix in it, try using 1.6.2.6-rc2. That should not exhibit this behavior. Please try using this version and see if the problem persists.

I am going to change this issue to "ready for testing" to indicate that we are awaiting test results from using a more recent version.

By: Marcello Ceschia (marcelloceschia) 2010-03-08 14:10:58.000-0600

ok, I compiled 1.6.2.6-rc2 and see the same behaveior i described in 0119121.

But asterisk does not crash with pollmailboxes=yes



By: Mark Michelson (mmichelson) 2010-03-08 14:13:22.000-0600

Then in that case, I'm closing this bug since the problem has been fixed in a newer version of Asterisk. Thanks for reporting back so quickly!