Summary:ASTERISK-11158: Crash in app_meetme when invoking the page function
Reporter:callguy (callguy)Labels:
Date Opened:2008-01-05 07:52:58.000-0600Date Closed:2008-01-16 09:07:30.000-0600
Versions:Frequency of
Environment:Attachments:( 0) 11687.diff
( 1) 11687-bt-btfull-thread-apply-all-bt.txt
( 2) 11687-dialplan.txt
( 3) meetme-crash-10142007.rtf
Description:We are seeing intermittent crashes in 1.4.16 when invoking the page function. I saw issue 11612 was reported and resolved, but the backtrace from that doesn't appear the same as what we are seeing, so I'm posting this one as well.
Comments:By: Russell Bryant (russell) 2008-01-05 09:11:13.000-0600

Your system appears to be really good at finding problems.  :)

By: callguy (callguy) 2008-01-05 09:51:25.000-0600

no kidding there - i guess we're just gluttons for punishment. :)

By: callguy (callguy) 2008-01-06 16:51:29.000-0600

we confirmed this is still an issue in 1.4.17, so it's definitely something different that what was fixed in 11612.

By: Joshua C. Colp (jcolp) 2008-01-07 10:37:43.000-0600

I suspect two threads are trying to muck with the dialing information at the same time, can you also provide a thread apply all bt so I can see if this is true? The console output and general information on what exactly is happening from a user's perspective would also be useful.

By: dtyoo (dtyoo) 2008-01-07 12:05:40.000-0600

russell, file-

I work with callguy and am going to help get you more info on this.  I have access to a newer crash / core file but its the same issue.  First of all, I've uploaded a new bt, bt full, and thread apply all bt for the most recent crash.  While callguy's data was from 1.4.16, this one is from 1.4.17.

I have not been able to re-produce this issue in our dev environment.  It happens when a paging function is invoked that calls the Page application for a whole bunch of remote polycom 501 phones.  I've attached a dialplan snippet of where the problem is happening.  Notably there is an intermediate local channel being used.

The last line on the console before the crash is:

ERROR[9164] /usr/src/asterisk-test/1.4.17/asterisk-1.4.17/include/asterisk/lock.h: channel.c line 1248 (ast_channel_free): Error: attempt to destroy locked mutex '&chan->lock'.

Let me know what other info you need on this and I'll be happy to supply it.

By: Joshua C. Colp (jcolp) 2008-01-07 12:23:12.000-0600

Please try the attached patch.

By: callguy (callguy) 2008-01-07 21:34:46.000-0600

file: we're testing this out now. we've been hitting this issue a few times per week, so we'll update in a few days. thanks!

By: callguy (callguy) 2008-01-09 12:08:47.000-0600

file: I just posted a new bug (11712) that appeared to be unrelated, but did happen on the machine I have running the diff from here. You may want to take a look.

By: callguy (callguy) 2008-01-16 09:03:40.000-0600

file: we've had over a week of uptime without any recurrence of this issue, so I think you can close this one out. Thanks for your help!

By: Joshua C. Colp (jcolp) 2008-01-16 09:07:29.000-0600

Fixed in 1.4 as of revision 98960 and trunk as of revision 98961.