[Home]

Summary:ASTERISK-15462: Crash In chan_local in local_queue_frame (ast_mutex_trylock)
Reporter:Geoff Mina (geoff2010)Labels:
Date Opened:2010-01-18 16:07:35.000-0600Date Closed:2011-06-07 14:00:41
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_local
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) bt.txt
( 1) bt-full.txt
Description:I had a server randomly crash on me today.  I am currently running 1.4.26, but have scanned all the release notes up until 1.4.29 and found nothing that would indicate a change to chan_local was made to correct this issue.  Unfortunately this is a very high profile platform and I can't just upgrade without knowing for certain the bug has been corrected.

I also searched the bug list and the most similar ticket I found was for the 1.6 branch.  I have attached the bt and bt full.  Please let me know if there is anything else I can provide.

Thanks.
Geoff

Comments:By: Geoff Mina (geoff2010) 2010-01-18 20:37:10.000-0600

The following code is the source of the crash in chan_local.  It appears that 'other' is most likely an invalid pointer at this point... but I am not sure what else could be done to prevent this particular crash.  


       /* Recalculate outbound channel */
      other = isoutbound ? p->owner : p->chan;

       if (!other) {
               return 0;
      }

       /* do not queue frame if generator is on both local channels */
       if (us && us->generator && other->generator) {
               return 0;
       }

       /* Set glare detection */
       ast_set_flag(p, LOCAL_GLARE_DETECT);

       /* Ensure that we have both channels locked */
       while (other && ast_channel_trylock(other)) {
               ast_mutex_unlock(&p->lock);
               if (us && us_locked) {
                       do {
                               ast_channel_unlock(us);
                              usleep(1);
                               ast_channel_lock(us);
                       } while (ast_mutex_trylock(&p->lock));
               } else {
                       usleep(1);
                       ast_mutex_lock(&p->lock);
               }
               other = isoutbound ? p->owner : p->chan;
       }

By: Geoff Mina (geoff2010) 2010-01-19 07:30:31.000-0600

Issue 12012 is about a year old, but appears to be a similar issue... or at least the segfault happened at the code which was added by Russel in the patch to fix the last problem.

By: Leif Madsen (lmadsen) 2010-01-19 07:48:08.000-0600

Thanks for the triage in this bug report! That should be useful to a developer for sure.

By: Russell Bryant (russell) 2010-03-02 09:39:54.000-0600

There have been quite a few changes since 1.4.26 at this point (339 changes).  Can you try with the latest code in the 1.4 branch to see if you are still having a problem?

By: Geoff Mina (geoff2010) 2010-03-02 18:51:26.000-0600

I am planning on upgrading soon.  Unfortunately, we won't know for a very long time.  I run about 12 million calls through my network in a month... and this problem has only occurred a single time in over a year.

It's obviously not a common scenario which caused this.

thanks.

By: Leif Madsen (lmadsen) 2010-03-17 11:02:39

Due to the nature of this bug, I'm going to close this for now. Since it is an issue that is VERY uncommon, then I think leaving this open isn't really necessary. If the reporter continues to have this issue with future versions of Asterisk, then please reopen. Thanks!