[Home]

Summary:ASTERISK-15398: (local_queue_frame): Error obtaining mutex: Invalid argument (causes crash)
Reporter:Matt King, M.A. Oxon. (kebl0155)Labels:
Date Opened:2010-01-08 10:43:36.000-0600Date Closed:2010-03-18 13:25:46
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Channels/chan_local
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) gdb.txt.gz
Description:Asterisk crashes (core dumped) with the following error messages:

[Jan  8 15:28:56] ERROR[9587] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 234 (local_queue_frame): Error obtaining mutex: Invalid argument
[Jan  8 15:28:56] ERROR[9587] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 659 (local_hangup): mutex '&p->lock' freed more times than we've locked!
[Jan  8 15:28:56] ERROR[9587] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 659 (local_hangup): Error releasing mutex: Invalid argument


****** ADDITIONAL INFORMATION ******

I did see some similar-looking bugs, so I apologise if this is a dupe - I'm posting because this is the only one I can see with local_queue_frame in the ERROR message.

Asterisk crashed today.  We use Queue() with Local channels and Sip channels.  It looks like the crash was in chan_local.

The crash produced a core, which I have saved.  I complied with DONT_OPTIMIZE and DEBUG_THREADS, so I can upload gdb output if requested - I thought I'd better wait for confirmation that this isn't a dupe first though.

Thanks for your help!
Comments:By: Leif Madsen (lmadsen) 2010-01-08 12:16:18.000-0600

Before moving this forward, can you try with the latest 1.6.2 branch just to see if it has already been resolved? If so then I can close this, otherwise, then we can move this forward.

By: Leif Madsen (lmadsen) 2010-01-08 12:16:51.000-0600

Additionally, you'll need to attach a backtrace to this issue, per the instructions in the doc/backtrace.txt file of your Asterisk source. Thanks!

By: Matt King, M.A. Oxon. (kebl0155) 2010-01-08 12:27:06.000-0600

Hello,

We are already running the latest 1.6.2 release (1.6.2.0).  This is a production system serving multiple customers, and I cannot take the risk of running trunk on this machine.  I'm sure you understand.

I have attached the gdb output as requested.

Good luck!

By: Tilghman Lesher (tilghman) 2010-01-10 13:40:06.000-0600

He doesn't mean running trunk.  He means running the latest of the 1.6.2 branch (i.e. http://svn.digium.com/svn/asterisk/branches/1.6.2/, not http://svn.digium.com/svn/asterisk/trunk).

By: Leif Madsen (lmadsen) 2010-01-11 14:07:57.000-0600

Asterisk 1.6.2.1-rc1 was just released today as well, but yes, I meant 1.6.2 branch (as I stated) and not Asterisk trunk (which is entirely different).

By: Matt King, M.A. Oxon. (kebl0155) 2010-01-29 05:54:49.000-0600

Hi, sorry we can't run 1.6.2.1 because of this bug https://issues.asterisk.org/view.php?id=16729

We haven't seen a crash in 1.6.2.0 since the original post though...



By: Matt King, M.A. Oxon. (kebl0155) 2010-02-05 07:10:30.000-0600

We just had another one about half an hour ago.

Exactly the same error messages:

[Feb  5 12:22:39] ERROR[30570] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 234 (local_queue_frame): Error obtaining mutex: Invalid argument
[Feb  5 12:22:39] ERROR[30570] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 659 (local_hangup): mutex '&p->lock' freed more times than we've locked!
[Feb  5 12:22:39] ERROR[30570] /usr/src/asterisk-1.6.2.0/include/asterisk/lock.h: chan_local.c line 659 (local_hangup): Error releasing mutex: Invalid argument

Do you need another thread trace?

By: Matt King, M.A. Oxon. (kebl0155) 2010-02-22 10:17:56.000-0600

And again today - twice.

I had a look through the code.  This is happening when local_hangup tries to unlock the thing after the lock() fails in local_queue_frame at line 234.

I checked the lock documentation, and (excluding thread priority) the only way Invalid argument can be thrown is if &p->lock hasn't been initialised properly:

The pthread_mutex_lock(), pthread_mutex_trylock() and pthread_mutex_unlock() functions may fail if:

[EINVAL]
   The value specified by mutex does not refer to an initialised mutex object.

however we know the object was sucessfully unlocked a microsecond earlier.  Something must be invalidating the object (or freeing it?) during this microsecond.

Can we check to see if the pointer does refer to an initialised mutex object before calling unlock?  If so what would be the code for this?

I am going to upgrade to the 1.6.2 SVN version of chan_local.c - it's not substantially different, so far as I can tell.  I will repost when the error next occurs.



By: Leif Madsen (lmadsen) 2010-02-23 09:56:21.000-0600

Thanks for the information. Hopefully a developer will be able to look at this shortly. Any additional information you come across which may be useful is always welcome. Thanks again!

By: Russell Bryant (russell) 2010-03-01 14:48:36.000-0600

It has been a little while now, and the 1.6.2 branch has a ton of fixes in it.  Can you please try the latest code in the 1.6.2 branch?  There is a good chance this has been fixed.

By: Matt King, M.A. Oxon. (kebl0155) 2010-03-03 10:22:30.000-0600

Hi Russell,

The releases since then are all security releases, so the code that fixes the bug we're stuck on (16729) hasn't made it into the release stream yet.

I have disabled DEBUG_THREADS and DONT_OPTIMISE and we haven't had the problem since.

By: Leif Madsen (lmadsen) 2010-03-17 10:48:18

1.6.2.6 has now been released with the latest changes in it. Please re-test.

By: Leif Madsen (lmadsen) 2010-03-18 13:25:45

I'm going to close this issue for now. If the reporter still has issues on versions beyond 1.6.2.6, then please reopen the issue. Thanks!