[Home]

Summary:ASTERISK-10446: Crash in local_queue_frame trying to trylock a corrupted p->owner lock
Reporter:Chase Venters (chaseventers)Labels:
Date Opened:2007-10-04 13:02:41Date Closed:2007-10-16 17:01:01
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Addons/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) btfull.txt
( 1) btfull-crash2.txt
Description:It appears that local_queue_frame() is trying to lock a mutex that has been destroyed (or is otherwise corrupt). The reentrancy value is 7903722 and the pthread struct contains a value __m_owner = 0xbad, which is probably a magic cookie for a mutex that has been destroyed.

In frame 1, isoutbound is 1, making other p->owner, making the corrupted lock &p->owner->lock.
Comments:By: Chase Venters (chaseventers) 2007-10-04 13:08:11

Hmm, my searches didn't find this before, but it appears to be related to 0010875.

By: Chase Venters (chaseventers) 2007-10-05 14:56:41

Saw this in my logs:

[Oct  3 08:01:15] ERROR[24122] /root/NewAst/asterisk-1.4.11/include/asterisk/lock.h: chan_local.c line 180 (local_queue_frame): mutex '&us->lock' freed more times than we've locked!
[Oct  3 08:01:15] ERROR[24122] /root/NewAst/asterisk-1.4.11/include/asterisk/lock.h: chan_local.c line 180 (local_queue_frame): Error releasing mutex: Operation not permitted

By: Chase Venters (chaseventers) 2007-10-05 14:58:37

Saw this today:

[Oct  5 11:26:17] ERROR[24320] /root/NewAst/asterisk-1.4.11/include/asterisk/lock.h: cdr.c line 996 (post_cdr): '&(&be_list)->lock' was locked here.
[Oct  5 11:26:17] ERROR[24313] /root/NewAst/asterisk-1.4.11/include/asterisk/lock.h: cdr.c line 996 (post_cdr): Deadlock? waited 5 sec for mutex '&(&be_list)->lock'?

repeated identically many times with different thread IDs, eventually resulting in another core dump (see btfull-crash2.txt).

By: callguy (callguy) 2007-10-12 15:40:31

ChaseVenters: I took a look at your backtraces. Can you confirm if you are running with DEBUG_THREADS enabled?

By: Chase Venters (chaseventers) 2007-10-12 15:46:20

I was. I turned off all the debugging options the other day after studying lock.h -- it looks like the debugging code is vulnerable to a number of race conditions.

By: callguy (callguy) 2007-10-12 15:55:13

That's what it looked like. This is the same as bug 10571, there's a patch there that should resolve the issue.

By: Digium Subversion (svnbot) 2007-10-16 16:53:56

Repository: asterisk
Revision: 85994

U   branches/1.4/include/asterisk/lock.h

------------------------------------------------------------------------
r85994 | russell | 2007-10-16 16:53:52 -0500 (Tue, 16 Oct 2007) | 16 lines

Some locking errors exposed the fact that the lock debugging code itself was
not thread safe.  How ironic!  Anyway, these changes ensure that the code that
is accessing the lock debugging data is thread-safe.  

Many thanks to Ivan for finding and fixing the core issue here, and also
thanks to those that tested the patch and provided test results.

(closes issue ASTERISK-10177)
(closes issue ASTERISK-10446)
(closes issue ASTERISK-10436)
(might close some others, as well ...)

Patches: (from issue ASTERISK-10177)
     ivan_ast_1_4_12_rel_patch_lock.h.diff uploaded by Ivan (license 229)
      - a few small changes by me

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-10-16 17:01:01

Repository: asterisk
Revision: 85995

_U  trunk/
U   trunk/include/asterisk/lock.h

------------------------------------------------------------------------
r85995 | russell | 2007-10-16 17:01:01 -0500 (Tue, 16 Oct 2007) | 24 lines

Merged revisions 85994 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r85994 | russell | 2007-10-16 17:14:36 -0500 (Tue, 16 Oct 2007) | 16 lines

Some locking errors exposed the fact that the lock debugging code itself was
not thread safe.  How ironic!  Anyway, these changes ensure that the code that
is accessing the lock debugging data is thread-safe.  

Many thanks to Ivan for finding and fixing the core issue here, and also
thanks to those that tested the patch and provided test results.

(closes issue ASTERISK-10177)
(closes issue ASTERISK-10446)
(closes issue ASTERISK-10436)
(might close some others, as well ...)

Patches: (from issue ASTERISK-10177)
     ivan_ast_1_4_12_rel_patch_lock.h.diff uploaded by Ivan (license 229)
      - a few small changes by me

........

------------------------------------------------------------------------