[Home]

Summary:ASTERISK-17479: Error obtaining mutex in channel.c
Reporter:Absystech Telephony Team (absystech)Labels:
Date Opened:2011-02-25 08:49:09.000-0600Date Closed:2011-07-27 13:17:41
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/Channels
Versions:1.8.4 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace.txt
( 1) full_debug_20110225-1147.log
Description:Core crash after incoming calls on DAHDI or on transfer between incoming calls on DAHDI and SIP peers.

SIP Account 5630 is a receptionist phone and it intercept incoming calls wich arrive in the queue called "Standard".
Maybe it crashes on a transfert between 5630 and 7662, maybe on incoming calls...

****** ADDITIONAL INFORMATION ******

Debian Lenny Stable - Kernel 2.6.26-2
Asterisk 1.6.2.16.2
Asterisk Addons 1.6.2.3
Dahdi 2.4 and Dahdi tools 2.4
LipPri 1.4.11.5
Digium TE122B
Phones : Aastra 6731i firmware 3.0.2.70 February 2011
Receptionist Phone : Aastra 6757i firmware 3.0.2.70 February 2011
Comments:By: David Woolley (davidw) 2011-02-25 08:59:28.000-0600

Although I can't view the log without going to a bit of trouble (Unix newlines but viewing on Windows), this sort of symptom (invalid lock followed by delayed, memory corruption, crash) is often caused by a race involving a hangup on one or both conflicting channels.

By: Absystech Telephony Team (absystech) 2011-02-25 09:27:31.000-0600

Thanks, that's what we effectively read on the logs.
But how to prevent this case ?
Can this kind of issue appear due to a large amount of dialog ?
What do you think about pedantic mode ?
Can the phones be the origin of the problem ?

Thanks in advance

By: David Woolley (davidw) 2011-02-25 11:09:44.000-0600

This sort of problem is always due to a design error in Asterisk.  A lot of complex activity tends to bring it out, because it increases the chances of two events happening with just the right timing relationship.  When we've found this sort of thing, it always has to be fixed in the source code (we are using an old version, so we often back port fixes from later versions).

If you can make a good guess at what is going wrong, you can often add a call to sleep in the source code, which makes the problem repeatable, for debugging and verification of the fix.

Workarounds are not really the job of the issue tracker, but the only workaround I know of, at least without a detailed understanding of the particular failure, is to run on a single processor, single core machine (or simulating this with processor affinity), as this greatly reduces the opportunities, although doesn't remove them.

By: Alec Davis (alecdavis) 2011-02-28 14:33:35.000-0600

ASTERISK-17378 although for 1.8.x or trunk, same senario, receptionist transfers a DAHDI or IAX call between SIP extensions and asterisk SEGFAULTS.

By: Russell Bryant (russell) 2011-03-14 10:51:29

Your backtrace appears to contain memory corruption and we require valgrind output in order to move this issue forward.

Please see https://wiki.asterisk.org/wiki/display/AST/Valgrind for more information about how to produce debugging information. Thanks!

By: Absystech Telephony Team (absystech) 2011-03-14 11:30:00

Thanks for yout analysis. It will be difficult to have valgrind information because we cannot reproduce the crash since last month... I'll do my best.

I kept the core file (~ 120Mo) and i can make some gd commands for the moment.

By: Russell Bryant (russell) 2011-07-27 13:17:35.740-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

If this is still an issue, please open a new issue so it can be re-triaged appropriately. Thanks!