[Home]

Summary:ASTERISK-16678: Asterisk crash when bridging and masquerading channels
Reporter:Andre Luis (andrel)Labels:
Date Opened:2010-09-14 17:12:28Date Closed:2011-07-27 13:11:26
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/Channels
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace.14.09.txt
( 1) backtrace1.txt
( 2) better-patch-1.6.2.13
( 3) bt-2.txt
( 4) bt-3.txt
( 5) log.core.14.09
( 6) patch-1.6.2.13
Description:Asterisk Seems to crash when trying to bridge and masquerade the channel sometimes
check full debug log for information(end of lines shows errors) and backtrace via gdb.

****** ADDITIONAL INFORMATION ******

[Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 478 (ast_check_hangup_locked): mutex '&chan->lock_dont_use' freed more times than we've locked!
[Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 478 (ast_check_hangup_locked): Error releasing mutex: Invalid argument
[Sep 14 17:56:18] DEBUG[4908] rtp.c: Changing ssrc from 55367769 to 1552696528 due to a source change
[Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 3239 (ast_indicate_data): Error obtaining mutex: Invalid argument
[Sep 14 17:56:18] ERROR[4908] astobj2.c: user_data is NULL
Comments:By: Andre Luis (andrel) 2010-09-14 17:14:13

log.core.14.09 is the full log.

Scenario
Cisco Router with E1 Cards(two E1S 60 Channels Total)
Comunicating thru SIP
With The Asterisk Server Version 1.6.2.12-rc1(issue from 1.6.2.11 already)
OS: Debian 5.0.4 Lenny 32bit

Don't know is this is related to the other issue
https://issues.asterisk.org/view.php?id=17984
Hope you guys can tell!

By: Alec Davis (alecdavis) 2010-09-14 17:42:08

can you give us an example dialplan that will cause this crash?

By: Andre Luis (andrel) 2010-09-14 17:45:56

I don't know what caused the crash,
just reported according to asterisk doc/backtrace.txt
Hope i could help more, but have no understanding to explain which dialplan cause
and how to reprodeuce it!

By: Andre Luis (andrel) 2010-09-20 15:56:25

I wonder if this https://issues.asterisk.org/view.php?id=16057

is anyting liked my post

By: Alec Davis (alecdavis) 2010-09-20 19:11:05

I saw the message "freed more times than we've locked!

You should be able to apply the patch bug16057.diff4.txt at ASTERISK-14975

By: Andre Luis (andrel) 2010-09-20 19:47:31

I didn't understand your message,
you say its the same thing and the patch should work for me?

By: Alec Davis (alecdavis) 2010-09-20 20:09:25

Sorry, while testing the patch that's on issue ASTERISK-14975, I regulary saw the message "Fixup failed on channel XXX, strange things may happen" then after that I'd see your message.

What I'm suggesting, is you either try the 1.6.2 SVN branch, or apply the patch mentioned in my previous post.

By: Andre Luis (andrel) 2010-09-21 06:01:40

I applied the patch, i'll monitor to see if is going to crash now.
I'll post here, if anything change!

By: Andre Luis (andrel) 2010-09-21 15:05:22

Patched but no success at fixing.
getting these errors,

[Sep 21 16:45:41] ERROR[3849]: channel.c:1786 ast_hangup: Unable to find channel in list to free. Assuming it has already been done.
[Sep 21 16:45:41] WARNING[3849]: channel.c:1488 ast_channel_free: Channel '' may not have been hung up properly



By: Andre Luis (andrel) 2010-09-23 13:26:04

Just a note.

The crash only happens when DUNDi is in use!

By: Andre Luis (andrel) 2010-10-25 07:26:17

Is Anyone checking this out?

By: nmower (nmower) 2010-11-08 12:41:00.000-0600

We have this issue in version 1.6.2.13.  (See backtrace in file bt-2.txt.)  This bug is highly sporadic, but we'll try to get another backtrace with optimization turned off.

By: Andre Luis (andrel) 2010-11-08 12:43:19.000-0600

Mine is without optimization.

By: nmower (nmower) 2010-11-08 12:49:40.000-0600

Uploaded file bt-3.txt has a different backtrace, but the crash occurs in identical circumstances.  (During an attended transfer.)

By: nmower (nmower) 2010-11-12 16:01:05.000-0600

Added backtrace1.txt.  Optimization has been turned off, and we used 'bt full' this time.

By: nmower (nmower) 2010-11-16 16:10:52.000-0600

Hmm...looks like the zombie flag is set.

(gdb) fr 0
#0  0x00007f4f11911419 in sip_indicate (ast=0x7f4ef0104280, condition=17, data=0x0, datalen=0)
   at chan_sip.c:6641
6641 ast_rtp_new_source(p->rtp);
(gdb) p ast->flags
$3 = 524816
(gdb) p ast->flags & AST_FLAG_ZOMBIE
$4 = 16

By: nmower (nmower) 2010-11-22 12:36:31.000-0600

I just uploaded a patch that works for us.

By: Andre Luis (andrel) 2010-11-22 12:47:17.000-0600

Do you have a scenario for me to test?

By: nmower (nmower) 2010-11-22 14:57:22.000-0600

Sorry, there's no real test scenario.  The bug is too sporadic for that.  The best I can give you is high call volume with many attended transfers.

By: Andre Luis (andrel) 2010-11-22 15:01:04.000-0600

How do you reprodeuce the error in order to come up with patch?

By: nmower (nmower) 2010-11-23 16:21:46.000-0600

The only way I know to reproduce the error is to wait for a day when the call volume is fairly high, and then examine the core file produced.  We had one more segfault today, which we addressed with the patch named 'better-patch-1.6.2.13'.  After the patch, no segfaults all day long.

By: Andre Luis (andrel) 2010-11-24 04:02:07.000-0600

That's nice, so you say if i place a call with a high volume it will crash?

By: nmower (nmower) 2010-11-24 07:28:15.000-0600

One or two crashes will occur on a very long, busy day.  Or no crashes at all.  Or four crashes, on a really bad day.  I'm saying it's highly unpredictable.

By: Andre Luis (andrel) 2010-11-24 07:34:46.000-0600

My crashes only occur when i'm using DUNDi. when using iax accounts with manual routes everything goes normally!

By: nmower (nmower) 2011-03-30 11:36:54

We have upgraded to Asterisk 1.6.2.17.2, and the issue appears to be resolved for us.  We left the source code unmodified -- none of our former patches were applied to it.

I believe the ATXFER_NULL_TECH change on 20-Jan-2011 was the solution to our particular problem.

By: Alec Davis (alecdavis) 2011-06-04 00:24:00

is this still a problem?

at the end of log.core.14.09 I see that a call pickup had just been done. A few issues have been cleared up lately with directed pickup and a later version may help.

If indeed the call pickup was the cause, for 1.6.2, only 1.6.2svn has these fixes, there isn't yet a released version.

See ASTERISK-17264



By: Andre Luis (andrel) 2011-06-15 09:06:52.615-0500

The problem is gone on newer versions!
1.6.2.17.2 is ok!

By: Andre Luis (andrel) 2011-06-15 09:08:15.539-0500

The issue seems to be gone in newer versions!

By: Russell Bryant (russell) 2011-07-27 13:11:16.896-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

If this is still an issue, please open a new issue so it can be re-triaged appropriately. Thanks!