Summary: | ASTERISK-16678: Asterisk crash when bridging and masquerading channels | ||
Reporter: | Andre Luis (andrel) | Labels: | |
Date Opened: | 2010-09-14 17:12:28 | Date Closed: | 2011-07-27 13:11:26 |
Priority: | Critical | Regression? | No |
Status: | Closed/Complete | Components: | Core/Channels |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace.14.09.txt ( 1) backtrace1.txt ( 2) better-patch-1.6.2.13 ( 3) bt-2.txt ( 4) bt-3.txt ( 5) log.core.14.09 ( 6) patch-1.6.2.13 | |
Description: | Asterisk Seems to crash when trying to bridge and masquerade the channel sometimes check full debug log for information(end of lines shows errors) and backtrace via gdb. ****** ADDITIONAL INFORMATION ****** [Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 478 (ast_check_hangup_locked): mutex '&chan->lock_dont_use' freed more times than we've locked! [Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 478 (ast_check_hangup_locked): Error releasing mutex: Invalid argument [Sep 14 17:56:18] DEBUG[4908] rtp.c: Changing ssrc from 55367769 to 1552696528 due to a source change [Sep 14 17:56:18] ERROR[4908] /root/asterisk/asterisk-1.6.2.12-rc1/include/asterisk/lock.h: channel.c line 3239 (ast_indicate_data): Error obtaining mutex: Invalid argument [Sep 14 17:56:18] ERROR[4908] astobj2.c: user_data is NULL | ||
Comments: | By: Andre Luis (andrel) 2010-09-14 17:14:13 log.core.14.09 is the full log. Scenario Cisco Router with E1 Cards(two E1S 60 Channels Total) Comunicating thru SIP With The Asterisk Server Version 1.6.2.12-rc1(issue from 1.6.2.11 already) OS: Debian 5.0.4 Lenny 32bit Don't know is this is related to the other issue https://issues.asterisk.org/view.php?id=17984 Hope you guys can tell! By: Alec Davis (alecdavis) 2010-09-14 17:42:08 can you give us an example dialplan that will cause this crash? By: Andre Luis (andrel) 2010-09-14 17:45:56 I don't know what caused the crash, just reported according to asterisk doc/backtrace.txt Hope i could help more, but have no understanding to explain which dialplan cause and how to reprodeuce it! By: Andre Luis (andrel) 2010-09-20 15:56:25 I wonder if this https://issues.asterisk.org/view.php?id=16057 is anyting liked my post By: Alec Davis (alecdavis) 2010-09-20 19:11:05 I saw the message "freed more times than we've locked! You should be able to apply the patch bug16057.diff4.txt at ASTERISK-14975 By: Andre Luis (andrel) 2010-09-20 19:47:31 I didn't understand your message, you say its the same thing and the patch should work for me? By: Alec Davis (alecdavis) 2010-09-20 20:09:25 Sorry, while testing the patch that's on issue ASTERISK-14975, I regulary saw the message "Fixup failed on channel XXX, strange things may happen" then after that I'd see your message. What I'm suggesting, is you either try the 1.6.2 SVN branch, or apply the patch mentioned in my previous post. By: Andre Luis (andrel) 2010-09-21 06:01:40 I applied the patch, i'll monitor to see if is going to crash now. I'll post here, if anything change! By: Andre Luis (andrel) 2010-09-21 15:05:22 Patched but no success at fixing. getting these errors, [Sep 21 16:45:41] ERROR[3849]: channel.c:1786 ast_hangup: Unable to find channel in list to free. Assuming it has already been done. [Sep 21 16:45:41] WARNING[3849]: channel.c:1488 ast_channel_free: Channel '' may not have been hung up properly By: Andre Luis (andrel) 2010-09-23 13:26:04 Just a note. The crash only happens when DUNDi is in use! By: Andre Luis (andrel) 2010-10-25 07:26:17 Is Anyone checking this out? By: nmower (nmower) 2010-11-08 12:41:00.000-0600 We have this issue in version 1.6.2.13. (See backtrace in file bt-2.txt.) This bug is highly sporadic, but we'll try to get another backtrace with optimization turned off. By: Andre Luis (andrel) 2010-11-08 12:43:19.000-0600 Mine is without optimization. By: nmower (nmower) 2010-11-08 12:49:40.000-0600 Uploaded file bt-3.txt has a different backtrace, but the crash occurs in identical circumstances. (During an attended transfer.) By: nmower (nmower) 2010-11-12 16:01:05.000-0600 Added backtrace1.txt. Optimization has been turned off, and we used 'bt full' this time. By: nmower (nmower) 2010-11-16 16:10:52.000-0600 Hmm...looks like the zombie flag is set. (gdb) fr 0 #0 0x00007f4f11911419 in sip_indicate (ast=0x7f4ef0104280, condition=17, data=0x0, datalen=0) at chan_sip.c:6641 6641 ast_rtp_new_source(p->rtp); (gdb) p ast->flags $3 = 524816 (gdb) p ast->flags & AST_FLAG_ZOMBIE $4 = 16 By: nmower (nmower) 2010-11-22 12:36:31.000-0600 I just uploaded a patch that works for us. By: Andre Luis (andrel) 2010-11-22 12:47:17.000-0600 Do you have a scenario for me to test? By: nmower (nmower) 2010-11-22 14:57:22.000-0600 Sorry, there's no real test scenario. The bug is too sporadic for that. The best I can give you is high call volume with many attended transfers. By: Andre Luis (andrel) 2010-11-22 15:01:04.000-0600 How do you reprodeuce the error in order to come up with patch? By: nmower (nmower) 2010-11-23 16:21:46.000-0600 The only way I know to reproduce the error is to wait for a day when the call volume is fairly high, and then examine the core file produced. We had one more segfault today, which we addressed with the patch named 'better-patch-1.6.2.13'. After the patch, no segfaults all day long. By: Andre Luis (andrel) 2010-11-24 04:02:07.000-0600 That's nice, so you say if i place a call with a high volume it will crash? By: nmower (nmower) 2010-11-24 07:28:15.000-0600 One or two crashes will occur on a very long, busy day. Or no crashes at all. Or four crashes, on a really bad day. I'm saying it's highly unpredictable. By: Andre Luis (andrel) 2010-11-24 07:34:46.000-0600 My crashes only occur when i'm using DUNDi. when using iax accounts with manual routes everything goes normally! By: nmower (nmower) 2011-03-30 11:36:54 We have upgraded to Asterisk 1.6.2.17.2, and the issue appears to be resolved for us. We left the source code unmodified -- none of our former patches were applied to it. I believe the ATXFER_NULL_TECH change on 20-Jan-2011 was the solution to our particular problem. By: Alec Davis (alecdavis) 2011-06-04 00:24:00 is this still a problem? at the end of log.core.14.09 I see that a call pickup had just been done. A few issues have been cleared up lately with directed pickup and a later version may help. If indeed the call pickup was the cause, for 1.6.2, only 1.6.2svn has these fixes, there isn't yet a released version. See ASTERISK-17264 By: Andre Luis (andrel) 2011-06-15 09:06:52.615-0500 The problem is gone on newer versions! 1.6.2.17.2 is ok! By: Andre Luis (andrel) 2011-06-15 09:08:15.539-0500 The issue seems to be gone in newer versions! By: Russell Bryant (russell) 2011-07-27 13:11:16.896-0500 Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions If this is still an issue, please open a new issue so it can be re-triaged appropriately. Thanks! |