Summary:ASTERISK-16640: ConfBridge crashes when leave simultaneously
Reporter:thomas987 (thomas987)Labels:
Date Opened:2010-09-02 09:18:11Date Closed:2013-01-14 21:49:03.000-0600
Status:Closed/CompleteComponents:Applications/app_confbridge Bridges/bridge_softmix
Versions:Frequency of
is duplicated byASTERISK-20261 app_conference cause asterisk coredump
is related toASTERISK-16390 ConfBridge crashes Asterisk
is related toASTERISK-16835 Segfault when shutting down and ongoing traffic to ConfBridge application
Environment:Attachments:( 0) 20110201_trunk_asterisk.log
( 1) 20110201_trunk_gdb.log
( 2) backtrace_pthread.txt
( 3) backtrace_timerfd.txt
Description:ConfBridge crashes when users leave the conference at the same time (very timing sensitive).

I have reproduced the issue every time (3-4 times) when using the res_timing_timerfd timer and with (only tried one time) with res_timing_pthread timer.

I didn't get the crash when I tried (only one time) with the res_timing_dahdi timer but I might have just been lucky.
Comments:By: thomas987 (thomas987) 2011-02-01 04:19:12.000-0600

Hi. It has been some time since this crash was reported and I today tried the latest code from the svn trunk and reproduced the same bug there. I have attached new log output from asterisk and gdb.

This issue should be changed to reflect that the bug is in the trunk and not only in the 1.6.2 version. Most probably this applies to the related issue 0017670 as well.

The bug is easily reproduced on my side by calling simulated phones which hang up at exactly the same time.

By: Gunnar Harms (speedy) 2012-05-31 14:53:58.142-0500

Maybe someone want to try this patch or give some comments on it. It's a backport of some lines in Asterisk 10:
--- bridges/bridge_softmix.c.orig       2011-07-14 22:13:06.000000000 +0200
+++ bridges/bridge_softmix.c    2012-04-25 22:42:22.000000000 +0200
@@ -149,11 +149,19 @@
       struct softmix_channel *sc = bridge_channel->bridge_pvt;

+       if (!(bridge_channel->bridge_pvt)) {                          
+           return 0;
+       }
+       bridge_channel->bridge_pvt = NULL;
       /* Drop mutex lock */

By: sun bing (hoowa) 2012-09-20 02:16:04.872-0500

i want to know why not fix this but after 2 years, and why asterisk team fix this problem in v10 quiet?

does not 1.8 is long team support or your people to give 1.8 up?

By: Matt Jordan (mjordan) 2012-09-20 09:04:25.447-0500

First, no one has "given up" on Asterisk 1.8.  Asterisk 1.8 is an LTS version [1] that continues to receive bug fixes, and new releases are made on a monthly basis that contain those bug fixes.  I'm not sure how you would come to the conclusion that Asterisk 1.8 isn't supported simply because this issue happens to be open.

Second, ConfBridge is marked as an extended support module in Asterisk 1.8 [2].  Development fixes for it typically come from the developer community.  While Gunnar provided an in-comment snippet of code that *may* fix the problem in Asterisk 1.8, no one has actually provided a patch for that version, attached it to this issue, tested it, and confirmed that it fixed the problem.  If you need this fixed in Asterisk 1.8, you may want to consider contacting a developer on the asterisk-biz list to see if they'd be willing to fix this for you.

Finally, ConfBridge went through a major overhaul for Asterisk 10.  The intent wasn't just to fix bugs - it and the underlying Bridge API got several major improvements that were too intrusive to be made in a release branch.  The changes made were not candidates for backporting.  As a result of some of those changes, some things that were bugs in Asterisk 1.8 in ConfBridge were naturally fixed.  If someone would like to use some of those changes to fix the bugs in ConfBridge in Asterisk 1.8 they'd be welcome to do so - but they'd need to attach the diffs as patches to the appropriate bug reports and test them.

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

[2] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Module+Support+States