[Home]

Summary:ASTERISK-17414: [patch] Crashing when using local channels and realtime on asterisk 1.8.3-rc2
Reporter:Nic Colledge (nic)Labels:
Date Opened:2011-02-15 14:13:28.000-0600Date Closed:2011-05-25 06:31:17
Priority:BlockerRegression?No
Status:Closed/CompleteComponents:Channels/chan_local
Versions:Frequency of
Occurrence
Related
Issues:
is duplicated byASTERISK-17967 Asterisk locks on transfer
Environment:Attachments:( 0) 18818.v1.txt
( 1) backtrace-threads.txt
( 2) backtrace-threads2.txt
( 3) core-show-locks.txt
( 4) core-show-locks2.txt
( 5) issue18818.patch
Description:I have been having a problem with asterisk crashing when using local channels and realtime on asterisk 1.8.3-rc2.
The example given here is I think the easiest way to reproduce this problem.

In extensions.conf I have:

[internal]
switch => Realtime/extensions/p
exten => 301,1,Answer()
exten => 301,2,Dial(Local/501@internal)
exten => 301,3,Hangup()
exten => 501,1,Answer()
exten => 501,2,Playback(demo-echotest)
exten => 501,3,Echo()
exten => 501,4,Hangup()

Where dialling 301 puts you through to 501 and you hear the echo test message fine. However if I move 501 to the realtime database extensions table and remove it from extensions.conf asterisk hangs on the local channel dial, then completely dies a few minutes later (console stops responding to commands) with killall -9 asterisk being the only way to stop it.

In both cases I can dial 501 directly with no problem.

The last message on the console (with verbose 10) -- Executing [300@internal:2] Dial("SIP/1014-00000001", "Local/501@internal")

Everything works fine with the exact same setup and asterisk 1.8.1.2 and 1.8.2.3.

Discussed a little on asterisk-users, Jonathan Thurman managed to reproduce the problem with latest SVN of 1.8 branch.
Comments:By: Nic Colledge (nic) 2011-02-15 14:16:11.000-0600

Uploaded some backtraces / core show locks.
Filenames ending with 2 are just after the problem happens. The others are some time later.

By: Jonathan Thurman (jthurman) 2011-02-15 14:57:41.000-0600

I can reproduce this on the latest branch 1.8 SVN with similar locks/backtraces.

By: Kadir Terzi (kterzi) 2011-02-16 03:06:21.000-0600

I have also installed 1.8.3-rcX, exactly the same problem. I have tried with Debian.

By: Nic Colledge (nic) 2011-03-02 15:05:38.000-0600

Just tried this on 1.8.4-rc2. Still a problem.

By: Kadir Terzi (kterzi) 2011-03-03 08:14:58.000-0600

I have now tried on 1.8.3. Problem is exactly the same as described above. It happens with Local channel/Realtime combination.

By: Nic Colledge (nic) 2011-03-08 15:52:46.000-0600

Just done a little more digging into this, the problem seems to come from autoservice.c

pbx_find_extension calls ast_autoservice_stop which in turn get stuck in the loop:

/* Wait while autoservice thread rebuilds its list. */
while (chan_list_state == as_chan_list_state) {
usleep(1000);
}

If I take out the while, the problem goes away, but I dare say its leaving some junk behind that will eventually kill asterisk.

By: steve-howes (steve-howes) 2011-03-17 05:45:22

Can confirm this as well on 1.8.3.

By: S├ębastien Couture (sysreq) 2011-03-18 06:14:49

I can confirm experiencing this on 1.8.3.2.

By: Jonathan Thurman (jthurman) 2011-03-23 10:57:58

From looking over the traces, and reproducing it, I think I have a resolution.  The local channel code holds a lock on the channel, and makes a call to ast_exists_extension.  Down that rabbit hole, autoservice_run attempts to lock the channel as well...

I've created a patch that seems to resolve it for me, but I need some additional help with testing.  The patch is against the 1.8 branch SVN 311604, but applies cleanly to 1.8.3.2

By: Nic Colledge (nic) 2011-03-23 11:30:43

I can confirm this patch fixes the issue in my test environment when applied to 1.8.3.2.

I will test tomorrow night in production environment.

Thanks!

By: Ishfaq Malik (ishmalik) 2011-04-13 08:38:46

I can also confirm that this patch has fixed this issue which we were also experiencing when applied to 1.8.3.2

By: Russell Bryant (russell) 2011-04-26 12:36:49

This patch looks good.  It just needed a few more tweaks to make thinks thread-safe since the channel was unlocked.  I uploaded the final patch here for the record.  Thanks!

By: Digium Subversion (svnbot) 2011-04-26 12:40:24

Repository: asterisk
Revision: 315446

U   branches/1.8/channels/chan_local.c

------------------------------------------------------------------------
r315446 | russell | 2011-04-26 12:40:24 -0500 (Tue, 26 Apr 2011) | 14 lines

chan_local: resolve a deadlock.

This patch resolves a fairly complex deadlock that can occur with the
combination of chan_local and a dialplan switch, such as dynamic realtime
extensions, which pulls autoservice into the picture when doing a dialplan
lookup.

(closes issue ASTERISK-17414)
Reported by: nic
Patches:
     issue18818.patch uploaded by jthurman (license 614)
     18818.v1.txt uploaded by russell (license 2)
Tested by: nic, jthurman, kterzi, steve-howes, sysreq, IshMalik

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=315446

By: Digium Subversion (svnbot) 2011-04-26 12:41:52

Repository: asterisk
Revision: 315447

_U  trunk/
U   trunk/channels/chan_local.c

------------------------------------------------------------------------
r315447 | russell | 2011-04-26 12:41:52 -0500 (Tue, 26 Apr 2011) | 21 lines

Merged revisions 315446 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

........
 r315446 | russell | 2011-04-26 12:40:23 -0500 (Tue, 26 Apr 2011) | 14 lines
 
 chan_local: resolve a deadlock.
 
 This patch resolves a fairly complex deadlock that can occur with the
 combination of chan_local and a dialplan switch, such as dynamic realtime
 extensions, which pulls autoservice into the picture when doing a dialplan
 lookup.
 
 (closes issue ASTERISK-17414)
 Reported by: nic
 Patches:
       issue18818.patch uploaded by jthurman (license 614)
       18818.v1.txt uploaded by russell (license 2)
 Tested by: nic, jthurman, kterzi, steve-howes, sysreq, IshMalik
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=315447