Summary:ASTERISK-17798: pbx_lua hangs on faxdetect
Reporter:Alex Besogonov (cyberax)Labels:
Date Opened:2011-05-04 13:29:18Date Closed:2011-06-13 13:51:49
Versions:1.8.3 Frequency of
Environment:Attachments:( 0) bt1.txt
( 1) bt2.txt
( 2) locks1.txt
( 3) locks2.txt
Description:pbx_lua module deadlocks on faxdetect. The last lines from the log are:
[May  4 21:04:50] DEBUG[18412]: dsp.c:590 tone_detect: 1100 Hz done detected
[May  4 21:04:50] DEBUG[18412]: chan_sip.c:6954 sip_rtp_read: Fax CNG detected on SIP/datagroup-in-00000000
[May  4 21:04:50] DEBUG[18412]: pbx_lua.c:1283 lua_find_extension: Looking up fax@from-kharkov:1

The relevant section of extensions.lua is:
extensions = {
   ["from-kharkov"] = {
       ["fax"] = function()
           app.DumpChan() -- This never happens

       ["s"] = function()


Thread 3 (Thread 0x7fa458965700 (LWP 18412)):
#0  0x00007fa48599939d in nanosleep () from /lib/libc.so.6
#1  0x00007fa4859ce844 in usleep () from /lib/libc.so.6
---Type <return> to continue, or q <return> to quit---
#2  0x000000000044069a in ast_autoservice_stop (chan=0x7fa4805c4cd8) at autoservice.c:274
#3  0x00000000004d7263 in pbx_find_extension (chan=0x7fa4805c4cd8, bypass=<value optimized out>,
   q=<value optimized out>, context=<value optimized out>, exten=<value optimized out>,
   priority=<value optimized out>, label=0x0, callerid=0x7fa48058b100 "0577521688", action=E_MATCH) at pbx.c:2820
#4  0x00000000004d9def in pbx_extension_helper (c=0x7fa4805c4cd8, con=<value optimized out>,
   context=0x7fa4805c5230 "from-kharkov", exten=0x7fa46c353fda "fax", priority=<value optimized out>,
   label=<value optimized out>, callerid=0x7fa48058b100 "0577521688", action=E_MATCH, found=0x0,
   combined_find_spawn=0) at pbx.c:4037
ASTERISK-1  0x00000000004db324 in ast_exists_extension (c=0x7fa45895a050, context=<value optimized out>,
   exten=<value optimized out>, priority=<value optimized out>, callerid=<value optimized out>) at pbx.c:4583
ASTERISK-2  0x00007fa46c300f7f in sip_read (ast=0x7fa4805c4cd8) at chan_sip.c:6989
ASTERISK-3  0x0000000000468672 in __ast_read (chan=0x7fa4805c4cd8, dropaudio=0) at channel.c:3829
ASTERISK-4  0x0000000000469fa9 in ast_generic_bridge (c0=0x7fa4805c4cd8, c1=0x14b1a68, config=0x7fa45895f4e0,
   fo=<value optimized out>, rc=<value optimized out>) at channel.c:6849
ASTERISK-5  0x000000000046b1b5 in ast_channel_bridge (c0=0x7fa4805c4cd8, c1=0x14b1a68, config=0x7fa45895f4e0,
   fo=<value optimized out>, rc=<value optimized out>) at channel.c:7280
ASTERISK-6 0x000000000049786c in ast_bridge_call (chan=0x7fa4805c4cd8, peer=0x14b1a68, config=0x7fa45895f4e0)
   at features.c:3572
ASTERISK-7 0x00007fa4724b214f in dial_exec_full (chan=0x7fa4805c4cd8, data=<value optimized out>,
   peerflags=<value optimized out>, continue_exec=<value optimized out>) at app_dial.c:2653
ASTERISK-8 0x00007fa4724b3846 in dial_exec (chan=0x7fa45895a050, data=0x0) at app_dial.c:2753
ASTERISK-9 0x00000000004d98fb in pbx_exec (c=0x7fa4805c4cd8, app=0x7fa48050d3f0, data=0x7fa458960140 "SIP/1602,30,gkt")
   at pbx.c:1406
ASTERISK-10 0x00007fa4676d5f3d in lua_pbx_exec (L=0x1515790) at pbx_lua.c:229
---Type <return> to continue, or q <return> to quit---
ASTERISK-11 0x00007fa4674b2bc1 in ?? () from /usr/lib/liblua5.1.so.0
ASTERISK-12 0x00007fa4674bd959 in ?? () from /usr/lib/liblua5.1.so.0
ASTERISK-13 0x00007fa4674b308d in ?? () from /usr/lib/liblua5.1.so.0
ASTERISK-14 0x00007fa4674b2737 in ?? () from /usr/lib/liblua5.1.so.0
ASTERISK-15 0x00007fa4674b27b2 in ?? () from /usr/lib/liblua5.1.so.0
ASTERISK-16 0x00007fa4674ae1a1 in lua_pcall () from /usr/lib/liblua5.1.so.0
ASTERISK-17 0x00007fa4676d6a46 in exec (chan=0x7fa4805c4cd8, context=0x7fa4805c5230 "from-kharkov",
   exten=0x7fa4805c5280 "s", priority=<value optimized out>, callerid=<value optimized out>,
   data=<value optimized out>) at pbx_lua.c:1241
ASTERISK-18 0x00000000004da2f1 in pbx_extension_helper (c=0x7fa4805c4cd8, con=<value optimized out>,
   context=0x7fa4805c5230 "from-kharkov", exten=0x7fa4805c5280 "s", priority=<value optimized out>,
   label=<value optimized out>, callerid=0x7fa48058b100 "0577521688", action=E_SPAWN, found=0x7fa458964cbc,
   combined_find_spawn=1) at pbx.c:4098
ASTERISK-19 0x00000000004da590 in ast_spawn_extension (c=0x7fa45895a050, context=<value optimized out>,
   exten=<value optimized out>, priority=<value optimized out>, callerid=<value optimized out>,
   found=<value optimized out>, combined_find_spawn=1) at pbx.c:4608
ASTERISK-20 0x00000000004db5c9 in __ast_pbx_run (c=0x7fa4805c4cd8, args=<value optimized out>) at pbx.c:4706
ASTERISK-21 0x00000000004dc88b in pbx_thread (data=0x7fa45895a050) at pbx.c:5017
ASTERISK-22 0x000000000051525a in dummy_start (data=<value optimized out>) at utils.c:973
ASTERISK-23 0x00007fa484f7f9ca in start_thread () from /lib/libpthread.so.0
ASTERISK-24 0x00007fa4859d570d in clone () from /lib/libc.so.6
ASTERISK-25 0x0000000000000000 in ?? ()
Comments:By: Leif Madsen (lmadsen) 2011-05-05 10:47:53

Unfortunately your backtrace contains a lot of optimized out data. Please provide a backtrace per https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Matthew Nicholson (mnicholson) 2011-05-06 10:20:27

I'll try and reproduce this today.

By: Matthew Nicholson (mnicholson) 2011-05-06 15:46:24

I tried to reproduce this but I had some problems getting fax detection working.  Please clarify if you are getting a crash or if asterisk is just locking up.

By: Alex Besogonov (cyberax) 2011-05-06 15:49:14

Asterisk is simply locking up, it doesn't crash. AEL and old-style extensions.conf both work fine.

By: Leif Madsen (lmadsen) 2011-05-10 16:43:16

OK a 'core show locks' will probably be useful here.

By: Matthew Nicholson (mnicholson) 2011-05-10 18:54:40

Also a non optimized backtrace may help.

By: Alex Besogonov (cyberax) 2011-05-10 18:58:49

Curiously, it doesn't happen on a version without optimizations (it still happens on a version from packages.asterisk.org, I've double-checked). Also, it sometimes doesn't happen if sipdebug is turned off.

So it's a race condition somewhere, I'll try to catch it with gdb.

By: Matthew Nicholson (mnicholson) 2011-06-10 12:22:00.609-0500

I have reproduced this locally and am looking into it now.

By: Matthew Nicholson (mnicholson) 2011-06-10 13:37:06.967-0500

This is not a problem specific to pbx_lua and has the potential to occur when any dialplan switch are enabled along with fax detect on chan_sip.

By: Matthew Nicholson (mnicholson) 2011-06-10 13:42:41.254-0500

I have uploaded both 'core show locks' and backtraces demonstrating this issue. The problem occurs when ast_autoservice_stop() is called with the channel lock held on a channel that has an auto service running on it. This occurs if chan_sip detects a fax in it's read function and any pbx_switches are enabled in the context checked for the fax extension.

The ast_autoservice_stop() function waits for the autoservice_run thread to update the as_chan_list_state variable while at the same time the autoservice_run thread waits to get a lock on the channel that called ast_autoservice_stop(), thus we have a deadlock.

Additionally, because this is not a traditional deadlock, it is not readily obvious in the output of 'core show locks'.

By: Matthew Nicholson (mnicholson) 2011-06-10 14:24:41.095-0500

This should be fixed in 1.6.2 r323039, 1.8 r323040, and trunk r323041.  I don't seem to have permission to close it.