[Home]

Summary:ASTERISK-16059: [patch] endless cycle with ast_channel->lock held leeds to complete system stuck
Reporter:under (under)Labels:
Date Opened:2010-05-06 04:39:27Date Closed:2011-07-26 15:00:52
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/T.38
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) udptl.diff
Description:(gdb) bt
#0  decode_length (buf=0x361a55f0 "\200\022\200\nê\233ïè\22546mø:Ú\001| \233\r\2137Ð\226\227±ER+\002\034/Þ", limit=32, len=0xbedcecec, pvalue=0xbedcebd4) at asterisk/main/udptl.c:212
#1  0x0816565d in decode_open_type (buf=0x361a55f0 "\200\022\200\nê\233ïè\22546mø:Ú\001| \233\r\2137Ð\226\227±ER+\002\034/Þ", limit=32, len=0xbedcecec, p_object=0xbedcece0, p_num_octets=0x361b1590)
   at asterisk/main/udptl.c:246
#2  0x08165e66 in udptl_rx_packet (s=0x361a51a8, buf=0x361a55f0 "\200\022\200\nê\233ïè\22546mø:Ú\001| \233\r\2137Ð\226\227±ER+\002\034/Þ", len=32) at asterisk/main/udptl.c:453
#3  0x0816701c in ast_udptl_read (udptl=0x361a51a8) at asterisk/main/udptl.c:746
#4  0x2b252f28 in sip_rtp_read (ast=0x35c7d1a8, p=0x35ae9268, faxdetect=0xbedceea0) at asterisk/channels/chan_sip.c:6132
ASTERISK-1  0x2b25349c in sip_read (ast=0x35c7d1a8) at asterisk/channels/chan_sip.c:6192
ASTERISK-2  0x080b52f8 in __ast_read (chan=0x35c7d1a8, dropaudio=0) at asterisk/main/channel.c:2714
ASTERISK-3  0x080b69f9 in ast_read (chan=0x35c7d1a8) at asterisk/main/channel.c:3012
ASTERISK-4  0x0813cd13 in bridge_p2p_loop (c0=0x35c7d1a8, c1=0x2d01a9a8, p0=0x356f31a8, p1=0x2d0c11a8, timeoutms=-1, flags=0, fo=0xbedcf7f4, rc=0xbedcf7f0, pvt0=0x35ae9268, pvt1=0x2d039268) at asterisk/main/rtp.c:3782
ASTERISK-5  0x0813e047 in ast_rtp_bridge (c0=0x35c7d1a8, c1=0x2d01a9a8, flags=0, fo=0xbedcf7f4, rc=0xbedcf7f0, timeoutms=-1) at asterisk/main/rtp.c:4058
ASTERISK-6 0x080be076 in ast_channel_bridge (c0=0x35c7d1a8, c1=0x2d01a9a8, config=0xbedd045c, fo=0xbedcf7f4, rc=0xbedcf7f0) at asterisk/main/channel.c:5059
ASTERISK-7 0x0806c5b8 in ast_bridge_call (chan=0x35c7d1a8, peer=0x2d01a9a8, config=0xbedd045c) at asterisk/main/features.c:2292
ASTERISK-8 0x2b5a9ad4 in dial_exec_full (chan=0x35c7d1a8, data=0x2e641f8c, peerflags=0xbedd07bc, continue_exec=0x0) at asterisk/apps/app_dial.c:2042
ASTERISK-9 0x2b5aa1ee in dial_exec (chan=0x35c7d1a8, data=0x2e641f8c) at asterisk/apps/app_dial.c:2108
ASTERISK-10 0x0811474b in pbx_exec (c=0x35c7d1a8, app=0x2b4c5318, data=0x2e641f8c) at asterisk/main/pbx.c:951
ASTERISK-11 0x2b0d0d83 in app_dial_wrapper (chan=0x35c7d1a8, data=0xbedd30d8) at app_dial_wrapper/app_dial_wrapper.cxx:165
ASTERISK-12 0x0811474b in pbx_exec (c=0x35c7d1a8, app=0x291996c8, data=0xbedd30d8) at asterisk/main/pbx.c:951
ASTERISK-13 0x0811c5c8 in pbx_extension_helper (c=0x35c7d1a8, con=0x0, context=0x35c7d3d4 "dial", exten=0x35c7d424 "s", priority=2, label=0x0, callerid=0x2d011e68 "4992705543", action=E_SPAWN, found=0xbedd5848, combined_find_spawn=1)
   at asterisk/main/pbx.c:3141
ASTERISK-14 0x0811e453 in ast_spawn_extension (c=0x35c7d1a8, context=0x35c7d3d4 "dial", exten=0x35c7d424 "s", priority=2, callerid=0x2d011e68 "4992705543", found=0xbedd5848, combined_find_spawn=1) at asterisk/main/pbx.c:3608
ASTERISK-15 0x2b5c1b2c in _macro_exec (chan=0x35c7d1a8, data=0x336fd84c, exclusive=0) at asterisk/apps/app_macro.c:336
ASTERISK-16 0x2b5c3ab0 in macro_exec (chan=0x35c7d1a8, data=0x336fd84c) at asterisk/apps/app_macro.c:499
ASTERISK-17 0x0811474b in pbx_exec (c=0x35c7d1a8, app=0x2b4c6aa8, data=0x336fd84c) at asterisk/main/pbx.c:951
ASTERISK-18 0x296384ea in hunt_cycle (chan=0x35c7d1a8, router=@0xbedd7a7c, caller_id=@0xbedd7aa8, callee_id=@0xbedd7aa4, huntstatus=@0xbedd882c) at res_smartswitch/hunt.cxx:123
ASTERISK-19 0x2963971e in hunt (chan=0x35c7d1a8, router=@0xbedd7a7c, caller_id=@0xbedd7aa8, callee_id=@0xbedd7aa4, huntstatus=@0xbedd882c, nof_cycles=5) at res_smartswitch/hunt.cxx:178
ASTERISK-20 0x2b1c317d in handle_originator_call (chan=0x35c7d1a8, id=19, caller_id=@0xbedd8870, callee_id=@0xbedd886c, huntstatus=@0xbedd882c) at app_softswitch/handle_originator_call.cxx:77
ASTERISK-21 0x2b1b0fe3 in app_softswitch (chan=0x35c7d1a8, data=0xbeddaab8) at app_softswitch/app_softswitch.cxx:84
ASTERISK-22 0x0811474b in pbx_exec (c=0x35c7d1a8, app=0x291bc328, data=0xbeddaab8) at asterisk/main/pbx.c:951
ASTERISK-23 0x0811c5c8 in pbx_extension_helper (c=0x35c7d1a8, con=0x0, context=0x35c7d3d4 "dial", exten=0x35c7d424 "s", priority=1, label=0x0, callerid=0x35c03f28 "907784360897", action=E_SPAWN, found=0xbeddcf0c, combined_find_spawn=1)
   at asterisk/main/pbx.c:3141
ASTERISK-24 0x0811e453 in ast_spawn_extension (c=0x35c7d1a8, context=0x35c7d3d4 "dial", exten=0x35c7d424 "s", priority=1, callerid=0x35c03f28 "907784360897", found=0xbeddcf0c, combined_find_spawn=1) at asterisk/main/pbx.c:3608
ASTERISK-25 0x0811eb33 in __ast_pbx_run (c=0x35c7d1a8, args=0x0) at asterisk/main/pbx.c:3699
ASTERISK-26 0x0811fc9f in pbx_thread (data=0x35c7d1a8) at asterisk/main/pbx.c:3972
ASTERISK-27 0x0816d9f2 in dummy_start (data=0x35c04628) at asterisk/main/utils.c:861
ASTERISK-28 0x2821b73f in pthread_getprio () from /lib/libthr.so.3
ASTERISK-29 0xbdaaefec in ?? ()

=======================================================================
=== Currently Held Locks ==============================================
=======================================================================
===
=== <file> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 688922128 (do_monitor           started at [20526] asterisk/channels/chan_sip.c restart_monitor())
=== ---> Tried and failed to get Lock #0 (asterisk/channels/chan_sip.c): MUTEX 20466 do_monitor dialog 0x35ae91a8 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690330864 (netconsole           started at [ 1086] asterisk/main/asterisk.c listener())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690328416 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Lock #0 (asterisk/main/channel.c): MUTEX 2553 __ast_read &chan->lock_dont_use 0x35c7d234 (1)
=== ---> Lock #1 (asterisk/channels/chan_sip.c): MUTEX 6191 sip_read p 0x35ae91a8 (1)
=== -------------------------------------------------------------------
===
=== Thread ID: 690329504 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 688918864 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754349728 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754362240 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690330048 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754351904 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754361424 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754357616 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754364144 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 759661136 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690326512 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 759658416 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690331136 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 759660320 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754358704 (netconsole           started at [ 1086] asterisk/main/asterisk.c listener())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754356256 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754355984 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690328144 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754363328 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754363872 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754352176 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 754362784 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)
=== -------------------------------------------------------------------
===
=== Thread ID: 690324880 (pbx_thread           started at [ 3993] asterisk/main/pbx.c ast_pbx_start())
=== ---> Tried and failed to get Lock #0 (asterisk/main/channel.c): MUTEX 1131 channel_find_locked &c->lock_dont_use 0x35c7d234 (0)


****** ADDITIONAL INFORMATION ******

It seems this happens when malformed T.38 fax is processed.
"length" variable which is checked in decode_open_type() will never be negative, because it's declared as unsigned!!!!!!!!!!! And cycle loops forever with ast_channel->lock being held.
1.4.21 doesn't has this issue.
Guys, how could you pass this to the mainline?
Comments:By: under (under) 2010-05-06 09:27:15

why did you change severity to 'minor'?
Production system is completely stuck after this issue!

By: Leif Madsen (lmadsen) 2010-05-07 11:50:02

'minor' refers to the number of people affected by the issue, not the severity of the issue in your environment.

By: Leif Madsen (lmadsen) 2010-05-07 11:58:08

Also, one of the developers has stated you're using a custom application and you're running out of stack space. Can you reproduce this on vanilla Asterisk without third-party changes?

By: under (under) 2010-05-11 02:40:30

yes, I'm running custom application.
The statement that I'm going out of stack space is wrong.
I already described the source of the issue, that is a regression in 1.6 branch.
The described issue happened about twice a day (actually when some of the customers sent broken fax).
I already attached a patch that fixes a problem. During last 4 days issue wasn't observed.

By: under (under) 2010-05-11 02:58:44

the issue is pretty simple, I already described it before.
if you have this snippet:

again:
unsigned length;
length = -1;
if(length > 0)
  goto again;

What do you expect to happen? (this is simplified extract from asterisk code).
The answer is : you'll  get endless cycle.

By: Paul Belanger (pabelanger) 2010-05-25 15:20:29

Is this an issue with 1.6.2 (see below).
---
Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.6.0 and 1.6.1 branches has ended. For continued maintenance support please move to the 1.6.2 branch.

More information on this change can be found in the release announcement: http://www.asterisk.org/node/49924

By: under (under) 2010-05-26 02:31:50

I don't understand the question

By: Ronald Chan (loloski) 2010-05-26 09:03:29

under: maybe what pabelanger is saying is that if you can reproduce the issue with 1.6.2? since the support for 1.6.0 and 1.6.1 respectively is already ended

By: under (under) 2010-05-26 12:39:04

I can't try to reproduce on 1.6.2, because asterisk runs on behalf of the carrier company. And they won't permit to "test something just to reproduce"
I did a fix for 1.6.0.26 and carrier's production system works on it now w/o this issue.
Buggy fax is not mine, I don't even know whom it belongs to so I can't call him and say "would you send fax now to reproduce?".
Regression in quite obvious. Please look at the patch.



By: under (under) 2011-01-10 05:50:16.000-0600

I saw this issue is fixed in trunk, I guess you can close this bug