[Home]

Summary:ASTERISK-11873: Channel deadlock in ast_autoservice_stop / pbx_find_extension
Reporter:Atis Lezdins (atis)Labels:
Date Opened:2008-04-17 11:02:50Date Closed:2008-04-23 10:20:40
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 01_bt.core.26019.txt
( 1) 01_callid_1208355514.1361.log
( 2) 01_callid_1208371309.8726.log
( 3) 01_callid_1208375969.11308.log
( 4) 01_callid_1208382837.14137.log
( 5) 01_core_show_locks.txt
( 6) 01_show_channel.txt
( 7) 01_show_channels_verbose.txt
( 8) 01_sip_show_channel.txt
( 9) 02_callid_1208463015.9332.log
(10) 02_core_show_locks.txt
(11) 02_show_channels.txt
Description:Yesterday I noticed 4 locked channels that didn't went away. One of them is local, 3 SIP. They all seems to be locked in ast_autoservice_stop which is called from pbx_find_extension

In SIP calls deadlock position is obvious - nothing more occurs in that thread and log ends, but in Local channel it's after this line:

[Apr 16 07:19:23] DEBUG[22435] res_config_mysql.c: MySQL RealTime: Retrieve SQL: SELECT * FROM extensions_table WHERE exten = '28901' AND context = 'db_queue_na' AND priority = '7'

Database contents in this row are GoTo(hangup_ok|s|1) - nothing that should have caused lock, and normally it works good.

I tried terminating all those channels with "soft hangup" but that didn't helped. So i waited for night time and grabbed a core dump (and also lot's of different debug - everything i could think of)

It seems that this started after upgrade from 1.4.14 to 1.4.19, previously we didn't had that much locked channels, but now there are some once every few days.


Comments:By: Russell Bryant (russell) 2008-04-17 11:05:01

Can you try to get "core show locks" output, please?

By: Atis Lezdins (atis) 2008-04-17 11:06:52

did that :) You're too fast

By: Kevin McAllister (mcallist) 2008-04-17 13:12:28

I am also experiencing this exact problem.  I am willing and able to provide stack dumps, etc if needed.

Also, this may be obvious, but the hung readlocks on conlock hold up dialplan reload indefinitely, requiring a restart to make dialplan changes.

By: Digium Subversion (svnbot) 2008-04-17 17:10:33

Repository: asterisk
Revision: 114230

U   branches/1.4/main/autoservice.c

------------------------------------------------------------------------
r114230 | russell | 2008-04-17 17:10:32 -0500 (Thu, 17 Apr 2008) | 6 lines

Remove redundant safety net.  The check for the autoservice channel list state
accomplishes the same goal in a better way.

(issue ASTERISK-11873)
Reported By: atis

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114230

By: Digium Subversion (svnbot) 2008-04-17 17:18:47

Repository: asterisk
Revision: 114233

_U  trunk/
U   trunk/main/autoservice.c

------------------------------------------------------------------------
r114233 | russell | 2008-04-17 17:18:47 -0500 (Thu, 17 Apr 2008) | 14 lines

Merged revisions 114230 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114230 | russell | 2008-04-17 17:15:43 -0500 (Thu, 17 Apr 2008) | 6 lines

Remove redundant safety net.  The check for the autoservice channel list state
accomplishes the same goal in a better way.

(issue ASTERISK-11873)
Reported By: atis

........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114233

By: Digium Subversion (svnbot) 2008-04-17 17:19:06

Repository: asterisk
Revision: 114234

_U  branches/1.6.0/
U   branches/1.6.0/main/autoservice.c

------------------------------------------------------------------------
r114234 | russell | 2008-04-17 17:19:06 -0500 (Thu, 17 Apr 2008) | 22 lines

Merged revisions 114233 via svnmerge from
https://origsvn.digium.com/svn/asterisk/trunk

................
r114233 | russell | 2008-04-17 17:24:00 -0500 (Thu, 17 Apr 2008) | 14 lines

Merged revisions 114230 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r114230 | russell | 2008-04-17 17:15:43 -0500 (Thu, 17 Apr 2008) | 6 lines

Remove redundant safety net.  The check for the autoservice channel list state
accomplishes the same goal in a better way.

(issue ASTERISK-11873)
Reported By: atis

........

................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=114234

By: Russell Bryant (russell) 2008-04-17 17:19:21

Give it a try with that change ...

By: Atis Lezdins (atis) 2008-04-17 17:24:18

Another one locked channel, this time not on realtime context.

[Apr 17 13:11:13] VERBOSE[19817] logger.c:     -- Executing [28206@local_dial:69] System("Local/28206@default_queue-c5d9,2",

Will get backtrace tomorrow, anything else I can grab before dumping to core?

By: Atis Lezdins (atis) 2008-04-17 17:26:58

Sorry, this log I just uploaded has nothing to do with commit, will try that tomorrow.

By: Kevin McAllister (mcallist) 2008-04-17 21:09:28

I have applied.  Will check it out.

Thanks for the quick turn around.

By: Atis Lezdins (atis) 2008-04-22 02:51:31

I have modified patch and applied to production, so I should see when possible lock would occur in log, and see it's further path, but so far I haven't seen anything. Will wait for this week.



By: Russell Bryant (russell) 2008-04-22 11:06:33

Please do not post code of any kind in comments.  Thanks.  :)

By: Kevin McAllister (mcallist) 2008-04-23 09:49:08

We've been running with this patch to autoservice.c on 1.4.19 since the 18th.  And I have not seen the context lock get hung up since then.

Thank you for the quick fix.

By: Atis Lezdins (atis) 2008-04-23 10:02:23

I can also confirm from modified patch, that situation repeated, but call continued successfully without waiting for unblock.