Summary:ASTERISK-19754: Deadlock in chan_sip / pthread_timing
Reporter:Nikola Ciprich (nikola.ciprich)Labels:
Date Opened:2012-04-19 04:45:49Date Closed:2013-04-19 11:00:27
Status:Closed/CompleteComponents:Channels/chan_sip/General Resources/res_timing_pthread
Versions: Frequency of
is related toASTERISK-21389 res_timing_pthread fails to return from write, causing timer dependent operations to block indefinitely
Environment:Linux (centos 5), x86_64, kernel 3.0.xAttachments:( 0) ASTERISK-19754-core-show-locks.txt
( 1) backtrace-threads_1310.txt
( 2) core-show-locks_1310.txt
Description:from time to time, asterisk SIP handling seems to get into unresponsive state (no replies to INVITE, REGISTER etc).
the culprit seems to be deadlock in chan_sip:

[edit - mcj]

core show locks output removed and put into attachment.
Comments:By: Filip Frank (frenk77) 2012-04-23 05:45:27.456-0500

I maybe found this problem too, after our DNS failure. All phones was unregistred. It send SIP register but asterisk not sending response him.

By: Nikola Ciprich (nikola.ciprich) 2012-04-24 03:06:46.404-0500

Anyone? Is there something I could do to help debugging this issue? It seems to be quite serious..

By: Matt Jordan (mjordan) 2012-04-24 12:40:57.766-0500

Please attach debug output, be it log statements or the output of 'core show locks', as a file and not in the issue description.

By: Nikola Ciprich (nikola.ciprich) 2012-06-30 01:25:25.432-0500

so the problem seems to be present in as well.. we've had multiple such hangs on one of our customer's boxes. What I noted is, that they had multiple BLF buttons set to nonexistent extensions... I wonder whether this is related?

By: Kien Kennedy (kiennd) 2012-10-18 01:58:40.729-0500

I had same problem, too. I have tried many version of asterisk (1.8.11, 1.8.15 and the newest 1.8.18-rc1) but the problem is still occured. The system was worked well in 1-2 months, but recently, it has these problems.

By: Nikola Ciprich (nikola.ciprich) 2012-10-18 02:07:19.184-0500

our testing just hung few minutes ago, so yes, the problem is definitely still present.

By: Kien Kennedy (kiennd) 2012-10-18 04:25:19.112-0500

Any one try older versions? 1.4 or 1.6, more stable ?

By: Matt Jordan (mjordan) 2013-04-19 10:56:34.145-0500

There is a patch on ASTERISK-21389 that should prevent res_timing_pthread from blocking callers. That should resolve this issue.

If you test with that patch and this issue is still a problem, please let a bug marshal know in #asterisk-bugs and we will be happy to reopen this issue. Thanks!