[Home]

Summary:ASTERISK-17840: [patch] Deadlock on transferring
Reporter:Wulfert Hop (wulfert)Labels:
Date Opened:2011-05-11 10:31:42Date Closed:2011-07-01 09:54:15
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/chan_sip/Transfers
Versions:1.8.3 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 2011-04-21_14_41_cli-capture.txt
( 1) 2011-04-21_14_41_core-show-locks.txt
( 2) 2011-04-21_14_41_gdb.txt
( 3) patch.txt
( 4) Same_patch_aftersigning.txt
Description:Have several backtraces available. And a small patch (Additional Info) witch solves our problem.

From what I have found from the backtraces is (my comments after the //):
....
ASTERISK-4  0x00007fa2f66b552f in sip_set_rtp_peer (chan=0xd8d998, instance=0xd9c278, vinstance=0x0, tinstance=0x0, codecs=7437, nat_active=0) at chan_sip.c:27693

 line 27649: sip_pvt_lock(p);    // lock is being set

ASTERISK-3  0x00007fa2f665888d in transmit_reinvite_with_sdp (p=0xd85f88, t38version=0, oldsdp=0) at chan_sip.c:11020
ASTERISK-2  0x00007fa2f66424c5 in try_suggested_sip_codec (p=0xd85f88) at chan_sip.c:5996
ASTERISK-1  0x0000000000519fc0 in pbx_builtin_getvar_helper (chan=0xd8d998, name=0x7fa2f66c81c3 "SIP_CODEC_INBOUND") at pbx.c:9467

 line: 9467: ast_channel_lock(chan);   // lock is set again and waiting for the first lock????

#4  0x0000000000444ac0 in __ao2_lock (user_data=0xd8d998, file=0x5b8a54 "pbx.c", func=0x5beb50 "pbx_builtin_getvar_helper", line=9467, var=0x5bd339 "chan") at astobj2.c:157
....

I assume my patch is more like a workaround, possibly introducing new bugs (by temporary unlocking it)

****** ADDITIONAL INFORMATION ******

<inline patch removed by lmadsen>
Comments:By: Leif Madsen (lmadsen) 2011-05-11 14:12:53

Please attach patches as a text file to this issue. You can't submit patches inline like that. You must sign the license agreement prior to uploading your patches.

By: Leif Madsen (lmadsen) 2011-05-11 14:15:10

Additionally your backtrace contains <value optimized out>. Backtraces should be done against corefiles which have been created from an unoptimized Asterisk binary:

https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace

By: Wulfert Hop (wulfert) 2011-05-11 15:15:03

Did select the DONT_OPTIMIZE and DEBUG_THREADS in the make menu select.

May be I forgot a "make clean"?

But the optimized parts are only of el_gets()
If you still insist, I have to put a malfuctioning asterisk back into production.. Have to announce it to our users and customers..

By: Wulfert Hop (wulfert) 2011-05-12 03:47:52

Just signed the license and uploaded again.

By: Wulfert Hop (wulfert) 2011-05-16 15:14:49

Could this be the same issue as ASTERISK-17431 ?



By: Alec Davis (alecdavis) 2011-05-16 17:02:55

If it is the same, from ASTERISK-17431 use bug18837-trunk.diff3.txt

The actual diff for 1.8svn that was applied is at http://svnview.digium.com/svn/asterisk/branches/1.8/channels/chan_sip.c?r1=308679&r2=308945&pathrev=308945

By: Richard Mudgett (rmudgett) 2011-06-30 18:57:04.497-0500

Please try Asterisk 1.8.5-rc1 since it was just released with many deadlock issues resolved.

By: Wulfert Hop (wulfert) 2011-07-01 03:48:21.864-0500

Hi Richard,

Not sure if this is the right way to anser..

We are now using: 1.8.4.3
The problem is solved there, thanks.

Altough our installation deadlocks less frequent (around once every two weeks). But by monitoring, and restarting asterisk with sipsak this is sort of acceptible.

I will keep updating to the latest versions.

If you want, I can recompile and provide backtraces, etc.

Many thanks,


Wulfert

By: Richard Mudgett (rmudgett) 2011-07-01 09:54:15.142-0500

I am going to close this issue as it should be fixed by the patch Alec Davis committed (ASTERISK-17431).  The core show locks attached to this issue looks to be the same deadlock.

Core show locks is the best way to locate deadlocks.  If you have other deadlocks please open a new issue.  Thanks.