Summary:ASTERISK-15987: [patch] Deadlock between ast_hangup and pri_dchannel
Reporter:Laurent Steffan (lmsteffan)Labels:
Date Opened:2010-04-20 19:49:23Date Closed:2010-05-07 10:33:54
Versions:Frequency of
Environment:Attachments:( 0) bug17216.patch
( 1) gdb.txt
( 2) lockApr14.txt
Description:I think what causes the deadlock is the following :

Ast_hangup performs a lock of the channel then, through dahdi_hangup, tries to lock the private (Dahdi) part of the channel.

Meanwhile, pri_dchannel first seizes the private part of the channel, then, in ast_softhangup_nolock, tries to lock the channel that owns this private part.

A deadlock occurs as both threads attempt to seize both deadlocks in reverse order.

I include a full trace and a snapshot of the locks.


The threads involved, if I am not mistaken, are

pri_dchannel : thread id 139827492997456, which corresponds to thread # 11 in the GDB trace ;

ast_hangup : thread id 139827037731152, and that's thread # 38 in the trace.

I believe that pri_dchannel has the locks in the wrong order. I thought about performing a lock on the channel before seizing its private part, but the change would not be very localized and I fear it might break other things. Also, the fact that ast_queue_frame locks the channel in the function called ast_softhangup_nolock puzzles me, as I briefly discussed with corydon76 and kpfleming on IRC).

I have checked that the latest SVN version still has the same code.
Comments:By: Jeff Peeler (jpeeler) 2010-05-03 15:52:51

Turns out queueing a frame is unnecessary which negates the need to lock the channel. The attached patch should fix the problem.

By: Laurent Steffan (lmsteffan) 2010-05-05 09:11:57

Thanks. I'll try that tomorrow.

By: Digium Subversion (svnbot) 2010-05-07 10:33:53

Repository: asterisk
Revision: 261866

U   trunk/channels/sig_pri.c

r261866 | jpeeler | 2010-05-07 10:33:52 -0500 (Fri, 07 May 2010) | 14 lines

Fix deadlock in sig_pri when hanging up.

The pri_dchannel thread currently violates locking order by locking the private
and then attempting to queue a frame, which needs to lock the channel. Queueing
a frame is unneccesary though and is actually a regression since sig_pri.
All the places that currently use ast_softhangup_nolock now will just set the
softhangup value directly as before.

(closes issue ASTERISK-15987)
Reported by: lmsteffan
     bug17216.patch uploaded by jpeeler (license 325)