Summary:ASTERISK-02889: Previous locking improvements in chan_mgcp need to be revised (???)
Reporter:Andrey S Pankov (casper)Labels:
Date Opened:2004-11-25 12:56:02.000-0600Date Closed:2011-06-07 14:10:45
Versions:Frequency of
Environment:Attachments:( 0) avoided_deadlock.log.txt
( 1) chan_mgcp.c.diff.txt
( 2) chan_mgcp.c.kram.diff.txt
( 3) chan_mgcp.c.v1-0.diff.txt
( 4) gdb.txt
Description:There is a scenario leading to "channel.c:495 ast_channel_walk_locked: Avoided deadlock for 'MGCP...'".

* remains functional (running), but NOT mgcp devices (no dialtone, no packet exchange, etc.).


I have no possibility to run HEAD asterisk in production, that's why I've backported chan_mgcp changes from HEAD (bug ASTERISK-2840) to current v1-0.

I attach log file from * console (with 'iax2 debug' and 'mgcp debug' turned on).

Here is reproducible call scenario...

*#1 - (asterisk #1 [IAX2 only] "centrex")
*#2 - (asterisk #2 [MGCP CA] "casper")
epA - (DLink DG104S [MGCP] "office")
epB - (cisco ATA [MGCP] "office2")

epB->MGCP/581343->*#2->IAX2/581343->*#1 (not very important part of the call flow, the only thing we need is to have an incoming call on *#2...)

*#1->IAX2/101->*#2 (here it is - incoming IAX2 call)

*#2->MGCP/101@epB (epB picks up, makes flash/attended transfer to MGCP/103@epA and hangs up...)

...and now the trick! MGCP/103@epA makes #/pound/unattended transfer back to MGCP/101@epB.

After that all MGCP devices stop functioning and "show channels" CLI command reports about "Avoided deadlock..." for MGCP channel.
Comments:By: Mark Spencer (markster) 2004-11-25 13:08:51.000-0600

I need to see the results of this running on CVS head, not on your backport to CVS stable.  I don't even know what the code looks like or what snapshot of CVS head mgcp you used for the port.

Also I'll need a "thread apply all bt" from when it's in its bad state.

By: Andrey S Pankov (casper) 2004-11-28 10:13:52.000-0600

The same "always" reproducibility for CVS HEAD 2004-11-28.
Please see gdb.txt attached for "thread apply all bt full".

By: Mark Spencer (markster) 2004-11-28 16:38:12.000-0600

You need to get this in a state where you can duplicate it and find me on irc.  To make things easier, please be sure you have compiled with "make clean ; make valgrind ; make install" so i can get all the details I need out of gdb.  Thanks.

By: Olle Johansson (oej) 2004-12-13 01:45:06.000-0600

Any updates, any communication over IRC?


By: Andrey S Pankov (casper) 2004-12-16 09:10:18.000-0600

I don't have internet access for now, sorry... There was lots of attempts to communicate with markster, but no solution was found. As markster said, someone should try to compare chan_zap and chan_mgcp sources to find out what's wrong with locking within chan_mgcp. This task is too hard for me... Anybody volunteers?

By: twisted (twisted) 2004-12-28 11:04:35.000-0600

isn't this related to ASTERISK-2840?

By: Andrey S Pankov (casper) 2004-12-29 09:28:58.000-0600

twisted: No. This bug is a real one, 100% reproducible.

ASTERISK-2840 is just a backport of changes made in head to stable.

By: Andrey S Pankov (casper) 2004-12-29 09:36:24.000-0600

chan_mgcp.c.kram.diff.txt file contains changes made by kram on my system. Unfortunately, they don't fix the bug.

By: Mark Spencer (markster) 2005-01-09 03:20:39.000-0600

I won't be able to fix this without access to a machine which exhibits this problem.

By: Andrey S Pankov (casper) 2005-01-10 07:33:10.000-0600

Patience please... :) Today is the first working day this year!

By: Olle Johansson (oej) 2005-01-30 12:31:41.000-0600

Casper: We have had a few working days now. Gentle reminder to move this bug report forward!

By: Andrey S Pankov (casper) 2005-02-09 12:47:34.000-0600

Still no luck getting mgcp hardware... Sorry for the delay.

By: Mark Spencer (markster) 2005-02-27 18:47:11.000-0600

When you get hardware and are ready to work on it, we can reopen this.