[Home]

Summary:ASTERISK-15757: Deadlocks with ~2k MGCP users
Reporter:Adrien L. (adrien)Labels:
Date Opened:2010-03-05 14:30:52.000-0600Date Closed:2011-07-26 15:29:51
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Channels/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) 100312_backtrace.txt
( 1) 100312_locks.txt
( 2) 100803_backtrace.txt
Description:I experience a freeze of Asterisk when I was in 1.2.35 due to 'Avoided deadlocks'.

So I move to 1.4.29.1 release and the issue seems to be the same. One difference : I don't have "Avoided deadlocks" into Warning level but only a lot of "Avoiding" during the day and ONE (ie. On one channel) repeated several times after Asterisk has freezed.

I can post some debug output but not backtrace yet because I must recompile with DEBUG_THREADS = #-DDUMP_SCHEDULER #-DDEBUG_SCHEDULER #-DDEBUG_THREADS #-DDETECT_DEADLOCKS


****** ADDITIONAL INFORMATION ******

[Mar  5 18:17:22] DEBUG[29114] channel.c: Avoiding deadlock for channel '0xb7687f88'
[Mar  5 18:17:22] DEBUG[29114] channel.c: Avoiding deadlock for channel '0xb7687f88'
[Mar  5 18:17:22] DEBUG[29114] channel.c: Avoiding deadlock for channel '0xb7687f88'

X 173k into full files until I do "kill -9" and reload Asterisk.
Comments:By: Adrien L. (adrien) 2010-03-08 04:59:19.000-0600

There are also a lot of Maximum Retries along the day.

Maybe it can be a trigger of the problem ?

[Mar  6 13:28:28] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84864 on [029447356583]
[Mar  6 13:28:28] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84865 on [028384207560]
[Mar  6 13:28:28] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84866 on [028384207560]
[Mar  6 13:28:28] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84867 on [028384207560]
[Mar  6 13:28:28] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84868 on [028384207560]
[Mar  6 13:28:29] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84881 on [029913852346]
[Mar  6 13:28:29] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84882 on [029913852346]
[Mar  6 13:28:29] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84889 on [029447356583]
[Mar  6 13:28:29] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84890 on [029447356583]
[Mar  6 13:28:30] WARNING[20863] chan_mgcp.c: Maximum retries exceeded for transaction 84885 on [028036623746]

By: Leif Madsen (lmadsen) 2010-03-08 10:10:29.000-0600

Thanks for the submission!  I'm putting this into feedback until you can provide some additional information (such as the backtrace you appear to be working on getting, etc...).

It may or may not also be useful to get some valgrind output, but since that will slow your system down significantly then I think the backtrace and 'core show locks' output will be the best usage of time right now.

By: Adrien L. (adrien) 2010-03-08 11:04:07.000-0600

I reproduce the problem. Just a bit different than previously.

I see the service down slowly. 2 of my CPE which were not in calls lost the service. They sent "RSIP" for Heartbeat procedure without any answer from Asterisk. But the calls already established were not cutted.

I try to get a backtrace but I'm not really familiar with GDB and the pertinent informations to get in. I upload it to the case.

It's look like a freeze but this time without "Avoiding deadlock" flood. On the contrary, I don't have deadlock from the beginning of the freeze since the restart.

Moreover I don't need to do a kill. A "restart now" has ran.

lmadsen : core show locks is not implement in my CLI.

By: Leif Madsen (lmadsen) 2010-03-08 11:53:39.000-0600

adrien: You'll need to enable MALLOC_DEBUG from the Compiler Flags section of menuselect to get the 'core show locks' command.

More information about submitting backtraces is in the doc/backtrace.txt file of your Asterisk sources.

By: Adrien L. (adrien) 2010-03-08 12:00:56.000-0600

I need to rebuild my Asterisk...There's nothing usuable into the backtrace posted this evening?

I will read backtrace.txt ASAP.

Thanks.

By: Leif Madsen (lmadsen) 2010-03-08 13:57:43.000-0600

No, it doesn't look like there is anything useful in it.

By: Adrien L. (adrien) 2010-03-09 03:01:14.000-0600

I cannot run menuselect because I don't have ncurses.

Can you confirm the line to put into Makefile ? :

MALLOC_DEBUG = #-include $(PWD)/include/asterisk/astmm.h

Thanks.

By: Adrien L. (adrien) 2010-03-15 11:33:17

I have produced a "core show lock" and a new backtrace.

Is it usefull to identify the problem ?

By: Leif Madsen (lmadsen) 2010-03-15 14:40:34

Not sure if it is useful, but I'll set this to Acknowledge so it can be reviewed by a developer.

By: ddkprog (ddkprog) 2010-06-07 15:01:35

>adrien
have you used 3-way transfer?

By: Nahuel Greco (nahuelgreco) 2010-06-10 14:24:37

adrien:

Have you tried the Domjan Attila version? It fixes many lock issues, checkout it from this svn repository:

https://observer.router.hu/repos_pub/chan_mgcp/new_lock/

By: Leif Madsen (lmadsen) 2011-07-26 15:29:46.086-0500

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.  Further information can be found at http://www.asterisk.org/developers/bug-guidelines