[Home]

Summary:ASTERISK-08971: [branch] manager seems to not flush it's eventq like it should
Reporter:BJ Weschke (bweschke)Labels:
Date Opened:2007-03-08 12:33:57.000-0600Date Closed:2007-09-18 15:40:35
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/ManagerInterface
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ast_1_4_11_patch_ami_17_09_2007.diff
( 1) ast_1_4_11_patch_ami_linux_new.diff
( 2) ast_1_4_11_patch_ami_linux.diff
( 3) ast_1_4_11_patch_ami.diff
Description: I'm opening this bug up to track an issue and manage a branch on an issue I've been having with the 1.4 Asterisk Manager Interface. It seems it's not flushing it's eventq which is causing it to consume huge amounts of memory in a short period of time.
Comments:By: jmls (jmls) 2007-03-09 04:44:20.000-0600

Is this problem with connections that are command-only (we use astmanproxy to connect, but only to send ami commands, not to send events to the connected client)

By: Olle Johansson (oej) 2007-05-15 15:08:33

Is this still an issue?

By: Joshua C. Colp (jcolp) 2007-06-06 19:37:39

I seem to remember you merging something to fix this BJ... or maybe not... is this still an issue you are persuing?

By: Konrad Rozycki (krdian) 2007-08-13 02:00:27

I have the same problem. Events list increase and eating memory. Is there any solution to flush event queue without restarting asterisk ?



By: BJ Weschke (bweschke) 2007-08-13 05:53:10

krdian - I haven't seen this happen recently on an updated 1.4 branch version. Are you seeing this with a recent version of 1.4 still happening?

By: Konrad Rozycki (krdian) 2007-08-13 06:14:25

I see this problem with version 1.4.9. I'll compile 1.4.10.1 and see if it still happen.

Update: with 1.4.10.1 the same problem. This problem appears when i have enabled
       ringinuse in queue config.



By: BJ Weschke (bweschke) 2007-08-15 09:40:31

Ok. The ringinuse is an interesting point. Can you tell me ... What are all the different processes that are logging in to the manager? Do you have something that's logging in and staying logged in continuously listening for events or something else?

By: Konrad Rozycki (krdian) 2007-08-28 16:20:51

yes, i have connected appliction which is using asterisk-java for monitoring purpose. Do you think this may blocking eventq ? What i can see while all members in queue are busy and couple callers are in queue. Asterisk is hitting agents in 'inuse' state what probably causing that issue. When i trying to use ringinuse=no
then appears device state problem - some members getting 'inuse' state even though they are not in use.

By: Volnikov Ivan (ivan) 2007-09-04 07:32:14

It seems like manager API implementation BUG, that happed with a high degree of probability in case that one of AMI authorized client send synchronously action "Action: Originate". In that case the size of the event queue will be increase while "Originate" is not comleted. In this time other AMI client can take short time authorized connection and can take passed event, but not process it. It leads to that elements of event queue will never not be released any more. I can reproduce this situation with a high degree of probability. I use 1.4.11 Asterisk.



By: Volnikov Ivan (ivan) 2007-09-05 03:32:55

I made some changes in manager.c (see ast_1_4_11_patch_ami.diff) to fix the problem. Fix consists of three parts.
- fix "race condition" in the case of AMI client connecting
- fix "race condition" in the case of AMI client disconnecting
- fix "copy&paste" error in comments for "manager show eventq" command
My synthetic test emulated this Issue is checked patch successfully.



By: Volnikov Ivan (ivan) 2007-09-05 07:51:56

File ast_1_4_11_patch_ami_linux.diff is the same like ast_1_4_11_patch_ami.diff.
I just remove win32 editor symbols in src manager.c file. It heppend while debug.

By: BJ Weschke (bweschke) 2007-09-06 04:41:38

Ivan - I'm going to put your code into an svn branch for people to that have had the problem recently to test with. krdian: please, when this branch is available, please test it!

Ivan - Thank you for the research and patch!! Your conclusion sounds very plausible given the conditions your describing.

By: Digium Subversion (svnbot) 2007-09-06 04:51:30

Repository: asterisk
Revision: 81645

------------------------------------------------------------------------
r81645 | bweschke | 2007-09-06 04:51:30 -0500 (Thu, 06 Sep 2007) | 3 lines

Creating a new branch for simplified testing for those that want to try the patch supplied by Ivan on issue ASTERISK-8971


------------------------------------------------------------------------

By: Andrey Solovyev (corruptor) 2007-09-06 04:54:26

I've patched my production asterisk because I also have problems with event queue (duplicate issue 0010439). I will have the results in a 1-2 weeks.

By: Digium Subversion (svnbot) 2007-09-06 04:59:09

Repository: asterisk
Revision: 81647

------------------------------------------------------------------------
r81647 | bweschke | 2007-09-06 04:59:07 -0500 (Thu, 06 Sep 2007) | 3 lines

Patch as supplied by Ivan for issue ASTERISK-8971


------------------------------------------------------------------------

By: BJ Weschke (bweschke) 2007-09-06 05:06:03

Ok, so everyone else that's having the problem please check out the branch which will stay in sync with 1.4 branch at

http://svn.digium.com/svn/asterisk/team/bweschke/patch-M9238/

By: Volnikov Ivan (ivan) 2007-09-06 05:28:55

I think that is most correct patch for this Issue (ast_1_4_11_patch_ami_linux_new.diff). I realy sorry for the delivered inconveniences. This patch is not patch for patch. Use this for real manager.c version 1.4.11. The previous variant (ast_1_4_11_patch_ami_linux.diff) had a logical defect and can not be build in developer mode. And in diff file please change path 11/manager.c and 12/manager.c to main/manager.c.



By: Digium Subversion (svnbot) 2007-09-06 06:03:20

Repository: asterisk
Revision: 81648

------------------------------------------------------------------------
r81648 | bweschke | 2007-09-06 06:03:19 -0500 (Thu, 06 Sep 2007) | 3 lines

Revert, re-do on issue ASTERISK-8971.


------------------------------------------------------------------------

By: Volnikov Ivan (ivan) 2007-09-17 04:11:06

I have made new updating (ast_1_4_11_patch_ami_17_09_2007.diff : patch must be applied to release 1.4.11 module manager.c) which in my opinion should solve a problem completely. Most important change that I have made is the removed atomic access to "num_sessions" counter. This is the property of session object and must be modified in AST_LIST_LOCK(&sessions) sections. I think that is the principal cause for "race conditions" that happend with a high probability so far with increase of loading of the processor. Nevertheless all should be tested it. Possibly I not could consider all in this code. The previous variant had small vulnerability for the specified reasons.

By: Digium Subversion (svnbot) 2007-09-18 15:38:00

Repository: asterisk
Revision: 82867

------------------------------------------------------------------------
r82867 | russell | 2007-09-18 15:37:56 -0500 (Tue, 18 Sep 2007) | 10 lines

Fix a memory leak that can occur on systems under higher load.  The issue is
that when events are appended to the master event queue, they use the number
of active sessions as a use count so it will know when all active sessions
at the time the event happened have consumed it.  However, the handling of
the number of sessions was not properly synchronized, so the use count was
not always correct, causing an event to disappear early, or get stuck in
the event queue for forever.

(closes issue ASTERISK-8971, reported by bweschke, patch from Ivan, modified by me)

------------------------------------------------------------------------

By: Digium Subversion (svnbot) 2007-09-18 15:40:35

Repository: asterisk
Revision: 82868

------------------------------------------------------------------------
r82868 | russell | 2007-09-18 15:40:34 -0500 (Tue, 18 Sep 2007) | 18 lines

Merged revisions 82867 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
r82867 | russell | 2007-09-18 15:56:43 -0500 (Tue, 18 Sep 2007) | 10 lines

Fix a memory leak that can occur on systems under higher load.  The issue is
that when events are appended to the master event queue, they use the number
of active sessions as a use count so it will know when all active sessions
at the time the event happened have consumed it.  However, the handling of
the number of sessions was not properly synchronized, so the use count was
not always correct, causing an event to disappear early, or get stuck in
the event queue for forever.

(closes issue ASTERISK-8971, reported by bweschke, patch from Ivan, modified by me)

........

------------------------------------------------------------------------