Summary: | ASTERISK-05914: [patch] The ast_expr2 facility for $[ ] evaluation leaks memory | ||
Reporter: | Steve Murphy (murf) | Labels: | |
Date Opened: | 2005-12-28 09:19:04.000-0600 | Date Closed: | 2006-02-06 00:43:24.000-0600 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Core/Configuration |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) 20060116__bug6072.diff.txt ( 1) 6072.patch.v2 ( 2) 6072.patch.v4 ( 3) extensions.conf | |
Description: | Sorry, the ast_expr2 facility leaks memory. I've attached the patch. ****** STEPS TO REPRODUCE ****** execute a dialplan with $[ ] expressions in them. Do this a hundred million times. FEEL the bloat! ****** ADDITIONAL INFORMATION ****** The patch fixes all known/possible memory leaks. It also includes two small enhancements: 1. Two functions to register/clear and additional phrase to add to syntax errors. This would allow the app calling the $[] parser to insert some information of its own, like file and line number info, for example. 2. The functions above will also suppress error messages from the operator eval funcs, when there are unevaluated variable references present in the input expression. This new capability will allow the $[] parser to be used by syntax checkers to generate syntax errors in $[] expressions. Unless they are called, they have absolutely no affect on asterisk at runtime. | ||
Comments: | By: Steve Murphy (murf) 2005-12-28 09:23:05.000-0600 Applying this patch to the SVN head is a prerequisite to bug ASTERISK-5966021. By: Steve Murphy (murf) 2006-01-03 23:25:30.000-0600 OK, the changes that were introduced with svn head r 7771 (and previous), looked a lot like someone took heavy editorial license with my previous patch, and loused it up in the doing. But, further analysis seems to indicate that the changes submitted for ast_expr2.fl were merely serendipitous. A few of the memory leaks fixed in my previous patch were thereby plugged, but most were not touched. So, I'm uploading another patch, against the latest (7771) svn head. Please, implement this quick, it's a headache to keep up with you guys! You may as well mangle fixed code as not, so apply this and we should both breathe easier. I didn't mention above that a few small additions were added to make the parser skip over the contents of ${ } expressions that were not evaluated out of the input, as would happen if we use this parser to syntax check expressions at compile time. Until you apply this patch, you've got a leaky $[] expression evaluator! Also, the .c files in the patch were generated from bison 2.0, and the 2.5.31 version of flex. You can roll your own versions if you want to. By: Steve Murphy (murf) 2006-01-07 23:07:57.000-0600 Hi! Via testing, found one more hole to patch. If an unevaluated ${} expression is in the input, and at the end of the input, the whole token could be lost. I added an extra flex rule to take care of this situation. 6072.patch.v2 is now the latest and greatest patch. All preceding patches can be deleted. And, BTW, this is the CHILD of 6021, not the parent... if that makes any difference to anyone! By: Steve Murphy (murf) 2006-01-08 21:54:55.000-0600 I've attached the v3 patch. I always felt that the syntax error messages, which were generating internal token names, like TOK_COLON, etc, would be confusing to users, so I've upgraded the parser to generate syntax error messages with ':' instead of TOK_COLON, which should be much more natural for users to understand. I've upgraded AEL2 to do the same thing, so they are uniform in their behavior and messages. By: Steve Murphy (murf) 2006-01-16 12:09:10.000-0600 Uploaded the v4 patch. Now, the AND operator can be represented both with '&' and '&&'. This will give the C/java/C++/etc. hackers the equivalent of their logical operator. Same done for OR ('|' and '||' both mean the same thing). Same done for equality ('=' and '==' mean the same thing). Adjusted the parser to return a null string if that what it got to parse. This will provide some transparency for calls like DB(). By: Tilghman Lesher (tilghman) 2006-01-16 13:15:41.000-0600 To be clear, are these memory leaks only present in trunk, or are they present in 1.2, as well? By: Leif Madsen (lmadsen) 2006-01-16 13:40:27.000-0600 murf: can all patches prior to .v4 (01-16-2006) be removed (including the ast_expr diffs?) By: Steve Murphy (murf) 2006-01-16 13:49:47.000-0600 These leaks are in everything this patch isn't applied to. By: Steve Murphy (murf) 2006-01-16 13:51:59.000-0600 blitzrage-- Yes, I filed a bug against the mantis config, that prevents me, the bug creator, from removing attachments. If I could have, I would have removed all but the v4 patch myself. By: Tilghman Lesher (tilghman) 2006-01-16 14:04:46.000-0600 Please leave v2, as it seems to be the only patch that contains only the fixes to memory leaks, and no new features. That is the only patch that can be applied to 1.2. By: Leif Madsen (lmadsen) 2006-01-16 14:05:34.000-0600 Deleted all patches but v4 for you (I asked because I'm a bug marshall). Just wanted to do some cleaning up for you. By: Tilghman Lesher (tilghman) 2006-01-16 14:11:35.000-0600 murf: please re-upload v2, the patch that contained only memory leak fixes. By: Steve Murphy (murf) 2006-01-16 14:27:51.000-0600 OK, there it be. If you apply that one, I'll pull all the other changes back into 6021, and no good things will be lost. By: Tilghman Lesher (tilghman) 2006-01-16 15:24:02.000-0600 I can't apply this patch as written (v2) to 1.2. It contains additional features that are unnecessary for the bugfix. By: Tilghman Lesher (tilghman) 2006-01-16 15:37:04.000-0600 Patch uploaded that contains what seems to be just the bugfixes. Please let me know if I've corrupted your patch in any way or if I've missed something. Once again, this should be JUST the bugfixes, no extra features. By: Steve Murphy (murf) 2006-01-16 21:17:30.000-0600 I really hate to do this to you... I downloaded the patch, and compared it to the v2 patch. It LOOKS good. And then I sat down and figured this: a. to be SURE the patch is good, I need to rebuild from scratch a new test case. Since only part of the changes I added are in the test, I have to rip things out of other parts, so there won't be undefined references in the link. b. I then instrument and test via purify on Solaris, the standalone, ... c. I'm pooped. I've got my hands full right now. All this stuff will take hours, which I don't have right now. So, I recommend this: one little error, and the whole patch is worthless. It's really touchy stuff, believe me. It's not worth playing with, no matter how careful you think you've been. Don't add it to 1.2. Let 1.2 leak away. 6021 contains this patch in its perfect entirety. Introduce these changes in whatever version AEL2 ends up in. Close this bug for now. If I can find the time and energy to weed out just the non-additions for 1.2, I'll reopen it and supply the weeded out patch. I'd never release until I test it out, it's far too dangerous, and I just don't have the time right now. Your patch looks perfect, but until you run it thru a bunch of tests with purify keeping track, don't merge it. By: Tilghman Lesher (tilghman) 2006-01-18 16:59:48.000-0600 murf: is the testsuite you ran your patch through available for contribution as a developer tool? By: Steve Murphy (murf) 2006-01-18 17:50:38.000-0600 Corydon76-- My test environment is this: on Solaris, because it's the only machine I have that has a Purify license. A special makefile, an update script to copy over any changed files from my svn dirs. I build the standalone parser under purify, which includes the full AEL2 parser. Testing of the ast_expr2 stuff is done "in situ", as part of testing AEL2. The set of expressions that get thrown at it is not exhaustive, but is a fair taste of what it needs to handle, including some regex stuff. Mostly the parser is exersized, but in that environment, I was able to find and patch a fair assortment of leaks. To go all out, I'd build the standalone version of just the expr2 parser, and run maybe a couple hundred different expressions thru it, all constants, the kind of stuff the run-time parser would expect to see. No, it's probably not a setup that very many others could use... but then, maybe some could, with some customizations for their environment vs. mine... At any rate, If I rip out the enhancements I made for AEL2, I won't be able to test just the bugfix version in the AEL2 config. So, I'll have to sit down and write the standalone test suite, and then redo the makefile to make the standalone ast_expr2 stuff, and instrument it under purify. When I get the time... I'll do it. Unless you want to? You got some energy? Want to write a file where each line is an ast_expr2 constant expression, of varying flavors and sizes? And a main to read in the file, and feed the exprs into the parser? And some sort of script to check against expected results, which could just be a diff between one run's output, and the current run's output? By: Steve Murphy (murf) 2006-01-19 11:47:55.000-0600 Corydon76-- I see this bug has been left hanging. Usually I note a wreckless abandon when it comes to closing a bug for ANY reason! Well, anyway, OK, I created a new SVN work area, asterisk/team/murf/bug_6072, and in there I applied your patch. I built the test environment, replicated it on Solaris, ran the beast under purify, blah, blah, yada-yada. And the memory freeing was OK as per your patch. But there were some other bugs, that are indeed bugs, that needed fixing, as the expressions in the test suite showed some problems. 1. The =~ operator is in the documentation, but has been removed from the expr2.fl file! Not nice! I put it back, and =~ operators are useable again now. My guess is someone removed it for a previous release, and then forget to add it back in. Nasty! 2. The ~ operator was missing, I think. I added it back in to jive with the docs. 3. THe '%' char was present in the TOKEN pattern, which caused it it be absorbed into a neighboring TOKEN if there was no space there. Nasty. Removed this bug. 4. You excluded a pattern in the grammar that would have returned a blank string if that's what comes in. I didn't add it back in, but... it really should go back in even for 1.2, because this naturally occurring event can happen in something common like ${DB(xxx/yyy)}, which will return an empty string if match in the DB is found. This blank string, right now, returns syntax errors in 1.2 as it is, but it merely SHOULD return an empty string as the result, with no syntax error (and therefore, leaked memory). If you want, I can add it back into my svn work area, or you can. Or you can decide to leave it out. Up to you. By: Tilghman Lesher (tilghman) 2006-01-19 12:31:24.000-0600 Okay, the patch which I prepared and you tested has been committed to 1.2 and merged to trunk. So the memory leaks (which were my primary concern) are now fixed. Now, if you want to prepare an additional patch to 1.2, to cover remaining issues, I have the following responses to your suggestions: 1) This may be fine for 1.2, but given that it doesn't appear in any 1.2 up till now, we may wish to simply remove it from the documentation. Drumkilla needs to be consulted for this change. 2) The operator ~ is not documented for 1.2, and we therefore can't allow it. 3) That fix can safely go into 1.2 as a real bugfix. 4) This is a change in behavior. I don't think it's appropriate for 1.2. Once we have all issues in 1.2 identified, committed, and merged to trunk, we can then consider the remaining patches for trunk alone. By: Steve Murphy (murf) 2006-01-19 13:19:05.000-0600 Corydon76-- My mistake! I confused the '~', with the '~' logical complement operator. I fixed the prob in my tree. You were right for #2, the '~' was never documented. My mistake, it was meant to be '!', which IS documented. I don't know how these operators were left out. I think sloppiness. In some previous release they were considered new stuff, and hacked out, but were not completely hacked back in the current 1.2. All I know is that I would never have knowingly removed them. Removing document features that had bugs in the implementation isn't always the best policy! murf By: Tilghman Lesher (tilghman) 2006-01-19 14:01:24.000-0600 '!' isn't documented in 1.0, either. I'm going to have to enforce policy against that for 1.2, as well. Again, if you can persuade either Russell (drumkilla) or Kenny, the 1.2 maintainers, to make an exception, it's possible that it can still go in to 1.2, but I cannot make such an exception to the policy. Just to be clear, this policy is not against these going into trunk -- they can and will -- only that we are forbidden from adding new features into 1.2. By: Steve Murphy (murf) 2006-01-19 14:49:21.000-0600 If you mean, that 1.0 didn't have '!', and you won't be able to fold this into 1.0, fine. My memory is hazy on exactly when I submitted the upgraded expr2 stuff. But I really don't care if it isn't in that release. 1.2 is a different matter, but rules are rules. I'll be happy to see some (future) version of asterisk with the complete features. I've been tucking everything away into AEL2, as these expr2 fixes will be necc. in that new regime. By: Steve Murphy (murf) 2006-01-19 14:51:44.000-0600 Well, wait a second... You say: '!' isn't documented in 1.0, either. I'm going to have to enforce policy against that for 1.2, as well. What exactly do you mean by this, because '!' IS DOCUMENTED in README.variables in 1.2. I don't quite understand the linkage to 1.0, I guess. By: Tilghman Lesher (tilghman) 2006-01-19 16:16:46.000-0600 By that, I meant that because the '!' operator never appeared in any release, even though it appears in the documentation, adding it now would constitute a feature add. Right now, it's at best a documentation error. If it had appeared in a release, then a case could be made that its removal should never have happened and its functionality should be restored. As it is, I think this might have appeared in trunk at one time, but it never made it into a release. Again, this is just about applying policy of no new features in 1.2. By: Steve Murphy (murf) 2006-01-25 09:58:48.000-0600 Hmmm, looks like this bug is still here. Looking over the last post from Corydon76, I see you might be expecting me to say something about making a patch for 1.2 of the remaining issues. It's up to you guys. As far as I'm concerned, 1.2 can remain as it is. If you'd like to have the features I've written added to 1.2, let me know, I'll draft a patch. If you aren't interested, feel free to close the bug. The new stuff is a candidate for 1.3, or whatever, I guess, and hopefully will end up there. If people aren't/having been using the missing features, or even just complaining about their lack, then obviously, they aren't missing them in 1.2. Why stir up trouble and waste time on a patch will be rejected? By: Tilghman Lesher (tilghman) 2006-01-25 10:06:10.000-0600 murf: see ASTERISK-6074. Apparently there is still a memory leak in the expression parsing engine. By: nywiley (nywiley) 2006-01-26 10:03:57.000-0600 Hello. I think the expr2 modules are still leaking memory. Here is a dump of my memory after the system has been running for a couple of hours. The ast_expr modules just keep allocating memory and not releasing it. Let me know what additional information you need. This is with version 1.2.3 tarball release -with memory debugging turned on and dont-optimize Best Regards, Bill 1 bytes in 1 allocations in file 'res_features.c' 680 bytes in 1 allocations in file 'chan_agent.c' 16 bytes in 1 allocations in file 'res_agi.c' 24 bytes in 1 allocations in file 'devicestate.c' 11072 bytes in 1 allocations in file 'localtime.c' 5264 bytes in 7 allocations in file 'dsp.c' 20 bytes in 1 allocations in file 'cli.c' 164 bytes in 9 allocations in file 'app_dial.c' 1176 bytes in 3 allocations in file 'res_crypto.c' 3968 bytes in 16 allocations in file 'file.c' 16456 bytes in 8 allocations in file 'io.c' 80960 bytes in 5 allocations in file 'frame.c' 132488 bytes in 5 allocations in file 'res_musiconhold.c' 1250 bytes in 15 allocations in file 'config.c' 53180 bytes in 10 allocations in file 'rtp.c' 1992 bytes in 3 allocations in file 'format_wav.c' 1040 bytes in 2 allocations in file 'enum.c' 21372 bytes in 55 allocations in file 'app_queue.c' 108515 bytes in 32 allocations in file 'chan_zap.c' 2007 bytes in 28 allocations in file 'chan_local.c' 2768 bytes in 40 allocations in file 'manager.c' 183868 bytes in 99 allocations in file 'chan_sip.c' 10760 bytes in 13 allocations in file 'cdr.c' 25314 bytes in 49 allocations in file 'channel.c' 21966 bytes in 403 allocations in file 'logger.c' 8840 bytes in 151 allocations in file 'sched.c' 474 bytes in 48 allocations in file 'app_macro.c' 8544 bytes in 227 allocations in file 'chanvars.c' 36750 bytes in 34 allocations in file 'app_voicemail.c' 6480 bytes in 167 allocations in file 'asterisk.c' 40320 bytes in 126 allocations in file 'loader.c' 5302 bytes in 261 allocations in file 'res_indications.c' 16361 bytes in 999 allocations in file 'pbx_config.c' 135555 bytes in 1359 allocations in file 'pbx.c' 150656 bytes in 9416 allocations in file 'ast_expr2.y' 472416 bytes in 37624 allocations in file 'ast_expr2.fl' By: Steve Murphy (murf) 2006-01-26 10:38:19.000-0600 What exactly are the numbers saying here? Are they expressing the total amount of allocated memory, never freed, or are they just counting the allocated memory, and not subtracting the freed memory? You see, every time a $[...] expression is evaluated, the parser is called, and a series of memory allocations (calloc, etc) takes place. The memory is freed before the parser exits. So, to bump the allocation is a normal process. The fact that the numbers rise so quickly is testimony that you make heavy use of $[...] expressions! Or that you have one very busy daemon, lots of incoming/outgoing calls. If whatever is counting the memory allocations, also is decrementing the totals with the free() calls, then we need to investigate! I have been thinking about how to attack this, and now I have an idea... send me your dialplan (extensions.conf), I extract all your $[...] expressions, set up a test to run them millions of times, and study the allocation totals. Purify is good at finding the leaked memory when the pointers disappear, but doesn't tell you about leaks where the pointers still live... So, if indeed there are still mem leaks, I need to find where/why the pointers are preserved. One other source of leaks is/are syntax errors from the $[...] parser itself. Are you getting any error messages in your logs? You'd have to be getting a ton of them! Check around, and let me know. My email is murf at e-tools [dot] com. By: nywiley (nywiley) 2006-01-26 12:39:49.000-0600 The numbers show you the CURRENTLY allocated number of bytes per module and the CURRENTLY active number of allocations. As you free things up, the memory allocated would go down. I have a test box ready to go when you are ready to give me something to try. Best Regards, Bill By: Tilghman Lesher (tilghman) 2006-01-26 13:11:32.000-0600 Confirmed. The expression "$[ 1 + 1 ]" causes the following unfreed allocations: 12 bytes allocated in ast_yylex at line 68 of ast_expr2.fl 2 bytes allocated in ast_yylex at line 68 of ast_expr2.fl 12 bytes allocated in ast_yylex at line 84 of ast_expr2.fl 12 bytes allocated in ast_yylex at line 84 of ast_expr2.fl 12 bytes allocated in make_integer at line 235 of ast_expr2.y By: nywiley (nywiley) 2006-01-26 14:28:34.000-0600 /usr/bin/valgrind --leak-check=full --show-reachable=yes ./testexpr2 "$[ 1 + 1 ]" ==27035== Memcheck, a memory error detector. ==27035== Copyright (C) 2002-2005, and GNU GPL'd, by Julian Seward et al. ==27035== Using LibVEX rev 1471, a library for dynamic binary translation. ==27035== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. ==27035== Using valgrind-3.1.0, a dynamic binary instrumentation framework. ==27035== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==27035== For more details, rerun with: -v ==27035== =====2====== ==27035== ==27035== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 5 from 1) ==27035== malloc/free: in use at exit: 18 bytes in 2 blocks. ==27035== malloc/free: 7 allocs, 5 frees, 269 bytes allocated. ==27035== For counts of detected errors, rerun with: -v ==27035== searching for pointers to 2 not-freed blocks. ==27035== checked 63,400 bytes. ==27035== ==27035== 2 bytes in 1 blocks are indirectly lost in loss record 1 of 2 ==27035== at 0x4A18AFE: malloc (in /usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so) ==27035== by 0x4B8AD11: strdup (in /lib64/libc-2.3.6.so) ==27035== by 0x401EA3: ast_yylex (ast_expr2.fl:84) ==27035== by 0x404190: ast_yyparse (ast_expr2.c:1149) ==27035== by 0x403C1D: ast_expr (ast_expr2.fl:112) ==27035== by 0x40582A: main (ast_expr2.y:398) ==27035== ==27035== ==27035== 18 (16 direct, 2 indirect) bytes in 1 blocks are definitely lost in loss record 2 of 2 ==27035== at 0x4A1A181: calloc (in /usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so) ==27035== by 0x401E7C: ast_yylex (ast_expr2.fl:84) ==27035== by 0x404190: ast_yyparse (ast_expr2.c:1149) ==27035== by 0x403C1D: ast_expr (ast_expr2.fl:112) ==27035== by 0x40582A: main (ast_expr2.y:398) ==27035== ==27035== LEAK SUMMARY: ==27035== definitely lost: 16 bytes in 1 blocks. ==27035== indirectly lost: 2 bytes in 1 blocks. ==27035== possibly lost: 0 bytes in 0 blocks. ==27035== still reachable: 0 bytes in 0 blocks. ==27035== suppressed: 0 bytes in 0 blocks. By: nywiley (nywiley) 2006-01-26 15:08:30.000-0600 FYI ... there is a memory loss bug in the 1.10 release as well ... I just tested it with valgrind and found memory loss. This must have been an issue for some time. By: Steve Murphy (murf) 2006-01-26 22:31:40.000-0600 One little note here! If you run testexpr2, make sure NOT to include the $[..] stuff. Instead of running testexpr2 "$[ 1+1 ]", run testexpr2 "1+1" because the expr2 parser will always be presented the expression, with the $[] stripped from it by asterisk. By: Tilghman Lesher (tilghman) 2006-02-05 11:14:14.000-0600 We seem to be waiting on an updated patch for trunk-only, as the memory leak portions have been committed and merged to trunk. By: Steve Murphy (murf) 2006-02-05 13:03:58.000-0600 Hadn't thought much on trunk. Was hoping that the AEL2 intro and the memory leak would be solved in one mod, rather than try to break them up. Also, since just the ast_expr2 stuff contains several enhancements along with the mem leak plug, the absolute most time consuming thing to do would be to require me to break out the mem leak plug from the enhancements, and file them as two separate issues. So what exactly do you require here? By: Tilghman Lesher (tilghman) 2006-02-05 19:46:44.000-0600 So, we can close this one, since the enhancements are in another patch, in another bug? Please understand that the patch included here will not apply to trunk, given that the part of it that implemented the memory fixes was already merged (from 1.2). By: Steve Murphy (murf) 2006-02-05 23:34:55.000-0600 OK, I understand. I checked trunk, and indeed, the patch is already applied to it. So, it does look like you could close this. Do we need to worry about the 1.10 release, as nywiley pointed out? Or is that ancient enough not to worry about? |