Summary: | ASTERISK-28609: Memory Leak in res_rtp_asterisk.c | ||
Reporter: | Ted G (tgwaste) | Labels: | fax patch |
Date Opened: | 2019-11-05 22:24:08.000-0600 | Date Closed: | 2019-12-18 08:47:54.000-0600 |
Priority: | Minor | Regression? | |
Status: | Closed/Complete | Components: | General |
Versions: | 16.3.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) ASTERISK-28609-1.patch ( 1) frame-malloc-debug.patch ( 2) leak.txt ( 3) mem-20191216.txt | |
Description: | Hello,
There appears to be a memory leak in frame.c. Sometimes I received these errors in my logs: [Nov 5 21:14:49] WARNING[14918][C-00028a27]: file.c:258 ast_writestream: Translated frame write failed I believe this happenes because of some issues between my carrier and the asterisk system and most of the time these errors are not frequent. However sometimes these errors are produced thousands of times per day and when that happens the memory usage for asterisk shoots up and never recovers: ... 1073986 bytes in 5455 allocations in file xmldoc.c 1327380 bytes in 1819 allocations in file chan_sip.c 1351770 bytes in 840 allocations in file res_rtp_asterisk.c 1707720 bytes in 105 allocations in file smoother.c 2409853 bytes in 255 allocations in file file.c 2988480 bytes in 163 allocations in file bridge_softmix.c 7310456 bytes in 633 allocations in file translate.c 7434868 bytes in 2773 allocations in file confbridge/conf_config_parser.c 899355372 bytes ( 899355372 cache) in 2046261 allocations in file frame.c <---- 933339512 bytes allocated (899355372 in caches) in 2102261 selected allocations 933339512 bytes in all allocations 934016850 bytes in all allocations and deferred free allocations axis/root# top -b -n 1 | grep asterisk 19727 root 20 0 3353860 1.314g 8256 S 111.8 2.1 16620:19 asterisk 1.3GB :( System uptime: 1 week, 3 days, 23 hours, 34 minutes, 42 seconds 1 week, 3 days worth of frame write failed error counts: mmdd errors 1027 2218 1028 1426 1029 5275 1030 2454 1031 1612 1101 2797 1102 1635 1103 3278 1104 743 1105 1117 | ||
Comments: | By: Asterisk Team (asteriskteam) 2019-11-05 22:24:09.137-0600 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur. By: Joshua C. Colp (jcolp) 2019-11-06 05:03:04.498-0600 What codecs are in use? What type of file is it writing to? Can you provide call flow and console output? By: Ted G (tgwaste) 2019-11-06 11:11:51.717-0600 What codecs are in use? I'm only using ulaw. What type of file is it writing to? I don't quite understand the question. Its writing to a stream afaik: file.c:258 ast_writestream: Translated frame write failed Can you provide call flow and console output? Again, I don't understand the question. The system is enormous. To provide a complete map of call flow would be weeks of effort. The basics are: Voip Phone company -> My Asterisk System -> Do Stuff -> Hang up By: Joshua C. Colp (jcolp) 2019-11-06 11:17:05.330-0600 The message is being output as a result of an audio frame being given to file.c and a format module. This is used to record audio into a file. Knowing what file is being recorded to (the extension) can help to understand precisely what is going on. The message itself is only one part and provides a hint that can then be used to identify the full scenario of what is going on. As it is there isn't enough information to be able to understand that. By: Ted G (tgwaste) 2019-11-06 12:22:46.817-0600 Ah. The file extension is always .wav Input File : 'test-rec.wav' Channels : 1 Sample Rate : 8000 Precision : 16-bit Duration : 00:04:29.55 = 2156392 samples ~ 20216.2 CDDA sectors File Size : 4.31M Bit Rate : 128k Sample Encoding: 16-bit Signed Integer PCM By: Joshua C. Colp (jcolp) 2019-11-07 07:45:51.612-0600 How is the recording being done? MixMonitor? Record? I've been looking through the code and experimenting, and haven't yet been able to reproduce the frame leak given the information. By: Ted G (tgwaste) 2019-11-07 08:18:21.361-0600 I do both types of recording, could be either. The system has a lot of traffic. I am not able to reproduce this at will. It happens over time (~week) By: Corey Farrell (coreyfarrell) 2019-11-07 09:01:52.687-0600 If you can retest with the attached patch it will give better debugging information for frame allocations. Specifically MALLOC_DEBUG will report allocations as coming from the caller to ast_frdup / ast_frisolate instead of reporting frame.c. So for example if MixMonitor were leaking frames you would see allocations for app_mixmonitor.c increasing. Note this disables allocation caching in frame.c so it will likely have a performance impact but since you are already MALLOC_DEBUG I do not expect this to be major. By: Ted G (tgwaste) 2019-11-07 09:55:51.147-0600 Thank you! Will test as soon as I can. I haven't noticed any performance issues with MALLOC_DEBUG so I just keep it on. By: Ted G (tgwaste) 2019-11-08 12:09:04.810-0600 First I apologize as I know this is an amateur hour question but its probably been at least a decade since I applied a patch and there are multiple ways I've come across in my searching for how to do it. Can you tell me the proper way? Im getting the following in my attempt: /usr/src/asterisk-16.3.0# patch include/asterisk/frame.h frame-malloc-debug.patch patching file include/asterisk/frame.h patching file include/asterisk/frame.h Hunk #1 FAILED at 43. Hunk #2 FAILED at 185. Hunk #3 FAILED at 194. Hunk #4 FAILED at 205. Hunk #5 FAILED at 291. Hunk #6 FAILED at 313. Hunk #7 FAILED at 736. 7 out of 7 hunks FAILED -- saving rejects to file include/asterisk/frame.h.rej /usr/src/asterisk-16.3.0# diff include/asterisk/frame.h ../asterisk-16.3.0.bak/include/asterisk/frame.h 563,564c563 < #define ast_frisolate(fr) __ast_frisolate(fr, __FILE__, __LINE__, __PRETTY_FUNCTION__) < struct ast_frame *__ast_frisolate(struct ast_frame *fr, const char *file, int line, const char *func); --- > struct ast_frame *ast_frisolate(struct ast_frame *fr); 571,572c570 < #define ast_frdup(fr) __ast_frdup(fr, __FILE__, __LINE__, __PRETTY_FUNCTION__) < struct ast_frame *__ast_frdup(const struct ast_frame *fr, const char *file, int line, const char *func); --- > struct ast_frame *ast_frdup(const struct ast_frame *fr); Is that patched or not? :) By: Sean Bright (seanbright) 2019-11-08 12:12:03.488-0600 {noformat} cd /usr/src/asterisk-16.3.0 patch -p1 < /path/to/frame-malloc-debug.patch {noformat} By: Ted G (tgwaste) 2019-11-08 12:20:15.293-0600 Thank you Sean. All good. By: Ted G (tgwaste) 2019-11-08 20:57:37.532-0600 So far since the patch I've received 230 errors but I'm not seeing anything different in memory show summary or console output {code} # asterisk -x 'memory show summary' | grep fr 9520 bytes in 170 allocations in file framehook.c 679167 bytes in deferred free large allocations 22601 bytes in deferred free small allocations 701768 bytes in deferred free allocations 43627447 bytes in all allocations and deferred free allocations {code} [Nov 8 19:53:47] WARNING[13303][C-00001291]: file.c:258 ast_writestream: Translated frame write failed Memory hasn't gone crazy yet but I am unsure what to be on the lookout for exactly. By: Corey Farrell (coreyfarrell) 2019-11-10 19:18:18.845-0600 With my patch you will not see allocations under frame.c anymore, they will appear elsewhere. By: Sean Bright (seanbright) 2019-11-12 08:37:10.479-0600 [~tgwaste], in your {{MixMonitor}} usage - are you using the {{r}} and/or {{t}} options along with the newer {{S}} option? By: Ted G (tgwaste) 2019-11-12 17:18:54.395-0600 The only 2 flags I use for MixMonitor are 'a' and 'b' IE: EXEC MIXMONITOR /path/to/file,ab By: Joshua C. Colp (jcolp) 2019-11-13 13:04:14.578-0600 Per the comment from [~coreyfarrell] the allocations will appear elsewhere. Can you please attach the full output? By: Sean Bright (seanbright) 2019-11-20 10:10:35.954-0600 [~tgwaste], can you attach the output from {{memory show summary}} (don't filter it in any way). By: Corey Farrell (coreyfarrell) 2019-11-23 13:37:37.734-0600 {code} 27446120 bytes in 132721 allocations in file res_rtp_asterisk.c {code} How many channels were active when you retrieved this information? Please post the complete output of asterisk CLI {{memory show allocations res_rtp_asterisk.c}} and tell us how many channels were active at the time of retrieval. By: Ted G (tgwaste) 2019-11-23 13:50:23.256-0600 Was really slow pasting this.. I am unable to delete my duplicate messages. Both messages resulted in an error from this site. By: Richard Mudgett (rmudgett) 2019-11-23 14:10:02.803-0600 Please *attach* large blocks of text information as *files* to the issue instead of pasting them into a comment. Your comment can then refer to points in the attached file. By: Corey Farrell (coreyfarrell) 2019-11-23 14:52:26.188-0600 Sorry partly my fault I didn't specify. For some reason I had expected the CLI command to group the output. I've processed it with {{sort|uniq -c}}, attached as leak.txt. It looks like {{ast_rtp_read}} has some paths where the {{frames}} list is non-empty but does not get returned or freed. By: Benjamin Keith Ford (bford) 2019-11-25 09:43:26.096-0600 Looking at {{ast_rtp_read}}, it looks like Corey is right. I'm going to open an issue for an audit of the function. By: George Joseph (gjoseph) 2019-12-05 15:08:04.438-0600 Ted, Can you upgrade to the latest version of 16 (16.6.2)? I have a patch for you to try but I don't want to backport it to 16.3. Apply the patch with {noformat} patch -p1 < ASTERISK-28609-1.patch {noformat} and rebuild. THIS IS A TEST PATCH, not a permanent one. Give it a shot and let me know how it works. By: Ted G (tgwaste) 2019-12-06 00:54:14.253-0600 Thanks George, I will try to get to this as soon as possible and report back. By: George Joseph (gjoseph) 2019-12-09 10:29:27.570-0600 Any news? By: Ted G (tgwaste) 2019-12-09 11:20:06.003-0600 I'm applying the patch now. I'll let it run a week(?) and then provide the same data Corey requested. By: Ted G (tgwaste) 2019-12-10 08:03:19.463-0600 To clarify: Should I have installed the other patch as well? frame-malloc-debug.patch? By: George Joseph (gjoseph) 2019-12-10 08:40:11.510-0600 No. I'd prefer that you only apply the fix patch. By: George Joseph (gjoseph) 2019-12-12 09:16:22.304-0600 Hi [~tgwaste], Any improvement? We'd like to get this closed out. :) By: Ted G (tgwaste) 2019-12-16 23:03:55.634-0600 Included is the output of {code}asterisk -x 'memory show summary'{code} for 1 full week. >> mem-20191216.txt By: George Joseph (gjoseph) 2019-12-17 06:23:34.229-0600 Well, that's certainly better than your original report of {noformat} 899355372 bytes ( 899355372 cache) in 2046261 allocations in file frame.c <---- {noformat} Do you agree that the patch provided solves your issue? By: Ted G (tgwaste) 2019-12-17 16:06:53.297-0600 Well, as I originally stated this leak happens randomly. Currently the memory usage is not crazy. It can happen tomorrow or in a month there is no way of knowing. Having said that, this doesn't quite seem right but it could be my lack of knowledge: {code} # asterisk -x 'core show channels concise' | grep SIP | wc -l 86 # asterisk -x 'memory show allocations res_rtp_asterisk.c' | wc -l 703 {code} By: Sean Bright (seanbright) 2019-12-17 16:19:07.851-0600 [~tgwaste], what about that output doesn't seem right to you? By: Ted G (tgwaste) 2019-12-17 16:29:04.068-0600 I thought it should closely match the number of channels but I guess that's not right? I guess I miss-understood what Corey was getting at when he asked for the same information and asked: "How many channels were active when you retrieved this information? Please post the complete output of asterisk CLI memory show allocations res_rtp_asterisk.c and tell us how many channels were active at the time of retrieval." If everyone thinks the patch is the answer, then I'm OK to close the issue out. If memory goes crazy again I can reopen. Will the patch be in the next version? By: Sean Bright (seanbright) 2019-12-17 16:34:14.811-0600 The number of allocations in {{res_rtp_asterisk}} is some function of the number of channels, but I would not expect them to be the same number. 8 allocations in {{res_rtp_asterisk}} per SIP channel (703 / 86) does not seem unreasonable to me. Assuming that ratio remains constant, then I don't think there is anything to be concerned about. By: George Joseph (gjoseph) 2019-12-17 16:34:55.264-0600 Yeah, there's only an approximate relationship. We just wanted to get an idea of how busy the system was. The patch does plug a few holes in the exact place shown in the debug information so I'm fairly confident that you won't see any increase over time but yeah feel free to re-open the issue if you do see them. The patch was just officially too late for the next release but we're going to have to do another set of release candidates anyway so I'll see if we can get this in. By: Friendly Automation (friendly-automation) 2019-12-18 08:47:56.072-0600 Change 13466 merged by Joshua C. Colp: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13466|https://gerrit.asterisk.org/c/asterisk/+/13466] By: Friendly Automation (friendly-automation) 2019-12-18 08:57:37.290-0600 Change 13393 merged by George Joseph: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13393|https://gerrit.asterisk.org/c/asterisk/+/13393] By: Friendly Automation (friendly-automation) 2019-12-18 09:01:33.379-0600 Change 13392 merged by George Joseph: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13392|https://gerrit.asterisk.org/c/asterisk/+/13392] By: Friendly Automation (friendly-automation) 2019-12-18 09:02:45.561-0600 Change 13483 merged by Friendly Automation: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13483|https://gerrit.asterisk.org/c/asterisk/+/13483] By: Friendly Automation (friendly-automation) 2019-12-18 09:03:31.215-0600 Change 13484 merged by George Joseph: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13484|https://gerrit.asterisk.org/c/asterisk/+/13484] By: Friendly Automation (friendly-automation) 2019-12-18 09:04:55.082-0600 Change 13412 merged by George Joseph: res_rtp_asterisk: Add frame list cleanups to ast_rtp_read [https://gerrit.asterisk.org/c/asterisk/+/13412|https://gerrit.asterisk.org/c/asterisk/+/13412] |