Summary: | ASTERISK-17964: Possible memory leak in chan_sip.c | ||
Reporter: | daren ferreira (daren) | Labels: | |
Date Opened: | 2011-06-04 06:30:32 | Date Closed: | 2011-07-26 09:47:45 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | Channels/chan_sip/General |
Versions: | 1.6.2.18 1.8.4 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) dbg-sip-alloc.txt.gz ( 1) dbg-sip-alloc-1.8.4.2.txt ( 2) dbg-sip-sum.txt ( 3) dbg-sip-sum-1.8.4.2.txt ( 4) memory-2011-06-04_06:00:01 ( 5) memory-2011-06-04_07:00:01 ( 6) memory-2011-06-04_08:00:01 ( 7) memory-2011-06-04_09:00:01 ( 8) memory-2011-06-04_10:00:01 ( 9) memory-2011-06-04_11:00:01 (10) memory-2011-06-04_12:00:01 (11) memory-2011-06-04_13:00:01 | |
Description: | After several crashes i made investigations and saw memory always growing, days after days, i updated my 1.6.2.15 to 1.6.2.18 and activate memory allocation debugging and now i can say that chan_sip is taking around 5 more megabytes every hour. I'll try to attach "memory show summary" to this report... Please let me know if you require anymore information ****** STEPS TO REPRODUCE ****** Just wait, with our without calls. I have some register tries from unknown peers... | ||
Comments: | By: Walter Doekes (wdoekes) 2011-06-04 07:01:53 Please attach 'memory show allocations chan_sip.c' and 'memory show summary chan_sip.c' That could provide valuable clues. By: David Woolley (davidw) 2011-06-04 08:22:18 You also need to reproduce it on 1.8.4 as the 1.6 series is no longer supported for non-security issues. By: daren ferreira (daren) 2011-06-06 04:46:11.135-0500 Thank you for your reply. With JIRA i found another request for a similar problem that may have been fixed (ASTERISK-17510) but other people seems to have same problem, so i'll made tests on 1.8. I will attach 'memory show allocations chan_sip.c' and 'memory show summary chan_sip.c' By: daren ferreira (daren) 2011-06-06 04:49:03.178-0500 These files are results from memory show summary chan_sip.c and memory show allocations chan_sip.c By: daren ferreira (daren) 2011-06-07 04:29:13.433-0500 I just upgrade to 1.8.4.2 the leak is still here. You'll find 'memory show allocations chan_sip.c' and 'memory show summary chan_sip.c' attached to this request. By: daren ferreira (daren) 2011-06-07 04:29:55.305-0500 results of 'memory show allocations chan_sip.c' and 'memory show summary chan_sip.c' By: Walter Doekes (wdoekes) 2011-06-07 15:35:19.402-0500 Are you using realtime peers? And are they cached? Can you reduce your configuration to a bare minimum and still get the leaks? (And then post the config.) Is it REGISTER requests that cause the leaks? You can hammer your own server with a tool like sipp. https://code.osso.nl/projects/sipp/browser/scenario/register.xml By: daren ferreira (daren) 2011-06-08 19:28:30.168-0500 I'm using realtime peers but there not cached. I just made tests with the scenario you gave me. It seems that memory usage is growing continuously while sipp running but memory seems to be freed when sipp is terminated... with both correct or incorrect login and password I unfortunately can't reduce my configuration (production server) About SIP, i'm using realtime odbc/sql authentication and my sip.conf parameters are : ---------------------------- [general] context=default allowguest=no bindport=5060 bindaddr=0.0.0.0 srvlookup=no videosupport=no disallow=all allow=g722 allow=alaw allow=ulaw musicclass=default language=fr dtmfmode=rfc2833 alwaysauthreject=yes allowexternalinvites=no autodomain=no vmexten=550 promiscredir=no canreinvite=no t38pt_udptl=yes ignoreregexpire=yes tos_sip=cs3 tos_audio=ef tos_video=af41 rtsavesysname=yes allowsubscribe=no counteronpeer=yes callcounter=yes register => xxxx:yyyy@sip.voiptraffic.net -------------------------- I can give you more informations on my config, just let me know what and i'll tell you. By: Walter Doekes (wdoekes) 2011-06-09 06:54:44.974-0500 Ok. I belive rtcachefriends=no by default. So uncached friends it is. Memory usage does increase the first moments, but it should stabilize after a while (32 secs?). Certain objects are garbage collected only after some time because they may be needed for retransmits. But, if you're saying the register hammering does not speed up the excess memory usage, then that's probably not it. Can you try removing the register=> line? Is there any qualify=yes going on? What happens when you remove that? Other leaky candidates might be alwaysauthreject, ignoreregexpire, t38pt_udtptl or perhaps the callcounter. Does that counter work for your uncached friends? By: daren ferreira (daren) 2011-06-09 17:51:54.906-0500 Thank you for your reply. You're right callcounter is no more useful, so i disabled it, without real change I verified, qualify is set to no (anyway if i well remember it is not compatible with rt with cache disabled) I disabled t38pt_udtptl, no change memory is still growing Unfortunately ignoreregexpire and alwaysauthreject can't be disabled because : ignoreregexpire is mandatory because of users who forget to regulary register (or don't set their devices properly) alwaysauthreject is mandatory for security reasons. Then i wonder why "subversion" has closed this issue because i don't find any mention of any patch for such issue.. is it an error related to move between mantis to jira A last idea, memory is growin even with no visible activity (neither register nor call) then by showing sip debug i see much activity, which seems to be related to nat (to bypass nat problems, nat=yes is used in sip peers), do you think that nat can be source of such leaks? I looked at memory show allocations chan_sip.c and made counts: occurences| chan_sip.c line 1477 26057 1477 26060 1477 26065 1 13547 12 11105 1 1057 1 29016 1 29017 1 29018 1 29019 1 29116 1 26795 1 26796 1 26797 1 26798 1 26799 66 7247 66 7250 66 7255 1 860 1 6690 1 6700 1 6831 1 6839 66 7228 The always growing lines are 1477 26057 1477 26060 1477 26065 I had a look at these lines in chan_sip.c, they are all in "else" part of "if (peer)" which is defined by ------- if (!realtime || ast_test_flag(&global_flags[1], SIP_PAGE2_RTCACHEFRIENDS)) { /* Note we do NOT use find_peer here, to avoid realtime recursion */ /* We also use a case-sensitive comparison (unlike find_peer) so that case changes made to the peer name will be properly handled during reload */ ast_copy_string(tmp_peer.name, name, sizeof(tmp_peer.name)); peer = ao2_t_find(peers, &tmp_peer, OBJ_POINTER | OBJ_UNLINK, "find and unlink peer from peers table"); } ------ but because of realtime and not caching peer is not defined and redefined undefinitely It seems that peer already connected are not recognized and so are recreated I have no idea why... if i well understand the code, rtcache would be a way to stop such problem... or a fix in code to recognize already created peers... I remember rtcachefriend is a problem ( but i don't remember why ) so i hesitate to reactivate it... I just made tests with rtcachefriend enabled and it seems to stop the leakage.... Can somebody give me reasons to keep rtcachefriend activated? Or better, find a fix for chan_sip.c By: David Woolley (davidw) 2011-06-17 05:48:25.850-0500 It looks like this was closed by a bug in the repository automation/JIRA interface. The repository seems to have closed it as the result of committing a completely different issue. By: daren ferreira (daren) 2011-06-17 06:54:24.528-0500 How to reopen this issue because that's a real issue. Normally you shouldn't have to enable rtcachefriends not to get memory leak... By: daren ferreira (daren) 2011-07-01 07:17:49.416-0500 Do somebody have an idea on how to solve the problem? Is there any developper on this issue? Or maybe that's considered as a normal behaviour... even if i doubt that rtcachefriend is mandatory when using realtime in order not to have a memory leak... By: Walter Doekes (wdoekes) 2011-07-01 18:20:01.493-0500 Ok. So you've concluded that setting rtcachefriend=yes solves the leak. That's valuable. I fixed that exact same problem in https://issues.asterisk.org/jira/browse/ASTERISK-17510 which is rolled out in the 1.8 branch first in 1.8.5-rc1 (look for "Don't link non-cached realtime peers" in the ChangeLog). So.. try that version and see if it helps. By: daren ferreira (daren) 2011-07-28 09:42:08.839-0500 It seems to work with 1.8.5.0 but 1.8.5.0 has a new, and worse problem, so i'll have to do a rollback and open a new issue. Thank you for your help ! |