|Summary:||ASTERISK-17221: Asterisk SVN 1.8 running at 99% CPU|
|Date Opened:||2011-01-10 21:45:11.000-0600||Date Closed:||2019-04-09 12:32:57|
|Environment:||Attachments:||( 0) asterisk-info.txt|
( 1) asterisk-info--2.txt
( 2) asterisk-logs.txt
( 3) asterisk-logs--2.txt
( 4) capture3.txt
( 5) core-show-locks--2.txt
( 6) gdb-trace.txt
( 7) gdb-trace--2.txt
( 8) snmp-cpu.jpg
( 9) snmp-cpu--2.jpg
|Description:||Asterisk (SVN-branch-1.8-r299449) process is processing calls (10 per minute) but running at 99% CPU utilisation. The asterisk process runs at low CPU utilisation for approximately 2 weeks before pegging at 99% utilisation.|
Have tried several asterisk releases with the same issue.
****** ADDITIONAL INFORMATION ******
config, logs and gdb trace attached.
|Comments:||By: Stefan Schmidt (schmidts) 2011-01-11 04:51:21.000-0600|
i am not sure but maybe you should try to deactive the full log at verbose level 5 and debug level 5. thats a big amount of data you have to write to disk. Do you really need this?
maybe its something complete different but form the gdb i dont see any "hanging" thread which looks like a cpu killer.
By: BrettH (zeero) 2011-01-11 19:02:47.000-0600
Hi, thanks for response.
No, the full log is not required. I initially experience the high CPU symptoms with only "error" logging enabled at verbose level=3. I then increased the logging levels in an attempt to identify the root cause for the CPU utilisation but it did not help.
By: Stefan Schmidt (schmidts) 2011-01-12 05:36:12.000-0600
and you didnt see any locks, right?
normally a high cpu load is only caused by a looping or hanging thread but even your gdb doesnt show something like this.
can you reproduce this or is it just the now running process and you didnt have restartet it?
By: Leif Madsen (lmadsen) 2011-01-12 08:48:48.000-0600
I don't see a 'core show locks' which would certainly help here.
By: Stefan Schmidt (schmidts) 2011-01-12 08:58:18.000-0600
leif its in the asterisk-info file but empty.
@zeero could you please retry the core show locks until you catch some information.
By: BrettH (zeero) 2011-01-12 19:19:36.000-0600
When I detach gdb it seems to kill the asterisk process. I have restarted asterisk with "/usr/sbin/asterisk -f -vvvg &" and the processor utilisation is back to 1%. I expect the high CPU utilisation to start reoccuring within the next 3 weeks.
When the issue reoccurs, I can create a cron to collect "asterisk -rx 'core show locks'" every minute. Is there any other compile flags or commands we can use to gather additional information?
By: Stefan Schmidt (schmidts) 2011-01-13 03:43:28.000-0600
your snmp looks like you have cached a deadlock but not bad enough to kill the whole system.
why do you use the do not fork option -f? any special reason for this?
you should try to use safe_asterisk then IMHO gdb will not kill the process.
if this happens again try to catch the core show locks in the console not with a cron job, cause every minute will not be often enough.
By: BrettH (zeero) 2011-01-13 21:12:24.000-0600
The issue has reoccured again after 3 days. I had no special reason for using the -f switch when starting asterisk, however I have now restarted it using safe_asterisk and the default switches. I have attached all the new debug and log information, these files are suffixed with "--2.txt". I collected the output of "core show locks" and spammed this command for approximately 5 minutes.
By: BrettH (zeero) 2011-01-22 00:29:15.000-0600
Any other suggestions/recommendations?
By: BrettH (zeero) 2011-02-01 07:30:17.000-0600
I'm at a loss on what additional things I can try to resolve this issue as this is production server. Is there a production ready version available (1.6 perhaps ?) that is easier to troubleshoot or do I need to contact a developer to consult for a fee?
By: BrettH (zeero) 2011-02-07 23:36:18.000-0600
I may be grasping at straws but the output of:
"core show threads"
"lsof -p 25127"
^ Seem to indicate that some TCP connections are not getting closed down or cleaned up correctly ?
By: Sean Bright (seanbright) 2019-04-09 12:32:57.283-0500
I'm not able to reproduce this with Asterisk 13 (the oldest version of Asterisk still supported). If you are able to reproduce this in Asterisk 13, please re-open by commenting on this ticket.