Summary:ASTERISK-17196: High CPU Usage / Choppy Sound (res_timing_pthread)
Reporter:Kadir Terzi (kterzi)Labels:
Date Opened:2011-01-03 08:54:08.000-0600Date Closed:2011-09-22 07:38:16
Versions:Frequency of
Environment:Attachments:( 0) 7_backtrace_threads.txt
( 1) 7_core_show_locks.txt
( 2) 7_core_show_threads.txt
( 3) asterisk_CLI_15_02_20111.8.2.3.txt
( 4) backtrace_01_03_20111.8.2.3.txt
( 5) backtrace_15_02_20111.8.2.3.txt
( 6) backtrace-threads.txt
( 7) backtrace-threads1.txt
( 8) core_show_locks_01_03_20111.8.2.3.txt
( 9) core_show_locks_1_8_2_3.txt
(10) core_show_locks_15_02_20111.8.2.3.txt
(11) core_show_threads_01_03_20111.8.2.3.txt
(12) core_show_threads_1_8_2_3.txt
(13) core_show_threads_15_02_20111.8.2.3.txt
(14) core-show-locks.txt
(15) core-show-locks1.txt
(16) core-show-locks2.txt
Description:I face the same symptoms with issue 16158 on debain system(x86_64) with Asterisk The longer Asterisk sits idle the longer the high CPU usage.  There is some correlation between the length of idle time and the length of time the CPU remains spiked at 100%. I'm using 1.4.27 stable version for a very long time without any problem and I just wanted to upgrade to 1.8 for it's new functions. So, this is basically an upgrade from 1.4. I have converted my dialplan successfully. I have even tested SRTP with success.
Comments:By: Kadir Terzi (kterzi) 2011-01-03 09:12:41.000-0600

Some more details are here. I do not have Zap/Dahdi channels and I use realtime _pbx. I've tried with 2 different debian servers both of them are 4-core dell systems.

By: Leif Madsen (lmadsen) 2011-01-05 09:48:30.000-0600

When this is happening I think you should provide a 'core show locks' and a backtrace of the running system with DONT_OPTIMIZE enabled.

See the Debugging section of the Asterisk wiki at http://wiki.asterisk.org/wiki/


By: Kadir Terzi (kterzi) 2011-01-06 04:31:19.000-0600

Backtraces and locks are now uploaded.

By: Kadir Terzi (kterzi) 2011-01-07 02:23:53.000-0600

I have to mention that I have copied the '.conf' files from Asterisk 1.4 and only updated the following:
-> [settings] section of 'extconfig.conf' according to the new format
-> converted the 'extensions.conf'.
-> new field 'uniqueid' is added to 'realtime_queue_members' table.

By: Kadir Terzi (kterzi) 2011-01-10 06:59:08.000-0600

After compiling Asterisk without res_timing_pthread, it seems the problem is disappeared. Is this something to be fixed?

By: Kadir Terzi (kterzi) 2011-01-10 09:33:46.000-0600

After disabling "res_timing_pthread.so", I'm having a different problem. I see "channel.c:1399 __ast_queue_frame: Exceptionally long voice queue length queuing to Local/......" at Asterisk CLI just before playing sound files and CPU usage is again gettig very high. Please see the attached file(core_show_locks2.txt).

By: John Covert (jcovert) 2011-01-11 20:39:36.000-0600

Exceptionally long voice queue length queuing ...

This would appear to be related to ASTERISK-17143 and ASTERISK-16258

By: Kadir Terzi (kterzi) 2011-01-17 02:41:33.000-0600

I've uploaded 3 new trace results from a different machine. The problem is exactly the same. Asterisk starts consuming %100 CPU after an idle period.

By: Kadir Terzi (kterzi) 2011-01-18 02:35:31.000-0600

Problem persists also in Asterisk 1.8.2.

By: Kadir Terzi (kterzi) 2011-01-21 08:18:16.000-0600

Is there any progress about this issue? Please let me know if more details are needed.

By: Leif Madsen (lmadsen) 2011-01-21 08:34:02.000-0600

I initially added relations to the other issues, but those issues aren't using pthread, and that's what this issue is reporting, so I've removed the relationships.

This issue has been prioritized and will be dealt with as soon as time and resources allow.

By: Leif Madsen (lmadsen) 2011-01-21 08:35:39.000-0600

Can you use res_timing_timerfd (if your kernel supports it) or res_timing_dahdi? Typically one of those two timer modules will be more stable than res_timing_pthread if you're having issues with it.

By: Kadir Terzi (kterzi) 2011-01-21 08:59:20.000-0600

I have Linux 2.6.26-2-amd64 installed. Normally timerfd should be supported by this version. But, menuselect does not allow me to select it. It's somehow not implemented. And I have no Dahdi in this server.

By: John Covert (jcovert) 2011-01-21 10:24:27.000-0600

I'm on Mac, and will not be able to use other timing options.

By: Kadir Terzi (kterzi) 2011-01-28 08:44:05.000-0600

I have tested now without success. "core show locks" and "core show threads" are attached.

By: Kadir Terzi (kterzi) 2011-02-15 03:24:24.000-0600

I have installed newer kernel(2.6.32-5-amd64) which supports res_timing_timerfd. Now, I do not have high CPU usage after an idle period. But, there is another problem which causes a very high amount of CPU load. I have pasted backtrace, locks, threads and also asterisk console messages I get. I'm almost about to give up with testing 1.8.I have tested by disabling res__timing_pthread with asterisk 1.8.2 and

By: Kadir Terzi (kterzi) 2011-03-01 08:28:52.000-0600

I have installed dahdi software and tried with res_timing_dahdi only enabled as timer. It did not help my issue. I still get deadlocks, high CPU usage.New traces are attached.

By: Kadir Terzi (kterzi) 2011-03-01 09:33:42.000-0600

Asterisk CLI gives me hundreds of these messages during the problem:
channel.c: Exceptionally long voice queue length queuing to Local/599@xxxxxxxx
And after a while I start getting thousands of:
channel.c: Unable to write to alert pipe on Local/599@xxxxxxxx (qlen = 64): Resource temporarily unavailable!

By: Kadir Terzi (kterzi) 2011-03-01 09:44:39.000-0600

One of the reason was sox. It happens when sox is running to convert wav to mp3 just after a monitored call. I have now removed sox from the dialplan. It worked fine for 1,5 hours with 3-4 average simultaneous # of calls. And then happened again.

By: Rafael Prado Rocchi (prado) 2011-04-19 12:14:00

kterzi, try this patch:

Its related to "Exceptionally long voice queue length queuing to Local/599@xxxxxxxx" and to 400%+ cpu usage.

Solved in my tests

By: Leif Madsen (lmadsen) 2011-09-22 07:38:16.779-0500

Known issue with res_timing_pthread -- please use other timing methods. Closing.