Summary: | ASTERISK-17457: Asterisk 1.6.2.16.1 on embedded linux low perfomance of app_queue | ||
Reporter: | trscod (trscod) | Labels: | |
Date Opened: | 2011-02-22 07:35:02.000-0600 | Date Closed: | 2011-07-26 14:27:14 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Applications/app_queue |
Versions: | 1.6.2.15 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace-threads.txt ( 1) backtrace-threads1.txt ( 2) core-show-locks.txt ( 3) core-show-locks1.txt | |
Description: | Hi I run asterisk 1.6.2.16.1 on embedded linux based on gentoo linux. LAMP & FreePBX is used. uname -a Linux pbx 2.6.36 #2 SMP Thu Feb 17 14:00:31 i686 Intel(R) Atom(TM) CPU N270 @ 1.60GHz GenuineIntel GNU/Linux ****** ADDITIONAL INFORMATION ****** Asterisk works with well perfomance. Tests of perfomance of chan_sip & playback: sipp -rtp_echo -sn uac 192.168.1.1 -s 777 -d 30000 -l 150 -i 192.168.1.2 with dialplan: [E1_IN] exten => _X.,1,Answer exten => _X.,2,Playback(/var/lib/asterisk/moh/atr) exten => _X.,3,Hangup asterisk -rx "core show channels" .... SIP/0000-00005ef3 777@E1_IN:2 Up Playback(/var/lib/asterisk/moh SIP/0000-00005ef2 777@E1_IN:2 Up Playback(/var/lib/asterisk/moh SIP/0000-00005ef1 777@E1_IN:2 Up Playback(/var/lib/asterisk/moh SIP/0000-00005ef0 777@E1_IN:2 Up Playback(/var/lib/asterisk/moh 150 active channels 150 active calls 24454 calls processed asterisk -rx "core show uptime" System uptime: 1 hour, 22 minutes, 6 seconds load average: 0.62, 1.16, 1.77 Asterisk's PID uses 99% CPU I think this perfomance is good for this mashine. Then I test a queue perfomance. In FreePBX i add 15 peers(extensions) & add queue whith default settings: asterisk -rx "queue show" 0000001 has 0 calls (max unlimited) in 'ringall' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 60s Members: 109 (Local/109@from-queue/n) (Invalid) has taken no calls yet 108 (Local/108@from-queue/n) (Invalid) has taken no calls yet 107 (Local/107@from-queue/n) (Invalid) has taken no calls yet 106 (Local/106@from-queue/n) (Invalid) has taken no calls yet 105 (Local/105@from-queue/n) (Invalid) has taken no calls yet 104 (Local/104@from-queue/n) (Invalid) has taken no calls yet 103 (Local/103@from-queue/n) (Invalid) has taken no calls yet 102 (Local/102@from-queue/n) (Invalid) has taken no calls yet 101 (Local/101@from-queue/n) (Invalid) has taken no calls yet 100 (Local/100@from-queue/n) (Invalid) has taken no calls yet 114 (Local/114@from-queue/n) (Invalid) has taken no calls yet 113 (Local/113@from-queue/n) (Invalid) has taken no calls yet 112 (Local/112@from-queue/n) (Invalid) has taken no calls yet 111 (Local/111@from-queue/n) (Invalid) has taken no calls yet 110 (Local/110@from-queue/n) (Invalid) has taken no calls yet No Callers The members of this queue are absent. They are not registered on server. Either in FreePBX I add inbound route 777 to queue. So, test: sipp -rtp_echo -sn uac 192.168.1.1 -s 777 -d 30000 -l 5 -i 192.168.1.2 (ONLY 5 CALLS) & register one of 15 members of queue (peer 100) well, some time after the running test - all is OK: asterisk -rx "core show channels" Channel Location State Application(Data) SIP/100-00000005 100@from-internal:1 Ringing AppDial((Outgoing Line)) Local/100@from-queue s@macro-dial-one:37 Ring Dial(SIP/100,"",trM(auto-blkvm Local/100@from-queue 0000001@from-queue:1 Ringing AppQueue((Outgoing Line)) SIP/0000-00000004 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000003 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000002 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000001 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000000 0000001@ext-queues:9 Up Queue(0000001,t,,) 8 active channels 6 active calls 20 calls processed asterisk -rx "queue show" 0000001 has 5 calls (max unlimited) in 'ringall' strategy (0s holdtime, 0s talktime), W:0, C:0, A:0, SL:0.0% within 60s Members: 109 (Local/109@from-queue/n) (Unavailable) has taken no calls yet 108 (Local/108@from-queue/n) (Unavailable) has taken no calls yet 107 (Local/107@from-queue/n) (Unavailable) has taken no calls yet 106 (Local/106@from-queue/n) (Unavailable) has taken no calls yet 105 (Local/105@from-queue/n) (Unavailable) has taken no calls yet 104 (Local/104@from-queue/n) (Unavailable) has taken no calls yet 103 (Local/103@from-queue/n) (Unavailable) has taken no calls yet 102 (Local/102@from-queue/n) (Unavailable) has taken no calls yet 101 (Local/101@from-queue/n) (Unavailable) has taken no calls yet 100 (Local/100@from-queue/n) (Ringing) has taken no calls yet 114 (Local/114@from-queue/n) (Unavailable) has taken no calls yet 113 (Local/113@from-queue/n) (Unavailable) has taken no calls yet 112 (Local/112@from-queue/n) (Unavailable) has taken no calls yet 111 (Local/111@from-queue/n) (Unavailable) has taken no calls yet 110 (Local/110@from-queue/n) (Unavailable) has taken no calls yet Callers: 1. SIP/0000-00000000 (wait: 0:24, prio: 0) 2. SIP/0000-00000001 (wait: 0:24, prio: 0) 3. SIP/0000-00000002 (wait: 0:24, prio: 0) 4. SIP/0000-00000003 (wait: 0:24, prio: 0) 5. SIP/0000-00000004 (wait: 0:24, prio: 0) But after only 1minute, 41 seconds queue is ?eases to call member 100. Then I see: asterisk -rx "queue show" 0000001 has 6 calls (max unlimited) in 'ringall' strategy (3s holdtime, 1s talktime), W:0, C:1, A:9, SL:100.0% within 60s Members: 109 (Local/109@from-queue/n) (Unavailable) has taken no calls yet 108 (Local/108@from-queue/n) (Unavailable) has taken no calls yet 107 (Local/107@from-queue/n) (Unavailable) has taken no calls yet 106 (Local/106@from-queue/n) (Unavailable) has taken no calls yet 105 (Local/105@from-queue/n) (Unavailable) has taken no calls yet 104 (Local/104@from-queue/n) (Unavailable) has taken no calls yet 103 (Local/103@from-queue/n) (Unavailable) has taken no calls yet 102 (Local/102@from-queue/n) (Unavailable) has taken no calls yet 101 (Local/101@from-queue/n) (Unavailable) has taken no calls yet 100 (Local/100@from-queue/n) (Not in use) has taken 1 calls (last was 35 secs ago) 114 (Local/114@from-queue/n) (Unavailable) has taken no calls yet 113 (Local/113@from-queue/n) (Unavailable) has taken no calls yet 112 (Local/112@from-queue/n) (Unavailable) has taken no calls yet 111 (Local/111@from-queue/n) (Unavailable) has taken no calls yet 110 (Local/110@from-queue/n) (Unavailable) has taken no calls yet Callers: 1. SIP/0000-00000008 (wait: 0:57, prio: 0) 2. SIP/0000-00000010 (wait: 0:27, prio: 0) 3. SIP/0000-00000011 (wait: 0:27, prio: 0) 4. SIP/0000-00000012 (wait: 0:27, prio: 0) 5. SIP/0000-00000013 (wait: 0:27, prio: 0) 6. SIP/0000-00000014 (wait: 0:04, prio: 0) asterisk -rx "core show uptime" System uptime: 1 minute, 41 seconds asterisk -rx "core show channels" Channel Location State Application(Data) SIP/0000-00000022 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000021 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-00000020 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-0000001f 0000001@ext-queues:9 Up Queue(0000001,t,,) SIP/0000-0000001e 0000001@ext-queues:9 Up Queue(0000001,t,,) Local/100@from-queue 0000001@from-queue:1 Ringing AppQueue((Outgoing Line)) 6 active channels 6 active calls 165 calls processed top - 16:06:09 up 5:04, 2 users, load average: 0.00, 0.02, 0.18 Cpu(s): 1.2%us, 2.3%sy, 0.0%ni, 95.0%id, 0.0%wa, 0.3%hi, 1.2%si, 0.0%st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16815 asterisk 20 0 42116 19m 8376 S 8 1.9 0:20.26 asterisk It is not all. After the termination of the sipp, asterisk become strange: it execute commands whits small delay & want no to execute command "core restart now". I can restart asterisk only with killing PID. So, i thik it is a bug. I tryed to find similar issue on this site before. | ||
Comments: | By: David Woolley (davidw) 2011-02-22 08:14:42.000-0600 There doesn't seem to be a performance problem (load average is very low); there is some hint of a deadlock. Firstly, if you have to kill asterisk, you always need to supply backtraces from an unoptimised compilation (attach gdb before killing, or use gcore). ^\ may not show all the processes. Also, given this looks like there is some sort of deadlock, you need to have built with thread debugging, and to run core show locks. If a CLI has stalled, you can usually still do this by starting a new one and issuing it as the first command. I think, for 1.6.2, this is still documented in the doc directory in the source tree. For later versions it is in the wiki. PS. Please try and keep the description to about a dozen lines. Some more can be put in additional information, but you may have needed an attachment here. The description is included in every email and, people don't like looking through a lot of text to get to the latest information. By: Leif Madsen (lmadsen) 2011-02-22 10:05:08.000-0600 Marked as Feedback per davidw's excellent overview of requirements. By: trscod (trscod) 2011-02-22 14:38:30.000-0600 Thank you davidw for explanation. So, i recompile asterisk with "DONT_OPTIMIZE"&several "debug" flags. The output of commands: gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk `pidof asterisk` > /tmp/backtrace-threads.txt & asterisk -rx "core show locks" > /tmp/core-show-locks.txt in attach. PS: I think, that the bug is depends on processor capacity. For example, this distro works on Intel Atom D510 (dual core). And queue with idle configs work perfectly with 20 callers. After 20 callers problem was detected. I think, on powerfull CPU callers quantity will be more than 20. And that's why problem may be invisible. By: Ronald Chan (loloski) 2011-02-23 05:56:50.000-0600 trscod: can you reproduce this without local channel involve or with plain vanilla asterisk, i'm just courious! thanks! By: trscod (trscod) 2011-02-24 16:07:26.000-0600 Hello, folks! I update my Asterisk to latest release version 1.6.2.16.2. Either I compile it without optimization. Problem still present. I attach output of commands: gdb -ex "thread apply all bt" --batch /usr/sbin/asterisk `pidof asterisk` > /tmp/backtrace-threads1.txt & asterisk -rx "core show locks" > /tmp/core-show-locks1.txt I did it according to documentation in sources. I am not experienced reporter, please tell me if it is need more information. Thanks! By: Russell Bryant (russell) 2011-04-26 17:07:18 This looks like a problem with res_timing_timerfd. Can you try with res_timing_dahdi instead? By: Russell Bryant (russell) 2011-07-26 14:27:08.298-0500 Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions If this is still an issue, please open a new issue so it can be re-triaged appropriately. Thanks! |