Summary: | ASTERISK-19820: wrapuptime is intermittently disregarded for queue calls | ||||||
Reporter: | WRP (wrp) | Labels: | |||||
Date Opened: | 2012-04-30 20:14:01 | Date Closed: | 2016-09-23 14:16:14 | ||||
Priority: | Major | Regression? | |||||
Status: | Closed/Complete | Components: | Applications/app_queue | ||||
Versions: | 1.8.11.1 13.6.0 | Frequency of Occurrence | Frequent | ||||
Related Issues: |
| ||||||
Environment: | Linux 2.6.18-194.el5 x86_64 | Attachments: | ( 0) ignored_wrap_up.tar.gz ( 1) lastcalltime.patch ( 2) nowrapup.log | ||||
Description: | app_queue occasionally sends a call to an agent before their wrap up time has expired from their previous call. Here is our queue configuration: {code:title=queues.conf (partial)} [3008] announce-frequency=300 announce-holdtime=no announce-position=yes autofill=no eventmemberstatus=yes eventwhencalled=yes joinempty=yes leavewhenempty=no maxlen=150 monitor-type=mixmonitor monitor-format=wav49 music=main03282011 periodic-announce-frequency=0 queue-callswaiting=queue-callswaiting queue-thankyou=queue-thankyou queue-thereare=queue-thereare queue-youarenext=queue-youarenext reportholdtime=no retry=5 ringinuse=no servicelevel=60 strategy=rrmemory timeout=30 timeoutrestart=yes weight=0 wrapuptime=240 {code} Here is an example of a call that rang an agent 9 seconds after their previous call (wrapuptime=240): {code:title=queue_log} 1334934705|1334934313.167719|3008|SIP/2201|CONNECT|341|1334934701.167743|3 1334934850|1334934821.167753|3000|NONE|ENTERQUEUE||2092201908|1 1334935785|1334934313.167719|3008|SIP/2201|COMPLETECALLER|341|1080|7 1334935794|1334934845.167754|3008|SIP/2201|RINGNOANSWER|4000 {code} {code:title=full log} [2012-04-20 08:29:40] VERBOSE[24896] app_queue.c: -- Called SIP/2261 [2012-04-20 08:29:40] VERBOSE[24896] app_queue.c: -- SIP/2261-00022a6c is ringing [2012-04-20 08:29:50] VERBOSE[24896] app_queue.c: -- SIP/2261-00022a6c is busy [2012-04-20 08:29:50] VERBOSE[24896] app_queue.c: -- Nobody picked up in 10000 ms [2012-04-20 08:29:50] VERBOSE[24896] netsock2.c: == Using SIP RTP TOS bits 184 [2012-04-20 08:29:50] VERBOSE[24896] netsock2.c: == Using SIP RTP CoS mark 5 [2012-04-20 08:29:50] VERBOSE[24896] app_queue.c: -- Called SIP/2201 [2012-04-20 08:29:50] VERBOSE[24896] app_queue.c: -- SIP/2201-00022a6d is ringing [2012-04-20 08:29:54] VERBOSE[24896] app_queue.c: -- SIP/2201-00022a6d is busy [2012-04-20 08:29:54] VERBOSE[24896] app_queue.c: -- Nobody picked up in 4000 ms [2012-04-20 08:29:59] VERBOSE[24896] netsock2.c: == Using SIP RTP TOS bits 184 [2012-04-20 08:29:59] VERBOSE[24896] netsock2.c: == Using SIP RTP CoS mark 5 [2012-04-20 08:29:59] VERBOSE[24896] app_queue.c: -- Called SIP/2281 [2012-04-20 08:30:00] VERBOSE[24896] app_queue.c: -- SIP/2281-00022a6e is ringing [2012-04-20 08:30:03] VERBOSE[24896] app_queue.c: -- SIP/2281-00022a6e answered SIP/coloCUBE-00022a4e {code} This is happening a few times a week. | ||||||
Comments: | By: Matt Jordan (mjordan) 2012-05-01 08:13:18.861-0500 We require a complete debug log to help triage the issue. This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue: https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information By: WRP (wrp) 2012-05-01 13:54:12.128-0500 When I set the debug to 1 or higher, our console is plagued with the following output. Literally hundreds of lines a second. {noformat} [2012-05-01 11:50:38] DEBUG[12041]: audiohook.c:239 audiohook_read_frame_both: Read factory 0x2aaacc00a7d0 was pretty quick last time, waiting for them. [2012-05-01 11:50:38] DEBUG[16861]: dsp.c:568 tone_detect: tone 1100, Ew=3.92E+04, Et=2.62E+06, s/n= 0.02 [2012-05-01 11:50:38] DEBUG[24872]: dsp.c:568 tone_detect: tone 1100, Ew=1.64E+04, Et=3.00E+06, s/n= 0.01 [2012-05-01 11:50:38] DEBUG[9371]: dsp.c:568 tone_detect: tone 1100, Ew=1.30E+05, Et=2.31E+06, s/n= 0.06 [2012-05-01 11:50:38] DEBUG[24035]: audiohook.c:233 audiohook_read_frame_both: Write factory 0x2aaadc042b60 was pretty quick last time, waiting for them. [2012-05-01 11:50:38] DEBUG[24044]: dsp.c:568 tone_detect: tone 1100, Ew=5.45E+04, Et=2.63E+06, s/n= 0.02 [2012-05-01 11:50:38] DEBUG[12041]: audiohook.c:239 audiohook_read_frame_both: Read factory 0x2aaacc00a7d0 was pretty quick last time, waiting for them. {noformat} Since this is an intermittent issue, we need to leave these settings enabled (ber the document you provided) for a day or more. Can you provide any suggesion as to what we can do in order to capture the required debug without flooding our console output (and log files for that matter)? By: Matt Jordan (mjordan) 2012-05-01 13:58:13.579-0500 Don't send the DEBUG to a console, send it to a log file instead: {noformat} console => notice,warning,error full => notice,warning,error,debug,verbose,dtmf,fax {noformat} Note that the instructions linked on the wiki page do not expect you to redirect DEBUG output to the CLI console. By: Matt Jordan (mjordan) 2012-05-01 13:59:14.256-0500 As far as log files are concerned, we're perfectly fine with large log files. That's their purpose. What we will need is the approximate time when the issue happened. By: WRP (wrp) 2012-05-01 14:07:43.185-0500 OK, that certainly helped with the console output. Our log is growing by half a megabyte a second. Are you able to handle a 43GB log file (1 day pre-compression)? By: Matt Jordan (mjordan) 2012-05-01 14:28:24.042-0500 Well, there's large, and then there's large. If you throttle debug levels to 3, does it still grow at that rate? By: WRP (wrp) 2012-05-01 14:35:25.874-0500 That seems to help considerably. Just measured it at around 5 megs a minute (down from 30). ~7GB uncompressed ok? By: Matt Jordan (mjordan) 2012-05-04 08:30:39.951-0500 Not really :-) We need a DEBUG log that illustrates this particular problem, not a capture from a system that runs all day. Essentially, what we're looking for is the portion from the 'full log' that you illustrates above, only with the DEBUG statements. What you could do would be to look at your queue log for a particular instance of the problem, then use split to split the DEBUG log file into pieces. Determine which piece contains the time period that corresponds to what is shown in the queue log as a problem, and attach only that piece. By: Rusty Newton (rnewton) 2012-06-04 19:11:58.932-0500 WRP, are you able to provide the requested chunk of a full log during the time the issue occurs? By: WRP (wrp) 2012-06-05 11:32:16.165-0500 We haven't had the problem since I entered the ticket. We still have the extra debug enabled. I'd like to keep this ticket open a little longer in case the problem arises again. Thanks! By: WRP (wrp) 2012-06-05 23:46:36.406-0500 Strangely enough, we had the problem arise today. Here is the result from the queue_log: {code:title=queue_log} 1338914202|1338914148.62888|3008|NONE|ENTERQUEUE||4256520697|13 1338915031|1338914970.62920|3011|NONE|ENTERQUEUE||8009597167|1 1338915250|1338914970.62920|3011|SIP/2201|CONNECT|219|1338915248.62933|1 1338915696|1338914970.62920|3011|SIP/2201|COMPLETECALLER|219|446|1 1338915704|1338914148.62888|3008|SIP/2201|RINGNOANSWER|8000 {code} As you can see, 2201 recieved a ring just 8 seconds after completing their previous call. Both queues 3011 and 3008 have wrap up times set to 4 minutes. I do have a complete debug log this time. I'd like to send it to you directly, if possible, as I imagine it has sensitive information in it. Do you have an email address or other means of communication that I can send the 6MB file to? By: Matt Jordan (mjordan) 2012-06-08 09:30:52.789-0500 You can restrict who can access an attachment when you attach the file. You should see a lock icon at the bottom of the file attachment - if you select 'Bug Marshal', then only those with commit access to Asterisk will be able to view the file attachments. By: WRP (wrp) 2012-06-08 11:55:47.212-0500 Asterisk debug log during instance where wrapup time was seemingly ignored By: WRP (wrp) 2012-06-08 11:56:31.417-0500 I've attached the debug file, though it doesn't seem to be showing on the ticket (from my perspective). Are you seeing ignored_wrap_up.tar.gz? By: Jamuel Starkey (jamuel) 2012-06-08 16:49:53.524-0500 I see it appear 3 times under Attachments: Attachments ignored_wrap_up.tar.gz 08/Jun/12 11:55 AM 5.75 MB ignored_wrap_up.tar.gz 08/Jun/12 11:54 AM 5.75 MB ignored_wrap_up.tar.gz 08/Jun/12 11:51 AM 5.75 MB By: WRP (wrp) 2012-06-08 17:10:28.324-0500 Ok. Those were all duplicates. You should be fine with the file that remains. Looking forward to hearing what you find! By: WRP (wrp) 2012-07-17 19:34:26.115-0500 Has anyone had a chance to review the log and debug files? By: Taylor Solazzo (tsolazzo) 2012-12-11 13:42:20.549-0600 Hello! We are continuing to experience this issue. We have since lowered the wrap-up time across all queues to six minutes. Here is another instance with SIP/2364 {code} 1355244739|1355244691.40347|3010|NONE|ENTERQUEUE||3124210030|1 - 3124210030 Enters @ 8:52:19 1355244781|1355244691.40347|3010|SIP/2364|CONNECT|42|1355244773.40350|7 - 2364 Connects with 3124210030 @ 8:53:01 1355244958|1355244914.40354|3010|NONE|ENTERQUEUE||6165347619|1 - 6165347619 - Enters @ 8:55:58 1355245198|1355244691.40347|3010|SIP/2364|COMPLETECALLER|42|417|1 - 2364 Ends call with 3124210030 at 8:59:58 1355245218|1355244914.40354|3010|SIP/2364|RINGNOANSWER|20000 - 2364 rings with 6165347619 @ - 9:00:18 1355245413|1355245383.40381|3016|NONE|ENTERQUEUE||3053712455|1 - 3053712455 Enters @ 9:03:33 1355245691|1355245383.40381|3016|SIP/2364|CONNECT|278|1355245678.40389|12 2364 Connects with 3053712455 @ 9:08:11 1355245862|1355245383.40381|3016|SIP/2364|COMPLETECALLER|278|171|1 -2364 Ends with 3053712455 @ 9:11:02 1355246043|1355245983.40404|3012|NONE|ENTERQUEUE||8012660999|3 - 8012660999 Enters @ 9:14:03 1355246349|1355245983.40404|3012|SIP/2364|CONNECT|306|1355246343.40426|5 - 2364 Connects with 8012660999 @ 9:19:09 1355247612|1355245983.40404|3012|SIP/2364|COMPLETECALLER|306|1263|3 - 2364 Ends with 8012660999 @ 9:40:12 {code} By: Taylor Solazzo (tsolazzo) 2012-12-27 14:18:30.013-0600 Another occurrence today. Since last entry, all queues have been lowered to a wrap-up time of five minutes. {code} 1356629069|1356628995.17178|3012|NONE|DID|8775784000 - 4142890890 Calls 877# @ 17:24:29 1356629069|1356628995.17178|3012|NONE|ENTERQUEUE||4142890890|1 - 4142890890 Enters queue at 17:24:29 1356629071|1356628995.17178|3012|SIP/2339|RINGNOANSWER|2000 - 4142890890 Rings device SIP/2339 with no answer @ 17:24:31 1356629089|1356628995.17178|3012|SIP/2340|CONNECT|20|1356629082.17182|6 -4142890890 connects with SIP/2340 @ 17:24:49 1356629446|1356628995.17178|3012|SIP/2340|COMPLETECALLER|20|357|1 - 4142890890 ends with SIP/2340 @ 17:30:46 1356629445|1356629417.17188|3016|NONE|DID|8775784000 - 3035018834 Calls 877 # @ 17:30:45 1356629445|1356629417.17188|3016|NONE|ENTERQUEUE||3035018834|1 -3035018834 Enters queue @ 17:30:45 1356629448|1356629417.17188|3016|SIP/2162|RINGNOANSWER|3000 - 3035018834 Rings device SIP/2162 @ 17:30:48 1356629467|1356629417.17188|3016|SIP/2340|CONNECT|22|1356629448.17191|17 3035018834 connects with SIP/2340 @ 17:31:07 1356630048|1356629417.17188|3016|SIP/2340|COMPLETECALLER|22|581|1 -3035018834 ends with SIP/2340 @ 17:40:48 {code} By: Taylor Solazzo (tsolazzo) 2012-12-27 18:30:55.356-0600 Another instance today: {code} 1356634997|1356634916.17424|3012|NONE|ENTERQUEUE||2135074924|1 - 2135074924 Enters queue @ 19:03:17 1356635032|1356634916.17424|3012|SIP/2363|CONNECT|35|1356635019.17429|12 - 2135074924 connects with agent SIP/2363 @ 19:03:52 1356635811|1356634916.17424|3012|SIP/2363|COMPLETECALLER|35|779|1 -2135074924 ends call with SIP/2363 @ 19:16:51 1356635808|1356635739.17466|3012|NONE|ENTERQUEUE||8153976282|1 - 8153976282 Enters queue @ 19:16:48 1356635826|1356635739.17466|3012|SIP/2363|RINGNOANSWER|10000 - 8153976282 rings agent SIP/2363 @ 19:17:06 (Note not even a minute has passed before system tries SIP/2363 again even with five minute wrap-up time set.) 1356635965|1356635739.17466|3012|SIP/2246|CONNECT|157|1356635959.17478|5 8153976282 connects with another agent SIP/2246 {code} By: hristo (hristo) 2013-05-06 11:38:41.248-0500 I am also seeing this problem occasionally. The version is 1.8.17. Is it worth it to test with a newer Asterisk version? By: Elwin Formsma (formsmae) 2015-08-07 08:11:17.279-0500 Debug log By: Elwin Formsma (formsmae) 2015-08-07 08:12:43.045-0500 We have also experienced issues with wrapuptime sometimes being ignored. Attached: nowrapup.log First call of the agent (which doesnt get wrapup): 1438675644.64887 Second call (which he gets right after the first): 1438675790.64933 Asterisk version: 1.8.31 By: Elwin Formsma (formsmae) 2015-10-16 09:48:58.877-0500 bump By: Marek Cervenka (cervajs) 2015-12-08 12:34:19.707-0600 i have the same problem on asterisk 13.6.0 By: Martin Tomec (matesstar) 2015-12-14 06:48:57.329-0600 After some investigation, it seems to be race condition in app_queue: function {{can_ring_entry}} is called in the same time (same second) as {{handle_hangup}} for previous call. It gets old (wrong) timestamps in both {{call->member->lastcall}} and {{call->lastcall}}. (tested for version 13.6.0) By: Martin Tomec (matesstar) 2015-12-15 04:51:16.192-0600 So we are able to _simulate_ it. We simulated longer delay between call hangup and update the queue status by locking table queue_log (on myisam engine). And so the wrapuptime was ignored because {{update_queue}} was waiting for DB lock. Maybe innoDB engine can mitigate this issue, but not solve. I will try to consult in mailing list and maybe write patch. By: Martin Tomec (matesstar) 2015-12-18 13:03:04.968-0600 I have tried to patch this by adding "in_call" parameter to member structure. Patch is atached if somebody wants to test it on Chrismas (patch is against master). After some tests, I will commit it to gerrit. By: Elwin Formsma (formsmae) 2016-04-06 06:22:38.948-0500 Martin we seem unable to reproduce the problems, even with locking tables. This makes it hard to verify the patch. Did you manage to verify this patch yourself? |