Summary: | ASTERISK-10997: RINGNOANSWER queue_log without a member name | ||
Reporter: | Hannes Loubser (hloubser) | Labels: | |
Date Opened: | 2007-12-10 02:53:43.000-0600 | Date Closed: | 2008-01-02 13:43:12.000-0600 |
Priority: | Minor | Regression? | No |
Status: | Closed/Complete | Components: | Applications/app_queue |
Versions: | Frequency of Occurrence | ||
Related Issues: | |||
Environment: | Attachments: | ( 0) 11498.patch ( 1) 11498v2.patch ( 2) 11498v3.patch | |
Description: | Some RINGNOANSWER events in the queue_log are missing member names in around 10% of the calls to rna() in app_queue.c . SVN 90163, OS Ubuntu 7.10. ****** ADDITIONAL INFORMATION ****** queue_log extract (see line 2): 1197030019|1197030019.30922|4476|NONE|ENTERQUEUE||caller-20 1197030046|1197030019.30922|4476||RINGNOANSWER|10000 1197030057|1197030019.30922|4476|SIP/1560|CONNECT|38|1197030051.31012 1197030060|1197030019.30922|4476|SIP/1560|COMPLETECALLER|38|3|35 | ||
Comments: | By: Eliel Sardanons (eliel) 2007-12-10 12:26:20.000-0600 Do you know how to reproduce this? By: Mark Michelson (mmichelson) 2007-12-11 11:17:04.000-0600 I have not been able to reproduce this, but looking through the code, I found an apparent way that this could happen. I will upload a patch shortly for you to test. By: Mark Michelson (mmichelson) 2007-12-11 11:18:53.000-0600 I have uploaded 11498.patch to test. Please report if it fixes the issue. By: Hannes Loubser (hloubser) 2007-12-12 05:07:02.000-0600 The patch produces multiple entries for only one extension, e.g. : 1197449785|1197449770.170|4476|SIP/1502|RINGNOANSWER|10000 1197449785|1197449770.171|4476|SIP/1502|RINGNOANSWER|10000 1197449785|1197449760.124|4476|SIP/1502|RINGNOANSWER|10000 1197449786|1197449771.176|4476|SIP/1502|RINGNOANSWER|10000 1197449786|1197449771.177|4476|SIP/1502|RINGNOANSWER|10000 By: Mark Michelson (mmichelson) 2007-12-12 09:36:26.000-0600 Ah, I see what the problem is. I'll get it fixed and have another patch up ASAP. By: Mark Michelson (mmichelson) 2007-12-12 09:54:07.000-0600 I've uploaded 11498v2.patch. The problem with the first patch wasn't actually that multiple queue log entries were made for the same member; what was actually happening was that all the queue log entries were being attributed to the same member even though logically they should have been attributed to the member currently being rung. The second patch corrects this error. I apologize for the erroneous first patch. By: Hannes Loubser (hloubser) 2007-12-13 02:05:54.000-0600 I tested patch 11498v2 - the member is now updated, but the problem remains the same: 1197532261|1197532219.2207|4476|SIP/1539|RINGNOANSWER|10000 1197532261|1197532222.2217|4476||RINGNOANSWER|10000 1197532263|1197532223.2218|4476|SIP/1509|RINGNOANSWER|10000 1197532263|1197532221.2212|4476|SIP/1524|RINGNOANSWER|10000 1197532265|1197532224.2223|4476||RINGNOANSWER|10000 1197532265|1197532224.2226|4476|SIP/1528|RINGNOANSWER|10000 The problem can be reproduced by running a stress test using a pool of soft phones to call into the queue. The queue members are also soft phones, controlled with a java program. This is the line where membername is not set in 10% of the cases (line 2263, rev 90163 of app_queue.c): if (!*to) rna(orig, qe, on, membername); By: Mark Michelson (mmichelson) 2007-12-13 10:56:40.000-0600 That's the line I was thinking had the problem. The problem is that it's possible to reach that line without the membername being set. I thought I had accounted for that possibility with the patches I provided, but unfortunately such is not the case. I'll look into this more to see if I can reproduce the issue or figure out where the problem is. By: Mark Michelson (mmichelson) 2007-12-20 18:55:17.000-0600 I thought I'd give an update regarding this. This bug has pointed out a basic deficiency in the queue application. To put things quite simply, the reason that no membername is printed on those particular RINGNOANSWER situations is that the queue has no way of knowing what the membername should be. If you want a longer explanation as to why, I can provide it, but it'll be rather long and I might not make any sense when I explain it. That being said, logically you would think that the queue SHOULD know exactly which member it was that it tried to ring and who did not pick up. I THINK I may have a fix which will not be too invasive, but I will need to test quite extensively before I can say for sure that my idea for a fix will work. I'd also like to point out for the record that this bug has called my attention to the fact that when using the "ringall" strategy, it is completely random which member's name will show up in the RINGNOANSWER queue log entry if no one picks up the phone. My solution to the reported bug would also fix this deficiency as well. By: Digium Subversion (svnbot) 2007-12-26 13:48:04.000-0600 Repository: asterisk Revision: 94803 A team/mmichelson/queue-log-accuracy/ ------------------------------------------------------------------------ r94803 | mmichelson | 2007-12-26 13:48:03 -0600 (Wed, 26 Dec 2007) | 12 lines Creating this branch as a means of fixing issue ASTERISK-10997. The change necessary is somewhat large, and before all is said and done, it may be too big a change for 1.4. Therefore, I am making this branch against trunk and will backport the change to 1.4 if it makes sense to do so. There are two issues here regarding the queue log. It is possible for certain RINGNOANSWER log entries to not have a member name listed. Also, when using the ringall strategy, if no one answers, then the RINGNOANSWER entry generated will randomly have one of the member's names listed. While this may not sound like a huge deal, it has exposed a need for app_queue to change the way in which it manages which members it is calling. ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=94803 By: Digium Subversion (svnbot) 2007-12-26 16:17:17.000-0600 Repository: asterisk Revision: 94816 D team/mmichelson/queue-log-accuracy/ ------------------------------------------------------------------------ r94816 | mmichelson | 2007-12-26 16:17:17 -0600 (Wed, 26 Dec 2007) | 5 lines Deleting this branch. While working on it, I discovered a non-invasive way to get the desired result. As such, the fix can be applied to the 1.4 branch. I tested my method locally and will be writing a patch to place on issue ASTERISK-10997 for the reporter to test as well. ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=94816 By: Mark Michelson (mmichelson) 2007-12-26 16:27:47.000-0600 I've uploaded 11498v3.patch for testing. Please see if this fixes the issue. Thank you for being patient! By: Hannes Loubser (hloubser) 2008-01-02 06:52:53.000-0600 I tested the 11498v3 patch with around 750 calls and found no errors in the queue log. Thanks putnopvut for fixing the bug! By: Digium Subversion (svnbot) 2008-01-02 11:47:45.000-0600 Repository: asterisk Revision: 95890 U branches/1.4/apps/app_queue.c ------------------------------------------------------------------------ r95890 | mmichelson | 2008-01-02 11:47:44 -0600 (Wed, 02 Jan 2008) | 9 lines A change to improve the accuracy of queue logging in the case where a member does not answer during the specified timeout period. Prior to this change, there was a small chance that the member name recorded in this case would be blank. Also prior to this change, if using the ringall strategy, if no one answered the call during the specified timeout, the member name listed in the queue log would randomly be one of the members that was rung. (closes issue ASTERISK-10997, reported and tested by hloubser, patched by me) ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=95890 By: Digium Subversion (svnbot) 2008-01-02 12:02:23.000-0600 Repository: asterisk Revision: 95891 _U trunk/ U trunk/apps/app_queue.c ------------------------------------------------------------------------ r95891 | mmichelson | 2008-01-02 12:02:21 -0600 (Wed, 02 Jan 2008) | 17 lines Merged revisions 95890 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r95890 | mmichelson | 2008-01-02 11:51:22 -0600 (Wed, 02 Jan 2008) | 9 lines A change to improve the accuracy of queue logging in the case where a member does not answer during the specified timeout period. Prior to this change, there was a small chance that the member name recorded in this case would be blank. Also prior to this change, if using the ringall strategy, if no one answered the call during the specified timeout, the member name listed in the queue log would randomly be one of the members that was rung. (closes issue ASTERISK-10997, reported and tested by hloubser, patched by me) ........ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=95891 By: Digium Subversion (svnbot) 2008-01-02 12:13:04.000-0600 Repository: asterisk Revision: 95892 _U team/file/netsock2/ U team/file/netsock2/apps/app_osplookup.c U team/file/netsock2/apps/app_queue.c U team/file/netsock2/main/Makefile U team/file/netsock2/main/translate.c ------------------------------------------------------------------------ r95892 | file | 2008-01-02 12:13:03 -0600 (Wed, 02 Jan 2008) | 41 lines Merged revisions 95864,95888,95891 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r95864 | russell | 2008-01-02 12:20:26 -0400 (Wed, 02 Jan 2008) | 6 lines For some odd reason, the last set of libresample build changes from Kevin did not work for everyone, but it did for some. This set of changes makes trunk start again for those having problems. Instead of building libresample as a static library, it just links the object files in directly with the asterisk binary. ................ r95888 | qwell | 2008-01-02 13:38:02 -0400 (Wed, 02 Jan 2008) | 4 lines Update osplookup documentation to use commas instead of pipes. Closes issue ASTERISK-11139, patch by Laureano. ................ r95891 | mmichelson | 2008-01-02 14:05:57 -0400 (Wed, 02 Jan 2008) | 17 lines Merged revisions 95890 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r95890 | mmichelson | 2008-01-02 11:51:22 -0600 (Wed, 02 Jan 2008) | 9 lines A change to improve the accuracy of queue logging in the case where a member does not answer during the specified timeout period. Prior to this change, there was a small chance that the member name recorded in this case would be blank. Also prior to this change, if using the ringall strategy, if no one answered the call during the specified timeout, the member name listed in the queue log would randomly be one of the members that was rung. (closes issue ASTERISK-10997, reported and tested by hloubser, patched by me) ........ ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=95892 By: Digium Subversion (svnbot) 2008-01-02 13:43:12.000-0600 Repository: asterisk Revision: 95938 _U team/murf/utf8-whatif/ U team/murf/utf8-whatif/acinclude.m4 U team/murf/utf8-whatif/apps/app_osplookup.c U team/murf/utf8-whatif/apps/app_queue.c U team/murf/utf8-whatif/channels/chan_zap.c U team/murf/utf8-whatif/codecs/Makefile U team/murf/utf8-whatif/codecs/codec_resample.c U team/murf/utf8-whatif/configure U team/murf/utf8-whatif/configure.ac U team/murf/utf8-whatif/include/asterisk/autoconfig.h.in A team/murf/utf8-whatif/include/asterisk/libresample.h U team/murf/utf8-whatif/main/Makefile U team/murf/utf8-whatif/main/translate.c ------------------------------------------------------------------------ r95938 | murf | 2008-01-02 13:43:10 -0600 (Wed, 02 Jan 2008) | 58 lines Merged revisions 95864,95888,95891,95893-95894,95937 via svnmerge from https://origsvn.digium.com/svn/asterisk/trunk ................ r95864 | russell | 2008-01-02 09:20:26 -0700 (Wed, 02 Jan 2008) | 6 lines For some odd reason, the last set of libresample build changes from Kevin did not work for everyone, but it did for some. This set of changes makes trunk start again for those having problems. Instead of building libresample as a static library, it just links the object files in directly with the asterisk binary. ................ r95888 | qwell | 2008-01-02 10:38:02 -0700 (Wed, 02 Jan 2008) | 4 lines Update osplookup documentation to use commas instead of pipes. Closes issue ASTERISK-11139, patch by Laureano. ................ r95891 | mmichelson | 2008-01-02 11:05:57 -0700 (Wed, 02 Jan 2008) | 17 lines Merged revisions 95890 via svnmerge from https://origsvn.digium.com/svn/asterisk/branches/1.4 ........ r95890 | mmichelson | 2008-01-02 11:51:22 -0600 (Wed, 02 Jan 2008) | 9 lines A change to improve the accuracy of queue logging in the case where a member does not answer during the specified timeout period. Prior to this change, there was a small chance that the member name recorded in this case would be blank. Also prior to this change, if using the ringall strategy, if no one answered the call during the specified timeout, the member name listed in the queue log would randomly be one of the members that was rung. (closes issue ASTERISK-10997, reported and tested by hloubser, patched by me) ........ ................ r95893 | kpfleming | 2008-01-02 11:17:15 -0700 (Wed, 02 Jan 2008) | 2 lines umm... this did not compile on x86-64, and could not possibly have worked on any platform as it was passing string pointers to a function expecting ints ................ r95894 | kpfleming | 2008-01-02 11:21:04 -0700 (Wed, 02 Jan 2008) | 2 lines and now just to keep the libresample party going... if the functions from libresample are going to be in the main Asterisk binary, it makes sense for the header that defines them to be available without any special CFLAGS and to out-of-tree modules building against /usr/include/asterisk ................ r95937 | kpfleming | 2008-01-02 12:34:33 -0700 (Wed, 02 Jan 2008) | 4 lines improve AC_C_DEFINE_CHECK to not try to evaluate the macro being checked for, but just check for its existence finish implementation of check for Zaptel HWGAIN support add check for Zaptel ECHOCANCEL_PARAMS support ................ ------------------------------------------------------------------------ http://svn.digium.com/view/asterisk?view=rev&revision=95938 |