Summary:ASTERISK-28701: app_queue: Core reload resets queue stats, even when keepstats=yes
Reporter:Luke Escude (lukeescude)Labels:bounty
Date Opened:2020-01-18 09:07:21.000-0600Date Closed:2021-08-25 18:34:09
Versions:16.7.0 Frequency of
Environment:Centos 7 x64Attachments:
Description:Hello - When keepstats=yes in queues.conf (under general), the overall stats for each queue are kept after issuing a core reload: Avg talk time, hold time, call quantity, etc.

However, the per-agent call stats (# calls taken, last call taken) get cleared out.

This seems to go against the purpose of the keepstats directive.

Expected results: Keepstats would keep stats ;)
Comments:By: Asterisk Team (asteriskteam) 2020-01-18 09:07:22.119-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

By: Joshua C. Colp (jcolp) 2020-01-18 10:58:59.758-0600

The "keepstats" option appears to have been removed over 10 years ago, so enabling or disabling it does nothing. I don't know if the current behavior is expected or not, but I would be concerned with changing it in case there are possible repercussions.

By: Luke Escude (lukeescude) 2020-01-18 11:12:23.581-0600

Hmmm what if a new option were added in, like reallykeepstats=yes, so current behavior is preserved by default unless this new option is supplied.

I can try and find some time to make a patch but it's unlikely I'll be able to anytime soon. I may have some time later today to actually identify the code responsible for clearing the member stats.

By: Joshua C. Colp (jcolp) 2020-01-21 05:00:00.948-0600

I think that'd be the best option.

By: Luke Escude (lukeescude) 2020-01-21 09:48:23.023-0600

Joshua, could it be on Line 11430 on queues.c - struct ast_flags mask = {AST_FLAGS_ALL & ~QUEUE_RESET_STATS,};... It appears to be including the queue reset stats flag, even though we're simply calling a reload.

Additionally Line 9436 calls ao2_callback(q->members, OBJ_NODATA, mark_member_dead, NULL); which might mark a member "dead" during a standard reload event.

I haven't written C++ since college, sorry haha

By: Luke Escude (lukeescude) 2020-01-21 09:54:11.317-0600

Apparently others in the community have the same issue: https://community.asterisk.org/t/why-does-cli-queue-reload-all-reset-queue-stats/69843/6

By: Luke Escude (lukeescude) 2020-01-21 10:55:58.939-0600

Alright, more diagnostics:

queue reload parameters - Good (no stats loss)
queue reload members - Good (no stats loss)
queue reload rules - Good (no stats loss)
queue reload all - Bad (all stats loss)

I haven't even tested queue reset stats but I imagine that resets stats as expected.

This is definitely not supposed to happen with a reload command, I can't imagine why someone would want that.

By: Luke Escude (lukeescude) 2020-01-21 10:58:56.891-0600

The community member is correct about config reload /etc/asterisk/queues.conf - This will safely reload the queues without resetting statistics.

So I'm wondering: Is it possible to issue a "core reload" but have it ignore queues, so that I can manually reload queues.conf?

By: Benjamin Keith Ford (bford) 2020-01-29 11:08:43.865-0600

There's no way to do a reload and leave some modules untouched. I agree that there would need to be some sort of option here that changes behavior.

Would you be willing to submit a patch[1] to Gerrit[2] if you have something in mind already? This ticket is on the verge of being either a bug or an improvement, so it's difficult to tell if it's something that would get worked on sooner rather than later.

[1]: https://wiki.asterisk.org/wiki/display/AST/Gerrit+Usage
[2]: https://gerrit.asterisk.org/q/status:open

By: Luke Escude (lukeescude) 2020-01-29 13:32:52.727-0600

I still can't find the code responsible for it... I'd absolutely get after it if I knew more C... If someone can take a look and come up with some ideas, I don't mind experimenting with it.

It's definitely a bug though, I can't imagine why anyone would only want to keep half the stats during a standard core reload.

By: Benjamin Keith Ford (bford) 2020-01-30 11:53:13.838-0600

I tried doing a {{core reload}} to reset the stats, but that does not appear to affect the queue stats for me. All of the queue stats were preserved. I checked the link you posted in an earlier comment, and did a {{queue reload all}} and that DID reset queue stats, which could be argued that it should not. However, you get your stats erased for members on a {{core reload}}?

By: Luke Escude (lukeescude) 2020-01-30 11:59:21.152-0600

Correct it keeps the overall stats for the queues, but resets the per-member stats. (Core reload).

By: Benjamin Keith Ford (bford) 2020-01-30 12:13:42.696-0600

Is there anything else of note that you can provide debugging wise, such as:
# Queue configuration?
# Static or realtime?
# Is this a fresh install of Asterisk or is this a problem that popped up after an upgrade?
# Are you running any custom patches on your system? Probably unlikely, but good to know.

By: Luke Escude (lukeescude) 2020-01-30 12:39:09.831-0600

Standard ring-all queue, nothing special on the config.

Static config using queues.conf

Fresh install always.

Yes we've got custom patches but those are for network-related stuff like sending a null packet prior to RTP to open NAT ports, or a null video packet for H264 stuff, etc.

By: Benjamin Keith Ford (bford) 2020-02-03 09:49:33.904-0600

My suggestion is to try an install that does not have any custom patches. There's a chance that something could get trampled on even if the patch seems to have nothing to do with where the problem is. If the problem still persists after that, we can try some alternatives.

By: Asterisk Team (asteriskteam) 2020-02-17 12:00:01.241-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

By: Luke Escude (lukeescude) 2020-02-17 16:09:27.508-0600

It appears as though our patches are not causing this to happen.

Core reload still resets half of the queues' stats (the member stats) but retains the overall stats.

By: Asterisk Team (asteriskteam) 2020-02-17 16:09:27.866-0600

This issue has been reopened as a result of your commenting on it as the reporter. It will be triaged once again as applicable.

By: George Joseph (gjoseph) 2020-02-18 12:26:53.961-0600

I'm looking at this in more detail.   {{core reload}} should _not_ be resetting stats as that code specifically excludes stats.  {{queue reload all}} _does_ reset the stats but shouldn't so I'll open an internal issue for that anyway.  If you're seeing {{core reload}} resetting the member stats then I'm wondering if it has something to do with the queue member config changing.  Stay tuned.

By: George Joseph (gjoseph) 2020-02-19 09:39:33.939-0600

Yeah OK.  It seems that if queues.conf is modified in any way, all member stats are reset whenever {{core reload}}, {{queue reload all}}, {{queue reload members}} are done.  It probably shouldn't do that.

By: Luke Escude (lukeescude) 2020-02-19 09:42:12.473-0600

Kickass! This'll be a great fix, right before a bunch of our customers are going to start using queues.

By: Joshua C. Colp (jcolp) 2020-02-19 09:43:40.127-0600

Be aware that there is no time frame on when or if this will get looked into.

By: Luke Escude (lukeescude) 2020-02-19 09:45:58.353-0600

No worries I figured as much - I'm just glad to have this on the roadmap, since so many people on the forums/online communities have been complaining about this thing for a while now.

By: George Joseph (gjoseph) 2020-02-19 10:03:20.388-0600

[~lukeescude] The fixes should be pretty straightforward if you want to take a crack at them. :)

There are actually 2 issues here and I was able to reproduce them with a simple 2 line queues.conf file and 2 phones.

member => PJSIP/1172

exten = 9998,1,Queue(queue1)

First issue...  {{queue reload all}} resets all stats when it shouldn't.
Call the queue from another phone and talk for a few seconds then...
{{queue show queue1}} should show stats for both the queue and the member.
{{core reload}} preserves the stats as it should.
{{queue reload members}}  preserves the stats as it should.
{{queue reload all}} resets all the stats which it SHOULDN'T.
This is because we're using the wrong flag mask in handle_queue_reload.  When "all" is specified, we pass AST_FLAGS_ALL to reload_handler but AST_FLAGS_ALL also resets the stats besides reloading.  We need to pass {{AST_FLAGS_ALL & ~QUEUE_RESET_STATS}} instead.

Second issue...  {{core reload}}, {{module reload app_queue.so}}, {{config reload queues.conf}} reset ALL member stats if anything in queues.conf has changed even if the change had nothing whatsoever to do with the specific queue or members.    When we do a config reload, we should check to see if the member already exists and carry over the stats instead of resetting them to zero.  We already carry over some data, just not the stats.

By: Luke Escude (lukeescude) 2021-08-17 13:00:20.001-0500

Hey guys, I'd like to post this up as a Bug Bounty - I'm not familiar with the asterisk-dev mailing list nor how to add to the bug bounty list... Any tips or documentation on this?

Basically we will gladly pay someone $599 to fix this issue.

By: Joshua C. Colp (jcolp) 2021-08-17 13:02:12.846-0500

The mailing list is at http://lists.digium.com/mailman/listinfo/asterisk-dev and the wiki contains information on bug bounties[1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Bug+Bounties

By: N A (InterLinked) 2021-08-20 14:11:47.305-0500

Hmm... this is interesting. Certainly explains some of the strange behavior I was noticing with app_queue: https://issues.asterisk.org/jira/browse/ASTERISK-29578?filter=-2

Strictly speaking, I don't think the 2 bugs are related, but I did notice this behavior and I just assumed it was intentional. It makes sense though that it's a bug.

I did notice that app_queue does not reload unless the file was modified, which explains why this bug only surfaced when it the modified timestamp had changed.

My two thoughts are that queue reload shouldn't check the modified timestamp, because the current behavior is buggy so a force reload should be possible without using "touch", which can ensure that the members can be force reloaded.

Might look into this issue as well in a little bit, if nobody else does.

Updated: Gerrit review: https://gerrit.asterisk.org/c/asterisk/+/16343

By: Friendly Automation (friendly-automation) 2021-08-25 18:34:11.138-0500

Change 16372 merged by Friendly Automation:
app_queue: Don't reset queue stats on reload


By: Friendly Automation (friendly-automation) 2021-08-25 18:34:31.272-0500

Change 16343 merged by Friendly Automation:
app_queue: Don't reset queue stats on reload


By: Friendly Automation (friendly-automation) 2021-08-25 18:35:41.238-0500

Change 16371 merged by Friendly Automation:
app_queue: Don't reset queue stats on reload


By: Friendly Automation (friendly-automation) 2021-08-30 07:26:59.708-0500

Change 16370 merged by George Joseph:
app_queue: Don't reset queue stats on reload