-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualization collectors iterate over all visualizations 4 times #117367
Comments
Pinging @elastic/kibana-vis-editors (Team:VisEditors) |
@flash1293 i moved it to our 8.1 tasks as it is a performance problem but feel free to move it as you wish! |
Thanks for raising this. We can certainly collapse the 4 requests into one, but switching to aggregations will be more difficult as the information we are after is part of a JSON string which has to be parsed before it can be inspected (and there's no easy way to change this - for example we look into the vega spec which might be hjson). Would reducing the page size and sleeping for a few seconds between pages help as well with the performance problem? |
That would reduce the impact on Kibana, but keeping a PIT open for a long time is also expensive on Elasticsearch so I don't think it would really be an improvement overall. In one case a customer had 27000 visualizations and returning all of them was 27MB so every time usage collection happens we query for and need to parse > 100MB. On it's own this is no problem for Kibana/ES but together with all the other usage collection it adds up to a significant performance cost on a cluster that's maybe already busy. I think as a first step collapsing into a single query would already be a big improvement. Longer term maybe we can stop stringifying/parsing these fields see "JSON Fields" in #43673 |
There are four different visualization usage collectors. Each one of them iterates over all visualizations:
https://github.com/elastic/kibana/blob/main/src/plugins/vis_types/table/server/usage_collector/get_stats.ts#L49
https://github.com/elastic/kibana/blob/main/src/plugins/vis_types/timeseries/server/usage_collector/get_usage_collector.ts#L30
https://github.com/elastic/kibana/blob/main/src/plugins/vis_types/vega/server/usage_collector/get_usage_collector.ts#L74
https://github.com/elastic/kibana/blob/main/src/plugins/visualizations/server/usage_collector/get_usage_collector.ts#L40
When there are many pages of visualizations this becomes an expensive operation. Could we collect this usage data using a single query? Could we use an aggregation instead of iterating over all documents?
The text was updated successfully, but these errors were encountered: