-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataTiersUsageTransportAction is incredibly inefficient in large clusters #100230
Closed
Tracked by
#77466
Labels
>bug
:Data Management/Data streams
Data streams and their lifecycles
Team:Data Management
Meta label for data/management team
Comments
DaveCTurner
added
>bug
:Data Management/Data streams
Data streams and their lifecycles
labels
Oct 3, 2023
elasticsearchmachine
added
the
Team:Data Management
Meta label for data/management team
label
Oct 3, 2023
Pinging @elastic/es-data-management (Team:Data Management) |
97 tasks
DaveCTurner
added a commit
to DaveCTurner/elasticsearch
that referenced
this issue
Oct 4, 2023
This action invokes a subsidiary action but does not set up the proper parent/child relationship, so cancellations of the parent task do not propagate to the child. Relates elastic#100230
elasticsearchmachine
pushed a commit
that referenced
this issue
Oct 4, 2023
This action invokes a subsidiary action but does not set up the proper parent/child relationship, so cancellations of the parent task do not propagate to the child. Relates #100230
Some not entirely dis-similar prior art (along the lines of a "dedicated TransportNodesAction which computes") in #100092, in case somebody is thinking of picking this up. |
gmarouli
added a commit
that referenced
this issue
Nov 10, 2023
gmarouli
added a commit
to gmarouli/elasticsearch
that referenced
this issue
Nov 10, 2023
…ividual nodes (elastic#100230) (elastic#101599)" This reverts commit ea2035d.
davidkyle
pushed a commit
to davidkyle/elasticsearch
that referenced
this issue
Nov 13, 2023
davidkyle
pushed a commit
to davidkyle/elasticsearch
that referenced
this issue
Nov 13, 2023
…ividual nodes (elastic#100230) (elastic#101599)" (elastic#102042) Reverting because the new action is not properly handled in a mixed cluster.
gmarouli
added a commit
to gmarouli/elasticsearch
that referenced
this issue
Nov 14, 2023
gmarouli
added a commit
that referenced
this issue
Nov 15, 2023
2 tasks
elena-shostak
added a commit
to elastic/kibana
that referenced
this issue
Jun 19, 2024
…186370) ## Summary Calls to `/_xpack/usage` in Elasticsearch do not perform well on large clusters. See elastic/elasticsearch#100230. Some users have reported timeouts on this request path. Added a filter_path to the `/_xpack/usage` ES call to optimize the call. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) __Fixes: https://github.com/elastic/kibana/issues/169449__ Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
seanrathier
pushed a commit
to seanrathier/kibana
that referenced
this issue
Jun 21, 2024
…lastic#186370) ## Summary Calls to `/_xpack/usage` in Elasticsearch do not perform well on large clusters. See elastic/elasticsearch#100230. Some users have reported timeouts on this request path. Added a filter_path to the `/_xpack/usage` ES call to optimize the call. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### For maintainers - [x] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process) __Fixes: https://github.com/elastic/kibana/issues/169449__ Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
>bug
:Data Management/Data streams
Data streams and their lifecycles
Team:Data Management
Meta label for data/management team
Today
DataTiersUsageTransportAction
executes an internal nodes stats action with all the trimmings:elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/DataTiersUsageTransportAction.java
Lines 77 to 81 in c006d10
In a large cluster this implementation may need hundreds of MiB of heap on the coordinating node to hold onto every statistic about every shard on every node (several kiB per shard) even though we use almost none of them. Worse, the coordinating node is always the elected master because that's how
XPackUsageFeatureTransportAction
derivatives work. It also burns a bunch of CPU and network bandwidth just transporting these stats around the cluster. AFAICT we could push this computation out to the individual nodes with a dedicatedTransportNodesAction
which computes the tinyTierSpecificStats
on each node in a manner that allows the coordinating node to combine them.It also does not propagate cancellation down to the nodes stats task(addressed in #100253)It also captures the cluster state when it's initiated and retains it until completion, which can represent another 100MiB+ of heap usage.
Relates #77466.
The text was updated successfully, but these errors were encountered: