Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add level=datastreams to Indices Stats #83049

Open
Tracked by #77466
original-brownbear opened this issue Jan 25, 2022 · 4 comments
Open
Tracked by #77466

Add level=datastreams to Indices Stats #83049

original-brownbear opened this issue Jan 25, 2022 · 4 comments
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >enhancement Team:Data Management Meta label for data/management team

Comments

@original-brownbear
Copy link
Member

We want to add another level "datastreams" in addition to the existing "shards" and "indices" to the indices stats API.
This level would group+aggregate index-stats by data-stream instead of returning stats by individual index. It would not otherwise change the response format and a datastream would simply appear as a single index to callers.
Indices covered by the request that are not part of a datastream would be included in the response the same way as they are at level "indices".

This is motivated by scalability concerns around the response sizes generated by this API for very larger clusters. Also bandwidth used by monitoring when calling this API every 10s or so for all indices becomes problematic for large index counts.

relates #77466

@original-brownbear original-brownbear added >enhancement :Data Management/Stats Statistics tracking and retrieval APIs labels Jan 25, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jan 25, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@original-brownbear original-brownbear changed the title Add Level=Datastreams to Indices Stats Add level=datastreams to Indices Stats Jan 25, 2022
@matschaffer
Copy link
Contributor

matschaffer commented Jan 25, 2022

Any rough plan on what the data structure would look like?

I'm curious how well this shape up with things like spotting a shard getting heavy search activity causing node trouble.

Seems like do to that we'd still need to publish information about which shards are on what nodes, which would be basically index-level.

At least for indexing operations, those would always been on the latest index for the data stream.

I don't know all the index metrics offhand, but it might be good to start by listing which ones we could avoid publishing for non-write indices taking part in a data stream. Then we could look at potential savings for expected workloads.

@original-brownbear
Copy link
Member Author

Any rough plan on what the data structure would look like?

The thinking was to just have DS display like indices do today if level is set to indices with no other changes to the structure. But now I'm starting to have questions about our current usage below ...

Seems like do to that we'd still need to publish information about which shards are on what nodes, which would be basically index-level.

Before we go on I think it's important to understand the answer to this question:
Are we currently collecting shard level monitoring data for all indices in our monitoring tooling?
Our understanding when we discussed this solutions to the scalability issues of this API were under the assumption that we are not using shard level stats.

@matschaffer
Copy link
Contributor

matschaffer commented Jan 25, 2022

Are we currently collecting shard level monitoring data for all indices in our monitoring tooling?

There's a couple of places in the stack monitoring UI we have shards called out (which nodes have which shards, recent shard migrations).

I'm not sure offhand just how detailed the data is but I've definitely used the shard/node association to help pinpoint hot ingest situations in the past.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

3 participants