Add level=datastreams to Indices Stats #83049

original-brownbear · 2022-01-25T14:18:27Z

We want to add another level "datastreams" in addition to the existing "shards" and "indices" to the indices stats API.
This level would group+aggregate index-stats by data-stream instead of returning stats by individual index. It would not otherwise change the response format and a datastream would simply appear as a single index to callers.
Indices covered by the request that are not part of a datastream would be included in the response the same way as they are at level "indices".

This is motivated by scalability concerns around the response sizes generated by this API for very larger clusters. Also bandwidth used by monitoring when calling this API every 10s or so for all indices becomes problematic for large index counts.

relates #77466

elasticmachine · 2022-01-25T14:18:29Z

Pinging @elastic/es-data-management (Team:Data Management)

matschaffer · 2022-01-25T15:06:37Z

Any rough plan on what the data structure would look like?

I'm curious how well this shape up with things like spotting a shard getting heavy search activity causing node trouble.

Seems like do to that we'd still need to publish information about which shards are on what nodes, which would be basically index-level.

At least for indexing operations, those would always been on the latest index for the data stream.

I don't know all the index metrics offhand, but it might be good to start by listing which ones we could avoid publishing for non-write indices taking part in a data stream. Then we could look at potential savings for expected workloads.

original-brownbear · 2022-01-25T15:43:49Z

Any rough plan on what the data structure would look like?

The thinking was to just have DS display like indices do today if level is set to indices with no other changes to the structure. But now I'm starting to have questions about our current usage below ...

Seems like do to that we'd still need to publish information about which shards are on what nodes, which would be basically index-level.

Before we go on I think it's important to understand the answer to this question:
Are we currently collecting shard level monitoring data for all indices in our monitoring tooling?
Our understanding when we discussed this solutions to the scalability issues of this API were under the assumption that we are not using shard level stats.

matschaffer · 2022-01-25T16:05:51Z

Are we currently collecting shard level monitoring data for all indices in our monitoring tooling?

There's a couple of places in the stack monitoring UI we have shards called out (which nodes have which shards, recent shard migrations).

I'm not sure offhand just how detailed the data is but I've definitely used the shard/node association to help pinpoint hot ingest situations in the past.

original-brownbear added >enhancement :Data Management/Stats Statistics tracking and retrieval APIs labels Jan 25, 2022

elasticmachine added the Team:Data Management Meta label for data/management team label Jan 25, 2022

original-brownbear changed the title ~~Add Level=Datastreams to Indices Stats~~ Add level=datastreams to Indices Stats Jan 25, 2022

original-brownbear mentioned this issue Jan 25, 2022

Fix Large Shard Count Scalability Issues #77466

Open

97 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add level=datastreams to Indices Stats #83049

Add level=datastreams to Indices Stats #83049

original-brownbear commented Jan 25, 2022

elasticmachine commented Jan 25, 2022

matschaffer commented Jan 25, 2022 •

edited

Loading

original-brownbear commented Jan 25, 2022

matschaffer commented Jan 25, 2022 •

edited

Loading

Add level=datastreams to Indices Stats #83049

Add level=datastreams to Indices Stats #83049

Comments

original-brownbear commented Jan 25, 2022

elasticmachine commented Jan 25, 2022

matschaffer commented Jan 25, 2022 • edited Loading

original-brownbear commented Jan 25, 2022

matschaffer commented Jan 25, 2022 • edited Loading

matschaffer commented Jan 25, 2022 •

edited

Loading

matschaffer commented Jan 25, 2022 •

edited

Loading