Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add index metrics #85

Merged

Conversation

matsumana
Copy link
Contributor

@matsumana matsumana commented Aug 16, 2017

I would like to monitor index metrics, just like Kinaba X-pack monitoring.
so I added 3 metrics below:

# HELP elasticsearch_indices_docs_primary Count of documents which only primary shards
# TYPE elasticsearch_indices_docs_primary gauge
elasticsearch_indices_docs_primary{cluster="test",index="foo_1"} 2
elasticsearch_indices_docs_primary{cluster="test",index="foo_2"} 3

# HELP elasticsearch_indices_store_size_bytes_primary Current total size of stored index data in bytes which only primary shards on all nodes
# TYPE elasticsearch_indices_store_size_bytes_primary gauge
elasticsearch_indices_store_size_bytes_primary{cluster="test",index="foo_1"} 8425
elasticsearch_indices_store_size_bytes_primary{cluster="test",index="foo_2"} 12420

# HELP elasticsearch_indices_store_size_bytes_total Current total size of stored index data in bytes which all shards on all nodes
# TYPE elasticsearch_indices_store_size_bytes_total gauge
elasticsearch_indices_store_size_bytes_total{cluster="test",index="foo_1"} 16850
elasticsearch_indices_store_size_bytes_total{cluster="test",index="foo_2"} 24840

dominikschulz
dominikschulz previously approved these changes Aug 16, 2017
Copy link
Contributor

@dominikschulz dominikschulz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matsumana
Copy link
Contributor Author

Hello, Please review.

Copy link
Collaborator

@metalmatze metalmatze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either remote all indices metric from collector/nodes.go and move them into collector/indices.go or the other way around. But like this it gets confusing.

Copy link
Member

@zwopir zwopir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the metrics your are using from _all/_stats are all included in _nodes/_local/stats/_nodes/stats which we already get from ES in `collector/nodes.go. Some of the json fields are not (yet) used, but if you are not planning to use the detailed index stats, we don't need to make another http call.

The additional call can be still useful, if we evaluate the per index metrics. I would then make the _all/_stats call optional.

So my suggestion is:

  • retrieve the metrics from this PR from the _nodes/stats endpoint, i.e. extend collector/nodes.go.
  • pick a (for now arbitrary) metrics from the index endpoint and expose it as prometheus metric as a starting point for future word. We can then later extend indices.go to expose detailed indices metrics.

After merging this PR I would implement the opt-in/opt-out for sub collectors.

@matsumana
Copy link
Contributor Author

matsumana commented Aug 23, 2017

Hello, Thank you for the comments.

the metrics your are using from _all/_stats are all included in _nodes/_local/stats/_nodes/stats which we already get from ES in `collector/nodes.go.

Certainly we can get the metrics of docs from _nodes/_local/stats / _nodes/stats.
however, which is the docs metric of primary + replica. I want to collect metrics of primary docs only.
Actually, it seems that Kibana X-Pack Monitoring is so.

@matsumana
Copy link
Contributor Author

So I fixed below:

  • moved all indices metric from collector/nodes.go to collector/indices.go
  • gathered the metrics collect logic in metrics_collector. To avoid redundant api calls.

What do you think?

@dominikschulz dominikschulz dismissed their stale review August 23, 2017 14:27

need to have a look at the changes

@zwopir
Copy link
Member

zwopir commented Aug 25, 2017

not sure if moving the index metrics to a another call to the ES api is what we want. We need to call _nodes/_local/stats or depending on the all-flag _nodes/stats for the nodes-metrics anyways. I would like to make the indices-metrics optional, since people might have many indices and we thus create a high metrics cardinality and much more json to transfer and parse.

What do you think, @dominikschulz ?

@dominikschulz
Copy link
Contributor

Sounds reasonable. We should make indices metrics optional.

@matsumana
Copy link
Contributor Author

@zwopir @dominikschulz
I agree with you.
Actually, I have heard that there are Elasticsearch clusters with several hundreds indexes in some companies.
I fixed this PR. Please review.

@dominikschulz
Copy link
Contributor

@matsumana IMHO you're changing/moving a lot of unrelated code that shouldn't be touched by your PR.

@@ -220,49 +217,14 @@ func (c *ClusterHealth) Describe(ch chan<- *prometheus.Desc) {
ch <- c.jsonParseFailures.Desc()
}

func (c *ClusterHealth) fetchAndDecodeClusterHealth() (clusterHealthResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should remain in this file

"github.com/go-kit/kit/log"
)

func TestClusterHealth(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should remain in this file

indexStatsResponse *indexStatsResponse
}

func NewMetricsCollector(logger log.Logger, client *http.Client, url *url.URL, all bool, exportIndices bool) *MetricsCollector {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should init all metrics collector in main, not in another wrapper method.

c.indices.Collect(ch, clusterHealthResponse, nodeStatsResponse, indexStatsResponse)
}

func (c *ClusterHealth) fetchAndDecodeClusterHealth() (clusterHealthResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should stay in it's own file.

return chr, nil
}

func (c *Nodes) fetchAndDecodeNodeStats() (nodeStatsResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should stay in it's own file.

return
}

func TestClusterHealth(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should stay in it's own file.

}
}

func TestNodesStats(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should stay in it's own file.

@@ -984,51 +525,14 @@ func (c *Nodes) Describe(ch chan<- *prometheus.Desc) {
ch <- c.jsonParseFailures.Desc()
}

func (c *Nodes) fetchAndDecodeNodeStats() (nodeStatsResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should remain in this file

"github.com/go-kit/kit/log"
)

func TestNodesStats(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method should remain in this file

}

func (c *ClusterHealth) Collect(ch chan<- prometheus.Metric) {
func (c *ClusterHealth) Collect(ch chan<- prometheus.Metric, clusterHealthResponse clusterHealthResponse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't work. Collect must fulfill the prometheus. Collector interface and thus the signature must be

Collect(ch chan<- prometheus.Metric)


c.clusterHealth.Collect(ch, clusterHealthResponse)
c.nodes.Collect(ch, nodeStatsResponse)
c.indices.Collect(ch, clusterHealthResponse, nodeStatsResponse, indexStatsResponse)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrapping collectors (or your custom non-interface-fulfilling Collect() ) isn't a good idea. Doing so makes the collectors run sequentially, not in parallel. Also the json retrieval works parallel, if you exclude it from the collectors.
In additionMaking collectors optional is then just a matter of ifing the prometheus.Mustregister().

Could you please refer to you initial intent of this PR: get the docs metrics of the primary shards. To archieve this I would strongly suggest to

  • go back to the start (sorry)
  • implement the indices collector (fulfilling the collector interface) with just the very few metrics included you are interested in. We would really like to keep the overall index metrics we can get via _nodes/stats in the nodes collector. Please only export the structs marshaled from the _all/_stats that are not in _nodes/stats.
  • add this collector optionally with prometheus.MustRegister

@matsumana
Copy link
Contributor Author

matsumana commented Aug 28, 2017

@zwopir @dominikschulz
I fixed.
Would you please review again?

Copy link
Member

@zwopir zwopir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Matsumana,

thanks for the update. It almost looks good. Two things I would ask you to change:

  • exclude the flag, if indices metrics should be scraped to main.go
  • a copy'n'paste error in the indices Collect() func

"err", err,
)
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this clusterHealth block in the indices collector is probably a copy'n' paste error, isn't it?!

client *http.Client
url *url.URL
all bool
exportIndices bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move the bool controlling if indices metrics should be scraped out of the collector. There is no reason registering the metrics via Describe() and then don't collect them. See my other comment in main.go

func (c *Indices) fetchAndDecodeIndexStats() (indexStatsResponse, error) {
var isr indexStatsResponse

if c.exportIndices {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move if statement to main.go

main.go Outdated
@@ -55,6 +56,7 @@ func main() {

prometheus.MustRegister(collector.NewClusterHealth(logger, httpClient, esURL))
prometheus.MustRegister(collector.NewNodes(logger, httpClient, esURL, *esAllNodes))
prometheus.MustRegister(collector.NewIndices(logger, httpClient, esURL, *esAllNodes, *esExportIndices))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please replace by

if *exExportIndices {
    prometheus.MustRegister(collector.NewIndices(logger, httpClient, esURL, *esAllNodes))
}

(so remove the flag from the NewIndices constructor as well)

metric.Desc,
metric.Type,
metric.Value(indexStats),
metric.Labels(clusterHealthResponse.ClusterName, indexName)...,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy'n'paste error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please review the labels you want to attach to the indices metrics. It seems there is not cluster name available

@matsumana
Copy link
Contributor Author

@zwopir Thank you for the comments.
I fixed.
Would you please review again?

@zwopir
Copy link
Member

zwopir commented Aug 29, 2017

looks good to me, thanks for your contribution!

@zwopir zwopir merged commit 46d7b17 into prometheus-community:master Aug 29, 2017
@matsumana matsumana deleted the feature/add-index-metrics branch September 7, 2017 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants