Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency in Elastic stack metricsets' code #8308

Merged
merged 30 commits into from
Oct 25, 2018

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Sep 13, 2018

This PR makes error messages, reporting, and logging consistent across all metricsets for Elastic stack products, including the xpack monitoring code paths:

  • elasticsearch/ccr
  • elasticsearch/cluster_stats
  • elasticsearch/index
  • elasticsearch/index_recovery
  • elasticsearch/index_summary
  • elasticsearch/ml_job
  • elasticsearch/node
  • elasticsearch/node_stats
  • elasticsearch/pending_tasks
  • elasticsearch/shard
  • kibana/status
  • kibana/stats
  • logstash/node
  • logstash/node_stats

@ycombinator ycombinator added in progress Pull request is currently in progress. Metricbeat Metricbeat needs_backport PR is waiting to be backported to other branches. v7.0.0-alpha1 v6.5.0 labels Sep 13, 2018
@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch 2 times, most recently from c2bbcae to 5b22496 Compare September 13, 2018 20:31
@@ -57,25 +56,35 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
func (m *MetricSet) Fetch(r mb.ReporterV2) {
isMaster, err := elasticsearch.IsMaster(m.HTTP, m.HostData().SanitizedURI+clusterStatsPath)
if err != nil {
r.Error(fmt.Errorf("Error fetching master info: %s", err))
msg := errors.Wrap(err, "error determining if connected Elasticsearch node is master.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors strings should not end with punctuation. https://github.com/golang/go/wiki/CodeReviewComments#error-strings.

return &MetricSet{
base,
http,
config.XPack,
log,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as will inline the logp.NewLogger statement.

@@ -92,11 +92,15 @@ func (m *MetricSet) Fetch(r mb.ReporterV2) {
}

if m.MetricSet.XPack {
eventsMappingXPack(r, m, content)
err = eventsMappingXPack(r, m, content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is never reported. Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we don't want to report X-Pack errors because they would end up in the metricbeat-* index, not the .monitoring-* index which is where all the "success" documents go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously this is something we want to change at some point, but for now we're just trying to achieve parity with existing documents in .monitoring-*.

} else {
err = eventsMapping(r, content)
if err != nil {
r.Error(err)
}
}

if err != nil {
m.Log.Error(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the reporting and logging handled separately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because we want to log x-pack and non-xpack errors but want to report only non-xpack errors. I agree, it's not very pleasant.

@@ -20,6 +20,11 @@ package index_recovery
import (
"encoding/json"

"github.com/elastic/beats/metricbeat/helper/elastic"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import should be grouped with the other github.com/elastic imports.

err = eventsMapping(r, *info, content)
if err != nil {
r.Error(err)
m.Log.Error(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a common pattern. I wonder if it would make sense to have the Metricbeat framework log any error that is reported by a metricset.

Copy link
Contributor Author

@ycombinator ycombinator Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering this too but I have a fairly narrow and relatively recent view in Metricbeat. I'll defer to folks who've worked more broadly in it than me for their opinion: @jsoriano @ruflin, your thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitively nice to make this more generic.

Best in a follow up PR.

return
}

if m.MetricSet.XPack {
eventMappingXPack(r, m, content)
err = eventMappingXPack(r, m, content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is never reported. Is that intentional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, xpack data structure cannot make sense of the errors.

@@ -32,6 +32,9 @@ import (
// Global clusterIdCache. Assumption is that the same node id never can belong to a different cluster id
var clusterIDCache = map[string]string{}

// ModuleName is the ame of this module
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/ame/name/

And godocs should have proper punctuation. Wouldn't it be nicer if the descriptions here ended in punctuation. 😄

@@ -58,16 +59,14 @@ func eventsMapping(r mb.ReporterV2, info elasticsearch.Info, content []byte) err

err := json.Unmarshal(content, &indicesStruct)
if err != nil {
r.Error(err)
return err
return errors.Wrap(err, "failure parsing Elasticsearch Stats API response")
}

var errors multierror.Errors
for name, index := range indicesStruct.Indices {
event := mb.Event{}
event.MetricSetFields, err = schema.Apply(index)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event can also contain an error. I think it would make sense to store/report the error alongside the data that it's associated with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now solved with the more resent commits?

@ycombinator
Copy link
Contributor Author

Good stuff, @andrewkroh. Appreciate the early feedback so I don't have to make the same fixes in a number of places later 😄!

@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch from b453a84 to ceed913 Compare September 18, 2018 15:46
@ycombinator ycombinator changed the title Consistency in Elastic stack metricsets' error handling Consistency in Elastic stack metricsets' code Sep 18, 2018
@ycombinator
Copy link
Contributor Author

CI is failing because of unrelated flaky test: #8208

@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch 2 times, most recently from 6a8c780 to a29be7d Compare September 27, 2018 20:26
@ycombinator
Copy link
Contributor Author

ycombinator commented Sep 29, 2018

@ruflin @andrewkroh The CI intake job is failing with this error:

18:08:40 ./module/elasticsearch/elasticsearch_integration_test.go
18:08:40 Code differs from goimports' style ^
18:08:40 ../libbeat/scripts/Makefile:115: recipe for target 'check' failed

I checked the imports in elasticsearch_integration_test.go and I'm not seeing anything odd. I didn't even mess with the imports in that file in is PR 😄 Could you help me figure out what I'm missing? Thanks.

@ruflin
Copy link
Contributor

ruflin commented Oct 1, 2018

Can you try to run make fmt to see if you get a diff?

ycombinator added a commit that referenced this pull request Oct 1, 2018
…#8497)

This PR makes the Metricbeat `logstash` module code consistent with other Elastic stack modules by:

* using `FetchContent()` instead of `FetchJSON()` to fetch data from LS APIs before passing it to the event mapping function.
* adding unit tests.

Further consistency around event reporting and error handling will be achieved in #8308.
@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch from ae9e1e8 to 189aefd Compare October 1, 2018 19:02
@ycombinator
Copy link
Contributor Author

Can you try to run make fmt to see if you get a diff?

@ruflin I did get a diff but it was in a file completely unrelated to this PR or the error shown by the intake job in CI. Still I tried to commit that diff in da30d94 but it didn't help either.

@ruflin
Copy link
Contributor

ruflin commented Oct 1, 2018

Which go version do you have locally?

@ycombinator
Copy link
Contributor Author

1.11 😬 Let me try downgrading to 1.10.

@ruflin
Copy link
Contributor

ruflin commented Oct 1, 2018

Yes, 1.10 will do different indentations I learned today.

@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch from b5b85d3 to 9824433 Compare October 2, 2018 03:48
@ycombinator ycombinator added review and removed in progress Pull request is currently in progress. labels Oct 2, 2018
@ycombinator ycombinator force-pushed the metricbeat-xpack-log-errors branch from b38e397 to ab3a9ce Compare October 23, 2018 07:51
@ycombinator
Copy link
Contributor Author

@ruflin Other PR (#8592) is merged into master and this PR has been rebased on master now. Ready for your 👀 again.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the error logging / reporting model we have now in the Elasticsearch and Kibana Module. We should probably make that our standard when changing other modules.

event.MetricSetFields, err = schema.Apply(shard)
if err != nil {
errors = append(errors, err)
event.Error = errors.Wrap(err, "failure applying shard schema")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we report partial documents events even on failure. We should probably document this somewhere.

@@ -58,16 +59,14 @@ func eventsMapping(r mb.ReporterV2, info elasticsearch.Info, content []byte) err

err := json.Unmarshal(content, &indicesStruct)
if err != nil {
r.Error(err)
return err
return errors.Wrap(err, "failure parsing Elasticsearch Stats API response")
}

var errors multierror.Errors
for name, index := range indicesStruct.Indices {
event := mb.Event{}
event.MetricSetFields, err = schema.Apply(index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now solved with the more resent commits?

err = eventsMapping(r, *info, content)
if err != nil {
r.Error(err)
m.Log.Error(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would definitively nice to make this more generic.

Best in a follow up PR.

@@ -38,23 +41,23 @@ func eventsMappingXPack(r mb.ReporterV2, m *MetricSet, content []byte) error {
for indexName, indexData := range data {
indexData, ok := indexData.(map[string]interface{})
if !ok {
continue
return fmt.Errorf("%v is not a map", indexName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move forward with the way it is at the moment and fix it if it breaks.

@ycombinator ycombinator merged commit 3b2399d into elastic:master Oct 25, 2018
@ycombinator ycombinator deleted the metricbeat-xpack-log-errors branch October 25, 2018 21:23
@ycombinator ycombinator added v6.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Oct 25, 2018
ycombinator added a commit that referenced this pull request Nov 5, 2018
…de (#8755)

Cherry-pick of PR #8308 to 6.x branch. Original message: 

This PR makes error messages, reporting, and logging consistent across all metricsets for Elastic stack products, including the xpack monitoring code paths:

* [x] `elasticsearch/ccr`
* [x] `elasticsearch/cluster_stats`
* [x] `elasticsearch/index`
* [x] `elasticsearch/index_recovery`
* [x] `elasticsearch/index_summary`
* [x] `elasticsearch/ml_job`
* [x] `elasticsearch/node`
* [x] `elasticsearch/node_stats`
* [x] `elasticsearch/pending_tasks`
* [x] `elasticsearch/shard`
* [x] `kibana/status`
* [x] `kibana/stats`
* [x] `logstash/node`
* [x] `logstash/node_stats`
ycombinator added a commit that referenced this pull request Dec 19, 2018
The cleanup in #8308 introduced a regression which caused the `xpack.enabled` flag in the `kibana` module to stop having any effect.

This PR fixes that issue by reverting the troublesome bits from #8308, ensuring that the struct field for the parsed configuration setting is exported.
ycombinator added a commit that referenced this pull request Dec 19, 2018
Cherry-pick of PR #9642 to 6.x branch. Original message: 

The cleanup in #8308 introduced a regression which caused the `xpack.enabled` flag in the `kibana` module to stop having any effect.

This PR fixes that issue by reverting the troublesome bits from #8308, ensuring that the struct field for the parsed configuration setting is exported.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants