count 429s in elasticsearch output #8056

graphaelli · 2018-08-22T21:22:53Z

As a libbeat user, I'd like more help tracking down when elasticsearch is the bottleneck in event ingestion and could use configuration tuning or more nodes. A rise in 429s can provide some indication of this situation.

This change adds a new observable method ErrTooMany and corresponding monitoring metric output.events.toomany that is only reported by the elasticsearch output.

Eventually, a distribution of time spent in queue (ack time - receive time) would be useful across more outputs, but this is a start.

graphaelli · 2018-08-22T21:23:17Z

Open to suggestions on how to test this.

ph · 2018-08-22T22:08:47Z

@graphaelli I am +1 to track 429 at the output level, Concerning testing sadly I don't think we test any of the stats at the client level. I would probably take the following strategy:

Implement a mock http to control the response.
Create a spy observer and do the assert there?

ruflin · 2018-08-23T11:23:01Z

I think it would be super helpful to have these values also in stack monitoring. If you agree, could you open a meta issue here? https://github.com/elastic/stack-monitoring

graphaelli · 2018-08-24T20:39:44Z

@ph I started adding that and found some existing code that actually works for this. 97cb68c up for review.

ph

LGTM, @ruflin you want to take a look since you were commenting?

ph · 2018-08-27T13:45:45Z

libbeat/outputs/elasticsearch/client.go

-			stats.nonIndexable++
-			continue
+		if status < 500 {
+			if status == http.StatusTooManyRequests {


glad I am not the only one that use these constants :)

ruflin

LGTM. Please make sure to file a follow up issue in stack-monitoring so we have it in Elasticsearch templates and the UI.

graphaelli force-pushed the count-429 branch from 9c31367 to 3e1d70c Compare August 22, 2018 21:37

ph added the libbeat label Aug 22, 2018

graphaelli force-pushed the count-429 branch 2 times, most recently from 2f73a5c to f8dad25 Compare August 22, 2018 21:46

count 429s in elasticsearch output

a397c1f

graphaelli force-pushed the count-429 branch from f8dad25 to a397c1f Compare August 22, 2018 21:52

test bulkCollectPublishFails stats

97cb68c

graphaelli force-pushed the count-429 branch from f3db9ab to 97cb68c Compare August 24, 2018 19:12

ph approved these changes Aug 27, 2018

View reviewed changes

ruflin approved these changes Aug 27, 2018

View reviewed changes

ph merged commit 109dcce into elastic:master Aug 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

count 429s in elasticsearch output #8056

count 429s in elasticsearch output #8056

graphaelli commented Aug 22, 2018

graphaelli commented Aug 22, 2018

ph commented Aug 22, 2018

ruflin commented Aug 23, 2018

graphaelli commented Aug 24, 2018

ph left a comment

ph Aug 27, 2018

ruflin left a comment

count 429s in elasticsearch output #8056

count 429s in elasticsearch output #8056

Conversation

graphaelli commented Aug 22, 2018

graphaelli commented Aug 22, 2018

ph commented Aug 22, 2018

ruflin commented Aug 23, 2018

graphaelli commented Aug 24, 2018

ph left a comment

Choose a reason for hiding this comment

ph Aug 27, 2018

Choose a reason for hiding this comment

ruflin left a comment

Choose a reason for hiding this comment