Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest metrics are not always sent first when output fails #5194

Closed
danielnelson opened this issue Dec 26, 2018 · 0 comments · Fixed by #5287
Closed

Latest metrics are not always sent first when output fails #5194

danielnelson opened this issue Dec 26, 2018 · 0 comments · Fixed by #5287
Assignees
Milestone

Comments

@danielnelson
Copy link
Contributor

Relevant telegraf.conf:

[agent]
  metric_batch_size = 5
  metric_buffer_limit = 10

[[inputs.mem]]
  fieldpass = ["free"]
[[outputs.file]]
  files = ["stdout"]

System info:

Telegraf 1.9.1

Steps to reproduce:

  1. Modify file output to always return an error from Write(), to simulate real error.
  2. Run Telegraf

Expected behavior:

Latest metrics should always be emitted first:

2018-12-26T23:21:03Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"", Flush Interval:10s
mem free=11845566464i 1545866470000000000
2018-12-26T23:21:20Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
2018-12-26T23:21:30Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
2018-12-26T23:21:40Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
2018-12-26T23:21:50Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
2018-12-26T23:22:00Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866480000000000
mem free=11895586816i 1545866490000000000
mem free=11896877056i 1545866500000000000
mem free=11896774656i 1545866510000000000
mem free=11889926144i 1545866520000000000

Actual behavior:

Annotated log:

2018-12-26T23:21:03Z I! Starting Telegraf                                     
2018-12-26T23:21:03Z I! Using config file: /home/dbn/.telegraf/telegraf.conf  
2018-12-26T23:21:03Z I! Loaded inputs: inputs.mem                             
2018-12-26T23:21:03Z I! Loaded aggregators:                                   
2018-12-26T23:21:03Z I! Loaded processors:                                    
2018-12-26T23:21:03Z I! Loaded outputs: file
2018-12-26T23:21:03Z I! Tags enabled: 
2018-12-26T23:21:03Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"", Flush Interval:10s
mem free=11845566464i 1545866470000000000
2018-12-26T23:21:20Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
2018-12-26T23:21:30Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
2018-12-26T23:21:40Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
2018-12-26T23:21:50Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000

So far this looks good, since each write fails the last metrics are included until the metric batch size is reached.

2018-12-26T23:22:00Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000

That batch is not right, it is the same as before when it should have cycled the 1545866470000000000 out. This continues for some time until the metric buffer size is reached:

2018-12-26T23:22:10Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
2018-12-26T23:22:20Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
2018-12-26T23:22:30Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
2018-12-26T23:22:40Z E! [agent] Error writing to output [file]: fake
mem free=11845566464i 1545866470000000000
mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
2018-12-26T23:22:50Z E! [agent] Error writing to output [file]: fake

At this point metrics fall of the metric buffer so the values start to cycle:

mem free=11895586816i 1545866480000000000
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
mem free=11887988736i 1545866520000000000
2018-12-26T23:23:00Z E! [agent] Error writing to output [file]: fake
mem free=11896877056i 1545866490000000000
mem free=11896774656i 1545866500000000000
mem free=11889926144i 1545866510000000000
mem free=11887988736i 1545866520000000000
mem free=11882758144i 1545866530000000000

Additional info:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant