-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send metric tags as riemann tags and attributes #1845
Conversation
allow sending measurement name as tag Signed-off-by: Fabio Berchtold <fabio.berchtold@swisscom.com>
Fixes #1501. |
## set measurement name as a Riemann tag instead of prepending it to the Riemann service name | ||
measurement_as_tag = false | ||
## list of Riemann tags, if specified use these instead of any Telegraf tags | ||
tags = ["telegraf","custom_tag"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain the use-case of this? why do you want to completely overwrite all telegraf tags?
Many tags in telegraf are used to differentiate metrics that are otherwise identical, so using this option has the potential to ruin many collected metrics. For example, cpu metrics typically look like this:
cpu,cpu=cpu0 usage=99
cpu,cpu=cpu1 usage=88
if the user then used this option, all they would see in Riemann is:
cpu,telegraf=custom_tag usage=88
which is basically a useless metric because there is no context on which cpu it has collected, and you've also lost the data of cpu0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sparrc The reasoning behind this is that in Riemann tags are an entirely different thing than in Telegraf.
In Riemann tags are just an array of strings. Tags in Telegraf though are key=value pairs and its more appropriate to send these as Riemann attributes, which are also key=value pairs of type map[string]string
.
It actually does not make too much sense to try and squeeze Telegraf tags into Riemann tags as the old code did, or even the fallback code in this PR.
Maybe it might even be better to always only send Riemann tags if they are specifically configured in the outputs.riemann configuration, and the option renamed to riemann_tags
to make it more clear that they are a different thing?
If you check the Riemann library event struct you can see that attributes
in Riemann are what tags
are to Telegraf/Influxdb, while Riemann tags are not:
https://github.com/amir/raidman/blob/master/raidman.go#L39-L49
Here's an example Telegraf configuration and the resulting Riemann event from one of our systems, without setting outputs.riemann.tags
:
[global_tags]
deployment = "postgresql-dev-2"
id = "1e612b44-e92f-4d27-9f30-5e2f53947870"
az = "z1"
[[outputs.riemann]]
url = "localhost:5555"
measurement_as_tag = true
...
#riemann.codec.Event{
:host "postgresql-1e612b44-e92f-4d27-9f30-5e2f53947870", :state nil, :description nil, :ttl nil,
:measurement "disk", :service "used_percent", :metric 73.90288667552354, :path "/boot", :fstype "ext4",
:tags ["disk" "z1" "postgresql-dev-2" "ext4" "1e612b44-e92f-4d27-9f30-5e2f53947870" "postgresql" "/boot"],
:deployment "postgresql-dev-2", :id "1e612b44-e92f-4d27-9f30-5e2f53947870", :az "z1", :time 1475605281}
And the same example and resulting event with setting outputs.riemann.tags
:
[global_tags]
deployment = "postgresql-dev-2"
id = "1e612b44-e92f-4d27-9f30-5e2f53947870"
az = "z1"
[[outputs.riemann]]
url = "localhost:5555"
measurement_as_tag = true
tags = ["telegraf","postgres_cluster"]
...
#riemann.codec.Event{
:host "postgresql-1e612b44-e92f-4d27-9f30-5e2f53947870", :state nil, :description nil, :ttl nil,
:measurement "disk", :service "used_percent", :metric 73.16736001949994, :path "/boot", :fstype "ext4",
:tags ["telegraf" "postgres_cluster"],
:deployment "postgresql-dev-2", :id "1e612b44-e92f-4d27-9f30-5e2f53947870", :az "z1", :time 1475605021}
As you can see in the first example most :tags
are almost nonsensical and not really useful.
While in the second example they have a specific meaning and Riemann can be more easily configured to filter for those.
The actual Telegraf tags on the other hand are always present as additional event attributes.
I hope my explanation makes sense 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that that may make sense for your use-case, but the vast majority of telegraf users are collecting metrics from a diverse array of sources, not just a single type of source (postgres_cluster) that they're going to want to hardcode into their config file.
What if the riemann output allowed selecting particular telegraf tags to have their values applied as riemann tags? so your config could look something like this?:
[global_tags]
deployment = "postgresql-dev-2"
id = "1e612b44-e92f-4d27-9f30-5e2f53947870"
az = "z1"
description = "postgres_cluster"
[[outputs.riemann]]
url = "localhost:5555"
measurement_as_tag = true
rieman_tag_keys = ["description"]
...
#riemann.codec.Event{
:host "postgresql-1e612b44-e92f-4d27-9f30-5e2f53947870", :state nil, :description nil, :ttl nil,
:measurement "disk", :service "used_percent", :metric 73.16736001949994, :path "/boot", :fstype "ext4",
:tags ["telegraf" "postgres_cluster"],
:deployment "postgresql-dev-2", :id "1e612b44-e92f-4d27-9f30-5e2f53947870", :az "z1", :time 1475605021}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sparrc hmm ok, makes sense. I modified the code to match your proposal. If riemann_tag_keys
are specified / non-empty it will send only their tag values as Riemann tags, otherwise it will send all tag values.
Signed-off-by: Fabio Berchtold <fabio.berchtold@swisscom.com>
|
||
# ## set measurement name as a Riemann tag instead of prepending it to the Riemann service name | ||
# measurement_as_tag = false | ||
# ## list of tag keys to specify, whose values get sent as Riemann tags. If empty, all Telegraf tag values will be sent to Riemann as tags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines shouldn't be more than 80 characters
@JamesClonk Overall I think the changes look OK, but I'm concerned that the riemann plugin isn't in a very good state, and also lacks unit tests. Currently the Riemann output plugin appears to have a strange behavior that isn't generally very useful. Rather than tacking on features and boolean switches to fix the behavior, I'd prefer if we instead had a full rewrite of the plugin. Given it's current state, I think it's safe to phase out the current plugin as a "legacy" version and phase in a newer one. @JamesClonk would you be willing to work on a rewrite of the plugin? I'd like to see a short writeup on how telegraf data being written to Riemann should be structured, a README giving examples of how to query the data in Riemann, and unit tests specifying exactly how the outputted Riemann data is intended to look. |
@JamesClonk see #1878, I'm going to close this PR for now, let's keep the discussion on the design of the plugin in #1878 until a PR for a new plugin is ready. |
Signed-off-by: Fabio Berchtold fabio.berchtold@swisscom.com