Splunk Metrics serializer #4339

ronnocol · 2018-06-22T23:52:47Z

This serializer properly formats the metrics data according to the Splunk metrics JSON specification, it can be used with any output that supports serializers, including file or HTTP. I decided that just formatting the data according to the Splunk spec was a better investment than managing the whole output pipeline. Now you can use telegraf without requiring a HEC (by using a file output), or you can use the well tested and maintained HTTP output and just set the extra headers required by the HEC.

Please see the docs in README.md

Required for all PRs:

[ x ] Signed CLA.
[ x ] Associated README.md updated.
[ x ] Has appropriate unit tests.

in a format that can be consumed by a Splunk metrics index. Can be used with any output including file or HTTP. Please see the docs in README.md

phemmer · 2018-06-24T15:48:20Z

plugins/serializers/splunkmetric/splunkmetric.go

+		obj["_value"] = v
+
+		dataGroup.Event = "metric"
+		dataGroup.Time = float64(metric.Time().UnixNano() / int64(s.TimestampUnits))


I think I see what you're intending to do here, but this won't result in the correct behavior. https://play.golang.org/p/VEDBzbF0j0e

If you want to do math between 2 ints, and get a float result, you need to convert the int to float before the arithmetic.

phemmer · 2018-06-24T15:52:09Z

plugins/serializers/splunkmetric/splunkmetric.go

+		return "", err
+	}
+
+	metricString = string(metricJson)


Picking on the string vs byte slice thing some more, you're converting a byte slice to a string, and then later on converting that string back into a byte slice (up on line 50). It would use less memory to keep it as a byte slice.

phemmer · 2018-06-24T15:55:33Z

plugins/serializers/splunkmetric/splunkmetric.go

+	}
+
+	for _, m := range objects {
+		serialized = serialized + m + "\n"


It would be better if serialized were a byte slice. Every time you append to a string, a new string has to be allocated. If your batches are large (in terms of count or size), this can suck up a lot of memory. With byte slice, an allocation is performed only if the length of the slice exceeds the capacity.
Ditto for other places where this is stored in a string (e.g. line 23).

Also I'm not seeing much point of the objects slice. You should be able to append to serialized directly.

phemmer · 2018-06-24T15:58:51Z

plugins/serializers/splunkmetric/splunkmetric.go

+	return []byte(serialized), nil
+}
+
+func (s *serializer) SerializeBatch(metrics []telegraf.Metric) ([]byte, error) {


SerializeBatch seems like it's just a reinvention of json.Encoder. It might be better to use the existing standard lib functionality. It should also simplify your code a lot.

phemmer · 2018-06-24T16:01:31Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+	for _, metric := range metrics {
+		m, err := s.createObject(metric)
+		if err == nil {


The error should be logged somewhere when not nil. Otherwise metrics are going to be getting dropped and the user won't have any idea why.

phemmer · 2018-06-24T16:22:24Z

plugins/serializers/registry.go

@@ -73,6 +74,8 @@ func NewSerializer(config *Config) (Serializer, error) {
 		serializer, err = NewGraphiteSerializer(config.Prefix, config.Template, config.GraphiteTagSupport)
 	case "json":
 		serializer, err = NewJsonSerializer(config.TimestampUnits)
+	case "splunkmetric":
+		serializer, err = NewSplunkmetricSerializer(config.TimestampUnits)


Splunk only supports seconds. The serializer should not be using config.TimestampUnits.

Splunk absolutely support sub second timestamps. From the default datetime.xml:

<define name="_utcepoch" extract="utcepoch, subsecond">  <text><![CDATA[((?<=^|[\s#,"=\(\[\|\{])(?:1[012345]|9)\d{8}|^@[\da-fA-F]{16,24})(?:\.?(\d{1,6}))?(?![\d\(])]]></text> </define>```

I didn't say it doesn't. It only supports seconds as the unit. It supports more than that as the precision. Unit != precision.

phemmer · 2018-06-24T16:39:35Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+func (s *serializer) createObject(metric telegraf.Metric) (metricString string, err error) {
+
+	/* Splunk supports one metric per line and has the following required names:


This is not entirely accurate. Splunk's http event collector is not line-oriented. It's object oriented. You can shove multiple JSON objects on a single line and it's happy to consume them.
See this example: https://docs.splunk.com/Documentation/Splunk/7.1.1/Data/HTTPEventCollectortokenmanagement#Send_multiple_events_to_HEC_in_one_request

{"event": "Pony 1 has left the barn"}{"event": "Pony 2 has left the barn"}{"event": "Pony 3 has left the barn", "nested": {"key1": "value1"}}

I agree, it's not line oriented, but it doesn't support reading an array of JSON objects as metrics. This is OK:

{"event": "Pony 1 has left the barn"}{"event": "Pony 2 has left the barn"}{"event": "Pony 3 has left the barn", "nested": {"key1": "value1"}}

This is not OK:

[{"event": "Pony 1 has left the barn"}{"event": "Pony 2 has left the barn"}{"event": "Pony 3 has left the barn", "nested": {"key1": "value1"}}]

My point was that developer documentation should be accurate. We shouldn't tell the developers it's "one metric per line" if it's not.

phemmer · 2018-06-24T16:54:32Z

plugins/serializers/splunkmetric/splunkmetric.go

+	 ** metric_name: The name of the metric
+	 ** _value:      The value for the metric
+	 ** _time:       The timestamp for the metric
+	 ** All other index fields become deminsions.


I personally don't think this is a good structure to follow. This will prevent using this serializer with pretty much every single input plugin, as none of them follow this format. I think the implementation over on #4185 has a much more flexible implementation.

The comment is wrong, it should be time, not _time (I'll fix), but metric_name and _value are required fields for the metrics store. Naming the fields this way means you don't need custom props.conf or transforms.conf. In fact this is the same naming convention that is used in the PR you reference. The point of this serializer is to format the metrics into this format so that it can be used with the generic, well tested, HTTP output. Or the file output if you're running telegraf on a machine that is running a Splunk forwarder.

phemmer · 2018-06-24T16:58:37Z

This is now our 3rd splunk output proposal in a few days :-/
(ok, maybe not a "few" days, but within a relatively short time frame)
See also:
#4185
#4300

These updates cleanup a lot of the handling of the data and allows to configure if the output should be in a HEC compatible format or a file/stdout format. Additional documentation files also updated.

ronnocol · 2018-06-24T22:46:51Z

@phemmer Thank you for the comprehensive review, I appreciate all of the feedback and believe I have addressed all/most of the issues with this latest commit.

I know that there were some PRs for Splunk HEC outputs, but this was specifically written as a serializer so that you could have Splunk compatible metrics generated for any reasonable output (e.g. file or HTTP.) Furthermore, it will allow you to easily redirect different types of metrics to different metrics indexes, which the output based PRs do not allow for. The output based PRs also require that you have/deploy a HEC to be used to get metrics into the Splunk infrastructure vs. being able to use existing infrastructure (forwarders) to get the data into a metrics index.

dkulig · 2018-07-19T12:51:30Z

Looks like it doesn't work properly with the zookeeper input,

2018-07-19T12:50:41Z D! Attempting connection to output: file
2018-07-19T12:50:41Z D! Successfully connected to output: file
2018-07-19T12:50:41Z I! Starting Telegraf v1.8.0~caf97dc5
2018-07-19T12:50:41Z I! Loaded inputs: inputs.zookeeper
2018-07-19T12:50:41Z I! Loaded aggregators:
2018-07-19T12:50:41Z I! Loaded processors:
2018-07-19T12:50:41Z I! Loaded outputs: file
2018-07-19T12:50:41Z I! Tags enabled: host=mbpdk
2018-07-19T12:50:41Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"mbpdk", Flush Interval:10s
2018-07-19T12:50:51Z D! Output [file] buffer fullness: 1 / 10000 metrics.
2018-07-19T12:50:51Z E! [serializer.splunkmetric] Dropping invalid metric
2018-07-19T12:50:51Z E! Error writing to output [file]: failed to serialize message: can not parse value

my config:

[[outputs.file]]
   files = ["stdout"]
   data_format = "splunkmetric"
   hec_routing = false

[[inputs.zookeeper]]
  servers = [":2181"]

… numeric. Some inputs return non-numeric values (e.g. zookeeper returns a version string) Splunk requires that metrics be numeric, but previously the serializer would drop the entire input vs. just the offending metric. Now just drop the offencing key/value and improve error logging: 2018-07-20T02:37:00Z E! Can not parse value: 3.4.12-e5259e437540f349646870ea94dc2658c4e44b3b for key: version

arohter · 2018-08-08T01:36:26Z

This looks great, and 👍 for leveraging the existing http plugin. Can we get this merged and into an official release?

glinton · 2018-08-10T14:15:03Z

docs/DATA_FORMATS_OUTPUT.md

+
+```toml
+[[outputs.http]]
+#   ## URL is the address to send metrics to


no need for the initial # through line 252

glinton · 2018-08-10T14:30:33Z

plugins/serializers/splunkmetric/README.md

+
+```toml
+[[outputs.http]]
+#   ## URL is the address to send metrics to


again, leading # isn't necessary

glinton · 2018-08-10T14:31:38Z

plugins/serializers/splunkmetric/README.md

+An example configuration of a file based output is: 
+
+```toml
+# # Send telegraf metrics to file(s)


glinton · 2018-08-10T14:31:56Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+import (
+	"encoding/json"
+	//	"errors"


remove this commented import

glinton · 2018-08-10T14:32:41Z

plugins/serializers/splunkmetric/splunkmetric.go

+	m, err := s.createObject(metric)
+	if err != nil {
+		log.Printf("E! [serializer.splunkmetric] Dropping invalid metric: %v [%v]", metric, m)
+		return []byte(""), err


pedantic, but maybe return nil, err here

Fixup README files as requested in review

danielnelson

We should unify the documentation into just a single place. Let's link from the DATA_FORMATS_OUTPUT documents to the README and sometime before 1.8 is final we can move the other serializers documentation to follow suite.

ronnocol · 2018-09-03T19:38:13Z

@danielnelson For the sake of clarity, you want me to remove the text in DATA_FORMATS_OUTPUT and replace it with a link to the README. Possibly merging some of the text from that I put in the DATA_FORMAT_OUTPUT file into the README if additional clarity is required.

danielnelson · 2018-09-04T19:42:09Z

That's right, thank you

…plunkmetric README.

danielnelson

Great job on the docs, thanks for taking care of that for me. I took another look over the pull request and noticed a few additional things we should discuss before merging:

plugins/serializers/registry.go

danielnelson · 2018-09-06T20:34:02Z

plugins/serializers/splunkmetric/README.md

+   ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
+   data_format = "splunkmetric"
+    ## Provides time, index, source overrides for the HEC
+   hec_routing = true


Can you rename this variable splunk_hec_routing, due to how the parser variables are injected directly into the plugin we have started prefixing the variables. Long term, these will probably become tables and the parsers will become more independent.

Just to keep things consistent, lets call it splunkmetric_hec_routing

ok, will do.

danielnelson · 2018-09-06T20:35:16Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+	m, err := s.createObject(metric)
+	if err != nil {
+		log.Printf("D! [serializer.splunkmetric] Dropping invalid metric: %v [%v]", metric, m)


We should return this as an error: return nil, fmt.Errorf("Dropping invalid metric: %s", metric.Name())

danielnelson · 2018-09-06T20:36:49Z

plugins/serializers/splunkmetric/splunkmetric.go

+	for _, metric := range metrics {
+		m, err := s.createObject(metric)
+		if err != nil {
+			log.Printf("D! [serializer.splunkmetric] Dropping invalid metric: %v [%v]", metric, m)


Should return error. If there are situations where a metric cannot be serialized but it is not an error, have createObject return nil, nil and check if m is nil before appending.

danielnelson · 2018-09-06T20:37:14Z

plugins/serializers/splunkmetric/splunkmetric.go

+	for k, v := range metric.Fields() {
+
+		if !verifyValue(v) {
+			log.Printf("D! Can not parse value: %v for key: %v", v, k)


I think here we should just continue on the next field. Otherwise we will never be able to serialize a metric with a string field.

Got it. Will include in the other changes being made.

danielnelson · 2018-09-06T20:48:18Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+	dataGroup := HECTimeSeries{}
+
+	for k, v := range metric.Fields() {


Use metric.FieldList() so that no allocation is performed.

Part of the refactoring mentioned below (thank for the suggestion)

danielnelson · 2018-09-06T20:49:34Z

plugins/serializers/splunkmetric/splunkmetric.go

+
+		dataGroup.Event = "metric"
+		// Convert ns to float seconds since epoch.
+		dataGroup.Time = float64(metric.Time().UnixNano()) / float64(1000000000)


metric.Time().Unix() will give you unix time in seconds.

I don't want seconds, I want ms but as a float (this is Splunk's spec...) e.g. 1536295485.123

danielnelson · 2018-09-06T20:51:32Z

plugins/serializers/splunkmetric/splunkmetric.go

+}
+
+func (s *serializer) createObject(metric telegraf.Metric) (metricJson []byte, err error) {
+


We do a bad job of explaining what is guaranteed in a telegraf.Metric, and actually as I write this I notice some additional checks I need to add.

Here is a brief rundown of things you may or may not need to check:

metric name may be an empty string

zero or more tags, tag keys are any non-empty string, tag values may be empty strings

zero, yes zero, or more fields, field keys are any non-empty string, field values may be any float64 (including NaN, +/-Inf),int64,uint64,string,bool

time is any time.Time.

The part about tag/field keys not being empty strings is not true right now, but after writing this I am going to ensure this is the case in 1.8.

Thanks, this actually caused me to re-examine the the serializer with several different inputs, and I found a case in which metrics were lost (dropped), so I'm refactoring some of the code to deal with that. (Also, it's been a crazy week...so hope to get this done over the next few days.)

russorat · 2018-09-10T23:55:18Z

plugins/serializers/splunkmetric/README.md

@@ -0,0 +1,139 @@
+# Splunk Metrics serialzier


danielnelson · 2018-09-11T00:52:08Z

@ronnocol We are hoping to release 1.8.0-rc1 on Wednesday afternoon, let me know if you think that could be a problem for this pr.

Tested http input to Splunk 7.1.2 (w/ hec routing) Verified output of file output.

ronnocol · 2018-09-11T04:21:33Z

@danielnelson I believe I have resolved all of your requested changes as well as resolved an issue where metrics might have been dropped. I know that you want to cut an RC1 on Wednesday, I believe that this serializer is ready to be included in that.

This serializer formats and outputs the metric data

135799b

in a format that can be consumed by a Splunk metrics index. Can be used with any output including file or HTTP. Please see the docs in README.md

phemmer suggested changes Jun 24, 2018

View reviewed changes

phemmer mentioned this pull request Jun 24, 2018

output plugin for Splunk HEC API #4300

Closed

3 tasks

Cleanup string and byte slices, add hec_routing option

5dc75f4

These updates cleanup a lot of the handling of the data and allows to configure if the output should be in a HEC compatible format or a file/stdout format. Additional documentation files also updated.

ronnocol added 2 commits June 25, 2018 15:28

Add comment on hec_routing flag

864ba51

fixed format; not sure how it broke

49a3c28

arohter mentioned this pull request Jun 26, 2018

Feature Request: Splunk Output plugin #3176

Closed

ronnocol added 5 commits July 19, 2018 19:38

fmt check fix

8c4535f

forgot to update tests

30cad33

Continue past whole metric k/v if the only value is non-parsable.

cc29b29

fmt issues... sorry; one of those days

ac56a95

glinton suggested changes Aug 10, 2018

View reviewed changes

Change E! to D! since dropping invalid metrics isn't an error

a3c8374

Fixup README files as requested in review

glinton approved these changes Aug 13, 2018

View reviewed changes

danielnelson mentioned this pull request Aug 29, 2018

Initial Commit for Splunk Output plugin #4185

Closed

3 tasks

danielnelson added this to the 1.8.0 milestone Aug 29, 2018

danielnelson added the new plugin label Aug 29, 2018

danielnelson approved these changes Aug 29, 2018

View reviewed changes

Reformat docs as per request. DATA_FORMATS_OUTPUT now just links to s…

9c01483

…plunkmetric README.

danielnelson suggested changes Sep 6, 2018

View reviewed changes

russorat reviewed Sep 10, 2018

View reviewed changes

plugins/serializers/splunkmetric/README.md Outdated

@@ -0,0 +1,139 @@

# Splunk Metrics serialzier

Copy link

Contributor

russorat Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling

Fixes requested in PR, also fixed droppping metrics.

1bb4e2e

Tested http input to Splunk 7.1.2 (w/ hec routing) Verified output of file output.

gofmt fixes... I really thought make did this

67072ba

danielnelson approved these changes Sep 11, 2018

View reviewed changes

danielnelson merged commit c80aab0 into influxdata:master Sep 11, 2018

rgitzel pushed a commit to rgitzel/telegraf that referenced this pull request Oct 17, 2018

Add Splunk Metrics serializer (influxdata#4339)

1e67fd8

otherpirate pushed a commit to otherpirate/telegraf that referenced this pull request Mar 15, 2019

Add Splunk Metrics serializer (influxdata#4339)

44cffb3

otherpirate pushed a commit to otherpirate/telegraf that referenced this pull request Mar 15, 2019

Add Splunk Metrics serializer (influxdata#4339)

3c94e2b

dupondje pushed a commit to dupondje/telegraf that referenced this pull request Apr 22, 2019

Add Splunk Metrics serializer (influxdata#4339)

839a0ec

athoune pushed a commit to bearstech/telegraf that referenced this pull request Apr 17, 2020

Add Splunk Metrics serializer (influxdata#4339)

f9d5967


		func (s *serializer) createObject(metric telegraf.Metric) (metricString string, err error) {

		/* Splunk supports one metric per line and has the following required names:


		dataGroup := HECTimeSeries{}

		for k, v := range metric.Fields() {

Splunk Metrics serializer #4339

Splunk Metrics serializer #4339

Conversation

ronnocol commented Jun 22, 2018

Required for all PRs:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phemmer commented Jun 24, 2018 • edited Loading

ronnocol commented Jun 24, 2018

dkulig commented Jul 19, 2018

arohter commented Aug 8, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielnelson left a comment

Choose a reason for hiding this comment

ronnocol commented Sep 3, 2018

danielnelson commented Sep 4, 2018

danielnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielnelson commented Sep 11, 2018

ronnocol commented Sep 11, 2018

phemmer commented Jun 24, 2018 •

edited

Loading