Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new config for csv column explicit type conversion #4781

Merged

Conversation

rudbast
Copy link
Contributor

@rudbast rudbast commented Oct 1, 2018

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

Resolves #4775.

Copy link
Contributor

@glinton glinton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to make the proper changes to /internal/config/config.go

diff --git a/internal/config/config.go b/internal/config/config.go
index 3d051097..bd919d44 100644
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -1460,6 +1460,18 @@ func getParserConfig(name string, tbl *ast.Table) (*parsers.Config, error) {
 		}
 	}
 
+	if node, ok := tbl.Fields["csv_column_types"]; ok {
+		if kv, ok := node.(*ast.KeyValue); ok {
+			if ary, ok := kv.Value.(*ast.Array); ok {
+				for _, elem := range ary.Value {
+					if str, ok := elem.(*ast.String); ok {
+						c.CSVColumnTypes = append(c.CSVColumnTypes, str.Value)
+					}
+				}
+			}
+		}
+	}
+
 	if node, ok := tbl.Fields["csv_tag_columns"]; ok {
 		if kv, ok := node.(*ast.KeyValue); ok {
 			if ary, ok := kv.Value.(*ast.Array); ok {
@@ -1588,6 +1600,7 @@ func getParserConfig(name string, tbl *ast.Table) (*parsers.Config, error) {
 	delete(tbl.Fields, "grok_custom_pattern_files")
 	delete(tbl.Fields, "grok_timezone")
 	delete(tbl.Fields, "csv_column_names")
+	delete(tbl.Fields, "csv_column_types")
 	delete(tbl.Fields, "csv_comment")
 	delete(tbl.Fields, "csv_delimiter")
 	delete(tbl.Fields, "csv_field_columns")

and /plugins/parsers/registry.go

diff --git a/plugins/parsers/registry.go b/plugins/parsers/registry.go
index c662cf30..3ec82308 100644
--- a/plugins/parsers/registry.go
+++ b/plugins/parsers/registry.go
@@ -127,6 +127,7 @@ type Config struct {
 
 	//csv configuration
 	CSVColumnNames       []string `toml:"csv_column_names"`
+	CSVColumnTypes       []string `toml:"csv_column_types"`
 	CSVComment           string   `toml:"csv_comment"`
 	CSVDelimiter         string   `toml:"csv_delimiter"`
 	CSVHeaderRowCount    int      `toml:"csv_header_row_count"`
@@ -195,6 +196,7 @@ func NewParser(config *Config) (Parser, error) {
 			config.CSVComment,
 			config.CSVTrimSpace,
 			config.CSVColumnNames,
+			config.CSVColumnTypes,
 			config.CSVTagColumns,
 			config.CSVMeasurementColumn,
 			config.CSVTimestampColumn,
@@ -216,6 +218,7 @@ func newCSVParser(metricName string,
 	comment string,
 	trimSpace bool,
 	columnNames []string,
+	columnTypes []string,
 	tagColumns []string,
 	nameColumn string,
 	timestampColumn string,
@@ -249,6 +252,7 @@ func newCSVParser(metricName string,
 		Comment:           comment,
 		TrimSpace:         trimSpace,
 		ColumnNames:       columnNames,
+		ColumnTypes:       columnTypes,
 		TagColumns:        tagColumns,
 		MeasurementColumn: nameColumn,
 		TimestampColumn:   timestampColumn,

@@ -148,6 +149,35 @@ outer:
}
}

// In case if defined column names & types count don't match.
if i < len(p.ColumnTypes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like throwing an error if they don't match before here would be a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've added the throw error statement on newCSVParser, seems like it fits there better ?

@rudbast rudbast force-pushed the feat/csv/predefined_column_types branch from 9f65406 to 3759eba Compare October 2, 2018 03:19
@rudbast rudbast changed the title Add new config for csv column explict type conversion Add new config for csv column explicit type conversion Oct 2, 2018
@danielnelson danielnelson added this to the 1.9.0 milestone Oct 4, 2018
@danielnelson danielnelson added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Oct 4, 2018
@danielnelson danielnelson merged commit a1f9f63 into influxdata:master Oct 4, 2018
@rudbast rudbast deleted the feat/csv/predefined_column_types branch October 4, 2018 03:00
@FAE-MICHAEL
Copy link

Hi @danielnelson / @glinton / @rudbast

I'm trying to use 'csv_column_types' with only datatypes ["float", "string" ...].
But this is generating the error:

Error parsing /etc/telegraph/telegraph.conf, line 48: field corresponding to csv_column_types 'is not defined in * file.File'

Has anyone ever experienced this?

Example of use:

[[inputs.file]]
   files = ["/home/STATRACK.csv"]
   data_format = "csv"
   csv_delimiter = "|"
   csv_header_row_count = 0
   csv_column_names = ["REPTIMEOBTAINED", "REPSTATSCACHESIZE", "VMMAXMEM", "VMTOTALMEM", "LASTXACTID"]
   csv_column_types = ["string", "float", "float", "float", "string"]

@rudbast
Copy link
Contributor Author

rudbast commented Oct 14, 2018

@FAE-MICHAEL i'm suspecting you're not using the telegraf version in which this changes is included.. could you try using build from source (master) ? btw i'm not sure if we should continue this conversation here, you could try make an issue and reference it here i think (?).

@FAE-MICHAEL
Copy link

Hi @rudbast

I'm using the version of telegraf 1.8 1 downloaded from site influxdb. is there any more recent compilated version? Where can I download it? If it does not work, I'll open another problem here ...

@rudbast
Copy link
Contributor Author

rudbast commented Oct 14, 2018

@FAE-MICHAEL i think this PR is planned for 1.9 and as of now it hasn't released yet. if you still planned to use this feature (field), i suggest you build your own from current master branch https://github.com/influxdata/telegraf#from-source.

@FAE-MICHAEL
Copy link

thank you very much for the answers ... I can point the telegraf to the PR already resolved and that will be released in version 1.9 or will I have to wait?

@rudbast
Copy link
Contributor Author

rudbast commented Oct 15, 2018

i think it'll be a while before 1.9 is released, not sure when..

@FAE-MICHAEL
Copy link

thanks a lot @rudbast ....

I'll wait, I use grafana and the fieldkeys as string inhibits the visualization of the legend, but the graph is normally displayed. I appreciate your help

@danielnelson
Copy link
Contributor

1.9 is tentatively scheduled by end of November, you can using a nightly build before then.

@FAE-MICHAEL
Copy link

Hi @danielnelson and @rudbast

I made the update and the start of the telegraf occurred successfully, but in its log now appears:

Error in plugin [inputs.file]: column type: parse float error strconv.ParseFloat: parsing "18": invalid syntax

Did I just install telegraf-nightly.x86_64.rpm, do I need to update or change some settings?

Thank you

@FAE-MICHAEL
Copy link

@danielnelson , @rudbast

The error occurred due to "blank spaces".
After corrected it gives the error saying of the conflict between the columns of MEASUREMENT.

Actually FIELDTYPE is STRING even doing DROP SERIES FROM MEASUREMENT it creates MEASUREMENT as STRING and not FLOAT as it should ....

See error telegraf.log:

2018-10-17T01:16:10Z E! [outputs.influxdb]: when writing to [http://127.0.0.1:8086]: received error partial write: field type conflict: input field "TR_OPWAITTIME" on measurement "file" is type float, already exists as type string dropped=1; discarding points

@FAE-MICHAEL
Copy link

I did a DROP MEASUREMENT and it was created with the correct FIELDTYPE. Now the problem is solved. Thank you so much for your help @danielnelson and @rudbast .

rgitzel pushed a commit to rgitzel/telegraf that referenced this pull request Oct 17, 2018
otherpirate pushed a commit to otherpirate/telegraf that referenced this pull request Mar 15, 2019
otherpirate pushed a commit to otherpirate/telegraf that referenced this pull request Mar 15, 2019
dupondje pushed a commit to dupondje/telegraf that referenced this pull request Apr 22, 2019
athoune pushed a commit to bearstech/telegraf that referenced this pull request Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants