Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avro Parser put timestamp to the metric fields even if it is not specified in the field list #13821

Closed
BelousovAntonV opened this issue Aug 23, 2023 · 7 comments · Fixed by #13856
Labels
feature request Requests for new plugin and for new features to existing plugins

Comments

@BelousovAntonV
Copy link
Contributor

Use Case

I want to decide which fields should be added to the metric by myself. We do not need timestamp added as a field.

Expected behavior

When option set to false, the timestamp

Actual behavior

At the moment when you specify any field from message as a timestamp it will be added as a field to the metric, even if it is not specified in the field list.

The related option pass_timestamp_to_fields should be added to the parser config which could be true/false and will be set to true by default

Additional info

I've done this change in our solution without any options. So if I want to add timestamp to the fields, I should add it to the field list in the config. But for the backward compatibility, I think there should be the option that will disable the behavior when timestamp just added to the metric fields. When the option is false you will get timestamp as a field only when you specify it in the field list.

@BelousovAntonV BelousovAntonV added the feature request Requests for new plugin and for new features to existing plugins label Aug 23, 2023
@powersj
Copy link
Contributor

powersj commented Aug 23, 2023

I want to decide which fields should be added to the metric by myself. We do not need timestamp added as a field.

This is why we have processors as well as the global configuration options. I would rather us not add fine tuning to each parser and instead use the more global approach.

Is there a reason why those would not work to drop any fields or tags you do not want?

@powersj powersj added the waiting for response waiting for response from contributor label Aug 23, 2023
@BelousovAntonV
Copy link
Contributor Author

Hi @powersj,

Do you mean we could use field drop at the end of Input configuration?
If yes, looks like this issue could be canceled.
I guess, that I did not realize it before.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Aug 23, 2023
@BelousovAntonV
Copy link
Contributor Author

BelousovAntonV commented Aug 23, 2023

But, anyway, it makes no sense for me, why timestamp field should be added to the metric fields even if it is not specified in the field list option of the configuration file...

I mean, this peace of code looks like a bug for me:

// If you have specified your fields in the config, you
// get what you asked for.
fieldList = p.Fields

// Except...if you specify the timestamp field, and it's
// not listed in your fields, you'll get it anyway.
// This will randomize your field ordering, which isn't
// ideal.  If you care, list the timestamp field.
if p.Timestamp != "" {

Why it should be added to the fields if I did not specify it in the avro_fields option? I do not need it, but, I'm getting it and should drop it afterwards. Isn't it OK, when I expect to receive only those fields, that were specified?

@powersj
Copy link
Contributor

powersj commented Aug 25, 2023

@athornton - any thoughts or background as to this design decision?

@athornton
Copy link
Contributor

I don't specifically remember, but probably because I was only considering the case where telegraf is feeding InfluxDB, which requires a timestamp for each measurement. And certainly in the use cases I'm familiar with on my project, we very definitely want the time the measurement was made, not the time that InfluxDB received the measurement.

Oddly, the work I'm doing now on Avro union types has alerted me to the reality that people are using telegraf to send records to things that are not InfluxDB, so I think if we want to relax that restriction it's OK.

I think if there is no timestamp column, InfluxDB probably just uses the time the message was received? I'd have to look. At any rate we should probably put a warning in the documentation that, if there's no timestamp, and you're sending data to InfluxDB, it will...do whatever its default behavior is.

@athornton
Copy link
Contributor

Looking at this the timestamp logic in init is clearly buggy. I'm trying to determine what I meant, but clearly checking whether the timestamp is one of the allowable formats inside a clause that is only active if the timestamp format is the empty string wasn't it.

@athornton
Copy link
Contributor

I have this implemented in #13856. The underlying reason was "that's what I wanted for my use case" but it's easy enough for us to change our own configuration to specify which field we want for timestamps explicitly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants