Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow decoupling of main function with runApp to allow custom distributions of Telegraf #11851

Closed
neelayu opened this issue Sep 21, 2022 · 7 comments
Labels
feature request Requests for new plugin and for new features to existing plugins

Comments

@neelayu
Copy link
Contributor

neelayu commented Sep 21, 2022

Use Case

First of all thank you very much for this wonderful project. Telegraf is one of the most popular tools to metrics ingestion and we use it extensively. However, one limitation is that it doesn't easily extend to custom distributions. For ex, Opentelemetry collector allows organisations to write their own Main function and import the necessary components in their distribution. This allows some components to be developed which have a specific use-case and are private to the org, but they want to leverage the telegraf framework.
The major task to incorporate this feature has been done in the following PRs
#11700 for CLI
#11524 for custom builder. Similar to what Opentelemetry has right now.
#11654 for selectively importing plugins.

Expected behavior

Currently, to use telegraf as a service, either we use the custom-builder, or use a telegraf core build. It is not possible to define a main function which replicates what telegraf does. But this allows more flexibility in developing plugins which are private to the org.

Actual behavior

Not possible to write minimal code to create a telegraf binary. Either we have to fork or use custom builder.

Additional info

It would also be good to have a pre run hook which runs before the actual agent. This way we have control over the behaviour of the agent. This behaviour can be defined via flags, which distributions can define apart from the core flags provided.

@neelayu neelayu added the feature request Requests for new plugin and for new features to existing plugins label Sep 21, 2022
@reimda
Copy link
Contributor

reimda commented Sep 21, 2022

Hi @neelayu, I'm curious to know more about the purpose of your custom distribution. Telegraf isn't intended to be reused as a golang module at a high level like you describe.

I don't think the project is completely opposed to accepting a refactor like #11852, but I would like to know more about how you intend to use it and why you can't use the existing extensibility features in telegraf. External teams can already contribute plugins or use exec/execd plugins to write private external plugins. You can also write starlark scripts to process metrics. These extensibility features have been sufficient for other users, why is your use case different?

@neelayu
Copy link
Contributor Author

neelayu commented Sep 23, 2022

Hey @reimda Thanks for going through the proposal. I do understand that telegraf isnt designed to be used as a go module. However, I'll try my best to explain the need for this feature-

  • We want to invoke Telegraf with configuration generated during runtime, based on certain set of parameters. So we convert a JSON based configuration to TOML, a cfg format which Telegraf requires. And the idea is we do this conversion which will have telegraf as a go module dependency. The before run hook will take care of setting the configuration flag.
  • If we use telegraf as a dependency, any breaking change can be caught during the build/compilation itself. In case of execd plugins, this would not be caught. Some might prefer a single code base for this. I dont know if I explained it well. But do let me know.
  • One major reason is that we have been using Opentelemetry collector for quite some time and our build uses otelcol as a go module dependency. Major distributions also include the aws-otel-collector, splunk-otel-collector.

@neelayu
Copy link
Contributor Author

neelayu commented Sep 28, 2022

Hi @reimda did you get a chance to go through the points?

@neelayu
Copy link
Contributor Author

neelayu commented Oct 12, 2022

A friendly reminder on this. Just want to know what the consensus is on this feature. Thanks!

@srebhan
Copy link
Member

srebhan commented Dec 19, 2022

@neelayu first of all thank you for bringing this up and also thanks for submitting a PR on this topic!

We discussed supporting custom distributions in the team and came to the conclusion that we will not guarantee any public interfaces for the Telegraf agent nor for any other parts with the exception of the plugin interfaces defined. This being said, we nevertheless see the value in organizing the Telegraf code in a way to have a clear set of functions called in a sensible, understandable order like (beside CLI options etc) comprising the steps of:

  1. load configurations files and setup the Telegraf configurations (should be in config)
  2. setup accumulator and plugin interconnect infrastructure (should be in agent)
  3. initializing all configured plugins (should be in agent I think)
  4. start plugins (probably from outputs towards inputs). This would be the main loop. (should be in agent)
  5. stop plugins and make sure all remaining data is flushed (probably from input towards outputs) (should be in agent)
  6. tear-down all plugins (should be in agent)
  7. cleanup

This would also allow you and others to use those functions (e.g. to write your own main). However, I want to emphasize that despite this interface might be general we will not guarantee any public interface there! The main reason is that we probably want to make changes on how things inter-operate and do not want to be tied to switching main versions to change those interna. Does that make sense to you @neelayu?

I will close your PR shortly as it will not provide these clear function mentioned above, but we are willing to accept PR(s) reorganizing the code in the way mentioned. What do you think @neelayu?

@srebhan srebhan added the waiting for response waiting for response from contributor label Dec 19, 2022
@neelayu
Copy link
Contributor Author

neelayu commented Dec 20, 2022

Hey @srebhan thanks for going through this proposal. Overall I feel your idea makes sense. And yes, glad we could clear the important part of no guarantee for public interfaces.
I do feel there are some changes which could pave the way to use telegraf as a go dependency using the suggestions you mentioned. I feel that should be enough.

I'll work on some refactoring when I get the time.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Dec 20, 2022
@powersj
Copy link
Contributor

powersj commented Nov 7, 2023

I'm going to close this for now. We are still happy to see a refactor at a later date.

@powersj powersj closed this as not planned Won't fix, can't repro, duplicate, stale Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
4 participants