feat: Tool to customize Telegraf builds #11524

srebhan · 2022-07-19T16:24:30Z

Updated associated README.md.
Wrote appropriate unit tests.
Pull request title or commits are in conventional commit format

resolves #9556
replaces #8519
replaces #5809

This PR adds a tool to reduce the binary size of Telegraf by only including user-selected plugins in the build. A Telegraf binary only including the modbus input plugin, the influxdb output plugin as well as the starlark processor, but no parser will be ~18Mb in size. See the included README.md for more details.

powersj · 2022-07-19T20:48:17Z

Thanks, @srebhan, for tackling the minify work. There is clearly a need for users to have a smaller telegraf.

Two initial questions as to the design of this

Why the extra configuration file? We briefly chatted and you mentioned users not having a config yet. I am not sure I agree. If a user has the desire to minify their telegraf build, they should already have a config in the first place. As such that config or configs can be used as input to minify.
What prompted putting minify in front of any build and the removal of the all.go files? I would like to a) keep minify out of the traditional build process and b) allow anyone to build telegraf without ever knowing minify exists in the first place.

srebhan · 2022-07-20T08:28:34Z

Hey @powersj,

Why the extra configuration file? We briefly chatted and you mentioned users not having a config yet. I am not sure I agree. If a user has the desire to minify their telegraf build, they should already have a config in the first place. As such that config or configs can be used as input to minify.

There were multiple reasons to support both ways:

There are people that do know what they need but do not have any working configuration yet. You could argument that they can start with a "full" Telegraf and use the resulting configs, but that might not be possible in some embedded environments. So being able to build a smaller Telegraf right away might be a good option.
Please think of environments where someone builds (in the sense of compiles) Telegraf for others to use. This is often the case in larger companies or if you want to provide Telegraf as a service to your customers. In these cases, sending the configuration files from the user to the "builder" might not be possible due to confidentiality, policies or practical reasons.
You want to build Telegraf for a future state of your system. For example you know that your current system will be extended in the future with new protocols (requiring more plugins) and you want to build Telegraf for this future configuration.

In all the above cases users will be forced to write a "Telegraf configuration-file" without the help of globbing and black-/whitelisting which might be cumbersome. Furthermore, configuring parsers (and maybe serializers in the future) will be near to impossible or at least hard. We will then end up with a degenerated version of a Telegraf file that looks similar but is not valid or we force the user (potentially not familiar with Telegraf configurations) to create a full Telegraf configuration.

My goal was to provide easy ways to cover both, having a working config and only using what is in there and cases where you don't have a config or want to extend it.

What prompted putting minify in front of any build and the removal of the all.go files? I would like to a) keep minify out of the traditional build process and b) allow anyone to build telegraf without ever knowing minify exists in the first place.

My thoughts were the following:

Ease of use. If you want to build a minified version but keep up with newer versions, simply put your build.conf file in your Telegraf root and you will always get your minified version by simply typing make.
Testing. This way, we "eat our own dog food" and will be able to catch issues with minify early. The cost (in terms of time) is negligible I think.
Reduce developer burden. By removing the all.go files we kill of another redundant location for developers of new plugins to touch.
Keep the repo clean. When using minify in-tree, it needs to modify the all.go files. If people then create PRs they might send modified versions of all.go by accident, creating more burden for us to check...

All of the above being said, I think both design decisions are well justified. However, if the decision is to remove both, I can do this. In this case, I think, there is no point in having minify in-tree as it will not be used by the build and potentially creates trouble with in-tree builds (modify tracked files). We should then move it out of the tree in an own repo and I think we have been there.... :-)

powersj · 2022-07-20T16:20:06Z

@srebhan, thanks for the background. It does help to understand your approach.

There are people that do know what they need but do not have any working configuration yet.

In the spec, Telegraf configuration files are the center of the design. The assumption was that minify was for users who already knew what they wanted and desired a smaller Telegraf binary. I would argue that if someone does not know what plugins they may or may not use, then they have no business using minify in the first place. That user must continue to determine what is best for their use cases, and only after determining what they need to use minify.

...builds Telegraf for others to use.

You want to build Telegraf for a future state of your system

Nothing should stop users from accomplishing this with a Telegraf config. While the spec declared that we expected a valid Telegraf configuration, I think it is more accurate to say we expect a valid TOML file that specifies the plugins required. If a user wanted to include additional plugins, then they could pass any number of "empty" plugin sections (e.g. [inpus.mqtt], [outputs.influxdb_v2], etc.) in a TOML file. The settings and configuration values should not be necessary for minify to operate.

re: minify in build, I need more time to digest that.

srebhan · 2022-07-20T16:50:08Z

@powersj first of all thanks for the helpful discussion!

In the spec, Telegraf configuration files are the center of the design.

All specified use cases of the specs work IIRC, so there is no trade-off here.

Nothing should stop users from accomplishing this with a Telegraf config. While the spec declared that we expected a valid Telegraf configuration, I think it is more accurate to say we expect a valid TOML file that specifies the plugins required. If a user wanted to include additional plugins, then they could pass any number of "empty" plugin sections (e.g. [inpus.mqtt], [outputs.influxdb_v2], etc.) in a TOML file. The settings and configuration values should not be necessary for minify to operate.

As I said, it is possible, but please give examples for the following use-cases and estimate their effort

Select all input plugins but only the influxdb_v2 output.
Select only the json and xpath parsers, http input and influxdb output.
Select all plugins except for aggregators and processors.

Please find my examples for the 3 attached (created by hand).
exampe2.conf.txt
example1.conf.txt
example3.conf.txt

You also did not address the issue that in larger organizations or external service companies the one building Telegraf has potentially no access to the config files...

powersj · 2022-07-20T19:36:07Z

We had a virtual discussion about this and I think this boils down to these questions:

Do we want to support building of Telegraf without a Telegraf configuration?
Do we want to support wildcard selection and blacklisting of plugins?
Do we want to support "manually" enabling plugins additional to a config?
How do we want to specify parsers (and later serializers) to include/exclude?

@srebhan feel free to update/re-word those questions :)

edit: I updated the questions based on your invite!

srebhan · 2022-07-28T09:02:07Z

After internal discussion the decision was to

Only support Telegraf configuration files for configuration. Remove the additional build-configuration files.
Keep minify out of the build process. Remove from Makefile and keep all.go files.
Remove feature to do out-of-tree builds.
Parsers (and in the future serializers) are configured by specifying data_format in the Telegraf configuration. If no parser is found, but plugins using data_format, the default parsers are included (influx usually and json for exec).

powersj

Thank you for the updates! I really do appreciate you driving this, and I am excited to see what our users say.

I have a few changes to the README and a couple of questions.

Have you run this at least once on Windows? I can run it on my Mac later as well.

go.mod

tools/minify/README.md

tools/minify/cmd/configure.go

tools/minify/common/buildconfig.go

tools/minify/README.md

reimda · 2022-08-05T20:48:59Z

The docs and code look good to me. Thanks for putting this together! I'd approve now except that I am still concerned about what we're naming this tool. (Sorry to be picky about naming at the last minute and for also rejecting your first name choice, "bob")

I think our audience of users and developers are more familiar with minify or minification being the process of making code (especially interpreted code like javascript or markup like html) smaller by removing unnecessary characters like whitespace, while retaining all the functionality of the original version. When I google "minify" I get things only with this specific definition.

This tool leaves the whitespace alone and builds a smaller binary by removing functionality. That is not minifying.

I think if we use the name "minify" it will be a source of confusion and that when people see it for the first time they will think it doesn't remove functionality. They will also know telegraf is written in a compiled language and will wonder why it even has a tool to remove unnecessary characters in source code.

Previous proposals for this kind of tool used these names

telegraf-lite-builder
bob (the builder) 😃
telegraf-build
bring-your-own-telegraf

I'd like the name to indicate what it does and not be a codename or inside joke. It doesn't need to include "telegraf" because it is in the telegraf repo in the tools directory.

Since this tool produces custom builds that include only the plugins that are necessary for a given configuration, the name should include a term like custom, specialized, reduced, lite, or minimal. Of those I think minimal stands out because it also makes you think small size is the point. It's a more general term than "minify" and doesn't already have a software connotation.

The other tools in telegraf/tools use underscores instead of dashes.

What does everyone think of renaming this to telegraf/tools/minimal_builder?

srebhan · 2022-08-08T08:14:07Z

@reimda no worries, I think naming is important here to make this discoverable....

I like minify as it is concise and short, but I agree that is might overlap with other meanings. Not sure if this is a problem though... Anyway, if we rename the tool, I would rather go in the direction of "configuration", as that's the core step of the tool. If we add "mini..." we will always run into the meaning clash and can leave minify....
So if we change the name, why not simply tailor_to_config or configure?

popey · 2022-08-08T13:31:00Z

Yay! A bikeshed we can paint together! Here's my ~~brush~~ 2¢.

In the future, I envisaged that Telegraf would have three distinct options for users to pick from when selecting which build of Telegraf best suits their use case.

Telegraf Complete - The "full fat" build of telegraf, which contains all the usual plugins, parsers and dependencies. What we often consider the "batteries included" build. This is effectively the (slightly chonky) build we known and love today.
Telegraf Core - A build of Telegraf which contains a subset of all the available plugins and parsers. That (yet to be defined) set of features would be well supported, widely used and known.
Telegraf Custom - A build of Telegraf created (likely using what we now call "minifiy" and previously called "bob") by the customer or a developer on their behalf. This would contain only the minimum necessary plugins, parsers and dependencies required for the customers' needs.

I believe the terms Complete, Core and Custom enable us to very clearly identify what kind of build a customer or community member is using, when helping in issues or other support avenues.

I do agree that "minify" has become the internal "codename" or shorthand for the feature which builds "Telegraf Custom". I also see that it's an overloaded term, as it's often used to obfuscate or shrink an existing source codebase like JavaScript to reduce transmission time, and hide the operation of the code. I don't believe we want to obfuscate or hide anything. So for that reason perhaps we should seek some name other than "minifiy" so as not to be associated with this existing use case.

I like terms to reflect exactly what they do. Given this tool is building a custom telegraf, I feel the words are right in front of us.

telegraf-custom-builder, telegraf-builder, custom-builder, custom-telegraf-builder (ctb or tcb for short?) or similar?

powersj · 2022-08-16T16:35:18Z

team consensus: custom_builder

srebhan · 2022-08-16T17:00:37Z

@powersj name changed...

tools/custom_builder/main.go

powersj · 2022-08-16T17:46:34Z

cmd/telegraf/telegraf.go

@@ -277,7 +280,14 @@ func runAgent(ctx context.Context,

 	logger.SetupLogging(logConfig)

-	log.Printf("I! Starting Telegraf %s", version)
+	log.Printf("I! Starting Telegraf %s%s", version, internal.Customized)
+	log.Printf("I! Available plugins: %d inputs, %d aggregators, %d processors, %d parsers, %d outputs",


I love this additional output, thank you!

I built with a config of just inputs.diskio and outputs.file and I ended up seeing:

2022-08-16T17:43:50Z I! Available plugins: 3 inputs, 0 aggregators, 0 processors, 0 parsers, 2 outputs

Which are:

telegraf-sven on  minify [?] via 🐹 v1.19 ❯ ./telegraf --input-list Available Input Plugins: diskio io system telegraf-sven on  minify [?] via 🐹 v1.19 ❯ ./telegraf --output-list Available Output Plugins: file wavefront

Two questions:

Why is wavefront pulled in? I understand the other inputs due to dependencies

Should we document this in the README, additional plugins may get pulled in due to dependencies. This could be a single sentence added to the readme.

It seems wavefront is coming through the backyard. ;-) The route is

file output imports github.com/influxdata/telegraf/plugins/serializers

serializers/registry.go imports github.com/influxdata/telegraf/plugins/serializers/wavefront

wavefront serializer imports github.com/influxdata/telegraf/plugins/outputs/wavefront 😨

This happens with the epic comment:

// TODO: this dependency is going the wrong way: Move MetricPoint into the serializer.

🤦‍♂️

I guess this has to be fixed, but is not an issue of custom_builder... What do you think?

Should we document this in the README, additional plugins may get pulled in due to dependencies. This could be a single sentence added to the readme.

Absolutely. Will do.

lol @ that comment, let's circle around and fix that after this is merged. I can file an issue.

powersj · 2022-08-16T18:11:51Z

Also tested on Arch, M1 mac, and Windows 10.

powersj · 2022-08-16T18:16:22Z

Cross builds look good too (btw this is why I want us to continue to print out the GOOS and GOARCH values during a build, really nice to see it was correctly passed in logs):

❯ GOARCH=arm64 ./tools/custom_builder/custom_builder --config config.toml 
2022/08/16 12:12:27 Importing configuration file(s)...
2022/08/16 12:12:27 Found 1 configuration files...
-------------------------------------------------------------------------------
Enabled plugins:
-------------------------------------------------------------------------------
aggregators (0):
-------------------------------------------------------------------------------
inputs (1):
  diskio                          plugins/inputs/diskio
-------------------------------------------------------------------------------
outputs (1):
  file                            plugins/outputs/file
-------------------------------------------------------------------------------
parsers (0):
-------------------------------------------------------------------------------
processors (0):
-------------------------------------------------------------------------------
2022/08/16 12:12:27 Running build...
<snip>
go build -tags "custom,inputs.diskio,outputs.file" -ldflags " -X main.commit=fa97580b -X main.branch=minify -X main.goos=linux -X main.goarch=arm64 -X main.version=1.24.0-fa97580b" ./cmd/telegraf
❯ file telegraf 
telegraf: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, Go BuildID=9xJ4G6WJh0b-9Uf0UKD6/yN9qkhpWV9JFhM9ENyVi/ELYdZ_uJi_13QU0hOSZG/a7LNWdwFMeM2yLTBYtpV, with debug_info, not stripped

… plugins per category on startup.

telegraf-tiger · 2022-08-17T20:43:25Z

Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

☺️ This pull request doesn't significantly change the Telegraf binary size (less than 1%)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB	RPM	TAR GZ	ZIP
amd64.deb	aarch64.rpm	darwin_amd64.tar.gz	windows_amd64.zip
arm64.deb	armel.rpm	darwin_arm64.tar.gz	windows_i386.zip
armel.deb	armv6hl.rpm	freebsd_amd64.tar.gz
armhf.deb	i386.rpm	freebsd_armv7.tar.gz
i386.deb	ppc64le.rpm	freebsd_i386.tar.gz
mips.deb	riscv64.rpm	linux_amd64.tar.gz
mipsel.deb	s390x.rpm	linux_arm64.tar.gz
ppc64el.deb	x86_64.rpm	linux_armel.tar.gz
riscv64.deb		linux_armhf.tar.gz
s390x.deb		linux_i386.tar.gz
		linux_mips.tar.gz
		linux_mipsel.tar.gz
		linux_ppc64le.tar.gz
		linux_riscv64.tar.gz
		linux_s390x.tar.gz
		static_linux_amd64.tar.gz

powersj

🎉 thank you, I really like how this ended up!

telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Jul 19, 2022

This was referenced Jul 19, 2022

feat: Build configuration tool #8519

Closed

A builder for custom, smaller telegraf builds #5809

Closed

srebhan force-pushed the minify branch from dd5b3be to 717b407 Compare July 28, 2022 12:35

srebhan requested review from reimda and powersj July 28, 2022 13:36

powersj reviewed Jul 28, 2022

View reviewed changes

powersj reviewed Jul 29, 2022

View reviewed changes

tools/minify/README.md Outdated Show resolved Hide resolved

srebhan force-pushed the minify branch from 89223d7 to 97146b0 Compare August 3, 2022 17:04

reimda mentioned this pull request Aug 9, 2022

chore: use build tags to select plugins #11654

Merged

srebhan force-pushed the minify branch from 97146b0 to 959c8e9 Compare August 15, 2022 21:04

srebhan changed the title ~~feat: Tool to minify Telegraf~~ feat: Tool to customize Telegraf builds Aug 15, 2022

srebhan requested a review from powersj August 15, 2022 21:05

sspaink approved these changes Aug 16, 2022

View reviewed changes

powersj reviewed Aug 16, 2022

View reviewed changes

srebhan requested review from powersj and sspaink August 17, 2022 20:20

srebhan added 11 commits August 17, 2022 22:20

Add MustCompile function for the filter to handle static use-cases.

49f4340

Export LoadConfigFile() to be used by customizer.

868f903

Add customizer tool.

ca08f52

Build customizer tool and setup gitignore and linter.

73b2087

Tag telegraf as customized if it is and print the number of available…

45690b0

… plugins per category on startup.

Fix compatibility for go < 1.19.

659965d

Actually it's better to use a comma-separated list.

397bc19

Fix make-command environment to include the OS settings.

835d980

Rename 'customizer' to 'custom_builder'.

e017583

Add notes about dependency includes.

63f9352

Only run 'make build' to avoid deleting the custom_builder binary.

32b3cf5

srebhan force-pushed the minify branch from 4c2b97f to 32b3cf5 Compare August 17, 2022 20:23

sspaink approved these changes Aug 18, 2022

View reviewed changes

powersj approved these changes Aug 18, 2022

View reviewed changes

powersj merged commit f1ce84f into influxdata:master Aug 19, 2022

sjwang90 mentioned this pull request Sep 7, 2022

Telegraf 1.24 influxdata/docs-v2#4422

Closed

neelayu mentioned this pull request Sep 21, 2022

Allow decoupling of main function with runApp to allow custom distributions of Telegraf #11851

Closed

srebhan deleted the minify branch November 7, 2022 11:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Tool to customize Telegraf builds #11524

feat: Tool to customize Telegraf builds #11524

srebhan commented Jul 19, 2022 •

edited

Loading

powersj commented Jul 19, 2022

srebhan commented Jul 20, 2022 •

edited

Loading

powersj commented Jul 20, 2022 •

edited

Loading

srebhan commented Jul 20, 2022

powersj commented Jul 20, 2022 •

edited

Loading

srebhan commented Jul 28, 2022

powersj left a comment

reimda commented Aug 5, 2022 •

edited

Loading

srebhan commented Aug 8, 2022

popey commented Aug 8, 2022

powersj commented Aug 16, 2022

srebhan commented Aug 16, 2022

powersj Aug 16, 2022

srebhan Aug 17, 2022

srebhan Aug 17, 2022

powersj Aug 17, 2022

powersj commented Aug 16, 2022

powersj commented Aug 16, 2022

telegraf-tiger bot commented Aug 17, 2022

Artifact URLs

powersj left a comment

feat: Tool to customize Telegraf builds #11524

feat: Tool to customize Telegraf builds #11524

Conversation

srebhan commented Jul 19, 2022 • edited Loading

powersj commented Jul 19, 2022

srebhan commented Jul 20, 2022 • edited Loading

powersj commented Jul 20, 2022 • edited Loading

srebhan commented Jul 20, 2022

powersj commented Jul 20, 2022 • edited Loading

srebhan commented Jul 28, 2022

powersj left a comment

Choose a reason for hiding this comment

reimda commented Aug 5, 2022 • edited Loading

srebhan commented Aug 8, 2022

popey commented Aug 8, 2022

powersj commented Aug 16, 2022

srebhan commented Aug 16, 2022

powersj Aug 16, 2022

Choose a reason for hiding this comment

srebhan Aug 17, 2022

Choose a reason for hiding this comment

srebhan Aug 17, 2022

Choose a reason for hiding this comment

powersj Aug 17, 2022

Choose a reason for hiding this comment

powersj commented Aug 16, 2022

powersj commented Aug 16, 2022

telegraf-tiger bot commented Aug 17, 2022

Artifact URLs

powersj left a comment

Choose a reason for hiding this comment

srebhan commented Jul 19, 2022 •

edited

Loading

srebhan commented Jul 20, 2022 •

edited

Loading

powersj commented Jul 20, 2022 •

edited

Loading

powersj commented Jul 20, 2022 •

edited

Loading

reimda commented Aug 5, 2022 •

edited

Loading