Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining inputs only for testing #97

Closed
marc-gr opened this issue Dec 16, 2020 · 5 comments
Closed

Allow defining inputs only for testing #97

marc-gr opened this issue Dec 16, 2020 · 5 comments

Comments

@marc-gr
Copy link
Contributor

marc-gr commented Dec 16, 2020

We have several packages that define inputs only for running system tests.

This is mostly because it is impractical to set a running version of the package regular input, e.g.: s3+sqs, httpjson. In these situations we set up a logfile input which we use only for the system tests, but that is still visible to the end users through the UI, which can lead to confusion.

Would be nice to have a way of defining testing-only inputs, or an option to hide a whole input from the UI, in a similar way we do with variables and the show_user flag.

@mtojek
Copy link
Contributor

mtojek commented Dec 16, 2020

Test cases that can't be expressed with standard logfiles (e.g. S3, SQS) can be expressed with input events. In other words you are not providing an emulator that can produce fake input data, but you're using entire recorded events (with all relevant fields). Did you try to use this option?

Another option is using cloud emulators. I didn't analyze the case for GCP, but maybe it can cover your use case.

Just to be clear, I'm not blocking for your proposal, it's a valid call. I'd like us to make sure that this is best approach to introduce special data streams "for testing only".

/cc @ruflin @ycombinator @kaiyan-sheng

@marc-gr
Copy link
Contributor Author

marc-gr commented Dec 16, 2020

If I understand you correctly, you are referring to the input events defined in pipeline tests. Our intent when using the setup I mentioned is to be able to test the end-to-end flow using system tests when there is no practical way of emulating the input, so we can test not only the pipeline but also any previous processing we might do. Hope this adds a bit more context about our use case.

I agree that when there is a possibility to emulate we should favor that vs this approach (GCP is one case as you mentioned).

@mtojek
Copy link
Contributor

mtojek commented Dec 16, 2020

so we can test not only the pipeline but also any previous processing we might do.

Understood. With pipeline tests there is a gap indeed (previous processing).

I wonder if introducing "show_user" option is the right choice. Maybe we need to introduce a full blown developer mode in Kibana, which potentially can unlock special features, like data streams "for testing". We may not end up with the "show_user" flag, but want to introduce more ones in the future.

@ruflin
Copy link
Contributor

ruflin commented Dec 17, 2020

@marc-gr Any chance you could link us to some example inputs you are trying to test?

The processing you are doing here on the edge and want to test, does it retrieve any local data for the processing or is it processing independent of the environment and in theory could also be done centrally? I like the idea of the e2e testing and elastic-package should help here but if the purpose is to test the input processing maybe parts of it should happen in the input itself?

It seems what is requested here is not introducing a special data stream but an additional input config + stream.yml potentially? In the ideal case, we would not mix test configs with production configs so the final package that is shipped removes these test configs. Could these be somewhere in the _dev folder?

@marc-gr Assuming you are using JS processors, how do you make sure you keep the production and testing JS processors in sync?

@andrewkroh
Copy link
Member

We've have been able to add system tests for most of the input types that we are using today so the need for test only inputs is mostly gone. We're emulating and simulating the things we need to test against (e.g. streaming syslog over udp/tcp/tls, making webhook callbacks, emulating google pubsub). So we will continue to put effort into this approach. I'm going to close this proposal. Thanks for discussing the idea.

One place where we don't have a solution yet is AWS SQS testing. This affects AWS CloudTrail where we'd like to be able to test that the handlebar template and Filebeat input settings are working correctly. One option might be to setup CI to do a test with AWS.

@marc-gr Assuming you are using JS processors, how do you make sure you keep the production and testing JS processors in sync?

Unrelated to this proposal, keeping the edge processing configuration in sync across multiple inputs is a problem we have. It's worst when there is a script processor, but even when it's a simple add_fields processor you must remember to update them all. Moving this processing into ingest node should eliminate the problem, but it will take some time to change them all over.

rw-access pushed a commit to rw-access/package-spec that referenced this issue Mar 23, 2021
* Updating package-spec version

* Running go get -u

* Updating go.mod
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants