Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E test optimization for multi-distro support #319

Merged
merged 27 commits into from
Mar 5, 2022

Conversation

YrrepNoj
Copy link
Contributor

@YrrepNoj YrrepNoj commented Feb 14, 2022

This PR updates the GitHub Actions to run e2e pipeline in on the native GitHub runners (instead of needing to use chat-ops and Terratest to stand up AWS EC2 instances to perform the tests on.)

This PR enables the following:

  • Modular e2e util code to easily test multiple k8s distros with the same test suite.
  • Ability to run e2e test suite locally on a local cluster (will even stand up a cluster for you if one is not currently running)
  • Updates the GitHub Actions pipeline to automatically run test for the KinD and K3D distros (without needing chat-ops to kick off the test)

@RothAndrew

This comment was marked as resolved.

@YrrepNoj YrrepNoj force-pushed the 100-e2e-test-refresh branch 4 times, most recently from cac7753 to a7f0e21 Compare February 22, 2022 18:41
@YrrepNoj YrrepNoj changed the title WIP Prototype: Initial attempt at reworking e2e tests Initial attempt at reworking e2e tests Feb 22, 2022
@YrrepNoj YrrepNoj marked this pull request as ready for review February 22, 2022 19:39
@RothAndrew
Copy link
Contributor

  • The pull_request event simulates a merge of the pull request, and triggers Actions workflows based on the configuration and code at the merge commit. This is intended for e.g. running tests, and verifying that the code would still work if the pull request was merged. However, since the code in the pull request is potentially malicious, workflows triggered by the pull_request event are run without access to the repository’s secrets.
  • The pull_request_target event triggers Actions workflows based on the configuration and code at the base branch of the pull request. Since the base branch is part of the base repository itself and not part of a fork, workflows triggered by pull_request_target are trusted and run with access to secrets. This is intended for e.g. adding comments and labels to new pull requests (which requires a GitHub API token).

Actions triggered from pull_request can be run from forks, but they won't have secrets, so we would not be able to run Terratest things in this way, since it requires access to the AWS access and secret keys.

That just leaves annoyances like malicious PRs that mine bitcoin or just sit there spending our GHA minutes that are, while annoying, not security risks. Sounds like GitHub will require approval to run any workflows for first time contributors, but then will let them through after that.

Questions:

  • @YrrepNoj how are you planning to do K3s-based testing using the new setup you've been working on?
  • It looks like different distros run in parallel, but the test suite runs serially. Do have that right? If so, is that going to be sustainable as our test suite grows? You mentioned that it runs in the same amount of time now as the current suite does, but as we add more tests the test runs will get longer and longer.

@RothAndrew
Copy link
Contributor

  • How will this handle running in different linux distros?

@RothAndrew
Copy link
Contributor

RothAndrew commented Feb 23, 2022

  • Using this method, how can we handle doing tests on cloud distros like EKS? Workflows that use the pull_request trigger don't have access to repo secrets

EDIT: *Workflows that use the pull_request trigger submitted from forks

@RothAndrew
Copy link
Contributor

I suppose we can do a combination, with part of the test suite being runnable using the pull_request event, and others that require access to secrets being kicked off manually like we do now.

@YrrepNoj
Copy link
Contributor Author

  • Using this method, how can we handle doing tests on cloud distros like EKS? Workflows that use the pull_request trigger don't have access to repo secrets

I'll look through your comments more tomorrow but I don't think this is a fully accurate statement. The test pipeline here still uses the repo REGISTRY credential secrets to sign into repo1 to build packages.

@RothAndrew
Copy link
Contributor

https://docs.github.com/en/actions/managing-workflow-runs/approving-workflow-runs-from-public-forks

image

It's working because you aren't submitting the PR from a fork.

@jeff-mccoy jeff-mccoy changed the title Initial attempt at reworking e2e tests E2E test optimization for multi-distro support Feb 26, 2022
@RothAndrew
Copy link
Contributor

^^^ That comment should not work, since rothandrewseviltwin is not a developer of this repo

@YrrepNoj YrrepNoj force-pushed the 100-e2e-test-refresh branch from b408f5f to 2b596fa Compare March 1, 2022 22:13
@YrrepNoj YrrepNoj added the test label Mar 1, 2022
This commit brings back the terratest supported test for the example game package.
This verifies zarf can init with k3s as a component and deploy a package on it
@YrrepNoj YrrepNoj force-pushed the 100-e2e-test-refresh branch from 629f5d6 to d9b7a6c Compare March 3, 2022 06:18
@YrrepNoj
Copy link
Contributor Author

YrrepNoj commented Mar 3, 2022

@RothAndrew @jeff-mccoy This is ready for review now.

Also, I figured out what we needed to do to test Zarf with it's self-initialized k3s cluster on a native GitHub Runner so we don't need to rely on Terratest standing up an ec2 instance for that. I currently have the slash-dispatch action still there to test the game_example on an ec2 instance but was going to remove it since we don't need that anymore.

Any thoughts on that? I know we'll need it in the future when testing other packages/linux distros but I don't think we need it currently?

@jeff-mccoy
Copy link
Contributor

This is super good news. I didn't realize that native github runners were actually VMs (unlike gitlab land where we were in containers). I think for now we can just go all in on the native runners and drop the EC2 / terratest requirement and the chat-ops requirement. Since native runners are only Ubuntu, we will need to standup self-hosted runners later on for other Linux Distros to satisfy #101, but I'm a huge fan of not having to leverage terraform for the majority of the tests.

One thing I do think we are losing that we need to discuss though is the current terratest setup does a better job of breaking the tests down into smaller identifiable chunks. I know it's possibly more yaml soup--but I think we shouldn't lose that fidelity if can prevent it. Thoughts?

@jeff-mccoy jeff-mccoy removed the test label Mar 4, 2022
@YrrepNoj
Copy link
Contributor Author

YrrepNoj commented Mar 4, 2022

I'm not opposed to it, but I'm also not convinced we need that level of granularity we had before. Were you referencing the way the current tests have a different response status for each e2e test? That would lead to a large number of workflow yaml files but would solve any future issues with the test suite taking too long to run.

Another option is that we could break that down a little more and have each test run in its own step, at the potential cost of a handful more seconds of overhead per test. Currently this PR runs all of the tests in the Run Tests step, the output can be seen in the action details which prints the individual test results as well as the stdout/stderr for any failing tests.

@RothAndrew
Copy link
Contributor

RothAndrew commented Mar 4, 2022

IMO we should not completely strip out the chatops and terratest framework, for 2 reasons:

  1. We need the chatops for any tests that require secrets
  2. We will very soon need a way to do a test on an EKS cluster

I won't hold up this PR if you guys feel differently, though I suspect if we rip that stuff out we're going to end up putting some/most of it back in later. Which is fine as long as we can go pull it from the git history.

@jeff-mccoy
Copy link
Contributor

I personally just like the fidelity in the PR we get for showing test results here vs digging into the logs, but it's super preference. @RothAndrew, you're probably right, we will likely need to bring back something later on unless we discover a new way to manage the secret concern--but I'd rather pull chatops until we are certain we need it and then go from there. What are your thoughts on the test breakdown in this view vs in the logs?

@jeff-mccoy
Copy link
Contributor

Thinking about this a little more I think if we can get the ADR done we can table breaking out the tests more for another issue @YrrepNoj

@YrrepNoj YrrepNoj force-pushed the 100-e2e-test-refresh branch from e7a52d4 to f915f8a Compare March 5, 2022 10:51
Copy link
Contributor

@jeff-mccoy jeff-mccoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a major overhaul of the test, but nothing is too scary in the code. I think this will end up putting us on a much better local and ci testing path. Thanks for working this.


.PHONY: test-e2e
test-e2e: package-example-game test-cloud-e2e-example-game ## DEPRECATED - to be replaced by individual e2e test targets
# TODO: This can be cleaned up a little more when `zarf init` is able to provide the path to the `zarf-init.tar.zst`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually don't have to do zarf init, when you run a package deploy or even zarf zarf-init.tar.zst, it should do the exact same thing now. Init is nothing more than a fixed shortcut right now to use the init package in the same directory.

@jeff-mccoy
Copy link
Contributor

/test all

@jeff-mccoy
Copy link
Contributor

Seems a little strange to do that but keeps in line with our existing process

@jeff-mccoy
Copy link
Contributor

actually n/m that won't be possible 😆. Okay I'm going to merge this.

@jeff-mccoy jeff-mccoy merged commit ef8e280 into master Mar 5, 2022
@jeff-mccoy jeff-mccoy deleted the 100-e2e-test-refresh branch March 5, 2022 19:03
Noxsios pushed a commit that referenced this pull request Mar 8, 2023
* add kind testing
* add k3d testing
* add k3s testing
* remove chat based e2e test
* cleanup of e2e test code
* adr for e2e testing

Co-authored-by: Jeff McCoy <code@jeffm.us>
Co-authored-by: Minimind <882485+jeff-mccoy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants