Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e github actions #2417

Closed
wants to merge 10 commits into from
Closed

Add e2e github actions #2417

wants to merge 10 commits into from

Conversation

annajung
Copy link
Member

Description of your changes:
Based on E2E design proposal shared with the community, add AWS e2e workflow triggered by GitHub actions

TL;DR of steps

  • Install juju
  • Use juju to configure AWS
  • Install Docker, Python, Kustomize, Kubectl, and KinD into the AWS instance
  • Deploy Kubeflow using raw manifests
  • Run the mnist python script
  • If any of the steps fail, generate KinD, kubectl describe, and kubectl pod logs and upload the artifacts
  • Always destroy the AWS instance afterward

/hold

cc @kimwnasptd @DomFleischmann

@annajung
Copy link
Member Author

Secrets are created in the Manifest AWS account and the credential has been successfully tested locally to run through all the steps

The next step is to add the necessary credential to the repo / secrets.

@kimwnasptd
Copy link
Member

Thanks for the heads up! @zijianjoy small ping here that I'll reach out so that we can configure these in the kubeflow/manifests repo

@kimwnasptd
Copy link
Member

@annajung in the interim can you also update this PR so that it triggers the E2E test (have a dummy file somewhere) so that we can verify the test picks up the credentials once we set them up?

@zijianjoy
Copy link
Contributor

Sounds good to me for this github actions, feel free to let me know if approval from me is required.

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: annajung
Once this PR has been reviewed and has the lgtm label, please assign elikatsis for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@annajung
Copy link
Member Author

Updated the trigger to run the tests for any changes to the /apps and /commons directories and added a README file to trigger the e2e test. Once the credentials are added, I believe you can just rerun the failed test

Anna Jung (VMware) added 3 commits July 24, 2023 17:06
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Anna Jung (VMware) added 4 commits July 24, 2023 17:09
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
@annajung
Copy link
Member Author

annajung commented Aug 1, 2023

The problem with the github action failing to authenticate is due to security best practices enforced by GitHub. By default, GitHub secrets are not shared across forks to prevent any misuse (ref https://securitylab.github.com/research/github-actions-preventing-pwn-requests/)

Due to the dangers inherent to automatic processing of PRs, GitHub’s standard pull_request workflow trigger by default prevents write permissions and secrets access to the target repository.

There are two triggers that enable sharing secrets, thepull_request_target trigger and the workflow_run trigger.

  • The pull_request_target trigger's behavior is the same as the pull request trigger but allows access to secrets. However, it's not recommended to use this by itself as the original intent of the trigger is to use it for PRs that do not require running content of the PR. If the pull_request_target trigger is used, it should be used to with labels to prevent from automatic trigger. By depending on a specific label to be placed to trigger the action, it relies on a human to review the changes first before allowing the action to run via applying the label on a PR. However, for our case, I believe everyone can add a label via prow, so not sure if this is a valid option unless somehow we can control who can add what labels.

  • Theworkflow_run trigger is an action triggered by the previous action based on its status and has access to secrets. It's used to isolate the checking of the untrusted code and other steps that rely on GitHub secret. Meaning, two workflows are needed. One for checking out the source code to determine if it's safe to run the rest of the tests. If it's deemed safe, the code can be saved as an artifact and passed to the second workflow. The second workflow takes the artifact from the first workflow and triggers the test that requires access to the secret. The only thing here is, what would be the right test for the first workflow to allow the trigger of the second workflow?

Another approach or step that can be combined with the two triggers above is manual workflow approval using tools like https://github.com/marketplace/actions/manual-workflow-approval, which requires specific approval from a reviewer to trigger a workflow. Using this with pull_request_target will allow us to not rely on label, but to rely on specific reviewers. And for workflow_run, maybe this can be part of the first workflow so that a workflow with access to secrets is not triggered by any pull request, but only triggered after the approval from a person. The downside is that both approaches rely on a manual review. However, this might be okay, considering that the trigger for the e2e tests is dependent on changes to /app and /common, which are typically updated only during the kubeflow release.

Currently, I think the best way to move forward is to use the workflow_run trigger with the first workflow containing the manual review to approve the workflow. Once approved, the workflow will exit successfully and trigger the second workflow (that has access to secrets), which will run the e2e test. What do you think? @kimwnasptd

Anna Jung (VMware) added 2 commits August 3, 2023 10:56
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
@annajung
Copy link
Member Author

annajung commented Aug 3, 2023

/ok-to-test

@annajung
Copy link
Member Author

annajung commented Aug 3, 2023

Since we do need to gatekeep when the e2e test with github secrets are run, the following changes have been made

  • Add another action that requires a label ok-to-test for all changes in apps/ and common/ directories
  • Once label is added, it run another action, responsible for any pre-check verification before triggering e2e
  • If pre-check for e2e test passes, it should trigger the actual e2e test

If I understand correctly, ok-to-test is a valid command. It might be that it's valid for certain users with higher permissions. If so, we can use that as a way to trigger e2e, but who has permissions also must be reviewed.

If someone can help add ok-to-test label, we can test out rest of the github action flow

Signed-off-by: Anna Jung (VMware) <antheaj@vmware.com>
@annajung
Copy link
Member Author

annajung commented Aug 4, 2023

Looks like there is a problem with using workflow_run. According to workflow run trigger docs, it only triggers if the workflow is already merged into the main branch, so it won't be able to trigger within a PR.

Note: This event will only trigger a workflow run if the workflow file is on the default branch.

I think one way we can test this out before merging into the main branch is to create a dev branch in kubeflow/manifests and add a branch filter to the workflow to test before making a PR from dev to main.

@juliusvonkohout
Copy link
Member

@annajung does our kind cluster have PSA by default? https://kubernetes.io/docs/concepts/security/pod-security-admission/. It seems to be possible https://medium.com/@LachlanEvenson/hands-on-with-kubernetes-pod-security-admission-b6cac495cd11 but we really need it by default to test security staff for Kubeflow 1.9

@annajung
Copy link
Member Author

@juliusvonkohout since PSA feature became GA in K8s v1.25, it's enabled by default if we use kinD version > 1.25

set -eux
juju scp -- -r $(pwd)/ ubuntu/0:~/

- name: Install Docker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why docker is required?

# Download kubectl
sudo curl -L "https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl" -o /usr/local/bin/kubectl
sudo chmod +x /usr/local/bin/kubectl
kubectl version --short --client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the following output

(base) [dlovison@redhat ~]$ sudo curl -L "https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl" -o /usr/local/bin/kubectl
[sudo] password for dlovison: 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 47.5M  100 47.5M    0     0  18.6M      0  0:00:02  0:00:02 --:--:-- 18.6M
(base) [dlovison@redhat ~]$ sudo chmod +x /usr/local/bin/kubectl
(base) [dlovison@redhat ~]$ kubectl version --short --client
error: unknown flag: --short
See 'kubectl version --help' for usage.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Jan 11, 2024

/close

because anna will not continue with this and its superseded by #2544

@google-oss-prow google-oss-prow bot closed this Jan 11, 2024
Copy link

@juliusvonkohout: Closed this PR.

In response to this:

/close

because anna will not continue with this and its superseeded by #2544

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants