-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Test] Improve Katib CI/CD GitHub Actions #2024
Comments
One thing that might help is to avoid concurrent builds on the same PR (in case people pushed multiple commits which trigger separate builds). https://github.com/argoproj/argo-workflows/blob/master/.github/workflows/ci-build.yaml#L12-L14 We should utilize the cache on GitHub Actions. If the image builds are time-consuming, we should consider pre-building the image cache that Docker can use. |
I agree with you. Change to test on
Also, I think we could seperate the e2e test on two stage, |
@andreyvelich Thanks for creating this issue.
Makes sense. When I migrated e2e tests to gh-actions, I made all e2e tests build all suggestion images to avoid complicated shell scripts. But as you say, we can avoid the complex scripts to rebuild the e2e test using the katib Python client.
I added the step to avoid the error
I added the platform linux/amd64 to verify if we can build multi-platform images. In e2e, we only build single-platform images. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Since @tenzen-y made our E2E actions very stable we can close this issue. Thanks again for this effort! |
/kind feature
/area testing
Recently, we switched to GitHub Actions for our CI/CD pipelines, thanks a lot again @tenzen-y for driving this.
Since we have limitations now: 20 concurrent jobs and we haven't set AWS EC2 instances for our workers yet, we need to do some improvements to reduce execution time.
I think, we can try to do the following:
Should we run
postgres
test only for Random search experiment ? We run 3 Trials for Random experiment, so we can verify that DB works properly.Can we build only the required suggestions images for each e2e test ? As I can see, build step takes around 15 min which is more than half of e2e.
@tenzen-y Are there any specific requirements why we clean cache for our build image after each e2e run ?
Do we need to build images for
linux/amd64
if that is verified as part of e2e ?In the longterm/separate tracking issue we can also do this:
@kubeflow/wg-training-leads @tenzen-y @anencore94 Are there any other improvements that you have in your mind ?
GitHub Actions improvements checklist
I can identify the following improvements:
postgres
e2e only for random search.cancel-in-progress
API.linux/amd64
build from the pre-commit check since we verify this in E2E test.Please let me know if we should add more items @johnugeorge @anencore94 @terrytangyuan @tenzen-y @gaocegege
Love this feature? Give it a 👍 We prioritize the features with the most 👍
The text was updated successfully, but these errors were encountered: