-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CI workflow to run android tests on mobile phones #13024
Conversation
- platform: android | ||
arch: armv8.2-a | ||
abi: arm64-v8a | ||
docker_image: "gcr.io/iree-oss/android@sha256:3f641d25786b1e5e430ee4cacb8bfe57540fda5ecaa7ca2802c179c26e77ce09" | ||
build_script: "./build_tools/cmake/build_android.sh" | ||
# No test_script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It breaks out the Android target from the matrix cross_compile_and_test. I think it makes sense to do so as Android has its own matrix to run tests on multiple devices.
I'd prefer to add to the existing matrix (android_pixel_4, android_pixel_6, etc.). We can add another setup script or more env vars as needed for that. This PR is adding a lot of boilerplate code right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we are not able to do that because android test workflow needs two different machines: 1. build machine to do cross-compile and 2. mobile phones to run tests. So they must be run in different jobs, but the matrix only supports a single job.
Ideally in the test step we can call a reusable workflow to run tests on different devices, but I don't think it's possible right now (reusable workflow must be called on the job level: https://docs.github.com/en/actions/using-workflows/reusing-workflows#calling-a-reusable-workflow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another way is having a matrix job to cross-compile and upload test artifacts, and having another matrix job to download and run tests on different machines. But that will introduce unnecessary deps between cross-compile and test jobs (android tests need to wait for all cross-compilation finished because it can't just wait on the android cross-compile in the matrix due to Github limitation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because android test workflow needs two different machines: 1. build machine to do cross-compile and 2. mobile phones to run tests
Isn't that the point of cross compiling in general though? Every job in the matrix will want to
- run the already built compiler the generate program artifacts
- build the runtime (fast, so doesn't need a large build machine)
- start an emulator, connect to a device, etc.
- run tests on the emulated platform / connected device
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, but previously we can run build and test in a single job because we only use emulators to run tests on the same build machine. However it's not the case now because android tests are running on mobile phones and those devices don't connect to the build machine.
So we want something like
jobs:
cross_compile_and_test:
matrix:
- arch: arm_64
test_device: pixel-4-rpi
- arch: riscv_64
test_device: riscv-emulator
steps:
cross_compile:
run-on: x86_64_build_machine
runs: ./cross_compile.sh ${matrix.arch}
test:
# This is not possible today because `run-on` can only be at the job level.
run-on: ${matrix.test_device}
runs: ./run_test.sh
But the problem is we can't run steps on different devices in a single job and the matrix is limited to a single job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Desktop GPU machines should be plenty powerful enough to build the runtime and compile any artifacts they need though
? We're only talking about Android here. We definitely would like to run Android emulator tests, but we're still going to need physical devices for real GPU tests and benchmarking.
It's annoying that GitHub Actions doesn't give us the right tools here. Looking at the graph, it seems like making the Android tests depend on the full cross-compilation wouldn't be too awful:
If we do, we should definitely leave a note about what's going on though.
An alternative would be to factor cross-compile into a reusable workflow, so that there's less duplication. I'm not sure whether passing the all the arguments ends up being as complicated as just copy-pasting the whole thing though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we put it in a reusable workflow, can we get a matrix to run multiple jobs? https://docs.github.com/en/actions/using-workflows/reusing-workflows#using-a-matrix-strategy-with-a-reusable-workflow
The reusable workflow would do both compilation and testing and that would be abstracted away from the caller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we put it in a reusable workflow, can we get a matrix to run multiple jobs? https://docs.github.com/en/actions/using-workflows/reusing-workflows#using-a-matrix-strategy-with-a-reusable-workflow
The reusable workflow would do both compilation and testing and that would be abstracted away from the caller
I think that is a good solution to reduce boilerplate code. But Github CI decides to not show the dependency graph for the reusable workflow in a matrix... (the two jobs in the screenshot below should be cross-compile
-> test
). Feel like this makes the readability worse
(from the run https://github.com/openxla/iree/actions/runs/4735027150)
That's pretty slow for pure overhead of something we want to run on presubmit. Will that be a bottleneck if we have multiple PRs and merged commits all queuing to use a single RPi? We might be able to proceed anyways, but the time burnt on overhead and the extra workflow complexity concerns me.
I updated the PR description. This test workflow is intentionally to only run on postsubmit for now (as we are currently doing with the buildkite test pipeline), as we only have two devices for each model.
I also want to mention the big benefit to proceeding is that we can drop the buildkite pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After trying more with reusable workflows and matrix jobs, I decided to stick with the original separate android workflow. Added the comment to explain why: https://github.com/openxla/iree/pull/13024/files#diff-b803fcb7f17ed9235f1e5cb1fcd2f5d3b2838429d4368ae4c57ce4436577f03fR935-R941
Basically the requirements of running on physical (non-scalable) devices and limited capabilities of matrix jobs make the Android test more complicated.
Eventually other platforms might also have physical devices to run. At that point we can generalize the current Android workflow to reuse it. But that will require passing more parameters and customisable/dynamic logic in the workflow and I don't think it will happen in the near future, so maybe we shouldn't over-generalized the solution right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, while I'm not thrilled that we have to do it this way, the other near-term options are not awesome either. I think we should merge this as-is
b06ec61
to
6e5b2d2
Compare
b2f5fb9
to
7f5ea4b
Compare
- platform: android | ||
arch: armv8.2-a | ||
abi: arm64-v8a | ||
docker_image: "gcr.io/iree-oss/android@sha256:3f641d25786b1e5e430ee4cacb8bfe57540fda5ecaa7ca2802c179c26e77ce09" | ||
build_script: "./build_tools/cmake/build_android.sh" | ||
# No test_script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, while I'm not thrilled that we have to do it this way, the other near-term options are not awesome either. I think we should merge this as-is
01c981a
to
764401e
Compare
Add a new workflow to cross-compile and run tests on Android devices. It breaks out the Android target from the matrix `cross_compile_and_test`. I think it makes sense to do so as Android has its own matrix to run tests on multiple devices. Currently this will only run on postsubmit due to the limited capacity of the mobile devices.
Add a new workflow to cross-compile and run tests on Android devices. It breaks out the Android target from the matrix `cross_compile_and_test`. I think it makes sense to do so as Android has its own matrix to run tests on multiple devices. Currently this will only run on postsubmit due to the limited capacity of the mobile devices.
Add a new workflow to cross-compile and run tests on Android devices.
It breaks out the Android target from the matrix
cross_compile_and_test
. I think it makes sense to do so as Android has its own matrix to run tests on multiple devices.Currently this will only run on postsubmit due to the limited capacity of the mobile devices.