Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Linux build on GPU on Github Actions #9335

Closed
wants to merge 19 commits into from

Conversation

luhenry
Copy link
Contributor

@luhenry luhenry commented Apr 2, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 2, 2024
Copy link

netlify bot commented Apr 2, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 6fbfa92
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/661e9cc989da3c0008892f05

@luhenry
Copy link
Contributor Author

luhenry commented Apr 2, 2024

cc @assignUser @Yuhta @pedroerp @kgpai

It's building on all pull requests. The testing would happen on a subset of PR or on nightly as discussed previously, and would be done on a different job. I've tested locally with fmt version 9.1.0 as well, and it compiles with both CUDA 11.8 and 12.4

Could you please also let me know what are the CUDA versions used internally, to make sure we add them to the test matrix here so we don't break the build?

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Collaborator

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I think we should just integrate it into the adapters build as the only difference in both is the additional installation of cuda + -DVELOX_ENABLE_GPU=ON.

That way we only add one additional build instead of two. Another option would be to only build the gpu parts with anything not needed (e.g. the connectors I would assume?) .

If I understand correctly the tests will be skipped automatically when no cuda device is detected anyway right?

scripts/setup-centos8.sh Outdated Show resolved Hide resolved
.github/workflows/linux-build.yml Outdated Show resolved Hide resolved
.github/workflows/linux-build.yml Outdated Show resolved Hide resolved
@luhenry
Copy link
Contributor Author

luhenry commented Apr 3, 2024

@assignUser ok, sounds good, I'll fold it into the adapters build.

@assignUser
Copy link
Collaborator

@luhenry Thinking on this a bit more, what are the minimal flags you need? If the build is much smaller (+ potentially skip building test) it might be faster to build 2 standalone builds Vs one additional adapters build.

@luhenry
Copy link
Contributor Author

luhenry commented Apr 9, 2024

@assignUser the build isn't much smaller as Wave depends (directly or indirectly) on most of the other velox libraries. Also, it doesn't add much to the existing build in terms of time to compile or produced binaries.

If you're concerned about resource/machine usage, we can build a limited set of cuda versions (what's the one(s) you'd like to focus on?) and build the larger set on the nightlies.

@assignUser
Copy link
Collaborator

If you're concerned about resource/machine usage, we can build a limited set of cuda versions (what's the one(s) you'd like to focus on?) and build the larger set on the nightlies.

Yep that was my concern. Sounds like the most efficent way would be adding the wave build to the existing adapters build with one cuda version (someone else should choose which one ^^) + a nightly matrix build with multiple versions. cc @kgpai

@luhenry
Copy link
Contributor Author

luhenry commented Apr 10, 2024

If you're concerned about resource/machine usage, we can build a limited set of cuda versions (what's the one(s) you'd like to focus on?) and build the larger set on the nightlies.

Yep that was my concern. Sounds like the most efficent way would be adding the wave build to the existing adapters build with one cuda version (someone else should choose which one ^^) + a nightly matrix build with multiple versions. cc @kgpai

I've changed it to a CUDA_VERSION env variable set currently to 11.8. Happy to change that value to anything else. I'll add the nightly in a follow-up PR since it will require work with GPU runners which I don't know how to access yet.

@assignUser
Copy link
Collaborator

@luhenry looks good but could you change the test skipping code so it doesn't cause gtest to pick up a failed test? Otherwise this will always fail the adapters job which is a bit unfortunate ^^

I was about to suggest installing cuda in the docker image but it seems that gha has great mirrors and the install only takes 27s so that's probably not much slower than the additional ~1GB to download if we add it to the container :D

@luhenry
Copy link
Contributor Author

luhenry commented Apr 11, 2024

@luhenry looks good but could you change the test skipping code so it doesn't cause gtest to pick up a failed test? Otherwise this will always fail the adapters job which is a bit unfortunate ^^

Fixed with 5c3780e

I was about to suggest installing cuda in the docker image but it seems that gha has great mirrors and the install only takes 27s so that's probably not much slower than the additional ~1GB to download if we add it to the container :D

It should be adding ~217MB as that's what's getting installed by yum install cuda-nvcc-11-8 cuda-cudart-11-8 (per the logs https://github.com/facebookincubator/velox/actions/runs/8627576322/job/23662809679#step:5:89)

@assignUser
Copy link
Collaborator

It should be adding ~217MB

I did local testing (with both versions though) and that added ~ 800mb to the final image but in any case it's fine as it is for now. :)

@luhenry
Copy link
Contributor Author

luhenry commented Apr 11, 2024

@assignUser I've just merged the main branch as it's failing with https://github.com/facebookincubator/velox/actions/runs/8643569195

@assignUser
Copy link
Collaborator

@luhenry yeah the fix is in #9451 feel free to apply it to this PR, I hope it get's merged soon^^

@luhenry
Copy link
Contributor Author

luhenry commented Apr 11, 2024

@luhenry yeah the fix is in #9451 feel free to apply it to this PR, I hope it get's merged soon^^

Done.

@luhenry
Copy link
Contributor Author

luhenry commented Apr 11, 2024

@assignUser all tests are now passing.

Copy link
Collaborator

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the cmake and workflow front, I'd like a C++ approval as well but then this should be ready to merge :)

@assignUser assignUser requested a review from Yuhta April 12, 2024 16:00
@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@Yuhta
Copy link
Contributor

Yuhta commented Apr 15, 2024

@luhenry Can you rebase to the newest?

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

velox/experimental/gpu/tests/HashTableTest.cu Outdated Show resolved Hide resolved
velox/experimental/gpu/tests/HashTableTest.cu Outdated Show resolved Hide resolved
velox/experimental/wave/common/Cuda.cu Outdated Show resolved Hide resolved
@luhenry
Copy link
Contributor Author

luhenry commented Apr 16, 2024

@Yuhta fixed.

@facebook-github-bot
Copy link
Contributor

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Yuhta merged this pull request in 1e901c9.

Copy link

Conbench analyzed the 1 benchmark run on commit 1e901c9d.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
Summary: Pull Request resolved: facebookincubator#9335

Reviewed By: kagamiori

Differential Revision: D55646566

Pulled By: Yuhta

fbshipit-source-id: 7c1cc2ef8db3da4e9f9889b9c7bcde52283e6bb2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants