Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QST] Should the fallback behavior when NATIVE CUDA arch detection fails just be RAPIDS? #630

Closed
vyasr opened this issue Jun 8, 2024 · 3 comments
Labels
? - Needs Triage Need team to review and classify question Further information is requested

Comments

@vyasr
Copy link
Contributor

vyasr commented Jun 8, 2024

Currently when NATIVE architectures are specified but no local GPUs are detected, rapids_cuda_set_architectures falls back to producing the list of supported architectures. This is done by passing that list of architectures to rapids_cuda_detect_architectures, which then uses it as the fallback output. The result is that if native arch detection fails, NATIVE is equivalent to RAPIDS, except that the latest virtual architecture is not built like it is for RAPIDS. This behavior seems confusing. If it was an intentional design decision for rapids-cmake to fall back to producing all supported GPU architectures if native detection failed -- and I assume it was since that would only occur on CPU-only machines that are very likely to be machines that are being used to build packages for redistribution (e.g. our CI) -- then I would expect that this fallback should also produce what we consider to be the default build option for RAPIDS.

Should we change NATIVE to use the RAPIDS behavior when native detection fails?

@vyasr vyasr added question Further information is requested ? - Needs Triage Need team to review and classify labels Jun 8, 2024
@robertmaynard
Copy link
Contributor

NATIVE never produces any form of SASS. When no GPU is detected on the machine the goal was to fall back to generate SASS for all supported GPUs so the code will run.

I don't think we should expect that NATIVE will ever produce the same as RAPIDS when no CUDA driver / GPU exists.
Consider going forward we might want to start having RAPIDS generate 90a code, that wouldn't be needed for NATIVE in fallback mode as 90 is sufficient for SASS execution.

@robertmaynard
Copy link
Contributor

robertmaynard commented Jun 9, 2024

Thinking about this more the proposal to change NATIVE to the now usable native ( via cmake ) removes the need to change this logic ( #320 )

Under cmake native ( aka -arch=native ) when no GPU / CUDA is found the compiler defaults back to -arch=sm_MinX.Y.

So I think we can close this and move forward with deprecating NATIVE in 24.08

@vyasr
Copy link
Contributor Author

vyasr commented Jun 10, 2024

NATIVE never produces any form of SASS. When no GPU is detected on the machine the goal was to fall back to generate SASS for all supported GPUs so the code will run.

I assume you mean that NATIVE never produces any form of PTX? It's not clear to me that the fallback to "build all supported SASS" is necessarily a better choice than producing the same behavior as RAPIDS. I get your point with the 90a example, but conversely if the goal is to make it "so the code will run" wouldn't you also want to produce nonzero PTX in case you end up on a newer architecture than the list of supported architectures? That's why we include PTX when generating with RAPIDS.

So I think we can close this and move forward with deprecating NATIVE in 24.08

In any case, I was also thinking about the switch to native as well when writing up this issue, and I agree that probably makes this issue moot so I'm fine closing.

@vyasr vyasr closed this as completed Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants