Adding AMDGCN code into the fat binary #258

ethanc8 · 2024-07-28T23:23:32Z

ethanc8
Jul 28, 2024

We know that CUDA fat binaries can contain SASS (binary code for the Streaming Multiprocessor) for each SM generation, along with PTX code for any SM generations that weren't already compiled in. Do we know about what kinds of things the NVIDIA CUDA compiler will reject, as in, can we just put, for example, a "gfx901" slice into the fat binary, and expect libcuda to just ignore that? Then, with appropriate compiler support, we could have native code on both AMD and NVIDIA GPUs, and then modify ZLUDA to load the AMD slices. Even if the CUDA compiler rejects slices that are not PTX and don't start with "sm_", we might be able to put in slices with arbitrarily high SM versions such as sm_901.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding AMDGCN code into the fat binary #258

{{title}}

Replies: 0 comments

Select a reply

Adding AMDGCN code into the fat binary #258

ethanc8 Jul 28, 2024

Replies: 0 comments

ethanc8
Jul 28, 2024