You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We know that CUDA fat binaries can contain SASS (binary code for the Streaming Multiprocessor) for each SM generation, along with PTX code for any SM generations that weren't already compiled in. Do we know about what kinds of things the NVIDIA CUDA compiler will reject, as in, can we just put, for example, a "gfx901" slice into the fat binary, and expect libcuda to just ignore that? Then, with appropriate compiler support, we could have native code on both AMD and NVIDIA GPUs, and then modify ZLUDA to load the AMD slices. Even if the CUDA compiler rejects slices that are not PTX and don't start with "sm_", we might be able to put in slices with arbitrarily high SM versions such as sm_901.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
We know that CUDA fat binaries can contain SASS (binary code for the Streaming Multiprocessor) for each SM generation, along with PTX code for any SM generations that weren't already compiled in. Do we know about what kinds of things the NVIDIA CUDA compiler will reject, as in, can we just put, for example, a "gfx901" slice into the fat binary, and expect libcuda to just ignore that? Then, with appropriate compiler support, we could have native code on both AMD and NVIDIA GPUs, and then modify ZLUDA to load the AMD slices. Even if the CUDA compiler rejects slices that are not PTX and don't start with "sm_", we might be able to put in slices with arbitrarily high SM versions such as sm_901.
Beta Was this translation helpful? Give feedback.
All reactions