-
Notifications
You must be signed in to change notification settings - Fork 739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][CUDA] Enable sub-group loads and stores #2763
Conversation
@premanandrao, could you help with that, please? The crash seems to happen in clang's Sema. |
I will take a look. |
In certain complex template instantiations, address space mismatches cause the compiler to assert instead of issuing errors gracefully. This fixes one such case identified in PR #2763. Signed-off-by: Premanand M Rao <premanand.m.rao@intel.com>
71c47c7
to
8fd3448
Compare
#2798 appears to have fixed the tests, at least locally. Opening this up for review while GitHub repeats the tests. |
@Pennycook, could you please resolve conflicts. Please note that sub_group-related tests are in llvm-test-suite now which is available under Jenkins/Precommit. |
@vladimirlaz So I should move the test changes to llvm-test-suite? Should the changes to the tests be merged before or after the changes to the functionality? |
Rather than implementing the SubgroupBlockReadINTEL and SubgroupBlockWriteINTEL intrinsics in libspirv, this commit adds a fallback path directly to the sub-group header. There are several reasons for this: 1) There are currently no INTEL extensions implemented in libspirv. 2) The load/store functions are expected to be rewritten soon to expose additional functionality, which may map to different SPIR-V. Signed-off-by: John Pennycook <john.pennycook@intel.com>
PRs should be tested together. Submission should be done tests first and then product one after another. |
8fd3448
to
88f0935
Compare
OpGenericCastToPtrExplict is for dynamic cast and OpGenericCastToPtr is for static cast. OpGenericCastToPtrExplict is already transformed to to_{global|local|private} OCL builtins, but the handling for OpGenericCastToPtr is missing. Looks we can transform OpGenericCastToPtr to addrspacecast instruction directly in SPIRV to OCL transformation. Signed-off-by: Cui, Dele <dele.cui@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@21e96548e242860
While implementing this I uncovered a bug in the load_store test, which assumed that it would always be safe to execute a sub-group load or store that accessed invalid memory. Admittedly this is something that the specification is unclear about, and something that we should revisit after some further discussion.
In my tests, everything works except for
half
. Trying to compile the test forhalf
causes the compiler itself to crash, and I don't understand why, which is why I've opened this as a draft PR for now. @bader, can you please recommend somebody to take a look at this error and help fix it?