Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

group_ballot implementation is missing for AMD GPU/HIP targets #6718

Closed
stefanatwork opened this issue Sep 7, 2022 · 1 comment
Closed
Labels
bug Something isn't working compiler Compiler related issue hip Issues related to execution on HIP backend.

Comments

@stefanatwork
Copy link
Contributor

stefanatwork commented Sep 7, 2022

Describe the bug
Code that uses sycl::ext::oneapi::group_ballot will fail with a linker error when compiling to amdgcn-amd-amdhsa.

Error message:
lld: error: undefined hidden symbol: __spirv_GroupNonUniformBallot(unsigned int, bool)

To Reproduce

  1. Create a source file test.cpp with this code:
#include <CL/sycl.hpp>
#define N 256
int main()
{
  sycl::queue queue;
  sycl::buffer<int, 1> A(N);
  sycl::buffer<int, 1> B(N);

  sycl::host_accessor A_host_acc(A, sycl::write_only);
  for (size_t i = (size_t)0; i < N; i++) {
    A_host_acc[i] = rand() % 32;
  }

  queue.submit([&](sycl::handler &cgh) {
    sycl::accessor A_acc(A, cgh, sycl::read_only);
    sycl::accessor B_acc(B, cgh, sycl::write_only, sycl::no_init);

    cgh.parallel_for<class ballot>(N, [=]  (sycl::id<1> idx) { 
      B_acc[idx] = sycl::ext::oneapi::group_ballot(sycl::ext::oneapi::experimental::this_sub_group(), A_acc[idx] > 0.5f).count();
      });
  });
  queue.wait();
  return 0 ;
}
  1. Compile with clang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfx1032 -o test test.cpp
  2. Expected result: compiler exits without error, producing an executable named test
  3. Actual result: compiler exits with error
lld: error: undefined hidden symbol: __spirv_GroupNonUniformBallot(unsigned int, bool)
>>> referenced by lto.tmp:(typeinfo name for sycl::_V1::detail::__pf_kernel_wrapper<main::'lambda'(sycl::_V1::handler&)::operator()(sycl::_V1::handler&) const::ballot>)
>>> referenced by lto.tmp:(typeinfo name for sycl::_V1::detail::__pf_kernel_wrapper<main::'lambda'(sycl::_V1::handler&)::operator()(sycl::_V1::handler&) const::ballot>)
>>> referenced by lto.tmp:(_ZTSN4sycl3_V16detail19__pf_kernel_wrapperIZZ4mainENKUlRNS0_7handlerEE_clES4_E6ballotEE_with_offset)
>>> referenced 5 more times

Environment:

  • OS: Ubuntu 20.04
  • Target device and vendor: AMD Radeon RX6600 (gfx1032)
  • DPC++ version: 85a6833
  • Dependencies version: HIP 5.2.0
@stefanatwork stefanatwork added the bug Something isn't working label Sep 7, 2022
@AlexeySachkov AlexeySachkov added the hip Issues related to execution on HIP backend. label Sep 7, 2022
@AerialMantis AerialMantis added the compiler Compiler related issue label Sep 14, 2022
romanovvlad pushed a commit that referenced this issue Dec 2, 2022
This patch is adding group ballot support for HIP (based on initial work
from @abagusetty on #6734 ), but also
extending the sub-group mask implementation to support 64 bit masks, as
a lot of AMD GPUs use 64 bit wavefronts.

Related to issue: #6718
@npmiller
Copy link
Contributor

npmiller commented Dec 2, 2022

The support for group_ballot has now been merged so this should work with the latest compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working compiler Compiler related issue hip Issues related to execution on HIP backend.
Projects
None yet
Development

No branches or pull requests

5 participants