Skip to content

Commit

Permalink
[SYCL] Fix PI_KERNEL_MAX_SUB_GROUP_SIZE in OpenCL backend (#6849)
Browse files Browse the repository at this point in the history
Currently PI_KERNEL_MAX_SUB_GROUP_SIZE in the PI OpenCL backend uses the
max work item sizes as the input to the corresponding OpenCL query to
avoid truncation. However, using the max work item sizes in all
dimensions may exceed the total max work items limitations. To prevent
this limit from being exceeded, this commit changes the query to only
use the max work-item size in the first dimension and using 1s in the
other dimensions.

Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
  • Loading branch information
steffenlarsen authored Sep 23, 2022
1 parent 4d0df22 commit 5998d7c
Showing 1 changed file with 16 additions and 6 deletions.
22 changes: 16 additions & 6 deletions sycl/plugins/opencl/pi_opencl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -865,14 +865,24 @@ pi_result piKernelGetSubGroupInfo(pi_kernel kernel, pi_device device,
std::shared_ptr<void> implicit_input_value;
if (param_name == PI_KERNEL_MAX_SUB_GROUP_SIZE && !input_value) {
// OpenCL needs an input value for PI_KERNEL_MAX_SUB_GROUP_SIZE so if no
// value is given we use the max work item sizes of the device to avoid
// truncation of max sub-group size.
implicit_input_value = std::shared_ptr<size_t[]>(new size_t[3]);
pi_result pi_ret_err = piDeviceGetInfo(
device, PI_DEVICE_INFO_MAX_WORK_ITEM_SIZES, 3 * sizeof(size_t),
implicit_input_value.get(), nullptr);
// value is given we use the max work item size of the device in the first
// dimention to avoid truncation of max sub-group size.
pi_uint32 max_dims = 0;
pi_result pi_ret_err =
piDeviceGetInfo(device, PI_DEVICE_INFO_MAX_WORK_ITEM_DIMENSIONS,
sizeof(pi_uint32), &max_dims, nullptr);
if (pi_ret_err != PI_SUCCESS)
return pi_ret_err;
std::shared_ptr<size_t[]> WGSizes{new size_t[max_dims]};
pi_ret_err =
piDeviceGetInfo(device, PI_DEVICE_INFO_MAX_WORK_ITEM_SIZES,
max_dims * sizeof(size_t), WGSizes.get(), nullptr);
if (pi_ret_err != PI_SUCCESS)
return pi_ret_err;
for (size_t i = 1; i < max_dims; ++i)
WGSizes.get()[i] = 1;
implicit_input_value = std::move(WGSizes);
input_value_size = max_dims * sizeof(size_t);
input_value = implicit_input_value.get();
}

Expand Down

0 comments on commit 5998d7c

Please sign in to comment.