Skip to content

Commit

Permalink
[SYCL] Support memory clock rate and memory bus width queries (#7412)
Browse files Browse the repository at this point in the history
  • Loading branch information
againull authored Nov 17, 2022
1 parent e7ed860 commit 4f7787c
Show file tree
Hide file tree
Showing 15 changed files with 291 additions and 8 deletions.
75 changes: 75 additions & 0 deletions sycl/doc/extensions/supported/sycl_ext_intel_device_info.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The Feature Test Macro SYCL\_EXT\_INTEL\_DEVICE\_INFO will be defined as one of
| 3 | HW threads per EU device query is supported |
| 4 | Free device memory query is supported |
| 5 | Device ID is supported |
| 6 | Memory clock rate and bus width queries are supported |



Expand Down Expand Up @@ -425,6 +426,80 @@ Then the free device memory can be obtained using the standard get\_info() inte
auto FreeMemory = dev.get_info<ext::intel::info::device::free_memory>();
}


# Memory Clock Rate #

A new device descriptor is added which provides the maximum clock rate of device's global memory.

This new device descriptor is not available for devices in the OpenCL platform, and the matching aspect is false for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform.

## Version ##

The extension supports this query in version 6 and later.


## Device Information Descriptors ##

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| ext\:\:intel\:\:info\:\:device\:\:memory\_clock\_rate | uint32\_t| Returns the maximum clock rate of device's global memory in MHz. If device doesn't have memory then returns 0. If there are several memories on the device then the minimum of the clock rate values is returned. |


## Aspects ##

A new aspect, ext\_intel\_memory\_clock\_rate, is added.


## Error Condition ##

An invalid object runtime error is thrown if the device does not support aspect\:\:ext\_intel\_memory\_clock\_rate.


## Example Usage ##

Then the memory clock rate can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_memory_clock_rate)) {
auto MemoryClockRate = dev.get_info<ext::intel::info::device::memory_clock_rate>();
}


# Memory Bus Width #

A new device descriptor is added which provides the maximum bus width between device and memory.

This new device descriptor is not available for devices in the OpenCL platform, and the matching aspect is false for those devices. The DPC++ default behavior is to expose GPU devices through the Level Zero platform.

## Version ##

The extension supports this query in version 6 and later.


## Device Information Descriptors ##

| Device Descriptors | Return Type | Description |
| ------------------ | ----------- | ----------- |
| ext\:\:intel\:\:info\:\:device\:\:memory\_bus\_width | uint32\_t| Returns the maximum bus width between device and memory in bits. If device doesn't have memory then returns 0. If there are several memories on the device then the minimum of the bus width values is returned. |


## Aspects ##

A new aspect, ext\_intel\_memory\_bus\_width, is added.


## Error Condition ##

An invalid object runtime error is thrown if the device does not support aspect\:\:ext\_intel\_memory\_bus\_width.


## Example Usage ##

Then the memory bus width can be obtained using the standard get\_info() interface.

if (dev.has(aspect::ext_intel_memory_bus_width)) {
auto MemoryBusWidth = dev.get_info<ext::intel::info::device::memory_bus_width>();
}

# Deprecated queries #

The table below lists deprecated, that would soon be removed and their replacements:
Expand Down
11 changes: 10 additions & 1 deletion sycl/include/sycl/detail/pi.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,12 @@
// 10.14 Add PI_EXT_INTEL_DEVICE_INFO_FREE_MEMORY as an extension for
// piDeviceGetInfo.
// 11.15 piEventCreate creates even in the signalled state now.
// 11.16 Add PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE and
// PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH as an extension for
// piDeviceGetInfo.

#define _PI_H_VERSION_MAJOR 11
#define _PI_H_VERSION_MINOR 15
#define _PI_H_VERSION_MINOR 16

#define _PI_STRING_HELPER(a) #a
#define _PI_CONCAT(a, b) _PI_STRING_HELPER(a.b)
Expand Down Expand Up @@ -277,6 +280,12 @@ typedef enum {
// Return true if sub-device should do its own program build
PI_DEVICE_INFO_BUILD_ON_SUBDEVICE = 0x10028,
PI_EXT_INTEL_DEVICE_INFO_FREE_MEMORY = 0x10029,
// Return 0 if device doesn't have any memory modules. Return the minimum of
// the clock rate values if there are several memory modules on the device.
PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE = 0x10030,
// Return 0 if device doesn't have any memory modules. Return the minimum of
// the bus width values if there are several memory modules on the device.
PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH = 0x10031,
PI_DEVICE_INFO_ATOMIC_64 = 0x10110,
PI_DEVICE_INFO_ATOMIC_MEMORY_ORDER_CAPABILITIES = 0x10111,
PI_DEVICE_INFO_ATOMIC_MEMORY_SCOPE_CAPABILITIES = 0x11000,
Expand Down
2 changes: 1 addition & 1 deletion sycl/include/sycl/feature_test.hpp.in
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ __SYCL_INLINE_VER_NAMESPACE(_V1) {
// Feature test macro definitions

// TODO: Move these feature-test macros to compiler driver.
#define SYCL_EXT_INTEL_DEVICE_INFO 5
#define SYCL_EXT_INTEL_DEVICE_INFO 6
#define SYCL_EXT_INTEL_DEVICE_ARCHITECTURE 1
#define SYCL_EXT_ONEAPI_SUB_GROUP_MASK 1
#define SYCL_EXT_ONEAPI_LOCAL_MEMORY 1
Expand Down
2 changes: 2 additions & 0 deletions sycl/include/sycl/info/aspects.def
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,5 @@ __SYCL_ASPECT(ext_oneapi_cuda_async_barrier, 34)
__SYCL_ASPECT(ext_oneapi_bfloat16, 35)
__SYCL_ASPECT(ext_intel_free_memory, 36)
__SYCL_ASPECT(ext_intel_device_id, 37)
__SYCL_ASPECT(ext_intel_memory_clock_rate, 38)
__SYCL_ASPECT(ext_intel_memory_bus_width, 39)
2 changes: 2 additions & 0 deletions sycl/include/sycl/info/ext_intel_device_traits.def
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ __SYCL_PARAM_TRAITS_SPEC(ext::intel, device, gpu_hw_threads_per_eu, pi_uint32, P
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, max_mem_bandwidth, pi_uint64, PI_DEVICE_INFO_MAX_MEM_BANDWIDTH)
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, uuid, detail::uuid_type, PI_DEVICE_INFO_UUID)
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, free_memory, pi_uint64, PI_EXT_INTEL_DEVICE_INFO_FREE_MEMORY)
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, memory_clock_rate, pi_uint32, PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE)
__SYCL_PARAM_TRAITS_SPEC(ext::intel, device, memory_bus_width, pi_uint32, PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH)
#ifdef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC
#undef __SYCL_PARAM_TRAITS_TEMPLATE_SPEC_NEEDS_UNDEF
Expand Down
19 changes: 19 additions & 0 deletions sycl/plugins/cuda/pi_cuda.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1926,6 +1926,25 @@ pi_result cuda_piDeviceGetInfo(pi_device device, pi_device_info param_name,
return getInfo(param_value_size, param_value, param_value_size_ret,
FreeMemory);
}
case PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE: {
int value = 0;
sycl::detail::pi::assertion(
cuDeviceGetAttribute(&value, CU_DEVICE_ATTRIBUTE_MEMORY_CLOCK_RATE,
device->get()) == CUDA_SUCCESS);
sycl::detail::pi::assertion(value >= 0);
// Convert kilohertz to megahertz when returning.
return getInfo(param_value_size, param_value, param_value_size_ret,
value / 1000);
}
case PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH: {
int value = 0;
sycl::detail::pi::assertion(
cuDeviceGetAttribute(&value,
CU_DEVICE_ATTRIBUTE_GLOBAL_MEMORY_BUS_WIDTH,
device->get()) == CUDA_SUCCESS);
sycl::detail::pi::assertion(value >= 0);
return getInfo(param_value_size, param_value, param_value_size_ret, value);
}

// TODO: Investigate if this information is available on CUDA.
case PI_DEVICE_INFO_DEVICE_ID:
Expand Down
20 changes: 20 additions & 0 deletions sycl/plugins/hip/pi_hip.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1815,6 +1815,26 @@ pi_result hip_piDeviceGetInfo(pi_device device, pi_device_info param_name,
FreeMemory);
}

case PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE: {
int value = 0;
sycl::detail::pi::assertion(
hipDeviceGetAttribute(&value, hipDeviceAttributeMemoryClockRate,
device->get()) == hipSuccess);
sycl::detail::pi::assertion(value >= 0);
// Convert kilohertz to megahertz when returning.
return getInfo(param_value_size, param_value, param_value_size_ret,
value / 1000);
}

case PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH: {
int value = 0;
sycl::detail::pi::assertion(
hipDeviceGetAttribute(&value, hipDeviceAttributeMemoryBusWidth,
device->get()) == hipSuccess);
sycl::detail::pi::assertion(value >= 0);
return getInfo(param_value_size, param_value, param_value_size_ret, value);
}

// TODO: Implement.
case PI_DEVICE_INFO_ATOMIC_MEMORY_ORDER_CAPABILITIES:
// TODO: Investigate if this information is available on HIP.
Expand Down
33 changes: 32 additions & 1 deletion sycl/plugins/level_zero/pi_level_zero.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3180,7 +3180,38 @@ pi_result piDeviceGetInfo(pi_device Device, pi_device_info ParamName,
}
return ReturnValue(FreeMemory);
}

case PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE: {
// If there are not any memory modules then return 0.
if (Device->ZeDeviceMemoryProperties->empty())
return ReturnValue(pi_uint32{0});

// If there are multiple memory modules on the device then we have to report
// the value of the slowest memory.
auto Comp = [](const ze_device_memory_properties_t &A,
const ze_device_memory_properties_t &B) -> bool {
return A.maxClockRate < B.maxClockRate;
};
auto MinIt =
std::min_element(Device->ZeDeviceMemoryProperties->begin(),
Device->ZeDeviceMemoryProperties->end(), Comp);
return ReturnValue(pi_uint32{MinIt->maxClockRate});
}
case PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH: {
// If there are not any memory modules then return 0.
if (Device->ZeDeviceMemoryProperties->empty())
return ReturnValue(pi_uint32{0});

// If there are multiple memory modules on the device then we have to report
// the value of the slowest memory.
auto Comp = [](const ze_device_memory_properties_t &A,
const ze_device_memory_properties_t &B) -> bool {
return A.maxBusWidth < B.maxBusWidth;
};
auto MinIt =
std::min_element(Device->ZeDeviceMemoryProperties->begin(),
Device->ZeDeviceMemoryProperties->end(), Comp);
return ReturnValue(pi_uint32{MinIt->maxBusWidth});
}
case PI_DEVICE_INFO_GPU_EU_COUNT: {
pi_uint32 count = Device->ZeDeviceProperties->numEUsPerSubslice *
Device->ZeDeviceProperties->numSubslicesPerSlice *
Expand Down
8 changes: 8 additions & 0 deletions sycl/source/detail/device_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,14 @@ bool device_impl::has(aspect Aspect) const {
return getPlugin().call_nocheck<detail::PiApiKind::piDeviceGetInfo>(
MDevice, PI_EXT_INTEL_DEVICE_INFO_FREE_MEMORY, 0, nullptr,
&return_size) == PI_SUCCESS;
case aspect::ext_intel_memory_clock_rate:
return getPlugin().call_nocheck<detail::PiApiKind::piDeviceGetInfo>(
MDevice, PI_EXT_INTEL_DEVICE_INFO_MEMORY_CLOCK_RATE, 0, nullptr,
&return_size) == PI_SUCCESS;
case aspect::ext_intel_memory_bus_width:
return getPlugin().call_nocheck<detail::PiApiKind::piDeviceGetInfo>(
MDevice, PI_EXT_INTEL_DEVICE_INFO_MEMORY_BUS_WIDTH, 0, nullptr,
&return_size) == PI_SUCCESS;
case aspect::ext_intel_device_info_uuid: {
auto Result = getPlugin().call_nocheck<detail::PiApiKind::piDeviceGetInfo>(
MDevice, PI_DEVICE_INFO_UUID, 0, nullptr, &return_size);
Expand Down
16 changes: 16 additions & 0 deletions sycl/source/detail/device_info.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1573,6 +1573,22 @@ inline uint64_t get_device_info_host<ext::intel::info::device::free_memory>() {
PI_ERROR_INVALID_DEVICE);
}

template <>
inline uint32_t
get_device_info_host<ext::intel::info::device::memory_clock_rate>() {
throw runtime_error(
"Obtaining the device memory clock rate is not supported on HOST device",
PI_ERROR_INVALID_DEVICE);
}

template <>
inline uint32_t
get_device_info_host<ext::intel::info::device::memory_bus_width>() {
throw runtime_error(
"Obtaining the device memory bus width is not supported on HOST device",
PI_ERROR_INVALID_DEVICE);
}

} // namespace detail
} // __SYCL_INLINE_VER_NAMESPACE(_V1)
} // namespace sycl
2 changes: 2 additions & 0 deletions sycl/test/abi/sycl_symbols_linux.dump
Original file line number Diff line number Diff line change
Expand Up @@ -4160,8 +4160,10 @@ _ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device10gpu_slicesEEENS0_6deta
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device11free_memoryEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device11pci_addressEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device12gpu_eu_countEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device16memory_bus_widthEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device17gpu_eu_simd_widthEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device17max_mem_bandwidthEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device17memory_clock_rateEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device21gpu_hw_threads_per_euEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device23gpu_subslices_per_sliceEEENS0_6detail19is_device_info_descIT_E11return_typeEv
_ZNK4sycl3_V16device8get_infoINS0_3ext5intel4info6device25gpu_eu_count_per_subsliceEEENS0_6detail19is_device_info_descIT_E11return_typeEv
Expand Down
2 changes: 2 additions & 0 deletions sycl/test/abi/sycl_symbols_windows.dump
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@
??$get_info@Umax_work_item_dimensions@device@info@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
??$get_info@Umax_write_image_args@device@info@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
??$get_info@Umem_base_addr_align@device@info@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
??$get_info@Umemory_bus_width@device@info@intel@ext@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
??$get_info@Umemory_clock_rate@device@info@intel@ext@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
??$get_info@Uname@device@info@_V1@sycl@@@device@_V1@sycl@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
??$get_info@Uname@platform@info@_V1@sycl@@@platform@_V1@sycl@@QEBA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@XZ
??$get_info@Unative_vector_width_char@device@info@_V1@sycl@@@device@_V1@sycl@@QEBAIXZ
Expand Down
13 changes: 10 additions & 3 deletions sycl/test/extensions/properties/properties_kernel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ using device_has_all =
aspect::ext_oneapi_native_assert, aspect::host_debuggable,
aspect::ext_intel_gpu_hw_threads_per_eu,
aspect::ext_oneapi_cuda_async_barrier, aspect::ext_oneapi_bfloat16,
aspect::ext_intel_free_memory, aspect::ext_intel_device_id>);
aspect::ext_intel_free_memory, aspect::ext_intel_device_id,
aspect::ext_intel_memory_clock_rate,
aspect::ext_intel_memory_bus_width>);

template <aspect Aspect> inline void singleAspectDeviceHasChecks() {
static_assert(is_property_value<decltype(device_has<Aspect>)>::value);
Expand Down Expand Up @@ -120,14 +122,16 @@ int main() {
singleAspectDeviceHasChecks<aspect::ext_oneapi_bfloat16>();
singleAspectDeviceHasChecks<aspect::ext_intel_free_memory>();
singleAspectDeviceHasChecks<aspect::ext_intel_device_id>();
singleAspectDeviceHasChecks<aspect::ext_intel_memory_clock_rate>();
singleAspectDeviceHasChecks<aspect::ext_intel_memory_bus_width>();

static_assert(is_property_value<decltype(device_has<>)>::value);
static_assert(std::is_same_v<device_has_key, decltype(device_has<>)::key_t>);
static_assert(decltype(device_has<>)::value.size() == 0);

static_assert(is_property_value<device_has_all>::value);
static_assert(std::is_same_v<device_has_key, device_has_all::key_t>);
static_assert(device_has_all::value.size() == 36);
static_assert(device_has_all::value.size() == 38);
static_assert(device_has_all::value[0] == aspect::host);
static_assert(device_has_all::value[1] == aspect::cpu);
static_assert(device_has_all::value[2] == aspect::gpu);
Expand Down Expand Up @@ -174,6 +178,9 @@ int main() {
static_assert(device_has_all::value[33] == aspect::ext_oneapi_bfloat16);
static_assert(device_has_all::value[34] == aspect::ext_intel_free_memory);
static_assert(device_has_all::value[35] == aspect::ext_intel_device_id);

static_assert(device_has_all::value[36] ==
aspect::ext_intel_memory_clock_rate);
static_assert(device_has_all::value[37] ==
aspect::ext_intel_memory_bus_width);
return 0;
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ static constexpr auto device_has_all = device_has<
aspect::ext_intel_gpu_slices, aspect::ext_oneapi_native_assert,
aspect::host_debuggable, aspect::ext_intel_gpu_hw_threads_per_eu,
aspect::usm_host_allocations, aspect::usm_shared_allocations,
aspect::ext_intel_free_memory, aspect::ext_intel_device_id>;
aspect::ext_intel_free_memory, aspect::ext_intel_device_id,
aspect::ext_intel_memory_clock_rate, aspect::ext_intel_memory_bus_width>;

// CHECK-IR: spir_func void @{{.*}}Func0{{.*}}(){{.*}} #[[DHAttr1:[0-9]+]]
SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(device_has_all) void Func0() {}
Expand Down Expand Up @@ -90,7 +91,9 @@ int main() {
// CHECK-IR-DAG: !{{[0-9]+}} = !{!"usm_shared_allocations", i32 [[usm_shared_allocations_ASPECT_MD:[0-9]+]]}
// CHECK-IR-DAG: !{{[0-9]+}} = !{!"ext_intel_free_memory", i32 [[ext_intel_free_memory_ASPECT_MD:[0-9]+]]}
// CHECK-IR-DAG: !{{[0-9]+}} = !{!"ext_intel_device_id", i32 [[ext_intel_device_id_ASPECT_MD:[0-9]+]]}
// CHECK-IR-DAG: !{{[0-9]+}} = !{!"ext_intel_memory_clock_rate", i32 [[ext_intel_memory_clock_rate_ASPECT_MD:[0-9]+]]}
// CHECK-IR-DAG: !{{[0-9]+}} = !{!"ext_intel_memory_bus_width", i32 [[ext_intel_memory_bus_width_ASPECT_MD:[0-9]+]]}

// CHECK-IR-DAG: attributes #[[DHAttr1]] = { {{.*}}"sycl-device-has"="[[ext_oneapi_cuda_async_barrier_ASPECT_MD]],[[ext_oneapi_bfloat16_ASPECT_MD]],[[custom_ASPECT_MD]],[[fp16_ASPECT_MD]],[[fp64_ASPECT_MD]],[[image_ASPECT_MD]],[[online_compiler_ASPECT_MD]],[[online_linker_ASPECT_MD]],[[queue_profiling_ASPECT_MD]],[[usm_device_allocations_ASPECT_MD]],[[usm_restricted_shared_allocations_ASPECT_MD]],[[usm_system_allocations_ASPECT_MD]],[[ext_intel_pci_address_ASPECT_MD]],[[host_ASPECT_MD]],[[cpu_ASPECT_MD]],[[gpu_ASPECT_MD]],[[accelerator_ASPECT_MD]],[[ext_intel_gpu_eu_count_ASPECT_MD]],[[ext_intel_gpu_subslices_per_slice_ASPECT_MD]],[[ext_intel_gpu_eu_count_per_subslice_ASPECT_MD]],[[ext_intel_max_mem_bandwidth_ASPECT_MD]],[[ext_intel_mem_channel_ASPECT_MD]],[[usm_atomic_host_allocations_ASPECT_MD]],[[usm_atomic_shared_allocations_ASPECT_MD]],[[atomic64_ASPECT_MD]],[[ext_intel_device_info_uuid_ASPECT_MD]],[[ext_oneapi_srgb_ASPECT_MD]],[[ext_intel_gpu_eu_simd_width_ASPECT_MD]],[[ext_intel_gpu_slices_ASPECT_MD]],[[ext_oneapi_native_assert_ASPECT_MD]],[[host_debuggable_ASPECT_MD]],[[ext_intel_gpu_hw_threads_per_eu_ASPECT_MD]],[[usm_host_allocations_ASPECT_MD]],[[usm_shared_allocations_ASPECT_MD]],[[ext_intel_free_memory_ASPECT_MD]],[[ext_intel_device_id_ASPECT_MD]]"
// CHECK-IR-DAG: attributes #[[DHAttr1]] = { {{.*}}"sycl-device-has"="[[ext_oneapi_cuda_async_barrier_ASPECT_MD]],[[ext_oneapi_bfloat16_ASPECT_MD]],[[custom_ASPECT_MD]],[[fp16_ASPECT_MD]],[[fp64_ASPECT_MD]],[[image_ASPECT_MD]],[[online_compiler_ASPECT_MD]],[[online_linker_ASPECT_MD]],[[queue_profiling_ASPECT_MD]],[[usm_device_allocations_ASPECT_MD]],[[usm_restricted_shared_allocations_ASPECT_MD]],[[usm_system_allocations_ASPECT_MD]],[[ext_intel_pci_address_ASPECT_MD]],[[host_ASPECT_MD]],[[cpu_ASPECT_MD]],[[gpu_ASPECT_MD]],[[accelerator_ASPECT_MD]],[[ext_intel_gpu_eu_count_ASPECT_MD]],[[ext_intel_gpu_subslices_per_slice_ASPECT_MD]],[[ext_intel_gpu_eu_count_per_subslice_ASPECT_MD]],[[ext_intel_max_mem_bandwidth_ASPECT_MD]],[[ext_intel_mem_channel_ASPECT_MD]],[[usm_atomic_host_allocations_ASPECT_MD]],[[usm_atomic_shared_allocations_ASPECT_MD]],[[atomic64_ASPECT_MD]],[[ext_intel_device_info_uuid_ASPECT_MD]],[[ext_oneapi_srgb_ASPECT_MD]],[[ext_intel_gpu_eu_simd_width_ASPECT_MD]],[[ext_intel_gpu_slices_ASPECT_MD]],[[ext_oneapi_native_assert_ASPECT_MD]],[[host_debuggable_ASPECT_MD]],[[ext_intel_gpu_hw_threads_per_eu_ASPECT_MD]],[[usm_host_allocations_ASPECT_MD]],[[usm_shared_allocations_ASPECT_MD]],[[ext_intel_free_memory_ASPECT_MD]],[[ext_intel_device_id_ASPECT_MD]],[[ext_intel_memory_clock_rate_ASPECT_MD]],[[ext_intel_memory_bus_width_ASPECT_MD]]"
// CHECK-IR-DAG: attributes #[[DHAttr2]] = { {{.*}}"sycl-device-has" {{.*}}
// CHECK-IR-DAG: attributes #[[DHAttr3]] = { {{.*}}"sycl-device-has"="[[fp16_ASPECT_MD]],[[atomic64_ASPECT_MD]]"
Loading

0 comments on commit 4f7787c

Please sign in to comment.