Adjust block size for reduce-then-scan based on input type #1782

adamfidel · 2024-08-14T20:11:09Z

Targeting to be included in #1762. This PR adjusts the block size to be bigger or smaller based on the input type.

danhoeflinger · 2024-08-22T21:13:29Z

include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl_reduce_then_scan.h

+    constexpr std::uint8_t __block_size_factor = sizeof(_ValueType) > sizeof(std::uint32_t) ? sizeof(_ValueType) / sizeof(std::uint32_t) : sizeof(std::uint32_t) / sizeof(_ValueType);
+    constexpr std::uint16_t __max_inputs_per_item = sizeof(_ValueType) > sizeof(std::uint32_t) ? 128 / __block_size_factor : 128 * __block_size_factor;


Can we adjust this so our baseline is for sizeof(double) types: 64 elements, then scale it by however much smaller the ValueType is?

std::uint8_t __scale = std::min(1, sizeof(double) / sizeof(_ValueType)); std::uing16_t __max_inputs_per_item = 64 * __scale;

We could also limit it to 4x scale if we don't think think 1 byte types will benefit from 512.

Thanks, this is a cleaner way to write this expression. Although I believe we want to use std::max here instead of std::min.

mmichel11 · 2024-08-22T21:26:01Z

Can we also adjust the __max_inputs_per_item template parameters to be std::uint16_t?

danhoeflinger

LGTM

adamfidel requested a review from danhoeflinger August 14, 2024 20:11

adamfidel force-pushed the dev/adamfidel/transform_reduce_then_scan_block_size branch from 7865493 to deb9be3 Compare August 15, 2024 14:53

Adjust block size for reduce-then-scan based on input type

4dd3068

adamfidel force-pushed the dev/adamfidel/transform_reduce_then_scan_block_size branch from deb9be3 to 4dd3068 Compare August 22, 2024 20:01

adamfidel requested a review from mmichel11 August 22, 2024 20:02

Change uint8_t -> uint16_t

013dada

danhoeflinger reviewed Aug 22, 2024

View reviewed changes

Address review comments

8c90c44

danhoeflinger approved these changes Aug 22, 2024

View reviewed changes

adamfidel merged commit 8a36d5a into dev/shared/transform_reduce_then_scan Aug 22, 2024

adamfidel deleted the dev/adamfidel/transform_reduce_then_scan_block_size branch August 22, 2024 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adjust block size for reduce-then-scan based on input type #1782

Adjust block size for reduce-then-scan based on input type #1782

adamfidel commented Aug 14, 2024

danhoeflinger Aug 22, 2024

adamfidel Aug 22, 2024

mmichel11 commented Aug 22, 2024

danhoeflinger left a comment

		constexpr std::uint8_t __block_size_factor = sizeof(_ValueType) > sizeof(std::uint32_t) ? sizeof(_ValueType) / sizeof(std::uint32_t) : sizeof(std::uint32_t) / sizeof(_ValueType);
		constexpr std::uint16_t __max_inputs_per_item = sizeof(_ValueType) > sizeof(std::uint32_t) ? 128 / __block_size_factor : 128 * __block_size_factor;

Adjust block size for reduce-then-scan based on input type #1782

Adjust block size for reduce-then-scan based on input type #1782

Conversation

adamfidel commented Aug 14, 2024

danhoeflinger Aug 22, 2024

Choose a reason for hiding this comment

adamfidel Aug 22, 2024

Choose a reason for hiding this comment

mmichel11 commented Aug 22, 2024

danhoeflinger left a comment

Choose a reason for hiding this comment