Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXSWHTEC-272 - Implement tests for warp shfl_up and shfl_down functions #193

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
cea96af
SWDEV-355313 - Move catch tests and samples
gargrahul Oct 26, 2022
909e7e4
SWDEV-355313 - Add README
gargrahul Nov 7, 2022
094b9af
SWDEV-355313 - Update amd-staging branch
gargrahul Nov 28, 2022
9daa6d0
SWDEV-355313 - Update README
gargrahul Dec 2, 2022
c49043e
SWDEV-355313 - Update latest code
gargrahul Dec 6, 2022
edf514a
Migrate basic Cooperative Groups tests and integrate to catch
nives-vukovic Jan 4, 2023
5610a48
Refactor basic Cooperative Groups tests
nives-vukovic Jan 10, 2023
c455740
Rename tiled partition related files and fix minor bug
nives-vukovic Jan 11, 2023
82fc666
Add LaunchCooperativeKernal and LaunchCooperativeKernelMultiDevice tests
nives-vukovic Jan 11, 2023
ef4fa46
Refactor hipCGThreadBlockTileType to use common function
nives-vukovic Jan 12, 2023
9c0f995
Merge remote-tracking branch 'origin/develop' into hipCoopGroups_wip
nives-vukovic Jan 12, 2023
b28aa60
Fix updated file not added during merge
nives-vukovic Jan 12, 2023
cc32117
Add coalesced_group type tests
nives-vukovic Jan 12, 2023
a177e26
Add coalesced_group shuffle_up and shuffle_down tests
nives-vukovic Jan 12, 2023
cdeadcf
Add coalesced_group shuffle tests - test fails
nives-vukovic Jan 13, 2023
7b84ac1
Merge remote-tracking branch 'upstream/develop' into cg_base_dino
music-dino Feb 1, 2023
d414bce
Implement common code for cooperative group tests
music-dino Feb 1, 2023
609fae5
Fixed compilation errror in cooperative_groups_common.hh
music-dino Feb 1, 2023
8cfb58b
Implement busy wait device function
music-dino Feb 1, 2023
5cf02ca
Add thread and block dimensions generators
music-dino Feb 2, 2023
fc11bf9
Move cpu_grid.h and supporting functions to catch/include
nives-vukovic Mar 1, 2023
18f2450
Use warp_size from properties in grid/block dims generators
nives-vukovic Mar 1, 2023
65a1e57
Fix condition for warp size 32 on AMD
nives-vukovic Mar 1, 2023
c01665f
Fix cpu_grid.h for warp function tests
nives-vukovic Mar 2, 2023
e41e642
Add missing include into cpu_grid.h
nives-vukovic Mar 2, 2023
e0e35e9
Merge remote-tracking branch 'origin/develop' into warp_common
nives-vukovic Mar 2, 2023
d4291ae
Add common functions and definitions for warp functions
nives-vukovic Mar 2, 2023
1c154be
Remove unnecessary memset
nives-vukovic Mar 2, 2023
2aed190
Cleanup leftover cooperative groups files
nives-vukovic Mar 2, 2023
2499e31
EXSWHTEC-272 - Implement tests for warp shfl_up and shfl_down functions
nives-vukovic Mar 3, 2023
05e2dd8
EXSWHTEC-272 - Fix minor issues in validation function
nives-vukovic Mar 3, 2023
01970f7
EXSWHTEC-272 - Fix doxygen comments
nives-vukovic Mar 3, 2023
583e30a
Add memory reset after allocation
nives-vukovic Mar 3, 2023
6838eb2
Merge branch 'warp_common' into warp_shfl_up_down_tests
nives-vukovic Mar 3, 2023
64b983a
EXSWHTEC-272 - Fix doxygen comments
milos-mozetic Mar 23, 2023
792358c
Expand Warp Test to include random and predefined test version
nives-vukovic May 3, 2023
a180c2d
Merge branch 'warp_common' into warp_shfl_up_down_tests
nives-vukovic May 3, 2023
de7a5bc
EXSWHTEC-272 - Modify warp shfl up and down tests according to common…
nives-vukovic May 3, 2023
fb1615d
Add comments for block and grid dimensions generate functions
nives-vukovic May 3, 2023
d19342d
Merge branch 'warp_common' into warp_shfl_up_down_tests
nives-vukovic May 3, 2023
d4313b1
Merge branch 'develop' into warp_shfl_up_down_tests
rakesroy Jul 11, 2023
22eb41a
Reduce common code for warp tests
nives-vukovic Jul 13, 2023
8a07cb5
Merge branch 'warp_common' into warp_shfl_up_down_tests
nives-vukovic Jul 13, 2023
9da7394
EXSWHTEC-272 - Create separate warp shfl common code
nives-vukovic Jul 13, 2023
a0a1e66
Merge remote-tracking branch 'upstream/develop' into warp_shfl_up_dow…
mirza-halilcevic Sep 29, 2023
e8178f7
Merge remote-tracking branch 'origin/develop' into warp_shfl_up_down_…
nives-vukovic Dec 8, 2023
4d90638
Merge branch 'develop' into warp_shfl_up_down_tests
mirza-halilcevic Dec 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions catch/unit/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ add_subdirectory(multiThread)
add_subdirectory(compiler)
add_subdirectory(errorHandling)
add_subdirectory(cooperativeGrps)
add_subdirectory(warp)
add_subdirectory(context)
add_subdirectory(warp)
add_subdirectory(dynamicLoading)
Expand Down
2 changes: 2 additions & 0 deletions catch/unit/warp/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# Common Tests - Test independent of all platforms
set(TEST_SRC
warp_shfl_up.cc
warp_shfl_down.cc
warp_shfl_xor.cc
warp_shfl.cc
)
Expand Down
121 changes: 121 additions & 0 deletions catch/unit/warp/warp_shfl_down.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
/*
Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#include "warp_shfl_common.hh"

#include <bitset>

/**
* @addtogroup shfl_down shfl_down
* @{
* @ingroup DeviceLanguageTest
* `T __shfl_down(T var, unsigned int lane_delta, int width = warpSize)` -
* Contains unit test for warp shfl_down function
*/

namespace cg = cooperative_groups;

template <typename T>
__global__ void shfl_down(T* const out, const T* const in, const uint64_t* const active_masks,
const unsigned int* const deltas, const int width) {
if (deactivate_thread(active_masks)) {
return;
}

const auto grid = cg::this_grid();
const auto block = cg::this_thread_block();
T var = in[grid.thread_rank()];
out[grid.thread_rank()] = __shfl_down(var, deltas[block.thread_rank() % width], width);
}

template <typename T> class WarpShflDown : public WarpShflTest<WarpShflDown<T>, T> {
public:
void launch_kernel(T* const arr_dev, T* const input_dev, const uint64_t* const active_masks) {
width_ = generate_width(this->warp_size_);
INFO("Width: " << width_);
const auto alloc_size = width_ * sizeof(unsigned int);
LinearAllocGuard<unsigned int> deltas_dev(LinearAllocs::hipMalloc, alloc_size);
deltas_.resize(width_);
std::generate(deltas_.begin(), deltas_.end(),
[this] { return GenerateRandomInteger(0u, static_cast<unsigned int>(width_)); });
HIP_CHECK(hipMemcpy(deltas_dev.ptr(), deltas_.data(), alloc_size, hipMemcpyHostToDevice));
shfl_down<<<this->grid_.grid_dim_, this->grid_.block_dim_>>>(arr_dev, input_dev, active_masks,
deltas_dev.ptr(), width_);
}

void validate(const T* const arr, const T* const input) {
ArrayAllOf(arr, this->grid_.thread_count_, [this, &input](unsigned int i) -> std::optional<T> {
const int rank_in_block = this->grid_.thread_rank_in_block(i).value();
const auto rank_in_warp = rank_in_block % this->warp_size_;
const auto rank_in_partition = rank_in_block % width_;
const auto mask_idx = this->warps_in_block_ * (i / this->grid_.threads_in_block_count_) +
rank_in_block / this->warp_size_;
const unsigned int delta = deltas_[rank_in_partition] % width_;
const std::bitset<sizeof(uint64_t) * 8> active_mask(this->active_masks_[mask_idx]);

const int target = rank_in_block % width_ + delta;
if (!active_mask.test(rank_in_warp) ||
(target < width_ && !active_mask.test(rank_in_warp + delta)) ||
(target < width_ && rank_in_block + delta >= this->grid_.threads_in_block_count_)) {
return std::nullopt;
}

return (target >= width_ ? input[i] : input[i + delta]);
});
};

private:
std::vector<unsigned int> deltas_;
int width_;
};

/**
* Test Description
* ------------------------
* - Validates the warp shuffle down behavior for all valid width sizes {2, 4, 8, 16, 32,
* 64(if supported)} for generated delta values. The threads are deactivated based on the
* passed active mask. The test is run for all overloads of shfl_down.
* Test source
* ------------------------
* - unit/warp/warp_shfl_down.cc
* Test requirements
* ------------------------
* - HIP_VERSION >= 5.2
* - Device supports warp shuffle
*/
TEMPLATE_TEST_CASE("Unit_Warp_Shfl_Down_Positive_Basic", "", int, unsigned int, long, unsigned long,
long long, unsigned long long, float, double) {
int device;
hipDeviceProp_t device_properties;
HIP_CHECK(hipGetDevice(&device));
HIP_CHECK(hipGetDeviceProperties(&device_properties, device));

if (!device_properties.arch.hasWarpShuffle) {
HipTest::HIP_SKIP_TEST("Device doesn't support Warp Shuffle!");
return;
}

SECTION("Shfl Down with specified active mask and input values") {
WarpShflDown<TestType>().run(false);
}

SECTION("Shfl Down with random active mask and input values") {
WarpShflDown<TestType>().run(true);
}
}
120 changes: 120 additions & 0 deletions catch/unit/warp/warp_shfl_up.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
/*
Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/

#include "warp_shfl_common.hh"

#include <bitset>

/**
* @addtogroup shfl_up shfl_up
* @{
* @ingroup DeviceLanguageTest
* `T __shfl_up(T var, unsigned int lane_delta, int width = warpSize)` -
* Contains unit test for warp shfl_up function
*/

namespace cg = cooperative_groups;

template <typename T>
__global__ void shfl_up(T* const out, const T* const in, const uint64_t* const active_masks,
const unsigned int* const deltas, const int width) {
if (deactivate_thread(active_masks)) {
return;
}

const auto grid = cg::this_grid();
const auto block = cg::this_thread_block();
T var = in[grid.thread_rank()];
out[grid.thread_rank()] = __shfl_up(var, deltas[block.thread_rank() % width], width);
}

template <typename T> class WarpShflUp : public WarpShflTest<WarpShflUp<T>, T> {
public:
void launch_kernel(T* const arr_dev, T* const input_dev, const uint64_t* const active_masks) {
width_ = generate_width(this->warp_size_);
INFO("Width: " << width_);
const auto alloc_size = width_ * sizeof(unsigned int);
LinearAllocGuard<unsigned int> deltas_dev(LinearAllocs::hipMalloc, alloc_size);
deltas_.resize(width_);
std::generate(deltas_.begin(), deltas_.end(),
[this] { return GenerateRandomInteger(0u, static_cast<unsigned int>(width_)); });
HIP_CHECK(hipMemcpy(deltas_dev.ptr(), deltas_.data(), alloc_size, hipMemcpyHostToDevice));
shfl_up<<<this->grid_.grid_dim_, this->grid_.block_dim_>>>(arr_dev, input_dev, active_masks,
deltas_dev.ptr(), width_);
}

void validate(const T* const arr, const T* const input) {
ArrayAllOf(arr, this->grid_.thread_count_, [this, &input](unsigned int i) -> std::optional<T> {
const auto rank_in_block = this->grid_.thread_rank_in_block(i).value();
const auto rank_in_warp = rank_in_block % this->warp_size_;
const auto rank_in_partition = rank_in_block % width_;
const auto mask_idx = this->warps_in_block_ * (i / this->grid_.threads_in_block_count_) +
rank_in_block / this->warp_size_;
const unsigned int delta = deltas_[rank_in_partition] % width_;
const std::bitset<sizeof(uint64_t) * 8> active_mask(this->active_masks_[mask_idx]);

const int target = rank_in_block % width_ - delta;
if (!active_mask.test(rank_in_warp) ||
(target >= 0 && !active_mask.test(rank_in_warp - delta))) {
return std::nullopt;
}

return (target < 0 ? input[i] : input[i - delta]);
});
};

private:
std::vector<unsigned int> deltas_;
int width_;
};

/**
* Test Description
* ------------------------
* - Validates the warp shuffle up behavior for all valid width sizes {2, 4, 8, 16, 32,
* 64(if supported)} for generated delta values. The threads are deactivated based on the
* passed active mask. The test is run for all overloads of shfl_up.
* Test source
* ------------------------
* - unit/warp/warp_shfl_up.cc
* Test requirements
* ------------------------
* - HIP_VERSION >= 5.2
* - Device supports warp shuffle
*/
TEMPLATE_TEST_CASE("Unit_Warp_Shfl_Up_Positive_Basic", "", int, unsigned int, long, unsigned long,
long long, unsigned long long, float, double) {
int device;
hipDeviceProp_t device_properties;
HIP_CHECK(hipGetDevice(&device));
HIP_CHECK(hipGetDeviceProperties(&device_properties, device));

if (!device_properties.arch.hasWarpShuffle) {
HipTest::HIP_SKIP_TEST("Device doesn't support Warp Shuffle!");
return;
}

SECTION("Shfl Up with specified active mask and input values") {
WarpShflUp<TestType>().run(false);
}

SECTION("Shfl Down with random active mask and input values") {
WarpShflUp<TestType>().run(true);
}
}