Half matrix and components #1708

yhmtsai · 2024-10-24T15:30:15Z

This PR adds the matrix and components (like arrary/device_matrix_data) with half precision support

Also, to avoid touch the files, which are not related to this PR, I add several type list with half additionally.
For example, RealValueTypes -> RealValueTypesWithHalf and next_precision -> next_precision_with_half
(We will add bfloat16 in the future, so maybe do not use Half)

for the friend and corresponding function

friend class <prev<value_type>> // class previous precision can access this
function(class <next<value_type>>) // it can access the class with next precision because it is class with previous precision of class with next precision.

If we only use next in friend and function

friend class <next<next<value_type>>>
function(class <next<value_type>>)

Moreover, the second one does not work when we fallback the next_precision_with_half to next_precision by disabling half because next<next<value_type>> is value_type without half now. However, the first one always work.

TODO:

merge Half base type #1706
~~add as_device_type to sycl for gko::half <-> sycl::half~~ move to Sycl device_type mapping #1710
ensure we do not call 16-bit fake atomic

MarcelKoch

Mostly looks good. There are still some places where the new half-enabled types are missing.

include/ginkgo/core/base/segmented_array.hpp

include/ginkgo/core/base/precision_dispatch.hpp

include/ginkgo/core/matrix/row_gatherer.hpp

core/test/utils.hpp

hip/components/cooperative_groups.hip.hpp

MarcelKoch · 2024-10-28T10:07:29Z

omp/components/atomic.hpp

+    static_assert(sizeof(ValueType) == sizeof(ResultType),
+                  "The type to reinterpret to must be of the same size as the "
+                  "original type.");
+    return reinterpret_cast<ResultType&>(val);


maybe just use memcpy here directly.

MarcelKoch · 2024-10-28T10:22:28Z

omp/components/atomic.hpp

+    }
+#else
+    // UB?
+    uint16_t* address_as_converter = reinterpret_cast<uint16_t*>(&out);


I guess this can't be done without either UB, or falling back to omp critical.

reference/matrix/ell_kernels.cpp

include/ginkgo/core/base/math.hpp

CMakeLists.txt

thoasm

Part 1 / 2 of my review. So far, I only have small comments.

thoasm · 2024-11-05T10:07:20Z

CMakeLists.txt

+option(GINKGO_ENABLE_HALF "Enable the use of half precision" ON)
+# We do not support MSVC. SYCL will come later
+if(MSVC OR GINKGO_BUILD_SYCL)
+    message(STATUS "HALF is not supported in MSVC, and later support in SYCL")


This needs to be rephrased since I really don't know what you mean by "and later support in SYCL".
Do you mean that SYCL does support half-precision in a later version?

Yes, we will enable the support from #1710
As the half is trivial copy again now, we might not need the device_type mapping though.

thoasm · 2024-11-05T10:27:29Z

common/cuda_hip/base/math.hpp

+struct device_numeric_limits<__half> {
+    // from __half documentation, it accepts unsigned short
+    // __half does not have constexpr
+    static GKO_ATTRIBUTES GKO_INLINE auto inf()


Will this also work for host code?

yes, the constructor are available on both side.
side note, the operations are not.

thoasm · 2024-11-05T10:32:25Z

common/cuda_hip/base/math.hpp

+    static constexpr auto inf() { return std::numeric_limits<T>::infinity(); }
+    static constexpr auto max() { return std::numeric_limits<T>::max(); }
+    static constexpr auto min() { return std::numeric_limits<T>::min(); }


Is there a reason you made these into functions?

__half does not have constexpr constructor such that I can not put them into static constexpr data here.
I do not try the following such that keep them as data yet.
If we still make it as the data not functions, the definition is on the host.
Can cuda still use them in device code? I doubt not because we do not pass it through the kernel call.

Thus, I made them into functions.

thoasm · 2024-11-05T10:52:07Z

common/cuda_hip/base/math.hpp

+}
+
+
+// Dircetly call float versrion from here?


typos:

Suggested change

// Dircetly call float versrion from here?

// Directly call float version from here?

Also, why is abs specialized, while sqrt above is a separate function? Because that's how thrust does it?

I think sqrt can also be specialized

thoasm · 2024-11-05T10:53:37Z

common/cuda_hip/base/math.hpp

+// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host
+// compiler.


I don't quite get the meaning, maybe:

Suggested change

// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host

// compiler.

// It is required by NVHPC 23.3, `isnan` is undefined when NVHPC is only used as a host

// compiler.

If I recall correctly,
I think cuda will go through the code twice. one for device and the other for the rest.
NVCC does not complain anything, but NVHPC will complain isnan is not defined.
TBH, I forgot whether I put __device__ or not when I encounter this issue.
I will check again

thoasm · 2024-11-05T11:03:14Z

common/cuda_hip/base/types.hpp

+THRUST_HALF_FRIEND_OPERATOR(+, +=)
+THRUST_HALF_FRIEND_OPERATOR(-, -=)
+THRUST_HALF_FRIEND_OPERATOR(*, *=)
+THRUST_HALF_FRIEND_OPERATOR(/, /=)


Do you need this macro afterward? If not, maybe just undef the macro.

thoasm · 2024-11-05T12:41:21Z

common/unified/components/fill_array_kernels.cpp

+        exec,
+        [] GKO_KERNEL(auto idx, auto array) {
+            if constexpr (std::is_same_v<remove_complex<ValueType>, half>) {
+                // __half can not be from int64_t


What do you mean by that?
That half can't be converted to int64_t?

No, __half can not be converted from int64_t.
cuda only writes the conversion from short, int, long long and the corresponding unsigned version.
Unfortuntately, it does not accepts int64_t even if long long and int64_t are the same technically.

ginkgo-bot · 2024-11-05T18:04:17Z

Error: The following files need to be formatted:

common/cuda_hip/components/atomic.hpp

You can find a formatting patch under Artifacts here or run format! if you have write access to Ginkgo

yhmtsai requested review from a team October 24, 2024 15:30

yhmtsai self-assigned this Oct 24, 2024

yhmtsai mentioned this pull request Oct 24, 2024

Half precision support #1257

Open

12 tasks

yhmtsai force-pushed the half_matrix branch 2 times, most recently from 8f3a17d to b7d4a15 Compare October 25, 2024 08:20

yhmtsai added this to the Ginkgo 1.9.0 milestone Oct 25, 2024

yhmtsai added the 1:ST:ready-for-review This PR is ready for review label Oct 25, 2024

yhmtsai force-pushed the half_matrix branch from b7d4a15 to dc5a7e6 Compare October 25, 2024 15:29

yhmtsai force-pushed the half_type branch from 07dcc75 to fac512f Compare October 25, 2024 15:29

MarcelKoch self-requested a review October 25, 2024 15:54

MarcelKoch requested changes Oct 28, 2024

View reviewed changes

yhmtsai force-pushed the half_matrix branch from dc5a7e6 to f53438a Compare October 28, 2024 16:12

yhmtsai force-pushed the half_type branch from fac512f to bfbe44b Compare October 28, 2024 16:12

yhmtsai force-pushed the half_matrix branch from f53438a to 5b22346 Compare October 28, 2024 17:19

yhmtsai force-pushed the half_type branch from bfbe44b to 0c24e81 Compare October 28, 2024 17:19

yhmtsai force-pushed the half_matrix branch 2 times, most recently from ee4be45 to 3037d52 Compare October 29, 2024 10:51

MarcelKoch requested a review from thoasm October 30, 2024 14:07

MarcelKoch reviewed Oct 31, 2024

View reviewed changes

CMakeLists.txt Show resolved Hide resolved

yhmtsai force-pushed the half_type branch from 0c24e81 to 7eaf547 Compare November 4, 2024 09:43

yhmtsai force-pushed the half_matrix branch from 3037d52 to 39b79d5 Compare November 4, 2024 14:24

thoasm reviewed Nov 5, 2024

View reviewed changes

yhmtsai added 12 commits November 5, 2024 19:02

instantiation/testing/next/prev/stub type definition

c136a15

half option

95502ee

device type mapping

1a6fe1d

atomic and cooperative_groups

101ee72

fix math and device_numeric_limit

9e71c31

array operation in half

3068fc3

matrix with half

71000fe

device_matrix_data and mtx_io

459a116

components such as array/iterator/segmented_array test with half

ee6813e

matrix test with half

297af9e

base such as composition/combination with half and corr. test

975e741

test_utils test

1741fdc

yhmtsai force-pushed the half_type branch from 7eaf547 to d4f3893 Compare November 5, 2024 18:03

yhmtsai force-pushed the half_matrix branch from 39b79d5 to 1741fdc Compare November 5, 2024 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Half matrix and components #1708

Half matrix and components #1708

yhmtsai commented Oct 24, 2024 •

edited

Loading

MarcelKoch left a comment

MarcelKoch Oct 28, 2024

MarcelKoch Oct 28, 2024

thoasm left a comment

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024 •

edited

Loading

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

thoasm Nov 5, 2024

thoasm Nov 5, 2024

yhmtsai Nov 5, 2024

ginkgo-bot commented Nov 5, 2024

	// Dircetly call float versrion from here?
	// Directly call float version from here?

		// It is required by NVHPC 23.3, isnan is undefined when NVHPC are only as host
		// compiler.

Half matrix and components #1708

Are you sure you want to change the base?

Half matrix and components #1708

Conversation

yhmtsai commented Oct 24, 2024 • edited Loading

MarcelKoch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thoasm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yhmtsai Nov 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ginkgo-bot commented Nov 5, 2024

yhmtsai commented Oct 24, 2024 •

edited

Loading

yhmtsai Nov 5, 2024 •

edited

Loading