-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc][math][c23] Add f16divf C23 math function #96131
Conversation
cc @lntue |
@llvm/pr-subscribers-libc Author: OverMighty (overmighty) ChangesPart of #93566. Patch is 31.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96131.diff 18 Files Affected:
diff --git a/libc/config/linux/aarch64/entrypoints.txt b/libc/config/linux/aarch64/entrypoints.txt
index dfed6acbdf257..e0111239c89fa 100644
--- a/libc/config/linux/aarch64/entrypoints.txt
+++ b/libc/config/linux/aarch64/entrypoints.txt
@@ -504,6 +504,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.canonicalizef16
libc.src.math.ceilf16
libc.src.math.copysignf16
+ libc.src.math.f16divf
libc.src.math.f16fmaf
libc.src.math.f16sqrtf
libc.src.math.fabsf16
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index cfe35167ca32e..512a3a3f50adf 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -536,6 +536,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.canonicalizef16
libc.src.math.ceilf16
libc.src.math.copysignf16
+ libc.src.math.f16divf
libc.src.math.f16fmaf
libc.src.math.f16sqrtf
libc.src.math.fabsf16
diff --git a/libc/docs/math/index.rst b/libc/docs/math/index.rst
index 293edd1c15100..185ae1fd82e8f 100644
--- a/libc/docs/math/index.rst
+++ b/libc/docs/math/index.rst
@@ -124,6 +124,8 @@ Basic Operations
+------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
| dsub | N/A | N/A | | N/A | | 7.12.14.2 | F.10.11 |
+------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
+| f16div | |check| | | | N/A | | 7.12.14.4 | F.10.11 |
++------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
| f16fma | |check| | | | N/A | | 7.12.14.5 | F.10.11 |
+------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
| fabs | |check| | |check| | |check| | |check| | |check| | 7.12.7.3 | F.10.4.3 |
diff --git a/libc/spec/stdc.td b/libc/spec/stdc.td
index f9c79ee106bbb..ac3badc8096b9 100644
--- a/libc/spec/stdc.td
+++ b/libc/spec/stdc.td
@@ -716,6 +716,8 @@ def StdC : StandardSpec<"stdc"> {
GuardedFunctionSpec<"totalordermagf16", RetValSpec<IntType>, [ArgSpec<Float16Ptr>, ArgSpec<Float16Ptr>], "LIBC_TYPES_HAS_FLOAT16">,
+ GuardedFunctionSpec<"f16divf", RetValSpec<Float16Type>, [ArgSpec<FloatType>, ArgSpec<FloatType>], "LIBC_TYPES_HAS_FLOAT16">,
+
GuardedFunctionSpec<"f16sqrtf", RetValSpec<Float16Type>, [ArgSpec<FloatType>], "LIBC_TYPES_HAS_FLOAT16">,
]
>;
diff --git a/libc/src/__support/FPUtil/generic/CMakeLists.txt b/libc/src/__support/FPUtil/generic/CMakeLists.txt
index a8a95ba3f15ff..a7b912e0bab98 100644
--- a/libc/src/__support/FPUtil/generic/CMakeLists.txt
+++ b/libc/src/__support/FPUtil/generic/CMakeLists.txt
@@ -45,3 +45,17 @@ add_header_library(
libc.src.__support.FPUtil.rounding_mode
libc.src.__support.macros.optimization
)
+
+add_header_library(
+ div
+ HDRS
+ div.h
+ DEPENDS
+ libc.hdr.fenv_macros
+ libc.src.__support.CPP.bit
+ libc.src.__support.CPP.type_traits
+ libc.src.__support.FPUtil.fenv_impl
+ libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.FPUtil.dyadic_float
+ libc.src.__support.FPUtil.rounding_mode
+)
diff --git a/libc/src/__support/FPUtil/generic/div.h b/libc/src/__support/FPUtil/generic/div.h
new file mode 100644
index 0000000000000..c3cb702d2a97a
--- /dev/null
+++ b/libc/src/__support/FPUtil/generic/div.h
@@ -0,0 +1,180 @@
+//===-- Division of IEEE 754 floating-point numbers -------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
+#define LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
+
+#include "hdr/fenv_macros.h"
+#include "src/__support/CPP/bit.h"
+#include "src/__support/CPP/type_traits.h"
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/dyadic_float.h"
+#include "src/__support/FPUtil/rounding_mode.h"
+
+namespace LIBC_NAMESPACE::fputil::generic {
+
+template <typename OutType, typename InType>
+cpp::enable_if_t<cpp::is_floating_point_v<OutType> &&
+ cpp::is_floating_point_v<InType> &&
+ sizeof(OutType) <= sizeof(InType),
+ OutType>
+div(InType x, InType y) {
+ using OutFPBits = FPBits<OutType>;
+ using OutStorageType = typename OutFPBits::StorageType;
+ using InFPBits = FPBits<InType>;
+ using InStorageType = typename InFPBits::StorageType;
+ using DyadicFloat =
+ DyadicFloat<cpp::bit_ceil(static_cast<size_t>(InFPBits::FRACTION_LEN))>;
+ using DyadicMantissaType = typename DyadicFloat::MantissaType;
+
+ // +1 for the implicit bit.
+ constexpr int DYADIC_EXTRA_MANTISSA_LEN =
+ DyadicMantissaType::BITS - (InFPBits::FRACTION_LEN + 1);
+ // +1 for the extra fractional bit in q.
+ constexpr int Q_EXTRA_FRACTION_LEN =
+ InFPBits::FRACTION_LEN + 1 - OutFPBits::FRACTION_LEN;
+
+ InFPBits x_bits(x);
+ InFPBits y_bits(y);
+
+ if (x_bits.is_nan() || y_bits.is_nan()) {
+ if (x_bits.is_signaling_nan() || y_bits.is_signaling_nan())
+ raise_except_if_required(FE_INVALID);
+
+ // TODO: Handle NaN payloads.
+ return OutFPBits::quiet_nan().get_val();
+ }
+
+ Sign result_sign = x_bits.sign() == y_bits.sign() ? Sign::POS : Sign::NEG;
+
+ if (x_bits.is_inf()) {
+ if (y_bits.is_inf()) {
+ raise_except_if_required(FE_INVALID);
+ return OutFPBits::quiet_nan().get_val();
+ }
+
+ return OutFPBits::inf(result_sign).get_val();
+ }
+
+ if (y_bits.is_inf())
+ return OutFPBits::inf(result_sign).get_val();
+
+ if (y_bits.is_zero()) {
+ if (x_bits.is_zero()) {
+ raise_except_if_required(FE_INVALID);
+ return OutFPBits::quiet_nan().get_val();
+ }
+
+ raise_except_if_required(FE_DIVBYZERO);
+ return OutFPBits::inf(result_sign).get_val();
+ }
+
+ if (x_bits.is_zero())
+ return OutFPBits::zero(result_sign).get_val();
+
+ DyadicFloat xd(x);
+ DyadicFloat yd(y);
+
+ bool would_q_be_subnormal = xd.mantissa < yd.mantissa;
+ int q_exponent = xd.get_unbiased_exponent() - yd.get_unbiased_exponent() -
+ would_q_be_subnormal;
+
+ if (q_exponent + OutFPBits::EXP_BIAS >= OutFPBits::MAX_BIASED_EXPONENT) {
+ switch (quick_get_round()) {
+ case FE_TONEAREST:
+ case FE_UPWARD:
+ return OutFPBits::inf().get_val();
+ default:
+ return OutFPBits::max_normal().get_val();
+ }
+ }
+
+ if (q_exponent < -OutFPBits::EXP_BIAS - OutFPBits::FRACTION_LEN) {
+ switch (quick_get_round()) {
+ case FE_UPWARD:
+ return OutFPBits::min_subnormal().get_val();
+ default:
+ return OutFPBits::zero().get_val();
+ }
+ }
+
+ InStorageType q = 1;
+ InStorageType xd_mant_in = static_cast<InStorageType>(
+ xd.mantissa >> (DYADIC_EXTRA_MANTISSA_LEN - would_q_be_subnormal));
+ InStorageType yd_mant_in =
+ static_cast<InStorageType>(yd.mantissa >> DYADIC_EXTRA_MANTISSA_LEN);
+ InStorageType r = xd_mant_in - yd_mant_in;
+
+ for (size_t i = 0; i < InFPBits::FRACTION_LEN + 1; i++) {
+ q <<= 1;
+ InStorageType t = r << 1;
+ if (t < yd_mant_in) {
+ r = t;
+ } else {
+ q += 1;
+ r = t - yd_mant_in;
+ }
+ }
+
+ bool round;
+ bool sticky;
+ OutStorageType result;
+
+ if (q_exponent > -OutFPBits::EXP_BIAS) {
+ // Result is normal.
+
+ round = (q & (InStorageType(1) << (Q_EXTRA_FRACTION_LEN - 1))) != 0;
+ sticky = (q & ((InStorageType(1) << (Q_EXTRA_FRACTION_LEN - 1)) - 1)) != 0;
+
+ result = OutFPBits::create_value(
+ result_sign,
+ static_cast<OutStorageType>(q_exponent + OutFPBits::EXP_BIAS),
+ static_cast<OutStorageType>((q >> Q_EXTRA_FRACTION_LEN) &
+ OutFPBits::SIG_MASK))
+ .uintval();
+
+ } else {
+ // Result is subnormal.
+
+ // +1 because the leading bit is now part of the fraction.
+ int underflow_extra_fraction_len =
+ Q_EXTRA_FRACTION_LEN + 1 - q_exponent - OutFPBits::EXP_BIAS;
+
+ InStorageType round_bit_mask = InStorageType(1)
+ << (underflow_extra_fraction_len - 1);
+ round = (q & round_bit_mask) != 0;
+ InStorageType sticky_bits_mask = round_bit_mask - 1;
+ sticky = (q & sticky_bits_mask) != 0;
+
+ result = OutFPBits::create_value(
+ result_sign, 0,
+ static_cast<OutStorageType>(q >> underflow_extra_fraction_len))
+ .uintval();
+ }
+
+ bool lsb = (result & 1) != 0;
+
+ switch (quick_get_round()) {
+ case FE_TONEAREST:
+ if (round && (lsb || sticky))
+ ++result;
+ break;
+ case FE_UPWARD:
+ ++result;
+ break;
+ default:
+ break;
+ }
+
+ return cpp::bit_cast<OutType>(result);
+}
+
+} // namespace LIBC_NAMESPACE::fputil::generic
+
+#endif // LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
diff --git a/libc/src/math/CMakeLists.txt b/libc/src/math/CMakeLists.txt
index 4472367d6c073..53b5f44e8ef31 100644
--- a/libc/src/math/CMakeLists.txt
+++ b/libc/src/math/CMakeLists.txt
@@ -99,6 +99,8 @@ add_math_entrypoint_object(exp10f)
add_math_entrypoint_object(expm1)
add_math_entrypoint_object(expm1f)
+add_math_entrypoint_object(f16divf)
+
add_math_entrypoint_object(f16fmaf)
add_math_entrypoint_object(f16sqrtf)
diff --git a/libc/src/math/f16divf.h b/libc/src/math/f16divf.h
new file mode 100644
index 0000000000000..a3359d9e47944
--- /dev/null
+++ b/libc/src/math/f16divf.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for f16divf -----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_MATH_F16DIVF_H
+#define LLVM_LIBC_SRC_MATH_F16DIVF_H
+
+#include "src/__support/macros/properties/types.h"
+
+namespace LIBC_NAMESPACE {
+
+float16 f16divf(float x, float y);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_MATH_F16DIVF_H
diff --git a/libc/src/math/generic/CMakeLists.txt b/libc/src/math/generic/CMakeLists.txt
index aa0069d821d0d..f14c70bedcfeb 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -3602,6 +3602,19 @@ add_entrypoint_object(
-O3
)
+add_entrypoint_object(
+ f16divf
+ SRCS
+ f16divf.cpp
+ HDRS
+ ../f16divf.h
+ DEPENDS
+ libc.src.__support.macros.properties.types
+ libc.src.__support.FPUtil.generic.div
+ COMPILE_OPTIONS
+ -O3
+)
+
add_entrypoint_object(
f16fmaf
SRCS
diff --git a/libc/src/math/generic/f16divf.cpp b/libc/src/math/generic/f16divf.cpp
new file mode 100644
index 0000000000000..45874fbac2055
--- /dev/null
+++ b/libc/src/math/generic/f16divf.cpp
@@ -0,0 +1,19 @@
+//===-- Implementation of f16divf function --------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/math/f16divf.h"
+#include "src/__support/FPUtil/generic/div.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+LLVM_LIBC_FUNCTION(float16, f16divf, (float x, float y)) {
+ return fputil::generic::div<float16>(x, y);
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/test/src/math/CMakeLists.txt b/libc/test/src/math/CMakeLists.txt
index bb364c3f0a175..ba588662f469e 100644
--- a/libc/test/src/math/CMakeLists.txt
+++ b/libc/test/src/math/CMakeLists.txt
@@ -1890,6 +1890,19 @@ add_fp_unittest(
libc.src.__support.FPUtil.fp_bits
)
+add_fp_unittest(
+ f16divf_test
+ NEED_MPFR
+ SUITE
+ libc-math-unittests
+ SRCS
+ f16divf_test.cpp
+ HDRS
+ DivTest.h
+ DEPENDS
+ libc.src.math.f16divf
+)
+
add_fp_unittest(
f16fmaf_test
NEED_MPFR
diff --git a/libc/test/src/math/DivTest.h b/libc/test/src/math/DivTest.h
new file mode 100644
index 0000000000000..39e8a6b67bd90
--- /dev/null
+++ b/libc/test/src/math/DivTest.h
@@ -0,0 +1,74 @@
+//===-- Utility class to test different flavors of float div ----*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
+#define LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
+
+#include "test/UnitTest/FEnvSafeTest.h"
+#include "test/UnitTest/FPMatcher.h"
+#include "test/UnitTest/Test.h"
+#include "utils/MPFRWrapper/MPFRUtils.h"
+
+namespace mpfr = LIBC_NAMESPACE::testing::mpfr;
+
+template <typename OutType, typename InType>
+class DivTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
+
+ struct InConstants {
+ DECLARE_SPECIAL_CONSTANTS(InType)
+ };
+
+ using InFPBits = typename InConstants::FPBits;
+ using InStorageType = typename InConstants::StorageType;
+
+ static constexpr InStorageType IN_MAX_NORMAL_U =
+ InFPBits::max_normal().uintval();
+ static constexpr InStorageType IN_MIN_NORMAL_U =
+ InFPBits::min_normal().uintval();
+ static constexpr InStorageType IN_MAX_SUBNORMAL_U =
+ InFPBits::max_subnormal().uintval();
+ static constexpr InStorageType IN_MIN_SUBNORMAL_U =
+ InFPBits::min_subnormal().uintval();
+
+public:
+ typedef OutType (*DivFunc)(InType, InType);
+
+ void test_subnormal_range(DivFunc func) {
+ constexpr InStorageType COUNT = 100'001;
+ constexpr InStorageType STEP =
+ (IN_MAX_SUBNORMAL_U - IN_MIN_SUBNORMAL_U) / COUNT;
+ for (InStorageType i = 0, v = 0, w = IN_MAX_SUBNORMAL_U; i <= COUNT;
+ ++i, v += STEP, w -= STEP) {
+ InType x = InFPBits(v).get_val();
+ InType y = InFPBits(w).get_val();
+ mpfr::BinaryInput<InType> input{x, y};
+ EXPECT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Div, input, func(x, y),
+ 0.5);
+ }
+ }
+
+ void test_normal_range(DivFunc func) {
+ constexpr InStorageType COUNT = 100'001;
+ constexpr InStorageType STEP = (IN_MAX_NORMAL_U - IN_MIN_NORMAL_U) / COUNT;
+ for (InStorageType i = 0, v = 0, w = IN_MAX_NORMAL_U; i <= COUNT;
+ ++i, v += STEP, w -= STEP) {
+ InType x = InFPBits(v).get_val();
+ InType y = InFPBits(w).get_val();
+ mpfr::BinaryInput<InType> input{x, y};
+ EXPECT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Div, input, func(x, y),
+ 0.5);
+ }
+ }
+};
+
+#define LIST_DIV_TESTS(OutType, InType, func) \
+ using LlvmLibcDivTest = DivTest<OutType, InType>; \
+ TEST_F(LlvmLibcDivTest, SubnormalRange) { test_subnormal_range(&func); } \
+ TEST_F(LlvmLibcDivTest, NormalRange) { test_normal_range(&func); }
+
+#endif // LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
diff --git a/libc/test/src/math/f16divf_test.cpp b/libc/test/src/math/f16divf_test.cpp
new file mode 100644
index 0000000000000..85be1ebcd55c9
--- /dev/null
+++ b/libc/test/src/math/f16divf_test.cpp
@@ -0,0 +1,13 @@
+//===-- Unittests for f16divf ---------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "DivTest.h"
+
+#include "src/math/f16divf.h"
+
+LIST_DIV_TESTS(float16, float, LIBC_NAMESPACE::f16divf)
diff --git a/libc/test/src/math/smoke/CMakeLists.txt b/libc/test/src/math/smoke/CMakeLists.txt
index a67d0437592d5..49f837d2f60e6 100644
--- a/libc/test/src/math/smoke/CMakeLists.txt
+++ b/libc/test/src/math/smoke/CMakeLists.txt
@@ -3553,6 +3553,19 @@ add_fp_unittest(
libc.src.math.totalordermagf16
)
+add_fp_unittest(
+ f16divf_test
+ SUITE
+ libc-math-smoke-tests
+ SRCS
+ f16divf_test.cpp
+ HDRS
+ DivTest.h
+ DEPENDS
+ libc.hdr.fenv_macros
+ libc.src.math.f16divf
+)
+
add_fp_unittest(
f16fmaf_test
SUITE
diff --git a/libc/test/src/math/smoke/DivTest.h b/libc/test/src/math/smoke/DivTest.h
new file mode 100644
index 0000000000000..8cd528d1111b9
--- /dev/null
+++ b/libc/test/src/math/smoke/DivTest.h
@@ -0,0 +1,67 @@
+//===-- Utility class to test different flavors of float div --------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TEST_SRC_MATH_SMOKE_DIVTEST_H
+#define LLVM_LIBC_TEST_SRC_MATH_SMOKE_DIVTEST_H
+
+#include "hdr/fenv_macros.h"
+#include "test/UnitTest/FEnvSafeTest.h"
+#include "test/UnitTest/FPMatcher.h"
+#include "test/UnitTest/Test.h"
+
+template <typename OutType, typename InType>
+class DivTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
+
+ DECLARE_SPECIAL_CONSTANTS(OutType)
+
+public:
+ typedef OutType (*DivFunc)(InType, InType);
+
+ void test_special_numbers(DivFunc func) {
+ EXPECT_FP_IS_NAN(func(aNaN, aNaN));
+
+ EXPECT_FP_EQ(inf, func(inf, zero));
+ EXPECT_FP_EQ(neg_inf, func(neg_inf, zero));
+ EXPECT_FP_EQ(neg_inf, func(inf, neg_zero));
+ EXPECT_FP_EQ(inf, func(neg_inf, neg_zero));
+
+ EXPECT_FP_EQ(inf, func(inf, zero));
+ EXPECT_FP_EQ(neg_inf, func(neg_inf, zero));
+ EXPECT_FP_EQ(neg_inf, func(inf, neg_zero));
+ EXPECT_FP_EQ(inf, func(neg_inf, neg_zero));
+ }
+
+ void test_division_by_zero(DivFunc func) {
+ EXPECT_FP_EQ_WITH_EXCEPTION(inf, func(InType(1.0), zero), FE_DIVBYZERO);
+ EXPECT_FP_EQ_WITH_EXCEPTION(neg_inf, func(InType(-1.0), zero),
+ FE_DIVBYZERO);
+ EXPECT_FP_EQ_WITH_EXCEPTION(neg_inf, func(InType(1.0), neg_zero),
+ FE_DIVBYZERO);
+ EXPECT_FP_EQ_WITH_EXCEPTION(inf, func(InType(1.0), zero), FE_DIVBYZERO);
+ }
+
+ void test_invalid_operations(DivFunc func) {
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(zero, zero), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_zero, zero), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(zero, neg_zero), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_zero, neg_zero), FE_INVALID);
+
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(inf, inf), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_inf, inf), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(inf, neg_inf), FE_INVALID);
+ EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_inf, neg_inf), FE_INVALID);
+ }
+};
+
+#define LIST_DIV_TESTS(OutType, InType, func) \
+ using LlvmLibcDivTest = DivTest<OutType, InType>; \
+ TEST_F(LlvmLibcDivTest, SpecialNumbers) { test_special_numbers(&func); } \
+ TEST_F(LlvmLibcDivTest, DivisionByZero) { test_division_by_zero(&func); } \
+ TEST_F(LlvmLibcDivTest, InvalidOperations) { test_invalid_operations(&func); }
+
+#endif // LLV...
[truncated]
|
if (x_bits.is_nan() || y_bits.is_nan()) { | ||
if (x_bits.is_signaling_nan() || y_bits.is_signaling_nan()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should try putting all of these if
s for special values under a single if (LIBC_UNLIKELY(...))
and running a benchmark to compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the change to if (LIBC_UNLIKELY(...))
. It is indeed faster:
Performance tests with inputs in denormal range:
-- My function --
Total time : 590356165 ns
Average runtime : 29.5178 ns/op
Ops per second : 33877887 op/s
-- Other function --
Total time : 612659035 ns
Average runtime : 30.6329 ns/op
Ops per second : 32644617 op/s
-- Average runtime ratio --
Mine / Other's : 0.963597
Performance tests with inputs in normal range:
-- My function --
Total time : 145477558 ns
Average runtime : 7.27387 ns/op
Ops per second : 137478386 op/s
-- Other function --
Total time : 172587200 ns
Average runtime : 8.62935 ns/op
Ops per second : 115883564 op/s
-- Average runtime ratio --
Mine / Other's : 0.842922
Performance tests with inputs in normal range with exponents close to each other:
-- My function --
Total time : 522964890 ns
Average runtime : 26.1482 ns/op
Ops per second : 38243523 op/s
-- Other function --
Total time : 550152260 ns
Average runtime : 27.5076 ns/op
Ops per second : 36353608 op/s
-- Average runtime ratio --
Mine / Other's : 0.950582
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to have perf tests to confirm improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with the changes to DyadicFloat::operator T()
, the new implementation with fewer iterations is still faster than the previous one:
Performance tests with inputs in denormal range:
-- My function --
Total time : 232472260 ns
Average runtime : 46.4944 ns/op
Ops per second : 21507964 op/s
-- Other function --
Total time : 519384676 ns
Average runtime : 103.877 ns/op
Ops per second : 9626785 op/s
-- Average runtime ratio --
Mine / Other's : 0.447592
Performance tests with inputs in normal range:
-- My function --
Total time : 511714277 ns
Average runtime : 102.343 ns/op
Ops per second : 9771087 op/s
-- Other function --
Total time : 637747828 ns
Average runtime : 127.549 ns/op
Ops per second : 7840097 op/s
-- Average runtime ratio --
Mine / Other's : 0.802377
Performance tests with inputs in normal range with exponents close to each other:
-- My function --
Total time : 309770755 ns
Average runtime : 61.9541 ns/op
Ops per second : 16140984 op/s
-- Other function --
Total time : 512884190 ns
Average runtime : 102.577 ns/op
Ops per second : 9748799 op/s
-- Average runtime ratio --
Mine / Other's : 0.603978
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance test of the current version (6439428) vs the one with more iterations (4ff508d):
Performance tests with inputs in denormal range:
-- My function --
Total time : 224276882 ns
Average runtime : 44.8553 ns/op
Ops per second : 22293893 op/s
-- Other function --
Total time : 520018166 ns
Average runtime : 104.004 ns/op
Ops per second : 9615058 op/s
-- Average runtime ratio --
Mine / Other's : 0.431287
Performance tests with inputs in normal range:
-- My function --
Total time : 506187486 ns
Average runtime : 101.237 ns/op
Ops per second : 9877772 op/s
-- Other function --
Total time : 642170120 ns
Average runtime : 128.434 ns/op
Ops per second : 7786106 op/s
-- Average runtime ratio --
Mine / Other's : 0.788245
Performance tests with inputs in normal range with exponents close to each other:
-- My function --
Total time : 303552644 ns
Average runtime : 60.7105 ns/op
Ops per second : 16471623 op/s
-- Other function --
Total time : 513769699 ns
Average runtime : 102.754 ns/op
Ops per second : 9731996 op/s
-- Average runtime ratio --
Mine / Other's : 0.590834
d00fa4a
to
e4ecb9e
Compare
Rebased and resolved merge conflicts. |
Add LIBC_INLINE to fputil::generic::div.
Optimize special value checks.
8ce9179
to
4ff508d
Compare
Had to rebase and force-push to get |
Restore include of "hdr/errno_macros.h".
Fix and optimize exception signaling, and fix missing include.
Part of #93566.