Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc][math][c23] Add f16divf C23 math function #96131

Merged
merged 8 commits into from
Jun 25, 2024

Conversation

overmighty
Copy link
Member

Part of #93566.

@overmighty
Copy link
Member Author

cc @lntue

@llvmbot
Copy link
Member

llvmbot commented Jun 20, 2024

@llvm/pr-subscribers-libc

Author: OverMighty (overmighty)

Changes

Part of #93566.


Patch is 31.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96131.diff

18 Files Affected:

  • (modified) libc/config/linux/aarch64/entrypoints.txt (+1)
  • (modified) libc/config/linux/x86_64/entrypoints.txt (+1)
  • (modified) libc/docs/math/index.rst (+2)
  • (modified) libc/spec/stdc.td (+2)
  • (modified) libc/src/__support/FPUtil/generic/CMakeLists.txt (+14)
  • (added) libc/src/__support/FPUtil/generic/div.h (+180)
  • (modified) libc/src/math/CMakeLists.txt (+2)
  • (added) libc/src/math/f16divf.h (+20)
  • (modified) libc/src/math/generic/CMakeLists.txt (+13)
  • (added) libc/src/math/generic/f16divf.cpp (+19)
  • (modified) libc/test/src/math/CMakeLists.txt (+13)
  • (added) libc/test/src/math/DivTest.h (+74)
  • (added) libc/test/src/math/f16divf_test.cpp (+13)
  • (modified) libc/test/src/math/smoke/CMakeLists.txt (+13)
  • (added) libc/test/src/math/smoke/DivTest.h (+67)
  • (added) libc/test/src/math/smoke/f16divf_test.cpp (+13)
  • (modified) libc/utils/MPFRWrapper/MPFRUtils.cpp (+53-31)
  • (modified) libc/utils/MPFRWrapper/MPFRUtils.h (+28-11)
diff --git a/libc/config/linux/aarch64/entrypoints.txt b/libc/config/linux/aarch64/entrypoints.txt
index dfed6acbdf257..e0111239c89fa 100644
--- a/libc/config/linux/aarch64/entrypoints.txt
+++ b/libc/config/linux/aarch64/entrypoints.txt
@@ -504,6 +504,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.canonicalizef16
     libc.src.math.ceilf16
     libc.src.math.copysignf16
+    libc.src.math.f16divf
     libc.src.math.f16fmaf
     libc.src.math.f16sqrtf
     libc.src.math.fabsf16
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index cfe35167ca32e..512a3a3f50adf 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -536,6 +536,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.canonicalizef16
     libc.src.math.ceilf16
     libc.src.math.copysignf16
+    libc.src.math.f16divf
     libc.src.math.f16fmaf
     libc.src.math.f16sqrtf
     libc.src.math.fabsf16
diff --git a/libc/docs/math/index.rst b/libc/docs/math/index.rst
index 293edd1c15100..185ae1fd82e8f 100644
--- a/libc/docs/math/index.rst
+++ b/libc/docs/math/index.rst
@@ -124,6 +124,8 @@ Basic Operations
 +------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
 | dsub             | N/A              | N/A             |                        | N/A                  |                        | 7.12.14.2              | F.10.11                    |
 +------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
+| f16div           | |check|          |                 |                        | N/A                  |                        | 7.12.14.4              | F.10.11                    |
++------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
 | f16fma           | |check|          |                 |                        | N/A                  |                        | 7.12.14.5              | F.10.11                    |
 +------------------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
 | fabs             | |check|          | |check|         | |check|                | |check|              | |check|                | 7.12.7.3               | F.10.4.3                   |
diff --git a/libc/spec/stdc.td b/libc/spec/stdc.td
index f9c79ee106bbb..ac3badc8096b9 100644
--- a/libc/spec/stdc.td
+++ b/libc/spec/stdc.td
@@ -716,6 +716,8 @@ def StdC : StandardSpec<"stdc"> {
 
           GuardedFunctionSpec<"totalordermagf16", RetValSpec<IntType>, [ArgSpec<Float16Ptr>, ArgSpec<Float16Ptr>], "LIBC_TYPES_HAS_FLOAT16">,
 
+          GuardedFunctionSpec<"f16divf", RetValSpec<Float16Type>, [ArgSpec<FloatType>, ArgSpec<FloatType>], "LIBC_TYPES_HAS_FLOAT16">,
+
           GuardedFunctionSpec<"f16sqrtf", RetValSpec<Float16Type>, [ArgSpec<FloatType>], "LIBC_TYPES_HAS_FLOAT16">,
       ]
   >;
diff --git a/libc/src/__support/FPUtil/generic/CMakeLists.txt b/libc/src/__support/FPUtil/generic/CMakeLists.txt
index a8a95ba3f15ff..a7b912e0bab98 100644
--- a/libc/src/__support/FPUtil/generic/CMakeLists.txt
+++ b/libc/src/__support/FPUtil/generic/CMakeLists.txt
@@ -45,3 +45,17 @@ add_header_library(
     libc.src.__support.FPUtil.rounding_mode
     libc.src.__support.macros.optimization
 )
+
+add_header_library(
+  div
+  HDRS
+    div.h
+  DEPENDS
+    libc.hdr.fenv_macros
+    libc.src.__support.CPP.bit
+    libc.src.__support.CPP.type_traits
+    libc.src.__support.FPUtil.fenv_impl
+    libc.src.__support.FPUtil.fp_bits
+    libc.src.__support.FPUtil.dyadic_float
+    libc.src.__support.FPUtil.rounding_mode
+)
diff --git a/libc/src/__support/FPUtil/generic/div.h b/libc/src/__support/FPUtil/generic/div.h
new file mode 100644
index 0000000000000..c3cb702d2a97a
--- /dev/null
+++ b/libc/src/__support/FPUtil/generic/div.h
@@ -0,0 +1,180 @@
+//===-- Division of IEEE 754 floating-point numbers -------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
+#define LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
+
+#include "hdr/fenv_macros.h"
+#include "src/__support/CPP/bit.h"
+#include "src/__support/CPP/type_traits.h"
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/dyadic_float.h"
+#include "src/__support/FPUtil/rounding_mode.h"
+
+namespace LIBC_NAMESPACE::fputil::generic {
+
+template <typename OutType, typename InType>
+cpp::enable_if_t<cpp::is_floating_point_v<OutType> &&
+                     cpp::is_floating_point_v<InType> &&
+                     sizeof(OutType) <= sizeof(InType),
+                 OutType>
+div(InType x, InType y) {
+  using OutFPBits = FPBits<OutType>;
+  using OutStorageType = typename OutFPBits::StorageType;
+  using InFPBits = FPBits<InType>;
+  using InStorageType = typename InFPBits::StorageType;
+  using DyadicFloat =
+      DyadicFloat<cpp::bit_ceil(static_cast<size_t>(InFPBits::FRACTION_LEN))>;
+  using DyadicMantissaType = typename DyadicFloat::MantissaType;
+
+  // +1 for the implicit bit.
+  constexpr int DYADIC_EXTRA_MANTISSA_LEN =
+      DyadicMantissaType::BITS - (InFPBits::FRACTION_LEN + 1);
+  // +1 for the extra fractional bit in q.
+  constexpr int Q_EXTRA_FRACTION_LEN =
+      InFPBits::FRACTION_LEN + 1 - OutFPBits::FRACTION_LEN;
+
+  InFPBits x_bits(x);
+  InFPBits y_bits(y);
+
+  if (x_bits.is_nan() || y_bits.is_nan()) {
+    if (x_bits.is_signaling_nan() || y_bits.is_signaling_nan())
+      raise_except_if_required(FE_INVALID);
+
+    // TODO: Handle NaN payloads.
+    return OutFPBits::quiet_nan().get_val();
+  }
+
+  Sign result_sign = x_bits.sign() == y_bits.sign() ? Sign::POS : Sign::NEG;
+
+  if (x_bits.is_inf()) {
+    if (y_bits.is_inf()) {
+      raise_except_if_required(FE_INVALID);
+      return OutFPBits::quiet_nan().get_val();
+    }
+
+    return OutFPBits::inf(result_sign).get_val();
+  }
+
+  if (y_bits.is_inf())
+    return OutFPBits::inf(result_sign).get_val();
+
+  if (y_bits.is_zero()) {
+    if (x_bits.is_zero()) {
+      raise_except_if_required(FE_INVALID);
+      return OutFPBits::quiet_nan().get_val();
+    }
+
+    raise_except_if_required(FE_DIVBYZERO);
+    return OutFPBits::inf(result_sign).get_val();
+  }
+
+  if (x_bits.is_zero())
+    return OutFPBits::zero(result_sign).get_val();
+
+  DyadicFloat xd(x);
+  DyadicFloat yd(y);
+
+  bool would_q_be_subnormal = xd.mantissa < yd.mantissa;
+  int q_exponent = xd.get_unbiased_exponent() - yd.get_unbiased_exponent() -
+                   would_q_be_subnormal;
+
+  if (q_exponent + OutFPBits::EXP_BIAS >= OutFPBits::MAX_BIASED_EXPONENT) {
+    switch (quick_get_round()) {
+    case FE_TONEAREST:
+    case FE_UPWARD:
+      return OutFPBits::inf().get_val();
+    default:
+      return OutFPBits::max_normal().get_val();
+    }
+  }
+
+  if (q_exponent < -OutFPBits::EXP_BIAS - OutFPBits::FRACTION_LEN) {
+    switch (quick_get_round()) {
+    case FE_UPWARD:
+      return OutFPBits::min_subnormal().get_val();
+    default:
+      return OutFPBits::zero().get_val();
+    }
+  }
+
+  InStorageType q = 1;
+  InStorageType xd_mant_in = static_cast<InStorageType>(
+      xd.mantissa >> (DYADIC_EXTRA_MANTISSA_LEN - would_q_be_subnormal));
+  InStorageType yd_mant_in =
+      static_cast<InStorageType>(yd.mantissa >> DYADIC_EXTRA_MANTISSA_LEN);
+  InStorageType r = xd_mant_in - yd_mant_in;
+
+  for (size_t i = 0; i < InFPBits::FRACTION_LEN + 1; i++) {
+    q <<= 1;
+    InStorageType t = r << 1;
+    if (t < yd_mant_in) {
+      r = t;
+    } else {
+      q += 1;
+      r = t - yd_mant_in;
+    }
+  }
+
+  bool round;
+  bool sticky;
+  OutStorageType result;
+
+  if (q_exponent > -OutFPBits::EXP_BIAS) {
+    // Result is normal.
+
+    round = (q & (InStorageType(1) << (Q_EXTRA_FRACTION_LEN - 1))) != 0;
+    sticky = (q & ((InStorageType(1) << (Q_EXTRA_FRACTION_LEN - 1)) - 1)) != 0;
+
+    result = OutFPBits::create_value(
+                 result_sign,
+                 static_cast<OutStorageType>(q_exponent + OutFPBits::EXP_BIAS),
+                 static_cast<OutStorageType>((q >> Q_EXTRA_FRACTION_LEN) &
+                                             OutFPBits::SIG_MASK))
+                 .uintval();
+
+  } else {
+    // Result is subnormal.
+
+    // +1 because the leading bit is now part of the fraction.
+    int underflow_extra_fraction_len =
+        Q_EXTRA_FRACTION_LEN + 1 - q_exponent - OutFPBits::EXP_BIAS;
+
+    InStorageType round_bit_mask = InStorageType(1)
+                                   << (underflow_extra_fraction_len - 1);
+    round = (q & round_bit_mask) != 0;
+    InStorageType sticky_bits_mask = round_bit_mask - 1;
+    sticky = (q & sticky_bits_mask) != 0;
+
+    result = OutFPBits::create_value(
+                 result_sign, 0,
+                 static_cast<OutStorageType>(q >> underflow_extra_fraction_len))
+                 .uintval();
+  }
+
+  bool lsb = (result & 1) != 0;
+
+  switch (quick_get_round()) {
+  case FE_TONEAREST:
+    if (round && (lsb || sticky))
+      ++result;
+    break;
+  case FE_UPWARD:
+    ++result;
+    break;
+  default:
+    break;
+  }
+
+  return cpp::bit_cast<OutType>(result);
+}
+
+} // namespace LIBC_NAMESPACE::fputil::generic
+
+#endif // LLVM_LIBC_SRC___SUPPORT_FPUTIL_GENERIC_DIV_H
diff --git a/libc/src/math/CMakeLists.txt b/libc/src/math/CMakeLists.txt
index 4472367d6c073..53b5f44e8ef31 100644
--- a/libc/src/math/CMakeLists.txt
+++ b/libc/src/math/CMakeLists.txt
@@ -99,6 +99,8 @@ add_math_entrypoint_object(exp10f)
 add_math_entrypoint_object(expm1)
 add_math_entrypoint_object(expm1f)
 
+add_math_entrypoint_object(f16divf)
+
 add_math_entrypoint_object(f16fmaf)
 
 add_math_entrypoint_object(f16sqrtf)
diff --git a/libc/src/math/f16divf.h b/libc/src/math/f16divf.h
new file mode 100644
index 0000000000000..a3359d9e47944
--- /dev/null
+++ b/libc/src/math/f16divf.h
@@ -0,0 +1,20 @@
+//===-- Implementation header for f16divf -----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_MATH_F16DIVF_H
+#define LLVM_LIBC_SRC_MATH_F16DIVF_H
+
+#include "src/__support/macros/properties/types.h"
+
+namespace LIBC_NAMESPACE {
+
+float16 f16divf(float x, float y);
+
+} // namespace LIBC_NAMESPACE
+
+#endif // LLVM_LIBC_SRC_MATH_F16DIVF_H
diff --git a/libc/src/math/generic/CMakeLists.txt b/libc/src/math/generic/CMakeLists.txt
index aa0069d821d0d..f14c70bedcfeb 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -3602,6 +3602,19 @@ add_entrypoint_object(
     -O3
 )
 
+add_entrypoint_object(
+  f16divf
+  SRCS
+    f16divf.cpp
+  HDRS
+    ../f16divf.h
+  DEPENDS
+    libc.src.__support.macros.properties.types
+    libc.src.__support.FPUtil.generic.div
+  COMPILE_OPTIONS
+    -O3
+)
+
 add_entrypoint_object(
   f16fmaf
   SRCS
diff --git a/libc/src/math/generic/f16divf.cpp b/libc/src/math/generic/f16divf.cpp
new file mode 100644
index 0000000000000..45874fbac2055
--- /dev/null
+++ b/libc/src/math/generic/f16divf.cpp
@@ -0,0 +1,19 @@
+//===-- Implementation of f16divf function --------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/math/f16divf.h"
+#include "src/__support/FPUtil/generic/div.h"
+#include "src/__support/common.h"
+
+namespace LIBC_NAMESPACE {
+
+LLVM_LIBC_FUNCTION(float16, f16divf, (float x, float y)) {
+  return fputil::generic::div<float16>(x, y);
+}
+
+} // namespace LIBC_NAMESPACE
diff --git a/libc/test/src/math/CMakeLists.txt b/libc/test/src/math/CMakeLists.txt
index bb364c3f0a175..ba588662f469e 100644
--- a/libc/test/src/math/CMakeLists.txt
+++ b/libc/test/src/math/CMakeLists.txt
@@ -1890,6 +1890,19 @@ add_fp_unittest(
     libc.src.__support.FPUtil.fp_bits
 )
 
+add_fp_unittest(
+  f16divf_test
+  NEED_MPFR
+  SUITE
+    libc-math-unittests
+  SRCS
+    f16divf_test.cpp
+  HDRS
+    DivTest.h
+  DEPENDS
+    libc.src.math.f16divf
+)
+
 add_fp_unittest(
   f16fmaf_test
   NEED_MPFR
diff --git a/libc/test/src/math/DivTest.h b/libc/test/src/math/DivTest.h
new file mode 100644
index 0000000000000..39e8a6b67bd90
--- /dev/null
+++ b/libc/test/src/math/DivTest.h
@@ -0,0 +1,74 @@
+//===-- Utility class to test different flavors of float div ----*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
+#define LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
+
+#include "test/UnitTest/FEnvSafeTest.h"
+#include "test/UnitTest/FPMatcher.h"
+#include "test/UnitTest/Test.h"
+#include "utils/MPFRWrapper/MPFRUtils.h"
+
+namespace mpfr = LIBC_NAMESPACE::testing::mpfr;
+
+template <typename OutType, typename InType>
+class DivTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
+
+  struct InConstants {
+    DECLARE_SPECIAL_CONSTANTS(InType)
+  };
+
+  using InFPBits = typename InConstants::FPBits;
+  using InStorageType = typename InConstants::StorageType;
+
+  static constexpr InStorageType IN_MAX_NORMAL_U =
+      InFPBits::max_normal().uintval();
+  static constexpr InStorageType IN_MIN_NORMAL_U =
+      InFPBits::min_normal().uintval();
+  static constexpr InStorageType IN_MAX_SUBNORMAL_U =
+      InFPBits::max_subnormal().uintval();
+  static constexpr InStorageType IN_MIN_SUBNORMAL_U =
+      InFPBits::min_subnormal().uintval();
+
+public:
+  typedef OutType (*DivFunc)(InType, InType);
+
+  void test_subnormal_range(DivFunc func) {
+    constexpr InStorageType COUNT = 100'001;
+    constexpr InStorageType STEP =
+        (IN_MAX_SUBNORMAL_U - IN_MIN_SUBNORMAL_U) / COUNT;
+    for (InStorageType i = 0, v = 0, w = IN_MAX_SUBNORMAL_U; i <= COUNT;
+         ++i, v += STEP, w -= STEP) {
+      InType x = InFPBits(v).get_val();
+      InType y = InFPBits(w).get_val();
+      mpfr::BinaryInput<InType> input{x, y};
+      EXPECT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Div, input, func(x, y),
+                                     0.5);
+    }
+  }
+
+  void test_normal_range(DivFunc func) {
+    constexpr InStorageType COUNT = 100'001;
+    constexpr InStorageType STEP = (IN_MAX_NORMAL_U - IN_MIN_NORMAL_U) / COUNT;
+    for (InStorageType i = 0, v = 0, w = IN_MAX_NORMAL_U; i <= COUNT;
+         ++i, v += STEP, w -= STEP) {
+      InType x = InFPBits(v).get_val();
+      InType y = InFPBits(w).get_val();
+      mpfr::BinaryInput<InType> input{x, y};
+      EXPECT_MPFR_MATCH_ALL_ROUNDING(mpfr::Operation::Div, input, func(x, y),
+                                     0.5);
+    }
+  }
+};
+
+#define LIST_DIV_TESTS(OutType, InType, func)                                  \
+  using LlvmLibcDivTest = DivTest<OutType, InType>;                            \
+  TEST_F(LlvmLibcDivTest, SubnormalRange) { test_subnormal_range(&func); }     \
+  TEST_F(LlvmLibcDivTest, NormalRange) { test_normal_range(&func); }
+
+#endif // LLVM_LIBC_TEST_SRC_MATH_DIVTEST_H
diff --git a/libc/test/src/math/f16divf_test.cpp b/libc/test/src/math/f16divf_test.cpp
new file mode 100644
index 0000000000000..85be1ebcd55c9
--- /dev/null
+++ b/libc/test/src/math/f16divf_test.cpp
@@ -0,0 +1,13 @@
+//===-- Unittests for f16divf ---------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "DivTest.h"
+
+#include "src/math/f16divf.h"
+
+LIST_DIV_TESTS(float16, float, LIBC_NAMESPACE::f16divf)
diff --git a/libc/test/src/math/smoke/CMakeLists.txt b/libc/test/src/math/smoke/CMakeLists.txt
index a67d0437592d5..49f837d2f60e6 100644
--- a/libc/test/src/math/smoke/CMakeLists.txt
+++ b/libc/test/src/math/smoke/CMakeLists.txt
@@ -3553,6 +3553,19 @@ add_fp_unittest(
     libc.src.math.totalordermagf16
 )
 
+add_fp_unittest(
+  f16divf_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    f16divf_test.cpp
+  HDRS
+    DivTest.h
+  DEPENDS
+    libc.hdr.fenv_macros
+    libc.src.math.f16divf
+)
+
 add_fp_unittest(
   f16fmaf_test
   SUITE
diff --git a/libc/test/src/math/smoke/DivTest.h b/libc/test/src/math/smoke/DivTest.h
new file mode 100644
index 0000000000000..8cd528d1111b9
--- /dev/null
+++ b/libc/test/src/math/smoke/DivTest.h
@@ -0,0 +1,67 @@
+//===-- Utility class to test different flavors of float div --------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_TEST_SRC_MATH_SMOKE_DIVTEST_H
+#define LLVM_LIBC_TEST_SRC_MATH_SMOKE_DIVTEST_H
+
+#include "hdr/fenv_macros.h"
+#include "test/UnitTest/FEnvSafeTest.h"
+#include "test/UnitTest/FPMatcher.h"
+#include "test/UnitTest/Test.h"
+
+template <typename OutType, typename InType>
+class DivTest : public LIBC_NAMESPACE::testing::FEnvSafeTest {
+
+  DECLARE_SPECIAL_CONSTANTS(OutType)
+
+public:
+  typedef OutType (*DivFunc)(InType, InType);
+
+  void test_special_numbers(DivFunc func) {
+    EXPECT_FP_IS_NAN(func(aNaN, aNaN));
+
+    EXPECT_FP_EQ(inf, func(inf, zero));
+    EXPECT_FP_EQ(neg_inf, func(neg_inf, zero));
+    EXPECT_FP_EQ(neg_inf, func(inf, neg_zero));
+    EXPECT_FP_EQ(inf, func(neg_inf, neg_zero));
+
+    EXPECT_FP_EQ(inf, func(inf, zero));
+    EXPECT_FP_EQ(neg_inf, func(neg_inf, zero));
+    EXPECT_FP_EQ(neg_inf, func(inf, neg_zero));
+    EXPECT_FP_EQ(inf, func(neg_inf, neg_zero));
+  }
+
+  void test_division_by_zero(DivFunc func) {
+    EXPECT_FP_EQ_WITH_EXCEPTION(inf, func(InType(1.0), zero), FE_DIVBYZERO);
+    EXPECT_FP_EQ_WITH_EXCEPTION(neg_inf, func(InType(-1.0), zero),
+                                FE_DIVBYZERO);
+    EXPECT_FP_EQ_WITH_EXCEPTION(neg_inf, func(InType(1.0), neg_zero),
+                                FE_DIVBYZERO);
+    EXPECT_FP_EQ_WITH_EXCEPTION(inf, func(InType(1.0), zero), FE_DIVBYZERO);
+  }
+
+  void test_invalid_operations(DivFunc func) {
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(zero, zero), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_zero, zero), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(zero, neg_zero), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_zero, neg_zero), FE_INVALID);
+
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(inf, inf), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_inf, inf), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(inf, neg_inf), FE_INVALID);
+    EXPECT_FP_IS_NAN_WITH_EXCEPTION(func(neg_inf, neg_inf), FE_INVALID);
+  }
+};
+
+#define LIST_DIV_TESTS(OutType, InType, func)                                  \
+  using LlvmLibcDivTest = DivTest<OutType, InType>;                            \
+  TEST_F(LlvmLibcDivTest, SpecialNumbers) { test_special_numbers(&func); }     \
+  TEST_F(LlvmLibcDivTest, DivisionByZero) { test_division_by_zero(&func); }    \
+  TEST_F(LlvmLibcDivTest, InvalidOperations) { test_invalid_operations(&func); }
+
+#endif // LLV...
[truncated]

Comment on lines 46 to 47
if (x_bits.is_nan() || y_bits.is_nan()) {
if (x_bits.is_signaling_nan() || y_bits.is_signaling_nan())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should try putting all of these ifs for special values under a single if (LIBC_UNLIKELY(...)) and running a benchmark to compare.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the change to if (LIBC_UNLIKELY(...)). It is indeed faster:

 Performance tests with inputs in denormal range:
-- My function --
     Total time      : 590356165 ns 
     Average runtime : 29.5178 ns/op 
     Ops per second  : 33877887 op/s 
-- Other function --
     Total time      : 612659035 ns 
     Average runtime : 30.6329 ns/op 
     Ops per second  : 32644617 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.963597 

 Performance tests with inputs in normal range:
-- My function --
     Total time      : 145477558 ns 
     Average runtime : 7.27387 ns/op 
     Ops per second  : 137478386 op/s 
-- Other function --
     Total time      : 172587200 ns 
     Average runtime : 8.62935 ns/op 
     Ops per second  : 115883564 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.842922 

 Performance tests with inputs in normal range with exponents close to each other:
-- My function --
     Total time      : 522964890 ns 
     Average runtime : 26.1482 ns/op 
     Ops per second  : 38243523 op/s 
-- Other function --
     Total time      : 550152260 ns 
     Average runtime : 27.5076 ns/op 
     Ops per second  : 36353608 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.950582 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice to have perf tests to confirm improvements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the changes to DyadicFloat::operator T(), the new implementation with fewer iterations is still faster than the previous one:

 Performance tests with inputs in denormal range:
-- My function --
     Total time      : 232472260 ns 
     Average runtime : 46.4944 ns/op 
     Ops per second  : 21507964 op/s 
-- Other function --
     Total time      : 519384676 ns 
     Average runtime : 103.877 ns/op 
     Ops per second  : 9626785 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.447592 

 Performance tests with inputs in normal range:
-- My function --
     Total time      : 511714277 ns 
     Average runtime : 102.343 ns/op 
     Ops per second  : 9771087 op/s 
-- Other function --
     Total time      : 637747828 ns 
     Average runtime : 127.549 ns/op 
     Ops per second  : 7840097 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.802377 

 Performance tests with inputs in normal range with exponents close to each other:
-- My function --
     Total time      : 309770755 ns 
     Average runtime : 61.9541 ns/op 
     Ops per second  : 16140984 op/s 
-- Other function --
     Total time      : 512884190 ns 
     Average runtime : 102.577 ns/op 
     Ops per second  : 9748799 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.603978 

Copy link
Member Author

@overmighty overmighty Jun 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance test of the current version (6439428) vs the one with more iterations (4ff508d):

 Performance tests with inputs in denormal range:
-- My function --
     Total time      : 224276882 ns 
     Average runtime : 44.8553 ns/op 
     Ops per second  : 22293893 op/s 
-- Other function --
     Total time      : 520018166 ns 
     Average runtime : 104.004 ns/op 
     Ops per second  : 9615058 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.431287 

 Performance tests with inputs in normal range:
-- My function --
     Total time      : 506187486 ns 
     Average runtime : 101.237 ns/op 
     Ops per second  : 9877772 op/s 
-- Other function --
     Total time      : 642170120 ns 
     Average runtime : 128.434 ns/op 
     Ops per second  : 7786106 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.788245 

 Performance tests with inputs in normal range with exponents close to each other:
-- My function --
     Total time      : 303552644 ns 
     Average runtime : 60.7105 ns/op 
     Ops per second  : 16471623 op/s 
-- Other function --
     Total time      : 513769699 ns 
     Average runtime : 102.754 ns/op 
     Ops per second  : 9731996 op/s 
-- Average runtime ratio --
     Mine / Other's  : 0.590834 

@overmighty
Copy link
Member Author

Rebased and resolved merge conflicts.

libc/test/src/math/smoke/DivTest.h Outdated Show resolved Hide resolved
libc/test/src/math/DivTest.h Outdated Show resolved Hide resolved
libc/src/__support/FPUtil/generic/div.h Outdated Show resolved Hide resolved
libc/src/__support/FPUtil/generic/div.h Outdated Show resolved Hide resolved
libc/src/__support/FPUtil/generic/div.h Outdated Show resolved Hide resolved
libc/src/__support/FPUtil/generic/div.h Outdated Show resolved Hide resolved
libc/src/__support/FPUtil/generic/div.h Outdated Show resolved Hide resolved
@overmighty
Copy link
Member Author

Had to rebase and force-push to get fputil::getpayload.

Restore include of "hdr/errno_macros.h".
Fix and optimize exception signaling, and fix missing include.
@lntue lntue merged commit edbe698 into llvm:main Jun 25, 2024
7 checks passed
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants