Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constant Fold Logf128 calls #84501

Merged
merged 9 commits into from
Apr 18, 2024
Merged

Constant Fold Logf128 calls #84501

merged 9 commits into from
Apr 18, 2024

Conversation

MDevereau
Copy link
Contributor

@MDevereau MDevereau commented Mar 8, 2024

This patch enables constant folding for 128 bit floating-point logf calls. This is achieved by querying if the host system has the logf128() symbol available with a CMake test. If so, replace the runtime call with the compile time value returned from logf128.

Copy link

github-actions bot commented Mar 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

✅ With the latest revision this PR passed the Python code formatter.

@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2024

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-ir

Author: Matthew Devereau (MDevereau)

Changes

This patch is mainly a concept/test patch to gauge if this approach to constant folding a fp128 return from a logf call is at all feasible, or if there are any concerns which can be addressed to make this patch an easy win.

This is achieved by querying with CMake if the host system has the logf128 symbol available. If so, replace the runtime call with the compile time constant returned from logf128.

There are a few concerns with this approach:

  1. The implementation of logf128 may also yield different results on different targets, such as x86 using fp80 precision instead of the full fp128 range on other targets such as aarch64.
  2. This approach relies on unit tests, as more commonplace Clang/C tests and opt/llc/IR tests are not applicable since they are ignorant to the result of the compile time CMake check.
  3. Cross compiling the compiler and moving it to a different machine might cause issues if logf128 is no longer present.
  4. This patch requires the host to have access to __float128 enabled. To enable this for aarch64 or other targets, a patch similar to #85070 would be required.

Full diff: https://github.com/llvm/llvm-project/pull/84501.diff

10 Files Affected:

  • (modified) llvm/include/llvm/ADT/APFloat.h (+19)
  • (modified) llvm/include/llvm/ADT/APInt.h (+18)
  • (modified) llvm/include/llvm/IR/Constants.h (+4)
  • (added) llvm/include/llvm/Support/float128.h (+18)
  • (modified) llvm/lib/Analysis/CMakeLists.txt (+6)
  • (modified) llvm/lib/Analysis/ConstantFolding.cpp (+9-4)
  • (modified) llvm/lib/IR/Constants.cpp (+18)
  • (modified) llvm/lib/Support/APFloat.cpp (+30)
  • (modified) llvm/unittests/Analysis/CMakeLists.txt (+7)
  • (added) llvm/unittests/Analysis/ConstantLogf128.cpp (+69)
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 8c247bbcec90a2..bd13e23f0ea649 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -19,6 +19,7 @@
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/FloatingPointMode.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/float128.h"
 #include <memory>
 
 #define APFLOAT_DISPATCH_ON_SEMANTICS(METHOD_CALL)                             \
@@ -299,6 +300,9 @@ class IEEEFloat final : public APFloatBase {
   IEEEFloat(const fltSemantics &, integerPart);
   IEEEFloat(const fltSemantics &, uninitializedTag);
   IEEEFloat(const fltSemantics &, const APInt &);
+#ifdef __FLOAT128__
+  explicit IEEEFloat(float128 ld);
+#endif
   explicit IEEEFloat(double d);
   explicit IEEEFloat(float f);
   IEEEFloat(const IEEEFloat &);
@@ -354,6 +358,9 @@ class IEEEFloat final : public APFloatBase {
   Expected<opStatus> convertFromString(StringRef, roundingMode);
   APInt bitcastToAPInt() const;
   double convertToDouble() const;
+#ifdef __FLOAT128__
+  float128 convertToQuad() const;
+#endif
   float convertToFloat() const;
 
   /// @}
@@ -942,6 +949,9 @@ class APFloat : public APFloatBase {
   APFloat(const fltSemantics &Semantics, uninitializedTag)
       : U(Semantics, uninitialized) {}
   APFloat(const fltSemantics &Semantics, const APInt &I) : U(Semantics, I) {}
+#ifdef __FLOAT128__
+  explicit APFloat(float128 ld) : U(IEEEFloat(ld), IEEEquad()) {}
+#endif
   explicit APFloat(double d) : U(IEEEFloat(d), IEEEdouble()) {}
   explicit APFloat(float f) : U(IEEEFloat(f), IEEEsingle()) {}
   APFloat(const APFloat &RHS) = default;
@@ -1218,6 +1228,15 @@ class APFloat : public APFloatBase {
   /// shorter semantics, like IEEEsingle and others.
   double convertToDouble() const;
 
+  /// Converts this APFloat to host float value.
+  ///
+  /// \pre The APFloat must be built using semantics, that can be represented by
+  /// the host float type without loss of precision. It can be IEEEquad and
+  /// shorter semantics, like IEEEdouble and others.
+#ifdef __FLOAT128__
+  float128 convertToQuad() const;
+#endif
+
   /// Converts this APFloat to host float value.
   ///
   /// \pre The APFloat must be built using semantics, that can be represented by
diff --git a/llvm/include/llvm/ADT/APInt.h b/llvm/include/llvm/ADT/APInt.h
index 6c05367cecb1ea..b7d73107dc0831 100644
--- a/llvm/include/llvm/ADT/APInt.h
+++ b/llvm/include/llvm/ADT/APInt.h
@@ -17,6 +17,7 @@
 
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/MathExtras.h"
+#include "llvm/Support/float128.h"
 #include <cassert>
 #include <climits>
 #include <cstring>
@@ -1663,6 +1664,13 @@ class [[nodiscard]] APInt {
   /// any bit width. Exactly 64 bits will be translated.
   double bitsToDouble() const { return llvm::bit_cast<double>(getWord(0)); }
 
+#ifdef __FLOAT128__
+  float128 bitsToQuad() const {
+    __uint128_t ul = ((__uint128_t)U.pVal[1] << 64) + U.pVal[0];
+    return llvm::bit_cast<float128>(ul);
+  }
+#endif
+
   /// Converts APInt bits to a float
   ///
   /// The conversion does not do a translation from integer to float, it just
@@ -1688,6 +1696,16 @@ class [[nodiscard]] APInt {
     return APInt(sizeof(float) * CHAR_BIT, llvm::bit_cast<uint32_t>(V));
   }
 
+#ifdef __FLOAT128__
+  static APInt longDoubleToBits(float128 V) {
+    const uint64_t Words[2] = {
+        static_cast<uint64_t>(V),
+        static_cast<uint64_t>(llvm::bit_cast<__uint128_t>(V) >> 64),
+    };
+    return APInt(sizeof(float128) * CHAR_BIT, 2, Words);
+  }
+#endif
+
   /// @}
   /// \name Mathematics Operations
   /// @{
diff --git a/llvm/include/llvm/IR/Constants.h b/llvm/include/llvm/IR/Constants.h
index c0ac9a4aa6750c..e924130f66f381 100644
--- a/llvm/include/llvm/IR/Constants.h
+++ b/llvm/include/llvm/IR/Constants.h
@@ -289,6 +289,10 @@ class ConstantFP final : public ConstantData {
   /// host double and as the target format.
   static Constant *get(Type *Ty, double V);
 
+#ifdef __FLOAT128__
+  static Constant *get128(Type *Ty, float128 V);
+#endif
+
   /// If Ty is a vector type, return a Constant with a splat of the given
   /// value. Otherwise return a ConstantFP for the given value.
   static Constant *get(Type *Ty, const APFloat &V);
diff --git a/llvm/include/llvm/Support/float128.h b/llvm/include/llvm/Support/float128.h
new file mode 100644
index 00000000000000..6ff844cd7b35ab
--- /dev/null
+++ b/llvm/include/llvm/Support/float128.h
@@ -0,0 +1,18 @@
+//===-- llvm/Support/float128.h - Compiler abstraction support --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_FLOAT128
+#define LLVM_FLOAT128
+
+#if defined(__clang__) && defined(__FLOAT128__)
+typedef __float128 float128;
+#elif defined(__FLOAT128__) && (defined(__GNUC__) || defined(__GNUG__))
+typedef _Float128 float128;
+#endif
+
+#endif // LLVM_FLOAT128
diff --git a/llvm/lib/Analysis/CMakeLists.txt b/llvm/lib/Analysis/CMakeLists.txt
index 35ea03f42f82b1..4473e888afa979 100644
--- a/llvm/lib/Analysis/CMakeLists.txt
+++ b/llvm/lib/Analysis/CMakeLists.txt
@@ -161,3 +161,9 @@ add_llvm_component_library(LLVMAnalysis
   Support
   TargetParser
   )
+
+include(CheckCXXSymbolExists)
+check_cxx_symbol_exists(logf128 math.h HAS_LOGF128)
+if(HAS_LOGF128)
+ target_compile_definitions(LLVMAnalysis PRIVATE HAS_LOGF128)
+endif()
\ No newline at end of file
diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 8b7031e7fe4a6f..dbf565cc1cd34b 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -1678,9 +1678,8 @@ bool llvm::canConstantFoldCallTo(const CallBase *Call, const Function *F) {
            Name == "floor" || Name == "floorf" ||
            Name == "fmod" || Name == "fmodf";
   case 'l':
-    return Name == "log" || Name == "logf" ||
-           Name == "log2" || Name == "log2f" ||
-           Name == "log10" || Name == "log10f";
+    return Name == "log" || Name == "logf" || Name == "log2" ||
+           Name == "log2f" || Name == "log10" || Name == "log10f";
   case 'n':
     return Name == "nearbyint" || Name == "nearbyintf";
   case 'p':
@@ -2094,7 +2093,8 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
     if (IntrinsicID == Intrinsic::canonicalize)
       return constantFoldCanonicalize(Ty, Call, U);
 
-    if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())
+    if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy() &&
+        !Ty->isFP128Ty())
       return nullptr;
 
     // Use internal versions of these intrinsics.
@@ -2209,6 +2209,11 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
     switch (IntrinsicID) {
       default: break;
       case Intrinsic::log:
+#if defined(__FLOAT128__) && defined(HAS_LOGF128)
+        if (Ty->isFP128Ty()) {
+          return ConstantFP::get(Ty, logf128(APF.convertToQuad()));
+        }
+#endif
         return ConstantFoldFP(log, APF, Ty);
       case Intrinsic::log2:
         // TODO: What about hosts that lack a C99 library?
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index e6b92aad392f66..69cdbcaf22b3fc 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -976,6 +976,24 @@ Constant *ConstantFP::get(Type *Ty, double V) {
   return C;
 }
 
+#ifdef __FLOAT128__
+Constant *ConstantFP::get128(Type *Ty, float128 V) {
+  LLVMContext &Context = Ty->getContext();
+
+  APFloat FV(V);
+  bool ignored;
+  FV.convert(Ty->getScalarType()->getFltSemantics(),
+             APFloat::rmNearestTiesToEven, &ignored);
+  Constant *C = get(Context, FV);
+
+  // For vectors, broadcast the value.
+  if (VectorType *VTy = dyn_cast<VectorType>(Ty))
+    return ConstantVector::getSplat(VTy->getElementCount(), C);
+
+  return C;
+}
+#endif
+
 Constant *ConstantFP::get(Type *Ty, const APFloat &V) {
   ConstantFP *C = get(Ty->getContext(), V);
   assert(C->getType() == Ty->getScalarType() &&
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index 0a4f5ac01553f1..6c58a36408c8b7 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -3670,6 +3670,15 @@ double IEEEFloat::convertToDouble() const {
   return api.bitsToDouble();
 }
 
+#ifdef __FLOAT128__
+float128 IEEEFloat::convertToQuad() const {
+  assert(semantics == (const llvm::fltSemantics *)&semIEEEquad &&
+         "Float semantics are not IEEEquads");
+  APInt api = bitcastToAPInt();
+  return api.bitsToQuad();
+}
+#endif
+
 /// Integer bit is explicit in this format.  Intel hardware (387 and later)
 /// does not support these bit patterns:
 ///  exponent = all 1's, integer bit 0, significand 0 ("pseudoinfinity")
@@ -3958,6 +3967,12 @@ IEEEFloat::IEEEFloat(double d) {
   initFromAPInt(&semIEEEdouble, APInt::doubleToBits(d));
 }
 
+#ifdef __FLOAT128__
+IEEEFloat::IEEEFloat(float128 ld) {
+  initFromAPInt(&semIEEEquad, APInt::longDoubleToBits(ld));
+}
+#endif
+
 namespace {
   void append(SmallVectorImpl<char> &Buffer, StringRef Str) {
     Buffer.append(Str.begin(), Str.end());
@@ -5265,6 +5280,21 @@ double APFloat::convertToDouble() const {
   return Temp.getIEEE().convertToDouble();
 }
 
+#ifdef __FLOAT128__
+float128 APFloat::convertToQuad() const {
+  if (&getSemantics() == (const llvm::fltSemantics *)&semIEEEquad)
+    return getIEEE().convertToQuad();
+  assert(getSemantics().isRepresentableBy(semIEEEquad) &&
+         "Float semantics is not representable by IEEEquad");
+  APFloat Temp = *this;
+  bool LosesInfo;
+  opStatus St = Temp.convert(semIEEEquad, rmNearestTiesToEven, &LosesInfo);
+  assert(!(St & opInexact) && !LosesInfo && "Unexpected imprecision");
+  (void)St;
+  return Temp.getIEEE().convertToQuad();
+}
+#endif
+
 float APFloat::convertToFloat() const {
   if (&getSemantics() == (const llvm::fltSemantics *)&semIEEEsingle)
     return getIEEE().convertToFloat();
diff --git a/llvm/unittests/Analysis/CMakeLists.txt b/llvm/unittests/Analysis/CMakeLists.txt
index b1aeaa6e71fd4c..796a31cc216812 100644
--- a/llvm/unittests/Analysis/CMakeLists.txt
+++ b/llvm/unittests/Analysis/CMakeLists.txt
@@ -51,6 +51,7 @@ set(ANALYSIS_TEST_SOURCES
   ValueLatticeTest.cpp
   ValueTrackingTest.cpp
   VectorUtilsTest.cpp
+  ConstantLogf128.cpp
   )
 
 set(MLGO_TESTS TFUtilsTest.cpp)
@@ -80,5 +81,11 @@ if(NOT WIN32)
   export_executable_symbols_for_plugins(AnalysisTests)
 endif()
 
+include(CheckCXXSymbolExists)
+check_cxx_symbol_exists(logf128 math.h HAS_LOGF128)
+if(HAS_LOGF128)
+  target_compile_definitions(AnalysisTests PRIVATE HAS_LOGF128)
+endif()
+
 add_subdirectory(InlineAdvisorPlugin)
 add_subdirectory(InlineOrderPlugin)
diff --git a/llvm/unittests/Analysis/ConstantLogf128.cpp b/llvm/unittests/Analysis/ConstantLogf128.cpp
new file mode 100644
index 00000000000000..1be7e9b4ab9c49
--- /dev/null
+++ b/llvm/unittests/Analysis/ConstantLogf128.cpp
@@ -0,0 +1,69 @@
+//===- unittests/CodeGen/BufferSourceTest.cpp - MemoryBuffer source tests -===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/CodeGen/GlobalISel/CallLowering.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstrTypes.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+namespace {
+
+TEST(ConstantFoldLogf128Fixture, ConstantFoldLogf128) {
+#ifdef __FLOAT128__
+  LLVMContext Context;
+  IRBuilder<> Builder(Context);
+  Module MainModule("Logf128TestModule", Context);
+  MainModule.setTargetTriple("aarch64-unknown-linux");
+
+  Type *FP128Ty = Type::getFP128Ty(Context);
+  FunctionType *FP128Prototype = FunctionType::get(FP128Ty, false);
+  Function *Logf128TestFunction = Function::Create(
+      FP128Prototype, Function::ExternalLinkage, "logf128test", MainModule);
+  BasicBlock *EntryBlock =
+      BasicBlock::Create(Context, "entry", Logf128TestFunction);
+  Builder.SetInsertPoint(EntryBlock);
+
+  FunctionType *FP128FP128Prototype =
+      FunctionType::get(FP128Ty, {FP128Ty}, false);
+  Constant *Constant2L = ConstantFP::get128(FP128Ty, 2.0L);
+  Function *Logf128 =
+      Function::Create(FP128FP128Prototype, Function::ExternalLinkage,
+                       "llvm.log.f128", MainModule);
+  CallInst *Logf128Call = Builder.CreateCall(Logf128, Constant2L);
+
+  TargetLibraryInfoImpl TLII(Triple(MainModule.getTargetTriple()));
+  TargetLibraryInfo TLI(TLII, Logf128TestFunction);
+  Constant *FoldResult =
+      ConstantFoldCall(Logf128Call, Logf128, Constant2L, &TLI);
+
+#ifndef HAS_LOGF128
+  ASSERT_TRUE(FoldResult == nullptr);
+#else
+  auto ConstantLog = dyn_cast<ConstantFP>(FoldResult);
+  ASSERT_TRUE(ConstantLog);
+
+  APFloat APF = ConstantLog->getValueAPF();
+  char LongDoubleHexString[0xFF];
+  unsigned Size =
+      APF.convertToHexString(LongDoubleHexString, 32, true,
+                             APFloatBase::roundingMode::NearestTiesToAway);
+  EXPECT_GT(Size, 0U);
+
+  ASSERT_STREQ(LongDoubleHexString,
+               std::string("0X1.62E42FEFA39EF000000000000000000P-1").c_str());
+#endif // HAS_LOGF128
+#else  // __FLOAT128__
+  ASSERT_TRUE(true);
+#endif
+}
+
+} // namespace
\ No newline at end of file

@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2024

@llvm/pr-subscribers-llvm-adt

Author: Matthew Devereau (MDevereau)

Changes

This patch is mainly a concept/test patch to gauge if this approach to constant folding a fp128 return from a logf call is at all feasible, or if there are any concerns which can be addressed to make this patch an easy win.

This is achieved by querying with CMake if the host system has the logf128 symbol available. If so, replace the runtime call with the compile time constant returned from logf128.

There are a few concerns with this approach:

  1. The implementation of logf128 may also yield different results on different targets, such as x86 using fp80 precision instead of the full fp128 range on other targets such as aarch64.
  2. This approach relies on unit tests, as more commonplace Clang/C tests and opt/llc/IR tests are not applicable since they are ignorant to the result of the compile time CMake check.
  3. Cross compiling the compiler and moving it to a different machine might cause issues if logf128 is no longer present.
  4. This patch requires the host to have access to __float128 enabled. To enable this for aarch64 or other targets, a patch similar to #85070 would be required.

Full diff: https://github.com/llvm/llvm-project/pull/84501.diff

10 Files Affected:

  • (modified) llvm/include/llvm/ADT/APFloat.h (+19)
  • (modified) llvm/include/llvm/ADT/APInt.h (+18)
  • (modified) llvm/include/llvm/IR/Constants.h (+4)
  • (added) llvm/include/llvm/Support/float128.h (+18)
  • (modified) llvm/lib/Analysis/CMakeLists.txt (+6)
  • (modified) llvm/lib/Analysis/ConstantFolding.cpp (+9-4)
  • (modified) llvm/lib/IR/Constants.cpp (+18)
  • (modified) llvm/lib/Support/APFloat.cpp (+30)
  • (modified) llvm/unittests/Analysis/CMakeLists.txt (+7)
  • (added) llvm/unittests/Analysis/ConstantLogf128.cpp (+69)
diff --git a/llvm/include/llvm/ADT/APFloat.h b/llvm/include/llvm/ADT/APFloat.h
index 8c247bbcec90a2..bd13e23f0ea649 100644
--- a/llvm/include/llvm/ADT/APFloat.h
+++ b/llvm/include/llvm/ADT/APFloat.h
@@ -19,6 +19,7 @@
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/FloatingPointMode.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/float128.h"
 #include <memory>
 
 #define APFLOAT_DISPATCH_ON_SEMANTICS(METHOD_CALL)                             \
@@ -299,6 +300,9 @@ class IEEEFloat final : public APFloatBase {
   IEEEFloat(const fltSemantics &, integerPart);
   IEEEFloat(const fltSemantics &, uninitializedTag);
   IEEEFloat(const fltSemantics &, const APInt &);
+#ifdef __FLOAT128__
+  explicit IEEEFloat(float128 ld);
+#endif
   explicit IEEEFloat(double d);
   explicit IEEEFloat(float f);
   IEEEFloat(const IEEEFloat &);
@@ -354,6 +358,9 @@ class IEEEFloat final : public APFloatBase {
   Expected<opStatus> convertFromString(StringRef, roundingMode);
   APInt bitcastToAPInt() const;
   double convertToDouble() const;
+#ifdef __FLOAT128__
+  float128 convertToQuad() const;
+#endif
   float convertToFloat() const;
 
   /// @}
@@ -942,6 +949,9 @@ class APFloat : public APFloatBase {
   APFloat(const fltSemantics &Semantics, uninitializedTag)
       : U(Semantics, uninitialized) {}
   APFloat(const fltSemantics &Semantics, const APInt &I) : U(Semantics, I) {}
+#ifdef __FLOAT128__
+  explicit APFloat(float128 ld) : U(IEEEFloat(ld), IEEEquad()) {}
+#endif
   explicit APFloat(double d) : U(IEEEFloat(d), IEEEdouble()) {}
   explicit APFloat(float f) : U(IEEEFloat(f), IEEEsingle()) {}
   APFloat(const APFloat &RHS) = default;
@@ -1218,6 +1228,15 @@ class APFloat : public APFloatBase {
   /// shorter semantics, like IEEEsingle and others.
   double convertToDouble() const;
 
+  /// Converts this APFloat to host float value.
+  ///
+  /// \pre The APFloat must be built using semantics, that can be represented by
+  /// the host float type without loss of precision. It can be IEEEquad and
+  /// shorter semantics, like IEEEdouble and others.
+#ifdef __FLOAT128__
+  float128 convertToQuad() const;
+#endif
+
   /// Converts this APFloat to host float value.
   ///
   /// \pre The APFloat must be built using semantics, that can be represented by
diff --git a/llvm/include/llvm/ADT/APInt.h b/llvm/include/llvm/ADT/APInt.h
index 6c05367cecb1ea..b7d73107dc0831 100644
--- a/llvm/include/llvm/ADT/APInt.h
+++ b/llvm/include/llvm/ADT/APInt.h
@@ -17,6 +17,7 @@
 
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/MathExtras.h"
+#include "llvm/Support/float128.h"
 #include <cassert>
 #include <climits>
 #include <cstring>
@@ -1663,6 +1664,13 @@ class [[nodiscard]] APInt {
   /// any bit width. Exactly 64 bits will be translated.
   double bitsToDouble() const { return llvm::bit_cast<double>(getWord(0)); }
 
+#ifdef __FLOAT128__
+  float128 bitsToQuad() const {
+    __uint128_t ul = ((__uint128_t)U.pVal[1] << 64) + U.pVal[0];
+    return llvm::bit_cast<float128>(ul);
+  }
+#endif
+
   /// Converts APInt bits to a float
   ///
   /// The conversion does not do a translation from integer to float, it just
@@ -1688,6 +1696,16 @@ class [[nodiscard]] APInt {
     return APInt(sizeof(float) * CHAR_BIT, llvm::bit_cast<uint32_t>(V));
   }
 
+#ifdef __FLOAT128__
+  static APInt longDoubleToBits(float128 V) {
+    const uint64_t Words[2] = {
+        static_cast<uint64_t>(V),
+        static_cast<uint64_t>(llvm::bit_cast<__uint128_t>(V) >> 64),
+    };
+    return APInt(sizeof(float128) * CHAR_BIT, 2, Words);
+  }
+#endif
+
   /// @}
   /// \name Mathematics Operations
   /// @{
diff --git a/llvm/include/llvm/IR/Constants.h b/llvm/include/llvm/IR/Constants.h
index c0ac9a4aa6750c..e924130f66f381 100644
--- a/llvm/include/llvm/IR/Constants.h
+++ b/llvm/include/llvm/IR/Constants.h
@@ -289,6 +289,10 @@ class ConstantFP final : public ConstantData {
   /// host double and as the target format.
   static Constant *get(Type *Ty, double V);
 
+#ifdef __FLOAT128__
+  static Constant *get128(Type *Ty, float128 V);
+#endif
+
   /// If Ty is a vector type, return a Constant with a splat of the given
   /// value. Otherwise return a ConstantFP for the given value.
   static Constant *get(Type *Ty, const APFloat &V);
diff --git a/llvm/include/llvm/Support/float128.h b/llvm/include/llvm/Support/float128.h
new file mode 100644
index 00000000000000..6ff844cd7b35ab
--- /dev/null
+++ b/llvm/include/llvm/Support/float128.h
@@ -0,0 +1,18 @@
+//===-- llvm/Support/float128.h - Compiler abstraction support --*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_FLOAT128
+#define LLVM_FLOAT128
+
+#if defined(__clang__) && defined(__FLOAT128__)
+typedef __float128 float128;
+#elif defined(__FLOAT128__) && (defined(__GNUC__) || defined(__GNUG__))
+typedef _Float128 float128;
+#endif
+
+#endif // LLVM_FLOAT128
diff --git a/llvm/lib/Analysis/CMakeLists.txt b/llvm/lib/Analysis/CMakeLists.txt
index 35ea03f42f82b1..4473e888afa979 100644
--- a/llvm/lib/Analysis/CMakeLists.txt
+++ b/llvm/lib/Analysis/CMakeLists.txt
@@ -161,3 +161,9 @@ add_llvm_component_library(LLVMAnalysis
   Support
   TargetParser
   )
+
+include(CheckCXXSymbolExists)
+check_cxx_symbol_exists(logf128 math.h HAS_LOGF128)
+if(HAS_LOGF128)
+ target_compile_definitions(LLVMAnalysis PRIVATE HAS_LOGF128)
+endif()
\ No newline at end of file
diff --git a/llvm/lib/Analysis/ConstantFolding.cpp b/llvm/lib/Analysis/ConstantFolding.cpp
index 8b7031e7fe4a6f..dbf565cc1cd34b 100644
--- a/llvm/lib/Analysis/ConstantFolding.cpp
+++ b/llvm/lib/Analysis/ConstantFolding.cpp
@@ -1678,9 +1678,8 @@ bool llvm::canConstantFoldCallTo(const CallBase *Call, const Function *F) {
            Name == "floor" || Name == "floorf" ||
            Name == "fmod" || Name == "fmodf";
   case 'l':
-    return Name == "log" || Name == "logf" ||
-           Name == "log2" || Name == "log2f" ||
-           Name == "log10" || Name == "log10f";
+    return Name == "log" || Name == "logf" || Name == "log2" ||
+           Name == "log2f" || Name == "log10" || Name == "log10f";
   case 'n':
     return Name == "nearbyint" || Name == "nearbyintf";
   case 'p':
@@ -2094,7 +2093,8 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
     if (IntrinsicID == Intrinsic::canonicalize)
       return constantFoldCanonicalize(Ty, Call, U);
 
-    if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy())
+    if (!Ty->isHalfTy() && !Ty->isFloatTy() && !Ty->isDoubleTy() &&
+        !Ty->isFP128Ty())
       return nullptr;
 
     // Use internal versions of these intrinsics.
@@ -2209,6 +2209,11 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
     switch (IntrinsicID) {
       default: break;
       case Intrinsic::log:
+#if defined(__FLOAT128__) && defined(HAS_LOGF128)
+        if (Ty->isFP128Ty()) {
+          return ConstantFP::get(Ty, logf128(APF.convertToQuad()));
+        }
+#endif
         return ConstantFoldFP(log, APF, Ty);
       case Intrinsic::log2:
         // TODO: What about hosts that lack a C99 library?
diff --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index e6b92aad392f66..69cdbcaf22b3fc 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -976,6 +976,24 @@ Constant *ConstantFP::get(Type *Ty, double V) {
   return C;
 }
 
+#ifdef __FLOAT128__
+Constant *ConstantFP::get128(Type *Ty, float128 V) {
+  LLVMContext &Context = Ty->getContext();
+
+  APFloat FV(V);
+  bool ignored;
+  FV.convert(Ty->getScalarType()->getFltSemantics(),
+             APFloat::rmNearestTiesToEven, &ignored);
+  Constant *C = get(Context, FV);
+
+  // For vectors, broadcast the value.
+  if (VectorType *VTy = dyn_cast<VectorType>(Ty))
+    return ConstantVector::getSplat(VTy->getElementCount(), C);
+
+  return C;
+}
+#endif
+
 Constant *ConstantFP::get(Type *Ty, const APFloat &V) {
   ConstantFP *C = get(Ty->getContext(), V);
   assert(C->getType() == Ty->getScalarType() &&
diff --git a/llvm/lib/Support/APFloat.cpp b/llvm/lib/Support/APFloat.cpp
index 0a4f5ac01553f1..6c58a36408c8b7 100644
--- a/llvm/lib/Support/APFloat.cpp
+++ b/llvm/lib/Support/APFloat.cpp
@@ -3670,6 +3670,15 @@ double IEEEFloat::convertToDouble() const {
   return api.bitsToDouble();
 }
 
+#ifdef __FLOAT128__
+float128 IEEEFloat::convertToQuad() const {
+  assert(semantics == (const llvm::fltSemantics *)&semIEEEquad &&
+         "Float semantics are not IEEEquads");
+  APInt api = bitcastToAPInt();
+  return api.bitsToQuad();
+}
+#endif
+
 /// Integer bit is explicit in this format.  Intel hardware (387 and later)
 /// does not support these bit patterns:
 ///  exponent = all 1's, integer bit 0, significand 0 ("pseudoinfinity")
@@ -3958,6 +3967,12 @@ IEEEFloat::IEEEFloat(double d) {
   initFromAPInt(&semIEEEdouble, APInt::doubleToBits(d));
 }
 
+#ifdef __FLOAT128__
+IEEEFloat::IEEEFloat(float128 ld) {
+  initFromAPInt(&semIEEEquad, APInt::longDoubleToBits(ld));
+}
+#endif
+
 namespace {
   void append(SmallVectorImpl<char> &Buffer, StringRef Str) {
     Buffer.append(Str.begin(), Str.end());
@@ -5265,6 +5280,21 @@ double APFloat::convertToDouble() const {
   return Temp.getIEEE().convertToDouble();
 }
 
+#ifdef __FLOAT128__
+float128 APFloat::convertToQuad() const {
+  if (&getSemantics() == (const llvm::fltSemantics *)&semIEEEquad)
+    return getIEEE().convertToQuad();
+  assert(getSemantics().isRepresentableBy(semIEEEquad) &&
+         "Float semantics is not representable by IEEEquad");
+  APFloat Temp = *this;
+  bool LosesInfo;
+  opStatus St = Temp.convert(semIEEEquad, rmNearestTiesToEven, &LosesInfo);
+  assert(!(St & opInexact) && !LosesInfo && "Unexpected imprecision");
+  (void)St;
+  return Temp.getIEEE().convertToQuad();
+}
+#endif
+
 float APFloat::convertToFloat() const {
   if (&getSemantics() == (const llvm::fltSemantics *)&semIEEEsingle)
     return getIEEE().convertToFloat();
diff --git a/llvm/unittests/Analysis/CMakeLists.txt b/llvm/unittests/Analysis/CMakeLists.txt
index b1aeaa6e71fd4c..796a31cc216812 100644
--- a/llvm/unittests/Analysis/CMakeLists.txt
+++ b/llvm/unittests/Analysis/CMakeLists.txt
@@ -51,6 +51,7 @@ set(ANALYSIS_TEST_SOURCES
   ValueLatticeTest.cpp
   ValueTrackingTest.cpp
   VectorUtilsTest.cpp
+  ConstantLogf128.cpp
   )
 
 set(MLGO_TESTS TFUtilsTest.cpp)
@@ -80,5 +81,11 @@ if(NOT WIN32)
   export_executable_symbols_for_plugins(AnalysisTests)
 endif()
 
+include(CheckCXXSymbolExists)
+check_cxx_symbol_exists(logf128 math.h HAS_LOGF128)
+if(HAS_LOGF128)
+  target_compile_definitions(AnalysisTests PRIVATE HAS_LOGF128)
+endif()
+
 add_subdirectory(InlineAdvisorPlugin)
 add_subdirectory(InlineOrderPlugin)
diff --git a/llvm/unittests/Analysis/ConstantLogf128.cpp b/llvm/unittests/Analysis/ConstantLogf128.cpp
new file mode 100644
index 00000000000000..1be7e9b4ab9c49
--- /dev/null
+++ b/llvm/unittests/Analysis/ConstantLogf128.cpp
@@ -0,0 +1,69 @@
+//===- unittests/CodeGen/BufferSourceTest.cpp - MemoryBuffer source tests -===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/CodeGen/GlobalISel/CallLowering.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstrTypes.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+namespace {
+
+TEST(ConstantFoldLogf128Fixture, ConstantFoldLogf128) {
+#ifdef __FLOAT128__
+  LLVMContext Context;
+  IRBuilder<> Builder(Context);
+  Module MainModule("Logf128TestModule", Context);
+  MainModule.setTargetTriple("aarch64-unknown-linux");
+
+  Type *FP128Ty = Type::getFP128Ty(Context);
+  FunctionType *FP128Prototype = FunctionType::get(FP128Ty, false);
+  Function *Logf128TestFunction = Function::Create(
+      FP128Prototype, Function::ExternalLinkage, "logf128test", MainModule);
+  BasicBlock *EntryBlock =
+      BasicBlock::Create(Context, "entry", Logf128TestFunction);
+  Builder.SetInsertPoint(EntryBlock);
+
+  FunctionType *FP128FP128Prototype =
+      FunctionType::get(FP128Ty, {FP128Ty}, false);
+  Constant *Constant2L = ConstantFP::get128(FP128Ty, 2.0L);
+  Function *Logf128 =
+      Function::Create(FP128FP128Prototype, Function::ExternalLinkage,
+                       "llvm.log.f128", MainModule);
+  CallInst *Logf128Call = Builder.CreateCall(Logf128, Constant2L);
+
+  TargetLibraryInfoImpl TLII(Triple(MainModule.getTargetTriple()));
+  TargetLibraryInfo TLI(TLII, Logf128TestFunction);
+  Constant *FoldResult =
+      ConstantFoldCall(Logf128Call, Logf128, Constant2L, &TLI);
+
+#ifndef HAS_LOGF128
+  ASSERT_TRUE(FoldResult == nullptr);
+#else
+  auto ConstantLog = dyn_cast<ConstantFP>(FoldResult);
+  ASSERT_TRUE(ConstantLog);
+
+  APFloat APF = ConstantLog->getValueAPF();
+  char LongDoubleHexString[0xFF];
+  unsigned Size =
+      APF.convertToHexString(LongDoubleHexString, 32, true,
+                             APFloatBase::roundingMode::NearestTiesToAway);
+  EXPECT_GT(Size, 0U);
+
+  ASSERT_STREQ(LongDoubleHexString,
+               std::string("0X1.62E42FEFA39EF000000000000000000P-1").c_str());
+#endif // HAS_LOGF128
+#else  // __FLOAT128__
+  ASSERT_TRUE(true);
+#endif
+}
+
+} // namespace
\ No newline at end of file

@MDevereau MDevereau changed the title WIP/POC: Constant Fold Logf128 calls Constant Fold Logf128 calls Mar 25, 2024
@MDevereau
Copy link
Contributor Author

To give some context as to why this is desirable, there's been instances where calls to long double have been generated when they should be constant folded. As they are 128-bit floats on aarch64 these calls end up being particularly slow compared to targets where the long double type is 64 bits in length.

The mersenne twister in libstdc++ contains this code:

const long double __r = static_cast<long double>(__urng.max())        
                      - static_cast<long double>(__urng.min()) + 1.0L;
const size_t __log2r = std::log(__r) / std::log(2.0L);  

GCC will constant fold both calls to logl here but clang will keep the calls. The proposed approach in the pull request is in thoery an easy-win which other systems could also benefit from.

@efriedma-quic
Copy link
Collaborator

The primary reason we haven't continued to expand float constant-folding is that we don't really want to use the host's libm for target calculations. Ideally, we want the compiler to be portable: we want to consistently produce the same results. So we want to use our own library routines.

We have ConstantFoldFP which already uses the host libc, but we don't really want to expand that usage. gcc uses MFPR, but LLVM can't due to licensing issues. Maybe we can borrow code from llvm-libc?

@efriedma-quic
Copy link
Collaborator

This approach relies on unit tests, as more commonplace Clang/C tests and opt/llc/IR tests are not applicable since they are ignorant to the result of the compile time CMake check.

You can write code in lit.cfg.py to add a feature to config.available_features, then check for it using a REQUIRES clause.

if (Ty->isFP128Ty()) {
return ConstantFP::get(Ty, logf128(APF.convertToQuad()));
}
#endif
return ConstantFoldFP(log, APF, Ty);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This falls through to using the double-precision log() for long double values on hosts which don't have float128; that seems wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not already what was happening previously? Inside ConstantFoldFP the result is handled as a double. I'm not sure if you're asking me to start preserving the long double type in this pre-existing code or whether what I've introduced has changed established behaviour

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this patch, we wouldn't try to fold fp128 at all. With this patch, we try to fold it using "double" routines. (Previously, we would only fold 64-bit or 32-bit floats.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you mean now, thanks for clarifying. I've rearranged this to stop any fall-through happening

#define LLVM_FLOAT128

#if defined(__clang__) && defined(__FLOAT128__)
typedef __float128 float128;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace llvm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -0,0 +1,69 @@
//===- unittests/CodeGen/BufferSourceTest.cpp - MemoryBuffer source tests -===//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy paste error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. The file has been removed now though.


namespace {

TEST(ConstantFoldLogf128Fixture, ConstantFoldLogf128) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easier to write InstSimplify tests for this. Also should test all the edge case special values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this file and moved/added new tests to InstSimplify. Hopefully that should prove easier to test.

@@ -289,6 +289,10 @@ class ConstantFP final : public ConstantData {
/// host double and as the target format.
static Constant *get(Type *Ty, double V);

#ifdef __FLOAT128__
static Constant *get128(Type *Ty, float128 V);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the APFloat version for float128?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've removed this function now.

@@ -1688,6 +1696,16 @@ class [[nodiscard]] APInt {
return APInt(sizeof(float) * CHAR_BIT, llvm::bit_cast<uint32_t>(V));
}

#ifdef __FLOAT128__
static APInt longDoubleToBits(float128 V) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think APInt/APFloat should be exposing APIs that are only conditionally available. That kind of defeats the point of using them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I can't think of a better way to do this currently since logf128 isn't available on all targets but is the only type that guarantees a floating-point type of 128 bits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do you need to directly deal with float128? Just keep everything as APFloat in the compiler

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libc logf128() function takes a float128 argument, and returns a float128 result. If we're using it, we need to convert between float128 and APFloat. (If APFloat::log() existed, this patch would be much simpler... but it doesn't, and it's difficult to write.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's correct to remove this (and have done so) since APFloat already has internal representation of float128's. I'm not sure about the convertToQuad function I've added though, since that is necessary to extract the float128 information from APFloat in order to pass it to the C logf128 function.

@jcranmer-intel
Copy link
Contributor

The primary reason we haven't continued to expand float constant-folding is that we don't really want to use the host's libm for target calculations. Ideally, we want the compiler to be portable: we want to consistently produce the same results. So we want to use our own library routines.

We have ConstantFoldFP which already uses the host libc, but we don't really want to expand that usage. gcc uses MFPR, but LLVM can't due to licensing issues. Maybe we can borrow code from llvm-libc?

I whole-heartedly agree with this sentiment. llvm-libc is working on getting correctly-rounded versions of all the math functions, but the only one implemented for all the float sizes is sqrt, and otherwise they're largely implemented only for float at the moment.

When we talked at the last LLVM FP working group meeting, llvm-libc people were receptive to the idea of reusing code for APFloat, although I would probably want to wait to see how things work out for reusing libc code in libc++ first.

@llvmbot llvmbot added cmake Build system in general and CMake in particular llvm:transforms labels Mar 28, 2024
@MDevereau
Copy link
Contributor Author

You can write code in lit.cfg.py to add a feature to config.available_features, then check for it using a REQUIRES clause.

I've implemented this and removed the unit tests. Instead I'm now relying on InstSimplify tests.

;
%A = call fp128 @llvm.log.f128(fp128 noundef 0xL0000000000000000C000000000000000)
ret fp128 %A
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test inf/nan/zero

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

;
%A = call fp128 @llvm.log.f128(fp128 noundef 0xL00000000000000007FFF000000000001)
ret fp128 %A
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test vectors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean, sorry. I initially thought you meant the log function can take a vector type parameter, but after checking https://llvm.org/docs/LangRef.html#llvm-log-intrinsic I can't see one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define <2 x fp128> @log_e_negative_2(){
  %A = call <2 x fp128> @llvm.log.v2f128(<2 x fp128> <fp128 0xL0000000000000000C000000000000000, fp128 0xL0000000000000000C000000000000001>)
  ret <2 x fp128> %A
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've added that test now.

ret fp128 %A
}

define fp128 @log_e_negative_0(){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space after define in all of these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -299,6 +300,9 @@ class IEEEFloat final : public APFloatBase {
IEEEFloat(const fltSemantics &, integerPart);
IEEEFloat(const fltSemantics &, uninitializedTag);
IEEEFloat(const fltSemantics &, const APInt &);
#ifdef __FLOAT128__
explicit IEEEFloat(float128 ld);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't expose non-portable APFloat APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -1688,6 +1696,16 @@ class [[nodiscard]] APInt {
return APInt(sizeof(float) * CHAR_BIT, llvm::bit_cast<uint32_t>(V));
}

#ifdef __FLOAT128__
static APInt longDoubleToBits(float128 V) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do you need to directly deal with float128? Just keep everything as APFloat in the compiler

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the unnecessary target requires

; RUN: opt < %s -passes=instsimplify -S | FileCheck %s

; REQUIRES: has_logf128
; REQUIRES: aarch64-registered-target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't require aarch64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. The initial reasoning behind this was that I was unsure if it was safe to use these tests for long double on multiple targets, like on x86 where the width may be 80 bits instead of the full 128 bits like on AArch64.

@efriedma-quic
Copy link
Collaborator

Please fix commit message; I think you've addressed most of the "concerns".

I'm still not happy with the dependency on host libc, given the increasing pressure to have constant-folding functionality reliably available. See also https://discourse.llvm.org/t/rfc-project-hand-in-hand-llvm-libc-libc-code-sharing/77701/18 .

Other than that, looks fine.

@MDevereau MDevereau changed the title Constant Fold Logf128 calls [clang] Constant Fold Logf128 calls Apr 10, 2024
@MDevereau MDevereau changed the title [clang] Constant Fold Logf128 calls Constant Fold Logf128 calls Apr 10, 2024
@MDevereau
Copy link
Contributor Author

@efriedma-quic I've changed the commit message to be brief and removed any WIP-like language

check_cxx_symbol_exists(logf128 math.h HAS_LOGF128)
if(HAS_LOGF128)
target_compile_definitions(LLVMAnalysis PRIVATE HAS_LOGF128)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub is unhappy about the end of file here

This is a proof of concept/work in progress patch.

This patch enables ConstantFolding of log FP128 calls.

This is achieved by querying with CMake if the host
system has the logf128 symbol available. If so, replace
the runtime call with the compile time constant returned
from logf128.

This approach could be considered controversial as cross-compiled
llvm executables using shared objects may not have the logf128 symbol
available at runtime.

The implementation of logf128 may also yield different results
on different targets, such as x86 using fp80 precision instead
of the full fp128 range on other targets.

This approach relies on unit tests, as more commonplace Clang/C
tests and opt/llc/IR tests are not applicable since they
are ignorant to the result of the compile time CMake check.
Use __float128 for clang and _Float128 for gcc.
Default to long double for other compilers.
@MDevereau MDevereau merged commit e90bc9c into llvm:main Apr 18, 2024
4 checks passed
MDevereau added a commit that referenced this pull request Apr 18, 2024
@MDevereau MDevereau deleted the logf128 branch April 30, 2024 13:45
MDevereau added a commit to MDevereau/llvm-project that referenced this pull request Apr 30, 2024
This is a second attempt to land llvm#84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
MDevereau added a commit that referenced this pull request May 1, 2024
This is a second attempt to land #84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
MDevereau added a commit to MDevereau/llvm-project that referenced this pull request May 1, 2024
This is a second attempt to land llvm#84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
MDevereau added a commit to MDevereau/llvm-project that referenced this pull request May 1, 2024
This is a second attempt to land llvm#84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is available
and checking if the host does not use __ibm128 which is prevalent on power pc
targets and replaces IEEE754 float128s.
MDevereau added a commit that referenced this pull request May 29, 2024
This is a second attempt to land #84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is
available and checking if the host does not use __ibm128 which is
prevalent on power pc targets and replaces IEEE754 float128s.
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
This is a second attempt to land llvm#84501 which failed on several targets.

This patch adds the HAS_IEE754_FLOAT128 define which makes the check for
typedef'ing float128 more precise by checking whether __uint128_t is
available and checking if the host does not use __ibm128 which is
prevalent on power pc targets and replaces IEEE754 float128s.
@nickdesaulniers
Copy link
Member

@efriedma-quic could we do something like this to solve #90685 (comment) (if the target matched the host)?

@efriedma-quic
Copy link
Collaborator

@davemgreen
Copy link
Collaborator

That sounds like the same issue we were trying to solve (depending on what logl means on the given target). There is a commit in #96287 and #104929 that got so very close, in that I believe it was working on all the buildbots. This can cause a really surprising perf swing on AArch64 where soft-float routines get called, so it would be good if we can get something fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants