Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IR] Allow MDString in operand bundles #110805

Merged
merged 4 commits into from
Oct 11, 2024
Merged

Conversation

spavloff
Copy link
Collaborator

@spavloff spavloff commented Oct 2, 2024

This change implements support of metadata strings in operand bundle values. It makes possible calls like:

call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values and metadata are stored in different tables, there must be a way to distinguish them during deserialization. It is implemented by putting a special marker before the metadata index. The marker cannot be treated as a reference to any SSA value, so it unambiguously identifies metadata. It allows extending the bitcode serialization without breaking compatibility.

Metadata as operand bundle values are intended to be used in floating-point function calls. They would represent the same information as now is passed by the constrained intrinsic arguments.

@llvmbot
Copy link
Member

llvmbot commented Oct 2, 2024

@llvm/pr-subscribers-llvm-ir

Author: Serge Pavlov (spavloff)

Changes

This change implements support of metadata strings in operand bundle values. It makes possible calls like:

call void @<!-- -->some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values and metadata are stored in different tables, there must be a way to distinguish them during deserialization. It is implemented by putting a special marker before the metadata index. The marker cannot be treated as a reference to any SSA value, so it unambiguously identifies metadata. It allows extending the bitcode serialization without breaking compatibility.

Metadata as operand bundle values are intended to be used in floating-point function calls. They would represent the same information as now is passed by the constrained intrinsic arguments.


Full diff: https://github.com/llvm/llvm-project/pull/110805.diff

8 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+3-3)
  • (modified) llvm/docs/ReleaseNotes.md (+2)
  • (modified) llvm/include/llvm/AsmParser/LLParser.h (+1)
  • (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+3)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+14)
  • (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+19-2)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+16-1)
  • (modified) llvm/test/Bitcode/operand-bundles.ll (+24)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 3f39d58b322a4f..8f5cad8a530053 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -2665,8 +2665,8 @@ are grouped into a single :ref:`attribute group <attrgrp>`.
 Operand Bundles
 ---------------
 
-Operand bundles are tagged sets of SSA values that can be associated
-with certain LLVM instructions (currently only ``call`` s and
+Operand bundles are tagged sets of SSA values or metadata strings that can be
+associated with certain LLVM instructions (currently only ``call`` s and
 ``invoke`` s).  In a way they are like metadata, but dropping them is
 incorrect and will change program semantics.
 
@@ -2674,7 +2674,7 @@ Syntax::
 
     operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
     operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
-    bundle operand ::= SSA value
+    bundle operand ::= SSA value | metadata string
     tag ::= string constant
 
 Operand bundles are **not** part of a function's signature, and a
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index f44d636a203374..de836b7a50b893 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -86,6 +86,8 @@ Changes to the LLVM IR
   * `llvm.nvvm.ptr.shared.to.gen`
   * `llvm.nvvm.ptr.constant.to.gen`
   * `llvm.nvvm.ptr.local.to.gen`
+  
+* Operand bundle values can now be metadata strings.
 
 Changes to LLVM infrastructure
 ------------------------------
diff --git a/llvm/include/llvm/AsmParser/LLParser.h b/llvm/include/llvm/AsmParser/LLParser.h
index 9576b935198dd4..ccb5c4450b76c7 100644
--- a/llvm/include/llvm/AsmParser/LLParser.h
+++ b/llvm/include/llvm/AsmParser/LLParser.h
@@ -66,6 +66,7 @@ namespace llvm {
       t_EmptyArray,          // No value:  []
       t_Constant,            // Value in ConstantVal.
       t_ConstantSplat,       // Value in ConstantVal.
+      t_MDString,            // Value in StrVal.
       t_InlineAsm,           // Value in FTy/StrVal/StrVal2/UIntVal.
       t_ConstantStruct,      // Value in ConstantStructElts.
       t_PackedConstantStruct // Value in ConstantStructElts.
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index cbd92fd52fc75a..ba2efee9414218 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -529,6 +529,9 @@ enum PossiblyExactOperatorOptionalFlags { PEO_EXACT = 0 };
 /// PossiblyDisjointInst's SubclassOptionalData contents.
 enum PossiblyDisjointInstOptionalFlags { PDI_DISJOINT = 0 };
 
+/// Mark to distinguish metadata from value in an operator bundle.
+enum MetadataOperandBundleValueMarker { OB_METADATA = 0x80000000 };
+
 /// GetElementPtrOptionalFlags - Flags for serializing
 /// GEPOperator's SubclassOptionalData contents.
 enum GetElementPtrOptionalFlags {
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index d84521d2e6e10d..5553850d31f5a8 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -3941,6 +3941,15 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState *PFS, Type *ExpectedTy) {
     ID.Kind = ValID::t_Constant;
     return false;
 
+  case lltok::exclaim: {
+    Lex.Lex();
+    ID.StrVal = Lex.getStrVal();
+    if (parseToken(lltok::StringConstant, "expected string"))
+      return true;
+    ID.Kind = ValID::t_MDString;
+    return false;
+  }
+
   case lltok::kw_asm: {
     // ValID ::= 'asm' SideEffect? AlignStack? IntelDialect? STRINGCONSTANT ','
     //             STRINGCONSTANT
@@ -6216,6 +6225,11 @@ bool LLParser::convertValIDToValue(Type *Ty, ValID &ID, Value *&V,
     V = ConstantVector::getSplat(cast<VectorType>(Ty)->getElementCount(),
                                  ID.ConstantVal);
     return false;
+  case ValID::t_MDString:
+    if (!Ty->isMetadataTy())
+      return error(ID.Loc, "invalid type for metadata string");
+    V = MetadataAsValue::get(Context, MDString::get(Context, ID.StrVal));
+    return false;
   case ValID::t_ConstantStruct:
   case ValID::t_PackedConstantStruct:
     if (StructType *ST = dyn_cast<StructType>(Ty)) {
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index 6f997510b03609..8ee93253bc2447 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -792,6 +792,24 @@ class BitcodeReader : public BitcodeReaderBase, public GVMaterializer {
     return ResVal == nullptr;
   }
 
+  bool getValueOrMetadata(const SmallVectorImpl<uint64_t> &Record,
+                          unsigned &Slot, unsigned InstNum, Value *&ResVal,
+                          BasicBlock *ConstExprInsertBB) {
+    if (Slot == Record.size())
+      return true;
+    unsigned ValID = Record[Slot++];
+    if (ValID != bitc::OB_METADATA) {
+      unsigned TypeId;
+      return getValueTypePair(Record, --Slot, InstNum, ResVal, TypeId,
+                              ConstExprInsertBB);
+    }
+    if (Slot == Record.size())
+      return true;
+    unsigned ValNo = InstNum - (unsigned)Record[Slot++];
+    ResVal = MetadataAsValue::get(Context, getFnMetadataByID(ValNo));
+    return false;
+  }
+
   /// Read a value out of the specified record from slot 'Slot'. Increment Slot
   /// past the number of slots used by the value in the record. Return true if
   /// there is an error.
@@ -6767,8 +6785,7 @@ Error BitcodeReader::parseFunctionBody(Function *F) {
       unsigned OpNum = 1;
       while (OpNum != Record.size()) {
         Value *Op;
-        unsigned OpTypeID;
-        if (getValueTypePair(Record, OpNum, NextValueNo, Op, OpTypeID, CurBB))
+        if (getValueOrMetadata(Record, OpNum, NextValueNo, Op, CurBB))
           return error("Invalid record");
         Inputs.push_back(Op);
       }
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index d9086bfebbd2a9..bec0caef58afa8 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -395,6 +395,8 @@ class ModuleBitcodeWriter : public ModuleBitcodeWriterBase {
   void writeModuleConstants();
   bool pushValueAndType(const Value *V, unsigned InstID,
                         SmallVectorImpl<unsigned> &Vals);
+  bool pushValueOrMetadata(const Value *V, unsigned InstID,
+                           SmallVectorImpl<unsigned> &Vals);
   void writeOperandBundles(const CallBase &CB, unsigned InstID);
   void pushValue(const Value *V, unsigned InstID,
                  SmallVectorImpl<unsigned> &Vals);
@@ -2931,6 +2933,19 @@ bool ModuleBitcodeWriter::pushValueAndType(const Value *V, unsigned InstID,
   return false;
 }
 
+bool ModuleBitcodeWriter::pushValueOrMetadata(const Value *V, unsigned InstID,
+                                              SmallVectorImpl<unsigned> &Vals) {
+  bool IsMetadata = V->getType()->isMetadataTy();
+  if (IsMetadata) {
+    Vals.push_back(bitc::OB_METADATA);
+    Metadata *MD = cast<MetadataAsValue>(V)->getMetadata();
+    unsigned ValID = VE.getMetadataID(MD);
+    Vals.push_back(InstID - ValID);
+    return false;
+  }
+  return pushValueAndType(V, InstID, Vals);
+}
+
 void ModuleBitcodeWriter::writeOperandBundles(const CallBase &CS,
                                               unsigned InstID) {
   SmallVector<unsigned, 64> Record;
@@ -2941,7 +2956,7 @@ void ModuleBitcodeWriter::writeOperandBundles(const CallBase &CS,
     Record.push_back(C.getOperandBundleTagID(Bundle.getTagName()));
 
     for (auto &Input : Bundle.Inputs)
-      pushValueAndType(Input, InstID, Record);
+      pushValueOrMetadata(Input, InstID, Record);
 
     Stream.EmitRecord(bitc::FUNC_CODE_OPERAND_BUNDLE, Record);
     Record.clear();
diff --git a/llvm/test/Bitcode/operand-bundles.ll b/llvm/test/Bitcode/operand-bundles.ll
index ab28cffd84aa29..a8e086f784c6cf 100644
--- a/llvm/test/Bitcode/operand-bundles.ll
+++ b/llvm/test/Bitcode/operand-bundles.ll
@@ -56,6 +56,13 @@ define void @f4(i32* %ptr) {
   ret void
 }
 
+define void @f5(i32 %x) {
+entry:
+  call void @callee1(i32 10, i32 %x) [ "foo"(i32 42, metadata !"abc"), "bar"(metadata !"abcde", metadata !"qwerty") ]
+; CHECK: call void @callee1(i32 10, i32 %x) [ "foo"(i32 42, metadata !"abc"), "bar"(metadata !"abcde", metadata !"qwerty") ]
+  ret void
+}
+
 ; Invoke versions of the above tests:
 
 
@@ -150,3 +157,20 @@ exception:
 normal:
   ret void
 }
+
+define void @g5(ptr %ptr) personality i8 3 {
+entry:
+  %l = load i32, ptr %ptr, align 4
+  %x = add i32 42, 1
+  invoke void @callee1(i32 10, i32 %x) [ "foo"(i32 42, metadata !"abc"), "bar"(metadata !"abcde", metadata !"qwerty") ]
+          to label %normal unwind label %exception
+; CHECK:   invoke void @callee1(i32 10, i32 %x) [ "foo"(i32 42, metadata !"abc"), "bar"(metadata !"abcde", metadata !"qwerty") ]
+
+exception:                                        ; preds = %entry
+  %cleanup = landingpad i8
+          cleanup
+  br label %normal
+
+normal:                                           ; preds = %exception, %entry
+  ret void
+}

@@ -6216,6 +6225,11 @@ bool LLParser::convertValIDToValue(Type *Ty, ValID &ID, Value *&V,
V = ConstantVector::getSplat(cast<VectorType>(Ty)->getElementCount(),
ID.ConstantVal);
return false;
case ValID::t_MDString:
if (!Ty->isMetadataTy())
return error(ID.Loc, "invalid type for metadata string");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test the error case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not present anymore.

case lltok::exclaim: {
Lex.Lex();
ID.StrVal = Lex.getStrVal();
if (parseToken(lltok::StringConstant, "expected string"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test error case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not present anymore.

@nikic
Copy link
Contributor

nikic commented Oct 2, 2024

To check my understanding, the reason why we need the bitcode change is that we only encode the operand type if the value ID is larger than the instruction ID, which is never going to be the case for metadata, so we can't use the type to distinguish metadata? And the reason this is not a problem for normal call arguments is that it doesn't encode the types and instead takes them from the function type?

@@ -6216,6 +6225,11 @@ bool LLParser::convertValIDToValue(Type *Ty, ValID &ID, Value *&V,
V = ConstantVector::getSplat(cast<VectorType>(Ty)->getElementCount(),
ID.ConstantVal);
return false;
case ValID::t_MDString:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this here, shouldn't we be calling parseMetadataAsValue() for metadata types in parseOptionalOperandBundles? That way we'd match how call arguments are parsed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is determined by the logic of parseOptionalOperandBundles. It parses type then value:

if (parseType(Ty) || parseValue(Ty, Input, PFS))

Parsing value involves first parsing to ValID, then conversion to the value:

bool LLParser::parseValue(Type *Ty, Value *&V, PerFunctionState *PFS) {
V = nullptr;
ValID ID;
return parseValID(ID, PFS, Ty) ||
convertValIDToValue(Ty, ID, V, PFS);
}

Processing metadata values was implemented according to this logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what it currently does. To support metadata arguments, you need to do something like this instead (ignoring the part about attributes):

if (parseType(ArgTy, ArgLoc))
return true;
AttrBuilder ArgAttrs(M->getContext());
if (ArgTy->isMetadataTy()) {
if (parseMetadataAsValue(V, PFS))
return true;
} else {
// Otherwise, handle normal operands.
if (parseOptionalParamAttrs(ArgAttrs) || parseValue(ArgTy, V, PFS))

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed parseOptionalOperandBundles accordingly.

@spavloff
Copy link
Collaborator Author

spavloff commented Oct 2, 2024

To check my understanding, the reason why we need the bitcode change is that we only encode the operand type if the value ID is larger than the instruction ID, which is never going to be the case for metadata, so we can't use the type to distinguish metadata?

Yes, the type index cannot be used for this purpose, because it is absent.

Also, there is no bitcode change here, everything that worked previously produces the same bit code. It is only the new feature (metadata in operande bundles) that require the new rule.

And the reason this is not a problem for normal call arguments is that it doesn't encode the types and instead takes them from the function type?

Because previously deserializer always could expect either SSA value, or metadata, but not both.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please wait for a second approval.

@@ -56,6 +56,13 @@ define void @f4(i32* %ptr) {
ret void
}

define void @f5(i32 %x) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be tested in compatibility.ll

@spavloff
Copy link
Collaborator Author

Can someone else review theses changes? It needs two approvals to launch.

Thanks.

Comment on lines 3210 to 3212
} else {
if (parseValue(Ty, Input, PFS))
return true;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just:

else if (parseValue(Ty, Input, PFS))
  return true;

This change implements support of metadata strings in operand bundle
values. It makes possible calls like:

    call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting
a special marker before the metadata index. The marker cannot be
treated as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without
breaking compatibility.

Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same
information as now is passed by the constrained intrinsic arguments.
@spavloff spavloff merged commit 15de239 into llvm:main Oct 11, 2024
9 checks passed
@R-Goc
Copy link
Contributor

R-Goc commented Oct 14, 2024

This commit causes a warning during build time:

C:\Users\rysza\lib\llvm-project\llvm\lib\Bitcode\Reader\BitcodeReader.cpp(801,15): warning: comparison of integers of different signs: 'unsigned int' and 'llvm::bitc::MetadataOperandBundleValueMarker' [-Wsign-compare]
  801 |     if (ValID != bitc::OB_METADATA) {
      |         ~~~~~ ^  ~~~~~~~~~~~~~~~~~

DanielCChen pushed a commit to DanielCChen/llvm-project that referenced this pull request Oct 16, 2024
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:

    call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.

Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
bricknerb pushed a commit to bricknerb/llvm-project that referenced this pull request Oct 17, 2024
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:

    call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.

Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
spavloff added a commit to spavloff/llvm-project that referenced this pull request Oct 23, 2024
Insert explicit cast from an enumerator to unsigned int, because
some compilers issue a warning on signed vs unsigned comparison,
see: llvm#110805 (comment).
@spavloff
Copy link
Collaborator Author

This commit causes a warning during build time:

C:\Users\rysza\lib\llvm-project\llvm\lib\Bitcode\Reader\BitcodeReader.cpp(801,15): warning: comparison of integers of different signs: 'unsigned int' and 'llvm::bitc::MetadataOperandBundleValueMarker' [-Wsign-compare]
  801 |     if (ValID != bitc::OB_METADATA) {
      |         ~~~~~ ^  ~~~~~~~~~~~~~~~~~

@R-Goc Could you please check, if #113428 solves this problem?

@R-Goc
Copy link
Contributor

R-Goc commented Oct 23, 2024

Fixes it.

@spavloff
Copy link
Collaborator Author

Thanks!

spavloff added a commit that referenced this pull request Oct 23, 2024
Insert explicit cast from an enumerator to unsigned int, because some
compilers issue a warning on signed vs unsigned comparison, see:
#110805 (comment).
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Insert explicit cast from an enumerator to unsigned int, because some
compilers issue a warning on signed vs unsigned comparison, see:
llvm#110805 (comment).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants