Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. #112617

Merged
merged 9 commits into from
Oct 24, 2024

Conversation

sunfishcode
Copy link
Member

WebAssembly's memory.fill and memory.copy instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call memcpy on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around memory.fill and memory.copy when the length is zero.

@llvmbot
Copy link
Member

llvmbot commented Oct 16, 2024

@llvm/pr-subscribers-backend-webassembly

Author: Dan Gohman (sunfishcode)

Changes

WebAssembly's memory.fill and memory.copy instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call memcpy on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around memory.fill and memory.copy when the length is zero.


Patch is 34.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112617.diff

6 Files Affected:

  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISD.def (+4)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+140)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td (+87-12)
  • (modified) llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp (+14-5)
  • (modified) llvm/test/CodeGen/WebAssembly/bulk-memory.ll (+84-18)
  • (modified) llvm/test/CodeGen/WebAssembly/bulk-memory64.ll (+105-36)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
index b8954f4693f0a0..149f0cd70262bb 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
@@ -50,3 +50,7 @@ HANDLE_MEM_NODETYPE(GLOBAL_GET)
 HANDLE_MEM_NODETYPE(GLOBAL_SET)
 HANDLE_MEM_NODETYPE(TABLE_GET)
 HANDLE_MEM_NODETYPE(TABLE_SET)
+
+// Bulk memory instructions that require branching to handle empty ranges.
+HANDLE_NODETYPE(MEMCPY)
+HANDLE_NODETYPE(MEMSET)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 5f76d666823e28..643947d23b366f 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -561,6 +561,138 @@ static MachineBasicBlock *LowerFPToInt(MachineInstr &MI, DebugLoc DL,
   return DoneMBB;
 }
 
+// Lower a `MEMCPY` instruction into a CFG triangle around a `MEMORY_COPY`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemcpy(MachineInstr &MI, DebugLoc DL,
+                                      MachineBasicBlock *BB,
+                                      const TargetInstrInfo &TII, bool Int64) {
+  MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+  MachineOperand DstMem = MI.getOperand(0);
+  MachineOperand SrcMem = MI.getOperand(1);
+  MachineOperand Dst = MI.getOperand(2);
+  MachineOperand Src = MI.getOperand(3);
+  MachineOperand Len = MI.getOperand(4);
+
+  // We're going to add an extra use to `Len` to test if it's zero; that
+  // use shouldn't be a kill, even if the original use is.
+  MachineOperand NoKillLen = Len;
+  NoKillLen.setIsKill(false);
+
+  // Decide on which `MachineInstr` opcode we're going to use.
+  unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+  unsigned MemoryCopy =
+      Int64 ? WebAssembly::MEMORY_COPY_A64 : WebAssembly::MEMORY_COPY_A32;
+
+  // Create two new basic blocks; one for the new `memory.fill` that we can
+  // branch over, and one for the rest of the instructions after the original
+  // `memory.fill`.
+  const BasicBlock *LLVMBB = BB->getBasicBlock();
+  MachineFunction *F = BB->getParent();
+  MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+  MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+  MachineFunction::iterator It = ++BB->getIterator();
+  F->insert(It, TrueMBB);
+  F->insert(It, DoneMBB);
+
+  // Transfer the remainder of BB and its successor edges to DoneMBB.
+  DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+  DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+  // Connect the CFG edges.
+  BB->addSuccessor(TrueMBB);
+  BB->addSuccessor(DoneMBB);
+  TrueMBB->addSuccessor(DoneMBB);
+
+  // Create a virtual register for the `Eqz` result.
+  unsigned EqzReg;
+  EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+  // Erase the original `memory.copy`.
+  MI.eraseFromParent();
+
+  // Test if `Len` is zero.
+  BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+  // Insert a new `memory.copy`.
+  BuildMI(TrueMBB, DL, TII.get(MemoryCopy))
+      .add(DstMem)
+      .add(SrcMem)
+      .add(Dst)
+      .add(Src)
+      .add(Len);
+
+  // Create the CFG triangle.
+  BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+  BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+  return DoneMBB;
+}
+
+// Lower a `MEMSET` instruction into a CFG triangle around a `MEMORY_FILL`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemset(MachineInstr &MI, DebugLoc DL,
+                                      MachineBasicBlock *BB,
+                                      const TargetInstrInfo &TII, bool Int64) {
+  MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+  MachineOperand Mem = MI.getOperand(0);
+  MachineOperand Dst = MI.getOperand(1);
+  MachineOperand Val = MI.getOperand(2);
+  MachineOperand Len = MI.getOperand(3);
+
+  // We're going to add an extra use to `Len` to test if it's zero; that
+  // use shouldn't be a kill, even if the original use is.
+  MachineOperand NoKillLen = Len;
+  NoKillLen.setIsKill(false);
+
+  // Decide on which `MachineInstr` opcode we're going to use.
+  unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+  unsigned MemoryFill =
+      Int64 ? WebAssembly::MEMORY_FILL_A64 : WebAssembly::MEMORY_FILL_A32;
+
+  // Create two new basic blocks; one for the new `memory.fill` that we can
+  // branch over, and one for the rest of the instructions after the original
+  // `memory.fill`.
+  const BasicBlock *LLVMBB = BB->getBasicBlock();
+  MachineFunction *F = BB->getParent();
+  MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+  MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+  MachineFunction::iterator It = ++BB->getIterator();
+  F->insert(It, TrueMBB);
+  F->insert(It, DoneMBB);
+
+  // Transfer the remainder of BB and its successor edges to DoneMBB.
+  DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+  DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+  // Connect the CFG edges.
+  BB->addSuccessor(TrueMBB);
+  BB->addSuccessor(DoneMBB);
+  TrueMBB->addSuccessor(DoneMBB);
+
+  // Create a virtual register for the `Eqz` result.
+  unsigned EqzReg;
+  EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+  // Erase the original `memory.fill`.
+  MI.eraseFromParent();
+
+  // Test if `Len` is zero.
+  BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+  // Insert a new `memory.copy`.
+  BuildMI(TrueMBB, DL, TII.get(MemoryFill)).add(Mem).add(Dst).add(Val).add(Len);
+
+  // Create the CFG triangle.
+  BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+  BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+  return DoneMBB;
+}
+
 static MachineBasicBlock *
 LowerCallResults(MachineInstr &CallResults, DebugLoc DL, MachineBasicBlock *BB,
                  const WebAssemblySubtarget *Subtarget,
@@ -718,6 +850,14 @@ MachineBasicBlock *WebAssemblyTargetLowering::EmitInstrWithCustomInserter(
   case WebAssembly::FP_TO_UINT_I64_F64:
     return LowerFPToInt(MI, DL, BB, TII, true, true, true,
                         WebAssembly::I64_TRUNC_U_F64);
+  case WebAssembly::MEMCPY_A32:
+    return LowerMemcpy(MI, DL, BB, TII, false);
+  case WebAssembly::MEMCPY_A64:
+    return LowerMemcpy(MI, DL, BB, TII, true);
+  case WebAssembly::MEMSET_A32:
+    return LowerMemset(MI, DL, BB, TII, false);
+  case WebAssembly::MEMSET_A64:
+    return LowerMemset(MI, DL, BB, TII, true);
   case WebAssembly::CALL_RESULTS:
   case WebAssembly::RET_CALL_RESULTS:
     return LowerCallResults(MI, DL, BB, Subtarget, TII);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
index 7aeae54d95a8c9..de79f2d44cd328 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
@@ -21,16 +21,33 @@ multiclass BULK_I<dag oops_r, dag iops_r, dag oops_s, dag iops_s,
 }
 
 // Bespoke types and nodes for bulk memory ops
+
+// memory.copy (may trap on empty ranges)
+def wasm_memory_copy_t : SDTypeProfile<0, 5,
+  [SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
+>;
+def wasm_memory_copy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memory_copy_t,
+                              [SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
+
+// memory.copy with a branch to avoid trapping
 def wasm_memcpy_t : SDTypeProfile<0, 5,
   [SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
 >;
-def wasm_memcpy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memcpy_t,
+def wasm_memcpy : SDNode<"WebAssemblyISD::MEMCPY", wasm_memcpy_t,
                          [SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
 
+// memory.fill (may trap on empty ranges)
+def wasm_memory_fill_t : SDTypeProfile<0, 4,
+  [SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
+>;
+def wasm_memory_fill : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memory_fill_t,
+                              [SDNPHasChain, SDNPMayStore]>;
+
+// memory.fill with a branch to avoid trapping
 def wasm_memset_t : SDTypeProfile<0, 4,
   [SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
 >;
-def wasm_memset : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memset_t,
+def wasm_memset : SDNode<"WebAssemblyISD::MEMSET", wasm_memset_t,
                          [SDNPHasChain, SDNPMayStore]>;
 
 multiclass BulkMemoryOps<WebAssemblyRegClass rc, string B> {
@@ -51,25 +68,83 @@ defm DATA_DROP :
          [],
          "data.drop\t$seg", "data.drop\t$seg", 0x09>;
 
+}
+
+defm : BulkMemoryOps<I32, "32">;
+defm : BulkMemoryOps<I64, "64">;
+
+// Define copy/fill manually instead of using the `BulkMemoryOps` multiclass
+// because when a multiclass defines opcodes, it gives them anonymous names
+// and we need opcodes with names so that we can handle them with custom code.
+
 let mayLoad = 1, mayStore = 1 in
-defm MEMORY_COPY_A#B :
+defm MEMORY_COPY_A32 :
   BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
-                      rc:$dst, rc:$src, rc:$len),
+                      I32:$dst, I32:$src, I32:$len),
          (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
-         [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
-           rc:$dst, rc:$src, rc:$len
+         [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+           I32:$dst, I32:$src, I32:$len
          )],
          "memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
          "memory.copy\t$src_idx, $dst_idx", 0x0a>;
 
 let mayStore = 1 in
-defm MEMORY_FILL_A#B :
-  BULK_I<(outs), (ins i32imm_op:$idx, rc:$dst, I32:$value, rc:$size),
+defm MEMORY_FILL_A32 :
+  BULK_I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
          (outs), (ins i32imm_op:$idx),
-         [(wasm_memset (i32 imm:$idx), rc:$dst, I32:$value, rc:$size)],
+         [(wasm_memory_fill (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
          "memory.fill\t$idx, $dst, $value, $size",
          "memory.fill\t$idx", 0x0b>;
-}
 
-defm : BulkMemoryOps<I32, "32">;
-defm : BulkMemoryOps<I64, "64">;
+let mayLoad = 1, mayStore = 1 in
+defm MEMORY_COPY_A64 :
+  BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+                      I64:$dst, I64:$src, I64:$len),
+         (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+         [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+           I64:$dst, I64:$src, I64:$len
+         )],
+         "memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
+         "memory.copy\t$src_idx, $dst_idx", 0x0a>;
+
+let mayStore = 1 in
+defm MEMORY_FILL_A64 :
+  BULK_I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+         (outs), (ins i32imm_op:$idx),
+         [(wasm_memory_fill (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+         "memory.fill\t$idx, $dst, $value, $size",
+         "memory.fill\t$idx", 0x0b>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A32 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+                                 I32:$dst, I32:$src, I32:$len),
+                    (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+                    [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+                      I32:$dst, I32:$src, I32:$len
+                    )],
+                    "", "", 0>,
+                  Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A32 : I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
+                    (outs), (ins i32imm_op:$idx),
+                    [(wasm_memset (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
+                    "", "", 0>,
+                  Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A64 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+                                 I64:$dst, I64:$src, I64:$len),
+                    (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+                    [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+                      I64:$dst, I64:$src, I64:$len
+                    )],
+                    "", "", 0>,
+                  Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A64 : I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+                    (outs), (ins i32imm_op:$idx),
+                    [(wasm_memset (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+                    "", "", 0>,
+                  Requires<[HasBulkMemory]>;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
index 74af4c8873f735..d51bfeb6d8592c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
@@ -28,9 +28,13 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemcpy(
 
   SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
   auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
-  return DAG.getNode(WebAssemblyISD::MEMORY_COPY, DL, MVT::Other,
-                     {Chain, MemIdx, MemIdx, Dst, Src,
-                      DAG.getZExtOrTrunc(Size, DL, LenMVT)});
+
+  // Use `MEMCPY` here instead of `MEMORY_COPY` because `memory.copy` traps
+  // if the pointers are invalid even if the length is zero. `MEMCPY` gets
+  // extra code to handle this in the way that LLVM IR expects.
+  return DAG.getNode(
+      WebAssemblyISD::MEMCPY, DL, MVT::Other,
+      {Chain, MemIdx, MemIdx, Dst, Src, DAG.getZExtOrTrunc(Size, DL, LenMVT)});
 }
 
 SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemmove(
@@ -52,8 +56,13 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemset(
 
   SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
   auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
+
+  // Use `MEMSET` here instead of `MEMORY_FILL` because `memory.fill` traps
+  // if the pointers are invalid even if the length is zero. `MEMSET` gets
+  // extra code to handle this in the way that LLVM IR expects.
+  //
   // Only low byte matters for val argument, so anyext the i8
-  return DAG.getNode(WebAssemblyISD::MEMORY_FILL, DL, MVT::Other, Chain, MemIdx,
-                     Dst, DAG.getAnyExtOrTrunc(Val, DL, MVT::i32),
+  return DAG.getNode(WebAssemblyISD::MEMSET, DL, MVT::Other, Chain, MemIdx, Dst,
+                     DAG.getAnyExtOrTrunc(Val, DL, MVT::i32),
                      DAG.getZExtOrTrunc(Size, DL, LenMVT));
 }
diff --git a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
index dc29dc81c13ec2..ae170d757a305a 100644
--- a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
+++ b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
@@ -17,7 +17,12 @@ declare void @llvm.memset.p0.i32(ptr, i8, i32, i1)
 ; CHECK-LABEL: memcpy_i8:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memcpy_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
   call void @llvm.memcpy.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -27,7 +32,12 @@ define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
 ; CHECK-LABEL: memmove_i8:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memmove_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
   call void @llvm.memmove.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -37,7 +47,12 @@ define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
 ; CHECK-LABEL: memset_i8:
 ; NO-BULK-MEM-NOT: memory.fill
 ; BULK-MEM-NEXT: .functype memset_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
   call void @llvm.memset.p0.i8(ptr %dest, i8 %val, i8 %len, i1 0)
@@ -47,7 +62,12 @@ define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
 ; CHECK-LABEL: memcpy_i32:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memcpy_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
   call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -57,7 +77,12 @@ define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
 ; CHECK-LABEL: memmove_i32:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memmove_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
   call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -67,7 +92,12 @@ define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
 ; CHECK-LABEL: memset_i32:
 ; NO-BULK-MEM-NOT: memory.fill
 ; BULK-MEM-NEXT: .functype memset_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
 ; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memset_i32(ptr %dest, i8 %val, i32 %len) {
   call void @llvm.memset.p0.i32(ptr %dest, i8 %val, i32 %len, i1 0)
@@ -107,8 +137,14 @@ define void @memset_1(ptr %dest, i8 %val) {
 ; CHECK-LABEL: memcpy_1024:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memcpy_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
 ; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memcpy_1024(ptr %dest, ptr %src) {
   call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -118,8 +154,14 @@ define void @memcpy_1024(ptr %dest, ptr %src) {
 ; CHECK-LABEL: memmove_1024:
 ; NO-BULK-MEM-NOT: memory.copy
 ; BULK-MEM-NEXT: .functype memmove_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
 ; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
 ; BULK-MEM-NEXT: return
 define void @memmove_1024(ptr %dest, ptr %src) {
   call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -129,8 +171,14 @@ define void @memmove_1024(ptr %dest, ptr %src) {
 ; CHECK-LABEL: memset_1024:
 ; NO-BULK-MEM-NOT: memory.fill
 ; BULK-MEM-NEXT: .functype memset_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
 ; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: me...
[truncated]

Comment on lines 26 to 27
def wasm_memory_copy_t : SDTypeProfile<0, 5,
[SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems identical to the existing wasm_memcpy_t.. Can we reuse it in both places? The same for wasm_memory_fill_t.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 76 to 78
// Define copy/fill manually instead of using the `BulkMemoryOps` multiclass
// because when a multiclass defines opcodes, it gives them anonymous names
// and we need opcodes with names so that we can handle them with custom code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this means. We use all opcodes in code by WebAssembly::INSTRNAME and most of them all inherit from some multiclass..? For example, this is from LowerFPToInt:

unsigned Abs = Float64 ? WebAssembly::ABS_F64 : WebAssembly::ABS_F32;
unsigned FConst = Float64 ? WebAssembly::CONST_F64 : WebAssembly::CONST_F32;
unsigned LT = Float64 ? WebAssembly::LT_F64 : WebAssembly::LT_F32;
unsigned GE = Float64 ? WebAssembly::GE_F64 : WebAssembly::GE_F32;
unsigned IConst = Int64 ? WebAssembly::CONST_I64 : WebAssembly::CONST_I32;
unsigned Eqz = WebAssembly::EQZ_I32;
unsigned And = WebAssembly::AND_I32;

All instructions here inherit from some multiclass. For example, ABS_F32 inherits from the multiclass UNARY_FP, and we can use their names fine:
defm ABS : UnaryFP<fabs, "abs ", 0x8b, 0x99>;

Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it appears to be due to the defm : BulkMemoryOps<...>, which makes the opcodes "anonymous", and tablegen names their opcodes as anonymous_9000MEMORY_COPY_A64 and anonymous_9000MEMORY_COPY_A64 and anonymous_9000MEMORY_FILL_A64. If I change it to defm : MEMORY_ : BulkMemoryOps<...> and then remove the MEMORY_ prefix from all the names, then they come out as just MEMORY_COPY_A64 and MEMORY_FILL_A64.

I've now updated the code to use multiclasses for all instructions that have 32/64 variants.

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests don't seem to be working now

llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td Outdated Show resolved Hide resolved
llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td Outdated Show resolved Hide resolved
@sunfishcode sunfishcode force-pushed the sunfishcode/memcpy-empty-range branch from 41141c8 to 40c4c71 Compare October 22, 2024 23:43
@aheejin
Copy link
Member

aheejin commented Oct 23, 2024

The tests still don't seem to be passing.

…nges.

WebAssembly's `memory.fill` and `memory.copy` instructions trap if the
pointers are out of bounds, even if the length is zero. This is
different from LLVM, which expects that it can call `memcpy` on
arbitrary invalid pointers if the length is zero. To avoid spurious
traps, branch around `memory.fill` and `memory.copy` when the
length is zero.
@sunfishcode sunfishcode force-pushed the sunfishcode/memcpy-empty-range branch from e9d0707 to fa5ee93 Compare October 23, 2024 18:54
llvm/lib/Target/WebAssembly/WebAssemblyISD.def Outdated Show resolved Hide resolved
llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td Outdated Show resolved Hide resolved
llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td Outdated Show resolved Hide resolved
sunfishcode and others added 3 commits October 24, 2024 07:55
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
@sunfishcode sunfishcode merged commit 1184458 into llvm:main Oct 24, 2024
8 checks passed
@sunfishcode sunfishcode deleted the sunfishcode/memcpy-empty-range branch October 24, 2024 21:14
@frobtech frobtech mentioned this pull request Oct 25, 2024
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
…nges. (llvm#112617)

WebAssembly's `memory.fill` and `memory.copy` instructions trap if the
pointers are out of bounds, even if the length is zero. This is
different from LLVM, which expects that it can call `memcpy` on
arbitrary invalid pointers if the length is zero. To avoid spurious
traps, branch around `memory.fill` and `memory.copy` when the length is
zero.

---------

Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants