-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. #112617
[WebAssembly] Protect memory.fill and memory.copy from zero-length ranges. #112617
Conversation
@llvm/pr-subscribers-backend-webassembly Author: Dan Gohman (sunfishcode) ChangesWebAssembly's Patch is 34.19 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112617.diff 6 Files Affected:
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
index b8954f4693f0a0..149f0cd70262bb 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
@@ -50,3 +50,7 @@ HANDLE_MEM_NODETYPE(GLOBAL_GET)
HANDLE_MEM_NODETYPE(GLOBAL_SET)
HANDLE_MEM_NODETYPE(TABLE_GET)
HANDLE_MEM_NODETYPE(TABLE_SET)
+
+// Bulk memory instructions that require branching to handle empty ranges.
+HANDLE_NODETYPE(MEMCPY)
+HANDLE_NODETYPE(MEMSET)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 5f76d666823e28..643947d23b366f 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -561,6 +561,138 @@ static MachineBasicBlock *LowerFPToInt(MachineInstr &MI, DebugLoc DL,
return DoneMBB;
}
+// Lower a `MEMCPY` instruction into a CFG triangle around a `MEMORY_COPY`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemcpy(MachineInstr &MI, DebugLoc DL,
+ MachineBasicBlock *BB,
+ const TargetInstrInfo &TII, bool Int64) {
+ MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+ MachineOperand DstMem = MI.getOperand(0);
+ MachineOperand SrcMem = MI.getOperand(1);
+ MachineOperand Dst = MI.getOperand(2);
+ MachineOperand Src = MI.getOperand(3);
+ MachineOperand Len = MI.getOperand(4);
+
+ // We're going to add an extra use to `Len` to test if it's zero; that
+ // use shouldn't be a kill, even if the original use is.
+ MachineOperand NoKillLen = Len;
+ NoKillLen.setIsKill(false);
+
+ // Decide on which `MachineInstr` opcode we're going to use.
+ unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+ unsigned MemoryCopy =
+ Int64 ? WebAssembly::MEMORY_COPY_A64 : WebAssembly::MEMORY_COPY_A32;
+
+ // Create two new basic blocks; one for the new `memory.fill` that we can
+ // branch over, and one for the rest of the instructions after the original
+ // `memory.fill`.
+ const BasicBlock *LLVMBB = BB->getBasicBlock();
+ MachineFunction *F = BB->getParent();
+ MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+ MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+ MachineFunction::iterator It = ++BB->getIterator();
+ F->insert(It, TrueMBB);
+ F->insert(It, DoneMBB);
+
+ // Transfer the remainder of BB and its successor edges to DoneMBB.
+ DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+ DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+ // Connect the CFG edges.
+ BB->addSuccessor(TrueMBB);
+ BB->addSuccessor(DoneMBB);
+ TrueMBB->addSuccessor(DoneMBB);
+
+ // Create a virtual register for the `Eqz` result.
+ unsigned EqzReg;
+ EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+ // Erase the original `memory.copy`.
+ MI.eraseFromParent();
+
+ // Test if `Len` is zero.
+ BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+ // Insert a new `memory.copy`.
+ BuildMI(TrueMBB, DL, TII.get(MemoryCopy))
+ .add(DstMem)
+ .add(SrcMem)
+ .add(Dst)
+ .add(Src)
+ .add(Len);
+
+ // Create the CFG triangle.
+ BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+ BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+ return DoneMBB;
+}
+
+// Lower a `MEMSET` instruction into a CFG triangle around a `MEMORY_FILL`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemset(MachineInstr &MI, DebugLoc DL,
+ MachineBasicBlock *BB,
+ const TargetInstrInfo &TII, bool Int64) {
+ MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+ MachineOperand Mem = MI.getOperand(0);
+ MachineOperand Dst = MI.getOperand(1);
+ MachineOperand Val = MI.getOperand(2);
+ MachineOperand Len = MI.getOperand(3);
+
+ // We're going to add an extra use to `Len` to test if it's zero; that
+ // use shouldn't be a kill, even if the original use is.
+ MachineOperand NoKillLen = Len;
+ NoKillLen.setIsKill(false);
+
+ // Decide on which `MachineInstr` opcode we're going to use.
+ unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+ unsigned MemoryFill =
+ Int64 ? WebAssembly::MEMORY_FILL_A64 : WebAssembly::MEMORY_FILL_A32;
+
+ // Create two new basic blocks; one for the new `memory.fill` that we can
+ // branch over, and one for the rest of the instructions after the original
+ // `memory.fill`.
+ const BasicBlock *LLVMBB = BB->getBasicBlock();
+ MachineFunction *F = BB->getParent();
+ MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+ MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+ MachineFunction::iterator It = ++BB->getIterator();
+ F->insert(It, TrueMBB);
+ F->insert(It, DoneMBB);
+
+ // Transfer the remainder of BB and its successor edges to DoneMBB.
+ DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+ DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+ // Connect the CFG edges.
+ BB->addSuccessor(TrueMBB);
+ BB->addSuccessor(DoneMBB);
+ TrueMBB->addSuccessor(DoneMBB);
+
+ // Create a virtual register for the `Eqz` result.
+ unsigned EqzReg;
+ EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+ // Erase the original `memory.fill`.
+ MI.eraseFromParent();
+
+ // Test if `Len` is zero.
+ BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+ // Insert a new `memory.copy`.
+ BuildMI(TrueMBB, DL, TII.get(MemoryFill)).add(Mem).add(Dst).add(Val).add(Len);
+
+ // Create the CFG triangle.
+ BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+ BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+ return DoneMBB;
+}
+
static MachineBasicBlock *
LowerCallResults(MachineInstr &CallResults, DebugLoc DL, MachineBasicBlock *BB,
const WebAssemblySubtarget *Subtarget,
@@ -718,6 +850,14 @@ MachineBasicBlock *WebAssemblyTargetLowering::EmitInstrWithCustomInserter(
case WebAssembly::FP_TO_UINT_I64_F64:
return LowerFPToInt(MI, DL, BB, TII, true, true, true,
WebAssembly::I64_TRUNC_U_F64);
+ case WebAssembly::MEMCPY_A32:
+ return LowerMemcpy(MI, DL, BB, TII, false);
+ case WebAssembly::MEMCPY_A64:
+ return LowerMemcpy(MI, DL, BB, TII, true);
+ case WebAssembly::MEMSET_A32:
+ return LowerMemset(MI, DL, BB, TII, false);
+ case WebAssembly::MEMSET_A64:
+ return LowerMemset(MI, DL, BB, TII, true);
case WebAssembly::CALL_RESULTS:
case WebAssembly::RET_CALL_RESULTS:
return LowerCallResults(MI, DL, BB, Subtarget, TII);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
index 7aeae54d95a8c9..de79f2d44cd328 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
@@ -21,16 +21,33 @@ multiclass BULK_I<dag oops_r, dag iops_r, dag oops_s, dag iops_s,
}
// Bespoke types and nodes for bulk memory ops
+
+// memory.copy (may trap on empty ranges)
+def wasm_memory_copy_t : SDTypeProfile<0, 5,
+ [SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
+>;
+def wasm_memory_copy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memory_copy_t,
+ [SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
+
+// memory.copy with a branch to avoid trapping
def wasm_memcpy_t : SDTypeProfile<0, 5,
[SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
>;
-def wasm_memcpy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memcpy_t,
+def wasm_memcpy : SDNode<"WebAssemblyISD::MEMCPY", wasm_memcpy_t,
[SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
+// memory.fill (may trap on empty ranges)
+def wasm_memory_fill_t : SDTypeProfile<0, 4,
+ [SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
+>;
+def wasm_memory_fill : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memory_fill_t,
+ [SDNPHasChain, SDNPMayStore]>;
+
+// memory.fill with a branch to avoid trapping
def wasm_memset_t : SDTypeProfile<0, 4,
[SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
>;
-def wasm_memset : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memset_t,
+def wasm_memset : SDNode<"WebAssemblyISD::MEMSET", wasm_memset_t,
[SDNPHasChain, SDNPMayStore]>;
multiclass BulkMemoryOps<WebAssemblyRegClass rc, string B> {
@@ -51,25 +68,83 @@ defm DATA_DROP :
[],
"data.drop\t$seg", "data.drop\t$seg", 0x09>;
+}
+
+defm : BulkMemoryOps<I32, "32">;
+defm : BulkMemoryOps<I64, "64">;
+
+// Define copy/fill manually instead of using the `BulkMemoryOps` multiclass
+// because when a multiclass defines opcodes, it gives them anonymous names
+// and we need opcodes with names so that we can handle them with custom code.
+
let mayLoad = 1, mayStore = 1 in
-defm MEMORY_COPY_A#B :
+defm MEMORY_COPY_A32 :
BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
- rc:$dst, rc:$src, rc:$len),
+ I32:$dst, I32:$src, I32:$len),
(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
- [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
- rc:$dst, rc:$src, rc:$len
+ [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I32:$dst, I32:$src, I32:$len
)],
"memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
"memory.copy\t$src_idx, $dst_idx", 0x0a>;
let mayStore = 1 in
-defm MEMORY_FILL_A#B :
- BULK_I<(outs), (ins i32imm_op:$idx, rc:$dst, I32:$value, rc:$size),
+defm MEMORY_FILL_A32 :
+ BULK_I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
(outs), (ins i32imm_op:$idx),
- [(wasm_memset (i32 imm:$idx), rc:$dst, I32:$value, rc:$size)],
+ [(wasm_memory_fill (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
"memory.fill\t$idx, $dst, $value, $size",
"memory.fill\t$idx", 0x0b>;
-}
-defm : BulkMemoryOps<I32, "32">;
-defm : BulkMemoryOps<I64, "64">;
+let mayLoad = 1, mayStore = 1 in
+defm MEMORY_COPY_A64 :
+ BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I64:$dst, I64:$src, I64:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I64:$dst, I64:$src, I64:$len
+ )],
+ "memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
+ "memory.copy\t$src_idx, $dst_idx", 0x0a>;
+
+let mayStore = 1 in
+defm MEMORY_FILL_A64 :
+ BULK_I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memory_fill (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+ "memory.fill\t$idx, $dst, $value, $size",
+ "memory.fill\t$idx", 0x0b>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A32 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I32:$dst, I32:$src, I32:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I32:$dst, I32:$src, I32:$len
+ )],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A32 : I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memset (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A64 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I64:$dst, I64:$src, I64:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I64:$dst, I64:$src, I64:$len
+ )],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A64 : I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memset (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
index 74af4c8873f735..d51bfeb6d8592c 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
@@ -28,9 +28,13 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemcpy(
SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
- return DAG.getNode(WebAssemblyISD::MEMORY_COPY, DL, MVT::Other,
- {Chain, MemIdx, MemIdx, Dst, Src,
- DAG.getZExtOrTrunc(Size, DL, LenMVT)});
+
+ // Use `MEMCPY` here instead of `MEMORY_COPY` because `memory.copy` traps
+ // if the pointers are invalid even if the length is zero. `MEMCPY` gets
+ // extra code to handle this in the way that LLVM IR expects.
+ return DAG.getNode(
+ WebAssemblyISD::MEMCPY, DL, MVT::Other,
+ {Chain, MemIdx, MemIdx, Dst, Src, DAG.getZExtOrTrunc(Size, DL, LenMVT)});
}
SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemmove(
@@ -52,8 +56,13 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemset(
SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
+
+ // Use `MEMSET` here instead of `MEMORY_FILL` because `memory.fill` traps
+ // if the pointers are invalid even if the length is zero. `MEMSET` gets
+ // extra code to handle this in the way that LLVM IR expects.
+ //
// Only low byte matters for val argument, so anyext the i8
- return DAG.getNode(WebAssemblyISD::MEMORY_FILL, DL, MVT::Other, Chain, MemIdx,
- Dst, DAG.getAnyExtOrTrunc(Val, DL, MVT::i32),
+ return DAG.getNode(WebAssemblyISD::MEMSET, DL, MVT::Other, Chain, MemIdx, Dst,
+ DAG.getAnyExtOrTrunc(Val, DL, MVT::i32),
DAG.getZExtOrTrunc(Size, DL, LenMVT));
}
diff --git a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
index dc29dc81c13ec2..ae170d757a305a 100644
--- a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
+++ b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
@@ -17,7 +17,12 @@ declare void @llvm.memset.p0.i32(ptr, i8, i32, i1)
; CHECK-LABEL: memcpy_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memcpy.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -27,7 +32,12 @@ define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memmove_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memmove.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -37,7 +47,12 @@ define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memset_i8:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
call void @llvm.memset.p0.i8(ptr %dest, i8 %val, i8 %len, i1 0)
@@ -47,7 +62,12 @@ define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
; CHECK-LABEL: memcpy_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -57,7 +77,12 @@ define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
; CHECK-LABEL: memmove_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -67,7 +92,12 @@ define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
; CHECK-LABEL: memset_i32:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i32(ptr %dest, i8 %val, i32 %len) {
call void @llvm.memset.p0.i32(ptr %dest, i8 %val, i32 %len, i1 0)
@@ -107,8 +137,14 @@ define void @memset_1(ptr %dest, i8 %val) {
; CHECK-LABEL: memcpy_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_1024(ptr %dest, ptr %src) {
call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -118,8 +154,14 @@ define void @memcpy_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memmove_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_1024(ptr %dest, ptr %src) {
call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -129,8 +171,14 @@ define void @memmove_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memset_1024:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: me...
[truncated]
|
def wasm_memory_copy_t : SDTypeProfile<0, 5, | ||
[SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>] | ||
>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems identical to the existing wasm_memcpy_t
.. Can we reuse it in both places? The same for wasm_memory_fill_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// Define copy/fill manually instead of using the `BulkMemoryOps` multiclass | ||
// because when a multiclass defines opcodes, it gives them anonymous names | ||
// and we need opcodes with names so that we can handle them with custom code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this means. We use all opcodes in code by WebAssembly::INSTRNAME
and most of them all inherit from some multiclass..? For example, this is from LowerFPToInt
:
llvm-project/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
Lines 481 to 487 in 27f66c9
unsigned Abs = Float64 ? WebAssembly::ABS_F64 : WebAssembly::ABS_F32; | |
unsigned FConst = Float64 ? WebAssembly::CONST_F64 : WebAssembly::CONST_F32; | |
unsigned LT = Float64 ? WebAssembly::LT_F64 : WebAssembly::LT_F32; | |
unsigned GE = Float64 ? WebAssembly::GE_F64 : WebAssembly::GE_F32; | |
unsigned IConst = Int64 ? WebAssembly::CONST_I64 : WebAssembly::CONST_I32; | |
unsigned Eqz = WebAssembly::EQZ_I32; | |
unsigned And = WebAssembly::AND_I32; |
All instructions here inherit from some multiclass. For example,
ABS_F32
inherits from the multiclass UNARY_FP
, and we can use their names fine:defm ABS : UnaryFP<fabs, "abs ", 0x8b, 0x99>; |
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it appears to be due to the defm : BulkMemoryOps<...>
, which makes the opcodes "anonymous", and tablegen names their opcodes as anonymous_9000MEMORY_COPY_A64
and anonymous_9000MEMORY_COPY_A64
and anonymous_9000MEMORY_FILL_A64
. If I change it to defm : MEMORY_ : BulkMemoryOps<...>
and then remove the MEMORY_
prefix from all the names, then they come out as just MEMORY_COPY_A64
and MEMORY_FILL_A64
.
I've now updated the code to use multiclasses for all instructions that have 32/64 variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests don't seem to be working now
41141c8
to
40c4c71
Compare
The tests still don't seem to be passing. |
…nges. WebAssembly's `memory.fill` and `memory.copy` instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call `memcpy` on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around `memory.fill` and `memory.copy` when the length is zero.
e9d0707
to
fa5ee93
Compare
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
…nges. (llvm#112617) WebAssembly's `memory.fill` and `memory.copy` instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can call `memcpy` on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch around `memory.fill` and `memory.copy` when the length is zero. --------- Co-authored-by: Heejin Ahn <aheejin@gmail.com>
WebAssembly's
memory.fill
andmemory.copy
instructions trap if the pointers are out of bounds, even if the length is zero. This is different from LLVM, which expects that it can callmemcpy
on arbitrary invalid pointers if the length is zero. To avoid spurious traps, branch aroundmemory.fill
andmemory.copy
when the length is zero.