Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] zero-length memcpy does not result in no-op when bulk-memory is enabled #63755

Closed
Luukdegram opened this issue Jul 8, 2023 · 14 comments

Comments

@Luukdegram
Copy link

Luukdegram commented Jul 8, 2023

According to the LLVM Language Reference:

If <len> is 0, it is no-op modulo the behavior of attributes attached to the arguments. If <len> is not a well-defined value, the behavior is undefined. If <len> is not zero, both <dest> and <src> should be well-defined, otherwise the behavior is undefined.

performing a memory copy with a zero-length results in a no-op module. This means I'd expect for the WebAssembly target with the bulk-memory feature enabled, to result in a no-op when the length is 0. However, this seems to emit a memory.copy instruction with no checks, meaning it can result in a trap when the destination address is out-of-bounds.

i.e. given the following IR:

; Function Attrs: noredzone nounwind
define dso_local void @_start() #0 !dbg !256 {
Entry:
  %0 = alloca { ptr, i32 }, align 4
  %1 = alloca { ptr, i32 }, align 4
  %2 = call fastcc { ptr, i32 } @foo.foo(), !dbg !265
  store { ptr, i32 } %2, ptr %1, align 4, !dbg !265
  call void @llvm.dbg.declare(metadata ptr %1, metadata !262, metadata !DIExpression()), !dbg !265
  %3 = call fastcc { ptr, i32 } @foo.foo(), !dbg !266
  store { ptr, i32 } %3, ptr %0, align 4, !dbg !266
  call void @llvm.dbg.declare(metadata ptr %0, metadata !264, metadata !DIExpression()), !dbg !266
  %4 = load { ptr, i32 }, ptr %0, align 4, !dbg !267
  %5 = load { ptr, i32 }, ptr %1, align 4, !dbg !267
  %6 = extractvalue { ptr, i32 } %5, 0, !dbg !267
  %7 = extractvalue { ptr, i32 } %4, 1, !dbg !267
  %8 = extractvalue { ptr, i32 } %4, 0, !dbg !267
  call void @llvm.memcpy.p0.p0.i32(ptr align 1 %8, ptr align 1 %6, i32 %7, i1 false), !dbg !267
  ret void, !dbg !267
}

; Function Attrs: noredzone nounwind
define internal fastcc { ptr, i32 } @foo.foo() unnamed_addr #0 !dbg !268 {
Entry:
  %0 = alloca i32, align 4
  store i32 -1, ptr %0, align 4, !dbg !274
  call void @llvm.dbg.declare(metadata ptr %0, metadata !272, metadata !DIExpression()), !dbg !274
  ret { ptr, i32 } { ptr inttoptr (i32 -1 to ptr), i32 0 }, !dbg !275
}

Will emit a memory.copy instruction although the length of the operand is 0. In this case, the destination address is out-of-bounds, therefore resulting in a trap. According to the WebAssembly specification any out-of-bounds memory operation will result in a trap, regardless of the fact we have a zero-length operand. This is not the behavior I'd expect according to the LLVM Language Reference.
This behavior only occurs when the bulk-memory feature is enabled as without this feature, we don't have access to the above mentioned instruction and LLVM will emit a loop instead (Which is safe for zero-length copies).

This issue also occurs for the llvm.memset intrinsic.

@llvmbot
Copy link
Member

llvmbot commented Jul 8, 2023

@llvm/issue-subscribers-backend-webassembly

@tlively
Copy link
Collaborator

tlively commented Jul 10, 2023

Hmm, I guess a proper fix here would require lowering the LLVM intrinsic to a length check + bulk memory instruction instead of just a bulk memory instruction. That seems unfortunate.

Alternative fixes would be to relax either the LLVM or WebAssembly semantics. I filed WebAssembly/design#1482 to discuss this on the WebAssembly side, but given the high cost of changing the WebAssembly spec, it might be easier to change the LLVM semantics to allow trapping on OOB accesses even if the length is zero.

@Luukdegram
Copy link
Author

Thank you for filing the ticket on the WebAssembly side, I'll follow the discussion closely.

@tlively
Copy link
Collaborator

tlively commented Jul 13, 2023

We decided not to change the WebAssembly semantics, so I will take a look at fixing this in the code unless someone chimes in who is interested in trying to get the LLVM semantics relaxed.

@nikic
Copy link
Contributor

nikic commented Jul 13, 2023

Changing the LLVM semantics is not feasible. You will have to emit the check in the wasm backend.

@dwblaikie
Copy link
Collaborator

Isn't the same true for the standard memcpy? So this webassembly situation could probably reuse however we handle ::memcpy on *nix targets?

@tlively
Copy link
Collaborator

tlively commented Jul 19, 2023

@dwblaikie, can you elaborate? Are you talking about target OSes when compiling C++ std::memcpy? Where is the handling you're referring to?

@dwblaikie
Copy link
Collaborator

Hmm, can't quite reproduce/tickle what I thought I was. But I thought standard memcpy (std::memcpy or global memcpy in C) was UB if the input or output is invalid, even if len == 0 (something like that, but maybe I'm misremembering - but I'm guessing the WebAssembly choice came from this choice in standard memcpy).

So I would've thought that LLVM would already have some code for non-WebAssembly targets to lower memcpy LLVM intrinsic to conditional memcpy call, and WA could use the same thing...

@tlively
Copy link
Collaborator

tlively commented Jul 19, 2023

Yes, cppreference at least does explicitly say an OOB read of length zero is UB. But maybe backends that emit memcpy calls are piercing that abstraction and emitting memcpy calls directly anyway, knowing that nothing bad will happen?

@dwblaikie
Copy link
Collaborator

But maybe backends that emit memcpy calls are piercing that abstraction and emitting memcpy calls directly anyway, knowing that nothing bad will happen?

Perhaps, on platforms it knows that it'll be OK? Maybe there are platforms it doesn't know about the memcpy implementation details it'll still do the right thing and add a conditional for safety?

@dschuff
Copy link
Member

dschuff commented Oct 7, 2024

@sbc100 @aheejin we discussed today what we'd need to do to push some of the new features on-by-default.
This is one we should probably fix before enabling bulk memory.

@aheejin
Copy link
Member

aheejin commented Oct 10, 2024

@dschuff What are we supposed to fix here? If it's UB, doesn't that mean the current status quo is fine? Or, because the behavior is different from when bulk memory is disabled, it could be confusing, so you'd like it to be a no-op?

@nikic
Copy link
Contributor

nikic commented Oct 10, 2024

@aheejin llvm.memcpy(a, b, len) needs to be lowered to if (len != 0) { memory.copy(a, b, len); } if len cannot be proven to be non-zero. It is not UB in LLVM IR (and also C2y, but this isn't really relevant as far as the Wasm backend is concerned).

@kleisauke
Copy link

I think this was fixed via PR #112617.

@nikic nikic closed this as completed Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants