Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Old EH] JS stack unwinding support corrupts some floats #22743

Open
hoodmane opened this issue Oct 15, 2024 · 17 comments
Open

[Old EH] JS stack unwinding support corrupts some floats #22743

hoodmane opened this issue Oct 15, 2024 · 17 comments

Comments

@hoodmane
Copy link
Contributor

hoodmane commented Oct 15, 2024

main.c++

#include <stdexcept>
#include <stdio.h>
using namespace std;
typedef unsigned int uint;
uint float_to_uint(float f);
float uint_to_float(uint i);

int main() {
    uint x = 4287976789;
    float f;
    try {
        f = uint_to_float(x);
    } catch(runtime_error& e) {
        printf("Caught runtime error...");
    }
    uint y = float_to_uint(f);
    printf("x: %ud, y: %ud, y - x: %ud\n", x, y, y - x);
    if (x != y) {
      printf("Oops!\n");
      return 1;
    }
    return 0;
}

helper.c++

typedef unsigned int uint;

typedef union { 
    uint i; 
    float f; 
} U; 

uint float_to_uint(float f) {
    U u;
    u.f = f;
    return u.i;
}

float uint_to_float(uint i) {
    U u;
    u.i = i;
    return u.f;
}

Running it

Compile and run with

em++ main.c++ -fexceptions -c && em++ helper.c++ -fexceptions -c &&  em++ main.o helper.o -fexceptions -o repro.js && node repro.js

it prints:

x: 4287976789d, y: 4292171093d, y - x: 4194304d
Oops!

On the other hand, if we compile and run with -fwasm-exceptions:

em++ main.c++ -fwasm-exceptions -c && em++ helper.c++ -fwasm-exceptions -c &&  em++ main.o helper.o -fwasm-exceptions -o repro.js && node repro.js

It prints the expected output:

x: 4287976789d, y: 4287976789d, y - x: 0d

Emscripten version

$ emcc -v
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.68 (ceee49d2ecdab36a3feb85a684f8e5a453dde910)
clang version 20.0.0git (https:/github.com/llvm/llvm-project 5cc64bf60bc04b9315de3c679eb753de4d554a8a)
Target: wasm32-unknown-emscripten
Thread model: posix
@hoodmane hoodmane changed the title JS stack unwinding corrupts some floats? JS stack unwinding support corrupts some floats? Oct 15, 2024
@hoodmane hoodmane changed the title JS stack unwinding support corrupts some floats? JS stack unwinding support corrupts some floats Oct 15, 2024
@kripken
Copy link
Member

kripken commented Oct 15, 2024

Hmm, the integer here is turned into a NaN by uint_to_float. And then the difference between old JS EH and new wasm EH is that the former goes through JS. Avoiding the try-catch around that call leads to the same behavior as new wasm EH, as nothing is thrown, so this is a bug in old EH (and not related to new EH).

The NaN gets canonicalized at the JS boundary, I believe (calling invoke_fi returns that NaN as a JS result) which is what leads to this issue. That is, we need a very particular pattern of NaN bits here, not the canonical NaN that JS (but not wasm) forces everything into.

We could in theory fix this by making invoke_fi return an integer and not a float directly, and we could do a reinterpret on the wasm side. That is, we could avoid sending floats through JS. I don't think that would be too much work, but OTOH this is for old EH, which is being phased out anyhow, so perhaps it isn't worth it?

@kripken kripken changed the title JS stack unwinding support corrupts some floats [Old EH] JS stack unwinding support corrupts some floats Oct 15, 2024
@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 16, 2024

Well anyone who uses rust is stuck with the old error handling until someone updates the rust try intrinsic to emit ir compatible with the new error handling:
#19612

@kripken
Copy link
Member

kripken commented Oct 16, 2024

@hoodmane is that an issue on the Rust side, or LLVM? (the link is to an Emscripten issue, but I'm not sure I understood from it where the underlying problem is)

In general, getting Rust to emit IR compatible with new wasm EH sounds very important for any Rust/C++ combination, Emscripten or otherwise. Is there a Rust issue for that?

With all that said, I wouldn't be opposed to fixing float invokes for old EH, using the approach I mentioned above, if someone has time for it.

@hoodmane
Copy link
Contributor Author

getting Rust to emit IR compatible with new wasm EH sounds very important

Yeah I would love to switch Pyodide to use the new wasm EH and this is the only blocker.

The issue is in the rust compiler. Specifically, the try intrinsic needs to be updated:
https://github.com/rust-lang/rust/blob/master/compiler/rustc_codegen_llvm/src/intrinsic.rs?plain=1#L971

@kleisauke
Copy link
Collaborator

The tracking issue for WebAssembly EH in Rust is at rust-lang/rust#118168. Looking at PR rust-lang/rust#111322, I'm unsure why it was conditionally excluded for Emscripten:
https://github.com/rust-lang/rust/blob/0037048da8ed19feef7ee1437f123a88a7342b33/compiler/rustc_codegen_ssa/src/base.rs#L389-L391

Could you try to remove the && sess.target.os != "emscripten" there? This would make it use codegen_wasm_try instead of codegen_emcc_try.
https://github.com/rust-lang/rust/blob/0037048da8ed19feef7ee1437f123a88a7342b33/compiler/rustc_codegen_llvm/src/intrinsic.rs#L666-L669

@kripken
Copy link
Member

kripken commented Oct 16, 2024

Thanks @kleisauke !

@walkingeyerobot fyi, the last comment has some details that might be relevant for future Rust/Emscripten work.

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 16, 2024

Well using that is good progress. I can get it to link with the new codegen and the try intrinsic looks perfect, but there are problems elsewhere in rust/library/panic_unwind. First I have to change the argument to __cxa_begin_catch I think it's supposed to be the wasm exception argument, the rust code for old EH dereferences it once first. This hits all the right code paths but then I'm running into the problem that exception_header->adjustedPtr is null. I'm not sure who was supposed to set adjustedPtr, I'll have to look into it more.

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 16, 2024

Looks like adjustedPtr was supposed to be filled in by _Unwind_CallPersonality, but as far as I can tell it never gets called. In ir, the catchpad looks like:

catchpad:
   %tok = catchpad within %cs [null]
   %ptr = call @llvm.wasm.get.exception(token %tok)
   %sel = call @llvm.wasm.get.ehselector(token %tok)
   call %catch_func(%data, %ptr)
   catchret from %tok to label %caught

Was @llvm.wasm.get.exception supposed to call it? The catch block in the generated wasm only has:

catch $tag0
      local.set $var4
      local.get $var3
      global.set $__stack_pointer
      local.get $var1
      local.get $var4
      local.get $var2
      call_indirect (param i32 i32)
      i32.const 1
      local.set $var5
      local.get $var5
end

and the call_indirect is call $catch_func.

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 16, 2024

Indeed WasmEHPrepare is supposed to turn @llvm.wasm.get.exception into a call to _Unwind_CallPersonality:

WebAssembly exception handling uses Windows exception IR for the middle level
representation. This pass does the following transformation for every
catchpad block:
(In C-style pseudocode)

  • Before:
 catchpad ...
 exn = wasm.get.exception();
 selector = wasm.get.selector();
 ...
  • After:
  catchpad ...
  exn = wasm.catch(WebAssembly::CPP_EXCEPTION);
  // Only add below in case it's not a single catch (...)
  wasm.landingpad.index(index);
  __wasm_lpad_context.lpad_index = index;
  __wasm_lpad_context.lsda = wasm.lsda();
  _Unwind_CallPersonality(exn);
  selector = __wasm_lpad_context.selector;
  ...

https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/WasmEHPrepare.cpp?plain=1#L13-L33

I wonder if I can get llvm to print the result of this pass...

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 17, 2024

After the WasmEHPrepare pass the ir looks like:

catchpad:                                         ; preds = %catchswitch
  %catchpad2 = catchpad within %catchswitch1 [ptr null]
  %exn = call ptr @llvm.wasm.catch(i32 0)
  call void %2(ptr %1, ptr %exn) [ "funclet"(token %catchpad2) ]
  catchret from %catchpad2 to label %caught

It seems we took that !NeedPersonality branch in the pass:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/WasmEHPrepare.cpp?plain=1#L342-L349

  // In case it is a catchpad with single catch (...) or a cleanuppad, we don't
  // need to call personality function because we don't need a selector.
  if (!NeedPersonality) {
    if (GetSelectorCI) {
      assert(GetSelectorCI->use_empty() &&
             "wasm.get.ehselector() still has uses!");
      GetSelectorCI->eraseFromParent();
    }
    return;
  }

So as the comment says, this would normally happen in C++ when there is a blanket catch (...) and so the return value of __cxa_begin_catch is unused. Rust only has one llvm catch block to handle all the different cases, but it wants to know what exception was thrown. I think the simplest solution for this is for Rust to explicitly call the personality function in this case, at least if that doesn't make the linker angry...

@hoodmane
Copy link
Contributor Author

Adding the call to _Unwind_CallPersonality doesn't help because _Unwind_GetLanguageSpecificData returns null.

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 17, 2024

Right, the ldsa would be initialized in the ir injected by WasmEHPrepare if there were multiple catch blocks:

__wasm_lpad_context.lpad_index = index;
__wasm_lpad_context.lsda = wasm.lsda();

@kleisauke
Copy link
Collaborator

Have you already tried using rust/library/panic_unwind/src/gcc.rs instead of the rust/library/panic_unwind/src/emcc.rs implementation?
https://github.com/rust-lang/rust/blob/06d261daf62620e3449ba451898c7ad9ae750f41/library/panic_unwind/src/lib.rs#L36-L38

The former implementation is also used by target_family = "wasm".

@hoodmane
Copy link
Contributor Author

Nope, trying that now thanks.

@kleisauke
Copy link
Collaborator

FWIW, you may also need to adjust the following:
https://github.com/rust-lang/rust/blob/06d261daf62620e3449ba451898c7ad9ae750f41/library/unwind/src/lib.rs#L7-L10

--- a/library/unwind/src/lib.rs
+++ b/library/unwind/src/lib.rs
@@ -4,10 +4,7 @@
 #![feature(staged_api)]
 #![feature(strict_provenance)]
 #![cfg_attr(not(target_env = "msvc"), feature(libc))]
-#![cfg_attr(
-    all(target_family = "wasm", not(target_os = "emscripten")),
-    feature(simd_wasm64, wasm_exception_handling_intrinsics)
-)]
+#![cfg_attr(target_family = "wasm", feature(simd_wasm64, wasm_exception_handling_intrinsics))]
 #![allow(internal_features)]
 
 // Force libc to be included even if unused. This is required by many platforms.

@hoodmane
Copy link
Contributor Author

Yup that works. Only missing thing is debug info. Emscripten has this code to attach more information to the error in debug releases:
https://github.com/emscripten-core/emscripten/blob/main/system/lib/libcxxabi/src/cxa_exception.cpp?plain=1#L299-L307
Ideally Emscripten should probably move that into _Unwind_RaiseException().

@hoodmane
Copy link
Contributor Author

Well it seems like __throw_exception_with_stack_trace fails on rust exceptions so that is a TODO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants