Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wasm] WebAssembly CodeGen disables PIC #7796

Open
derek-gerstmann opened this issue Aug 22, 2023 · 26 comments
Open

[wasm] WebAssembly CodeGen disables PIC #7796

derek-gerstmann opened this issue Aug 22, 2023 · 26 comments
Assignees

Comments

@derek-gerstmann
Copy link
Contributor

derek-gerstmann commented Aug 22, 2023

It appears that the WebAssembly CodeGen explicitly disables PIC, which means that objects and modules being generated for wasm cannot be linked into a shared object and dynamically loaded (eg emcc -s SIDE_MODULE=2).

Is this necessary?

b292ab5dae (Steven Johnson 2019-03-15 11:33:32 -0700 364) bool CodeGen_WebAssembly::use_pic() const {
b292ab5dae (Steven Johnson 2019-03-15 11:33:32 -0700 365)     return false;
b292ab5dae (Steven Johnson 2019-03-15 11:33:32 -0700 366) }
@steven-johnson
Copy link
Contributor

...I don't remember :-)

Try changing it and see what breaks?

@derek-gerstmann
Copy link
Contributor Author

Okay, will do ... lemme see if things go boom.

@derek-gerstmann
Copy link
Contributor Author

derek-gerstmann commented Aug 23, 2023

Just enabling the flag doesn't seem to be enough to enable relocation ... the LLVM module gets created with the correct halide_use_pic flag, and PIC level flags, but I still get errors emitted from wasm-ld:

wasm-ld: error: relocation R_WASM_MEMORY_ADDR_SLEB cannot be used against symbol.Lstr; recompile with -fPIC

There may be some other LLVM feature flags that may need to get set to make this work. I'll keep investigating.

@derek-gerstmann
Copy link
Contributor Author

This seems relevant:
https://bugs.llvm.org/show_bug.cgi?id=42714

Which would imply that the triple flag needs to have the os set to emscripten to allow dynamic linking.

@sbc100
Copy link

sbc100 commented Aug 23, 2023

I believe that bug was fixed in https://reviews.llvm.org/D153293. I just closes the corresponding github issue: llvm/llvm-project#42059

@sbc100
Copy link

sbc100 commented Aug 23, 2023

Note that if you want you could unconditionally enable PIC code generation. It does at a little bloat to the object files, due to the extra indirection, but when linked into a static binary 100% of this can be removed by wasm-opt in release builds.

@steven-johnson
Copy link
Contributor

Note that if you want you could unconditionally enable PIC code generation. It does at a little bloat to the object files, due to the extra indirection, but when linked into a static binary 100% of this can be removed by wasm-opt in release builds.

That sounds like the simplest option, TBH -- is there a reason this isn't done in general? (Is wasm-opt not usually in use, or is the bloat more than "a little"?)

(Thanks for commenting, BTW, didn't realize you were watching these issues/PRs :-)

@sbc100
Copy link

sbc100 commented Aug 23, 2023

I was pointed here by an emscripten who uses halide

@sbc100
Copy link

sbc100 commented Aug 23, 2023

Maybe you can just remove the bool use_pic() const override; override and take the PIC setting from the command line?

@slomp
Copy link
Contributor

slomp commented Aug 23, 2023

Here's an idea: could we just always set PIC everywhere, and then have a Halide target flag (e.g., -no_pic) to disable it?

@steven-johnson
Copy link
Contributor

PIC setting from the command line?

It not quite that simple, alas, since Halide codegen isn't always (or even usually) directly invoked from the command line, but rather, indirectly from build rules.

Here's an idea: could we just always set PIC everywhere, and then have a Halide target flag (e.g., -no_pic) to disable it?

That would probably be a much more tractable approach. It looks like we (currently) default to using PIC for all LLVM-based backends, with no way to override them; WebAssembly is the only backend that changes this (to false, also with no way to override it.)

So, strawman:

  • remove the override so that all LLVM backends default to PIC
  • add no_pic Target Feature that flips this setting (for all all LLVM backends, because, why not?)
  • Feature has no effect on non-LLVM backends
    ?

@derek-gerstmann
Copy link
Contributor Author

derek-gerstmann commented Aug 23, 2023

Thanks very much for the reply! It appears just setting use_pic() isn’t enough … we still get linking errors when trying to use the static libraries or object files create by Halide in a shared object created by emcc.

In the link you provided it appears that the target triple includes wasm32-wasi … is wasi required for dynamic linking?

@steven-johnson
Copy link
Contributor

we still get linking errors

OK, we should clearly hold off until we know the entire scope of the issue. What are the linking errors?

@steven-johnson
Copy link
Contributor

I believe that bug was fixed in https://reviews.llvm.org/D153293. I just closes the corresponding github issue: llvm/llvm-project#42059

A bit of a threadjack here, but: it's been a minute since I've done more than minor fixes to our wasm backend... which was written when WASI was still mostly just a proposal. Looking at the link, it seems likely that upgrading our codegen to be WASI-compliant (if needed) would be highly desirable (to allow for easier use in a runtime env that doesn't include JS support)... and I don't know anything at all about the Component Model proposal.

Unfortunately, our team is fairly understaffed right now, and there's ~no way that I'm likely to get any bandwidth to look into this anytime soon (unless my manager(s) make it a priority for me), but the Halide team would absolutely welcome any community contribution that could be made here.

@derek-gerstmann
Copy link
Contributor Author

we still get linking errors

OK, we should clearly hold off until we know the entire scope of the issue. What are the linking errors?

wasm-ld: error: relocation R_WASM_MEMORY_ADDR_SLEB cannot be used against symbol.Lstr; recompile with -fPIC

@sbc100
Copy link

sbc100 commented Aug 23, 2023

@steven-johnson. Note that both wasi and the component model as still very early in the standard process. Its likely worth supporting the different target triples but I think that is all you would need to do. I would imagine you could even target wasm32-unknown-unknown and you object files should work with both wasi-sdk and emscripten (I think there was one ABI difference remaining, but there are efforts to remove that too).

@derek-gerstmann
Copy link
Contributor Author

I attempted to override the llvm::triple inside of LLVM_Runtime_Linker.cpp:

} else if (target.arch == Target::WebAssembly) {

I tried wasm32-unknown-unkown and wasm32-unknown-wasi, but this didn't seem to help. Is there a different format necessary for wasm static libraries or object files that's needed?

@slomp
Copy link
Contributor

slomp commented Aug 23, 2023

@sbc100
Copy link

sbc100 commented Aug 23, 2023

I attempted to override the llvm::triple inside of LLVM_Runtime_Linker.cpp:

} else if (target.arch == Target::WebAssembly) {

I tried wasm32-unknown-unkown and wasm32-unknown-wasi, but this didn't seem to help. Is there a different format necessary for wasm static libraries or object files that's needed?

Not a different format, no. But the relocation types used are different. In PIC code generatation you should shouldn't get any R_WASM_MEMORY_ADDR_SLEB relocations being generated. Hence the error message telling you to recompile with -fPIC. So I guess that halide compiler is doing something a little different to the clang here. I don't think it will be hard for someone with some knowledge of hadlide to track down and fix.

@derek-gerstmann
Copy link
Contributor Author

Okay, good to know. It looks like the generated object files can be inspected with wabt. From the HelloWasm app:

> wasm-objdump -x ./bin/wasm/reaction_diffusion_init.o

reaction_diffusion_init.o:	file format wasm 0x1

Section Details:

Type[6]:
 - type[0] (i32) -> i32
 - type[1] (i32, i32) -> i32
 - type[2] (i32, i32, i32, i32) -> i32
 - type[3] (i32, i32, i32, i32, i32) -> i32
 - type[4] (i32, i32, i64, i64) -> i32
 - type[5] () -> i32
Import[11]:
 - memory[0] pages: initial=1 <- env.__linear_memory
 - global[0] i32 mutable=1 <- env.__stack_pointer
 - func[0] sig=1 <env.halide_error_buffer_argument_is_null> <- env.halide_error_buffer_argument_is_null
 - func[1] sig=2 <env.halide_error_bad_type> <- env.halide_error_bad_type
 - func[2] sig=2 <env.halide_error_bad_dimensions> <- env.halide_error_bad_dimensions
 - func[3] sig=2 <env.halide_error_buffer_extents_negative> <- env.halide_error_buffer_extents_negative
 - func[4] sig=3 <env.halide_error_constraint_violated> <- env.halide_error_constraint_violated
 - func[5] sig=4 <env.halide_error_buffer_allocation_too_large> <- env.halide_error_buffer_allocation_too_large
 - func[6] sig=4 <env.halide_error_buffer_extents_too_large> <- env.halide_error_buffer_extents_too_large
 - func[7] sig=1 <env.halide_error_device_dirty_with_no_device_support> <- env.halide_error_device_dirty_with_no_device_support
 - func[8] sig=1 <env.halide_error_host_is_null> <- env.halide_error_host_is_null
Function[3]:
 - func[9] sig=0 <reaction_diffusion_init>
 - func[10] sig=0 <reaction_diffusion_init_argv>
 - func[11] sig=5 <reaction_diffusion_init_metadata>
DataCount:
 - data count: 9
Code[3]:
 - func[9] size=1697 <reaction_diffusion_init>
 - func[10] size=13 <reaction_diffusion_init_argv>
 - func[11] size=8 <reaction_diffusion_init_metadata>
Data[9]:
 - segment[0] <.rodata..L__unnamed_1> memory=0 size=24 - init i32=0
  - 0000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  - 0000010: 0000 0000 0000 0000                      ........
 - segment[1] <.rodata..Lstr> memory=0 size=7 - init i32=32
  - 0000020: 6f75 7470 7574 00                        output.
 - segment[2] <.rodata..L__unnamed_2> memory=0 size=36 - init i32=48
  - 0000030: 2000 0000 0200 0000 0300 0000 0220 0100   ............ ..
  - 0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
  - 0000050: 0000 0000                                ....
 - segment[3] <.rodata..Lstr.4> memory=0 size=26 - init i32=96
  - 0000060: 7761 736d 2d33 322d 7761 736d 7274 2d6e  wasm-32-wasmrt-n
  - 0000070: 6f5f 7275 6e74 696d 6500                 o_runtime.
 - segment[4] <.rodata..Lstr.5> memory=0 size=24 - init i32=128
  - 0000080: 7265 6163 7469 6f6e 5f64 6966 6675 7369  reaction_diffusi
  - 0000090: 6f6e 5f69 6e69 7400                      on_init.
 - segment[5] <.rodata..Lreaction_diffusion_init_metadata_storage> memory=0 size=20 - init i32=160
  - 00000a0: 0100 0000 0100 0000 3000 0000 6000 0000  ........0...`...
  - 00000b0: 8000 0000                                ....
 - segment[6] <.rodata..Lstr.6> memory=0 size=21 - init i32=192
  - 00000c0: 4f75 7470 7574 2062 7566 6665 7220 6f75  Output buffer ou
  - 00000d0: 7470 7574 00                             tput.
 - segment[7] <.rodata..Lstr.7> memory=0 size=16 - init i32=224
  - 00000e0: 6f75 7470 7574 2e73 7472 6964 652e 3000  output.stride.0.
 - segment[8] <.rodata..Lstr.8> memory=0 size=2 - init i32=256
  - 0000100: 3100                                     1.
Custom:
 - name: "linking"
  - symbol table [count=22]
   - 0: F <reaction_diffusion_init> func=9 [ binding=global vis=default ]
   - 1: G <env.__stack_pointer> global=0 [ undefined binding=global vis=default ]
   - 2: D <.Lstr> segment=1 offset=0 size=7 [ binding=local vis=default ]
   - 3: F <env.halide_error_buffer_argument_is_null> func=0 [ undefined binding=global vis=default ]
   - 4: D <.Lstr.6> segment=6 offset=0 size=21 [ binding=local vis=default ]
   - 5: F <env.halide_error_bad_type> func=1 [ undefined binding=global vis=default ]
   - 6: F <env.halide_error_bad_dimensions> func=2 [ undefined binding=global vis=default ]
   - 7: F <env.halide_error_buffer_extents_negative> func=3 [ undefined binding=global vis=default ]
   - 8: D <.Lstr.7> segment=7 offset=0 size=16 [ binding=local vis=default ]
   - 9: D <.Lstr.8> segment=8 offset=0 size=2 [ binding=local vis=default ]
   - 10: F <env.halide_error_constraint_violated> func=4 [ undefined binding=global vis=default ]
   - 11: F <env.halide_error_buffer_allocation_too_large> func=5 [ undefined binding=global vis=default ]
   - 12: F <env.halide_error_buffer_extents_too_large> func=6 [ undefined binding=global vis=default ]
   - 13: F <env.halide_error_device_dirty_with_no_device_support> func=7 [ undefined binding=global vis=default ]
   - 14: F <env.halide_error_host_is_null> func=8 [ undefined binding=global vis=default ]
   - 15: F <reaction_diffusion_init_argv> func=10 [ binding=global vis=default ]
   - 16: F <reaction_diffusion_init_metadata> func=11 [ binding=global vis=default ]
   - 17: D <.Lreaction_diffusion_init_metadata_storage> segment=5 offset=0 size=20 [ binding=local vis=default ]
   - 18: D <.L__unnamed_1> segment=0 offset=0 size=24 [ binding=local vis=default ]
   - 19: D <.L__unnamed_2> segment=2 offset=0 size=36 [ binding=local vis=default ]
   - 20: D <.Lstr.4> segment=3 offset=0 size=26 [ binding=local vis=default ]
   - 21: D <.Lstr.5> segment=4 offset=0 size=24 [ binding=local vis=default ]
  - segment info [count=9]
   - 0: .rodata..L__unnamed_1 p2align=4 [ ]
   - 1: .rodata..Lstr p2align=5 [ ]
   - 2: .rodata..L__unnamed_2 p2align=4 [ ]
   - 3: .rodata..Lstr.4 p2align=5 [ ]
   - 4: .rodata..Lstr.5 p2align=5 [ ]
   - 5: .rodata..Lreaction_diffusion_init_metadata_storage p2align=4 [ ]
   - 6: .rodata..Lstr.6 p2align=5 [ ]
   - 7: .rodata..Lstr.7 p2align=5 [ ]
   - 8: .rodata..Lstr.8 p2align=5 [ ]
Custom:
 - name: "reloc.CODE"
  - relocations for section: 4 (Code) [34]
   - R_WASM_GLOBAL_INDEX_LEB offset=0x00000f(file=0x0001ff) symbol=1 <env.__stack_pointer>
   - R_WASM_GLOBAL_INDEX_LEB offset=0x00001a(file=0x00020a) symbol=1 <env.__stack_pointer>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x00002a(file=0x00021a) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x000030(file=0x000220) symbol=3 <env.halide_error_buffer_argument_is_null>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x000298(file=0x000488) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0002a4(file=0x000494) symbol=5 <env.halide_error_bad_type>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0002b1(file=0x0004a1) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0002bb(file=0x0004ab) symbol=6 <env.halide_error_bad_dimensions>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0002c8(file=0x0004b8) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0002d2(file=0x0004c2) symbol=7 <env.halide_error_buffer_extents_negative>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0002df(file=0x0004cf) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0002e9(file=0x0004d9) symbol=7 <env.halide_error_buffer_extents_negative>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0002f6(file=0x0004e6) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x000300(file=0x0004f0) symbol=7 <env.halide_error_buffer_extents_negative>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x00030d(file=0x0004fd) symbol=8 <.Lstr.7>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x000315(file=0x000505) symbol=9 <.Lstr.8>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x00031d(file=0x00050d) symbol=10 <env.halide_error_constraint_violated>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x00032a(file=0x00051a) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x000338(file=0x000528) symbol=11 <env.halide_error_buffer_allocation_too_large>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x000345(file=0x000535) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x000353(file=0x000543) symbol=11 <env.halide_error_buffer_allocation_too_large>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x000360(file=0x000550) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x00036e(file=0x00055e) symbol=12 <env.halide_error_buffer_extents_too_large>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x00037b(file=0x00056b) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x000389(file=0x000579) symbol=11 <env.halide_error_buffer_allocation_too_large>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x000396(file=0x000586) symbol=2 <.Lstr>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0003a4(file=0x000594) symbol=12 <env.halide_error_buffer_extents_too_large>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0003b1(file=0x0005a1) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0003b7(file=0x0005a7) symbol=13 <env.halide_error_device_dirty_with_no_device_support>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0003c4(file=0x0005b4) symbol=4 <.Lstr.6>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0003ca(file=0x0005ba) symbol=14 <env.halide_error_host_is_null>
   - R_WASM_GLOBAL_INDEX_LEB offset=0x00069c(file=0x00088c) symbol=1 <env.__stack_pointer>
   - R_WASM_FUNCTION_INDEX_LEB offset=0x0006ac(file=0x00089c) symbol=0 <reaction_diffusion_init>
   - R_WASM_MEMORY_ADDR_SLEB offset=0x0006b5(file=0x0008a5) symbol=17 <.Lreaction_diffusion_init_metadata_storage>
Custom:
 - name: "reloc.DATA"
  - relocations for section: 5 (Data) [5]
   - R_WASM_MEMORY_ADDR_I32 offset=0x00002f(file=0x0008e0) symbol=2 <.Lstr>
   - R_WASM_MEMORY_ADDR_I32 offset=0x00004f(file=0x000900) symbol=18 <.L__unnamed_1>
   - R_WASM_MEMORY_ADDR_I32 offset=0x00009f(file=0x000950) symbol=19 <.L__unnamed_2>
   - R_WASM_MEMORY_ADDR_I32 offset=0x0000a3(file=0x000954) symbol=20 <.Lstr.4>
   - R_WASM_MEMORY_ADDR_I32 offset=0x0000a7(file=0x000958) symbol=21 <.Lstr.5>
Custom:
 - name: "producers"

@derek-gerstmann
Copy link
Contributor Author

As @sbc100 indicated, the above object file contains R_WASM_MEMORY_ADDR_SLEB entries which shouldn't exist if PIC was enabled. I've scrutinized the WebAssembly clang driver and noticed that "mutable-globals" appear to be required for the linker:

https://github.com/llvm/llvm-project/blob/dfc759929644ed1ea3224ab30e1086f7acc60da6/clang/lib/Driver/ToolChains/WebAssembly.cpp#L276

So I added this to Halide's CodeGen_WebAssembly::mattrs() so that emits "+mutable-globals" and verified this shows up in the generated object files:

Custom:
 - name: "target_features"
  - [+] mutable-globals

But the linker still complains and we still have R_WASM_MEMORY_ADDR_SLEB symbols being generated.

I'll keep digging ....

@derek-gerstmann
Copy link
Contributor Author

derek-gerstmann commented Aug 23, 2023

I haven't found anything concrete ... my guess is we may be missing one or more LLVM passes that may transform the module into a PIC compatible layout for wasm and/or are missing attributes on global values or functions (ie dso_local) that would allow them to get stored correctly.

We don't use the standard llvm PassBuilder::buildModuleOptimizationPipeline() optimizations so things like RelLookupTableConverterPass don't get run.

Anyone have any other ideas? @steven-johnson @abadams @sbc100

@sbc100
Copy link

sbc100 commented Aug 23, 2023

The relevant code is in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp.

You can see that LowerGlobalAddress produces different code if isPositionIndependent is true. You might want to check that isPositionIndependent is returning true here?

@sbc100
Copy link

sbc100 commented Aug 23, 2023

@derek-gerstmann, also are you using llvm tip-of-tree? Because llvm/llvm-project#42059 only recently landed.

@derek-gerstmann
Copy link
Contributor Author

Ah, thanks! So far I've been testing with LLVM releases 15-17. I'll pull llvm from main and rebuild.

@derek-gerstmann
Copy link
Contributor Author

derek-gerstmann commented Aug 24, 2023

Okay, fix confirmed. Testing against LLVM main (18.x), with the following changes to Halide enabled me to produce wasm object files that could be relocated with emcc:

@@ -362,7 +370,7 @@ bool CodeGen_WebAssembly::use_soft_float_abi() const {
 }
 
 bool CodeGen_WebAssembly::use_pic() const {
-    return false;
+    return true;
 }
 

+++ b/src/CodeGen_WebAssembly.cpp
@@ -344,6 +344,14 @@ string CodeGen_WebAssembly::mattrs() const {
         sep = ",";
     }
 
+    // PIC implies +mutable-globals because the PIC ABI used by the linker
+    // depends on importing and exporting mutable globals. Also -pthread implies
+    // mutable-globals too, so quitely enable it if either of these are specified.
+    if(use_pic() || target.has_feature(Target::WasmThreads)) {
+        s << sep << "+mutable-globals";
+        sep = ",";
+    }
+

Optional, but likely needed (based on comments from LLVM devs):

@@ -1151,6 +1156,16 @@ void CodeGen_LLVM::optimize_module() {
     using OptimizationLevel = llvm::OptimizationLevel;
     OptimizationLevel level = OptimizationLevel::O3;
 
+#if LLVM_VERSION >= 180
+    if(tm->isPositionIndependent()) {
+        pb.registerOptimizerLastEPCallback(
+            [&](ModulePassManager &mpm, OptimizationLevel level) {
+                mpm.addPass(RelLookupTableConverterPass());        
+            });
+    }
+#endif
+

Symbols in the wasm object files now appear as R_WASM_MEMORY_ADDR_REL_SLEB rather than absolute offsets via R_WASM_MEMORY_ADDR_SLEB.

I'll open a PR, and we can iron out the details of the proposed strawman. Thanks everyone! @slomp @sbc100 @steven-johnson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants