Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WebAssembly] Change the default linker for wasm32-wasip2 #84569

Merged
merged 2 commits into from
Mar 19, 2024

Conversation

alexcrichton
Copy link
Contributor

This commit changes the default linker in the WebAssembly toolchain for the wasm32-wasip2 target. This target is being added to the WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the Component Model by default, in contrast with the preexisting wasm32-wasi target (in the process of being renamed to wasm32-wasip1) which outputs a core WebAssembly module by default.

The wasm-component-ld project currently lives in my GitHub account at https://github.com/alexcrichton/wasm-component-ld and isn't necessarily "official" yet, but it's expected to continue to evolve as the wasm32-wasip2 target continues to shape up and evolve.

Copy link

github-actions bot commented Mar 8, 2024

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Mar 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 8, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Alex Crichton (alexcrichton)

Changes

This commit changes the default linker in the WebAssembly toolchain for the wasm32-wasip2 target. This target is being added to the WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the Component Model by default, in contrast with the preexisting wasm32-wasi target (in the process of being renamed to wasm32-wasip1) which outputs a core WebAssembly module by default.

The wasm-component-ld project currently lives in my GitHub account at https://github.com/alexcrichton/wasm-component-ld and isn't necessarily "official" yet, but it's expected to continue to evolve as the wasm32-wasip2 target continues to shape up and evolve.


Full diff: https://github.com/llvm/llvm-project/pull/84569.diff

3 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/WebAssembly.cpp (+6)
  • (modified) clang/lib/Driver/ToolChains/WebAssembly.h (+1-1)
  • (modified) clang/test/Driver/wasm-toolchain.c (+8)
diff --git a/clang/lib/Driver/ToolChains/WebAssembly.cpp b/clang/lib/Driver/ToolChains/WebAssembly.cpp
index b8c2573d6265fb..a6c43c627f7206 100644
--- a/clang/lib/Driver/ToolChains/WebAssembly.cpp
+++ b/clang/lib/Driver/ToolChains/WebAssembly.cpp
@@ -221,6 +221,12 @@ WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,
   }
 }
 
+const char *WebAssembly::getDefaultLinker() const {
+  if (getOS() == "wasip2")
+    return "wasm-component-ld";
+  return "wasm-ld";
+}
+
 bool WebAssembly::IsMathErrnoDefault() const { return false; }
 
 bool WebAssembly::IsObjCNonFragileABIDefault() const { return true; }
diff --git a/clang/lib/Driver/ToolChains/WebAssembly.h b/clang/lib/Driver/ToolChains/WebAssembly.h
index ae60f464c10818..76e0ca39bd748d 100644
--- a/clang/lib/Driver/ToolChains/WebAssembly.h
+++ b/clang/lib/Driver/ToolChains/WebAssembly.h
@@ -67,7 +67,7 @@ class LLVM_LIBRARY_VISIBILITY WebAssembly final : public ToolChain {
                            llvm::opt::ArgStringList &CmdArgs) const override;
   SanitizerMask getSupportedSanitizers() const override;
 
-  const char *getDefaultLinker() const override { return "wasm-ld"; }
+  const char *getDefaultLinker() const override;
 
   CXXStdlibType GetDefaultCXXStdlibType() const override {
     return ToolChain::CST_Libcxx;
diff --git a/clang/test/Driver/wasm-toolchain.c b/clang/test/Driver/wasm-toolchain.c
index f950283ec42aa0..5f18e56f79b657 100644
--- a/clang/test/Driver/wasm-toolchain.c
+++ b/clang/test/Driver/wasm-toolchain.c
@@ -197,3 +197,11 @@
 // RUN: not %clang -### %s --target=wasm32-unknown-unknown --sysroot=%s/no-sysroot-there -fPIC -mno-mutable-globals %s 2>&1 \
 // RUN:   | FileCheck -check-prefix=PIC_NO_MUTABLE_GLOBALS %s
 // PIC_NO_MUTABLE_GLOBALS: error: invalid argument '-fPIC' not allowed with '-mno-mutable-globals'
+
+// Test that `wasm32-wasip2` invokes a the `wasm-component-ld` linker by default
+// instead of `wasm-ld`.
+
+// RUN: %clang -### -O2 --target=wasm32-wasip2 %s --sysroot /foo 2>&1 \
+// RUN:   | FileCheck -check-prefix=LINK_WASIP2 %s
+// LINK_WASIP2: "-cc1" {{.*}} "-o" "[[temp:[^"]*]]"
+// LINK_WASIP2: wasm-component-ld{{.*}}" "-L/foo/lib/wasm32-wasip2" "crt1.o" "[[temp]]" "-lc" "{{.*[/\\]}}libclang_rt.builtins-wasm32.a" "-o" "a.out"

@alexcrichton
Copy link
Contributor Author

cc @sunfishcode and @sbc100

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does wasm-component-ld accept any other input types other than the ones that wasm-ld accepts?

Does wasm-component-ld call wasm-ld internally?

Do we expect clang users to be building compound components using a single clang command? i.e. will they be somehow supplying input files that describe how the componend links? Would it make more sense to have clang default to building core modules and have the component creation be a higher level thing built on top clang outputs?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 8, 2024

Regarding WebAssembly/wasi-sdk and WebAssembly/wasi-libc, is there any reason why simple programs wouldn't be core modules? Won't most C/C++ programs still be build-able as just core modules?

@alexcrichton
Copy link
Contributor Author

Currently it accepts no extra inputs, but in the future I'd expect that it'll grow an option or two of its own. Currently it hand-codes the list of options to forward to wasm-ld which I'm sure will need updates over time, but that should be ok. And yes, the purpose of wasm-component-ld is twofold:

  1. First it invokes wasm-ld internally, producing the core wasm that's expected today.
  2. Next it invokes the "componentization process" which activates some Rust crates that converts this output core wasm module into a component.

In step (2) it's using component type information smuggled through LLVM/wasm-ld through custom sections to create the final output component.

Do we expect clang users to be building compound components using a single clang command?

If by compound you mean linking multiple components internally, then no. The intention is that wasm32-wasip2 outputs a component but internally it's "just" a core wasm module. The next phase of composing components together we've got other tooling for, and that's intended to be outside the scope of individual language toolchains. (e.g. in theory composition works the same regardless of source language)

Would it make more sense to have clang default to building core modules and have the component creation be a higher level thing built on top clang outputs?

That's actually what we currently have to day with converting wasm32-wasip1 outputs into components. The purpose of wasm32-wasip2, however, is that core wasms are not natively usable as-is because WASI APIs are defined at the level of the component model, not core wasm. There is of course definitions for core wasm using the canonical ABI, but that's not the focus of WASI nowdays.

I do plan on having a flag (also answering the question about wasm-component-ld-specific-flags) to not emit a component and instead skip the componentization step. This step could also be done by using -fuse-ld=wasm-ld (or the equivalent thereof) to just switch the default linker back to wasm-ld.

Regarding WebAssembly/wasi-sdk and WebAssembly/wasi-libc, is there any reason why simple programs wouldn't be core modules? Won't most C/C++ programs still be build-able as just core modules?

Yep, they'll all still be buildable as core modules (as that's the internals of a component anyway, a big thing produced by LLVM). With the p2 target though the thinking is that components are the default output, not core wasm modules.


Another future feature on the theoretical roadmap for wasm-component-ld is that Joel has componentize-py built on "dynamic linking" of a sort which gets dlopen working enough for Python to open its native extensions. This is all built on the Emscripten dynamic linking model and a component packages that all up into a single component where the dynamic libraries are statically known at build time. This is all wrapped up in wasm-tools component link and is something I'd like to also automatically bake in to wasm-component-ld so building that style of component is much easier.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 8, 2024

I see, so wasm-ld builds "core module + metadata" and then wasm-component-ld takes that metadata and uses it to wrap the core module into a component (with exactly that one core module inside of it).

So the core module + metadata is kind of isomorphic with that single-core-module-component?

@sbc100
Copy link
Collaborator

sbc100 commented Mar 8, 2024

Currently it accepts no extra inputs, but in the future I'd expect that it'll grow an option or two of its own

I was more asking about whether the file types is accepts are anything more than object files and libraries. i.e. can you pass other core modules and have wasm-component-ld build a component that contains more than one core module?

Also, can you say more about the metadata that is being using to drive wasm-componenet-ld? If the core module is all based on canonical ABI what extra metdata is needed/planned?

@alexcrichton
Copy link
Contributor Author

So the core module + metadata is kind of isomorphic with that single-core-module-component?

Sort of and sort of not. At a high level you're correct, but at a technical level this isn't correct. The subtle differences are:

  • Most wasm binaries today using WASI probably use wasi_snapshot_preview1 imports. That means they need an "adapter" to switch to to WASIp2-based imports. That adapter is a core wasm which is currently baked into wasm-component-ld (see the adapters in this folder)
  • Due to the way components and lifting and lowering works many imports need a "shim". For example in the component model when you "lower" a function from a component function to a core function you need to specify a linear memory. A linear memory isn't available until you instantiate the core module, but the core module needs the lowered imports to be instantiated. To break this cycle we introduce a shim wasm module which uses call_indirect through a table for its exports, so that's used to pass in as imports when instantiating the main module, and then after the main module is instantiated we instantiate another shim module that fills in the table.

Going from core module + metadata into a component is a pretty nontrivial operation, so the raw output of wasm-ld (module + metadata) isn't suitable as an intermediate artifact for components.

I was more asking about whether the file types is accepts are anything more than object files and libraries. i.e. can you pass other core modules and have wasm-component-ld build a component that contains more than one core module?

Ah sory I misunderstood! Currently wasm-component-ld takes no other inputs and it assumes everything goes into wasm-ld. In the future for the dynamic linking case above it may start taking in *.wasm binaries compiled as shared libraries to bundle and/or refer to in the output component, but that's in the future. That means that at this time wasm-component-ld will always build a component with a single "main module", e.g. the one from wasm-ld.

Also, can you say more about the metadata that is being using to drive wasm-componenet-ld? If the core module is all based on canonical ABI what extra metdata is needed/planned?

Certainly! The canonical ABI gives the ability to say that for any component model function type it corresponds to a particular core wasm function type. That operation is not reversible, though, you can't go from core wasm back to the component model function type. Thus the metadata carries along this information. That way the componentization process draws a mapping from component model import/export to core wasm import/export and then can synthesize the right type in the component binary format.

The short answer to your question, though, is wasm-tools component wit ./my-wit-files --wasm. That output blob, which is itself a component but only with type information, is embedded in core wasm. Put another way we start with WIT files, then those are converted into their binary format as a component with only types, that gets fed through LLVM/wasm-ld, then on the other end we deserialize the component-with-type-information, turn it back into WIT, and then make the real component.

@alexcrichton
Copy link
Contributor Author

ping @sbc100, happy to answer any more questions if you have them!

I was tentatively hoping this could get backported to an 18.1.x release so we could get a wasi-sdk release with p1/p2/etc

This commit changes the default linker in the WebAssembly toolchain for
the `wasm32-wasip2` target. This target is being added to the
WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the
Component Model by default, in contrast with the preexisting
`wasm32-wasi` target (in the process of being renamed to
`wasm32-wasip1`) which outputs a core WebAssembly module by default.

The `wasm-component-ld` project currently lives in my GitHub account at
https://github.com/alexcrichton/wasm-component-ld and isn't necessarily
"official" yet, but it's expected to continue to evolve as the
`wasm32-wasip2` target continues to shape up and evolve.
@alexcrichton alexcrichton force-pushed the wasm32-wasip2-new-linker branch from fb90a73 to d1a1b22 Compare March 18, 2024 19:25
@alexcrichton
Copy link
Contributor Author

At @sunfishcode's request I've pushed up a second commit which pass the path to wasm-ld to wasm-component-ld, and in testing it I also added the ability for -fuse-ld=lld to explicitly request that wasm-ld is used, regardless of target.

Copy link

github-actions bot commented Mar 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

This commit adds an explicit argument that's passed to
`wasm-component-ld` containing the location of `wasm-ld` itself. This
enables `wasm-component-ld` to avoid hunting around looking for it and
instead use the install that's paired with Clang itself.

Additionally this reinterprets the `-fuse-ld=lld` argument to explicitly
requesting the `wasm-ld` linker flavor, even on `wasm32-wasip2` targets.
@alexcrichton alexcrichton force-pushed the wasm32-wasip2-new-linker branch from d1a1b22 to 7284fd4 Compare March 18, 2024 19:29
@sunfishcode sunfishcode merged commit d66121d into llvm:main Mar 19, 2024
4 checks passed
Copy link

@alexcrichton Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may recieve a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@alexcrichton alexcrichton deleted the wasm32-wasip2-new-linker branch March 19, 2024 03:01
@sunfishcode sunfishcode added this to the LLVM 18.X Release milestone Mar 19, 2024
@sunfishcode
Copy link
Member

/cherry-pick d66121d

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Mar 19, 2024
This commit changes the default linker in the WebAssembly toolchain for
the `wasm32-wasip2` target. This target is being added to the
WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the
Component Model by default, in contrast with the preexisting
`wasm32-wasi` target (in the process of being renamed to
`wasm32-wasip1`) which outputs a core WebAssembly module by default.

The `wasm-component-ld` project currently lives in my GitHub account at
https://github.com/alexcrichton/wasm-component-ld and isn't necessarily
"official" yet, but it's expected to continue to evolve as the
`wasm32-wasip2` target continues to shape up and evolve.

(cherry picked from commit d66121d)
@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2024

/pull-request #85802

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Mar 19, 2024
This commit changes the default linker in the WebAssembly toolchain for
the `wasm32-wasip2` target. This target is being added to the
WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the
Component Model by default, in contrast with the preexisting
`wasm32-wasi` target (in the process of being renamed to
`wasm32-wasip1`) which outputs a core WebAssembly module by default.

The `wasm-component-ld` project currently lives in my GitHub account at
https://github.com/alexcrichton/wasm-component-ld and isn't necessarily
"official" yet, but it's expected to continue to evolve as the
`wasm32-wasip2` target continues to shape up and evolve.

(cherry picked from commit d66121d)
@TerrorJack
Copy link

@alexcrichton This is breaking wasi-sdk build with:

/workspace/wasi-sdk/build/install/opt/wasi-sdk/bin/clang --target=wasm32-wasip2 -nodefaultlibs -shared --sysroot=/workspace/wasi-sdk/build/install/opt/wasi-sdk/share/wasi-sysroot \
-o /workspace/wasi-sdk/build/install/opt/wasi-sdk/share/wasi-sysroot/lib/wasm32-wasip2/libc.so -Wl,--whole-archive build/wasm32-wasip2/libc.so.a -Wl,--no-whole-archive /workspace/wasi-sdk/build/install/opt/wasi-sdk/lib/clang/19/lib/wasi/libclang_rt.builtins-wasm32.a
error: unexpected argument '--entry' found

  tip: a similar argument exists: '--no-entry'

Usage: wasm-component-ld -o <OUTPUT> --wasm-ld-path <WASM_LD_PATH> <--export <EXPORT>|-z <Z_OPTS>|--stack-first|--allow-undefined|--fatal-warnings|--no-demangle|--gc-sections|-O <OPTIMIZE>|-L <LINK_PATH>|-l <LIBRARIES>|--no-entry|-m <TARGET_EMULATION>|--strip-all|OBJECTS>

For more information, try '--help'.
clang: error: linker command failed with exit code 2 (use -v to see invocation)
make[1]: *** [Makefile:569: /workspace/wasi-sdk/build/install/opt/wasi-sdk/share/wasi-sysroot/lib/wasm32-wasip2/libc.so] Error 2
make[1]: Leaving directory '/workspace/wasi-sdk/src/wasi-libc'
make: *** [Makefile:142: build/wasi-libc.BUILT] Error 2

@alexcrichton
Copy link
Contributor Author

Yes it's expected that wasi-sdk will need changes to incorporate this PR. I'm waiting for this to be in an LLVM release before updating wasi-sdk.

chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
This commit changes the default linker in the WebAssembly toolchain for
the `wasm32-wasip2` target. This target is being added to the
WebAssembly/wasi-sdk and WebAssembly/wasi-libc projects to target the
Component Model by default, in contrast with the preexisting
`wasm32-wasi` target (in the process of being renamed to
`wasm32-wasip1`) which outputs a core WebAssembly module by default.

The `wasm-component-ld` project currently lives in my GitHub account at
https://github.com/alexcrichton/wasm-component-ld and isn't necessarily
"official" yet, but it's expected to continue to evolve as the
`wasm32-wasip2` target continues to shape up and evolve.
@justdan96
Copy link

@alexcrichton Not sure if you've seen but this has been released as part of LLVM 18.1.2: https://github.com/llvm/llvm-project/tree/llvmorg-18.1.2/clang/lib/Driver/ToolChains

@alexcrichton
Copy link
Contributor Author

Oop no I did not, thank you for the heads up! I'll work on updating wasi-sdk now.

@justdan96
Copy link

No problem, I have wasi-sdk all working in a Docker container now with your latest version of wasm-component-ld so I should be thanking you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

6 participants