Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add target-prefixed executables to the installation bin directory #388

Merged

Conversation

alexcrichton
Copy link
Collaborator

@alexcrichton alexcrichton commented Mar 2, 2024

Currently the bin directory installed by wasi-sdk is not currently
suitable for putting in $PATH in all cases because it can shadow a
system-installed clang executable which is intended for native
binaries. For Linux distributions with gcc-based cross-compilers I've
often seen the pattern where they're installed as $target-gcc and so
I've taken a leaf out of their books to do that here as well.

This commit adds, currently alongside the preexisting clang
executable, target-prefixed executables such as wasm32-wasi-clang and
wasm32-wasi-clang++. These executables are symlinks to the clang
executable itself in the same manner that clang++ is a symlink to
clang itself.

I'll also note that this doesn't fix the problem of "add the wasi-sdk
bin dir to PATH" because clang and other prominent executables are
still there. My hope though is that this opens up a future refactoring
for doing so.

@alexcrichton
Copy link
Collaborator Author

I should say that another motivation for this is that we'll want the preview2 target to output a component by default, not a core module, and for that purpose rustc will use wasm-component-ld and I think we'll want to do the same here. To help configure this by default a wasm32-wasip2-clang shim executable seems like a good place to inject a -fuse-ld=/.... argument to wasm-component-ld.

@sunfishcode
Copy link
Member

I've been imagining we'd add the -fuse-ld=.../wasm-component-ld or whatever to upstream LLVM, rather than having a wrapper in wasi-sdk. Many C/C++ users don't use wasi-sdk per se; they use a clang from their distro, or from http://apt.llvm.org/, or other places. We'll still need them to install wasm-component-ld, of course.

It isn't really idiomatic for clang to have target-prefixed executables, because clang has the ability to support multiple targets in the same build. What would you think of instead installing wasi-sdk-clang and wasi-sdk-clang++ as symlinks/copies of the existing clang and clang++? Users would then pick between p1 and p2 with the --target flag. That feels closer to the "it's just clang" philosophy of wasi-sdk, and aligns with users using other clang packages.

Also, regardless of what we call the new commands, what would you think of putting them in a prefixed-bin directory, instead of bin? That would fix the "add the wasi-sdk bin dir to PATH" problem, and give users an easy choice of what they want to include in their PATH.

@alexcrichton
Copy link
Collaborator Author

Ah ok, I think I'm too used to distros and gcc things then. If it's not idiomatic for clang to have target-prefixes I agree this needs to change.

I suppose then this raises a bit of a larger question of if there's a longer-term installation plan for wasi-sdk perhaps? For example is the goal on macOS, for example, to enable brew install wasi-sdk? Or for Linux users to package-manager install wasi-sdk? In those situations wasi-sdk is more of a sysroot than a build of clang I suppose? That would further motivate getting any "special logic" into Clang itself (e.g. wasm-component-ld as a linker for wasm32-wasip2).

Do you know how feasible it would be to modify Clang to support wasm32-wasip2 and wasm-component-ld? I think it'd be best to get that change in the installation of wasi-sdk itself sooner-rather-than-later to avoid ossifying the output of wasm32-wasip2 as a core wasm module as opposed to a component.

Also, yeah, I'm happy to make a prefixed-bin directory with shims pointing to clang and clang++.

@sbc100
Copy link
Member

sbc100 commented Mar 5, 2024

I'll also note that this doesn't fix the problem of "add the wasi-sdk bin dir to PATH" because clang and other prominent executables are still there. My hope though is that this opens up a future refactoring for doing so.

I think as long as you add wasi-sdk to the end of the $PATH this should work fine right? When clang would resolve to the system clang and wasm32-wasi-clang would resolve to wasi-sdk

@sbc100
Copy link
Member

sbc100 commented Mar 5, 2024

It isn't really idiomatic for clang to have target-prefixed executables, because clang has the ability to support multiple targets in the same build

I'm not totally sure that is totally true. I believe there is precedent out there for target-prefixed clang binaries. For example android NDK ship no less that 57 target prefixed versions of clang++:

$ find android-ndk-r26c/ -iname *clang++ | head
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/x86_64-linux-android21-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/i686-linux-android27-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/armv7a-linux-androideabi33-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/armv7a-linux-androideabi32-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/i686-linux-android23-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/armv7a-linux-androideabi22-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/x86_64-linux-android23-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/i686-linux-android21-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/i686-linux-android22-clang++
android-ndk-r26c/toolchains/llvm/prebuilt/linux-x86_64/bin/i686-linux-android26-clang++
$ find android-ndk-r26c/ -iname *clang++ | wc -l 
57

I believe the mechansim for setting the default target based on the executable name is built into clang itself, so these can all by symlinks or copies of the primary clang++ executable.

I've personally found it useful in the past to be able do CC=myarch-clang rather than CC=clang --target=myarch (IIRC some build systems like CC to be just an executable).

Another advantage of myarch-clang is that you can put the directory contains those binaries at the end if your path while still allowing /usr/bin/clang to be found first when the arch prefix is not specified. So you can do things like CC=myarch-clang HOST_CC=clang

@sbc100
Copy link
Member

sbc100 commented Mar 5, 2024

I wonder if we could use clang's own builtin mechanism for creating links / copies of the clang binary: https://github.com/llvm/llvm-project/blob/1e828f838cc0f15074f3dbbb04929c06ef0c9729/clang/cmake/modules/AddClang.cmake#L185-L197

I thought there would be some existing CMake code in clang for creating arch-prefixed links, but I didn't find it in my initial search. I looks like we could set CLANG_LINKS_TO_CREATE to a list of aliases that we want.

That clang driver also has a builtin mechanism for setting the default target based on the executable name, so I don't think there is any need to invent our own one here.

@sbc100
Copy link
Member

sbc100 commented Mar 5, 2024

@yamt
Copy link
Contributor

yamt commented Mar 6, 2024

as clang recognizes prefixes as @sbc100 said, i suppose it makes sense for us to ship prefixed symlinks in a separate bin dir.

spacetanuki% /opt/wasi-sdk-21.0/bin/clang -print-effective-triple
wasm32-unknown-wasi
spacetanuki% ln -s /opt/wasi-sdk-21.0/bin/clang wasm64-clang
spacetanuki% ./wasm64-wasi-clang -print-effective-triple
wasm64-unknown-wasi
spacetanuki% 

@agoode
Copy link

agoode commented Mar 6, 2024

Prefixed binaries will help simplify MLton/mlton#550. I appreciate this proposed improvement.

@sunfishcode
Copy link
Member

Thanks @sbc100! I was unaware clang had that, and that other cross-sdks used it that way. That sounds like a good approach to me.

@loganek
Copy link

loganek commented Mar 7, 2024

I should say that another motivation for this is that we'll want the preview2 target to output a component by default, not a core module, and for that purpose rustc will use wasm-component-ld and I think we'll want to do the same here.

@alexcrichton I understand the motivation behind having a component as a default one for the preview2, but will there be an option to use wasm-ld directly for rustc/clang? I think there are usecases where people just want to build a core module with preview2 canonical ABI only, so having a flag to disable component model would be really helpful.

Currently the `bin` directory installed by wasi-sdk is not currently
suitable for putting in `$PATH` in all cases because it can shadow a
system-installed `clang` executable which is intended for native
binaries. For Linux distributions with gcc-based cross-compilers I've
often seen the pattern where they're installed as `$target-gcc` and so
I've taken a leaf out of their books to do that here as well.

This commit adds, currently alongside the preexisting `clang`
executable, target-prefixed executables such as `wasm32-wasi-clang` and
`wasm32-wasi-clang++`. These executables are symlinks to the `clang`
executable itself in the same manner that `clang++` is a symlink to
`clang` itself.

I'll also note that this doesn't fix the problem of "add the wasi-sdk
bin dir to PATH" because `clang` and other prominent executables are
still there. My hope though is that this opens up a future refactoring
for doing so.
@alexcrichton alexcrichton force-pushed the target-prefixed-executables branch from 5cc4883 to c2e39c9 Compare March 8, 2024 01:34
@alexcrichton
Copy link
Collaborator Author

I think as long as you add wasi-sdk to the end of the $PATH this should work fine right?

If executing target-prefixed executables, yes, but IMO that's pretty brittle. In the long term I still think it'd be good to have a directory that can be unambiguously added to $PATH. (I'll note I'm not proposing that in this PR though)

I looks like we could set CLANG_LINKS_TO_CREATE to a list of aliases that we want.

Thanks for pointing that out! (as well as the precedent and code Clang already has to handle this!)

I've now updated this PR to using this strategy instead to install symlinks for each target.

will there be an option to use wasm-ld directly for rustc/clang?

Yes both clang and rustc allow configuring the linker, there's no need to use the built-in default. And also, yes, my plan is to add a flag of some sort to wasm-component-ld to emit a core module instead of a component. That'll likely be an integral part of "dynamic" linking a la what componentize-py does today.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Nice to see that CLANG_LINKS_TO_CREATE works.

# list with a defined separator.
noop =
space = $(noop) $(noop)
join-with = $(subst $(space),$1,$(strip $2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shame that we have to resort for this kind of Makefile magic, but I am not well versed enough to suggest anything better :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed :(

I went through a number of iterations of "copy some bits and pieces from stack overflow" until I found this which ended up working well enough.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the PR description now that a new technique is being used?

@alexcrichton
Copy link
Collaborator Author

Ah yes indeed, and thanks for the review!

@alexcrichton alexcrichton merged commit 3e93db0 into WebAssembly:main Mar 8, 2024
5 checks passed
@alexcrichton alexcrichton deleted the target-prefixed-executables branch March 8, 2024 19:36
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Mar 8, 2024
I noticed that WebAssembly#388 increased the size of the Windows distribution by
200M+ and my guess for that is that all the symlinks which point to
clang get copied as real files (due to symlinks not always working on
Windows). To help counteract that this commit enables the
`LLVM_LINK_LLVM_DYLIB=ON` option when building LLVM and Clang. That
should build a `libLLVM.so` which helps reduce the size of all the tools
since they no longer all statically link LLVM. Locally the size of a
full build is ~100M less as a result of this change. On Windows where
all the binaries are copied around it should hopefully help much more.

I've additionally taken a leaf out of rust-lang/rust's book of building
LLVM to pass the `LLVM_VERSION_SUFFIX` option here too. That I believe
helps avoid `libLLVM.so` conflicting with a system-installed
`libLLVM.so` by accident by ensuring there's a suffix present on the
binaries built for wasi-sdk.
alexcrichton added a commit to alexcrichton/wasi-sdk that referenced this pull request Mar 9, 2024
I noticed that WebAssembly#388 increased the size of the Windows distribution by
200M+ and my guess for that is that all the symlinks which point to
clang get copied as real files (due to symlinks not always working on
Windows). To help counteract that this commit enables the
`LLVM_LINK_LLVM_DYLIB=ON` option when building LLVM and Clang. That
should build a `libLLVM.so` which helps reduce the size of all the tools
since they no longer all statically link LLVM. Locally the size of a
full build is ~100M less as a result of this change. On Windows where
all the binaries are copied around it should hopefully help much more.

I've additionally taken a leaf out of rust-lang/rust's book of building
LLVM to pass the `LLVM_VERSION_SUFFIX` option here too. That I believe
helps avoid `libLLVM.so` conflicting with a system-installed
`libLLVM.so` by accident by ensuring there's a suffix present on the
binaries built for wasi-sdk.
@alexcrichton
Copy link
Collaborator Author

For posterity here: For the wasm32-wasip2 targets and using wasm-component-ld by default I submitted llvm/llvm-project#84569 which was backported to 18.1.x in llvm/llvm-project#85802 which I believe will target the April 2 release of 18.1.2 for LLVM.

agoode added a commit to agoode/twelf that referenced this pull request May 13, 2024
This is to reflect the upstream change in
WebAssembly/wasi-sdk#388.
jcreedcmu pushed a commit to standardml/twelf that referenced this pull request May 13, 2024
This is to reflect the upstream change in
WebAssembly/wasi-sdk#388.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants