Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set PLATFORM_TRIPLET, include platform in so names, only load compatible so files #2299

Merged
merged 2 commits into from
Mar 29, 2022

Conversation

hoodmane
Copy link
Member

For reasons that are a bit beyond me, --host and PLATFORM_TRIPLET
seem to be independent, in particular we've had an empty
PLATFORM_TRIPLET. This is unfortunate because PLATFORM_TRIPLET
is used to generate the SOABI config variable which in turn is used
to decide whether a .so file is a good match for loading. We'd like
for linux Pythons not to try to import emscripten .so files (it
raises ImportError: some_file.so: invalid ELF header). Similarly,
we'd like to avoid attempting to load linux .so files in wasm. These
platform tags are our friends.

Anyways, this PR sets PLATFORM_TRIPLET and ensures that .so files
built by pywasmcross are tagged with our SOABI tag.

I moved the .so file renaming from pywasmcross to buildpkg just
before running the post script. That is a better place to put it in
case the package wants to look at the .so file after linking it. It
might be surprised that we moved it.

I also improved the error message if we try to loadWebAssemblyModule
something that is actually say a Linux .so file and updated get_dynlibs
to filter out .so files that have an incompatible abi tag.

…ble so files

For reasons that are a bit beyond me, `--host` and `PLATFORM_TRIPLET`
seem to be independent, in particular we've had an empty
`PLATFORM_TRIPLET`. This is unfortunate because `PLATFORM_TRIPLET`
is used to generate the SOABI config variable which in turn is used
to decide whether a .so file is a good match for loading. We'd like
for linux Pythons not to try to import emscripten .so files (it
raises `ImportError: some_file.so: invalid ELF header`). Similarly,
we'd like to avoid attempting to load linux .so files in wasm. These
platform tags are our friends.

Anyways, this PR sets `PLATFORM_TRIPLET` and ensures that .so files
built by pywasmcross are tagged with our SOABI tag.

I moved the .so file renaming from pywasmcross to buildpkg just
before running the post script. That is a better place to put it in
case the package wants to look at the .so file after linking it. It
might be surprised that we moved it.

I also improved the error message if we try to `loadWebAssemblyModule`
something that is actually say a Linux .so file and updated get_dynlibs
to filter out .so files that have an incompatible abi tag.
@hoodmane
Copy link
Member Author

In the future, we might consider updating the platform tag to include the emscripten version cf emscripten-core/emscripten#15917.

@rth
Copy link
Member

rth commented Mar 22, 2022

For reasons that are a bit beyond me, --host and PLATFORM_TRIPLET
seem to be independent, in particular, we've had an empty
PLATFORM_TRIPLET.

Hmm, maybe we should add the wasm32-emscripten platform triplet to https://github.com/python/cpython/blob/4aea656d62860e78cd8384f2de375f0d4f1db579/configure.ac#L836 in CPython. WDYT @tiran?

Also, is it expected that --host=wasm32-unknown-emscripten, but PLATFORM_TRIPLET would be wasm32-emscripten: shouldn't they match?

Similarly, we'd like to avoid attempting to load linux .so files in wasm.

Right, but ideally they shouldn't be in the wasm artifact anyway. And I think they aren't since we try to pre-load all .so at init?

@hoodmane
Copy link
Member Author

maybe we should add the wasm32-emscripten platform triplet to python/cpython@4aea656/configure.ac#L836 in CPython.

I tried that, for some reason it didn't set PLATFORM_TRIPLET.

Also, is it expected that --host=wasm32-unknown-emscripten, but PLATFORM_TRIPLET would be wasm32-emscripten: shouldn't they match?

Not sure, as far as I can tell PLATFORM_TRIPLET is allowed to leave out the vendor if it is unknown. I think we really ought to be wasm32-posix-emscripten or something since emscripten is really more like gnu and less like posix. But I think the main point of PLATFORM_TRIPLET is that it is used to set SOABI which is used by the loader system. It also is used to set MULTIARCH but that is not used anywhere Python core, though it is exposed via sysconfig.

ideally they shouldn't be in the wasm artifact anyway. And I think they aren't since we try to pre-load all .so at init?

Yeah they aren't unless we mess up. We can leave off the double checking code if you prefer. There are wheels out there that have a bunch of .so files for several different architectures. Using something like that in the browser wouldn't be ideal for download size reasons, but if you are using it with node it would be perfectly reasonable.

@hoodmane
Copy link
Member Author

maybe we should add the wasm32-emscripten platform triplet to python/cpython@4aea656/configure.ac#L836 in CPython.

I tried that, for some reason it didn't set PLATFORM_TRIPLET.

if $CPP $CPPFLAGS conftest.c >conftest.out 2>/dev/null; then
  PLATFORM_TRIPLET=`grep -v '^#' conftest.out | grep -v '^ *$' | tr -d ' 	

presumably the issue is that $CPP $CPPFLAGS conftset.c is returning a nonzero exit code.

@tiran
Copy link

tiran commented Mar 23, 2022

To make it more confusing a typical platform triplet has two, three, or even four elements. The three element syntax can either specify machine-vendor-os or machine-os-abi where abi specifies libc and flavor. arm-linux-gnueabihf is 32bit ARM CPU, Linux with GNU libc and hardware float ABI. You can use config.sub to normalize a short name to a canonical long name:

$ ./config.sub wasm32-emscripten
wasm32-unknown-emscripten
$ ./config.sub arm-linux-gnueabihf
arm-unknown-linux-gnueabihf

wasm32-posix-emscripten would be wrong and confusing on multiple accounts. POSIX is a standard, not a vendor. Vendors are typically companies like Fujitsu, Hitachi, IBM, or SUN. Neither Emscripten nor WASI are near POSIX compatible.

I recommend that you use wasm32-unknown-emscripten as preferred name and support wasm32-emscripten as alias. Rust, Emscripten, and other projects refer to Emscripten as wasm32-unknown-emscripten.

maybe we should add the wasm32-emscripten platform triplet to python/cpython@4aea656/configure.ac#L836 in CPython.

I tried that, for some reason it didn't set PLATFORM_TRIPLET.

AFAIK that check is only used for multiarch and has no affect on cross compilation.

@tiran
Copy link

tiran commented Mar 24, 2022

Ooops, I misunderstood the issue! I should have read the initial comment...

If I understand you correctly you want to have _somemodule.cpython-311-wasm32-emscripten.so for a WASM32 Emscripten shared extension on Python 3.11? PR python/cpython#32095 should solve your problem.

@hoodmane
Copy link
Member Author

hoodmane commented Mar 24, 2022

PR python/cpython#32095 should solve your problem.

Well it will solve our problem in a year when we switch to Python 3.11 =) For now, I think manually setting PLATFORM_TRIPLET in the Makefile should be fine. Thanks @tiran!

@hoodmane
Copy link
Member Author

@tiran another question. Currently every patch version bump in emscripten is considered to be an ABI break:
emscripten-core/emscripten#15917
So it would make sense to include the emscripten version as part of the ABI tag for the .so files and refuse to load .so files with a different version of the emscripten compiler. Would ABI tagging with e.g., wasm32-emscripten3.0.6 be legal? Or is there some other recommended way to handle this?

@rth
Copy link
Member

rth commented Mar 28, 2022

Thanks! This PR LGTM as a temporary solution.

@hoodmane hoodmane merged commit 37057d1 into pyodide:main Mar 29, 2022
@hoodmane hoodmane deleted the soabi branch March 29, 2022 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants