Add simple, explicit CLI to install a .whl file #92

takluyver · 2022-01-03T13:14:05Z

In the interests of moving the CLI question forwards, here's my take on a minimal CLI. That's minimal in the sense of a low-level interface to what installer can do, with no additional abstractions.

As a starting point, this requires that all the destination directories - purelib, platlib, headers, scripts, data - are specified explicitly as inputs. This would obviously be pretty verbose, but I think it's still useful, because downstream packaging is often built around shell commands, so even a long one is more convenient than embedding Python code to use the library. I also like that it makes the destination directories explicit, rather than downstreams finding ways to patch or hack around sysconfig or distutils to control where files end up.

But this is a starting point, and we could decide to add various convenience features:

Make --script-kind optional and write POSIX launchers by default, which is probably what 90% of downstreams want, given the diverse packaging ecosystems on Linux especially.
Make --platlib optional, use the same directory as --purelib by default
Make --headers required only if the wheel contains a foo-1.2.data/headers directory (AFAIK this is either never or almost never used - the only thing I'm familiar with that ships headers is numpy, and it has them inside the package directory)
Add a --destdir or --root option, for a path to prefix to all destination directories (Enable installation under a specified prefix #58). This is purely a convenience, because the caller can add the prefix themselves (--purelib ${destdir}/path/to/site-packages etc.), but maybe passing it 5 times is inconvenient enough to be worth a shortcut?
Add an option - or an entirely separate entry point, like python -m installer.thispython - to do an installation for the Python & platform where installer is running (like pip, or the proposal in Add CLI #66).

for more information, see https://pre-commit.ci

pradyunsg · 2022-01-03T15:16:47Z

Thanks for filing this!

I imagine that the tests will complain about not having coverage. I'm pretty sure that we should have #58 satisfied (happy with that being a follow up), but beyond that, I'm totally on board with this.

src/installer/__main__.py

layday · 2022-01-03T21:28:36Z

Would redistributors have use for something so skimpy?

because downstream packaging is often built around shell commands, so even a long one is more convenient than embedding Python code to use the library

Yeah, but they still have to do all the work collecting the paths and switching the prefix out, right? And they don't have the facility to do that more generally, I imagine. How would this work in practice? Would it be...

paths=$(python -c '
    a whole lotta path mangling
')
python -m installer
    --interpreter $(command -v python)
    --script-kind posix
    --purelib $(paths | jq??)
    --platlib $(paths | jq??)
    --headers $(paths | jq??)
    --scripts $(paths | jq??)
    --data $(paths | jq??)
    wheel.whl

And at that point, why not write everything in Python, including the installer invocation?

Of course, it might simpler than I'm imagining, but we should still have a little think about how the CLI's going to be used.

jameshilliard · 2022-01-03T22:08:43Z

Would redistributors have use for something so skimpy?

Yeah, it would be helpful as it provides a cleaner way to interface with installer from the cli.

Yeah, but they still have to do all the work collecting the paths and switching the prefix out, right? And they don't have the facility to do that more generally, I imagine. How would this work in practice?

We're pretty much set up for this already, for us at least we generally pass around cli params like this in make variables already, like this.

And at that point, why not write everything in Python, including the installer invocation?

We have to do invocations from make to python at some point anyways, this generally works best if there is a built in cli interface in the tool we're calling.

jameshilliard · 2022-01-03T22:15:34Z

src/installer/__main__.py

+def main():
+    """Entry point for CLI."""
+    ap = argparse.ArgumentParser("python -m installer")
+    ap.add_argument("wheel_file", help="Path to a .whl file to install")


Does this support installing the wheel file if passed as a glob, like dist/*.whl?

No, it won't be unglobbed in Python, but why would it not be unglobbed in the shell?

Currently we invoke python setup.py install while in the package directory, maybe we should make the wheel_file argument optional and autodetect the dist folder and install from there if present and a .whl file is there.

It will work if the shell expands the glob to a single wheel. At present, it won't accept multiple wheels to install - that's easy enough to do if necessary, but I wanted to start with the simplest thing.

maybe we should make the wheel_file argument optional and autodetect...

I think the crucial 'what to install' input should be 100% explicit. That's in keeping with the general pattern of this library. If you know there's a single wheel under dist, and you're running it from a shell, you can pass dist/*.whl and let the shell expand it.

it won't accept multiple wheels to install

Oh, not wanting it to do that, just pick up a single wheel from dist actually.

If you know there's a single wheel under dist, and you're running it from a shell, you can pass dist/*.whl and let the shell expand it.

Yeah, was going to do something like that otherwise, figured would be kinda nice to have similar semantics to existing install methods. Maybe have it use only the newest file in dist/*.whl would make sense or something if a glob is passed?

layday · 2022-01-03T22:30:13Z

We're pretty much set up for this already, for us at least we generally pass around cli params like this in make variables already, like this.

Will you be hardcoding the paths? So, for example, instead of:

PKG_PYTHON_SETUPTOOLS_INSTALL_TARGET_OPTS = \
    --executable=/usr/bin/python \
    --script-kind posix
    --root=$(TARGET_DIR)

You'll have:

PKG_PYTHON_INSTALLER_TARGET_OPTS = \
    --interpreter=/usr/bin/python \
    --script-kind=posix \
    --purelib=$(TARGET_DIR)/lib/python?.?/site-packages \
    --platlib=$(TARGET_DIR)/lib/python?.?/site-packages \
    --headers=$(TARGET_DIR)/include/python?.? \
    --scripts=$(TARGET_DIR)/bin \
    --data=$(TARGET_DIR)

Is that correct? Or do you have some other way of retrieving the paths?

jameshilliard · 2022-01-03T22:56:03Z

Is that correct?

Should be a little different.

Probably something like this I think:

PKG_PYTHON_INSTALLER_TARGET_OPTS = \
    --interpreter=/usr/bin/python \
    --script-kind=posix \
    --purelib=$(TARGET_DIR)/lib/python$(PYTHON3_VERSION_MAJOR)/site-packages \
    --platlib=$(TARGET_DIR)/lib/python$(PYTHON3_VERSION_MAJOR)/site-packages \
    --headers=$(STAGING_DIR)/usr/include/python$(PYTHON3_VERSION_MAJOR)  \
    --scripts=$(TARGET_DIR)/usr/bin \
    --data=$(TARGET_DIR)/usr

layday · 2022-01-03T23:24:43Z

I see, thank you.

takluyver · 2022-01-04T11:43:59Z

@layday I don't imagine this will be useful for everyone. Certainly, if you have to get the destination paths from Python as a JSON blob and then use jq to extract them and parse them back, you're better off staying in Python and calling installer as a library. I can see the case for having a separate 'install for this Python' CLI, but I think it's in keeping with the design of installer to have a fully explicit CLI that doesn't take anything from the running Python. And it seems like this has at least one use case. 🙂

@pradyunsg I'll look at adding some tests.

For destdir support, I can see 3 possible approaches:

Fully manual: python -m installer --purelib ${DESTDIR}/path/to/site-packages --scripts ${DESTDIR}/path/to/bin ...
Separate argument: python -m installer --destdir ${DESTDIR} --purelib /path/to/site-packages --scripts /path/to/bin ... (or 2a, --root, in keeping with setuptools/distutils)
Use the environment variable implicitly: python -m installer --purelib /path/to/site-packages --scripts /path/to/bin ...

Do you have a preference?

pradyunsg · 2022-01-04T12:04:57Z

Let's leave it fully manual for now?

If folks strongly feel that is not "good enough", then we can add things based on feedback then. Removing a separate option or implicit behaviour would be more tricky. :)

FFY00 · 2022-01-04T15:23:40Z

I have been meaning to write a reply in my PR detailing my opinion in your feedback and how to move forward regarding the CLI as I abandon that work, but since this is moving fast, I will try to quickly summarize it here in hope it can provide some clarity before any decision is made.

IMO this is the wrong approach to this issue. We are currently having a lot of trouble due to downstream customizations of the Python install layout, to the point that we rushed suboptimal changes to Python core that presented themselves problematic (see bpo-45413, which is a result of the mechanism that was introduced to support the downstream patching, mainly affecting pip; see pypa/distutils#88 (comment), introduced in python/cpython#24549, which aims to make distutils rely on sysconfig to support vendor patching in sysconfig instead as distutils is moved out of core). I believe this approach just further propagates the same mentality and will result in people hardcoding install locations or abusing this CLI to do things it wasn't intended to. The hardcoded install locations need to sooner or later be manually updated, and having them manually hardcoded can only result in more bugs. Install locations in Python are not even canonically static anymore and can vary at runtime, further introducing more bugs when this CLI is used in a slightly different environment, resulting in non-obvious broken installs.

Figuring out the install locations of a Python should not be hard, but we currently do have an issue, as presented in bpo-44445. This means wheel installation can not yet be done without querying the headers path from distutils, which is a big problem, but optimally, you would just fetch the install location map from sysconfig and plug it into SchemeDictionaryDestination. This is how easy it should be, but some work still needs to be done for that to be achieved, and that needs to happen before the distutils deprecation.

I have dedicated a big amount of time trying to understand and sort out all this mess, and this is my opinion from the experience gathered from that. I don't have the energy to try to prevent everybody for making mistakes, so I am not gonna to, I am just gonna leave this warning here.

@pradyunsg, if you think this is more than you are willing to handle, please just don't provide a CLI, instead of merging an arguably bad interface like this (No hard feelings @takluyver, I really appreciate your work here, I just think the approach is very understandably misguided, there are a lot of moving pieces here, and something like this will have a lot of implications on the rest of the ecosystem. This feedback is based on my experience actively working in sorting all of the install stuff out for the past 1~2 years, and still, I might be wrong in some of my takeaways. I think this proposal is understandable, but overly optimistic.)

TLDR: Please please please, can we stop with all this install location customization stuff? IMO the only sane, unproblematic way to deal with this is by querying the install locations from the Python interpreter. Manually specifying install locations will result in more bugs and extend the mess that we are already currently facing.

takluyver · 2022-01-04T18:09:00Z

@FFY00 This kind of criticism is absolutely fine - I was hoping to provoke this sort of discussion, and the code in this PR didn't take a lot of effort, so I won't be upset if we decide it's not useful. I just wanted to make the alternative approach concrete.

FWIW, the issues and PRs you point to seem to me - at least at a quick look - like an argument for this sort of interface, where paths are specified explicitly. Linux distros (and possibly other downstream environments) want to install files in different places to the default locations Python would normally use. In the absence of an explicit way to specify destination paths, they resort to patching or monkeypatching sysconfig and/or distutils, which ultimately leads to more complexity. I can't stop them customising install locations, so the best way to avoid that complexity seems to be providing a simple way to specify destination paths (as in this PR).

This is also useful if you want to do the installation on a different platform, or with a different Python, to the one which will run the code. I think this is @jameshilliard's use case, and I've also wanted to do something similar in Pynsist, to build Windows installers on Linux or Mac.

We may well want an 'install for this Python' entry point as well - although for downstreams which are happy using pip with its bundled dependencies, that's already covered. But I think there is value in a lower-level 'install to these directories' interface.

FFY00 · 2022-01-04T18:39:28Z

Like I said in my reply, I don't have the energy to stop people from making mistakes, so I am not gonna engage in an argument, I am just gonna try to clarify what I mean.

FWIW, the issues and PRs you point to seem to me - at least at a quick look - like an argument for this sort of interface, where paths are specified explicitly. Linux distros (and possibly other downstream environments) want to install files in different places to the default locations Python would normally use. In the absence of an explicit way to specify destination paths, they resort to patching or monkeypatching sysconfig and/or distutils, which ultimately leads to more complexity. I can't stop them customising install locations, so the best way to avoid that complexity seems to be providing a simple way to specify destination paths (as in this PR).

The current issues we have are due to people bypassing existing mechanisms, which is problematic because things are very tightly coupled and changing the paths in one place will result in a plethora of inconsistencies across the ecosystem, which all need to keep up with each other, and that is simply not maintainable. The approach in this PR plays and contributes into that, which I believe is the wrong choice and will result in more long term pain.

I have put a lot of effort in trying to get people to move away from this approach and let the Python interpreter be the source of truth for install locations (see https://gist.github.com/FFY00/625f65681fbcd7fc039dd4d727bb2c2f), because as we have seen over the years, the current approach of just letting everyone change locations where they want and hope everything works properly together is just not maintainable. It is painful to users, who will have stuff break, it is painful for developers as they will have to implement custom handling for specific user (eg. Debian) and make sure their codebase takes inconsistencies into account and make sure it work with external changes (see pypa/pip#9617, mesonbuild/meson#9288, etc.), and it's painful for Python core as the developers lose a lot of room for change and improvement as the whole ecosystem becomes dependent on how things are implemented, even private APIs, and makes the code incredibly fragile, making any sort of change potentially problematic for thousands of user (an example of this is the distutils deprecation, see https://ffy00.github.io/blog/02-python-debian-and-the-install-locations/).

This is also useful if you want to do the installation on a different platform, or with a different Python, to the one which will run the code. I think this is @jameshilliard's use case, and I've also wanted to do something similar in Pynsist, to build Windows installers on Linux or Mac.

That is a very specific use-case, and the API is more than enough to satisfy it, but IMO this shouldn't be exposed as the CLI for this project. It can be easily integrated in pynsist and cross compiling tooling.

We may well want an 'install for this Python' entry point as well - although for downstreams which are happy using pip with its bundled dependencies, that's already covered. But I think there is value in a lower-level 'install to these directories' interface.

I am not challenging the value of it, but rather what it enables, this is somewhere where I think we should thread very carefully. Of course this has value, and the tooling that needs that value can still make use of this project, but I don't think we should be exposing this to general users, who will use and abuse the interface without understanding the impact it has.

Hopefully that clarifies a bit better my point, and what I am trying to say.

Cheers,
Filipe

pradyunsg · 2022-01-04T18:42:19Z

@FFY00 your inputs and opinions are very welcome! ^.^

querying the install locations from the Python interpreter

Is there a single unambigous portable-across-redistributors way to do this? My understanding is that there isn't, and that's why I want to not get involved in this.

Honestly, if there's even a single mechanism that we want to "bless" that's 3.7+ or even 3.11+, I'm onboard for having that be the only thing that the CLI provides support for, as long as:

it does not do anything other than unpacking the wheel.
it does not bake in "this Python" assumption into the CLI (avoiding a subprocess when using the same interpreter is fine, but require the users to explicitly specify it).

FFY00 · 2022-01-04T18:54:58Z

Is there a single unambigous portable-across-redistributors way to do this? My understanding is that there isn't, and that's why I want to not get involved in this.

The issue here is people patching distutils for the install locations and not sysconfig (https://ffy00.github.io/blog/02-python-debian-and-the-install-locations/ gets into this a bit). This should not be happening as of Python 3.10, when the downstream customization mechanism in sysconfig was introduced. It will certainly not be happening as of Python 3.12.

it does not do anything other than unpacking the wheel.

What about validating compatibility? And the "spread" step specified in PEP 427?

it does not bake in "this Python" assumption into the CLI (avoiding a subprocess when using the same interpreter is fine, but require the users to explicitly specify it).

There is no getting around this. The interpreter is the only reasonable source of truth for the install locations. We can have an introspection script that we run on the target Python to get the install location if you really want this.

layday · 2022-01-04T19:16:43Z

There is no getting around this. The interpreter is the only reasonable source of truth for the install locations. We can have an introspection script that we run on the target Python to get the install location if you really want this.

https://github.com/layday/pyproject-install does this if anybody wants it, which I honestly don't think they will; distros don't care about installing with/into arbitrary Pythons.

Linux distros (and possibly other downstream environments) want to install files in different places to the default locations Python would normally use. In the absence of an explicit way to specify destination paths, they resort to patching or monkeypatching sysconfig and/or distutils, which ultimately leads to more complexity.

But distutils does allow passing scheme paths via the command line like you are doing here. If they'd rather patch distutils, there must be another reason that they do that.

jameshilliard · 2022-01-04T21:58:39Z

@takluyver

I can see the case for having a separate 'install for this Python' CLI, but I think it's in keeping with the design of installer to have a fully explicit CLI that doesn't take anything from the running Python.

Yeah, for cross compilation I think we need explicit overrides here since we can't actually run the target interpreter during install at all(only the host interpreter).

@pradyunsg

Let's leave it fully manual for now?

Seems like a good starting point.

@FFY00

IMO the only sane, unproblematic way to deal with this is by querying the install locations from the Python interpreter.

So how would we actually do this if we can't even run the target interpreter during install(ie when cross compiling?) and need to choose if a package is being installed for the host or the target?

@takluyver

In the absence of an explicit way to specify destination paths, they resort to patching or monkeypatching sysconfig and/or distutils, which ultimately leads to more complexity.

Agreed, we already modify sysconfig a good bit, trying to minimize downstream customizations is ideal, making it more difficult for downstreams to make required overrides isn't going to make things better overall IMO.

@FFY00

The current issues we have are due to people bypassing existing mechanisms, which is problematic because things are very tightly coupled and changing the paths in one place will result in a plethora of inconsistencies across the ecosystem, which all need to keep up with each other, and that is simply not maintainable.

The existing mechanisms simply don't provide the required customization needed for cross compilation from my understanding, at least if the mechanism is provided upstream it can be reworked in one place instead of in a bunch of downstream locations.

I have put a lot of effort in trying to get people to move away from this approach and let the Python interpreter be the source of truth for install locations (see https://gist.github.com/FFY00/625f65681fbcd7fc039dd4d727bb2c2f), because as we have seen over the years, the current approach of just letting everyone change locations where they want and hope everything works properly together is just not maintainable.

The site packages directories from the default install scheme should have a constant value, independly from the distribution we are using, and should always be used by the site module.

Except this assumption seems to be effectively entirely ignoring normal cross compilation scenarios if I'm understanding things correctly.

We're not actually trying to modify the runtime install layout(we want everything installed in the same site-packages for the target interpreter) but we have to use a different interpreter(the build host interpreter) from the target interpreter for build+installation and that has to have a separate runtime site-packages that is entirely independent from the target site-packages since we can't use target site-packages for build+installation at all(since the target interpreter can't be executed during build+installation at that process happens entirely on the build host).

We do try to correctly patch/override sysconfig for cross building at least and we outright don't support any sort of runtime python package installation/building on the target which means there's no need for us to have a separate distro-packages at all(as that seems to be designed to deal with an entirely different issue). On the target it's expected for us that all python packages will be installed in the same site-packages by the time the target interpreter runs for the first time.

Note we actually have effectively multiple sysconfigs here being used by the same host interpreter because our host build dependencies must be installed to the host interpreter site-packages and built against host libs while target runtime dependencies must go in the target site-packages and built against target libs. We currently are using our infrastructure to set this when packages are built/installed so that things end up in the right place.

That is a very specific use-case, and the API is more than enough to satisfy it, but IMO this shouldn't be exposed as the CLI for this project. It can be easily integrated in pynsist and cross compiling tooling.

We really prefer to interface with this sort of thing from the CLI, a python API is much more annoying and less maintainable since our cross compilation tooling is make based and not python based.

There is no getting around this. The interpreter is the only reasonable source of truth for the install locations. We can have an introspection script that we run on the target Python to get the install location if you really want this.

Except it's not possible in cross-compilation scenarios, the interpreter doing the install is the host toolchain interpreter, not the target one(we can't even run the target interpreter during install since it is generally built for an incompatible architecture), we have to be able to change this on the fly as well since we are doing both host and target builds+installations from the same host interpreter.

takluyver · 2022-01-05T11:14:25Z

I think I'm convinced that the need for this is a niche use case (where it's not possible to run the target Python to find the paths), and the main python -m installer CLI should install for the running Python, as in #66. Maybe this CLI can live at something like python -m installer.manual, or maybe it can just be maintained as an internal tool by people like @jameshilliard who need it.

What about validating compatibility?

From his comments on #66, I think Pradyun would rather that the CLI, like the Python API, just install the wheel it's given, with no extra checks. I'm inclined to agree - you've already made those checks soft-fail if dependencies are missing, for bootstrapping purposes, and that feels like a messy compromise for a low level tool.

And the "spread" step specified in PEP 427?

From how installer already works, I'm pretty sure Pradyun was including the 'spread' step as part of unpacking. PEP 427's suggestion that you first unpack the entire wheel into site-packages and then move files around is somewhat strange to me - it's surely more reliable to unpack files directly to the destination directories.

takluyver · 2022-01-05T12:33:34Z

I've opened #94, building off @FFY00's branch for #66, as an alternative for consideration. That one specifically installs for the Python interpreter running installer.

FFY00 · 2022-01-05T15:23:36Z

I think I'm convinced that the need for this is a niche use case (where it's not possible to run the target Python to find the paths), and the main python -m installer CLI should install for the running Python, as in #66. Maybe this CLI can live at something like python -m installer.manual, or maybe it can just be maintained as an internal tool by people like @jameshilliard who need it.

I would naturally prefer it to be internal to use-case specific tooling, but I think exposing a CLI like that could be a reasonable compromise, though I am really not jazzed about it. What I am hard against is that it be the main or only CLI provided by this package. So far, I do not think any of the presented use-cases warrant it.

So how would we actually do this if we can't even run the target interpreter during install(ie when cross compiling?) and need to choose if a package is being installed for the host or the target?

You should customize the Python you build to select a different install scheme when cross compiling (via https://docs.python.org/3/library/sysconfig.html#sysconfig._get_preferred_schemes, or https://gist.github.com/FFY00/625f65681fbcd7fc039dd4d727bb2c2f#--with-vendor-config if that gets accepted). When doing this, you will need to ensure the paths are correct, so you need to be careful, but you have to do the same thing for every other cross compiling customization.
You currently already have to patch sysconfig, this should be no different. There is already an upstream supported mechanism to customize the selected layout at runtime, there's no need to have a custom installer for that use-case, we already have hooks to do that in Python directly.

FFY00 · 2022-01-05T15:27:39Z

I can't find the link right now, but IIRC Fedora does a similar thing to select their custom install scheme in Python 3.10. Perhaps @hroncok can point you to it.

eli-schwartz · 2022-01-05T16:18:49Z

Except it's not possible in cross-compilation scenarios, the interpreter doing the install is the host toolchain interpreter, not the target one(we can't even run the target interpreter during install since it is generally built for an incompatible architecture), we have to be able to change this on the fly as well since we are doing both host and target builds+installations from the same host interpreter.

For e.g. packages which build using meson, cross compilation would require the use of exe_wrapper in order to have meson's target python introspection (essentially the same thing as @layday's example, although meson does scrape for both sysconfig paths and sysconfig vars) and generally target introspection as a whole, run the target python using, say, qemu-user.

Similarly, it would be possible (if not perfectly convenient) to use a pyproject-install CLI that expected an --interpreter argument. You could set it to ${hostbins}/qemu-target-python or something and have that be a shell script that invokes python using qemu. (Then this project doesn't need to explicitly code support for a cross compilation wrapper that is prepended to the interpreter argument).

Under this scenario, --interpreter is useful/usable in probably all cases, while manually hardcoding every single path as CLI arguments is useful nowhere other than "buildroot where we want to use the python API for maximum flexibility, but using CLI command arguments to pass function data because buildroot itself isn't written in python".

tl;dr I'm skeptical that trying to more or less implement the python API as CLI arguments is the ideal implementation form. It could exist as an advanced mode, but I strongly advise against it being the primary, or the only, mode of operation. The worst thing about it is that it encourages people to think it's a good idea to use it for purposes other than "a workaround for cross compilation because the interpreter in question isn't built for this CPU and OS".

Yes, that's right. Manually specifying every path is absolutely a workaround, not an inherent use case. It's not something you want to do, it's something you do because you can't (or find it difficult to) do the --interpreter method.

As such, it is a bad default design.

hroncok · 2022-01-05T17:41:45Z

I can't find the link right now, but IIRC Fedora does a similar thing to select their custom install scheme in Python 3.10. Perhaps @hroncok can point you to it.

fedora-python/cpython@f77f87b#diff-d593bd299ba58e440ba411ffa0640ccd9d20d518b0cf2644ed4bdb75a82a3e70

FFY00 · 2022-01-05T18:49:24Z

Thanks! I see you are not using sysconfig._get_preferred_schemes, so it actually differs a bit from what I was proposing 😅

hroncok · 2022-01-05T18:53:19Z

Eventually, we intend to get there, but not all tools respected that when we tried.

jameshilliard · 2022-01-05T20:06:33Z

@takluyver

Maybe this CLI can live at something like python -m installer.manual

This would be fine.

maybe it can just be maintained as an internal tool by people like @jameshilliard who need it

I'm really trying to avoid having to maintain something like this downstream since it means we are less likely to be in sync with upstream changes, if a maintained cli option is simply removed/changed in the future we want it to throw a hard error so the failure can be identified easily and so that we can migrate as needed, if we have to use the python API we're more likely to hit subtle difficult to trace breakage IMO.

From his comments on #66, I think Pradyun would rather that the CLI, like the Python API, just install the wheel it's given, with no extra checks.

Yeah, I think that's fine for our use case, we're fairly experienced with debugging these sort of issues in general so it's not a big deal as long as the functionality needed to fix the issue is properly exposed.

PEP 427's suggestion that you first unpack the entire wheel into site-packages and then move files around is somewhat strange to me - it's surely more reliable to unpack files directly to the destination directories.

Yeah, this also seems very error prone, and we also need to make sure some stuff like headers ends up not in the target rootfs but rather the sysroot/staging directory(where headers and stuff for the target package builds live, this is separate from the target rootfs and is not present at runtime on the target) so that they are usable by other packages during build. So I would expect something like this to not be desirable as we may accidentally end up with stuff like headers in the target rootfs.

@FFY00

What I am hard against is that it be the main or only CLI provided by this package.

Yeah, I'm not against having a normal non-manual CLI, that functionality is just fairly well covered by other tools like pip I think.

You should customize the Python you build to select a different install scheme when cross compiling (via https://docs.python.org/3/library/sysconfig.html#sysconfig._get_preferred_schemes, or https://gist.github.com/FFY00/625f65681fbcd7fc039dd4d727bb2c2f#--with-vendor-config if that gets accepted). When doing this, you will need to ensure the paths are correct, so you need to be careful, but you have to do the same thing for every other cross compiling customization.

It's not really clear if this would be sufficient for cross compilation, cross compilation requires a special hybrid setup effectively that requires dynamic scheme selection based for each package build/install invocation, some of which are mixed scheme in a way.

there's no need to have a custom installer for that use-case, we already have hooks to do that in Python directly

Our build system is make based so calling python api's directly is not very maintainable.

@eli-schwartz

For e.g. packages which build using meson, cross compilation would require the use of exe_wrapper in order to have meson's target python introspection (essentially the same thing as @layday's example, although meson does scrape for both sysconfig paths and sysconfig vars) and generally target introspection as a whole, run the target python using, say, qemu-user.

We don't use exe_wrapper for that, we have some special cased overrides however for gobject-introspection along those lines but exe_wrapper doesn't work well in the general case for us since our infrastructure generally expects host build tools to be compiled for the host architecture. Also using qemu wrappers for everything would be quite slow I think.

We override the binaries in the meson cross compilation config with scripts that call qemu wrappers.

Similarly, it would be possible (if not perfectly convenient) to use a pyproject-install CLI that expected an --interpreter argument. You could set it to ${hostbins}/qemu-target-python or something and have that be a shell script that invokes python using qemu. (Then this project doesn't need to explicitly code support for a cross compilation wrapper that is prepended to the interpreter argument).

I mean I think at a minimum this doesn't make much sense for our use case since stuff like the headers directory isn't even something that will exist on the target rootfs, so the target interpreter config isn't even going to be really what we want from my understanding. From my understanding we need dynamic override capability in some form.

Yes, that's right. Manually specifying every path is absolutely a workaround, not an inherent use case. It's not something you want to do, it's something you do because you can't (or find it difficult to) do the --interpreter method.

I mean I don't really think we have a good way to get this from the target interpreter since it's not expected to have the cross compilation target environment configured for itself in general but rather the target runtime configuration...it's complicated.

@hroncok @FFY00

fedora-python/cpython@f77f87b#diff-d593bd299ba58e440ba411ffa0640ccd9d20d518b0cf2644ed4bdb75a82a3e70

Hmm, interesting, so is the intention to be able to use env variables to override site scheme on the fly in upstream python?

FFY00 · 2022-01-05T20:25:39Z

It's not really clear if this would be sufficient for cross compilation, cross compilation requires a special hybrid setup effectively that requires dynamic scheme selection based for each package build/install invocation, some of which are mixed scheme in a way.

I am aware of the needs of cross-compilation. The approach I described should be sufficient.

jameshilliard · 2022-01-05T20:55:05Z

I am aware of the needs of cross-compilation. The approach I described should be sufficient.

It's just somewhat unclear how it would fit/integrate into our python package infrastructure, keep in mind that something sufficient for a generalized cross-compilation build may not be sufficient for integration with our infrastructure which often has additional requirements since we also aren't a normal distribution(we don't support binary package management/installation like deb/rpm/opkg) but rather are a source based rootfs generator tool essentially.

eli-schwartz · 2022-01-05T23:42:02Z

We don't use exe_wrapper for that, we have some special cased overrides however for gobject-introspection along those lines but exe_wrapper doesn't work well in the general case for us since our infrastructure generally expects host build tools to be compiled for the host architecture. Also using qemu wrappers for everything would be quite slow I think.

We override the binaries in the meson cross compilation config with scripts that call qemu wrappers.

Meson's exe_wrapper isn't about running the target system's native compiler using emulation. It's quite reasonable to use cross compilers rather than emulating native compilers. The exe_wrapper is used for things like... running the cross target's python, which is required in order to get include paths, the EXT_SUFFIX, whether to link to libpython, etc. But also when, say, running the testsuite.

...

So essentially the problem is that you want to install everything as per sysconfig paths, except for headers because "the embedded rootfs doesn't need that".

And the solution is to override the paths and install headers outside of the staging directory, instead of for example to have a cleanup stage immediately before baking the image, which prunes unwanted contents?

Personally, I think this is a confusing setup. But if that's what you really need, okay. I do think that this sounds like a choice rather than a requirement, and that a CLI installer should not feel bound to support it, even if it does choose to do so anyway in an advanced, non-default mode.

jameshilliard · 2022-01-05T23:58:46Z

The exe_wrapper is used for things like... running the cross target's python, which is required in order to get include paths, the EXT_SUFFIX, whether to link to libpython, etc. But also when, say, running the testsuite.

Yeah, I'm aware...we don't want testsuite running so we only use qemu when absolutely needed for gobject-introspection...and since the target python will have correct info in a number of cases here so it's kinda pointless to use it for configuration IMO, also it's really not necessary.

So essentially the problem is that you want to install everything as per sysconfig paths, except for headers because "the embedded rootfs doesn't need that".

Well sysconfig paths I think are mixed between sysroot and host interpreter right now, but we def don't want headers ending up in the target rootfs since that's the wrong location(target rootfs directory is where the runtime stuff lives, not compile time dependencies).

And the solution is to override the paths and install headers outside of the staging directory, instead of for example to have a cleanup stage immediately before baking the image, which prunes unwanted contents?

Staging is the target sysroot(ie does not get installed into the rootfs but is where stuff needed for building target binaries lives).

pradyunsg · 2022-01-06T07:22:50Z

Closing this out in favour of #94.

I’m convinced that such a CLI being the default / easier-to-use than install-using-sysconfig paths is a bad idea. If someone really wants this interface, they can write a Python script that does this, using the API.

jameshilliard · 2022-01-06T07:34:12Z

I’m convinced that such a CLI being the default / easier-to-use than install-using-sysconfig paths is a bad idea.

So maybe this should be reworked to go under python -m installer.manual or something as a non-default cli?

takluyver · 2022-01-06T10:56:37Z

I'm happy to move this CLI to a different name under the installer module if @pradyunsg wants. But I also get @FFY00's argument that it could be a solution that lets people easily get round an immediate problem, but leads to more problems down the line.

The CLI I proposed here is built entirely the public, documented Python API of the installer module, so if there are only niche use cases, I do think it's reasonable to say the downstream users who have that use case can maintain a wrapper like this for themselves. I don't agree that changes in a Python API are any more likely or harder to debug than changes in a CLI.

jameshilliard · 2022-01-06T19:32:54Z

I do think it's reasonable to say the downstream users who have that use case can maintain a wrapper like this for themselves.

Isn't that just going to lead to significant implementation fragmentation? I mean that's a big issue right now in general with python build/install integrations.

takluyver · 2022-01-07T09:28:18Z

At present, our guess is that there aren't many downstreams that actually need this fully manual interface, in which case fragmentation isn't too big a concern. If that turns out to be wrong, I think we'd probably want to incorporate this.

jameshilliard · 2022-01-10T07:45:25Z

@takluyver

our guess is that there aren't many downstreams that actually need this fully manual interface

I think anyone cross compiling will need something like this, although it seems the important part is going to be a root override functionality you can see what my latest integration attempt looks like here.

in which case fragmentation isn't too big a concern

It's a concern for downstream maintainers...we can of course manage to work around these issues but it's just adding more to the pile of hacks needed for python cross compilation due to the lack of upstream support...

takluyver · 2022-01-10T09:44:05Z

Yours is the only downstream I've actually heard from that really focuses on cross compiling. Most downstream packagers we deal with (Linux distros, things like Anaconda & Spack) build on the target platform. Anaconda has a 'noarch' shortcut to allow pure-Python packages to be built once, but I don't think they'd use installer for that in any case. Spack supports some kind of cross-compiling, but it's not the main way it's used, and they're working on using pip for building Python packages.

If there are other packaging systems in a similar situation, I think we'd be interested in hearing from them. But hearing repeatedly from one downstream system doesn't really make the case that it's a general need.

FFY00 · 2022-01-10T10:12:50Z

I think anyone cross compiling will need something like this, although it seems the important part is going to be a root override functionality you can see what my latest integration attempt looks like here.

As I have said, you can achieve this with #66 or #94 by customizing sysconfig. You will need to add an install scheme with those paths, and make sysconfig._get_preferred_scheme return it when you are cross compiling.

jameshilliard · 2022-01-10T12:03:41Z

Yours is the only downstream I've actually heard from that really focuses on cross compiling.

There are others like openembedded and openwrt, we're significantly more upstream first focused than those though.

FRidh · 2022-01-16T10:54:14Z

Cross-compilation is also supported in Nixpkgs. Whenever the Python interpreter is used, a hook is executed that sets

        export _PYTHON_HOST_PLATFORM='${pythonHostPlatform}'
        export _PYTHON_SYSCONFIGDATA_NAME='${pythonSysconfigdataName}'

takluyver and others added 4 commits January 3, 2022 12:47

Add simple, explicit CLI to install a .whl file

6599a8c

[pre-commit.ci] auto fixes from pre-commit.com hooks

6eaf6e5

for more information, see https://pre-commit.ci

Add mising docstrings

1cffdcd

Add periods to docstrings

bc84d6a

takluyver mentioned this pull request Jan 3, 2022

Add CLI #66

Closed

jameshilliard reviewed Jan 3, 2022

View reviewed changes

src/installer/__main__.py Outdated Show resolved Hide resolved

jameshilliard reviewed Jan 3, 2022

View reviewed changes

Make --platlib optional

4a23b5e

takluyver mentioned this pull request Jan 5, 2022

CLI to install to Python running installer #94

Merged

pradyunsg closed this Jan 6, 2022

abravalheri mentioned this pull request Jan 7, 2022

easy_install installation location not controlled by --prefix pypa/setuptools#1930

Open

pradyunsg mentioned this pull request Jan 25, 2022

Support for --prefix #98

Closed

pradyunsg mentioned this pull request Jul 27, 2022

Add an option for pip to manage a different "environment" pypa/pip#11307

Closed

1 task

Add simple, explicit CLI to install a .whl file #92

Add simple, explicit CLI to install a .whl file #92

Conversation

takluyver commented Jan 3, 2022

pradyunsg commented Jan 3, 2022 • edited Loading

layday commented Jan 3, 2022 • edited Loading

jameshilliard commented Jan 3, 2022

jameshilliard Jan 3, 2022

Choose a reason for hiding this comment

layday Jan 3, 2022

Choose a reason for hiding this comment

jameshilliard Jan 3, 2022

Choose a reason for hiding this comment

takluyver Jan 4, 2022

Choose a reason for hiding this comment

jameshilliard Jan 4, 2022

Choose a reason for hiding this comment

layday commented Jan 3, 2022

jameshilliard commented Jan 3, 2022 • edited Loading

layday commented Jan 3, 2022

takluyver commented Jan 4, 2022

pradyunsg commented Jan 4, 2022 • edited Loading

FFY00 commented Jan 4, 2022 • edited Loading

takluyver commented Jan 4, 2022

FFY00 commented Jan 4, 2022 • edited Loading

pradyunsg commented Jan 4, 2022 • edited Loading

FFY00 commented Jan 4, 2022

layday commented Jan 4, 2022 • edited Loading

jameshilliard commented Jan 4, 2022

takluyver commented Jan 5, 2022

takluyver commented Jan 5, 2022

FFY00 commented Jan 5, 2022

FFY00 commented Jan 5, 2022

eli-schwartz commented Jan 5, 2022

hroncok commented Jan 5, 2022

FFY00 commented Jan 5, 2022

hroncok commented Jan 5, 2022

jameshilliard commented Jan 5, 2022 • edited Loading

FFY00 commented Jan 5, 2022

jameshilliard commented Jan 5, 2022

eli-schwartz commented Jan 5, 2022

jameshilliard commented Jan 5, 2022

pradyunsg commented Jan 6, 2022

jameshilliard commented Jan 6, 2022

takluyver commented Jan 6, 2022

jameshilliard commented Jan 6, 2022

takluyver commented Jan 7, 2022

jameshilliard commented Jan 10, 2022

takluyver commented Jan 10, 2022

FFY00 commented Jan 10, 2022

jameshilliard commented Jan 10, 2022

FRidh commented Jan 16, 2022

pradyunsg commented Jan 3, 2022 •

edited

Loading

layday commented Jan 3, 2022 •

edited

Loading

jameshilliard commented Jan 3, 2022 •

edited

Loading

pradyunsg commented Jan 4, 2022 •

edited

Loading

FFY00 commented Jan 4, 2022 •

edited

Loading

FFY00 commented Jan 4, 2022 •

edited

Loading

pradyunsg commented Jan 4, 2022 •

edited

Loading

layday commented Jan 4, 2022 •

edited

Loading

jameshilliard commented Jan 5, 2022 •

edited

Loading