Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfiles: determine if Poetry meets our requirements #12568

Closed
Eric-Arellano opened this issue Aug 13, 2021 · 3 comments
Closed

Lockfiles: determine if Poetry meets our requirements #12568

Eric-Arellano opened this issue Aug 13, 2021 · 3 comments

Comments

@Eric-Arellano
Copy link
Contributor

#12470 determined we cannot use pip-compile. This is to determine if we can use Poetry vs. need to teach Pex how to generate lockfiles.

(Credit to @jsirois for doing most of this initial investigation. Thank you!)

Problems solved by Poetry

Handling env markers

Poetry generates a lockfile intended to work for all requested environments, meaning: all requested Python interpreters and Linux, Windows, and macOS. This works consistently regardless of which machine you run Poetry from.

Solves #12200.

Solved issues

Poetry issues we can work around.

Poetry includes unused / unvetted dists

See #12458. This is a supply chain risk.

Our solution is to post-process poetry.lock to remove unused dists. It is still unresolved, though, how we will know what environments will be used and what will not.

Support for --platforms

See #12557. We want to only generate a lockfile if it is valid for the requested platforms, and not include entries/dists for unused platforms.

We can do this via post-processing poetry.lock.

There is one known edge case where we will "cry wolf" when we didn't need to: #12557 (comment), although there is a decent workaround.

Handling of setuptools and wheel

Poetry 1.1 and earlier does not include these in lockfiles, which we need. Poetry 1.2 fixes it, but that's an alpha release and several months from becoming stable.

We worked around this by monkey matching to mark these dependencies as "safe".

Unsolved issues

Handling of "bifurcated requirements"

Poetry does not properly handle dependencies like:

default_extra_requirements = [
"setuptools<45; python_full_version == '2.7.*'",
"setuptools; python_version > '2.7'",
]

Resulting in:

setuptools==44.1.1; python_full_version == "2.7.*" or python_version > "2.7" \
--hash=sha256:27a714c09253134e60a6fa68130f78c7037e5562c4f21f8f318f2ae900d152d5 \
--hash=sha256:c67aa55db532a0dadc4d2e20ba9961cbd3ccc84d544e9029699822542b5a476b \
--hash=sha256:a49230977aa6cfb9d933614d2f7b79036e9945c4cdd7583163f4e920b83418d6 \
--hash=sha256:6bac238ffdf24e8806c61440e755192470352850f3419a52f26ffe0a1a64f465

See python-poetry/poetry#4381 for an issue John opened. We'll see if they're open to fixing it..

No support for --find-links

Poetry supports indexes ([python-repos].indexes), but not --find-links ([python-repos].repos). The issue has been open since 2019 with little traction: https://github.com/python-poetry/poetry/issues/1391, and no maintainer resposne despite being highly upvoted.

This is a blocker, fwict. There are no documented workarounds, so we would likely need to contribute a patch and backport it to 1.1.

Leveraging Poetry's cache

At a minimum, we need to hook up Poetry to named_caches. Even better is teaching Pex how to consume Poetry's cache for downloading artifacts. A sample entry:

/Users/ericarellano/Library/Caches/pypoetry/artifacts/
├── 0a
│   └── 67
│       └── 21
│           └── 620031f7f2938bad77b057a68a4c2a334542a7402a3214ed48690caf8c
│               └── pathspec-0.9.0-py2.py3-none-any.whl

Poetry's support is broken for changing the cache dir, though, and they have not fixed it in over a year: python-poetry/poetry#2445. On Linux, we can hack around it by setting XDG_CACHE_HOME, but there is no workaround on macOS. For this to work, we need to land a patch to Poetry itself, particularly Poetry 1.1 because 1.2 is alpha.

I do not know how hard it would be to hook this up to Pex.

pip options on individual requirements (TODO)

Users have been asking for us to add support for pip options on individual requirements, rather than global options: #12090.

TODO: which of these pip options could be mapped to Poetry's pyproject.toml, if any?

Certs / auth (TODO)

Poetry does have support for credentials: https://python-poetry.org/docs/repositories/. TODO to confirm we can wire this all up properly.

Unsolvable issues

These are unlikely to be solved if we go with Poetry, e.g. they are unlikely to accept a patch or it's a fundamental problem. If we go with Poetry, we know we are making this tradeoff.

Using a new resolver

Switching from pip's new resolver (resolvelib) to Poetry's will risk breaking some users. Something that pip can resolve might not resolve for Poetry, and vice versa. Or they may resolve differently.

To assess this risk, we would want to write an experiment that resolves using Poetry vs. using pip over N dependencies and check for discrepancies.

How to force transitive deps to a certain value

Pip's constraints files were originally intended to pin certain transitive deps, which you need to do if you don't like how the resolver did things.

There is no way to emulate this in Poetry, to force its resolver to use a particular version. See python-poetry/poetry#3225 for an issue proposing it. Someone tried implementing it in python-poetry/poetry#4005, but the maintainer was "not really enthused about this". After two months of no review, the author gave up and closed the issue.

There is a workaround: our users can hand edit the lockfiles we generate. They can also ignore Pants's generation and provide their own hand-generated lockfile. Of course, that is a subpar UX and harder for the user to get things right.

@Eric-Arellano
Copy link
Contributor Author

@jsirois scoped out what it'd take to do this in Pex. Copied from Slack:

Alright - it does turn out you can get pip to do what's needed to generate Poetry / PDM style lock files. Its about 100 lines of additional patching here: https://github.com/pantsbuild/pex/blob/ffa0a513d18e2ef16895fb92f80bf0e918295dad/pex/pip.py#L334-L365

That gives you (optionally) platform independent locks, pip legacy and pip 2020 support, --find-links support, constraints support, pip args support, --platform support without crying wolf and tests locking pantsbuild.pants show slightly fatser (~10.4s lock vs ~11s lock with Poetry).

Let me know what further you want me to do if anything. To clean up my weekend hacks and do more exhaustive testing to get this added as a feature to Pex will probably be a few weeks of work.

The interesting cases I hadn't considered and that came up in my hacking were:

  1. sdist resolution: Pip runs python setup.py egg_info (or PEP-517) to get the metadata to trace dependencies and you can't do this when you want to do a platform agnostic lock since you may not have an appropriate interpreter to run that setup.py with (some setup.py actively check sys.executable and fail fast if its version is undersired - functools32 is an exapmple). Poetry punts here and simply expects the sdist has a PKG-INFO file in it already. I expect this is ok for any modern sdist and it just dies on old ones. I had to hack Pip to do similar and use PKG-INFO if present.
  2. You can't simply turn off tag evaulation and marker evaluation. For marker evaluation you must handle the extras marker and you must handle python_version and python_full_version - but just those three and no others.

Thanks @jsirois, that's a really exciting find.

  1. To be unambiguous, "platform independent locks" also means generating a lockfile that works with multiple Python versions, right? Interpreter constraints do not play well with lock files. #12200. As w/ Poetry, you only need a single interpreter to run the lockfile generation, and that interpreter need not be in the range used for the lockfile?
  2. How would this approach Pants lockfile generation includes un-used dists and thus un-vetted dists. #12458? Are you still thinking an option to control this behavior, as proposed in the issue?
  3. It sounds like Pex would directly generate a requirements.txt-style lockfile, w/ no intermediate format like PEP 665?

I expect this is ok for any modern sdist and it just dies on old ones. I had to hack Pip to do similar and use PKG-INFO if present.

Agreed, sounds reasonable given the tradeoffs.

For marker evaluation you must handle the extras marker and you must handle python_version and python_full_version

Hm, how would that work with my first question of handling multiple Python versions?

@jsirois
Copy link
Contributor

jsirois commented Aug 16, 2021

  1. To be unambiguous, "platform independent locks" also means generating a lockfile that works with multiple Python versions, right? Interpreter constraints do not play well with lock files. #12200. As w/ Poetry, you only need a single interpreter to run the lockfile generation, and that interpreter need not be in the range used for the lockfile?

Correct.

  1. How would this approach Pants lockfile generation includes un-used dists and thus un-vetted dists. #12458? Are you still thinking an option to control this behavior, as proposed in the issue?

Correct. Lockfile generation is simple when generating the only lock you know works. Its only when you generate a platform independent lock where you don't know it actually works / is safe, etc that things get hard using Pip and you need to patch it. So that patching will be toggleable via a switch.

  1. It sounds like Pex would directly generate a requirements.txt-style lockfile, w/ no intermediate format like PEP 665?

Who knows. I have not given that any thought. As mentioned elsewhere - the format has ~0 to do with anything here. Pex could generate the lockfile in any random format or support multiple.

@Eric-Arellano
Copy link
Contributor Author

Closing for now as answered: we have a hypothesis that Pex's lockfile implementation will do what we need and @jsirois has been moving forward on an implementation. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants