Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using install_from_resolve with a manually generated lockfile raises AssertionError #18932

Closed
d-tw opened this issue May 7, 2023 · 7 comments
Assignees
Labels

Comments

@d-tw
Copy link

d-tw commented May 7, 2023

Describe the bug
Using install_from_resolve with [pytest] raises an AssertionError (from assert loaded_lockfile.is_pex_native in source) when the associated lockfile is a manually generated requirements-style lockfile.

The relevant parts of the pants.toml:

[GLOBAL]
pants_version = "2.16.0rc0"

[python]
enable_resolves = true
interpreter_constraints = ['==3.10.*']
resolves_generate_lockfiles = false

[python.resolves]
python-default = "src/py/deps.lock"

[pytest]
install_from_resolve = "python-default"

The lockfile contains the following dependencies (and their transitive dependencies):

poetry = "1.4.2"
pytest-cov = ">=2.12,!=2.12.1,<3.1"
pytest-xdist = ">=2.5,<3"

Am I missing something in my config? Is this use-case not supported?

Pants version
2.16.0rc0

OS
Tested on macOS. Not tested on other platforms.

Additional info
The contents of src/py/deps.lock is in this gist.

@d-tw d-tw added the bug label May 7, 2023
@benjyw
Copy link
Contributor

benjyw commented May 8, 2023

Yeah, this isn't supported. A tool lockfile must be a pants-generated one.

We have been thinking of deprecating those hand-knitted requirements.txt-style lockfiles. Do you have to use them, or could you switch to the (vastly better :) ) pants-generated ones?

@benjyw benjyw closed this as completed May 8, 2023
@d-tw
Copy link
Author

d-tw commented May 9, 2023

Ok, I understand.

My 2c, deprecating the requirements.txt would have a pretty big impact for us. We use pants as part of a wider tool chain.

We might be able to limit this impact if there's a clear way of generating a PEX-style lockfile from a poetry.lock.

Is the PEX-style lockfile format documented somewhere?

@benjyw
Copy link
Contributor

benjyw commented May 10, 2023

It's purposely undocumented. The only blessed way to create one is pants generate-lockfiles. What is the input from which you generate your lockfiles? You can have Pants generate one from that.

@d-tw
Copy link
Author

d-tw commented May 10, 2023

We have a pyproject.toml that is used to generate a poetry.lock file. We export this poetry.lock file into a requirements.txt that is used as a lock file by pants.

The requirements.txt is of the form:

adal==1.2.7 ; python_full_version >= "3.10.0" and python_full_version < "3.11.0" \
    --hash=sha256:2a7451ed7441ddbc57703042204a3e30ef747478eea022c70f789fc7f084bc3d \
    --hash=sha256:d74f45b81317454d96e982fd1c50e6fb5c99ac2223728aea8764433a39f566f1
adlfs==2023.1.0 ; python_full_version >= "3.10.0" and python_full_version < "3.11.0" \
    --hash=sha256:ccbdd6d33d5b7bce99a4a07c884c82fb135745d39b2d9b7b672f8e9d4d04407f \
    --hash=sha256:eca53f53d88fc8e2e7a2f1d8f5a40b8a1c56aa3b541e4aa2e7eaf55a6a789262
...

The poetry.lock file is also consumed by a variety of other tools, such as Dependabot for dependency vulnerability scanning.

In order for pants to use a PEX-style lockfile, I see 4 possibilities (please correct me where my assumptions are wrong):

  1. We replace the poetry.lock with a pants-managed lockfile.

This doesn't work for us, as we need the poetry.lock to be consumed by other systems. As you mentioned, the PEX-style lockfile is an undocumented proprietary format, so not suitable for consumption by other tools.

  1. We have two lockfiles, the poetry.lock and the PEX-style lockfile. Pants can read directly from the pyproject.toml to generate its own managed lockfile.

This goes against the whole idea of a lockfile, since we'll have two lockfiles built using different solvers at potentially different moments in time. There is a possibility of ending up with two different transitive closures, or different checksums. This is potential source of very subtle bugs or security vulnerabilities.

  1. Have pants generate a PEX-style lockfile using the requirements.txt as an input.

I'm assuming that from a version point of view, since the requirements.txt is strictly locked, the resulting PEX-style lockfile would have the exact same versions. That said, I believe that there is a possibility of drift on the checksums - does pants use the checksums as part of the resolve? Or could we potentially end up with two lock files with identical versions but different hashes?

In addition, this approach involves a redundant solve step, although that's not a blocker.

  1. Export the poetry.lock as a PEX-style lockfile without introducing an intermediate solving step.

This seems like the most appropriate approach, as it preserves our poetry.lock whilst providing pants with a PEX-style lockfile that has an identical transitive closure.

I understand that there is only one blessed way of generating PEX-style lockfiles, so producing one via custom code introduces a risk if the pants team decide to change the format down the line, but I believe this risk to be manageable as it would only occur on an upgrade, and that's an opt-in process on our side.

Whilst there is no documentation of the format (I'm guessing you don't want to commit to a specific structure or have to maintain backwards compatibility, which is understandable), it seems relatively straight-forward.

I still think some form of docs, or at least commented code, with a big red "PROCEED AT YOUR OWN RISK" would be useful :)

@benjyw
Copy link
Contributor

benjyw commented May 14, 2023

Re #4, the format is easy to reverse-engineer, but again can change without warning. So if it's not hard to cobble together that converter (I don't know enough about poetry.lock to say) then probably worth a try!

The alternative is to let Pants support poetry lockfiles for tools etc. I'll keep that in mind as I tinker with this functionality.

@benjyw
Copy link
Contributor

benjyw commented May 14, 2023

Re #3, you can run pex directly to do this, and I think the checksums would be stable, why wouldn't they be?

@d-tw
Copy link
Author

d-tw commented May 14, 2023

I'm going to give the converter a go - my mileage may vary...

Re-checksums, I would imagine that in most cases you're correct, the checksums would be identical, unless someone has changed the underlying files (either maliciously or otherwise) between the two resolves - then that becomes a vulnerability because you're not actually using the original checksum to check the code being downloaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants