-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor requires-python marker simplification #6268
Conversation
╰─▶ Because your project depends on datasets<2.19 and datasets>=2.19, we can conclude that your project's requirements are unsatisfiable. | ||
╰─▶ Because only datasets<2.19 is available and your project depends on datasets>=2.19, we can conclude that your project's requirements are unsatisfiable. | ||
"###); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah — this seems wrong in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this..
Moving the transformation call so we can see the display before mutation...
diff --git a/crates/uv-resolver/src/error.rs b/crates/uv-resolver/src/error.rs
index 4a38e3e30..dc9eeecf8 100644
--- a/crates/uv-resolver/src/error.rs
+++ b/crates/uv-resolver/src/error.rs
@@ -222,7 +222,6 @@ impl std::fmt::Display for NoSolutionError {
// Transform the error tree for reporting
let mut tree = self.error.clone();
- simplify_derivation_tree_markers(&self.python_requirement, &mut tree);
let should_display_tree = std::env::var_os("UV_INTERNAL__SHOW_DERIVATION_TREE").is_some()
|| tracing::enabled!(tracing::Level::TRACE);
@@ -230,6 +229,7 @@ impl std::fmt::Display for NoSolutionError {
display_tree(&tree, "Resolver derivation tree before reduction");
}
+ simplify_derivation_tree_markers(&self.python_requirement, &mut tree);
collapse_no_versions_of_workspace_members(&mut tree, &self.workspace_members);
Using a cargo insta test
invocation:
UV_INTERNAL__SHOW_DERIVATION_TREE=1 cit unconditional_overlapping_marker_disjoint_version_constraints
We get the following trees:
Resolver derivation tree before reduction
root==0a0.dev0 depends on project*
project==0.1.0 depends on datasets{python_full_version >= '3.11'}>=2.19
no versions of datasets{python_full_version >= '3.11'}>=2.19
no versions of project<0.1.0 | >0.1.0
Resolver derivation tree after reduction
project==0.1.0 depends on datasets>=2.19
no versions of datasets>=2.19
It looks like your simplifications here are fine, the tree is wrong before then.
Here are the (correct) trees from main:
Resolver derivation tree before reduction
root==0a0.dev0 depends on project*
project==0.1.0 depends on datasets>=2.19
project==0.1.0 depends on datasets<2.19
no versions of project<0.1.0 | >0.1.0
Resolver derivation tree after reduction
project==0.1.0 depends on datasets>=2.19
project==0.1.0 depends on datasets<2.19
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without UV_EXCLUDE_NEWER
, the message is entirely unhinged.
Resolver derivation tree before reduction
root==0a0.dev0 depends on project*
project==0.1.0 depends on datasets{python_full_version >= '3.11'}>=2.19
project==0.1.0 depends on datasets<2.19
no versions of datasets{python_full_version >= '3.11'}>2.19.0, <2.19.1 | >2.19.1, <2.19.2 | >2.19.2, <2.20.0 | >2.20.0, <2.21.0 | >2.21.0
no versions of project<0.1.0 | >0.1.0
Resolver derivation tree after reduction
project==0.1.0 depends on datasets>=2.19
project==0.1.0 depends on datasets<2.19
no versions of datasets>2.19.0, <2.19.1 | >2.19.1, <2.19.2 | >2.19.2, <2.20.0 | >2.20.0, <2.21.0 | >2.21.0
× No solution found when resolving dependencies:
╰─▶ Because only the following versions of datasets are available:
datasets<=2.19.0
datasets==2.19.1
datasets==2.19.2
datasets==2.20.0
datasets==2.21.0
and your project depends on datasets<2.19, we can conclude that your project and datasets>=2.19.0 are incompatible.
And because your project depends on datasets>=2.19, we can conclude that your project's requirements are unsatisfiable.
41d38b5
to
a06a218
Compare
CodSpeed Performance ReportMerging #6268 will not alter performanceComparing Summary
|
a06a218
to
bc36d3a
Compare
505c2b6
to
d8dbb13
Compare
Here is my current state: I still cannot fully explain the diff on the I've spent a lot of time looking at the [project]
name = "transformers"
version = "4.39.0.dev0"
requires-python = ">=3.9.0"
dependencies = []
[project.urls]
repository = "https://github.com/huggingface/transformers"
[project.optional-dependencies]
accelerate = [
"accelerate>=0.21.0",
]
flax = [
"flax>=0.4.1,<=0.7.0",
]
torch-speech = [
"librosa",
]
quality = [
"datasets!=2.5.0",
]
all = [
"tensorflow>=2.6,<2.16",
"onnxconverter-common",
"tensorflow-text<2.16",
"jax>=0.4.1,<=0.4.13",
]
agents = [
"opencv-python",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build" Then lock this using
Here's one example of a difference. This is from [[package]]
name = "tensorflow-macos"
version = "2.15.1"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
"python_full_version < '3.10' and platform_machine == 'aarch64' and platform_system == 'Linux'",
]
dependencies = [
{ name = "tensorflow-cpu-aws", marker = "python_full_version < '3.10' and platform_machine == 'aarch64' and platform_system == 'Linux'" },
]
wheels = [
{ url = "https://files.pythonhosted.org/packages/b3/c8/b90dc41b1eefc2894801a120cf268b1f25440981fcf966fb055febce8348/tensorflow_macos-2.15.1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:b8f01d7615fe4ff3b15a12f84471bd5344fed187543c4a091da3ddca51b6dc26", size = 2158 },
{ url = "https://files.pythonhosted.org/packages/bc/11/b73387ad260614ec43c313a630d14fe5522455084abc207fce864aaa3d73/tensorflow_macos-2.15.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:58fca6399665f19e599c591c421672d9bc8b705409d43ececd0931d1d3bc6a7e", size = 2159 },
{ url = "https://files.pythonhosted.org/packages/3a/54/95b9459cd48d92a0522c8dd59955210e51747a46461bcedb64a9a77ba822/tensorflow_macos-2.15.1-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:cca3c9ba5b96face05716792cb1bcc70d84c5e0c34bfb7735b39c65d0334b699", size = 2158 },
] And this is from this branch: [[package]]
name = "tensorflow-macos"
version = "2.15.1"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
"python_full_version < '3.10' and platform_machine == 'arm64' and platform_system == 'Darwin'",
]
wheels = [
{ url = "https://files.pythonhosted.org/packages/b3/c8/b90dc41b1eefc2894801a120cf268b1f25440981fcf966fb055febce8348/tensorflow_macos-2.15.1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:b8f01d7615fe4ff3b15a12f84471bd5344fed187543c4a091da3ddca51b6dc26", size = 2158 },
{ url = "https://files.pythonhosted.org/packages/bc/11/b73387ad260614ec43c313a630d14fe5522455084abc207fce864aaa3d73/tensorflow_macos-2.15.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:58fca6399665f19e599c591c421672d9bc8b705409d43ececd0931d1d3bc6a7e", size = 2159 },
{ url = "https://files.pythonhosted.org/packages/3a/54/95b9459cd48d92a0522c8dd59955210e51747a46461bcedb64a9a77ba822/tensorflow_macos-2.15.1-cp39-cp39-macosx_12_0_arm64.whl", hash = "sha256:cca3c9ba5b96face05716792cb1bcc70d84c5e0c34bfb7735b39c65d0334b699", size = 2158 },
] Here's another example. From [[package]]
name = "flatbuffers"
version = "24.3.25"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
"python_full_version < '3.10' and platform_machine == 'aarch64' and platform_system == 'Linux'",
] From this branch: [[package]]
name = "flatbuffers"
version = "24.3.25"
source = { registry = "https://pypi.org/simple" }
resolution-markers = [
"python_full_version < '3.10' and platform_machine == 'arm64' and platform_system == 'Darwin'",
] This represents a key thing I don't yet understand: why does One thing I do know is that this is related to fork prioritization. If I remove Because of that, I spent some time trying to making fork prioritization in this branch work in the same way as in Here are my thoughts in summary for where to go from here:
This is rather odd overall |
d8dbb13
to
a8cd800
Compare
2d9ec0a
to
667b11a
Compare
667b11a
to
cdcc076
Compare
I think the
|
6fc3d2a
to
cfbcd7c
Compare
cfbcd7c
to
3e6abec
Compare
3e6abec
to
9ac6c68
Compare
Conversions from strings to paths are always infallible.
When I first wrote this routine, it was intended to only emit a trace for the final "unioned" resolution. But we actually moved that semantic operation to the construction of the resolution *graph*. So there is no unioned `Resolution` any more. But this is still useful to see. So I changed this to just emit a trace of *every* resolution right before constructing the graph. It might be nice to also emit a trace of the unioned graph too. Or perhaps we should do that instead if this proves too noisy. (Although this is only emitted at TRACE level.)
This commit refactors how deal with `requires-python` so that instead of simplifying markers of dependencies inside the resolver, we do it at the edges of our system. When writing markers to output, we simplify when there's an obvious `requires-python` context. And when reading markers as input, we complexity markers with the relevant `requires-python` constraint.
This update changes the error message to one that is worse than the status quo, but it is still correct because `datasets >= 2.19` doesn't actually exist given our `EXCLUDE_NEWER` in tests at present. The underlying cause here seems to be in how PubGrub deals with reporting incompatibilities. Namely, when it has `foo < 1` and `foo >= 1`, it reports an incompatibility immediately before looking for versions. But when it has `foo < 1` and `foo >= 1 ; marker`, then because they aren't both pubgrub "packages," it starts requesting versions first and hits the "not available" error path instead of the "incompatible" error path. Since this is more of an underlying issue with how we setup `PubGrubPackage` and our interaction with pubgrub, we ended up deciding to move forward here with the regression since this PR is fixing a correctness issue. In particular, if one changes the `requires-python` to `>=3.8`, then both `main` and this PR produce similarly bad error messages.
It is not clear whether this update is correct or not. Moreover, it's not clear whether the status quo is correct or not. The problem is that `transformers` is so big that it's very hard to understand what the right output is without a deeper investigation. One thing that is interesting is if fork prioritization is removed in this PR *and* on `main`, then the differences in this ecosystem test go away. We've decided for now to move forward with this update even though we're uncertain because this PR fixes a few outstanding correctness issues.
Unlike the previous update, this message is specifically referring to a fork's markers inside the resolver. We probably *could* massage the message to be simplified with respect to requires-python, but it's not obvious to me that that is the right thing to do.
A `Dependency` now has both "simplified" and "complexified" markers, so just update the snapshots to match the new reality.
The output no longer results in installig two different versions of astroid unconditionally on Python 3.10. Fixes #6269
The `tomli` dependency is now included for `python_version <= 3.11`, which is what is expected. Fixes #6412
The `importlib-metadata` is no longer unconditionally repeated in the output for Python 3.10 (or Python 3.7). Fixes #6836
Previously we were using `[+-~]`, but this includes the full range of characters from `+` to `~`. Incidentally, this does include `-`. We instead rewrite this as `[-+~]`, which probably matches the intent.
The key change here is to use raw strings so that we don't need to double-escape things like `\d`. And in particular, we rely on the fact that `"\n"` and `r"\n"` are precisely equivalent when fed to `Regex::new` in the `regex` crate.
This is to account for slight differences on our Windows CI.
b81645e
to
08fc8ad
Compare
And otherwise make the regexes a little more robust.
da57490
to
92e68dd
Compare
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.4.0` -> `0.4.4` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>astral-sh/uv (astral-sh/uv)</summary> ### [`v0.4.4`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#044) [Compare Source](astral-sh/uv@0.4.3...0.4.4) ##### Enhancements - Allow customizing the project environment path with `UV_PROJECT_ENVIRONMENT` ([#​6834](astral-sh/uv#6834)) - Warn when `VIRTUAL_ENV` is set but will not be respected in project commands ([#​6864](astral-sh/uv#6864)) - Add `--no-hashes` to `uv export` ([#​6954](astral-sh/uv#6954)) - Make HTTP headers title case for backward compatibility ([#​6887](astral-sh/uv#6887)) - Pin `.python-version` in `uv init` ([#​6869](astral-sh/uv#6869)) - Support `file://` URLs for `UV_PYTHON_INSTALL_MIRROR` ([#​6950](astral-sh/uv#6950)) - Introduce more docker tags for uv ([#​6053](astral-sh/uv#6053)) ##### Bug fixes - Avoid canonicalizing the cache directory ([#​6949](astral-sh/uv#6949)) - Show all PyPy versions in `uv python list --all-versions` ([#​6917](astral-sh/uv#6917)) - Avoid incorrect `requires-python` marker simplifications ([#​6268](astral-sh/uv#6268)) ##### Documentation - Add documentation for `UV_PROJECT_ENVIRONMENT` ([#​6987](astral-sh/uv#6987)) - Add optional dependencies section to the lockfile document ([#​6982](astral-sh/uv#6982)) - Document use of the `file://` scheme in Python installation mirrors ([#​6984](astral-sh/uv#6984)) - Fix outdated references to the help menu documentation in the first steps page ([#​6980](astral-sh/uv#6980)) - Show env option in CLI reference documentation ([#​6863](astral-sh/uv#6863)) - Add bind mount example to `docker.md` ([#​6921](astral-sh/uv#6921)) ### [`v0.4.3`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#043) [Compare Source](astral-sh/uv@0.4.2...0.4.3) ##### Enhancements - Show build backend output when `--verbose` is provided ([#​6903](astral-sh/uv#6903)) - Allow `uv sync --frozen --package` without copying member `pyproject.toml` ([#​6943](astral-sh/uv#6943)) ##### Bug fixes - Avoid panic with missing temporary directory ([#​6929](astral-sh/uv#6929)) - Avoid updating incorrect dependencies for sorted `uv add` ([#​6939](astral-sh/uv#6939)) - Use lower-bound semantics for all Python compatibility comparisons ([#​6882](astral-sh/uv#6882)) ### [`v0.4.2`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#042) [Compare Source](astral-sh/uv@0.4.1...0.4.2) ##### Enhancements - Adding support for `.pyc` files in `uv run` ([#​6886](astral-sh/uv#6886)) - Treat missing `top_level.txt` as non-fatal ([#​6881](astral-sh/uv#6881)) ##### Bug fixes - Fix `is_disjoint` check for supported environments ([#​6902](astral-sh/uv#6902)) - Remove dangling archives in `uv cache clean ${package}` ([#​6915](astral-sh/uv#6915)) - Error when discovered Python is incompatible with `--isolated` workspace ([#​6885](astral-sh/uv#6885)) - Warn when discovered Python is incompatible with PEP 723 script ([#​6884](astral-sh/uv#6884)) ### [`v0.4.1`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#041) [Compare Source](astral-sh/uv@0.4.0...0.4.1) ##### Enhancements - Add `uv export --format requirements-txt` ([#​6778](astral-sh/uv#6778)) - Allow `@` references in `uv tool install --from` ([#​6842](astral-sh/uv#6842)) - Normalize version specifiers by sorting ([#​6333](astral-sh/uv#6333)) - Respect the user's upper-bound in `requires-python` ([#​6824](astral-sh/uv#6824)) - Use Windows registry to discover Python on Windows directly ([#​6761](astral-sh/uv#6761)) - Hint at `--no-workspace` in `uv init` failures ([#​6815](astral-sh/uv#6815)) - Update to last PyPy releases ([#​6784](astral-sh/uv#6784)) ##### Bug fixes - Avoid deadlocks when multiple uv processes lock resources ([#​6790](astral-sh/uv#6790)) - Expand tildes when matching against `PATH` ([#​6829](astral-sh/uv#6829)) - Fix `uv init --no-project` alias ([#​6837](astral-sh/uv#6837)) - Ignore pre-release segments when discovering via `requires-python` ([#​6813](astral-sh/uv#6813)) - Support inline optional tables in `uv add` and `uv remove` ([#​6787](astral-sh/uv#6787)) - Update default `hello.py` to pass `ruff format` ([#​6811](astral-sh/uv#6811)) - Avoid stripping root for user path display ([#​6865](astral-sh/uv#6865)) - Error when user-provided environments are disjoint with Python ([#​6841](astral-sh/uv#6841)) - Retain alphabetical sorting for `pyproject.toml` in `uv add` operations ([#​6388](astral-sh/uv#6388)))) ##### Documentation - Add a link to the multiple index docs in the alternative index guide ([#​6826](astral-sh/uv#6826)) - Add docs for inline exclude newer in PEP 723 scripts ([#​6831](astral-sh/uv#6831)) - Enumerate available Docker tags ([#​6768](astral-sh/uv#6768)) - Omit `[pip]` section from configuration file docs ([#​6814](astral-sh/uv#6814)) - Update `project.urls` in `pyproject.toml` ([#​6844](astral-sh/uv#6844)) - Add docs for AWS CodeArtifact usage ([#​6816](astral-sh/uv#6816)) ##### Other changes </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
I split this change into its own commit because I'm hoping it crystalizes what it means when we say "a `MarkerTree` has hidden state." That is, it isn't so much that there is some explicit member of a `MarkerTree` that is omitted, but rather, the lower and upper version bounds on `python_full_version` are are rewritten as "unbounded" when traversing the ADD for display. We will actually retain this functionality, but rejigger it so that it's explicit when we do this. In particular, this simplification has been problematic for us because it fundamentally changes the truth tables of a marker expression *unless* you are extremely careful to interpret it only under the original context in which it was simplified. This is quite difficult to do generally, and in prior work in #6268, we completed a refactor where we worked around this type of simplification and moved it to the edges of uv. In subsequent commits, we'll re-implement this form of simplification as a more explicit step.
I split this change into its own commit because I'm hoping it crystalizes what it means when we say "a `MarkerTree` has hidden state." That is, it isn't so much that there is some explicit member of a `MarkerTree` that is omitted, but rather, the lower and upper version bounds on `python_full_version` are are rewritten as "unbounded" when traversing the ADD for display. We will actually retain this functionality, but rejigger it so that it's explicit when we do this. In particular, this simplification has been problematic for us because it fundamentally changes the truth tables of a marker expression *unless* you are extremely careful to interpret it only under the original context in which it was simplified. This is quite difficult to do generally, and in prior work in #6268, we completed a refactor where we worked around this type of simplification and moved it to the edges of uv. In subsequent commits, we'll re-implement this form of simplification as a more explicit step.
This commit refactors how deal with
requires-python
so that instead ofsimplifying markers of dependencies inside the resolver, we do it at the
edges of our system. When writing markers to output, we simplify when
there's an obvious
requires-python
context. And when reading markersas input, we complexity markers with the relevant
requires-python
constraint.
There are some lingering problems though:
force folks into dealing with marker serialization correctly. It's a
very subtle problem and I think we should invest in that, but I think
it would be better in a follow-up PR since this one is already pretty
big.
still others require investigation to figure out whether they're
correct or not.
MarkerTree
's hidden state byimplementing requires-python marker simplification in a cursed way:
run
restrict_versions
, serialize the marker to string form and thenre-parse it. This should ensure any hidden state in the
MarkerTree
is removed. Of course, this isn't what we should do. I think we should
just remove the hidden state entirely, but I took a shortcut to prove
out the idea.
Fixes #6269, Fixes #6412, Fixes #6836