`cargo tree` doesn't handle transitive duplications properly #9599

ehuss · 2021-06-18T18:32:39Z

Problem
There are some situations where cargo tree will mark a package as a duplicate (*) when it probably shouldn't (and shows the wrong features with --no-dedupe). This arises when a package is built twice with the same features, but with different dependencies.

A real-world example is when looking at the following:

[package]
name = "foo"
version = "0.1.0"
resolver = "2"

[dependencies]
diesel = { version = "=1.4.7", features = ["postgres"] }
diesel_migrations = "=1.4.0"

Running cargo tree -f '{p} {f}' results in:

~/Proj/rust/cargo/scratch/diesel-issue> cargo tree -f '{p} {f}'
diesel-issue v0.1.0 (/Users/eric/Proj/rust/cargo/scratch/diesel-issue)
├── diesel v1.4.7 32-column-tables,bitflags,default,postgres,pq-sys,with-deprecated
│   ├── bitflags v1.2.1 default
│   ├── byteorder v1.4.3 default,std
│   ├── diesel_derives v1.4.1 (proc-macro) default,postgres
│   │   ├── proc-macro2 v1.0.27 default,proc-macro
│   │   │   └── unicode-xid v0.2.2 default
│   │   ├── quote v1.0.9 default,proc-macro
│   │   │   └── proc-macro2 v1.0.27 default,proc-macro (*)
│   │   └── syn v1.0.73 clone-impls,default,derive,extra-traits,fold,full,parsing,printing,proc-macro,quote
│   │       ├── proc-macro2 v1.0.27 default,proc-macro (*)
│   │       ├── quote v1.0.9 default,proc-macro (*)
│   │       └── unicode-xid v0.2.2 default
│   └── pq-sys v0.4.6
└── diesel_migrations v1.4.0 default
    ├── migrations_internals v1.4.1 default
    │   └── diesel v1.4.7 32-column-tables,bitflags,default,postgres,pq-sys,with-deprecated (*)
    └── migrations_macros v1.4.2 (proc-macro) default
        ├── migrations_internals v1.4.1 default (*)    <-- PROBLEM IS HERE
        ├── proc-macro2 v1.0.27 default,proc-macro (*)
        ├── quote v1.0.9 default,proc-macro (*)
        └── syn v1.0.73 clone-impls,default,derive,extra-traits,fold,full,parsing,printing,proc-macro,quote (*)

The problem is the diesel-issue → diesel_migrations → migrations_macros → migrations_internals. It has a (*) to indicate that it is duplicated. However, migrations_internals is built twice, and this is deceiving because the second migrations_internals has a dependency on diesel with no features.

Running with --no-dedupe is even worse, because it shows the wrong features for diesel under migrations_internals.

Steps
The following is an example in Cargo's testsuite format:

#[cargo_test]
fn dedupe_transitive_features2() {
    Package::new("differs", "1.0.0")
        .feature("some-feat", &[])
        .publish();
    Package::new("shared", "1.0.0")
        .dep("differs", "1.0")
        .publish();

    let p = project()
        .file(
            "Cargo.toml",
            r#"
                [package]
                name = "foo"
                version = "0.1.0"
                resolver = "2"

                [dependencies]
                shared = "1.0"
                differs = {version="1.0", features=["some-feat"]}
                bar = {path="bar"}

                [build-dependencies]
            "#,
        )
        .file("src/lib.rs", "")
        .file(
            "bar/Cargo.toml",
            r#"
                [package]
                name = "bar"
                version = "0.1.0"

                [build-dependencies]
                shared = "1.0"
            "#,
        )
        .file("bar/src/lib.rs", "")
        .build();

    // ISSUE HERE: The last `shared` shouldn't be `(*)`
    p.cargo("tree -f")
        .arg("{p} [{f}]")
        .with_stdout(
            "\
foo v0.1.0 [..]
├── bar v0.1.0 [..]
│   [build-dependencies]
│   └── shared v1.0.0 []
│       └── differs v1.0.0 []
├── differs v1.0.0 [some-feat]
└── shared v1.0.0 [] (*)
",
        )
        .run();
}

    // ISSUE HERE: The last `differs` shows with no features!
    p.cargo("tree --no-dedupe -f")
        .arg("{p} [{f}]")
        .with_stdout(
            "\
foo v0.1.0 [..]
├── bar v0.1.0 [..]
│   [build-dependencies]
│   └── shared v1.0.0 []
│       └── differs v1.0.0 []
├── differs v1.0.0 [some-feat]
└── shared v1.0.0 []
    └── differs v1.0.0 [some-feat]
",
        )
        .run();

Possible Solution(s)
The issue is that cargo tree doesn't have the same logic that was added in #8701 to accommodate dependencies that are the same, but link to different dependencies.

I have toyed with the idea of changing cargo tree to use the UnitGraph computed for a normal build instead of trying to recreate how some of these things are computed. There are some complexities and downsides to that approach, but I continue to feel that trying to replicate this logic in multiple places is not a good idea.

Notes

Output of cargo version: cargo 1.54.0-nightly (4445667 2021-06-12)

The text was updated successfully, but these errors were encountered:

weihanglo · 2021-06-26T09:03:19Z

I also feel not right to replicate the logic. Love to know the downsides about reuse UnitGraph. Could you explain more?

ehuss · 2021-06-28T16:34:40Z

The main problem is that UnitGraph is based on compilation units, whereas cargo tree wants the dependencies between packages. There are dependencies in UnitGraph that need to be filtered out.

For example, all the intra-package dependencies would need to be removed. Also, UnitGraph has multiple roots for a package (for example, if the package has multiple bins). Another example is build scripts, where the "run" unit depends on all transitive build scripts being run. Those are not real dependencies to be displayed. And conversely there is information missing from UnitGraph (like which dependencies are build-dependencies or dev-dependencies). That information is lost since the Resolve is discarded after the graph is computed.

Another problem is that the Graph built for cargo tree has a fundamentally different structure, and changes based on command-line options. All that code for build the graph will probably need to stay to handle all of that.

A concern is that attempting to share the code will end up needing more code to handle the impedance mismatch, and will be harder to maintain.

So, I think this is a fairly hard problem, mostly trying to balance the maintainability of the code. I don't really know what the right solution is.

alsuren · 2022-11-12T22:34:07Z

Hello. I would like to pick this task up. I will timebox 1 week for this, before I start my new job. If I haven't made any progress by then, I will park it and let someone else have a go.

(for context: I am currently using a copy-pasta of cargo::ops::tree::graph::Graph in my cargo-quickbuild prototype, and its proc-macro handling is causing me to calculate the wrong sets of features for some transitive dependencies. This causes fingerprint mismatches, so cargo build then recompiles large swathes of the crate graph and ruins all my fun)

What should I be aiming for here? One possible approach might be:

1) a function which converts from a UnitGraph to a cargo::ops::tree::graph::Graph
2) a function which converts back again for use in tests (I realise that some information is lost in each conversion process, so some extra context is needed in each direction).
3) a fuzzer/proptest that drives the above functions in a round-trip, then checks for equality.
4) replace cargo::ops::tree::build() with the one that goes via UnitGraph, and make sure its tests still pass.

This is quite an invasive approach, and probably isn't feasible in 1 week. I know I could just try to fix my immediate problem as a one-off, but it feels like it will be a game of whack-a-mole (see also #10651). Having a fuzzer would let me feel much more confident in the correctness of any fix I write).

I wonder if we could write the UnitGraph -> Graph conversion function from (1) (but not its inverse), and then write a fuzzer that sets up a random cargo workspace, and then checks that cargo::ops::tree::build() (with a very specific set of command-line options) matches the output from the UnitGraph -> Graph conversion function.

I haven't quite put my finger on what a random workspace generator would look like, or whether it would be able to cover all of the cases that we care about.

Does anyone have any thoughts, or should I just have a crack at it and see how far I get?

weihanglo · 2022-11-15T11:30:19Z

Looks like a big plan! Previously I was asking using unit graph instead, but the reply from ehuss made me wonder it may be really challenging.

I would recommend experimenting on what you list in the first step, and then assert against the current output first. Fuzzer/protest can be added after we're all happy with the change.

I cannot guarantee things will get merged, though it is still great to explore any possibility to reduce code complexity and discrepancy between cargo-tree and other build commands :)

alsuren · 2022-11-15T12:04:36Z

Thanks for your feedback. I will be working in the open on #11379, if you want to follow along (it's not very interesting at the moment though).

I would recommend experimenting on what you list in the first step, and then assert against the current output first. Fuzzer/protest can be added after we're all happy with the change.

Yeah, it looks like there are already quite a lot of cargo tree integration tests, so I should definitely get those passing before I think about fuzz tests.

I cannot guarantee things will get merged, though it is still great to explore any possibility to reduce code complexity and discrepancy between cargo-tree and other build commands :)

That's fine - I need to build something like this for cargo quickbuild anyway, so it won't be wasted effort.

davemilter · 2024-06-07T15:13:16Z

I wonder is this why I got such strange output from cargo tree -d --format "{p} {f}" for such Cargo.toml:

[dependencies]
async-compression = "=0.3.15"
nmea = "=0.2.0"

[build-dependencies]
bindgen = { version = "0.64.0", default-features = false, features = ["runtime"] }

memchr v2.7.2 alloc,default,std
└── async-compression v0.3.15 default
    └── foo v0.1.0 (/private/tmp/foo) 

memchr v2.7.2 alloc,std
└── nom v7.1.3 alloc,std
    ├── cexpr v0.6.0 
    │   └── bindgen v0.64.0 runtime
    │       [build-dependencies]
    │       └── foo v0.1.0 (/private/tmp/foo) 
    └── nmea v0.2.0 default,std
        └── foo v0.1.0 (/private/tmp/foo)

according to cargo tree each runtime dependency async-compression and nmea
will use memchr with different feature sets.

weihanglo · 2024-06-09T17:58:37Z

cargo/src/cargo/ops/tree/graph.rs

Lines 217 to 219 in 580dbc6

    
           /// Returns a list of nodes that are considered "duplicates" (same package 
        
           /// name, with different versions/features/source/etc.). 
        
           pub fn find_duplicates(&self) -> Vec<usize> {

This is how it caculates duplicates currently, so by the implementation a dependency with a different feature set is considered a duplicate.

ehuss added C-bug Category: bug Command-tree labels Jun 18, 2021

lopopolo mentioned this issue Mar 31, 2022

cargo tree -d reports duplicates when the same dependency is included in dependencies and build-dependencies #10528

Closed

alsuren mentioned this issue Nov 12, 2022

Fix builds involving proc-macro crates (brute-force the thing?) cargo-quick/cargo-quick#36

Open

11 tasks

alsuren mentioned this issue Nov 15, 2022

WIP: try making cargo tree's Graph from a UnitGraph #11379

Closed

35 tasks

weihanglo mentioned this issue Mar 2, 2023

Improve contributor experience by labeling complexity, ambiguity, and consensus of an issue #11788

Closed

weihanglo added S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. E-medium Experience: Medium labels Jun 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cargo tree` doesn't handle transitive duplications properly #9599

`cargo tree` doesn't handle transitive duplications properly #9599

ehuss commented Jun 18, 2021

weihanglo commented Jun 26, 2021

ehuss commented Jun 28, 2021 •

edited

Loading

alsuren commented Nov 12, 2022

weihanglo commented Nov 15, 2022

alsuren commented Nov 15, 2022

davemilter commented Jun 7, 2024

weihanglo commented Jun 9, 2024

cargo tree doesn't handle transitive duplications properly #9599

cargo tree doesn't handle transitive duplications properly #9599

Comments

ehuss commented Jun 18, 2021

weihanglo commented Jun 26, 2021

ehuss commented Jun 28, 2021 • edited Loading

alsuren commented Nov 12, 2022

weihanglo commented Nov 15, 2022

alsuren commented Nov 15, 2022

davemilter commented Jun 7, 2024

weihanglo commented Jun 9, 2024

`cargo tree` doesn't handle transitive duplications properly #9599

`cargo tree` doesn't handle transitive duplications properly #9599

ehuss commented Jun 28, 2021 •

edited

Loading