Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non determinist vendoring of git dependencies #13988

Closed
stormshield-guillaumed opened this issue May 31, 2024 · 10 comments · Fixed by #13989 or #14004
Closed

Non determinist vendoring of git dependencies #13988

stormshield-guillaumed opened this issue May 31, 2024 · 10 comments · Fixed by #13989 or #14004
Labels
C-bug Category: bug Command-vendor S-triage Status: This issue is waiting on initial triage.

Comments

@stormshield-guillaumed
Copy link
Contributor

stormshield-guillaumed commented May 31, 2024

Problem

The output of cargo vendor is not determinist for git dependencies. Depending on the machine where the command is executed, the Cargo.toml of the vendored dependencies can be different. All tests were done on various versions of Ubuntu but with the same version of cargo. The difference is only about order of arrays of tables in the Cargo.toml.

I made a minimal reproduction example in this repo for which I ran the vendor command on my computer. You can see in this CI run that running cargo vendor results in a diff with master. Also, the diff is not the same between the 3 jobs, so it doesn't seems like a mistake on my part.

Steps

No response

Possible Solution(s)

I'm not very familiar with the cargo codebase, but after digging a bit, it seems to only happen with git dependencies because they are treated differently. Those dependencies are not published on crates.io so their Cargo.toml is not normalised. Due to this, the Cargo.toml is normalised before the copy to the vendor directory. This normalisation yields different results depending on the machine.

Notes

No response

Version

cargo 1.80.0-nightly (431db31d0 2024-05-28)
release: 1.80.0-nightly
commit-hash: 431db31d0dbeda320caf8ef8535ea48eb3093407
commit-date: 2024-05-28
host: x86_64-unknown-linux-gnu
libgit2: 1.7.2 (sys:0.18.3 vendored)
libcurl: 8.6.0-DEV (sys:0.4.72+curl-8.6.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Ubuntu 22.4.0 (jammy) [64-bit]
@stormshield-guillaumed stormshield-guillaumed added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels May 31, 2024
@epage
Copy link
Contributor

epage commented May 31, 2024

From the CI log:

 diff --git a/vendor/axum-server/Cargo.toml b/vendor/axum-server/Cargo.toml
index 21eb1f1..fe22018 100644
--- a/vendor/axum-server/Cargo.toml
+++ b/vendor/axum-server/Cargo.toml
@@ -89,28 +89,28 @@ doc-scrape-examples = true
 required-features = ["tls-rustls"]
 
 [[example]]
-name = "shutdown"
-path = "examples/shutdown.rs"
+name = "graceful_shutdown"
+path = "examples/graceful_shutdown.rs"
 
 [[example]]
 name = "remote_address"
 path = "examples/remote_address.rs"
 
 [[example]]
-name = "from_std_listener"
-path = "examples/from_std_listener.rs"
+name = "multiple_addresses"
+path = "examples/multiple_addresses.rs"
 
 [[example]]
-name = "graceful_shutdown"
-path = "examples/graceful_shutdown.rs"
+name = "shutdown"
+path = "examples/shutdown.rs"
 
 [[example]]
-name = "hello_world"
-path = "examples/hello_world.rs"
+name = "from_std_listener"
+path = "examples/from_std_listener.rs"
 
 [[example]]
-name = "multiple_addresses"
-path = "examples/multiple_addresses.rs"
+name = "hello_world"
+path = "examples/hello_world.rs"
 
 [[example]]
 name = "remote_address_using_tower"

In #13713, we resolve all implicit targets. Those aren't being done in a deterministic order, causing each run to be different.

epage added a commit to epage/cargo that referenced this issue May 31, 2024
With rust-lang#13713, we enumerate all targets in `Cargo.toml` on `cargo publish`
and `cargo vendor`.
However, the order of the targets is non-determistic.
This can be annoying for comparing the published `Cargo.toml` across releases but even
worse is the churn it causes for `cargo vendor`.

So we sort all the targets during publish.
This keeps costs minimal with the following risks
- If the non-determinism shows up in a way that affects developers
  during development
- If there is a reason the user wants to control target order for
  explicit targets

Fixes rust-lang#13988
bors added a commit that referenced this issue May 31, 2024
fix(toml): Ensure targets are in a deterministic order

### What does this PR try to resolve?

With #13713, we enumerate all targets in `Cargo.toml` on `cargo publish` and `cargo vendor`.
However, the order of the targets is non-determistic. This can be annoying for comparing the published `Cargo.toml` across releases but even worse is the churn it causes for `cargo vendor`.

So we sort all the targets during publish.
This keeps costs minimal with the following risks
- If the non-determinism shows up in a way that affects developers during development
- If there is a reason the user wants to control target order for explicit targets

Fixes #13988

### How should we test and review this PR?

As for writing of tests, I'm unsure why none of our existing tests failed which makes it unclear to me what would be needed to write a test for this.

### Additional information
@bors bors closed this as completed in 40b9fec May 31, 2024
@stormshield-guillaumed
Copy link
Contributor Author

Sorry to revive this issue, but I updated the toolchain of the reproduction example and the problem is still there.

What I did to test

  1. Update rust-toolchain.toml file to use newer nightly containing your commit (checked with cargo version --verbose)
  2. Clean my local cargo cache of downloaded dependencies
  3. Delete .cargo/config.toml and the vendor directory
  4. Ran cargo vendor > .cargo/config.toml

On my computer, the vendor directory was the same as the remote repository, so no changes to commit. Also, the diff is still different between the 3 jobs.

@epage
Copy link
Contributor

epage commented Jun 3, 2024

What nightly did you try this on? We need to verify that it had the fix.

Also, did you update the vendored packages with that nightly before checking for consistency?

@epage
Copy link
Contributor

epage commented Jun 3, 2024

Also, can you include a copy of a failed?

@stormshield-guillaumed
Copy link
Contributor Author

stormshield-guillaumed commented Jun 3, 2024

cargo version --verbose on my computer :

cargo 1.80.0-nightly (7a6fad098 2024-05-31)
release: 1.80.0-nightly
commit-hash: 7a6fad0984d28c8330974636972aa296b67c4513
commit-date: 2024-05-31
host: x86_64-unknown-linux-gnu
libgit2: 1.7.2 (sys:0.18.3 vendored)
libcurl: 8.6.0-DEV (sys:0.4.72+curl-8.6.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Ubuntu 22.4.0 (jammy) [64-bit]

cargo version --verbose on the CI :

cargo 1.80.0-nightly (7a6fad098 2024-05-31)
release: 1.80.0-nightly
commit-hash: 7a6fad0984d28c8330974636972aa296b67c4513
commit-date: 2024-05-31
host: x86_64-unknown-linux-gnu
libgit2: 1.7.2 (sys:0.18.3 vendored)
libcurl: 8.6.0-DEV (sys:0.4.72+curl-8.6.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Ubuntu 24.4.0 (noble) [64-bit]

Diff on the 24.04 runner :

diff --git a/vendor/axum-server/Cargo.toml b/vendor/axum-server/Cargo.toml
index 21eb1f1..fe22018 100644
--- a/vendor/axum-server/Cargo.toml
+++ b/vendor/axum-server/Cargo.toml
@@ -89,28 +89,28 @@ doc-scrape-examples = true
 required-features = ["tls-rustls"]
 
 [[example]]
-name = "shutdown"
-path = "examples/shutdown.rs"
+name = "graceful_shutdown"
+path = "examples/graceful_shutdown.rs"
 
 [[example]]
 name = "remote_address"
 path = "examples/remote_address.rs"
 
 [[example]]
-name = "from_std_listener"
-path = "examples/from_std_listener.rs"
+name = "multiple_addresses"
+path = "examples/multiple_addresses.rs"
 
 [[example]]
-name = "graceful_shutdown"
-path = "examples/graceful_shutdown.rs"
+name = "shutdown"
+path = "examples/shutdown.rs"
 
 [[example]]
-name = "hello_world"
-path = "examples/hello_world.rs"
+name = "from_std_listener"
+path = "examples/from_std_listener.rs"
 
 [[example]]
-name = "multiple_addresses"
-path = "examples/multiple_addresses.rs"
+name = "hello_world"
+path = "examples/hello_world.rs"
 
 [[example]]
 name = "remote_address_using_tower"

Diff on the 22.04 runner :

diff --git a/vendor/axum-server/Cargo.toml b/vendor/axum-server/Cargo.toml
index 21eb1f1..2255a52 100644
--- a/vendor/axum-server/Cargo.toml
+++ b/vendor/axum-server/Cargo.toml
@@ -89,13 +89,21 @@ doc-scrape-examples = true
 required-features = ["tls-rustls"]
 
 [[example]]
-name = "shutdown"
-path = "examples/shutdown.rs"
+name = "multiple_addresses"
+path = "examples/multiple_addresses.rs"
+
+[[example]]
+name = "hello_world"
+path = "examples/hello_world.rs"
 
 [[example]]
 name = "remote_address"
 path = "examples/remote_address.rs"
 
+[[example]]
+name = "shutdown"
+path = "examples/shutdown.rs"
+
 [[example]]
 name = "from_std_listener"
 path = "examples/from_std_listener.rs"
@@ -104,14 +112,6 @@ path = "examples/from_std_listener.rs"
 name = "graceful_shutdown"
 path = "examples/graceful_shutdown.rs"
 
-[[example]]
-name = "hello_world"
-path = "examples/hello_world.rs"
-
-[[example]]
-name = "multiple_addresses"
-path = "examples/multiple_addresses.rs"
-
 [[example]]
 name = "remote_address_using_tower"
 path = "examples/remote_address_using_tower.rs"

@stormshield-guillaumed
Copy link
Contributor Author

In the example repo, i modified the rust-toolchain.toml file to use nightly-2024-06-03, and then i ran those commands :

rm -fr vendor .cargo/config.toml
cargo cache -r all # To remove all downloaded dependencies in my cargo cache
cargo vendor > .cargo/config.toml
git add .
git commit -m "..."
git push

@epage
Copy link
Contributor

epage commented Jun 3, 2024

So those diffs are from 2024-05-31. Did the problem persist with 2024-06-03?

@stormshield-guillaumed
Copy link
Contributor Author

Well, I don't know why but :

$ cargo +nightly-2024-06-03 version --verbose
cargo 1.80.0-nightly (7a6fad098 2024-05-31)
release: 1.80.0-nightly
commit-hash: 7a6fad0984d28c8330974636972aa296b67c4513
commit-date: 2024-05-31
host: x86_64-unknown-linux-gnu
libgit2: 1.7.2 (sys:0.18.3 vendored)
libcurl: 8.6.0-DEV (sys:0.4.72+curl-8.6.0 vendored ssl:OpenSSL/1.1.1w)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Ubuntu 22.4.0 (jammy) [64-bit]

The commit 7a6fad0 of the used cargo is newer than your commit 40b9fec so your fix should be present.

@stormshield-guillaumed
Copy link
Contributor Author

The prepare_toml_for_publish function doesn't seem to be called when vendoring git dependencies, but maybe I'm missing something. After a bit more digging, the non determinism when vendoring git dependencies could come from either one of these :

  • The download could fetch the file differently depending on the machine but this would be weird
  • The deserialization, serialization or resolution of the Cargo.toml which is done locally when vendoring those kind of dependencies

@epage epage reopened this Jun 3, 2024
@epage
Copy link
Contributor

epage commented Jun 3, 2024

Looks like I sorted too late in the process. I had assumed we had tests and they happened to be sorted but it doesn't look like we have them. I have a follow up I'm about to post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug Command-vendor S-triage Status: This issue is waiting on initial triage.
Projects
None yet
2 participants