Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make source tarball generation more reproducible #123246

Merged
merged 5 commits into from
Mar 31, 2024

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Mar 30, 2024

This PR performs several changes to source tarball generation (x dist rustc-src) in order to make it more reproducible (in light of the recent "xz backdoor"...). I want to follow up on it with making a separate CI workflow for generating the tarball.

After this PR, running this locally produces identical checksums:

$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz

$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz

Here is a guide on how to reproduce the published archives locally.

r? @Mark-Simulacrum

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Mar 30, 2024
@Mark-Simulacrum
Copy link
Member

@bors try

Probably makes sense to check against a try-built tarball (I think we publish those) before merging, just in case there's other obvious problems we can fix. Otherwise though r=me.

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 30, 2024
Make source tarball generation more reproducible

This PR performs several changes to source tarball generation (`x dist rustc-src`) in order to make it more reproducible (in light of the recent "xz backdoor"...). I want to follow up on it with making a separate CI workflow for generating the tarball.

After this PR, running this locally produces identical checksums:
```bash
$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz

$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz
```

r? `@Mark-Simulacrum`
@bors
Copy link
Contributor

bors commented Mar 30, 2024

⌛ Trying commit 45c3662 with merge b429057...

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 30, 2024

@bors try

@bors
Copy link
Contributor

bors commented Mar 30, 2024

⌛ Trying commit f57139c with merge bec0625...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 30, 2024
Make source tarball generation more reproducible

This PR performs several changes to source tarball generation (`x dist rustc-src`) in order to make it more reproducible (in light of the recent "xz backdoor"...). I want to follow up on it with making a separate CI workflow for generating the tarball.

After this PR, running this locally produces identical checksums:
```bash
$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz

$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz
```

r? `@Mark-Simulacrum`
@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Try build successful - checks-actions
Build commit: bec0625 (bec0625059e9a7ebcc1b2da983a560e1fc25535f)

1 similar comment
@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Try build successful - checks-actions
Build commit: bec0625 (bec0625059e9a7ebcc1b2da983a560e1fc25535f)

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Mar 31, 2024
@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 31, 2024

@bors try

@bors
Copy link
Contributor

bors commented Mar 31, 2024

⌛ Trying commit a74d4c9 with merge 07b3deb...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 31, 2024
Make source tarball generation more reproducible

This PR performs several changes to source tarball generation (`x dist rustc-src`) in order to make it more reproducible (in light of the recent "xz backdoor"...). I want to follow up on it with making a separate CI workflow for generating the tarball.

After this PR, running this locally produces identical checksums:
```bash
$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz

$ ./x dist rustc-src
$ sha256sum build/dist/rustc-1.79.0-src.tar.gz
```

r? `@Mark-Simulacrum`
@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Try build successful - checks-actions
Build commit: 07b3deb (07b3deb1d6cdad26997767f32feadc1140b781fa)

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 31, 2024

It's still not trivial to reproduce the archive from CI because of two things:

  1. Local junk present in src and other directories (we should tell people to use a clean checkout of rust to perform the check)
  2. Different config.toml used on CI vs locally (I plan to tackle this in another PR)

But at least locally the archive should now be reproducible, which is a good start.

@bors r=Mark-Simulacrum

@bors
Copy link
Contributor

bors commented Mar 31, 2024

📌 Commit 877e8d4 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 31, 2024
@bors
Copy link
Contributor

bors commented Mar 31, 2024

⌛ Testing commit 877e8d4 with merge a8cfc83...

@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing a8cfc83 to master...

@bors
Copy link
Contributor

bors commented Mar 31, 2024

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing a8cfc83 to master...

@bors bors added merged-by-bors This PR was explicitly merged by bors. labels Mar 31, 2024
@bors bors merged commit a8cfc83 into rust-lang:master Mar 31, 2024
12 checks passed
@rustbot rustbot added this to the 1.79.0 milestone Mar 31, 2024
@Kobzol Kobzol deleted the tarball-reproducible branch March 31, 2024 14:36
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (a8cfc83): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.8% [2.8%, 2.8%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.8% [2.8%, 2.8%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 668.134s -> 669.417s (0.19%)
Artifact size: 315.67 MiB -> 315.68 MiB (0.00%)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jun 29, 2024
…-ozkan

Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? `@onur-ozkan`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 29, 2024
Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? `@onur-ozkan`

try-job: x86_64-gnu-distcheck
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 2, 2024
…zkan

Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? `@onur-ozkan`

try-job: x86_64-gnu-distcheck
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 2, 2024
…-ozkan

Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? `@onur-ozkan`

try-job: x86_64-gnu-distcheck
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jul 3, 2024
…-ozkan

Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? ``@onur-ozkan``

try-job: x86_64-gnu-distcheck
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jul 3, 2024
Rollup merge of rust-lang#127050 - Kobzol:reproducibility-git, r=onur-ozkan

Make mtime of reproducible tarballs dependent on git commit

Since rust-lang#123246, our tarballs should be fully reproducible. That means that the mtime of all files and directories in the tarballs is set to the date of the first Rust commit (from 2006). However, this is causing some mtime invalidation issues (rust-lang#125578 (comment)).

Ideally, we would like to keep the mtime reproducible, but still update it with new versions of Rust. That's what this PR does. It modifies the tarball installer bootstrap invocation so that if the current rustc directory is managed by git, we will set the UTC timestamp of the latest commit as the mtime for all files in the archive. This means that the archive should be still fully reproducible from a given commit SHA, but it will also be changed with new beta bumps and `download-rustc` versions.

Note that only files are set to this mtime, directories are still set to the year 2006, because the `tar` library used by `rust-installer` doesn't allow us to selectively override mtime for directories (or at least I haven't found it). We could work around that by doing all the mtime modifications in bootstrap, but that would require more changes. I think/hope that just modifying the file mtimes should be enough. It should at least fix cargo `rustc` mtime invalidation.

Fixes: rust-lang#125578

r? ``@onur-ozkan``

try-job: x86_64-gnu-distcheck
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants