-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make order of files in repaired wheel deterministic #507
base: main
Are you sure you want to change the base?
Conversation
In order to make the output zip file reproducible (independent of the underlying filesystem's directory traversal order), sort each list of subdirectories and each list of files before adding them to the zip file. (Note that we want to sort the dirs list in place, causing os.walk to traverse the subdirectories in order.)
In order to make the output zip file reproducible (independent of the underlying filesystem's directory traversal order), sort each list of subdirectories and each list of files while we are generating the RECORD file. (Note that we want to sort the dirs list in place, causing os.walk to traverse the subdirectories in order.)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #507 +/- ##
==========================================
+ Coverage 92.25% 92.28% +0.02%
==========================================
Files 20 20
Lines 1266 1270 +4
Branches 305 305
==========================================
+ Hits 1168 1172 +4
Misses 56 56
Partials 42 42 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR
According to https://peps.python.org/pep-0427/#recommended-archiver-features, it is recommended to place the .dist-info folder at the end of the archive.
If we're ensuring the order for build reproducibility, can this be taken into account please ?
If the wheel metadata files are physically located at the end of the zip file, this allows other tools to modify the metadata without rewriting the entire archive.
Sure, that makes sense and is easy to do. |
Currently, when running
auditwheel repair
, the contents of the output whl file are unpredictable:In both cases, the order is dependent on the order of entries returned by
os.walk
.This is a problem for build reproducibility - provided that the build process is sufficiently well defined, different people should be able to run the same process on different machines and get identical outputs.
Note that when
setuptools
orwheel
generates a whl file, it does something similar (seeWheelFile.write_files
inwheel.wheelfile
.) The code here won't do quite the same as what setuptools does, but that shouldn't be a problem.