Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check in busybox to avoid historically flakey download in CI and simplify process #51529

Closed
wants to merge 3 commits into from

Conversation

@LilithHafner LilithHafner added the ci Continuous integration label Sep 30, 2023
@LilithHafner
Copy link
Member Author

(this is totally untested because I don't have a windows machine so I'm using CI to debug)

@LilithHafner LilithHafner marked this pull request as draft September 30, 2023 16:16
@LilithHafner
Copy link
Member Author

I have not reviewed the Binary file I'm checking in. However, I do attest that at the time of this PR, the binary I'm checking in is the same as a file I downloaded from https://frippery.org/files/busybox/busybox.exe and the same as a file I downloaded from https://web.archive.org/web/20230828161044/https://frippery.org/files/busybox/busybox.exe all of which have a sha512 checksum of a5b7c95f1aa9caeff134557f18070ec58c352f78ae09e3b9cc379547ff9278b207395ce62d9b73ab194f4d865bbd7f6875eb35336bfb9989b0594f493a443cbf
Hopefully we trust frippery.org, because we've been automatically downloading and executing binaries from them during CI for a while now.

@LilithHafner LilithHafner marked this pull request as ready for review September 30, 2023 20:34
@IanButterworth
Copy link
Sponsor Member

See #51531

Copy link
Member

@DilumAluthge DilumAluthge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should check binary .exe files into this repo.

If we really need to, here's what we can do instead:

  1. Create a new GitHub repo that just contains a README.
  2. Upload the BusyBox files as GitHub Releases on that repo.

@DilumAluthge DilumAluthge added the DO NOT MERGE Do not merge this PR! label Sep 30, 2023
@DilumAluthge DilumAluthge marked this pull request as draft September 30, 2023 21:03
@LilithHafner
Copy link
Member Author

@DilumAluthge, why not? It seems like a fixed binary is at least as secure and auditable as a bit of code that downloads and runs a binary from a remote source.

@IanButterworth
Copy link
Sponsor Member

I think we can close this?

@DilumAluthge DilumAluthge deleted the lh/checkin-busybox branch October 4, 2023 00:41
@LilithHafner
Copy link
Member Author

I'm still not sure why it's so bad to check in a binary file rather than download it each time. Would one of you explain the reasoning behind that?

@IanButterworth
Copy link
Sponsor Member

Perhaps this should be reopened with a different title? I've got no stance on the question you're asking

@DilumAluthge
Copy link
Member

It's generally recommended to only check text files into Git. For binary files, people usually use Git LFS or some other solution.

@IanButterworth
Copy link
Sponsor Member

For the security concern we could add a hash check of the downloaded file?

@DilumAluthge
Copy link
Member

Yeah, if there are security concerns, checksums are the way to go. We already do that for all our build dependencies, so it seems reasonable to add checksums for our various test dependencies as well.

@LilithHafner LilithHafner changed the title Check in busybox to fix CI failures Check in busybox to a void historically flakey download in CI and simplify process Oct 4, 2023
@LilithHafner LilithHafner restored the lh/checkin-busybox branch October 4, 2023 12:15
@LilithHafner LilithHafner reopened this Oct 4, 2023
@LilithHafner
Copy link
Member Author

LilithHafner commented Oct 4, 2023

Perhaps this should be reopened with a different title?

Okay, done.

It's generally recommended to only check text files into Git.

Who recommends this and why?

For binary files, people usually use Git LFS or some other solution.

I assume that Git LFS is appropriate for storing large files. is 608 KB too large?

@LilithHafner LilithHafner marked this pull request as ready for review October 4, 2023 12:22
@LilithHafner LilithHafner changed the title Check in busybox to a void historically flakey download in CI and simplify process Check in busybox to avoid historically flakey download in CI and simplify process Oct 9, 2023
@LilithHafner
Copy link
Member Author

@DilumAluthge, I appreciate that you block PRs that should not be merged. However, I think it is appropriate to provide an explanation of why you choose to block such a PR and engage in a brief dialogue about it. I still don't understand why it is bad to check in a small binary file.

@gbaraldi
Copy link
Member

gbaraldi commented Oct 9, 2023

Because it gets added to the julia repo, and it doesn't make sense to be there. Every git clone will download it.

@DilumAluthge
Copy link
Member

Just a couple quick quotes:

Unfortunately git does not handle binary files elegantly (unless you use git-lfs). You can inflate storage rapidly by, say, editing a 10M zip file a few times. I've had to GC more than one repo where someone accidentally added an innocuous binary file, and the next thing you know the repo has exceeded 2G of storage space.

(Source)


It’s important to never commit binary files because once you’ve commit them they are in the repository history and are very annoying to remove. You can delete the files from the current version of the project - but they’ll remain in the repository history, meaning that the overall repository size will still be large.

(Source)

@vchuravy
Copy link
Member

vchuravy commented Oct 9, 2023

Yeah checking binaries is a no-go. We can try and address this on the CI side of things (provide the binary in the environment there instead of downloading it everytime),
but checking in a binary should be a measure of last resort, if at all.

@LilithHafner
Copy link
Member Author

Thanks, all, for the explanations. I appreciate understanding the reasoning behind this. IIUC the issue is all about repo file size bloat.

@LilithHafner LilithHafner deleted the lh/checkin-busybox branch October 10, 2023 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration DO NOT MERGE Do not merge this PR!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants