Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finally moving away from weak hash algorithms (and extra-files while we are at it) #25876

Closed
9 tasks done
hannesm opened this issue May 14, 2024 · 5 comments · Fixed by #25962
Closed
9 tasks done

Finally moving away from weak hash algorithms (and extra-files while we are at it) #25876

hannesm opened this issue May 14, 2024 · 5 comments · Fixed by #25962

Comments

@hannesm
Copy link
Member

hannesm commented May 14, 2024

Dear values maintainers and OCaml developers and watchers of this repository,

since a long time it has been known that MD5 is a weak hash algorithm. drum beat

Finally there's a CI lint check to avoid fresh md5 checksums being added to this repository (ref ocurrent/opam-repo-ci#304). \o/

Now, there's a bit of legacy: we have around 19099 occurences of the string "md5=" in this repository.

This is distributed across the URL field of opam files and extra-files of opam files. Furthermore, extra-files are slightly painful, and the extra-source fields are preferred. I propose to change the following in one PR:

  • Add to all checksum where there's only md5 a sha256,
  • Replace all extra-files with extra-source pointing to a remote git repository (preferably under the ocaml or ocaml-opam github organisation) EDIT: I'll just use the opam-source-archives repository for that, and use sha256 as checksum for these files,
  • Get a lint check in place that prevents us from accepting any opam packages with extra-files

(Of course, we can as well use sha512 if someone knows about feasible attacks on sha256, the former is unfortunately way slower esp. on e.g. Raspberry Pi hardware.)

For the first two items, based on opam admin from opam 2.2 I have OCaml code that does this automatically (see https://github.com/hannesm/opam/tree/migrate-extra-files). I'm also fine to submit a opam-repo-ci PR for the lint check.

The only leftover question is whether there is any pushback or concerns? (I'm curious whether this must wait for opam 2.1.6 since earlier versions are not able to remove files on some operating systems (such as macOS)?) -- any insights?

Please raise your concerns via comments, and express your support (👍🏾) or not support (👎🏾) via reactions here.

(PS: I proposed an earlier document/PR at #25874 - where @mseri gave positive feedback, but I believe when we do such a change, let's do it at once.)

Earlier discussion at #17315 https://discuss.ocaml.org/t/opam-repository-security-and-data-integrity-posture/6478 -- Now 4 years later, I think we can act.

This is as well a necessary step for getting cryptographic signatures -- well, or at least it simplifies the design tremendously -- both having non-weak checksums of external data, and also not having arbitrary files around.

Regards from the security engine room

Update: this has been discussed and approved in the 2024 May 22nd opam-repository meeting https://github.com/ocaml/opam-repository/wiki/Meeting-notes#2024-05-22 -- there's no blocker to move forward.

Tasks to do:

@raphael-proust
Copy link
Collaborator

This issue was discussed during the opam-repository maintainer meeting and we decided that this should go ahead. Further discussions and such can happen in the different PRs that this issue is manifesting as.

@hannesm
Copy link
Member Author

hannesm commented May 24, 2024

Dear everyone (cc'ing manually @shonfeder @kit-ty-kate @raphael-proust),

I've also done the third and fourth step. And for you to have an easier time to validate the results (well, and me verifying that I didn't break anything), there's now a tool -- https://github.com/hannesm/opam-check-checksum -- You run it with the current opam repository (use 6ed19e3) on the one side (--old-opam-repo=DIR), and the updated one on the other side (--opam-repo=DIR -- use 2730ed6, the HEAD of https://github.com/hannesm/opam-repository/tree/extra-file-to-extra-source).

What does opam-check-checksum do? Well, it will read all opam files in old-opam-repo, and check opam files for having extra-files or only a md5 checksum. If this is the case, opam-repo is checked that the old hashes are included (for url), and for extra-files that a extra-sources entry is present and also contains the old checksum.

For me this ran without issues after I fixed a silly mistake (it should retain the sha512 if it was present) in the migration command.

To give some motivation, if we merge that branch, the MD5-only checksums go down from 12577 to 70 (measured with git grep 'checksum: "md5=' | wc -l). The remaining MD5-only checksums are due to unavailable sources/tarballs.

It would be great if you could either merge the opam-source-archives PR and my branch (named extra-file-to-extra-source), or tell me whether there are any steps I should do before this can be merged. You can of course take your time and look at the diff, or reproduce the results -- as mentioned, the two big commits are automated, the in-between one was a manual move to reconcile the fix-gcc patch.

@hannesm
Copy link
Member Author

hannesm commented May 24, 2024

If you'd like to read the diff for opam-repository, I recommend to use git diff -D 6ed19e325e5016a43606d8073ca73998f9ebf68f..2730ed62d22b8488a96c5683498a99a8426ce178 (which avoids the full text of the deleted files).

@emillon
Copy link
Contributor

emillon commented May 29, 2024

FYI, #25960 added hashes to +trunk packages, see

"sha256=1553dcce22558d985c805561b1a18a175d98e27740da2aaa9da73d971f19161f"
. I believe those should stay without hashes.

@hannesm
Copy link
Member Author

hannesm commented May 29, 2024

Thanks @emillon for your headsup, in #25962 these checksums got removed again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants