Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that base32 hashes are not **the** base32 everyone expects #4354

Open
piegamesde opened this issue Dec 11, 2020 · 14 comments
Open

Document that base32 hashes are not **the** base32 everyone expects #4354

piegamesde opened this issue Dec 11, 2020 · 14 comments

Comments

@piegamesde
Copy link
Member

Is your feature request related to a problem? Please describe.
I just lost three hours (and 4 more percent of my sanity) until I figured out that base 32 hashes in Nix do their own custom foo.

Describe the solution you'd like
Please document it somewhere. No, not somewhere, but everywhere. Everywhere where "base32" is mentioned. While we're at it, we could also document which hashes are represented in which base and that no padding is used.

This includes nix-hash, nix-prefetch-url the Nix manual and all other places where one might look into when running into this issue.

@andir
Copy link
Member

andir commented Dec 12, 2020

IMHO we should probably start with s/base32/nix32/g across the entire code base + some backwards compatibility in the man pages & commands such as nix-hash.

@zimbatm
Copy link
Member

zimbatm commented Dec 12, 2020

Note that base32 isn't standardized as much as base64. There are many variants out there: https://en.wikipedia.org/wiki/Base32

@piegamesde
Copy link
Member Author

There's an RFC from 2006 so I think it's safe to assume at all newer applications use that one. Also I'm not sure if the nix32 algorithm is equivalent to base32 except character substitution (quick test suggests "no").

I went ahead and created a wiki page about hashes, as there are quite a few other details about them that are under-documented. Please proof-read this, as I'm still pretty unsure about a lot of the edge cases.

@colemickens
Copy link
Member

I added links to Rust and Go implementations.

Can someone review the wiki page. Specifically the bit about SRI hashes.

@piegamesde I ask this because:

# try just prefixing "sha-256" as you documented on the wiki page
$ nix to-sri sha256-1g6ycnji10q5dd0avm6bz4lqpif82ppxjjq4x7vd8xihpgg3dm91
error: --- BadHash ----------------------------------------------------------------------------- nix
invalid SRI hash '1g6ycnji10q5dd0avm6bz4lqpif82ppxjjq4x7vd8xihpgg3dm91'

# on the other hand, if we prepend `sha256:` and call `nix to-sir` we get a **different**, valid SRI hash
nix to-sri sha256:1g6ycnji10q5dd0avm6bz4lqpif82ppxjjq4x7vd8xihpgg3dm91
sha256-IdU23rswdtT26QRL2e8VyMWLKfnL1K1AawWDEKVl3rw=

@edolstra
Copy link
Member

As the Wikipedia page makes clear, there is a no standard base-32. So I don't really see a compelling reason to rename "our" base-32 after 17 years. However, nowadays it's better to use SRI hashes, which is what the new CLI defaults to.

@piegamesde
Copy link
Member Author

Thanks for the additions. Also thanks for mentioning nix to-sri, I haven't seen that before and it looks something I'd want to mention in the Wiki page.

Regarding your question, it seems like the only valid encoding for SRI hashes is base64, so simply prepending sha256- is not enough.

However, now I am a bit confused about the "old <type>:<hash> format" that you used (sha256:1g6ycnji10q5dd0avm6bz4lqpif82ppxjjq4x7vd8xihpgg3dm91): why is there a colon instead of a dash and for which tools is this a valid input?

@edolstra
Copy link
Member

@colemickens This is the correct command:

$ nix to-sri --type sha256 1g6ycnji10q5dd0avm6bz4lqpif82ppxjjq4x7vd8xihpgg3dm91
sha256-IdU23rswdtT26QRL2e8VyMWLKfnL1K1AawWDEKVl3rw=

The <type>:<hash> format is (AFAIK) undocumented and shouldn't be used.

@zimbatm
Copy link
Member

zimbatm commented Dec 14, 2020

To add a bit more historical context, <type>:<hash> precedes the use of SRI. It was trying to fill the same need. It mainly appears in the expected: <hash>, got: <hash> error message, which will default to SRI in the next Nix release. That format is also being used in the .narinfo files in the binary cache and maybe a bunch of other places.

@zimbatm
Copy link
Member

zimbatm commented Dec 14, 2020

Regarding Base32; rfc4648 has been published in October 2006, and Eelco's thesis in January. Technically this means that everybody else should adopt Nix's curse-proof variant of Base32 :-p

@stale
Copy link

stale bot commented Jun 12, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label Jun 12, 2021
@andir
Copy link
Member

andir commented Jun 12, 2021

Still relevant. Bumping to make this "not stale"..

Long version:

@zimbatm technically it has already been discussed in November 2005: https://datatracker.ietf.org/doc/html/draft-josefsson-rfc3548bis-00

@stale stale bot removed the stale label Jun 12, 2021
@Ericson2314
Copy link
Member

This does seem like a simple and good thing to fix: Somewhere the docs note that our Base32 is not RFC 4648 because it predates its ratification.

@stale
Copy link

stale bot commented Jan 3, 2022

I marked this as stale due to inactivity. → More info

@nbraud
Copy link
Contributor

nbraud commented Sep 14, 2023

There's an RFC from 2006 so I think it's safe to assume at all newer applications use that one. Also I'm not sure if the nix32 algorithm is equivalent to base32 except character substitution (quick test suggests "no").

It's indeed not. (Source: I tried that before knuckling down and writing my own implementation, a few days ago)

I went ahead and created a wiki page about hashes, as there are quite a few other details about them that are under-documented. Please proof-read this, as I'm still pretty unsure about a lot of the edge cases.

Thanks, that would have saved me some hours and a significant fraction of my remaining sanity... had I found it then 😅

I'll try and give it a read 👍

I don't really see a compelling reason to rename "our" base-32 after 17 years.

  • Preserving the time and sanity of every contributor who keep stepping on that particular rake.
  • Make it easy to search for, so people can find implementations or specifications for it. As said above, that would have made writing my “sha256 to hash” maintainer script so much easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants