Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0131] Enforce the usage of SRI hashes in Nixpkgs #131

Closed
wants to merge 3 commits into from

Conversation

winterqt
Copy link
Member

@winterqt winterqt commented Jul 26, 2022

# Motivation
[motivation]: #motivation

Currently in Nixpkgs, the hash formats that are used in parameters to fetchers range from SRI and base-32 (the most common ones) to hex. In addition to this, packages use the algorithm-specific arguments (`sha256`, `sha512`, etc.) or the generic `hash` argument at random. This creates inconsistencies that we'd rather not have.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is SRI in this context?

Copy link
Member Author

@winterqt winterqt Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subresource Integrity, whose hashes look like sha256-2mE26ES+fYSWdfMr8uTsX2VVGTNMDQ9MXEk5E/L95UI=. Maybe I should add at least a definition of SRI in the RFC?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates inconsistencies that we'd rather not have.

Why? This is not a motivation, just a claim.

Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a SHA256SUMS file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a link to what SRI is in the RFC would be good to have. I also didn't know what SRI was.

Copy link
Member Author

@winterqt winterqt Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the different hash encodings genuinely has value...

I agree with this point, but at the same time most people use either the SRI hashes given by Nix's hash mismatch errors (which I now realize is only in 2.4), or convert those to base-32 (the default in 2.3 and below).

Copy link
Member

@sternenseemann sternenseemann Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vojta001 That's why these interfaces tend to have an assert or two behind the scenes…

nix hash to-sri

Well that's actually a very good point: There is no stable interface for converting hashes into the SRI format, because nix-hash(1) only supports SRI as an input, not an output format.

Copy link

@milahu milahu Sep 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a SHA256SUMS file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.

no need to convert the hash to base64

SRI hashes only require a prefix:

{
  sha256 = "1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
  sha256 = "sha256-1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
  hash = "sha256-1cc9110705165dc09dd09974dd7c0b6709c9351d6b6b1cef5a711055f891dd0f";
}

base16 hashes are found with
grep -r -E 'sha256 = "[0-9a-f]{64}";'

new and better hash technologies appear everywhere.

sha256 is de-facto standard in nixpkgs,
because its good enough for our use case = global IDs

Copy link
Member

@AndersonTorres AndersonTorres Sep 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sha256 is de-facto standard in nixpkgs,

MD5 was de facto standard no more than a decade ago.

Also, this was not about "good enough algorithms", but about a better format for the expression - namely, hash = "hashAlgorithm-..." instead of hashAlgorithm = "...".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates inconsistencies that we'd rather not have.

@winterqt could you add something more specific here, considering this is the whole motivation for the RFC?

The discussion above has brought up some more points:

  • semantic vs syntactical mutual exclusion
  • unique attribute name rather than multiple, algo-specific ones
  • ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@milahu: conversion would be needed because SRI is currently only accepting base64. However, it would be relatively easy to change: NixOS/nixpkgs#199660

rfcs/0131-sri-hashes.md Outdated Show resolved Hide resolved
# Drawbacks
[drawbacks]: #drawbacks

None at this time.
Copy link
Member

@sternenseemann sternenseemann Jul 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's (anecdotically) still a significant number of Nix 2.3 users who contributing to nixpkgs would be made unnecessarily harder for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, another ten years of deprecation window...

Copy link
Member

@piegamesde piegamesde Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You joke, but I seriously think this would be a sensible thing to do. (Also, things can be deprecated without the intent to eventually removing them, see Java which did not remove deprecated APIs until Java 9?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall we have developed a strange culture of removing stuff unnecessarily in nixpkgs, I feel like.

In this case, we shouldn't remove support for these formats from Nix anyways (to continue to be able to evaluate older nixpkgs revisions (the ability to reproduce ancient stuff truely is a strength of Nix)) and for existing APIs in nixpkgs it's laughably simple to keep supporting it (you don't need to change anything, in fact!). For all I care, stop supporting the sha256 and sha512 attributes for new APIs…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text does not talk about removal, I suppose, but about enforcement. Older expressions can be rendered by newer versions, whereas new expressions should be written in newer versions of Nix.

Copy link
Member Author

@winterqt winterqt Sep 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how does something like this sound: we strongly encourage using SRI hashes and hash in new packages, if there is no justifiable reason to not use them. Justifiable reasons include, but are not limited to, taking hashes directly from upstream, and using automated tooling that does not use SRI hashes.

I'm not sure about that last reason, but I also want to provide more than one example of a reason.

I'm also not sure about how to handle converting old packages (with the same criteria of having no justifiable reasoning), if at all. Thoughts welcome.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old packages can be converted by anyone interested ion converting them. I have started a preliminar work a long time ago :P

NixOS/nixpkgs#176283

Further I believe this should be decided on a case by case basis. Maybe allowing old Nix tools is acceptable, whereas upstream hashes is not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warns the user when the code uses dangerous functions

Not sure how apt this comparison is, as there's nothing fundamentally different about a sha256 hash in SRI format or one in base16 format.

I'm open to changing it to "discourage", but... I don't know, if non-SRI hashes keep slipping in anyways because it's not required, what's the point of this? I guess that's why I chose that strong of a word.

Stale hashes is not something caused by a particular hash format. It happens because expressions bitrot and we'll have stale hashes in SRI format before long. Doing a mass conversion of hashes may turn up stale hashes, but doesn't have to since you would likely just convert them without refetching, since that is much quicker.

If you are concerned about stale hashes, I'd suggest trying to refetch fixed output derivations in nixpkgs with cache disabled. I believe this has been done quite a few times over the years and it should be easy to turn up useful tools for this.

strongly encourage

Newer Nix versions already default to this, so there is already a lot of encouragement…

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"A lot of encouragement" that ends with "my version is too old" or "this is mere personal preferences, don't force them on me"? Wow cool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version is probably too old for the format to be convenient to produce but can evaluate the expressions with either hash format. For «mere personal preferences» argument new Nix making use of SRI format easier than non-SRI is an encouragement, as submitters will typically use whatever is the easiest and changing the format is pointless churn so the initial hash format will stay.

# Motivation
[motivation]: #motivation

Currently in Nixpkgs, the hash formats that are used in parameters to fetchers range from SRI and base-32 (the most common ones) to hex. In addition to this, packages use the algorithm-specific arguments (`sha256`, `sha512`, etc.) or the generic `hash` argument at random. This creates inconsistencies that we'd rather not have.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates inconsistencies that we'd rather not have.

Why? This is not a motivation, just a claim.

Using the different hash encodings genuinely has value, for example I frequently just copy over the base16 hash from a SHA256SUMS file provided by upstream, requiring less button presses, round trips and most importantly allows others to quickly and easily match the hashes textually when reviewing.

- Do we want to eventually deprecate the use of `sha256`, `sha512`, etc. in fetchers that support SRI hashes, leading to their eventual removal?
- This would require updating every script that uses these arguments, and would cause breakage for any usage of them out-of-tree; we'd need to weigh the benefits of this
- Is `hash` the correct name to use for the argument?
- Nix builtins use `narHash`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care about the name, but built-in fetchers and nixpkgs fetchers should be consistent.


- How can we do the treewide migration efficently?
- Text replacement for the majority of cases, with an AST-based solution for others?
- Do we want to eventually deprecate the use of `sha256`, `sha512`, etc. in fetchers that support SRI hashes, leading to their eventual removal?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can deprecate them without removing them.

@spacekookie
Copy link
Member

Hello everyone, this RFC is now open for nominations. If you think of someone (including yourself) who you think would make a good shepherd on this RFC, please nominate them now

@tomberek
Copy link
Contributor

tomberek commented Sep 21, 2022

Recommend looking through Nixpkgs for uses in the wild to find additional shepherds.

Edit: for examples, people using or recommending hash vs sha256 already

@winterqt
Copy link
Member Author

What do you mean by this (what "uses"), who would we be looking for?

@AndersonTorres
Copy link
Member

Nominating myself

@7c6f434c
Copy link
Member

I nominate @sternenseemann

@edolstra
Copy link
Member

edolstra commented Oct 5, 2022

We still need at least two more shepherds. @sternenseemann Are you willing to help shepherd this RFC?

@sternenseemann
Copy link
Member

We still need at least two more shepherds. @sternenseemann Are you willing to help shepherd this RFC?

I won't have time until December sadly.

Comment on lines +44 to +45
- Is `hash` the correct name to use for the argument?
- Nix builtins use `narHash`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much value in renaming hash. A lot of work which we can safe ourselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See NixOS/nixpkgs#79987 (comment) for context, I added it here just to include it since the question was raised (should probably cite that discussion, though).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are not talking about narHashes here, so this should be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAR hashes are necessary for any cases where the result is more than a single file, e.g. fetchFromGitHub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as we have flat hashes for normal files and NAR hashes over arbitrary trees (or normal files), I'd say it makes sense to call them differently. At least if I don't consider any backwards compatibility questions for now. For example, keep hash for normal ones and rename NAR hashes to narHash. It might even avoid some confusion, and functions could allow specifying either (though of course the value differs, as something different is hashed).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that could be done under the hood in the fetchers? It seems confusing for new contributors to have to think about whether to use a flat hash or a NAR hash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We already do this with outputHashMode.

@winterqt winterqt mentioned this pull request Oct 18, 2022
13 tasks
@lheckemann
Copy link
Member

@sternenseemann December would be fine IMHO, would you be willing to join the shepherd team then?

Alternatively, I nominate @vcunat -- would you be willing to shepherd?

# Alternatives
[alternatives]: #alternatives

- Do nothing: further inconsistencies will continue to be introduced
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a number of alternatives.

Different rules can be considered. For each of the following formats it could be decided to allow/disallow/preferred:

  • hash = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
  • sha256 = "0000000000000000000000000000000000000000000000000000000000000000";
  • sha256 = "sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";
  • sha256 = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=";

I've gathered there are a number of ways these rules can be 'enforced':

  • Throw evaluation errors upon not conforming to rule.
  • Have a it be a potential future linter rule.
  • Have PR reviewers nudge contributors to follow the rule
  • Do nothing: keep things like the are right now. Reviewers will suggest their own opinion in PRs if they'd like.
  • Allow any format: do not allow PR reviewers to suggest changing the format in PRs to their preferred format. If the format is allowed, just let it be. This avoids PRs being always potentially invalid due to reviewers having different opinions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow anything: Do not allow PR reviewers to suggest changing the format in PRs. If the format is allowed, just let it be. Avoid conflicting reviewer opinions.

I'm mentioning this option, as at the moment PRs are being reviewed where the reviewers suggest their own 'opinion' on the matter; even though there is no real rule. One outcome of this RFC could be to just let it be.

@edolstra
Copy link
Member

It appears that we need one more shepherd. Any takers?

# Summary
[summary]: #summary

Use SRI hashes and the `hash` parameter for fetchers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"SRI hash" needs a reference link with a definition and globally this RFC needs some more context to be explained.

FWIW, since I became aware that this kind of hash was preferred, I have tried to use them more systematically, but I'm annoyed by nix-prefetch-url not generating this type of hash (at least all the way to Nix 2.12) and not knowing how to convert a hash generated by nix-prefetch-url into an SRI hash.

Copy link
Member

@vcunat vcunat Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, for conversion I'm using

nix hash to-sri --type sha256 theHash

EDIT: or you could use

nix hash file path/to/file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! But another thing that is becoming annoying to me lately is to be forced to activate the nix-command experimental feature for more and more basic things that I cannot avoid.

Another solution that I've found meanwhile was openssl dgst -sha256 -binary /path/to/file | openssl base64 -A, but it's not as convenient.

Copy link

@ShamrockLee ShamrockLee Jan 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But another thing that is becoming annoying to me lately is to be forced to activate the nix-command experimental feature for more and more basic things that I cannot avoid.

I would rather view it as a missing feature of nix-hash that needs to be implemented to get this RFC merged. In other words, we should have

nix-hash --sri --type sha256 path/to/file

and

nix-hash --to-sri --type sha256 theHash

Update: I just created NixOS/nix#7690. Please take a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! And thanks to you, I also learned that, in the meantime, I can use nix --extra-experimental-features nix-command file to-sri --type sha256 .... I didn't realize that I don't need to activate the experimental feature globally, and this is already better than nothing.

@tomberek
Copy link
Contributor

From RFC Steering Committee: please nominate yourselves or someone you think would do a good job as a shepherd.

@sternenseemann
Copy link
Member

From RFC Steering Committee: please nominate yourselves or someone you think would do a good job as a shepherd.

I think this should be closed due to a lack of interest according to RFC 130 – Search for shepherds started in August and the two month period has long passed.

@winterqt
Copy link
Member Author

I think this should be closed due to a lack of interest according to RFC 130

I'm somewhat inclined to agree, though I suppose it wouldn't hurt to have a phase where we create a new RFC based on these ideas and reopen it later.

If anyone's interested in doing this, please reach out. I'd make a Matrix room, but that's proving harder than I'd expect... 😅

Search for shepherds started in August and the two month period has long passed.

I don't think @vcunat responded to his nomination, though I don't think that would make that much of a difference.

@vcunat
Copy link
Member

vcunat commented Jan 25, 2023

I didn't feel like accepting, because I couldn't see how to progress this – except for one minor improvement for which I filed a nixpkgs PR but got (almost?) no reaction. (and I still see it the same)

@ShamrockLee
Copy link

ShamrockLee commented Jan 26, 2023

The followings sound more actionable:

  1. Ensure that SRI hashes can be computed without experimental-features. (nix-hash: support base-64 and SRI format nix#7690)
  2. Ensure that all the in-tree fetchers / builders / lang-to-nix expression generators provide hash-algorithm-agnostic attributes (e.g. hash) to receive SRI hashes. We can start from the fetchers that appears in the Nixpkgs manual.
  3. Ensure that the hash-mismatch error message produced by the Nix buder always show hashes in SRI format (Thanks @7c6f434c for pointing out this part. See [RFC 0131] Enforce the usage of SRI hashes in Nixpkgs #131 (comment))
  4. Change the Nixpkgs documentation and recommend the use of hash-algorithm-agnostic attribute and SRI hashes.
  5. Upstream corresponding changes to out-of-tree lang-to-nix expression generators.

@7c6f434c
Copy link
Member

@ShamrockLee should there be the part where Nix never outputs a «new» non-SRI hash (new as in not the exact same meaningful characters as some non-SRI hash that was passed in)?

@ShamrockLee
Copy link

ShamrockLee commented Jan 28, 2023

the part where Nix never outputs a «new» non-SRI hash (new as in not the exact same meaningful characters as some non-SRI hash that was passed in)

@7c6f434c Sorry, but which hashes Nix would output are you referring to?

@7c6f434c
Copy link
Member

7c6f434c commented Jan 28, 2023 via email

@ShamrockLee
Copy link

ShamrockLee commented Jan 28, 2023

I see. IIUC, the hash is produced according to the hashAlgo of the source derivation set by the fetcher, and the fixes should go to each fetcher. Is that the case?

Updated.

@7c6f434c
Copy link
Member

7c6f434c commented Jan 28, 2023 via email

@ShamrockLee
Copy link

Ensure that the hash-mismatch error message produced by the Nix buder always show hashes in SRI format (Thanks @7c6f434c for pointing out this part. See [RFC 0131] Enforce the usage of SRI hashes in Nixpkgs #131 (comment))

Just change the content of the third point.

@edolstra
Copy link
Member

The RFC steering committee has decided to close this RFC due to a lack of shepherd nominations. However, it can be reopened if people are interested in shepherding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: open for nominations Open for shepherding team nominations
Projects
None yet
Development

Successfully merging this pull request may close these issues.