-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0171] Default name of fetchFromGithub FOD to include revision #171
Conversation
Co-authored-by: Robert Schütz <github@dotlambda.de>
17a0a70
to
7534dd4
Compare
Just one silly question: why only
|
I did reference this rfcs/rfcs/0171-fetch-from-github.md Line 104 in 1ac472d
I mention fetchFromGitHub because it's about 20x as common:
I'm not opposed to doing a similar treatment to the other fetchFromX helpers. I think this RFC could be expanded to encompass them all eventually. To your point, maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision" |
There's already a speculative impl as of 2 weeks ago: NixOS/nixpkgs#294068 Though the format is slightly different ( |
I choose the format to make it easier to inspect which source FOD is going wrong when several packages got rebuild at the same time. Update: Just came across NixOS/nixpkgs#49862, which seems to work on the package |
rfcs/0171-fetch-from-github.md
Outdated
- "Interchangeability" with other fetchers is diminished as the derivation name is different | ||
- In practice, fetchFromGitHub is never used in this way. It is generally the only fetcher, so there is never another FOD to dedupilicate. | ||
- Out-of-tree repositories may get hash mismatch errors | ||
- If the cause of the mismatch is staleness, this is good and working as intended | ||
- If the cause is non-determinism, this is frustrating. | ||
- Some derivations assume "source" to be the name of sourceRoot | ||
- This has been mitigated over two years within Nixpkgs | ||
- Out-of-tree code may break if they assume "source" is the name | ||
- Can be mitigated with release notes describing the issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are caused by the assumption of src.name == "source"
being broken.
However, we have already deprecated such assumption as early as Nixpkgs 21.11. Nixpkgs Manual now requires sourceRoot
to start with "${src.name}"
instead of "source"
when src
is constructed with by fetchgit
-based fetchers, and tree-wide conversions (NixOS/nixpkgs#245388, NixOS/nixpkgs#247977, NixOS/nixpkgs#248528, NixOS/nixpkgs#248683, NixOS/nixpkgs#294334) have been merged.
fetchFromGitHub
and other output-as-a-directory fetchers can still be used interchangeably if we stick to ${src.name}
instead of "source"
.
In my opinion, we do not provide bug-level compatibility to packages failing to follow already-stablized specifications, in-tree or out-of-tree.
IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. |
Hehe per #133 (comment), my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer! |
Congratulations! We still need to fix those fetchers at Nixpkgs level, though, since Nixpkgs often takes years to adopt new Nix features. |
- Full commit hashes could be truncated. This sacrifices a bit of simplicity for better looking derivation names: | ||
``` | ||
let | ||
version = builtins.replaceStrings [ "refs/tags/" ] [ "" ] rev; | ||
# Git commit hashes are 40 characters long, assume that very long versions are version-control labels | ||
ref = if (builtins.stringLength rev) > 15 then builtins.substring 0 8 version else version; | ||
in lib.sanitizeDerivationName "${repo}-${ref}-src"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't recommend this. Short hashes are not guaranteed to be stable for long term storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
40 characters are not really that long for a machine-generated derivation.
People would most likely see the store hash and (hopefully) part of the name
when it flashes through their terminal. During debugging, a path like that would occupy at most a bit more than one line inside the terminal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current "source" isn't that stable either. 8 characters is still a fair amount of entropy, and likely to be different enough for most repositories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also remember, that it doesn't have to be "unique across all time". It just needs to be different than what was there before, so that the combination of name + hash are different.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-04-02/42643/1 |
I'd like to nominate myself as a shepherd. I previously authored a related implementation, NixOS/nixpkgs#294068, which could inform the discussion around this RFC's design. I'm excited about this RFC and look forward to working with the community to shepherd it through the process. |
Co-authored-by: Yueh-Shun Li <shamrocklee@posteo.net>
If we have a shepards meeting, we can refine the scope. |
Yes, NixOS/nixpkgs#49862 is very related, in its latest reincarnation it allows you to keep both |
To code the fetcher name into the |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
@oxij I would like to nominate you as a shepherd, looks like you've thought about this longer than I've been involved in nixpkgs. I like your addition of a lib function to make derivation names more consistent in NixOS/nixpkgs#49862 |
Thanks for a nomination, but I don't think this RFC is a proper solution to the stated problem.
Eelco and others objecting to the `<hash>-<name>-<version or revision>-source` naming scheme (and to the original 2018 version of #49862 which wanted to do exactly this, but for all fetchers, not just `fetchFromGitHub`, and, primarily, for a different reason of `/nix/store` discoverability) do have a point: the current most common `<hash>-source` naming scheme is awfully convenient for running a build server (like Hydra) and when you have to frequently switch between fetchers (e.g. when using out-of-Nixpkgs derivations and/or flakes).
IMHO, the problem this RFC tries to address is actually a set of features/bugs/issues of Nix itself:
- Nix reuses existing fixed-output outputs between different derivations without actually checking the derivation actually builds into an output with the given output hash; which is certainly a feature, given how useful it is for switching between different fetchers;
- but, IIRC, last time I tried, explicit `nix-store -r --check <derivation>` over such a derivation will fail to re-build and re-check it; which is either a bug, or there should be a separate option for doing this; for simplicity, let's assume we want a new option for this, and we want to call it `--force-rebuild`;
- `nix-build` and `nix build` lack `--force-rebuild` option (similarly to how they lack `--check` option too), you have to call `nix-store -r` manually (an issue).
(Now that I think about it, I should probably open Nix issues for these. Later, I'm a bit busy ATM.)
If the latter two issues were not such, then you could just do `nix-build --force-rebuild -A hello.src` instead of any of this, and OfBorg could also run that on all changed fixed-output derivations in a PR, ensuring the author did not screw it up regardless of any (non-)changes to derivation names.
Which would be a better solution to the stated problem, since this RFC will fail to solve it for derivations with custom `name`s anyway.
In the meantime, IMHO, since changes like #245388, #247977, #248528, #248683 are generally useful and get merged without objections, I think you should apply #49862 to your Nixpkgs, split yet-unfixed changes to `sourceRoot`s from your #153386 into a separate patchset (I think I covered most of your changes with my PRs, but I probably missed at least some of those changes), rebuild your system with #49862 applied and `nameSourcesPrettily = true` or `nameSourcesPrettily = full` set, and PR any changes you need to make it all work.
Posting a nice review in #49862 so that it would get a chance to actually get merged eventually would be nice too, I suppose.
Then, if nobody ever wants to work on the Nix issues described above, and #49862 gets merged, we could discuss setting `nameSourcesPrettily = "full"` for OfBorg as a workaround.
And, IMHO, it should be `"full"`, not `true` (which is essentially what this RFC proposes) so that changing between `fetchgit`, `fetchFromGitHub`, `fetchFromGitLab`, etc would force a recheck too.
|
I certainly think this conversation needs to be revived because of the security implications of
which leaves us open to a potential cache poisoning attack on |
@risicle You are correct, it does have security implications, but even doing the equivalent of `nameSourcesPrettily = "full"` of NixOS/nixpkgs#49862 wouldn't really help in the most general case: @tobiasBora in NixOS/ofborg#68 (comment) shows how to easily poison OfBorg's cache.
Obviously, it won't work for poisoning Hydra, since the evil PR of his method won't ever get merged.
But now thinking about this, I have an IMHO much simpler and thus scarier attack that _will_ work against Hydra:
- Malice makes an evil PR with a package update that changes the source's revision to a new version but points the hash to a hash of an older version of the same package, a version which has a well-known CVE.
- A very paranoid Nixpkgs maintainer Alice (who does not use any build caches) looks at it, it looks fine, so she builds it, and it works fine, since Alice has the older version of the source in `/nix/store`, and so she merges it.
- All tools that track unfixed CVEs in Nixpkgs only look at package versions, so those will be fooled too.
- Malice can now exploit the CVE.
In fact, if that source derivation also has a custom `name` (which is not uncommon), then _nothing_ will catch this (including me, with all my machines not using the Hydra cache and all using the equivalent of `nameSourcesPrettily = "full"` since 2018).
A scary thought.
So, I guess NixOS/nix#969 is actually a critical bug, then.
|
You're close to the scenario detailed by that author to the security team. Ultimately we can never completely stop a malicious change missing the attention of a reviewer (and this goes for all projects, hashed caching scheme or not), but we can make it so the submitter has to jump through some more hoops that will hopefully make the PR look weirder and provoke more attention. Hence small ("small") changes like this. |
The objection(NixOS/nixpkgs#49862 (comment) and NixOS/nixpkgs#59858 (comment)) is no longer relevant, as |
I still think these are two different points:
- Changing most source derivation names no longer breaks most builds, yes.
(Some do depend on customly named source derivations still, AFAIK. E.g. Apache Arrow, Datafusion, etc.)
(Also, as the author of both ccbb065c88080966e14872ea9e0ac577f753d902 and 775f21b9fdc1f2bb46518a575339fac8ff867959 in Nixpkgs I feel it kind of a conflict of interest to push for this for those reasons.)
- But moving away from `*-source` naming scheme will still make using flakes inconvenient.
|
Custom
Good point! That would be an unfortunate cache miss. Nevertheless, isn't it "the Nixy way" to pursue dependency isolation even at the cost of rebuilds (i.e. changing the version of compiler used to compile Bash triggers a world rebuild)? |
I wrote a long post with pros/cons of different possible solutions in NixOS/nix#969 (comment)
So, to
That would be an unfortunate cache miss. Nevertheless, isn't it "the Nixy way" to pursue dependency isolation even at the cost of rebuilds (i.e. changing the version of compiler used to compile Bash triggers a world rebuild)?
my answer is that, personally, I would like my build machines to always check all new fixed-output things coming from Nixpkgs, but never check those made by me (unless I ask for it explicitly).
The linked comment describes one way to implement that.
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-05-14/45414/1 |
RFCSC: @oxij We notice that you didn’t accept the nomination, but keep in mind that you don’t need to agree with the RFC to be a shepherd, contrasting views can be very helpful. The role of the shepherd team is to decide whether the community as a whole would benefit from the RFC, having different views in the shepherd team can be helpful to ensure the entire community is heard. Please reconsider if you’d like to be a shepherd after all and let us know. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-05-28/46113/1 |
@kevincox
Alternatively, someone could just merge a noop-by-default NixOS/nixpkgs#316668 (which I split off from NixOS/nixpkgs#49862) which will also optionally implement this RFC for anyone with `nameSourcesPrettily = true` in their `config.nix` of Nixpkgs or `nixpkgs.config` of `configuration.nix`.
Then, someone could merge the latest incarnation of NixOS/nixpkgs#49862 (which still needs to go to `staging`, given it is a mass-rebuild) to make all `fetch*` derivation names consistent across the whole Nixpkgs.
Which will then reduce this RFC to an argument about changing the default value of `nameSourcesPrettily`.
This seems a much better plan to me, since with those PRs merged both sides of the argument can get what they want at any time by changing one `config` option, and you can actually try out what both sides want first and check if the trade-offs this discusses matter to you.
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-06-10/46817/1 |
I don't see much point in merging an option if we don't know that we want to use it. Otherwise we need to consider taking it back out if it ends up that this isn't desirable. I would only recommend merging the implementation first if there are unknowns that would have a major impact on the RFC itself and that can not be properly tested without shipping to most users behind a config option. I would recommend going forward with the RFC, then if it is accepted the mentioned PRs should be fairly easy to get in. (Code quality concerns would still have to be addressed of course, but the intention will be known to be desirable.) |
Agreed, the main reason why I drafted the RFC is for correctness. I would like for it to be opt-out rather than opt-in. Using the opt-in might be nice for PR stabilization, but I wouldn't want that for an end state. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-06-24/47589/1 |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-07-08/48678/1 |
I'm going to resign as shepherd, since I don't know how to do the job. I agree with this RFC broadly that making, using, and updating fixed-output derivations is too hard in Nix and Nixpkgs. There are a lot of good suggestions here about how to make it better. Without taking a strong position on any particular suggestion, it's a shame that doing these sorts of upgrades on the status quo are so difficult, and the levers of action seem both far away and not able to be pulled by roughly anybody. I do see some positive action in Lix-land: https://gerrit.lix.systems/c/lix/+/1536 |
Ah that's unfortunate, thanks for being explicit about stepping down. Shepherding has an (imo) decent explanation here:
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-08-05/50170/1 |
RFCSC: This RFC is being closed due to lack interest. If enough shepherds are found this issue can be reopened. If you don't have permission to reopen please open an issue for the NixOS RFC Steering Committee linking to this PR. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2024-08-19/50831/1 |
Make fetchFromGithub FOD name more meaningful. This avoids stale artifacts and gives more content fidelity when looking at nix store paths.
Rendered: https://github.com/jonringer/rfcs/blob/jringer/fetch-from-github/rfcs/0171-fetch-from-github.md
Items for further refinement:
src
label