Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System-agnostic builders #6697

Open
YorikSar opened this issue Jun 21, 2022 · 8 comments
Open

System-agnostic builders #6697

YorikSar opened this issue Jun 21, 2022 · 8 comments
Labels
feature Feature request or proposal significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@YorikSar
Copy link
Contributor

Problem description

nixpkgs provides a set of trivial builders that allow you do create derivations for simple tasks like:

  • writing a file to the store
  • writing a shell script to the store
  • creating a bunch os symlinks in the store

All those tasks are very simple, and yet they are implemented as shell scripts that run on a specific system that nixpkgs has been instantiated with. It means that every time I want to rebuild NixOS system configuration on my macOS laptop, I have to:

  • fetch all binary packages from the cache on my laptop,
  • fetch them again on some Linux builder (or push them there),
  • all this to create a number of small text files and symlinks on Linux builder
  • and then fetch them back to my laptop.

This also means that I cannot create NixOS VM without having Linux builder somewhere (see discussion in NixOS/nixpkgs#108984), even though you don't really need to run any Linux-specific tasks to build it, unless you're doing something exotic with your packages.

We could achieve this by overriding these trivial builders depending on the system where nixpkgs is being evaluated somehow, but then we would be getting different closures on different systems, because we don't trust that cat, mkdir, mv or ln produce the same result on Linux as on macOS.

Proposed solution

I want to be able to create derivations that can run on any system where Nix can run, so that these most trivial builders don't require Linux to run. Because we still need to make sure that these builders produce reproducible results, we can't run native binaries as no amount of sandboxing can provide enough guarantees for those.

I propose to use WASM virtual machine for this. With WASI having quite enough APIs, I think we could allow user to specify wasm32-wasi platform and provide a .wasm binary as a builder (or maybe .wat as well?). Then we can use wasmtime to run these builders, providing only minimal set of capabilities (pretty much only access to files and environment variables from the derivation). We could also embed wasmtime in Nix, but it feels like an unnecessary bloat. We should probably just allow user to configure the path to wasmtime binary, and then allow wasm32-wasi to run on this host.

Note that in this scenario builders themselves will most likely be built in platform-dependant derivations, but they will most likely be cached (and with CA we could reuse them for all platforms) and can be used on any system afterwards. You won't have to build writeText.wasm for NixOS configuration on macOS machine, but you will be able to use it there. Although for such simple cases we might want to emit WASM text format (.wat) from Nix specification.

Possible alternatives

  • We could allow to specify different native builders for different platforms in a derivation (essentially, allow simple cross-compilation at this point), but that would require us to trust these binaries to produce same results, which is not acceptable
  • We could embed some scripting language (Lua? "Nix with side effects"?) but that would prevent us from reusing existing tools written in other languages that can be compiled to WASM.
  • We could add more builtin builders to Nix, but this seems similar to the previous one.
@aakropotkin
Copy link
Contributor

aakropotkin commented Jun 22, 2022

In cases like this I write derivations that directly invoke things like cp, ln, tar, etc. These on their own aren't platform agnostic, but because they don't have the overhead of mkDerivation as runCommandNoCC does - they're incredibly fast. Composing these with built-ins can avoid a lot of the sluggishness you normally see with Nixpkgs derivations.

For system agnostic fetchers we have builtins btw. Additionally you have toFile which is platform agnostic - this alone might solve a lot of your use case. Similarly the built-ins path, storePath, filterSource, and fetchTree are all platform agnostic AFAIK and they absolutely fly. The built-in fetchers blow nixpkgs.fetch* out of the water.

nixpkgs.lib.cli even has a few QoL functions that make "direct" derivations painless.

github.com/aakropotkin/ak-nix under pkgs/build-support/trivial has examples. This one is for ln https://github.com/aakropotkin/ak-nix/blob/main/pkgs/build-support/trivial/link.nix

And for tar https://github.com/aakropotkin/ak-nix/blob/main/pkgs/build-support/trivial/tar.nix

I've used this same pattern for gcc, and several other CLI tools with great results.
gcc + ar: https://github.com/aakropotkin/gourou-nix/blob/main/support.nix

I went wild with fetchurl and fetchTree in this project building a partial replacement for NPM/Yarn ( in progress, alpha ) https://github.com/aameen-tulip/at-node-nix/blob/main/lib/registry.nix this is a good starting point but this whole repo has examples of using minimal derivations or raw built-ins ( everything in lib is system agnostic ) to build Node.js projects with registries and local trees.

I guess my point is: we have most of these features, they're just scarcely used in Nixpkgs and there aren't good guides or best practices on a lot of them, which might be a task worth focusing on. fetchTree being undocumented is a travesty because it's literally one of the most useful built-ins. Similarly callFlake being hidden is a shame. The built-ins path and filterSource behave like cp for single files or directories which should be clarified in docs. Generally more docs on using "raw" Nix with minimal Nixpkgs support would be awesome, since that's really where flakes shine and the eval caches become performant.

@YorikSar
Copy link
Contributor Author

The problem is specifically in small derivations like these requiring a builder with specific platform, even though they could produce the same result on any platform. We could create builtin builders for them or just one that would provide some scripting interface to do simple stuff, but it seems like inventing a new VM.

@aakropotkin
Copy link
Contributor

The problem is specifically in small derivations like these requiring a builder with specific platform, even though they could produce the same result on any platform. We could create builtin builders for them or just one that would provide some scripting interface to do simple stuff, but it seems like inventing a new VM.

Gotcha. I see; yeah that makes things tough. I was fooling around with the builtin fetchurl and buildenv stuff a bit today which were interesting - but I get a strong impression that those are internal.

Yeah I agree with you, having builtins for dead simple derivations like ln, cp, fetchurl ( technically exists but is hidden ), etc could be a huge optimization for certain use cases.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-33/20048/1

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/use-a-reference-to-a-derivation-in-tofile/20259/2

@roberth
Copy link
Member

roberth commented Feb 27, 2023

Just brainstorming here.

How about we add a new hashing mode to support this use case. It achieves system-independence by using an alternate hashing scheme similar to fixed-output derivations (but not actually fixed output).

As a possible design point, this has the huge benefit that it avoids having to make wasm part of the (virtual) Nix specification for which we commit to reproducibility long term.

I'll acknowledge right away that it can be abused to poke a hole in evaluation purity if you really want to, but to me this seems favorable compared to committing to a wasm implementation.

This hashing scheme:

  • excludes the builder (aka argv0) from hashing, similar to what we do for fixed-output derivations.
  • expects a function in builder, to which "currentSystem" is passed. This generates the wasm interpreter, or really any interpreter that the expression wishes to vouch system-independence for. The returned builder is never returned to the expression that invoked derivation, so that you'd have to go out of your way to find an impurity. Memoizing the wasm interpreter can be the expressions' responsibility.
  • at build time, recovers the builder using the deriver field in the store db

Instead of currentSystem, the evaluator could potentially even take into account the system of the derivation(s) that use the system-agnostic builder, if we make the necessary changes to hold on to the builder function a bit longer. That's probably not much more than an extra string context constructor.

  • that would require us to trust these binaries to produce same results, which is not acceptable

Fair, but I do think it's at least as acceptable as making wasm part of Nix. Maybe that's just me though, or maybe neither idea is acceptable. Or both? A conversation to have.

  • We could add more builtin builders to Nix, but this seems similar to the previous one.

We wouldn't have to!

@thufschmitt thufschmitt moved this to ⏰ Postponed in Nix team Mar 24, 2023
@fricklerhandwerk
Copy link
Contributor

Triaged in the Nix team meeting 2023-03-23:

  • Potentially a big rabbit-hole
  • We don't have the capacity to deal with this right now
  • Postpone, will reconsider if someone wants to invest time and effort in it.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-03-23-nix-team-meeting-minutes-43/26758/1

@roberth roberth added the significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Jun 2, 2023
@thufschmitt thufschmitt removed this from Nix team Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests

5 participants