Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: ensure-style options in NixOS modules #206467

Open
RaitoBezarius opened this issue Dec 16, 2022 · 19 comments
Open

RFC: ensure-style options in NixOS modules #206467

RaitoBezarius opened this issue Dec 16, 2022 · 19 comments

Comments

@RaitoBezarius
Copy link
Member

RaitoBezarius commented Dec 16, 2022

Context & motivation

In NixOS, a desirable property is that the current state of the system configuration is a pure function of the Nix expression evaluated.

For example, NGINX virtual hosts are directly a pure function of the Nix expressions describing them.

Another example would be that, under users.mutableUsers = false;, UNIX users are directly a pure a function of the Nix expressions describing them, including their attributes. (please correct me if this is wrong.)

A non-example of this is services.postgresql.ensureUsers, it is possible to manually remove a PostgreSQL user and perform multiple rebuild switch without reviving the user in question, therefore, creating a drift between NixOS expression and the actual PostgreSQL configuration.
Same thing for hardware.ensurePrinters too (which attempt to reconcile the expression with the reality, without removing any printer though.)

This can be generalized into all kind of options that "prefill" data or state which could also be seen as "static configuration state" (e.g. what are my users, what are their permissions, etc.) but could also have been dynamic.

Recently, there have been some activity to extend ensure-style options to existing NixOS modules for the sake of usability and, that under "nice assumptions" (no manual removal, not too much buggy software, etc.), they would even respect the "purity" predicate given above (i.e. the configuration state is a pure function of NixOS expression), see:

Arguments against the proliferation of these options

In #164235, @aanderse argued against these kinds of options because they break many assumptions people tend to have on a NixOS system and recommended using a tooling which would actually try to reconcile the state, e.g. Terraform (or NixOps I would say).

Open questions

Personally, I would argue that there are some practical advantages to limit the amount of tooling used to deploy a system and Terraform/NixOS integration is not necessarily optimal, therefore, I think this matter should be more discussed.

(1) Who is using NixOS with strong assumptions based on the fact they can derive more or less static configuration state based on NixOS expression? Is there any term to describe this property we can start using in the community and document it?
(2) Should we introduce a tainting mechanism whenever a property-breaking option is being used so that this group of users can isolate these systems? Should we do nothing and just try actively to remove these mechanisms? What is a good story for these competing needs?
(3) What is an acceptable way to perform these "reconciliation" operations à la Kubernetes/Terraform/Ansible in NixOS? Should we start work on a framework to contribute those to nixpkgs?

Another connected problem which is related but not directly is the "automatic migration" mechanism that tend to be present for NixOS module for simplicity, but creates real issues when in combination with rollback feature, e.g. upgrading Gitea, Gitea is broken, then rollbacking and Gitea revision N - 1 is not forward compatible with the new DB schema, therefore, Gitea is broken on previous revision too. I do think answering questions here would provide some insights regarding this problem too.

@schuelermine
Copy link
Contributor

Maybe this is best submitted to the https://github.com/NixOS/rfcs repository?

@RaitoBezarius
Copy link
Member Author

Maybe this is best submitted to the https://github.com/NixOS/rfcs repository?

I do not plan to write an RFC right now on the subject, it's an RFC as in "request for comments" from the whole community, if it turns out, we actually want to write a proper RFC, we can do it later. :)

@schuelermine
Copy link
Contributor

You can write a silly, uninformed, non-community-consensus driven RFC all you want. I have done so three times :)

@grahamc
Copy link
Member

grahamc commented Dec 17, 2022

ensure smacks of convergent configuration management. NixOS is special because it describes what is.

@roberth
Copy link
Member

roberth commented Dec 17, 2022

ensure smacks of convergent configuration management. NixOS is special because it describes what is.

I agree that on the user-facing side of NixOS we should try to avoid convergent options where possible.
mutableUsers shows that we can make convergent logic behave declaratively. I don't know how we got there, but I suspect that a lot of trial, error and contributions were involved. It would be great for NixOS to offer both a conceptual and technical framework to support the mixing of declaratively managed data and live data.

@costrouc
Copy link
Member

I personally find myself wanting that extra small bit from the NixOS modules especially around managing the state (ansible and terraform feel way to heavy of a dependency especially since I already have NixOS). I get that trying to declaratively manage something that stores state is not perfect but as others have mentioned it is already done throughout nix.

As a developer I often need to link authentication store + database + message queue + web server and NixOS is SO close to being able to manage this fully declaratively. The idea it is not perfect is something that the people using these tools will need can be aware of.

@zimbatm
Copy link
Member

zimbatm commented Dec 28, 2022

Whether we want it or not, having some sort of state is inevitable. We still want to push as much of the configuration to be congruent in the /nix/store, but it would be nice if we had tighter control over that state.

On my machine, /run/current-system/activate is 479 lines of bash, pushing state around on the machine. Most of the data in /var is loosely managed. It would be nice to have some declarative tool that does something close to terraform, but only for local resources.

@felschr
Copy link
Member

felschr commented Jan 14, 2023

Similar to the allowUnfree setting there could be something like allowScriptedState to opt-in or out of this behaviour.
This could be a good central place to make users aware & understand the risks of using ensure-style & similar options, especially if we decide to make this setting opt-in in the future.

@bjornfor
Copy link
Contributor

A non-example of this is services.postgresql.ensureUsers, it is possible to manually remove a PostgreSQL user and perform multiple rebuild switch without reviving the user in question, therefore, creating a drift between NixOS expression and the actual PostgreSQL configuration.

Wow, that's horribly broken. We cannot base the decision on whether we should (not) have ensure* options in NixOS based on that broken behaviour. I'd expect ensure* to reproducibly configure their little state area upon nixos-rebuild, while merging/allowing other state changes on the side. Like users.mutableUsers = true.

What's the alternative if we don't have ensure* options? Configure that state manually? 😱 (I came here because I was looking for a way to configure minio buckets declaratively -- I won't use it if I have to do manual configuration.)

@roberth
Copy link
Member

roberth commented Apr 13, 2023

What's the alternative if we don't have ensure* options? Configure that state manually? 😱

Putting in the effort and making the module implementations more stateful, just like mutableUsers.

Nix isn't magic. Stateless or declarative doesn't mean no state. What Nix does is it reduces the deployed software variables (from traditional package management entropy) into a single variable: a profile. It does not magically reduce the other variables; that would be called "catastrophic data loss".

If NixOS only manages the profile, we've done a shit job. And that's ok. This is OSS, and something is better than nothing, but damn. We take away control over individual services by making the whole system software into a single variable, but then we don't help out with the actual setup and migrations? How's that supposed to be any better?

Now it's not all bad. We do have a good example: mutableUsers, and we could probably come up with some guidelines for how to deal with actually interesting state. And we have this amazing tool called the NixOS test framework. We can TDD the shit out of this problem.

@RaitoBezarius
Copy link
Member Author

I totally agree with @roberth and I would go further: in my personal opinion, this is an open area of research to get the things right and Nix is uniquely positioned to experiment something new, in regard to this (like designing ways to compose small primitives to converge such state).

Of course, it's a matter of time and a strike balancing between "this option is hard, requires extra carefulness and scrutiny and tests" and "this option is easy and can be added in a harmless way" and that's why I am a bit picky in which to accept and not because also the data loss is really annoying.

@bendlas
Copy link
Contributor

bendlas commented Nov 10, 2023

ensure smacks of convergent configuration management. NixOS is special because it describes what is.

It does and it is. 2 points:

  • I'd still rather have a principled form of convergent configuration, than just a bunch of scripts. But I agree that at NixOS we don't do "good enough", we do "necessarily radical, otherwise pragmatic".
  • Do you think there is a fighting chance to ever make upgrading state as atomic and robust as nixos-rebuild switch/boot, @grahamc?
    The main problem I'm seeing is: The content of my database will never be declarative, unless I allow my users to declare stuff. Even with the best system, I'll have to restore a backup of my actual data, instead of just unfolding the latest descriptor into a new instance. Otherwise, it would mean 1 database transaction == 1 backup snapshot ... I mean that would be like trying to rewrite the RPATHs of every single binary and expecting the result to work 😈

@bendlas
Copy link
Contributor

bendlas commented Nov 10, 2023

Thinking about it, not even switch/boot is fully atomic, thus not fully declarative: What if something goes wrong after setting /nix/var/nix/profiles and before/during writing the bootloader? What about buggy state transitions in switch?

Maybe it's time to include the update process in the "what is", that @grahamc mentioned?

What about if we approach the problem as a generalization of the bootloader problem, and have nixos-rebuild declare a sort of "letter of intent" of pending stateful actions, each one monitored to hell and back as well as preferrably undoable?

One huge benefit would be that this would allow to check for more errors that wouldn't show up during eval, before actually eating them at runtime, see https://matrix.to/#/!aGqRytqbCECitOFhbt:nixos.org/$-Luxmact8b2jLtsAnJxBIZM9FDi2C7Gmx0ypVPLnF2E?via=matrix.org&via=lpc.events

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/breaking-changes-announcement-for-unstable/17574/39

RaitoBezarius pushed a commit to Ma27/nixpkgs that referenced this issue Nov 13, 2023
…sql15

Closes NixOS#216989

First of all, a bit of context: in PostgreSQL, newly created users don't
have the CREATE privilege on the public schema of a database even with
`ALL PRIVILEGES` granted via `ensurePermissions` which is how most of
the DB users are currently set up "declaratively"[1]. This means e.g. a
freshly deployed Nextcloud service will break early because Nextcloud
itself cannot CREATE any tables in the public schema anymore.

The other issue here is that `ensurePermissions` is a mere hack. It's
effectively a mixture of SQL code (e.g. `DATABASE foo` is relying on how
a value is substituted in a query. You'd have to parse a subset of SQL
to actually know which object are permissions granted to for a user).

After analyzing the existing modules I realized that in every case with
a single exception[2] the UNIX system user is equal to the db user is
equal to the db name and I don't see a compelling reason why people
would change that in 99% of the cases. In fact, some modules would even
break if you'd change that because the declarations of the system user &
the db user are mixed up[3].

So I decided to go with something new which restricts the ways to use
`ensure*` options rather than expanding those[4]. Effectively this means
that

* The DB user _must_ be equal to the DB name.
* Permissions are granted via `ensureDBOwnerhip` for an attribute-set in
  `ensureUsers`. That way, the user is actually the owner and can
  perform `CREATE`.
* For such a postgres user, a database must be declared in
  `ensureDatabases`.

For anything else, a custom state management should be implemented. This
can either be `initialScript`, doing it manual, outside of the module or
by implementing proper state management for postgresql[5], but the
current state of `ensure*` isn't even declarative, but a convergent tool
which is what Nix actually claims to _not_ do.

Regarding existing setups: there are effectively two options:

* Leave everything as-is (assuming that system user == db user == db
  name): then the DB user will automatically become the DB owner and
  everything else stays the same.

* Drop the `createDatabase = true;` declarations: nothing will change
  because a removal of `ensure*` statements is ignored, so it doesn't
  matter at all whether this option is kept after the first deploy (and
  later on you'd usually restore from backups anyways).

  The DB user isn't the owner of the DB then, but for an existing setup
  this is irrelevant because CREATE on the public schema isn't revoked
  from existing users (only not granted for new users).

[1] not really declarative though because removals of these statements
    are simply ignored for instance: NixOS#206467
[2] `services.invidious`: I removed the `ensure*` part temporarily
    because it IMHO falls into the category "manage the state on your
    own" (see the commit message). See also
    NixOS#265857
[3] e.g. roundcube had `"DATABASE ${cfg.database.username}" = "ALL PRIVILEGES";`
[4] As opposed to other changes that are considered a potential fix, but
    also add more things like collation for DBs or passwords that are
    _never_ touched again when changing those.
[5] As suggested in e.g. NixOS#206467
@ibizaman
Copy link
Contributor

ibizaman commented Nov 13, 2023

In my experience at work, state management is hard. That Nix has a hard time dealing with it does not mean in the slightest that Nix is not good. With that out of the way, here are a few interesting state migration examples and principles I read about and rediscovered at work:

  • To automatically migrate (and downgrade) a database, you need an up (and down) script that transitions the database to the new (or to the old) script. Taking Gitea example above, it seems they don’t provide a down script. If they don’t, there’s nothing we at Nix can do there. Also, applying a script can take a while and render your system useless while the migration happens, if not done correctly.
  • You can backup before each migration. This could help with the down script missing. But it’s sometimes not doable. Restoring the db could be a very costly operation that requires a lot of downtime. It shouldn’t, but if you’re talking about Tbs of data, then everything takes time. Also you lose any changes done by users when you restore backups.
  • We rarely update software and database state at the same time. Let’s take a concrete example which updates a database column type. We usually do it like so:
    1. Add a new column with the new type with NULL data in it.
    2. Update software to use new column and fallback to previous column if the value is NULL.
    3. Run a background script copying data from the old column to the new one, transforming the data as required.
    4. Make sure everything runs fine with the new column with a canary, smoke tests, etc.
    5. Update code to only use new column.
    6. Drop column from table.
      Each step should be tested extensively too. Each step becomes more complicated if you have multiple nodes in your cluster, sharding, AZ replication, a lot of data.

I’m sure I’m forgetting a lot here as I’m writing from memory. But I want to convey that IME database updates are rarely free and snappy and easy. Im not sure how all these steps can happen in one deploy command. Nor how Nix can solve this problem space.

The main issue IMO is Nix automatically produces a diff of old state to new state which works great to deploy new stateless binaries. But to deploy stateful stuff, we should be able to let the user override the internal automatic diffing and describe how the migration should happen, in multiple discrete steps.

Also, it doesn’t help that when you update to a new nixpkgs commit, you update everything at once. It would be nice to be able to update to a new nixpkgs commit but be able to apply changes with other strategies than « all at once », like one binary at a time.

Another tangential comment, which is in no way a criticism, is we should invest more in our observability tooling. Most of the big open source projects I host do not provide a good story there. We should embrace extensive tracing and metrics on top structured logging. This would help tremendously IMO.

@fricklerhandwerk
Copy link
Contributor

Related: NixOS/rfcs#155

@RaitoBezarius
Copy link
Member Author

In my experience at work, state management is hard. That Nix has a hard time dealing with it does not mean in the slightest that Nix is not good. With that out of the way, here are a few interesting state migration examples and principles I read about and rediscovered at work:

Thank you for your lengthy feedback, I mostly agree with you (except in some points where I don't see a compelling case where I can agree).

  • To automatically migrate (and downgrade) a database, you need an up (and down) script that transitions the database to the new (or to the old) script. Taking Gitea example above, it seems they don’t provide a down script. If they don’t, there’s nothing we at Nix can do there. Also, applying a script can take a while and render your system useless while the migration happens, if not done correctly.

Yep, it's a classical problem and why we should probably invest in standardizing automaticMigrations knobs for each NixOS module which enable manual migrations, of course, some of them doesn't, which let us move to the next point you wrote.

  • You can backup before each migration. This could help with the down script missing. But it’s sometimes not doable. Restoring the db could be a very costly operation that requires a lot of downtime. It shouldn’t, but if you’re talking about Tbs of data, then everything takes time. Also you lose any changes done by users when you restore backups.

That is true, but this is a self-inflicted limitation. You don't need to pay the backup cost (though you should always pay it for disaster recovery recipes anyway, right?), you can simply reuse your filesystem or rely on application-specific backup/snapshots technologies.

It's not always clear to a user, but there's a treasure trove of technologies we are usually sitting and not using, massaging filesystem snapshots to make use of them in this context is a trivial example of that. (and yes you can make the database cooperate with flushing the pages, etc. It requires work, it's not hard.)

The delta lost is unfortunate but also inevitable, what would you even expose your application to your users if you didn't finish validating the deployment? And if you do so because there is a weird bug in the application, we are in the set of cases where it will be almost impossible to automate any meaningful answer to this problem and requires manual work every time, so I would say that losing a delta of your state by roll backing a half-broken application is out of scope.

We can tolerate broken applications, but it is very complicated if not impossible to tolerate half broken applications. Failure mode is part of engineering, and failing hard is important. Failure to do so, well… will create disasters.

Disaster require disaster recovery plans and no automation can save us of that, we can just make it easier at most.

  • We rarely update software and database state at the same time. Let’s take a concrete example which updates a database column type. We usually do it like so:

    1. Add a new column with the new type with NULL data in it.
    2. Update software to use new column and fallback to previous column if the value is NULL.
    3. Run a background script copying data from the old column to the new one, transforming the data as required.
    4. Make sure everything runs fine with the new column with a canary, smoke tests, etc.
    5. Update code to only use new column.
    6. Drop column from table.
      Each step should be tested extensively too. Each step becomes more complicated if you have multiple nodes in your cluster, sharding, AZ replication, a lot of data.

I’m sure I’m forgetting a lot here as I’m writing from memory. But I want to convey that IME database updates are rarely free and snappy and easy. Im not sure how all these steps can happen in one deploy command. Nor how Nix can solve this problem space.

I respectfully disagree on "rarely free, snappy and easy", in my experience, they are almost always free, snappy and easy, they rarely fail!

The problem is that when they fail, it's rarely actionable because people are not used to it failing. People are not aware of everything that can go wrong. That's understandable, it mostly goes right.

Solving a problem space makes little sense to me. The Nix expression language trivially decreases (as demonstrated by now) the difficulty to intertwine complex application dependencies at the meta-level in a reusable fashion, via the NixOS module system (so called an expert system sometimes).

The Nix expression language can also trivially decrease the difficulty to tame the state convergence situation by providing abstractions to describe state convergence at the NixOS module level and let it be an emergent (complex) system.

Of course, this is not for the faint of the hearts and will probably never be useful for A/Z replication sharding blablabla use cases for now, as we don't even have "remote systemd" (which is key to enable Kubernetes-style use cases natively with Nix), but I bet this can tremendously help for the rare cases where it fails, because those cases are usually simple and easy.

Even backupping automatically before performing a state transition would be largely welcome by many of us because we have the backup storage, and we just don't have the opportunity to enable such measures.

The main issue IMO is Nix automatically produces a diff of old state to new state which works great to deploy new stateless binaries. But to deploy stateful stuff, we should be able to let the user override the internal automatic diffing and describe how the migration should happen, in multiple discrete steps.

Nix is able to diff closures (of .drvs or realized store paths). Not state. It is us who decide to give meaning to a diff of closures.

For what is worth, we can totally introduce more steps to the switch-to-configuration.pl logic to let them override the automatic diffing and perform policy based deployments, I will mention this again NixOS/nixops#1245.

There is a tension between NixOS being a normal operating system and NixOS embracing completely operation-style logic and offer them by default with empty policy, which makes it trivial for people like us to use them for implementing our own policies on the top of that.

More advanced convergence engines could be built on the top of the existing things, there's no reason to use the NixOS default provided one, there's no reason to have a unique implementation. What we need to do though is to be able to capture the expressivity we need to understand what does it mean to perform state convergence and express database migrations or anything as simply an act of state convergence.

Also, it doesn’t help that when you update to a new nixpkgs commit, you update everything at once. It would be nice to be able to update to a new nixpkgs commit but be able to apply changes with other strategies than « all at once », like one binary at a time.

I don't think "one binary at a time" will ever make sense for NixOS, partial updates are physically impossible for a good reason. What you are looking for though is a way to keep old systemd units running with their old paths and swap them with their new version, one at a time.

But you need to bring your own rollback policy in case of failures to roll out.

Another tangential comment, which is in no way a criticism, is we should invest more in our observability tooling. Most of the big open source projects I host do not provide a good story there. We should embrace extensive tracing and metrics on top structured logging. This would help tremendously IMO.

I am not sure that I understand how is it related to the matter, though. This is the problem of the software you are running and should be tracked somewhere else (even in the issue tracker of the software you are using!).

OTEL and whatnot are things that are available in nixpkgs in the reasonable limits, we are not really the place to hold upstreams accountable on this. :)

@ibizaman
Copy link
Contributor

ibizaman commented Nov 21, 2023

Thanks for answering and opening my eyes, see below. Btw, I didn’t copy here what I agree with.

That is true, but this is a self-inflicted limitation. You don't need to pay the backup cost (though you should always pay it for disaster recovery recipes anyway, right?), you can simply reuse your filesystem or rely on application-specific backup/snapshots technologies.

Not disagreeing but just wanted to clarify the cost I was thinking about at the time of writing was the time it takes to make a backup which can be long. This renders deployment cumbersome to make if they happen before deploying.

It's not always clear to a user, but there's a treasure trove of technologies we are usually sitting and not using, massaging filesystem snapshots to make use of them in this context is a trivial example of that. (and yes you can make the database cooperate with flushing the pages, etc. It requires work, it's not hard.)

I never really considered using the file system for this. You referring to snapshots like can be seen in ZFS or LVM, right? I’ve been reading about that now and I imagine we could just create a snapshot everytime we deploy. Which is a very quick operation. This makes me want to redo my whole backup strategy 😁 I wonder if we even could have a dataset (in ZFS terminology) per application we deploy which could even allow pretty seamless relocation of the app.

The delta lost is unfortunate but also inevitable, what would you even expose your application to your users if you didn't finish validating the deployment? And if you do so because there is a weird bug in the application, we are in the set of cases where it will be almost impossible to automate any meaningful answer to this problem and requires manual work every time, so I would say that losing a delta of your state by roll backing a half-broken application is out of scope.

Makes sense.

I respectfully disagree on "rarely free, snappy and easy", in my experience, they are almost always free, snappy and easy, they rarely fail!

The problem is that when they fail, it's rarely actionable because people are not used to it failing. People are not aware of everything that can go wrong. That's understandable, it mostly goes right.

I’m really curious how you do the kind of deploy I mentioned. Like you said, the issue is when they fail. We split each step in its separate deploy so if something goes wrong, we know without doubt what step failed. But also, reasoning about failure modes is easier if the deploy step does not have too many moving parts.

Nix is able to diff closures (of .drvs or realized store paths). Not state. It is us who decide to give meaning to a diff of closures.

For what is worth, we can totally introduce more steps to the switch-to-configuration.pl logic to let them override the automatic diffing and perform policy based deployments, I will mention this again NixOS/nixops#1245.

Ah yes. I meant the same. We can do whatever we want but it’s not there yet. I badly expressed myself, I didn’t mean the current behavior is set in stone.

I don't think "one binary at a time" will ever make sense for NixOS, partial updates are physically impossible for a good reason. What you are looking for though is a way to keep old systemd units running with their old paths and swap them with their new version, one at a time.

But you need to bring your own rollback policy in case of failures to roll out.

I mostly agree. I could imagine deploying updates to Nextcloud and Home Assistant independently because they don’t relate to each other.

At first, I tried to deploy my server using disnix which does what you describe pretty well. It has some shortcomings I couldn’t push through so I switched to a more classic deploy system. But I really liked disnix.

OTEL and whatnot are things that are available in nixpkgs in the reasonable limits, we are not really the place to hold upstreams accountable on this. :)

Yes I’m not sure how my last rant related to the issue at hand. I think I was thinking having proper monitoring helps to know if the new deploy behaves correctly and inform if we need to rollback or not. I’m sad so few apps I use have any introspection features.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/what-about-state-management/37082/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests