Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS 15 Sequoia clobbers _nixbld1-4 users #10892

Open
abathur opened this issue Jun 11, 2024 · 51 comments
Open

macOS 15 Sequoia clobbers _nixbld1-4 users #10892

abathur opened this issue Jun 11, 2024 · 51 comments
Labels
installer macos Nix on macOS, aka OS X, aka darwin

Comments

@abathur
Copy link
Member

abathur commented Jun 11, 2024

Note: I keep this first comment up-to-date. It includes fixes. You do not have to read the thread unless you are having trouble with its suggestions!

The macOS 15 Sequoia update takes 4 UIDs in the range we've been using, clobbering any _nixbldN users in the way (typically _nixbld1-4).

This manifests as:

  • On existing installs, build errors like: error: the user '_nixbld1' in the group 'nixbld' does not exist

    To fix this, run our migration script (which relocates/replaces _nixbld users) before or after taking the macOS 15 Sequoia update:

    curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh | bash -
    

    Caution: If you installed Nix with a third-party installer, you should check with them for additional/different instructions.

  • On fresh installs (with unpatched installer versions), the following error creating user _nixbld1:

    <main> attribute status: eDSRecordAlreadyExists
    <dscl_cmd> DS Error: -14135 (eDSRecordAlreadyExists)
    

    While the 2.24.6+ installers are fixed, older installers don't all work at the moment. (The installer fixes for this have been backported for every release back to the 2.18 series--but these aren't all quite released yet.)

    If you're trying to install versions from 2.20.0 and 2.24.5, you can explicitly override the starting UID:

    NIX_FIRST_BUILD_UID="351" sh <(curl -L <whatever release-specific installer URL you need>)
    

    If you run into this error with versions older than 2.18, you'll need to download the installer tarball for your platform, unpack it, and update the first UID in install-darwin-multi-user.sh to 351.

More background/context on the issue

Edit: As the macOS release is near, I'm tucking context away to focus the first comment on how users can fix broken installs.

Reports are percolating about the upcoming macOS Sequoia 15 (from people trying the beta out) using 4 UIDs in the range we've been using:

History on our previous change and ID range selection is in:

PRs to address:

@fbettag
Copy link

fbettag commented Jun 11, 2024

cat /etc/passwd on sequoia shows

_aonsensed:*:300:300:Always On Sense Daemon:/var/db/aonsensed:/usr/bin/false
_modelmanagerd:*:301:301:Model Manager:/var/db/modelmanagerd:/usr/bin/false
_reportsystemmemory:*:302:302:ReportSystemMemory:/var/empty:/usr/bin/false
_swtransparencyd:*:303:303:Software Transparency Services:/var/db/swtransparencyd:/usr/bin/false
_naturallanguaged:*:304:304:Natural Language Services:/var/db/com.apple.naturallanguaged:/usr/bin/false
_oahd:*:441:441:OAH Daemon:/var/empty:/usr/bin/false

@ryanbooker
Copy link

ryanbooker commented Jun 11, 2024

Perhaps count down from 400? That's what I've done locally to fix the immediate issue. FYI, everything works with an arbitrary range, e.g. 3001–3032.

@abathur
Copy link
Member Author

abathur commented Jun 11, 2024

No personal objection to counting down as a strategy, but:

  • IIRC the code that does this is in the general installer so we'd be changing it (and the "first" UID defaults) for linux as well.
  • Since the starting UIDs are overridable via environment variables, a naive flip in the order would technically be a breaking interface change for anyone actually doing scripted installs that pre-set UIDs. (No clue how common this is in the wild.) We could try to preserve the semantics/behavior of the current env by doing math on the value, but it may turn into a bit of a stumbling block over time if people get used to it going in reverse order and expect the variable to control the number it counts down from?

@roberth roberth added the macos Nix on macOS, aka OS X, aka darwin label Jun 12, 2024
@roberth
Copy link
Member

roberth commented Jun 12, 2024

Also affects nix-darwin

@stepbrobd
Copy link
Member

Question 1:
Would it be possible to workaround this without reinstalling Nix on macOS systems with ids option (exposed but not in docs, and changing the settings doesn't change anyting)?

Q1 RFC:
In this file, nixbld = 300; is set but this is not used anywhere. Perhaps we can add an idempotent shell script to add/remove nixbld group and nixbld* users on every rebuild?

Question 2:
For NixOS systems, if I'm understanding the docs correctly, nixbld* users are not needed to perform builds when auto-allocate-uids or cgroups is enabled, is there anything equivalent to these on macOS systems?

@abathur
Copy link
Member Author

abathur commented Jun 12, 2024

@michaelvanstraten noted in #6153 (comment) that you can get unblocked on new installs for testing with:

To repeat @ikuz's solution again for a quick fix on macOS 15 install/reinstall with:

 NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install)

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Some thoughts on where to reassign the IDs:

We need to be below (or equal to?) UID 400, and Apple has now used up to UID 304. Clearly we should expect they might keep adding users to the low end of this range occasionally. However, running right up against the 400 limit doesn’t seem safe to me either; /etc/group on my Sonoma machine contains groups from 395 to 400, so it seems like Apple considers the upper end of the system range to be its for the taking as well. The natural place to go, then, would be in the middle of the range.

We default to 32 build users currently. The main reason you might want more is that it limits the number of concurrent build jobs, and the number of those you might want to run is proportional to the cores/threads on your machine. The highest number of cores on a currently-shipping Mac is the 24‐core M2 Ultra, they used to ship Xeons with 48 threads (24 cores × 2 threads/core), and there are rumours of a 32‐core M3 Ultra and even a 64‐core M3 Ultimate. We can’t fit more than 96 users at the absolute maximum as of Sequoia, and that would obviously be risky, so let’s say we want to plan to have space for around 64 to 80 build users.

I suggest we start at 331 (if we want to keep the last digit matching the build user number) or 330 (if we don’t). That gives us enough margin for Apple to add ~26 new users before we run into problems again, 38 empty spaces above the top UID for the current default of 32 users, and just enough space to squeeze in 64 users before hitting the 395 ID that Apple has already used for a group.

The other good candidate would be 321/320; that reduces the margin on the low end to ~16 new Apple users, but increases the margin on the upper end, to (if using 320) just barely allow squeezing in 80 users in the available range. Personally I feel like the release of an 80‐core Mac would make me scared for 96 to 128‐core Macs that we have no realistic way of adding enough users for with our current approach anyway, and we have precedent that Apple is happy to add users on the low end of the range, so I lean towards 331 to give us more margin on that side. But I’m ambivalent if people have a strong preference for more margin to add and think that Apple will continue adding system users at a restrained enough pace compared to core count inflation. 331 spreads our users around the middle of the available range for 32 users, 321 does the same for 64 users.

This would also be a good opportunity to change the group ID; the current default of 30000 has the unfortunate effect of making the group show up in System Settings, unlike system groups. I would suggest using a related ID of 330 or 320 depending on the choice, and perhaps renaming it to _nixbld for consistency with the user names and other system groups (unless there’s any reason not to?).

Since we have to coordinate this with two installers and nix-darwin and get a migration plan in place before the release of Sequoia, I hope we can commit to a set of IDs ASAP to enable this to go smoothly.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Incidentally I note that _oahd has UID 441 which makes me wonder if Apple has secretly expanded the system user range without updating the meagre documentation of it? But I remember it being a pain to test and reproduce the issues that led us to use the system UID range in the first place, so it’d need careful verification of the bounds if we wanted to see if we could go beyond 400.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Some more investigation:

It seems like groups with GID < 500 don’t show up in System Settings. This may imply that the system UID range has also expanded, but I don’t know a convenient way to check, and anyway considering that Apple is using the middle of that range with 441 it might be awkward real estate to occupy; who knows what values Apple might pick in future. (There are also groups with higher GIDs that don’t show up in System Settings (e.g. com.apple.sharepoint.group.1/“<name>’s Public Folder”) that have dsAttrTypeNative:IsHidden: 1 in dscl(1) and Directory Utility, but setting that for the nixbld group didn’t seem to help.)

The maximum in‐use UID < 400 went from 297 to 304 in one version. I don’t know what the historical growth rate is like, but I’d definitely be more comfortable with 331 than 321 given that. If the growth continued at the rate of Sequoia (which seems unlikely, but still), we’d have to think of a new idea in 4 OS releases rather than 3. In general it seems like we’re on borrowed time here and I’m not really sure what the long‐term solution is. We may have to migrate now with the expectation of migrating again later.

If we could verify that UIDs < 500 now work fine, I suppose one solution would be to start at, say, 360 now, and hope Apple don’t eat up the lower 400s range if we start having machines that want 64 users. But I don’t remember how to test that.

@abathur
Copy link
Member Author

abathur commented Jun 15, 2024

Incidentally I note that _oahd has UID 441 which makes me wonder if Apple has secretly expanded the system user range without updating the meagre documentation of it?

Does this mean you looked at the usage info for sysadminctl and it still says 200-400?

But I remember it being a pain to test and reproduce the issues that led us to use the system UID range in the first place, so it’d need careful verification of the bounds if we wanted to see if we could go beyond 400.

Yeah. The main manifestation I recall was the system giving an obtuse adrenaline-inducing error message and booting into recovery mode during system updates that require a full reboot cycle. Looking back over the thread, it looks like that got less-scary on the next point release. The other was the build users showing up in a user list.

If we wanted to try moving outside of the current range, I suspect a workable protocol would be installing into the new range on a sub-sequoia version and then running the sequoia update and seeing if it blows up. (I don't currently have a spare mac that's eligible for this update, so someone else will have to drive...)

It can't save us from needing to fix this UID issue in the short run, but one way to address the long-run problem would be to figure out if we can get the various issues with auto-allocate-uids sorted out in order to make it the default. (The detsys folks tried defaulting to this in their installer and ran into trouble that compelled them to revert it on both macOS and Linux.)

After I hit post here I'll start drafting a feedback/radar report for Apple. Once I do, I'll also email their devrel about it, and post the FB number here in case anyone wants to refer to it from their own FB report. I'm not terribly optimistic about that helping (for example, I never got a response on the reports I opened about the big sur issue in 2021), but I guess there's an outside chance they'll improve their updater to relocate any UID/GID they trample to another valid ID.

(That would leave people with Weird installs--they might fail to fully clean up when they follow the uninstall instructions for example--but I think it would at least not break every existing macOS install made since early 2022 or whenever the multi-user default was released...)


For searchability, here's one real manifestation of this on an existing install (from ##10912):

...
these 13 derivations will be built:
  /nix/store/jcrd05mlpsw8wmixwd133pv3q3xbm18w-nerdfonts-3.2.1.drv
  ...
error: the user '_nixbld1' in the group 'nixbld' does not exist

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Does this mean you looked at the usage info for sysadminctl and it still says 200-400?

On Sonoma, which already has that _oahd user, yes; so if that’s within the system UID range and there’s not something weird going on with that user specifically and groups in the 400 to 500 range (perfectly possible! OAH is the internal name for Rosetta 2, as I understand it, so I wouldn’t be surprised if there are strange things going on there), then they expanded it without updating what passes for the “documentation”. I haven’t tried Sequoia yet, so I can’t comment on what the command says there.

Yeah. The main manifestation I recall was the system giving an obtuse adrenaline-inducing error message and booting into recovery mode during system updates that require a full reboot cycle. Looking back over the thread, it looks like that got less-scary on the next point release. The other was the build users showing up in a user list.

I think filling up the visible user list with 32 random daemon users is scary and off‐putting to users (especially in the absence of an official upstream uninstaller), so even if the more fundamental issues might be solved now I’d be reluctant to settle on that unless we can find another way to hide them.

It can't save us from needing to fix this UID issue in the short run, but one way to address the long-run problem would be to figure out if we can get the various issues with auto-allocate-uids sorted out in order to make it the default. (The detsys folks tried defaulting to this in their installer and ran into trouble that compelled them to revert it on both macOS and Linux.)

Yes, I would love this. If we can commit in the interim to e.g. UIDs starting at 331 and a GID of 330, hopefully that would give us enough runway to make something workable out of auto-allocate-uids. I remember hearing that the problems were worse on macOS than Linux, though (e.g. DeterminateSystems/nix-installer#521, DeterminateSystems/nix-installer#580 (comment) – I guess the lack of user namespaces really makes it tricky).

@abathur
Copy link
Member Author

abathur commented Jun 15, 2024

Ok, I've reported this in FB13917314 and emailed the devrel about it. For reference, report is roughly:

macOS 15 Sequoia beta installer clobbering existing role users with UIDs 301-304

We're getting reports (example: #10912) that the Sequoia update is clobbering existing build users for the Nix package manager, causing later errors such as:

error: the user '_nixbld1' in the group 'nixbld' does not exist

Users who have taken the update report seeing new users in this range in /etc/passwd:

_aonsensed:*:300:300:Always On Sense Daemon:/var/db/aonsensed:/usr/bin/false
_modelmanagerd:*:301:301:Model Manager:/var/db/modelmanagerd:/usr/bin/false
_reportsystemmemory:*:302:302:ReportSystemMemory:/var/empty:/usr/bin/false
_swtransparencyd:*:303:303:Software Transparency Services:/var/db/swtransparencyd:/usr/bin/false
_naturallanguaged:*:304:304:Natural Language Services:/var/db/com.apple.naturallanguaged:/usr/bin/false
_oahd:*:441:441:OAH Daemon:/var/empty:/usr/bin/false

A few years ago, the Nix installer used UIDs from 30001-30032 by default. The issue I reported in FB8997501 started causing trouble when users with these UIDs were present, so in response we took the hint from the usage note in sysadminctl ("Role accounts require name starting with _ and UID in 200-400 range") and migrated our build user UID defaults to 301-332.

The current behavior of the beta installer will break all existing multi-user Nix installs on macOS made in the last few years, confusing a lot of users in the process.

I can imagine at least two improvements that would help us out, here:

  • If these UIDs don't need to be hardcoded on your end, avoid clobbering existing role users and select a UID that doesn't clash.
  • If these UIDs need to be hardcoded, relocate any existing users to new unoccupied UIDs in the role user range.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

I can confirm that Sequoia’s sysadminctl says the same thing, so if there’s been any change it remains undocumented.

@ahcm
Copy link

ahcm commented Jul 7, 2024

On Macos 15 beta, users added with -roleAccount but no leading _ get an automatic user id >500.
With _ underscore one gets the following message:
Role account requires specified UID in 450-499 range.

While the help still gives the footnote:
*Role accounts require name starting with _ and UID in 200-400 range.

@lloeki
Copy link

lloeki commented Jul 12, 2024

For searchability too (thanks @abathur!), I had error: the user '_nixbld1' in the group 'nixbld' does not exist, if anyone needs to get out of this hole in a pinch, here's what I did to fill in the blanks:

# check your _nixbld users for who's missing
dscl . list /Users UniqueID | grep _nixbld | sort -n -k2

# check what are the non-nix used ones, so that you can find a hole in there
dscl . list /Users UniqueID | grep -v _nixbld | sort -n -k2

# fill in the blanks, mine were 1 to 4, I picked 401 as the start:
for i in {1..4}; do
  sudo dscl . -create "/Users/_nixbld${i}" UniqueID $(( 400 + ${i} ))
  sudo dscl . -create "/Users/_nixbld${i}" PrimaryGroupID 30000
  sudo dscl . -create "/Users/_nixbld${i}" IsHidden 1
  sudo dscl . -create "/Users/_nixbld${i}" RealName "_nixbld${i}"
  sudo dscl . -create "/Users/_nixbld${i}" NFSHomeDirectory '/var/empty'
: sudo createhomedir -cu "_nixbld${i}"
  sudo dscl . -create "/Users/_nixbld${i}" UserShell /sbin/nologin
done

# that's just for having the survivors be consistent, also using 401 as the start
for i in {5..32}; do
  id="$(dscl . -read "/Users/_nixbld${i}" UniqueID | cut -d' ' -f2)"
  sudo dscl . change "/Users/_nixbld${i}" UniqueID $id $(( 400 + ${i} ))
done

@lloeki
Copy link

lloeki commented Jul 16, 2024

FYI on my work laptop (still Sonoma 14) I have these:

-----8<-----
_backgroundassets        291
_mobilegestalthelper     293
_audiomxd                294
_terminusd               295
_neuralengine            296
_eligibilityd            297
-----8<-----
_nixbld{1-32} {301-332}
-----8<-----
_oahd                    441
_sentinelguard           498
_sentinel                499

Note:

  • _oahd 441 is already there; apparently it's Rosetta 2 related (/usr/libexec/rosetta/oahd) and part of the AOT compiler
  • I think _sentinel 498 and _sentinelguard 499 come from some company-managed security software (SentinelOne)

EDIT: confirmed that these two are SentinelOne

@abathur
Copy link
Member Author

abathur commented Sep 23, 2024

No, sorry I was unclear -- they were created in the script I (we) linked, it specifically noted moving them "temporarily," but in my case they were never moved back for some reason. Look for ((TEMP_NIX_FIRST_BUILD_UID=31000))

That sounds more manageable :)

Not super confident without a log, but one cause I'm aware of would be if there was already a gap in your set of _nixbldN users (i.e., if _nixbld33 were missing).

Or maybe it's not curl in this case (and I'm remembering wrong), currently I get an error when unpacking the archive:

$ sh <(curl -L https://nixos.org/nix/install)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  4267  100  4267    0     0  10441      0 --:--:-- --:--:-- --:--:--     0
downloading Nix 2.24.7 binary tarball for aarch64-darwin from 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz' to '/var/folders/kb/tw_lp_xd2_bbv0hqk4m0bvt80000gn/T/nix-binary-tarball-unpack.nJf7SW58hu'...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14.6M  100 14.6M    0     0  18.8M      0 --:--:-- --:--:-- --:--:-- 18.8M
tar (child): xz: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
/dev/fd/63: failed to unpack 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz'

Does type -a tar show a location other than /usr/bin/tar? (Maybe gnutar from Nix?) If so, I think it's actually that causing the trouble.

If so, can you open a new issue for the problem and symptoms?

(It sounds like there was a previous attempt to address this, but I guess it was only a partial fix:

Even if there isn't a good way around, documenting it may help others.)

@n8henrie
Copy link
Contributor

Does type -a tar show a location other than /usr/bin/tar?

Yes, this is what I was trying to say above. I prefer having the GNU utilities so I can use the same awk / sed / grep incantations across my linux and macos machines.

$ type -a tar
tar is /etc/profiles/per-user/n8henrie/bin/tar
tar is /run/current-system/sw/bin/tar
tar is /usr/bin/tar

If so, can you open a new issue for the problem and symptoms?

#11570

@juboba
Copy link

juboba commented Sep 28, 2024

I love this community.

@torgeir
Copy link

torgeir commented Oct 9, 2024

I recommend curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh | bash -, which enforces secure protocols, handles errors a bit better, and has a slightly more legible “obviously from the upstream Nix repository” flavour to it. This is patterned on the commands used by rustup and the Determinate Systems installers.

And you'd make for an even more trustworthy suggestion of a command for people to pipe into their shell by also pinning it to the current reveision, also improving its relevance for historical purposes.

Suggestion to change the url in your first post to
https://github.com/NixOS/nix/blob/8b2ffbae3adc2418a6221c24619d9bca51852d05/scripts/sequoia-nixbld-user-migration.sh @abathur

(obtained by pressing y when viewing it on github)

@mkenigs
Copy link
Contributor

mkenigs commented Oct 9, 2024

And you'd make for an even more trustworthy suggestion of a command for people to pipe into their shell by also pinning it to the current reveision, also improving its relevance for historical purposes.

Suggestion to change the url in your first post to https://github.com/NixOS/nix/blob/8b2ffbae3adc2418a6221c24619d9bca51852d05/scripts/sequoia-nixbld-user-migration.sh @abathur

(obtained by pressing y when viewing it on github)

Commands like this get copied and pasted around, and if there are patches to the script, it's preferable if people get the latest version rather than pinning to something that might be stale and not have fixes

@emilazy
Copy link
Member

emilazy commented Oct 9, 2024

Yes, if the Nix repository is compromised then users have bigger problems. Especially since the very comment containing the command to run is in the Nix repository.

@butterflyhug
Copy link

I have two Macs with Nix, both installed via the official https://nixos.org/nix/install installer. One of these machines is running MacOS 14.7 and the other has been upgraded to 15.0.1 (via 15.0). On both machines, the recommended sequoia-nixbld-user-migration.sh script is erroring out on its first dscl command when I run it from my admin user account, more or less immediately after I enter my password for the script's invocation of sudo. The full command output on both machines is identical:

$ curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh > /tmp/sequoia-nixbld-user-migration.sh
$ bash /tmp/sequoia-nixbld-user-migration.sh
Attempting to migrate _nixbld users.

Step 1: move existing _nixbld users out of the destination UID range.
Password:
<main> attribute status: eDSPermissionError
<dscl_cmd> DS Error: -14120 (eDSPermissionError)

(I believe the only difference here vs the recommended one-liner in the issue description is that I'm separating download from execution. I make a habit of keeping a local copy of all download-and-execute scripts like this, so that I can manually inspect the exact script that I have executed if/when anything goes wrong.)

Unsurprisingly given that the two scripts appear to be invoking dscl in the same way, the Determinate Systems installer's repair script that was also suggested upthread ultimately produces this same dscl error for me. It looks like my Nix installs are a bit old at this point (2.18.8 and 2.17.0, respectively), so I should upgrade Nix anyway... but it also doesn't seem like my stale Nix versions should really affect these scripts given my (limited) understanding of what the scripts are doing?

@abathur
Copy link
Member Author

abathur commented Oct 11, 2024

I agree that those versions shouldn't have anything to do with the issue. Afaik this is our first report of the problem, so the tractability of this will likely hinge on what you can figure out locally (at least until we figure out how to reproduce it).

A few questions to get us started:

  • Are these macs ~managed by an org (using mdm profiles)?
  • Have you restored them from time machine backups or used migration assistant (or any other means of image deployment or porting/recovering data)?
  • Can you manually invoke the failing dscl command while watching system logs (via Console.app or the logs command) and see if it coughs up any clues about the failure?

@butterflyhug
Copy link

butterflyhug commented Oct 11, 2024

No active MDM profiles on either Mac. Technically, the 14.7 machine is erroneously listed on a MDM auto-enrollment list from its previous life as an MDM-managed machine for about 6 months in 2020, but has never actually been re-enrolled after the previous owner removed their MDM profile and Recovery-wiped the machine for resale. (Yeah, I'm annoyed that the former owners could never be bothered to fix their mess after confirming that their auto-enrollment claim is erroneous, but that whole sordid tale should be irrelevant for our current purposes.) The 15.0.1 machine has no previous owners and has always been completely MDM-free; it originally shipped directly from Apple in early 2023 with 13.x preinstalled, and then has been kept regularly updated with production (non-beta) OS releases since.

Also no restored backups; I set up both machines from scratch from their clean OS installs as soon as they entered my hands and never looked back. As a result they have both been through multiple major MacOS upgrades over time. IIRC my original Nix installation on each of these machines was after upgrading the corresponding Mac's OS to 14.x.

Adding set -o xtrace to the top of the Nix-maintained script reveals that the exact command that is failing is sudo dscl . -create /Users/_nixbld5 UniqueID 31000. Here's a copy of Console.app's logs from the time period covering a manual invocation of that command on the machine running MacOS 15.0.1, although unfortunately I'm not really spotting anything we didn't already know in there.

@abathur
Copy link
Member Author

abathur commented Oct 11, 2024

Iirc warp is a terminal app, yeah? Are you using it on both systems? If so can you try again in Terminal.app?

@butterflyhug
Copy link

Oh hey, I found this old comment which suggested that the problem might be my choice of terminal emulator (I had previously discounted that as being potentially relevant because I'm using different third-party terminal apps on each of the two machines), and indeed the script works as expected in Terminal.app.

(fun, looks we got to the same place at the same time 🙂 )

@Enzime
Copy link
Member

Enzime commented Oct 11, 2024

@butterflyhug have you granted full disk access to either Terminal.app or Warp?

@abathur
Copy link
Member Author

abathur commented Oct 12, 2024

@Enzime fair question. In my case, the terminal.app hunch was because the log attached earlier shows some sandbox/tccd errors related to warp, which makes me think macos is further restricting permissions here.

I suspect we'd have heard by now if FDA was required to run these dscl commands broadly, but it is certainly still possible that there's something specific up with these systems and FDA explains why the command succeeded in Terminal (but I am hoping this isn't the reason).

@n8henrie
Copy link
Contributor

Terminal.app doesn't have FDA by default either.

@mwilsoncoding
Copy link

mwilsoncoding commented Oct 23, 2024

[Edit] Taking this comment to the correct place to avoid clutter. Thanks very much @cole-h for the redirect!

original comment

I missed this during an upgrade to Sequoia mandated by MDM for work.

I installed nix using the determinatesystems installer originally.

Before researching and finding this thread, I tried /nix/nix-installer repair, /nix/nix-installer install, and /nix/nix-installer uninstall, but all of them failed.

uninstall gave the following output:

`nix-installer` needs to run as `root`, attempting to escalate now via `sudo`... Nix uninstall plan (v0.18.0)

Planner: macos (with default settings)

Planned actions:

  • Unconfigure Nix daemon related settings with launchctl
  • Delete file /Library/LaunchDaemons/systems.determinate.nix-installer.nix-hook.plist
  • Remove the Nix configuration from zsh's non-login shells
  • Unconfigure the shell profiles
  • Remove the Nix configuration in /etc/nix/nix.conf
  • Unset the default Nix profile
  • Remove time machine exclusions
  • Remove Nix users and group
  • Remove the directory tree in /nix
  • Remove the APFS volume Nix Store on disk3

Proceed? ([Y]es/[n]o/[e]xplain):
INFO Revert: Remove directory /nix/temp-install-dir
INFO Revert: Configure Nix daemon related settings with launchctl
INFO Revert: Create a launchctl plist to put Nix into your PATH
INFO Revert: Configuring zsh to support using Nix in non-interactive shells
INFO Revert: Configure Nix
INFO Revert: Configure Time Machine exclusions
INFO Revert: Create build users (UID 301-332) and group (GID 30000)
INFO Revert: Provision Nix
INFO Revert: Create an encrypted APFS volume Nix Store for Nix on disk3 and add it to /etc/fstab mounting on /nix
ERROR Uninstallation complete, some errors encountered
Error:
0: Error reverting
0: Action create_users_and_group errored
Multiple child errors

  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld1"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld2"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld3"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld4"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)




  Action `create_apfs_volume` errored
  Failed to execute command with status 1 `"/usr/sbin/diskutil" "apfs" "deleteVolume" "Nix Store"`, stdout: Started APFS operation
  Deleting APFS Volume from its APFS Container
  Unmounting disk3s7
  The volume "Nix Store" on disk3s7 couldn't be unmounted because it is in use by process 0 (kernel)

  stderr: Error: -69888: Couldn't unmount disk

0:

Location:
src/cli/subcommand/uninstall.rs:192

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Consider reporting this error using this URL: ...

I tried /nix/nix-installer uninstall once more (out of insanity) and received the following output:

Nix uninstall plan (v0.18.0)

Planner: macos (with default settings)

Planned actions:

  • Unconfigure Nix daemon related settings with launchctl
  • Delete file /Library/LaunchDaemons/systems.determinate.nix-installer.nix-hook.plist
  • Remove the Nix configuration from zsh's non-login shells
  • Unconfigure the shell profiles
  • Remove the Nix configuration in /etc/nix/nix.conf
  • Unset the default Nix profile
  • Remove time machine exclusions
  • Remove Nix users and group
  • Remove the directory tree in /nix
  • Remove the APFS volume Nix Store on disk3

Proceed? ([Y]es/[n]o/[e]xplain):
INFO Revert: Remove directory /nix/temp-install-dir
INFO Revert: Configure Nix daemon related settings with launchctl
INFO Revert: Create a launchctl plist to put Nix into your PATH
INFO Revert: Configuring zsh to support using Nix in non-interactive shells
INFO Revert: Configure Nix
INFO Revert: Configure Time Machine exclusions
INFO Revert: Create build users (UID 301-332) and group (GID 30000)
INFO Revert: Provision Nix
INFO Revert: Create an encrypted APFS volume Nix Store for Nix on disk3 and add it to /etc/fstab mounting on /nix
ERROR Uninstallation complete, some errors encountered
Error:
0: Error reverting
0: Action create_nix_hook_service errored
Remove file /Library/LaunchDaemons/systems.determinate.nix-installer.nix-hook.plist

  Action `configure_nix` errored
  Multiple child errors

  Action `configure_shell_profile` errored
  Multiple child errors

  Action `create_directory` errored
  Read path `/etc/profile.d`

  Action `create_directory` errored
  Read path `/etc/zsh`


  Action `place_nix_configuration` errored
  Multiple child errors

  Action `create_or_merge_nix_config` errored
  Remove file `/etc/nix/nix.conf`

  Action `create_directory` errored
  Read path `/etc/nix`



  Action `set_tmutil_exclusion` errored
  Failed to execute command with status 22 `"tmutil" "removeexclusion" "/nix/var"`, stdout: 
  stderr: /nix/var: Error (-43) while attempting to change exclusion setting.



  Action `create_users_and_group` errored
  Multiple child errors

  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld1"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld2"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld3"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld4"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld5"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld6"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld7"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld8"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld9"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld10"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld11"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld12"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld13"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld14"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld15"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld16"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld17"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld18"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld19"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld20"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld21"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld22"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld23"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld24"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld25"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld26"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld27"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld28"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld29"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld30"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld31"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_user` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Users/_nixbld32"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)



  Action `create_group` errored
  Failed to execute command with status 185 `"/usr/bin/dscl" "." "-delete" "/Groups/nixbld"`, stdout: delete: Invalid Path

  stderr: <dscl_cmd> DS Error: -14009 (eDSUnknownNodeName)




  Action `create_nix_tree` errored
  Multiple child errors

  Action `create_directory` errored
  Read path `/nix/var/nix/daemon-socket`

  Action `create_directory` errored
  Read path `/nix/var/nix/userpool`

  Action `create_directory` errored
  Read path `/nix/var/nix/temproots`

  Action `create_directory` errored
  Read path `/nix/var/nix/profiles/per-user`

  Action `create_directory` errored
  Read path `/nix/var/nix/profiles`

  Action `create_directory` errored
  Read path `/nix/var/nix/gcroots/per-user`

  Action `create_directory` errored
  Read path `/nix/var/nix/gcroots`

  Action `create_directory` errored
  Read path `/nix/var/nix/db`

  Action `create_directory` errored
  Read path `/nix/var/nix`

  Action `create_directory` errored
  Read path `/nix/var/log/nix/drvs`

  Action `create_directory` errored
  Read path `/nix/var/log/nix`

  Action `create_directory` errored
  Read path `/nix/var/log`

  Action `create_directory` errored
  Read path `/nix/var`


  Action `create_nix_volume` errored
  Multiple child errors

  Action `bootstrap_launchctl_service` errored
  Failed to execute command with status 5 `"launchctl" "bootout" "system" "/Library/LaunchDaemons/org.nixos.darwin-store.plist"`, stdout: 
  stderr: Boot-out failed: 5: Input/output error



  Action `create_volume_service` errored
  Remove file `/Library/LaunchDaemons/org.nixos.darwin-store.plist`

  Action `encrypt_apfs_volume` errored
  Failed to execute command with status 44 `"/usr/bin/security" "delete-generic-password" "-a" "Nix Store" "-s" "Nix Store" "-l" "disk3 encryption password" "-D" "Encrypted volume password" "-j" "Added automatically by the Nix installer for use by /Library/LaunchDaemons/org.nixos.darwin-store.plist"`, stdout: 
  stderr: security: SecKeychainSearchCopyNext: The specified item could not be found in the keychain.



  Action `create_apfs_volume` errored
  Failed to execute command with status 1 `"/usr/sbin/diskutil" "apfs" "deleteVolume" "Nix Store"`, stdout: Started APFS operation
  Deleting APFS Volume from its APFS Container
  Unmounting disk3s7
  The volume "Nix Store" on disk3s7 couldn't be unmounted because it is in use by process 0 (kernel)

  stderr: Error: -69888: Couldn't unmount disk

0:

Location:
src/cli/subcommand/uninstall.rs:192

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Consider reporting this error using this URL: ... (much longer URL)

When I run the recommended curl --proto '=https' --tlsv1.2 -sSf -L https://install.determinate.systems/nix/tag/v0.26.0 | sh -s -- repair sequoia --move-existing-users, I get the following output:

info: downloading installer https://install.determinate.systems/nix/tag/v0.26.0/nix-installer-aarch64-darwin INFO nix-installer v0.26.0 `nix-installer` needs to run as `root`, attempting to escalate now via `sudo`... INFO nix-installer v0.26.0 WARN get_existing_receipt: Could not parse receipt. Your receipt will not be updated to account for the new UIDs Will move the _nixbld users to the Sequoia-compatible 350+ UID range and WILL NOT update the receipt

Proceed? ([Y]es/[n]o):
WARN get_existing_receipt: Could not parse receipt. Your receipt will not be updated to account for the new UIDs
WARN Unable to find create_users_and_group in receipt (receipt didn't exist or is unable to be parsed by this version of the installer). Your receipt at /nix/receipt.json will not reflect the changed UIDs, but the users will still be relocated to the new Sequoia-compatible UID range, starting at 350, and uninstallation will continue to work as normal, even if the UIDs do not match.
Error:
0: Failed to execute command with status 56 "/usr/bin/dscl" "-plist" "." "-read" "/Groups/nixbld", stdout:
stderr: <dscl_cmd> DS Error: -14136 (eDSRecordNotFound)

0:

Location:
src/cli/subcommand/repair.rs:259

Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Consider reporting this error using this URL: ...

At this point, I'm just wondering: how best to recover?

@cole-h
Copy link
Member

cole-h commented Oct 23, 2024

@mwilsoncoding Hi!

(Next time, please file issues with our installer against our installer at https://github.com/DeterminateSystems/nix-installer/issues, so we don't clutter upstream issues!)

To answer your question:

At this point, I'm just wondering: how best to recover?

Based on the output you've posted, Nix is (mostly) successfully uninstalled.

The first errors about "user _nixbld1 not found" are what would have been fixed by the repair sequoia command (macOS Sequoia takes the users with those UIDs for its own purposes, deleting them in the process). The "disk in use" issue is typically resolved by rebooting.

After rebooting, you should be able to attempt another install. If it complains about the disk, it should give you remediation steps you can try before running it again.

For future reference: you should run the repair tool while Nix is installed -- that's the only time it will work. If Nix is not installed, then it can't do anything.

@dapkdapk
Copy link

dapkdapk commented Nov 5, 2024

I don't know what was included in this macOS update. But the problem seems to be fixed in version Sequoia 15.1 (24B83). At least for me. Can anyone else confirm this?

@abathur
Copy link
Member Author

abathur commented Nov 5, 2024

I don't know what was included in this macOS update. But the problem seems to be fixed in version Sequoia 15.1 (24B83). At least for me. Can anyone else confirm this?

Can you clarify some details here? Maybe like:

  • did you install 15.1 from 14 or earlier, or do you mean your Nix install was broken on 15.0 and is now working?
  • what gid/uid do your _nixbld group and users have?
  • when did you install Nix?

@dapkdapk
Copy link

dapkdapk commented Nov 7, 2024

I don't know what was included in this macOS update. But the problem seems to be fixed in version Sequoia 15.1 (24B83). At least for me. Can anyone else confirm this?

Can you clarify some details here?..

Yes, sorry.

  1. I had previously installed MacOSX 15.0 and had exactly the problems described here. Before that, I had 14.x and no problems. With 15.0, I've uninstalled NixOS completely and also deleted the volume. Among other things, I also deleted all nixbld-users (as @Teebor-Choka pointed out). The new installation did not help. Then I gave up. This happened in Sept. It was only when update 15.1 came along that I tried again and did a NixOS update over the old failed attempt. The bash files had to be reset again and then everything went smoothly.

  2. Now I have the users _nixbld1-32

I can no longer say in detail which NixOS version it was before. But now it is currently Nix 2.24.10. I use a Mac14,6 Apple M2 Max.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installer macos Nix on macOS, aka OS X, aka darwin
Projects
None yet
Development

No branches or pull requests