Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root password lost after an OS upgrade #1982

Closed
ldevulder opened this issue Feb 28, 2024 · 3 comments · Fixed by #1984 or #1990
Closed

Root password lost after an OS upgrade #1982

ldevulder opened this issue Feb 28, 2024 · 3 comments · Fixed by #1984 or #1990
Assignees
Labels
kind/bug Something isn't working

Comments

@ldevulder
Copy link
Contributor

ldevulder commented Feb 28, 2024

elemental-toolkit version: 1.1.1

elemental-operator version: 1.5.0 (Dev)

CPU architecture, OS, and Version: x86, Elemental OS v2.0.2 (based on SLE Micro 5.5)

Describe the bug

After an upgrade from Elemental OS v2.0.2 (Stable) to v2.1.0 (Dev) with managedOSVersionName method the root password previously configured during deployment is not working anymore. Erased?

To Reproduce

Install Rancher Manager v2.8.2, elemental-operator Dev (1.5.0), deploy a simple one node cluster with Elemental OS Stable (2.0.2) and update it to Dev (2.1.0) with ManagedOSVersionName. Wait for the node to reboot and try to connect with either ssh on directly on the console (for baremetal of VM).

ManagedOSImage used:

apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSImage
metadata:
  name: with-os-version-name
  namespace: fleet-default
spec:
  clusterTargets:
    - clusterName: cluster-k3s
  managedOSVersionName: v2.1.0-unstable

Expected behavior

OS upgrade done and root connection still working.

Logs

elemental-operator.log

Additional context

N/A

@ldevulder ldevulder added the kind/bug Something isn't working label Feb 28, 2024
@ldevulder ldevulder moved this to 🗳️ To Do in Elemental Feb 28, 2024
@frelon
Copy link
Contributor

frelon commented Feb 28, 2024

How is the root password set during deployment?

@ldevulder
Copy link
Contributor Author

How is the root password set during deployment?

It done with cloud-config in MachineRegistration, with this template.

@davidcassany
Copy link
Contributor

OK, found the root cause... this is a tricky one... on reboot none of the elemental-setup-* services that are executed after switching root is executed. The reason for that is because they are hidden behind the bind mount of the previous persistent /etc/systemd/system. The contents of the image content with the current setup are only synced once on the first deployment, if we sync on every boot we need to make sure rsync call does not include the --delete flag and also comes with the arguable behavior of reverting any modified file compared to the original.

This requires some thoughts and discussion to agree what is the best behavior to support existing deployments and which is the best behavior for the future (probably we should switch to overlay by default). Also, is it possible to migrate from bind mounts to overlay persistent paths?

To fill the gap an not block testing with this issue we could essentially make the --delete flag toggable in rsync wrapper and make sync for bind mounts to be executed in all cases. This would keep previous behavior, which is not ideal and should be improved IMHO.

@davidcassany davidcassany self-assigned this Feb 29, 2024
@davidcassany davidcassany moved this from 🗳️ To Do to 🏃🏼‍♂️ In Progress in Elemental Feb 29, 2024
@github-project-automation github-project-automation bot moved this from 🏃🏼‍♂️ In Progress to ✅ Done in Elemental Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
Archived in project
3 participants