Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatcar instance at Linode loses network connectivity after 3227.2.0 upgrade #807

Closed
salfter opened this issue Jul 23, 2022 · 7 comments
Closed
Labels
area/network Issues related to network. kind/bug Something isn't working

Comments

@salfter
Copy link

salfter commented Jul 23, 2022

Description

I have two Flatcar instances running, both at 3227.2.0. One is a bare-metal instance on a home server (an Asrock Rack X470D4U2-2T with a Ryzen 5 2600) that continues to run properly, but the other is a Linode VM (messages at boot time indicate they're using kvm). I tried accessing a service that should be running on it, but got nowhere. I tried to ssh in...no dice. I brought up the web console interface and got in that way, but saw the following:

Flatcar Container Linux by Kinvolk stable 3227.2.0 for QEMU
Failed Units: 2
  systemd-networkd.service
  systemd-networkd.socket

Rebooting made no difference. I tried restarting the indicated service, but that just produced an error.

The Linode console won't let me copy text out of it, so this is a screen dump of systemctl status systemd-networkd.service:

magnetico-error-1

and of journalctl -xeu systemd-networkd.service:

magnetico-error-2

Impact

I have a VM that is unreachable from across the network. How do I fix this?

Environment and steps to reproduce

No particular steps were taken on my part; this VM has been running without issue for the past few months, until the recent update to 3227.2.0.

Expected behavior

At a minimum, I'd at least be able to ssh in. Ideally, other services would be responsive. (I'd sometimes have to restart them manually after Flatcar updated itself, but not always.)

@salfter salfter added the kind/bug Something isn't working label Jul 23, 2022
@till
Copy link

till commented Jul 24, 2022

I had something similar (not linode) once where dhcp failed to hand out a new lease.

Can you dig up more from the journal to see if you find something there?

Basically, search in sudo journalctl around the time where the units failed. Maybe boot log since it seems to happen then?

@jepio
Copy link
Member

jepio commented Jul 25, 2022

could anyone here attach a full journalctl -b0 output?

@jepio
Copy link
Member

jepio commented Jul 25, 2022

Would you be able to save some debug logs to disk, and then extract them after performing a manual rollback?https://www.flatcar.org/docs/latest/setup/debug/manual-rollbacks/#performing-a-manual-rollback

The first things that would help:

dmesg
journalctl -b0
networkctl status
ip link
ip addr
ls -la /usr/lib/systemd/systemd-networkd
sestatus

@jepio
Copy link
Member

jepio commented Jul 25, 2022

And could you verify whether the issue also happens when a fresh VM is provisioned with 3227.2.0? Or does it require updating.

@tormath1 tormath1 added the area/network Issues related to network. label Jul 25, 2022
@whites11
Copy link

And could you verify whether the issue also happens when a fresh VM is provisioned with 3227.2.0? Or does it require updating.

Not 100% sure I have the same issue, but I had problems with systemd-resolved not respecting dhcp settings and it was 100% on a new VM (no upgrades at all involved).

See https://kubernetes.slack.com/archives/C03GQ8B5XNJ/p1658762999205499

@jepio
Copy link
Member

jepio commented Aug 1, 2022

hi @salfter,
any chance you could provide some more information, we would really love to track this down and fix this before the next bugfix release.

@salfter
Copy link
Author

salfter commented Sep 1, 2022

This issue got away from me for a bit, but whatever was broken was fixed when I grabbed the most recent image this evening and ran through some of the steps in my Flatcar-on-Linode guide to get it up and running again:

https://alfter.us/2021/09/20/installing-flatcar-container-linux-on-linode/

Hopefully it was a one-time glitch, as this node had been through several updates previously without any problems. Hopefully it will be back to that behavior now.

@salfter salfter closed this as completed Sep 1, 2022
Repository owner moved this from Planned / ToDo to Implemented in Flatcar tactical, release planning, and roadmap Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/network Issues related to network. kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants