-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle subnet lease getting expired #29
Comments
Is there any work under way for this? It'd be incredibly useful as right now if a machine loses a lease and gets a new one it renders any containers on the machine with no network connectivity. |
One implementation idea for this is in #610 |
Also see #520 for some good questions about how flannel handles this at the moment. |
When fixing this, we should make sure this failure scenario is discussed clearly in the docs. |
FWIW, the system design that we've converged on for Cloud Foundry is that hosts are preferentially assigned their prior lease, even if it "expired." And if a new host appears, it is assigned a lease in the following priority order:
This is meant to minimize the probability that a lease is "stolen" from a live, but partitioned, container host. But if that does occur, once the partition heals and the "victim" host re-connects, it will discover that its lease is no longer valid. In this case, the victim host falls into a special, noisy failure mode which will (1) prevent any new workloads from being scheduled and (2) trigger the orchestration system to evacuate any existing workloads. Once the evacuation is complete, the host will clean up any leftover networking state (e.g. remove the VXLAN device), acquire a new lease for itself and begin accepting new workloads. We think this is the right plan. Feedback welcome. |
Added feature to allow flannel to restart in case of etcd failures and still keep the same subnet address for the hosts. Fixes flannel-io#610 flannel-io#29
Added feature to allow flannel to restart in case of etcd failures and still keep the same subnet address for the hosts. Fixes flannel-io#610 flannel-io#29
Added feature to allow flannel to restart in case of etcd failures and still keep the same subnet address for the hosts. Fixes flannel-io#610 flannel-io#29
Added feature to allow flannel to restart in case of etcd failures and still keep the same subnet address for the hosts. Fixes flannel-io#610 flannel-io#29
Added feature to allow flannel to restart in case of etcd failures and still keep the same subnet address for the hosts. Fixes flannel-io#610 flannel-io#29
This is now fixed in v0.8.0 |
Although flannel will start renewing the lease an hour prior to expiration, it could still get lost: e.g. VM getting suspended. Flannel should try to get the same subnet assignment if it's still available but fall back to a new lease and signal the fact.
The text was updated successfully, but these errors were encountered: