Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flannel reads from created subnet.env file on startup #752

Merged
merged 1 commit into from
Jun 22, 2017

Conversation

mgleung
Copy link
Contributor

@mgleung mgleung commented Jun 19, 2017

Description

Added feature to allow flannel to restart in case of etcd failures and
still keep the same subnet address for the hosts.

Fixes #610 #29

Testing

Added unit tests in subnet_test.go

Ran etcd locally and killed the process. Made sure that restarting flannel after an etcd restart produced the same subnet address on the lease.

Ran etcd locally and killed the process. When restarting etcd, changed the network configuration so the previous subnet address for the lease no longer matched the SubnetLen. On running flannel, checked that a new subnet was allocated for the lease with the correct subnet length.

Ran etcd locally and killed the process. When restarting etcd, changed the network address so the previous subnet address did not fall within it. On running flannel, checked that a new subnet was allocated for the lease in the appropriate range.

@mgleung mgleung force-pushed the prev-lease branch 3 times, most recently from 26e0fc1 to 7743549 Compare June 19, 2017 20:52
@tomdee
Copy link
Contributor

tomdee commented Jun 19, 2017

Can you include the documentation change in this PR too?

@mgleung mgleung force-pushed the prev-lease branch 4 times, most recently from 7bbf2e3 to cb92205 Compare June 19, 2017 22:46
@mgleung
Copy link
Contributor Author

mgleung commented Jun 19, 2017

Added some documentation. Let me know if it is clear enough.

Copy link
Contributor

@tomdee tomdee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few changes needed for the docs and some questions

@@ -25,6 +25,16 @@ This shows that there is a single lease (`10.5.34.0/24`) which will expire in 85
The `"PublicIP"` value is how flannel knows to reuse this lease when restarted.
This means that if the public IP changes, then the flannel subnet will change too.

In case something happens to etcd and it is either unavailable or loses the record of its leases, flannel will then attempt to renew the last lease that it has saved in its subnet config file (which, unless specified, is located at `/var/run/flannel/subnet.env`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this wording - if etcd is unavailable then flannel can't start, and why would etcd lose the record of its leases?

FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
```
In this case, if flannel fails to retrieve a lease from etcd, it will attempt to renew lease specified in `FLANNEL_SUBNET` (`10.5.34.1/24`). It will only renew this lease if the subnet specified is valid for the current etcd network configuration otherwise it will allocate a new lease.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retrieve an existing lease from etcd,

@@ -46,6 +46,7 @@ FLANNEL_SUBNET=10.5.72.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
```
The `FLANNEL_SUBNET` value written in this subnet config file will be checked each time flannel starts and will be reused if the subnet still matches the values stored in the etcd config.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit ambiguous so will be confusing for uses. Can you make it clearer and more explicit?

main.go Outdated
@@ -410,3 +414,17 @@ func mustRunHealthz() {
panic(err)
}
}

func ReadSubnetFromSubnetFile(path string, config *etcdv2.EtcdConfig) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're adding a new library for reading the file, why not use the same library for writing the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

godotenv only reads from env files sadly

@@ -33,7 +33,8 @@ const (
)

type LocalManager struct {
registry Registry
registry Registry
previousSubnet ip.IP4Net
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reasoning for choosing this mechanism for passing the previousSubnet through to the tryAcquireLease function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid changing the AcquireLease function to accept additional arguments since that would require the kube manager to also be updated when it does not need a subnet ip (since it is not affected by this). I could change it but I wanted to avoid changing the interface signatures.

Prefix string
Username string
Password string
PreviousSubnet ip.IP4Net
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind adding this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was adding a variable to the local manager, I thought that since it was currently being built according to the values passed in from this EtcdConfig that this would be the best place to pass in that value. Looking back at it, this config really only applies to values used to set up a connection to etcd so the value should be moved out for organizational reasons. Downside of moving it out is it clutters up the NewLocalManager constructor a little.

@tomdee
Copy link
Contributor

tomdee commented Jun 22, 2017

@mgleung as discussed offline, this looks good to go but it's back with you to make the change in the EtcdConfig

Added feature to allow flannel to restart in case of etcd failures and
still keep the same subnet address for the hosts.

Fixes flannel-io#610 flannel-io#29
@tomdee tomdee merged commit f2d2304 into flannel-io:master Jun 22, 2017
@mgleung mgleung deleted the prev-lease branch June 22, 2017 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants