Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ECK IPv6 compatible #3654

Merged
merged 20 commits into from
Sep 4, 2020
Merged

Make ECK IPv6 compatible #3654

merged 20 commits into from
Sep 4, 2020

Conversation

pebrc
Copy link
Collaborator

@pebrc pebrc commented Aug 24, 2020

Fixes #3649

Summary of changes

  • put literal IPv6 addresses in brackets when used in URLs or literal form (e.g in readiness probes)
  • switch to correct variant for well known IPs like loopback or zero-valued IP
  • allow IPv6 in SAN validation and CSR
  • switch from IP base network.publish_host to DNS based:
    • Reason: we used to inject the Pod IP via the downward API, this API does not correctly bracket IPv6 addresses leading to invalid configuration
    • we were discussing to switch to DNS based publish_host anyway
    • sticking to IPs would require us to change the publish_host once the Pod gets its IP and force another delete/create cycle for every node, which seems worse than just using a DNS name to begin with
  • Kibana and Enterprise Search are sensitive about the correct specification of the 'unspecified' ie. all-zero IP address (INADDR_ANY) to bind to all addresses on the local machine
    • this PR introduces some form of IPv6 autodetection on operator startup by injecting the operator IP into the environment
    • this information is then passed on into the configuration generation code for both applications to insert the correct IP
    • we could amend this mechanism in the future by for example introducing an IPFamily attribute on the corresponding CRDs to allow users to express a preference in dual-stack environments or add a global flag on the operator level to do the same.
  • 59c217f changes the discovery mechanism from IPs to DNS names. This change is not strictly necessary to support IPv6 I only changed it for consistency (see also Anya's comment Make ECK IPv6 compatible #3654 (comment)) It might carry some risk for slow cluster formation due to cached negative DNS lookups of not (yet) existing Pods and I am more than happy to revert it. We discussed out-of-band and the only potential benefit is that we might be able to remove the IP addresses from the certificates we generate and reduce the number of times we have to generate the certificates. At the same time this benefit is probably limited as certificates are one of the things Elasticsearch can reload on the fly. @nkvoll what are your thoughts in favour or against DNS based discovery?

Implications of switch to DNS names

@sebgl 's questions from #2830 still apply:

can DNS caching be an issue for nodes to discover themselves?

yes, negative lookup caching can be an issue: if a Pod was just created we might start seeing a delay until its name can be resolved if some entity did a negative lookup just before it was created

are there cases (security constraints) where Pods are not allowed/not supposed to rely on DNS?

Idk

any issue with the DNS server will lead to Elasticsearch not behaving correctly: are we ok with that?

I would assume that any issues with k8s DNS server will have more widespread effects than just Elasticsearch not behaving correctly, so I would say yes

what's the correct DNS name to use? , .

I believe it is <pod-name>.<statefulset-name> based on this

Breaking?

Should we mark this as breaking because we now include non-ready Pods in the headless service. I am on the fence because I believe the headless service should be considered an implementation detail and it is not meant for "public consumption". Having said that I am sure that there are users already relying on it, so at the very least we should highlight it in the release notes.

Testing

You need to run this against kind under Linux to test the IPv6 part (or set up your own vanilla IPv6 k8s from scratch).
I used the following spec:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  ipFamily: ipv6
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker

after reconfiguring Docker to use IPv6
cat /etc/docker/daemon.json

{
  "ipv6": true,
  "fixed-cidr-v6": "2001:db8:1::/64"
}

I am thinking we should introduce another test job to run the e2e tests against a IPv6 kind cluster (or change the existing job to IPv6) Will be a separate PR though

@botelastic botelastic bot added the triage label Aug 24, 2020
@pebrc pebrc added the >bug Something isn't working label Aug 24, 2020
@botelastic botelastic bot removed the triage label Aug 24, 2020
@pebrc pebrc added the v1.3.0 label Aug 24, 2020
@@ -54,6 +54,8 @@ func HeadlessService(es *esv1.Elasticsearch, ssetName string) corev1.Service {
Port: network.HTTPPort,
},
},
// allow nodes to discover themselves via DNS while they are booting up ie. are not ready yet
PublishNotReadyAddresses: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will want to flag this in the release notes since it could cause issues if people were relying on the old behavior

@anyasabo anyasabo added the release-highlight Candidate for the ECK release highlight summary label Aug 26, 2020
@anyasabo
Copy link
Contributor

anyasabo commented Aug 26, 2020

This approach LGTM 👍

Question: Should we use DNS names in discovery as well to be consistent?

I think yes? But I haven't thought about it too much.

This PR also allows us to pick up #2833 since it adds DNS names to the cert

@pebrc pebrc marked this pull request as ready for review August 27, 2020 12:58
@pebrc pebrc added >breaking and removed >breaking labels Aug 27, 2020
@anyasabo
Copy link
Contributor

I was curious what the potential effects of negative caching might have and it became a bit complicated. CoreDNS by default has a 5s TTL. BUT the default config shipped with k8s has a 30s TTL (to match the dnsmasq based kubedns).

If you run coredns in cache mode (as you would for node local DNS), the default is 30 minutes for nxdomain responses. But in GKE, they configure node local DNS to have a 5s TTL for nxdomain responses.

Openshift (v4 at least) also runs with a 30s TTL.

It appears like most configs would have a 5-30s TTL, which seems fine to me.

@pebrc
Copy link
Collaborator Author

pebrc commented Aug 27, 2020

Elasticsearch/the JVM caches negative lookups for 10 seconds and positive ones for 60 in recent versions https://www.elastic.co/guide/en/elasticsearch/reference/current/networkaddress-cache-ttl.html

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still doing some tests. I just left a minor comment but feel free to ignore it as I'm not an IPv6 expert.

args: args{
ipStr: "::FFFF:129.144.52.38",
},
want: corev1.IPv4Protocol,
Copy link
Contributor

@barkbay barkbay Aug 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I must confess I spent some time trying to understand the differences between IPv4-embedded IPv6 and IPv4-mapped IPv6, and wondering if this is what I would expect from ToIPFamily since ::FFFF:129.144.52.38 is an ipv6 address.

But I think I'm 👍 with the current code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I found that confusing too, but I am following net.ParseIP 's behaviour here.

func MaybeIPTo4(ipAddress net.IP) net.IP {
if ip := ipAddress.To4(); ip != nil {
// IPToRFCForm normalizes the IP address given to fit the expected network byte order octet form described in
// https://tools.ietf.org/html/rfc5280#section-4.2.1.6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC in addition to the byte ordering requirement we also want to avoid compressed zero form of IPv6 addresses using "::" ?
Maybe we can also explain this in the comment ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it is not removing the compressed form actually. Which is fine , with the exception of Enterprise Search which has currently an issue with compressed IPv6 IPs. I missed this because I was testing with manually configured overrides 😞 So thanks for pointing this out, will fix in another commit.

@nkvoll
Copy link
Member

nkvoll commented Sep 1, 2020

I think DNS based discovery makes sense. ES wants to discover a node with a name (which also usually has a PVC attached). The Pod is ephemeral and doesn't matter, and thus the same ephemeralness applies to its IP. Having bad TTLs for positive and negative hits seem like an operator error that also would cause issues for other services.

I would vote for creating another specific issue/PR for the switch to DNS-based discovery so it gets a little more (deserved) visibility.

@@ -440,6 +441,7 @@ func startOperator(stopChan <-chan struct{}) error {
}
params := operator.Parameters{
Dialer: dialer,
IPFamily: net.ToIPFamily(os.Getenv(settings.EnvPodIP)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should include that as an operator config setting, that defaults to autodetect if empty.
eg. --ip-family=ipv4|ipv6|""
So users can override it if for some reasons the IP family of the operator is not the ip family they want to use for the managed workload.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this would be a future addition. But I think you are right it would be good to have an escape hatch right from the start. Will add it.

if ipFamily == corev1.IPv4Protocol {
return net.IPv4zero.String()
}
// Enterprise Search even in its most recent version 7.9.0 cannot properly handle contracted IPv6 addresses like "::"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like we should open an issue in the enterprise search repo

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already done.

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some tests, deployed Elasticsearch,Kibana,APM Server, Enterprise Search and filebeats on Kind as suggested, it worked as expected.

Also 👍 to add a dedicated e2e test pipeline

Regarding questions around DNS caching I'm not sure I have a good answer. It might happen at the infra level, I guess understanding how it would disrupt Pods being restarted requires to understand how publish_host is used by the stack and the clients ?
Should we do some tests on an infra which makes an aggressive use of DNS caching to understand the edge cases ?

image

pkg/controller/elasticsearch/nodespec/readiness_probe.go Outdated Show resolved Hide resolved
pkg/controller/kibana/config_settings.go Outdated Show resolved Hide resolved
@pebrc pebrc merged commit 8cf990a into elastic:master Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >bug Something isn't working release-highlight Candidate for the ECK release highlight summary v1.3.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make ECK IPv6 compatible
6 participants