Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pod DNS names instead of IP addresses for discovery and TLS certificates #2830

Closed
sebgl opened this issue Apr 8, 2020 · 3 comments
Closed
Labels
>enhancement Enhancement of existing functionality

Comments

@sebgl
Copy link
Contributor

sebgl commented Apr 8, 2020

Related to #2823.

We configure nodes to advertise and discover themselves through their IP addresses.
For that reason, we are likely to regenerate TLS certificates when a Pod is restarted since its IP address changed.

It does not help with this bug where Elasticsearch may serve a certificate with the wrong IP address associated.

Things would be simpler if we only manipulated Pod DNS names instead of IP addresses, and make more sense for TLS certificates which are usually bound to a DNS name, not an IP address.

Questions (to investigate):

  • can DNS caching be an issue for nodes to discover themselves?
  • are there cases (security constraints) where Pods are not allowed/not supposed to rely on DNS?
  • any issue with the DNS server will lead to Elasticsearch not behaving correctly: are we ok with that?
  • what's the correct DNS name to use? <pod-name>, <pod-name>.<namespace>
@anyasabo
Copy link
Contributor

I started poking at this but won't be picking this up any time soon at least, leaving notes here in case it's helpful:
master...anyasabo:pubhost

One thing I think we will need to account for is that sset pods are given a name pod-name.sset-service-name, and we use separate services for each sset (rather than one single headless service that all of the ssets use). I think this is fine, and makes it easier for people who want to target only a specific set of nodes (e.g. ingress), but we will need to make some plumbing changes on our side so that the service name is available.

The other thing I noticed is that when ES starts up, it verifies it can resolve the name you tell it to publish. Currently our services do not publish not-ready pods, so the lookup fails and ES fails to start. We can change this on the service though and I do not see a downside to doing so. If people want different behavior they can create their own service.

@pebrc
Copy link
Collaborator

pebrc commented Sep 10, 2020

Just to update the discussion with the current state: When we moved to DNS based publish_host for IPv6 support we ran into issues with a stale DNS entry that causes Elasticsearch nodes to startup with on old IP for their Pod which is not routable and subsequently that condition permanently prevents that node from joining the Elasticsearch cluster see #3723 (comment)

We should investigate if we can solve this problem by somehow avoiding this first DNS lookup or make sure it is always correct by relying on the /etc/hosts file as suggested in the linked issue.

@pebrc
Copy link
Collaborator

pebrc commented Oct 20, 2021

I am closing this based on #2830 (comment) we can reopen if we want to reconsider and think it is worth taking another crack at it.

@pebrc pebrc closed this as completed Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement Enhancement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants