You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We experimented "full" TLS verification in #2659, and switched it back to "certificate" in #2831.
A bug in Elasticsearch leads to Elasticsearch presenting an outdated certificate with the wrong IP address during and after rolling upgrades.
As a result, full TLS verification leads to nodes not being able to trust each other when the bug happens.
Once we reintroduce full verification, we must make sure we don't break existing clusters impacted by the bug, even though we have a workaround for that bug. What may happen (that we want to avoid) is the following:
the change to full TLS verifications (and the TLS cert bugfix) leads to a rolling upgrade of the cluster
first Pod starts with full verification enabled: it cannot contact other nodes in the cluster impacted by the TLS cert bug
rolling upgrade can never progress
We may want to perform a double rolling upgrade instead, where we first upgrade all nodes to include the TLS certs bugfix, then upgrade all nodes to enable full TLS verification:
a cluster was created with ECK 1.0, it may or may not include Pods impacted by the bug, serving a certificate with the wrong IP
ECK version upgrade including the init container fix
the cluster gets a rolling upgrade, once done we are fairly confident no Pod is impacted by the TLS certs bug. At this point TLS verification is still set to "partial".
the cluster gets a second rolling upgrade to set TLS verification to "full", because we know no Pod is impacted by the TLS certs bug.
The text was updated successfully, but these errors were encountered:
Related to #2823.
We experimented "full" TLS verification in #2659, and switched it back to "certificate" in #2831.
A bug in Elasticsearch leads to Elasticsearch presenting an outdated certificate with the wrong IP address during and after rolling upgrades.
As a result, full TLS verification leads to nodes not being able to trust each other when the bug happens.
We eventually want to reintroduce full verification, but we need to make sure we correctly handle the bug above. One option is to rely on DNS names instead of IP addresses, another one is to check the IP address in certificates before starting Elasticsearch.
Once we reintroduce full verification, we must make sure we don't break existing clusters impacted by the bug, even though we have a workaround for that bug. What may happen (that we want to avoid) is the following:
We may want to perform a double rolling upgrade instead, where we first upgrade all nodes to include the TLS certs bugfix, then upgrade all nodes to enable full TLS verification:
The text was updated successfully, but these errors were encountered: