-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delay Kibana version upgrade until Elasticsearch is fully upgraded #2353
Comments
I'm a +1 for this. Note that for the versions we support, ES and Kibana are compatible iff the major+minor are the same: We've had other issues with out of order upgrades: We should also implement this for APM IMO (though we need to block on Kibana and ES). Upgrade order here: |
One implementation detail I think might be worth discussing: should this be a webhook validation? My initial thinking was that it should be, to make it clear what the problem is. But then it may make it challenging for say, new deployments where you just apply kibana+es at the same time at the same version. We could allow it to pass validation if you create a kibana with an ES ref that does not exist yet to alleviate that. But then it becomes challenging to do validation if the Elasticsearch resource is created later with an incompatible version. That might still be okay though -- the webhook would still catch the vast majority of issues. The alternative I see would be to not validate at admission, but then during reconciliation check the version of the referenced Elasticsearch before upgrading the version and emitting a warning if they are incompatible. That seems harder to implement though and is harder to see that there's a problem. A third option would be to do both -- the webhook catches the vast majority of issues, and the reconciliation check catches the remainder. |
Note that there is another compatibility matrix which offers a bit more flexibility to support minor-version upgrades as long as you upgrade Elasticsearch first. |
As discussed in zoom, we were curious about situations where Kibana does not automatically recover once Elasticsearch is upgraded. Going through this web of issues, we ran into it here: There is an enhancement issue open in Kibana (which mentions it was possible to run into by disabling allocation in ES before upgrading Kibana, which I couldn't reproduce): And the actual original Kibana issue with a bunch of people chiming in: Kibana has some docs here on how to resolve manually: So I think we need to nail down exactly what situations lead to issues in ECK to decide if this is something we want to guard against. If Kibana comes up on its own without intervention once ES is upgraded, I think I'm okay with that as is and do not think we need to change anything in ECK. |
I left some notes in #2352 (comment). It does not occur when upgrading from 6.8.0 to 7.1.0 to 7.2.0 to 7.3.0 to 7.4.0 to 7.5.0. In that situation, Kibana is unavailable during the Elasticsearch upgrade since it detects the Elasticsearch version differs, but becomes available as soon as the Elasticsearch version upgrade is over. |
The way Elasticsearch and Kibana controllers are decoupled in ECK does not make it easy to guarantee a version upgrade order ("don't upgrade Kibana until ES is fully upgraded"). If we want to keep that behaviour (Kibana controller has nothing to do with Elasticsearch resources), a safeguard in the version upgrade would require some sort of communication between the Kibana controller and the Elasticsearch-Kibana association controller. The association controller could indicate (through an annotation?) in the Kibana resource the highest allowed Kibana version, according to the current lowest Elasticsearch version running. The Kibana controller would not upgrade the existing Kibana deployment if the desired Kibana version does not match (yet) that annotation. |
There was a regression in Kibana 7.5 where we no longer waited for all ES nodes to be of a compatible version before starting migrations. 7.6 includes a fix for this: |
@rudolf Do I understand correctly that this means we don't need to handle this at all (ie. there is no risk if Kibana is upgraded first)? Do you happen to know if apm/beats have this kind of check too? |
Yes all nodes upgraded at the same time is supported (and the error behaviour noted in this thread was a bug). Just to be clear, the old Kibana node should be taken down first and then only can the ES and Kibana nodes be upgraded. Kibana polls elasticsearch by default every 2.5s so if an outdated Kibana node is left running while ES is being upgraded there is 2.5s of potentially undefined behaviour. However, once Kibana is restarted, it won't fully start up until the version check is complete. So the behaviour of Kibana being "stuck at startup" is expected. |
Closing in favour of #2600 |
If we upgrade both Elasticsearch & Kibana from eg. v7.1.1 to v7.2.1, Kibana is stuck in its startup phase and logs:
Once all Elasticsearch nodes are upgraded, Kibana is available again.
In case the Kibana resource is pointing towards an Elasticsearch resource managed by ECK, we could maybe wait before upgrading Kibana version until all Elasticsearch nodes have a version superior or equal to the Kibana version?
The text was updated successfully, but these errors were encountered: