-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail v2 Migrations when unknown document types are found #101052
Comments
Pinging @elastic/kibana-core (Team:Core) |
@joshdover, I don't recall being part of a conversation about only failing if the
This almost certainly has to be by accident, because it won't allow users to follow our documentation for removing a plugin that relies on saved-object migrations: https://www.elastic.co/guide/en/kibana/7.13/kibana-plugins.html#update-remove-plugin. Will either option that is proposed allow users to remove plugins following our documentation? If not, we're going to be introducing a breaking change. However, if we think that the breaking change is subtle and a large number of our users shouldn't be impacted, we can document it as a breaking change and proceed. |
Yea, this is a bug. Which has been present for a long time, probably since the beginning of SO migrations, but a bug nevertheless.
Technically we would not be introducing anything, as the problem has always been there. We would just be transforming a bug into a feature ™️ .
Currently most of our users shouldn't be impacted, as triggering this situations requires to disable a plugin registering a SO type you were effectively using before upgrading Kibana. Effectively, there isn't really any reason to do that, so it's unlikely our customers are doing it (and the total lack of SDH issues about this problem seems like a good confirmation). Now, if we think long-term, this feels more problematic. The best example would be third party plugins, where a scenario where a user tries a plugin for a bit and then choose to uninstall it. However, committing to to option (3) from the RFC now, and choosing to change the behavior later is also a possibility. |
We should not generalize the approach of "transforming a bug into a feature". The goal of not creating breaking changes is to provide a good experience for our users, so that Elastic products provide them joy. If this bug was to affect a large number of our users, this is a bug that we should absolutely be fixing and not embracing. However, if we can use the existence of a bug for a long time to justify the minimal impact of introducing a breaking change, that's fine. Ignoring my concerns with the "transforming a bug into a feature" logic, the v1 and v2 migrations only fail when the
👍 This is a scenario we should not be ignoring. If it's not incredibly common today, it will become incredibly common. |
My 2c worth is I see the POV from both sides. However, as a current elastic user, I'd like the flexibility of being able to try a feature, disable it and still retain the possibility of being able to use my old data in some way or another. Be it through snapshots, exporting data before disabling the feature or any other way of holding on to it without impacting perf would be a huge bonus. That being said, there is the cost of hanging on to docs that aren't being used indefinitely. We saw that with the fleet-agent docs that caused massive SO count explosions! Whichever way we go, we need to make sure it's documented well and that folks know the risks involved. Remembering back to when fleet was first introduced as beta, even with the very clear warning that there wasn't a migration path for the beta version, folks were still taken by surprise when they did migrate. |
You probably misunderstood me on that one (I may have set my french cursor a bit too high). I'm strongly in favor of trying to find a solution for this potential ticking bomb instead of just keeping this behavior as it is currently. I was just saying that effectively, not fixing the issue should probably not be considered a breaking change, as this issue has always been there.
I strongly agree. One potential option I see would be to
Ideally, we would also be able to update the
My gut feeling is that, now that we're performing full client-side reindex during the migration, we'll want to have #96626 quite soon. This would address that problem. |
Another thing to take into account if we decide to copy unknown documents to the target index without failing the migration is the multi-instance scenario
So even if we decide to copy unknown docs during migration, I think we still should declare that having multiple instance with different plugin sets is not something that we're supporting. |
Doesn't it require the first option anyway? Or you propose the migration should fail and we ask a Kibana user to navigate to this UI to perform cleanup?
IMO it sounds like the most practical solution, as it's mostly an edge case.
Btw how Kibana behaves when unknown SO types are imported? I'd assume it fails as well. |
This should probably be the case anyways, though we'll need to determine how we're going to handle this if we ever support dedicated node types. We should document this in the "settings that must be the same" section: https://www.elastic.co/guide/en/kibana/7.13/production.html#load-balancing-kibana
I'm +1 on this and this was the intended behavior during 7.x. I think essentially fixing this bug for 7.x is the best option we have right now, given that users use the ability to disable plugins in some cases. I don't think failing migrations with unknown document types is a practical option unless and until we adopt not letting any 1st party plugins from being disabled (#89584). That change could only happen at 8.0 at the earliest. If there's no objections, I propose we close this issue and open a new one to list out the steps needed to update DocumentMigrator to support preserving documents of an unknown type with a |
Agreed, 100%
👍 |
Closing in favor of #101351 |
In our initial design of the v2 Migrations, we decided that the ideal way to handle documents from disabled plugins was to fail migrations. See the option (3) in the RFC section describing why. However, in some scenarios, v1 Migrations allowed documents of unknown types to be preserved, so we decided to defer on this change until 8.0 since it was considered breaking.
During testing in #100171, we found that in some scenarios both v1 and v2 Migrations actually already exhibits this behavior, though somewhat by accident. If a customer disables a plugin, it's been growing more and more likely that this scenario would be encountered. With v2 migrations, this situation can actually be encountered on any upgrade where there are documents of an unknown type (or belonging to a disabled plugin) that have a
migrationVersion
specified. More details about the observed behavior going back to 6.8 can be viewed here: #100171 (comment).We've decided that we could go ahead and commit to option (3) from the RFC in 7.x since this is already the behavior in most scenarios but not all:
Committing to this option entails that we:
OUTDATED_DOCUMENTS_*
stepsThe primary risk I see to this change is that it effectively no longer allows users to disable plugins that have any stored Saved Objects, since Kibana will fail to startup. This seems like a major issue for cases where disabling things like
task_manager
ortelemetry
would are helpful for debugging a problem.An alternative could be to only fail if the
migrationVersion
is newer than the current Kibana version, rather than failing if we don't have a known migration, as @pgayvallet shared here: #100171 (comment). This was previously considered, but discarded:@kobelb Do you remember what the risks were to this option?
The text was updated successfully, but these errors were encountered: