Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup migrated docs on OUTDATED_DOCUMENTS_SEARCH step #97965

Closed
mshustov opened this issue Apr 22, 2021 · 4 comments
Closed

Cleanup migrated docs on OUTDATED_DOCUMENTS_SEARCH step #97965

mshustov opened this issue Apr 22, 2021 · 4 comments
Assignees
Labels
bug Fixes for quality problems that affect the customer experience project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.14.0

Comments

@mshustov
Copy link
Contributor

mshustov commented Apr 22, 2021

The current SO migrations v2 has a problem that documents migrated during OUTDATED_DOCUMENTS_SEARCH step are not deleted from the index if their id changed during migration.
It might cause data loss during the next migration process when already migrated SO will be re-written by outdated document migration.
As discussed in #97222 (comment), we can extend OUTDATED_DOCUMENTS_SEARCH step logic to remove documents that have been migrated during this step.

// OUTDATED_DOCUMENTS_TRANSFORM: (input) => output
const output = await migrate(input);
const elementsToDelete = input.filter((input_element) => output.every((output_element) => output_element.id !== input_element.id));
await es.delete(elementsToDelete);

Things to consider:

  • Handling 404s on deletes when running multiple Kibana nodes at the same time. Should include test for this specific case.
@mshustov mshustov added v7.14.0 project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient labels Apr 22, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Apr 22, 2021
@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc and removed needs-team Issues missing a team label labels Apr 22, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@joshdover joshdover added the bug Fixes for quality problems that affect the customer experience label Apr 22, 2021
@joshdover joshdover removed the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Apr 29, 2021
@botelastic botelastic bot added the needs-team Issues missing a team label label Apr 29, 2021
@joshdover joshdover added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Apr 29, 2021
@botelastic botelastic bot removed the needs-team Issues missing a team label label Apr 29, 2021
@joshdover joshdover self-assigned this May 5, 2021
@joshdover
Copy link
Contributor

I've ran into one issue here which I don't have a great way of solving.

When deleting these outdated documents, I want to be sure that we only delete them if no other changes have been written to them since they were first read. In order to do that, I need to use the _bulk API so that I can use the if_seq_no and if_primary_term options in order to use optimistic concurrency control.

This works out great for cases where the document was modified since the document was read and it returns errors like:

{
  "took" : 317,
  "errors" : true,
  "items" : [
    {
      "delete" : {
        "_index" : "test-mg",
        "_type" : "_doc",
        "_id" : "123",
        "status" : 409,
        "error" : {
          "type" : "version_conflict_engine_exception",
          "reason" : "[123]: version conflict, required seqNo [1], primary term [1]. current document has seqNo [2] and primary term [1]",
          "index_uuid" : "9MFEiRe2QJuWYNYvttzoiA",
          "shard" : "0",
          "index" : "test-mg"
        }
      }
    }
  ]
}

However, it gives the same error for when documents were already deleted, which can easily be encountered when multiple nodes are cleaning up the outdated documents at once:

{
  "took" : 3,
  "errors" : true,
  "items" : [
    {
      "delete" : {
        "_index" : "test-mg",
        "_type" : "_doc",
        "_id" : "123",
        "status" : 409,
        "error" : {
          "type" : "version_conflict_engine_exception",
          "reason" : "[123]: version conflict, required seqNo [2], primary term [1]. but no document was found",
          "index_uuid" : "9MFEiRe2QJuWYNYvttzoiA",
          "shard" : "0",
          "index" : "test-mg"
        }
      }
    }
  ]
}

I want to be able to ignore errors of this second form, while failing the migration on errors of the first form. However, only difference in the error messages is the error.reason line which is a human-readable string and could easily change.

I'm going to move forward with implementing this by filtering out errors that contain no document was found but I will get in touch with the Elasticsearch team as well to see if there's a better option that I'm missing.

@joshdover
Copy link
Contributor

I've opened elastic/elasticsearch#73895 for above. Since this would be a breaking change, in the meantime, we'll need to filter out on this reason string and make sure to include an integration test to catch when/if Elasticsearch changes this string.

@joshdover
Copy link
Contributor

Based on discussion and path forward in #101351, we no longer need this step since we will be throwing an error on any unknown types in the index prior to upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.14.0
Projects
None yet
Development

No branches or pull requests

3 participants