Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewriting SO id during migration #97222

Merged
merged 40 commits into from
Apr 26, 2021
Merged

Conversation

mshustov
Copy link
Contributor

@mshustov mshustov commented Apr 15, 2021

Summary

Implementation is based on algorithm changes described in #93002
TLDR: Kibana doesn't reindex SO in ES, but instead loads them from Elasticsearch server, transforms, rewrites id for SO registered with namespace: multiple and namespace: multiple-isolated, writes SO back to Elasticsearch server.

Performance concerns.

Tested on the migration of 100 000 SavedObjects https://github.com/elastic/kibana/blob/6712bc07279c9582094d31eb22ebf72bdfaba2f7/src/core/server/saved_objects/migrationsv2/integration_tests/migration_7.7.2_xpack_100k.test.ts

Performance seems to be better in the worst-case scenario when we have a lot of outdated documents, as readWithPit seems to be more performant than searchForOutdatedDocuments. See logs:

Master

batchSize 1_000: 154sec
migration_test_kibana.log

batchSize 10_000: 104sec

Branch

batchSize 1_000: 97sec
migration_test_kibana.log
batchSize 10_000: 105sec

Checklist

Delete any items that are not applicable to this PR.

@mshustov mshustov added v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.13.0 labels Apr 15, 2021
@@ -229,48 +229,6 @@ describe('KibanaMigrator', () => {
jest.clearAllMocks();
});

it('creates a V2 migrator that initializes a new index and migrates an existing index', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the implementation is tested in the integration tests

transformRawDocs: TransformRawDocs,
outdatedDocuments: SavedObjectsRawDoc[],
index: string,
refresh: estypes.Refresh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move to named arguments instead, but let's do it in a separate PR for all the actions at once

Copy link
Contributor

@rudolf rudolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick first pass and the implementation looks very solid!

@@ -14,7 +14,6 @@
import _ from 'lodash';
import { estypes } from '@elastic/elasticsearch';
import { MigrationEsClient } from './migration_es_client';
import { CountResponse, SearchResponse } from '../../../elasticsearch';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no changes in the file, a simple cleanup

await new Promise((resolve) => setTimeout(resolve, 10000));
});

it('rewrites id deterministically for SO with namespaceType: "multiple" and "multiple-isolated"', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -207,7 +207,7 @@
"fields": "[{\"name\":\"_id\",\"type\":\"string\",\"esTypes\":[\"_id\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_index\",\"type\":\"string\",\"esTypes\":[\"_index\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"_score\",\"type\":\"number\",\"count\":0,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_source\",\"type\":\"_source\",\"esTypes\":[\"_source\"],\"count\":0,\"scripted\":false,\"searchable\":false,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"_type\",\"type\":\"string\",\"esTypes\":[\"_type\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":false},{\"name\":\"message\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"message.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"message\"}}},{\"name\":\"user\",\"type\":\"string\",\"esTypes\":[\"text\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":false,\"readFromDocValues\":false},{\"name\":\"user.keyword\",\"type\":\"string\",\"esTypes\":[\"keyword\"],\"count\":0,\"scripted\":false,\"searchable\":true,\"aggregatable\":true,\"readFromDocValues\":true,\"subType\":{\"multi\":{\"parent\":\"user\"}}}]",
"title": "test_index*"
},
"type": "test_index*"
"type": "index-pattern"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid type that used to fail migration.

Copy link
Contributor

@jportner jportner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry 😅

@mshustov
Copy link
Contributor Author

@jportner reverted, I will skip the test on 7.x branch then

@mshustov mshustov requested a review from jportner April 26, 2021 07:14
Copy link
Contributor

@jportner jportner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@mshustov mshustov merged commit e6ba8cc into elastic:master Apr 26, 2021
@mshustov mshustov deleted the migration-id-gen branch April 26, 2021 13:57
mshustov added a commit to mshustov/kibana that referenced this pull request Apr 26, 2021
* some typos

* implement an alternative client-side migration algorithm

required to enforce idempotent id generation for SO

* update tests

* lol

* remove unnecessary param from request generic

* remove unused parameter

* optimize search when quierying SO for migration

* fix wrong type in fixtures

* try shard_doc asc

* add an integration test

* cleanup

* track_total_hits: false to improve perf

* add happy path test for transformDocs action

* remove unused types

* fix wrong typing

* add cleanup phase

* add an integration test for cleanup phase

* add unit-tests for cleanup function

* address comments

* Fix functional test

* set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase

* sourceIndex: Option.some<> for consistency

* Revert "set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase"

This reverts commit a128d7b.

* address comments from Pierre

* fix test

* Revert "fix test"

This reverts commit 97315b6.

* revert min convert version back to 8.0

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
mshustov added a commit that referenced this pull request Apr 26, 2021
* Rewriting SO id during migration (#97222)

* some typos

* implement an alternative client-side migration algorithm

required to enforce idempotent id generation for SO

* update tests

* lol

* remove unnecessary param from request generic

* remove unused parameter

* optimize search when quierying SO for migration

* fix wrong type in fixtures

* try shard_doc asc

* add an integration test

* cleanup

* track_total_hits: false to improve perf

* add happy path test for transformDocs action

* remove unused types

* fix wrong typing

* add cleanup phase

* add an integration test for cleanup phase

* add unit-tests for cleanup function

* address comments

* Fix functional test

* set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase

* sourceIndex: Option.some<> for consistency

* Revert "set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase"

This reverts commit a128d7b.

* address comments from Pierre

* fix test

* Revert "fix test"

This reverts commit 97315b6.

* revert min convert version back to 8.0

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>

* skip id rewriting test

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes v7.14.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants