Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resilient saved object migration algorithm #78413

Merged
merged 94 commits into from
Dec 15, 2020
Merged
Show file tree
Hide file tree
Changes from 93 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
4765c3e
Initial structure of migration state-action machine
rudolf Sep 24, 2020
1699561
Fix type import
rudolf Sep 24, 2020
915fdcd
Retries with exponential back off
rudolf Sep 25, 2020
90cfda7
Use discriminated union for state type
rudolf Oct 7, 2020
e48d013
Either type for actions
rudolf Oct 7, 2020
c630534
Test exponential retries
rudolf Oct 8, 2020
b85d986
TaskEither types for actions
rudolf Oct 13, 2020
7f7e573
Fetch indices instead of aliases so we can collect all index state in…
rudolf Oct 16, 2020
443a5d0
Log document id if transform fails
rudolf Oct 23, 2020
4da5a84
WIP: Legacy pre-migrations
rudolf Nov 3, 2020
7911ee0
UPDATE_TARGET_MAPPINGS
rudolf Nov 4, 2020
8573d46
WIP OUTDATED_DOCUMENTS_TRANSFORM
rudolf Nov 11, 2020
dca4c5b
Narrow res types depending on control state
rudolf Nov 12, 2020
7515240
OUTDATED_DOCUMENTS_TRANSFORM
rudolf Nov 13, 2020
bed8197
Use .kibana instead of .kibana_current
rudolf Nov 13, 2020
71bbede
rename control states TARGET_DOCUMENTS* -> OUTDATED_DOCUMENTS*
rudolf Nov 16, 2020
56bd467
WIP MARK_VERSION_INDEX_READY
rudolf Nov 17, 2020
e10d01b
Fix and expand INIT -> * transition tests
rudolf Nov 17, 2020
bf27c6d
Add alias/index name helper functions
rudolf Nov 17, 2020
0d7a1a6
Add feature flag for enabling v2 migrations
rudolf Nov 17, 2020
8eca894
split state_action_machine, reindex legacy indices
rudolf Nov 20, 2020
85bc88d
Don't use a scroll search for migrating outdated documents
rudolf Nov 24, 2020
0e672e7
model: test control state progressions
rudolf Dec 1, 2020
c27aa95
Action integration tests
rudolf Dec 4, 2020
0d347c9
Fix existing tests and type errors
rudolf Dec 4, 2020
4f95fd7
snapshot_in_progress_exception can only happen when closing/deleting …
rudolf Dec 4, 2020
e54d9d8
Retry steps up to 10 times
rudolf Dec 4, 2020
18c6700
Update api.md documentation files
rudolf Dec 4, 2020
df4088b
Further actions integration tests
rudolf Dec 4, 2020
6dfab82
Action unit tests
rudolf Dec 4, 2020
c002bc8
Fix actions integration tests
rudolf Dec 5, 2020
776eb1b
Rename actions to be more domain-specific
rudolf Dec 6, 2020
37ea126
Apply suggestions from code review
rudolf Dec 7, 2020
328afd9
Review feedback: polish and flesh out inline comments
rudolf Dec 7, 2020
c6dd07f
Fix unhandled rejections in actions unit tests
rudolf Dec 7, 2020
7c60f8b
model: only delay retryable_es_client_error, reset for other left res…
rudolf Dec 7, 2020
b93f834
Actions unit tests
rudolf Dec 7, 2020
4ca7d4b
More inline comments
rudolf Dec 7, 2020
db5f86d
Actions: Group index settings under 'index' key
rudolf Dec 7, 2020
f70bb34
bulkIndex -> bulkOverwriteTransformedDocuments to be more domain spec…
rudolf Dec 7, 2020
67685f6
state_action_machine tests, fix and add additional tests
rudolf Dec 7, 2020
eeeb1f9
Action integration tests: updateAndPickupMappings, searchForOutdatedD…
rudolf Dec 7, 2020
27e444f
oops: uncomment commented out code
rudolf Dec 7, 2020
da07c75
actions integration tests: rejection for createIndex
rudolf Dec 7, 2020
fdcdc51
update state properties: clearer names, mark all as readonly
rudolf Dec 7, 2020
caf90a0
add state properties currentAlias, versionAlias, legacyIndex and test…
rudolf Dec 7, 2020
0cfc420
Use CONSTANTS for constants :D
rudolf Dec 8, 2020
f3982fb
Actions: Clarify behaviour and impact of acknowledged: false responses
rudolf Dec 8, 2020
5a72d0d
Use consistent vocabulary for action responses
rudolf Dec 8, 2020
c2ba683
KibanaMigrator test for migrationsV2
rudolf Dec 8, 2020
0ca0b93
KibanaMigrator test for FATAL state and action exceptions in v2 migra…
rudolf Dec 8, 2020
5023e02
Fix ts error in test
rudolf Dec 8, 2020
901d4e3
Refactor: split index file up into a file per model, next, types
rudolf Dec 8, 2020
56b90c0
next: use partial application so we don't generate a nextActionMap on…
rudolf Dec 9, 2020
0826813
move logic from index.ts to migrations_state_action_machine.ts and test
rudolf Dec 9, 2020
37ee3f5
Merge branch 'master' into so-migrations
rudolf Dec 9, 2020
7dc62f1
add test
pgayvallet Dec 9, 2020
b76a40b
use `Root` to allow specifying oss mode
pgayvallet Dec 9, 2020
29cb116
Add fix and todo tests for reindexing with preMigrationScript
rudolf Dec 9, 2020
a5c92e2
Dump execution log of state transitions and responses if we hit FATAL
rudolf Dec 9, 2020
8319bb8
add 7.3 xpack tests
pgayvallet Dec 9, 2020
5480e34
add 100k test data
pgayvallet Dec 10, 2020
4f97684
Reindex instead of cloning for migrations
rudolf Dec 10, 2020
3880038
Merge pull request #2 from pgayvallet/so-migrations-add-integration-test
rudolf Dec 10, 2020
ad79056
Skip 100k x-pack integration test
rudolf Dec 10, 2020
cc0eb44
MARK_VERSION_INDEX_READY_CONFLICT for dealing with different versions…
rudolf Dec 11, 2020
b3df941
Track elapsed time
rudolf Dec 11, 2020
25b6519
Fix tests
rudolf Dec 11, 2020
e6d61f7
Merge branch 'master' into so-migrations
rudolf Dec 11, 2020
5d6fe03
Model: make exhaustiveness checks more explicit
rudolf Dec 12, 2020
a7622dc
actions integration tests: add additional tests from CR
rudolf Dec 12, 2020
7486426
migrations_state_action_machine fix flaky test
rudolf Dec 12, 2020
688621c
Fix flaky integration test
rudolf Dec 12, 2020
2f46188
Reserve FATAL termination only for situations which we never can reco…
rudolf Dec 12, 2020
1370f02
Handle incompatible_mapping_exception caused by another instance
rudolf Dec 12, 2020
cd8676a
Cleanup logging
rudolf Dec 12, 2020
4afb191
Fix/stabilize integration tests
rudolf Dec 13, 2020
9d42a67
Add REINDEX_SOURCE_TO_TARGET_VERIFY step
rudolf Dec 13, 2020
8851fea
Merge branch 'master' into so-migrations
rudolf Dec 13, 2020
e115b76
Strip tests archives of */.DS_Store and __MAC_OSX
rudolf Dec 14, 2020
6dfbc0c
Task manager migrations: remove invalid kibana property when converti…
rudolf Dec 14, 2020
55b124e
Add disabled mappings for removed field in map saved object type
rudolf Dec 14, 2020
386a9f7
verifyReindex action: use count API
rudolf Dec 14, 2020
33cb185
REINDEX_BLOCK_* to prevent lost deletes (needs tests)
rudolf Dec 15, 2020
342e597
Split out 100k docs integration test so that it has it's own kibana p…
rudolf Dec 15, 2020
80fed9c
REINDEX_BLOCK_* action tests
rudolf Dec 15, 2020
892ae1c
REINDEX_BLOCK_* model tests
rudolf Dec 15, 2020
d0f2518
Include original error message when migration_state_machine throws
rudolf Dec 15, 2020
286bdfa
Address some CR nits
rudolf Dec 15, 2020
eea4f9e
Fix TS errors
rudolf Dec 15, 2020
3c46888
Fix bugs
rudolf Dec 15, 2020
11dc7cf
Reindex then clone to prevent lost deletes
rudolf Dec 15, 2020
e815df4
Fix tests
rudolf Dec 15, 2020
4a7f2f7
Merge branch 'master' into so-migrations
kibanamachine Dec 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,4 @@ export interface SavedObjectsRawDoc
| [\_primary\_term](./kibana-plugin-core-server.savedobjectsrawdoc._primary_term.md) | <code>number</code> | |
| [\_seq\_no](./kibana-plugin-core-server.savedobjectsrawdoc._seq_no.md) | <code>number</code> | |
| [\_source](./kibana-plugin-core-server.savedobjectsrawdoc._source.md) | <code>SavedObjectsRawDocSource</code> | |
| [\_type](./kibana-plugin-core-server.savedobjectsrawdoc._type.md) | <code>string</code> | |

38 changes: 25 additions & 13 deletions rfcs/text/0013_saved_object_migrations.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,31 +214,43 @@ Note:
2. If the source is a < v6.5 `.kibana` index or < 7.4 `.kibana_task_manager`
index prepare the legacy index for a migration:
1. Mark the legacy index as read-only and wait for all in-flight operations to drain (requires https://github.com/elastic/elasticsearch/pull/58094). This prevents any further writes from outdated nodes. Assuming this API is similar to the existing `/<index>/_close` API, we expect to receive `"acknowledged" : true` and `"shards_acknowledged" : true`. If all shards don’t acknowledge within the timeout, retry the operation until it succeeds.
2. Clone the legacy index into a new index which has writes enabled. Use a fixed index name i.e `.kibana_pre6.5.0_001` or `.kibana_task_manager_pre7.4.0_001`. `POST /.kibana/_clone/.kibana_pre6.5.0_001?wait_for_active_shards=all {"settings": {"index.blocks.write": false}}`. Ignore errors if the clone already exists. Ignore errors if the legacy source doesn't exist.
3. Wait for the cloning to complete `GET /_cluster/health/.kibana_pre6.5.0_001?wait_for_status=green&timeout=60s` If cloning doesn’t complete within the 60s timeout, log a warning for visibility and poll again.
4. Apply the `convertToAlias` script if defined `POST /.kibana_pre6.5.0_001/_update_by_query?conflicts=proceed {"script": {...}}`. The `convertToAlias` script will have to be idempotent, preferably setting `ctx.op="noop"` on subsequent runs to avoid unecessary writes.
2. Create a new index which will become the source index after the legacy
pre-migration is complete. This index should have the same mappings as
the legacy index. Use a fixed index name i.e `.kibana_pre6.5.0_001` or
`.kibana_task_manager_pre7.4.0_001`. Ignore index already exists errors.
3. Reindex the legacy index into the new source index with the
`convertToAlias` script if specified. Use `wait_for_completion: false`
to run this as a task. Ignore errors if the legacy source doesn't exist.
4. Wait for the reindex task to complete. If the task doesn’t complete
within the 60s timeout, log a warning for visibility and poll again.
Ignore errors if the legacy source doesn't exist.
5. Delete the legacy index and replace it with an alias of the same name
```
POST /_aliases
{
"actions" : [
{ "add": { "index": ".kibana_pre6.5.0_001", "alias": ".kibana" } },
{ "remove_index": { "index": ".kibana" } }
{ "add": { "index": ".kibana_pre6.5.0_001", "alias": ".kibana" } },
]
}
```.
Unlike the delete index API, the `remove_index` action will fail if
provided with an _alias_. Ignore "The provided expression [.kibana]
matches an alias, specify the corresponding concrete indices instead."
or "index_not_found_exception" errors. These actions are applied
atomically so that other Kibana instances will always see either a
`.kibana` index or an alias, but never neither.
6. Use the cloned `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm.
provided with an _alias_. Therefore, if another instance completed this
step, the `.kibana` alias won't be added to `.kibana_pre6.5.0_001` a
second time. This avoids a situation where `.kibana` could point to both
`.kibana_pre6.5.0_001` and `.kibana_7.10.0_001`. These actions are
applied atomically so that other Kibana instances will always see either
a `.kibana` index or an alias, but never neither.

Ignore "The provided expression [.kibana] matches an alias, specify the
corresponding concrete indices instead." or "index_not_found_exception"
errors as this means another instance has already completed this step.
6. Use the reindexed legacy `.kibana_pre6.5.0_001` as the source for the rest of the migration algorithm.
3. If `.kibana` and `.kibana_7.10.0` both exists and are pointing to the same index this version's migration has already been completed.
1. Because the same version can have plugins enabled at any point in time,
perform the mappings update in step (6) and migrate outdated documents
with step (7).
2. Skip to step (9) to start serving traffic.
perform the mappings update in step (7) and migrate outdated documents
with step (8).
2. Skip to step (10) to start serving traffic.
4. Fail the migration if:
1. `.kibana` is pointing to an index that belongs to a later version of Kibana .e.g. `.kibana_7.12.0_001`
2. (Only in 8.x) The source index contains documents that belong to an unknown Saved Object type (from a disabled plugin). Log an error explaining that the plugin that created these documents needs to be enabled again or that these objects should be deleted. See section (4.2.1.4).
Expand Down
2 changes: 1 addition & 1 deletion src/core/public/public.api.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import { Action } from 'history';
import { ApiResponse } from '@elastic/elasticsearch/lib/Transport';
import Boom from '@hapi/boom';
import { ConfigDeprecationProvider } from '@kbn/config';
import { ConfigPath } from '@kbn/config';
import { EnvironmentMode } from '@kbn/config';
import { EuiBreadcrumb } from '@elastic/eui';
Expand All @@ -18,7 +19,6 @@ import { History } from 'history';
import { Href } from 'history';
import { IconType } from '@elastic/eui';
import { KibanaClient } from '@elastic/elasticsearch/api/kibana';
import { KibanaConfigType } from 'src/core/server/kibana_config';
import { Location } from 'history';
import { LocationDescriptorObject } from 'history';
import { Logger } from '@kbn/logging';
Expand Down
10 changes: 6 additions & 4 deletions src/core/server/elasticsearch/client/mocks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,9 @@ import type { DeeplyMockedKeys } from '@kbn/utility-types/jest';
import { ElasticsearchClient } from './types';
import { ICustomClusterClient } from './cluster_client';

const createInternalClientMock = (): DeeplyMockedKeys<Client> => {
const createInternalClientMock = (
res?: MockedTransportRequestPromise<unknown>
): DeeplyMockedKeys<Client> => {
// we mimic 'reflection' on a concrete instance of the client to generate the mocked functions.
const client = new Client({
node: 'http://localhost',
Expand Down Expand Up @@ -59,7 +61,7 @@ const createInternalClientMock = (): DeeplyMockedKeys<Client> => {
.filter(([key]) => !omitted.includes(key))
.forEach(([key, descriptor]) => {
if (typeof descriptor.value === 'function') {
obj[key] = jest.fn(() => createSuccessTransportRequestPromise({}));
obj[key] = jest.fn(() => res ?? createSuccessTransportRequestPromise({}));
} else if (typeof obj[key] === 'object' && obj[key] != null) {
mockify(obj[key], omitted);
}
Expand Down Expand Up @@ -95,8 +97,8 @@ const createInternalClientMock = (): DeeplyMockedKeys<Client> => {

export type ElasticsearchClientMock = DeeplyMockedKeys<ElasticsearchClient>;

const createClientMock = (): ElasticsearchClientMock =>
(createInternalClientMock() as unknown) as ElasticsearchClientMock;
const createClientMock = (res?: MockedTransportRequestPromise<unknown>): ElasticsearchClientMock =>
(createInternalClientMock(res) as unknown) as ElasticsearchClientMock;

export interface ScopedClusterClientMock {
asInternalUser: ElasticsearchClientMock;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ function wrapWithTry(
const failedTransform = `${type}:${version}`;
const failedDoc = JSON.stringify(doc);
log.warn(
`Failed to transform document ${doc}. Transform: ${failedTransform}\nDoc: ${failedDoc}`
`Failed to transform document ${doc?.id}. Transform: ${failedTransform}\nDoc: ${failedDoc}`
rudolf marked this conversation as resolved.
Show resolved Hide resolved
);
throw error;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
* serves as a central blueprint for what migrations will end up doing.
*/

import { Logger } from 'src/core/server/logging';
import { Logger } from '../../../logging';
import { MigrationEsClient } from './migration_es_client';
import { SavedObjectsSerializer } from '../../serialization';
import {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,7 @@
* specific language governing permissions and limitations
* under the License.
*/
import type { PublicMethodsOf } from '@kbn/utility-types';

import { KibanaMigrator, KibanaMigratorStatus } from './kibana_migrator';
import { IKibanaMigrator, KibanaMigratorStatus } from './kibana_migrator';
import { buildActiveMappings } from '../core';
const { mergeTypes } = jest.requireActual('./kibana_migrator');
import { SavedObjectsType } from '../../types';
Expand All @@ -45,7 +43,16 @@ const createMigrator = (
types: SavedObjectsType[];
} = { types: defaultSavedObjectTypes }
) => {
const mockMigrator: jest.Mocked<PublicMethodsOf<KibanaMigrator>> = {
const mockMigrator: jest.Mocked<IKibanaMigrator> = {
kibanaVersion: '8.0.0-testing',
savedObjectsConfig: {
batchSize: 100,
scrollDuration: '15m',
pollInterval: 1500,
skip: false,
// TODO migrationsV2: remove/deprecate once we release migrations v2
enableV2: false,
},
runMigrations: jest.fn(),
getActiveMappings: jest.fn(),
migrateDocument: jest.fn(),
Expand Down
Loading