Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for additive mappings update without creating a new version index #149326

Merged
merged 80 commits into from
Mar 1, 2023

Conversation

gsoldevila
Copy link
Contributor

@gsoldevila gsoldevila commented Jan 23, 2023

Fixes #147237

Based on the same principle as #147371, the goal of this PR is to avoid reindexing if possible.
This time, the idea is to check whether the new mappings are still compatible with the ones stored in ES.
To to so, we attempt to update the mappings in place in the existing index, introducing a new CHECK_COMPATIBLE_MAPPINGS step:

  • If the update operation fails, we assume the mappings are NOT compatible, and we continue with the normal reindexing flow.
  • If the update operation succeeds, we assume the mappings ARE compatible, and we skip reindexing, just like #147371 does.

image

@gsoldevila gsoldevila added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc enhancement New value added to drive a business result release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting Feature:Migrations v8.7.0 labels Jan 23, 2023
@gsoldevila gsoldevila requested a review from a team as a code owner January 23, 2023 13:59
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@@ -81,8 +79,6 @@ export { createIndex } from './create_index';

export { checkTargetMappings } from './check_target_mappings';

export { updateTargetMappingsMeta } from './update_target_mappings_meta';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generalised updateTargetMappingsMeta and renamed it as updateMappings.

@@ -45,13 +45,11 @@ export const updateAndPickupMappings = ({
RetryableEsClientError,
'update_mappings_succeeded'
> = () => {
// ._meta property will be updated on a later step
const { _meta, ...mappingsWithoutMeta } = mappings;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be this function's responsibility, it has been moved to next.ts.

@@ -488,7 +489,7 @@ export const model = (currentState: State, resW: ResponseType<AllActionStates>):
// the mappings have changed, but changes might still be compatible
return {
...stateP,
controlState: 'CHECK_UNKNOWN_DOCUMENTS',
controlState: 'CHECK_COMPATIBLE_MAPPINGS',
Copy link
Contributor Author

@gsoldevila gsoldevila Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the root change of the PR; introduction of a new intermediate step that will attempt to update the mappings (excluding the _meta).

@@ -69,6 +71,12 @@ export const nextActionMap = (client: ElasticsearchClient, transformRawDocs: Tra
Actions.fetchIndices({ client, indices: [state.currentAlias, state.versionAlias] }),
WAIT_FOR_YELLOW_SOURCE: (state: WaitForYellowSourceState) =>
Actions.waitForIndexStatus({ client, index: state.sourceIndex.value, status: 'yellow' }),
CHECK_COMPATIBLE_MAPPINGS: (state: CheckCompatibleMappingsState) =>
Actions.updateMappings({
Copy link
Contributor Author

@gsoldevila gsoldevila Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're effectively updating the index mappings earlier in the flow (if they are actually compatible).
With this change, if the mappings are compatible, the OUTDATED_DOCUMENTS _SEARCH + _TRANSFORM will be performed after updating the mappings (not the _meta).

This shouldn't be a problem cause the mappings are compatible, and we're going to perform an updateAndPickupMappings afterwards anyway.

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this out locally and confirmed that the flows described in the diagram are reproducible (tested with same mappings + new version and new compat. mappings and new version).

Great work @gsoldevila!

})
.then(() => Either.right('update_mappings_succeeded' as const))
.catch(catchRetryableEsClientErrors)
.catch(() => Either.left('incompatible_mapping_exception'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my learning (and not part of this PRs changes):

is it possible that ES returns some other kind of error code here that we are mapping to incompatible_mapping_exception. I see the following for an incompatible mapping on a recent snapshot:

{
	"error": {
		"root_cause": [
			{
				"type": "illegal_argument_exception",
				"reason": "mapper [file.Status] cannot be changed from type [keyword] to [long]"
			}
		],
		"type": "illegal_argument_exception",
		"reason": "mapper [file.Status] cannot be changed from type [keyword] to [long]"
	},
	"status": 400
}

I'm not sure if there are others, didn't spot 'em. Might be more defensive to check for the type: 'illegal_argument_exception' and just throw for an unexpected condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was fixing the CI errors and came to the same conclusion, this will be part of my next push.
TY!

@gsoldevila
Copy link
Contributor Author

gsoldevila commented Jan 24, 2023

The current CI failures are integrations tests that use 7.13 archives to test legacy migrations.

Is it possible that the mappings are still compatible and that we're not creating the v2 style indices? I will try to confirm that.

I propose to leave this check here and only apply the enhancement for >= 7.14 migrations (aka v2 migrator).
UPDATE: I was incorrectly assuming legacy migrations were those <= 7.14, and that the legacyIndex field of the state was an indicator of whether the migration was from legacy index or not.

From code comments, state.legacyIndex is:

The name of the concrete legacy index (if it exists) e.g. `.kibana` for <6.5 or `.kibana_task_manager` for < 7.4

This breaks my theory of the 7.13 archives having a legacy structure. I'll need to dig a bit deeper.

UPDATE: Adding a stateP.legacyIndex === undefined condition to our recent enhancements fixes all the 7.13 integration tests. But they just work cause I effectively disabled the enhancement by checking that property, which is always present 🤦🏼‍♂️ .

Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to leave this check here and only apply the enhancement for >= 7.14 migrations (aka v2 migrator). WDYT?

This makes sense to me, I'd like to hear what @rudolf thinks! Do we specifically want this enhancement for users who are < 7.14?

@rudolf
Copy link
Contributor

rudolf commented Jan 25, 2023

It's not important that we optimise the downtime from a legacy upgrade (< 7.3 for task manager, < 6.5.0 for .kibana!). But by the time we're done with the LEGACY_REINDEX_DELETE steps a legacy index looks like a v1/v2 index. That is it has a .kibana alias pointing to it and I don't think we should need to handle it very differently.

I think the reason the tests fail is because when we don't reindex we're unable to apply the excludeOnUpgradeQuery. We would need to have a kind of cleanup step that does a delete by query matching on the inverse of the excludeOnUpgradeQuery.

gsoldevila added a commit that referenced this pull request Feb 27, 2023
In the context of migrations,
#147371 avoids reindexing during
an upgrade, provided that `diffMappings === false`.

This _alternative path_ skips some key steps that are performed before
reindexing:
* `CHECK_UNKNOWN_DOCUMENTS`
* `CALCULATE_EXCLUDE_FILTERS`

These steps enrich a search query that is used during reindexing,
effectively filtering out undesired documents.

If the mappings [match](#147371)
(or they are
[compatible](#149326)) and we _no
longer reindex_, this cleanup operation does not happen, leaving
undesired documents in our system indices.

The goal of this PR is to add an extra step in the state machine
(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup a system
index if we're going the _skip reindexing_ path.


![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)

Fixes #150299

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
gsoldevila added a commit to gsoldevila/kibana that referenced this pull request Feb 27, 2023
In the context of migrations,
elastic#147371 avoids reindexing during
an upgrade, provided that `diffMappings === false`.

This _alternative path_ skips some key steps that are performed before
reindexing:
* `CHECK_UNKNOWN_DOCUMENTS`
* `CALCULATE_EXCLUDE_FILTERS`

These steps enrich a search query that is used during reindexing,
effectively filtering out undesired documents.

If the mappings [match](elastic#147371)
(or they are
[compatible](elastic#149326)) and we _no
longer reindex_, this cleanup operation does not happen, leaving
undesired documents in our system indices.

The goal of this PR is to add an extra step in the state machine
(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup a system
index if we're going the _skip reindexing_ path.

![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)

Fixes elastic#150299

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
(cherry picked from commit 754e868)
Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @gsoldevila 👍🏻, this looks pretty comprehensive with the addition of the tests using the Kibana migrator kit. I left a couple of comments, mostly for my own understanding.

I tested the new upgrade locally by:

  1. Introducing a compatible change to file SO (added a new field)
  2. Deleted the existing Kibana .kibana_8.8.0 alias pointing to .kibana_8.8.0_001 to pretend we are doing a version upgrade

Confirmed that I saw the expected logs and Kibana works as expected. Did not do further testing.

Would be great to get a second (or third) pair of eyes before merging. Cheers!

gsoldevila added a commit that referenced this pull request Feb 27, 2023
# Backport

This will backport the following commits from `main` to `8.7`:
- [Introduce CLEANUP_UNKNOWN_AND_EXCLUDED step
(#149931)](#149931)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Gerard
Soldevila","email":"gerard.soldevila@elastic.co"},"sourceCommit":{"committedDate":"2023-02-27T10:43:13Z","message":"Introduce
CLEANUP_UNKNOWN_AND_EXCLUDED step (#149931)\n\nIn the context of
migrations,\r\nhttps://github.com//pull/147371 avoids
reindexing during\r\nan upgrade, provided that `diffMappings ===
false`.\r\n\r\nThis _alternative path_ skips some key steps that are
performed before\r\nreindexing:\r\n* `CHECK_UNKNOWN_DOCUMENTS`\r\n*
`CALCULATE_EXCLUDE_FILTERS`\r\n\r\nThese steps enrich a search query
that is used during reindexing,\r\neffectively filtering out undesired
documents.\r\n\r\nIf the mappings
[match](https://github.com/elastic/kibana/pull/147371)\r\n(or they
are\r\n[compatible](#149326)) and
we _no\r\nlonger reindex_, this cleanup operation does not happen,
leaving\r\nundesired documents in our system indices.\r\n\r\nThe goal of
this PR is to add an extra step in the state
machine\r\n(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup
a system\r\nindex if we're going the _skip reindexing_
path.\r\n\r\n\r\n![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)\r\n\r\nFixes
https://github.com/elastic/kibana/issues/150299\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"754e8682d45196e519720693e036017c915c0379","branchLabelMapping":{"^v8.8.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["bug","Team:Core","release_note:skip","Feature:Migrations","backport:prev-minor","v8.7.0","v8.8.0"],"number":149931,"url":"https://github.com/elastic/kibana/pull/149931","mergeCommit":{"message":"Introduce
CLEANUP_UNKNOWN_AND_EXCLUDED step (#149931)\n\nIn the context of
migrations,\r\nhttps://github.com//pull/147371 avoids
reindexing during\r\nan upgrade, provided that `diffMappings ===
false`.\r\n\r\nThis _alternative path_ skips some key steps that are
performed before\r\nreindexing:\r\n* `CHECK_UNKNOWN_DOCUMENTS`\r\n*
`CALCULATE_EXCLUDE_FILTERS`\r\n\r\nThese steps enrich a search query
that is used during reindexing,\r\neffectively filtering out undesired
documents.\r\n\r\nIf the mappings
[match](https://github.com/elastic/kibana/pull/147371)\r\n(or they
are\r\n[compatible](#149326)) and
we _no\r\nlonger reindex_, this cleanup operation does not happen,
leaving\r\nundesired documents in our system indices.\r\n\r\nThe goal of
this PR is to add an extra step in the state
machine\r\n(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup
a system\r\nindex if we're going the _skip reindexing_
path.\r\n\r\n\r\n![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)\r\n\r\nFixes
https://github.com/elastic/kibana/issues/150299\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"754e8682d45196e519720693e036017c915c0379"}},"sourceBranch":"main","suggestedTargetBranches":["8.7"],"targetPullRequestStates":[{"branch":"8.7","label":"v8.7.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.8.0","labelRegex":"^v8.8.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/149931","number":149931,"mergeCommit":{"message":"Introduce
CLEANUP_UNKNOWN_AND_EXCLUDED step (#149931)\n\nIn the context of
migrations,\r\nhttps://github.com//pull/147371 avoids
reindexing during\r\nan upgrade, provided that `diffMappings ===
false`.\r\n\r\nThis _alternative path_ skips some key steps that are
performed before\r\nreindexing:\r\n* `CHECK_UNKNOWN_DOCUMENTS`\r\n*
`CALCULATE_EXCLUDE_FILTERS`\r\n\r\nThese steps enrich a search query
that is used during reindexing,\r\neffectively filtering out undesired
documents.\r\n\r\nIf the mappings
[match](https://github.com/elastic/kibana/pull/147371)\r\n(or they
are\r\n[compatible](#149326)) and
we _no\r\nlonger reindex_, this cleanup operation does not happen,
leaving\r\nundesired documents in our system indices.\r\n\r\nThe goal of
this PR is to add an extra step in the state
machine\r\n(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup
a system\r\nindex if we're going the _skip reindexing_
path.\r\n\r\n\r\n![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)\r\n\r\nFixes
https://github.com/elastic/kibana/issues/150299\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<42973632+kibanamachine@users.noreply.github.com>","sha":"754e8682d45196e519720693e036017c915c0379"}}]}]
BACKPORT-->
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Unknown metric groups

ESLint disabled line counts

id before after diff
securitySolution 428 430 +2

Total ESLint disabled count

id before after diff
securitySolution 506 508 +2

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @gsoldevila

Copy link
Contributor

@dokmic dokmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great job 👍

@gsoldevila gsoldevila merged commit 5dd8742 into elastic:main Mar 1, 2023
bmorelli25 pushed a commit to bmorelli25/kibana that referenced this pull request Mar 10, 2023
In the context of migrations,
elastic#147371 avoids reindexing during
an upgrade, provided that `diffMappings === false`.

This _alternative path_ skips some key steps that are performed before
reindexing:
* `CHECK_UNKNOWN_DOCUMENTS`
* `CALCULATE_EXCLUDE_FILTERS`

These steps enrich a search query that is used during reindexing,
effectively filtering out undesired documents.

If the mappings [match](elastic#147371)
(or they are
[compatible](elastic#149326)) and we _no
longer reindex_, this cleanup operation does not happen, leaving
undesired documents in our system indices.

The goal of this PR is to add an extra step in the state machine
(`CLEANUP_UNKNOWN_AND_EXCLUDED`), which will actively cleanup a system
index if we're going the _skip reindexing_ path.


![image](https://user-images.githubusercontent.com/25349407/216979691-fef40638-f990-4850-bac8-ee3e58330a7f.png)

Fixes elastic#150299

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
bmorelli25 pushed a commit to bmorelli25/kibana that referenced this pull request Mar 10, 2023
…ex (elastic#149326)

Fixes [elastic#147237](elastic#147237)

Based on the same principle as
[elastic#147371](elastic#147371), the goal of
this PR is to **avoid reindexing if possible**.
This time, the idea is to check whether the new mappings are still
compatible with the ones stored in ES.
To to so, we attempt to update the mappings in place in the existing
index, introducing a new `CHECK_COMPATIBLE_MAPPINGS` step:
* If the update operation fails, we assume the mappings are NOT
compatible, and we continue with the normal reindexing flow.
* If the update operation succeeds, we assume the mappings ARE
compatible, and we skip reindexing, just like
[elastic#147371](elastic#147371) does.


![image](https://user-images.githubusercontent.com/25349407/216979882-9fe9f034-b521-4171-b85d-50be6a13e179.png)

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
@rudolf rudolf added the Epic:ScaleMigrations Scale upgrade migrations to millions of saved objects label Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting enhancement New value added to drive a business result Epic:ScaleMigrations Scale upgrade migrations to millions of saved objects Feature:Migrations release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for additive mappings update without creating a new version index
8 participants