Reduce startup time by skipping update mappings step when possible #145604

gsoldevila · 2022-11-17T17:40:47Z

The goal of this PR is to reduce the startup times of Kibana server by improving the migration logic.

The migration logic is run systematically at startup, whether the customers are upgrading or not.
Historically, these steps have been very quick, but we recently found out about some customers that have more than one million Saved Objects stored, making the overall startup process slow, even when there are no migrations to perform.

This PR specifically targets the case where there are no migrations to perform, aka a Kibana node is started against an ES cluster that is already up to date wrt stack version and list of plugins.

In this scenario, we aim at skipping the UPDATE_TARGET_MAPPINGS step of the migration logic, which internally runs the updateAndPickupMappings method, which turns out to be expensive if the system indices contain lots of SO.

I locally tested the following scenarios too:

Fresh install. The step is not even run, as the .kibana index did not exist ✅
Stack version + list of plugins up to date. Simply restarting Kibana after the fresh install. The step is run and leads to DONE, as the md5 hashes match those stored in .kibana._mapping._meta ✅
Faking re-enabling an old plugin. I manually removed one of the MD5 hashes from the stored .kibana._mapping._meta through curl, and then restarted Kibana. The step is run and leads to UPDATE_TARGET_MAPPINGS as it used to before the PR ✅
Faking updating a plugin. Same as the previous one, but altering an existing md5 stored in the metas. ✅

And that is the curl command used to tamper with the stored _meta:

curl -X PUT "kibana:changeme@localhost:9200/.kibana/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
  "_meta": {
      "migrationMappingPropertyHashes": {
        "references": "7997cf5a56cc02bdc9c93361bde732b0",
      }
  }
}
'

elasticmachine · 2022-11-17T17:40:50Z

Pinging @elastic/kibana-core (Team:Core)

rudolf

Just did a brief first pass and will take a more detailed look later

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/model/model.ts

gsoldevila · 2022-11-21T17:30:47Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/model/model.ts

+    } else {
+      throwBadResponse(stateP, res as never);
+    }
+  } else if (stateP.controlState === 'CHECK_VERSION_INDEX_READY_ACTIONS') {


I created this "logic-only" step to avoid duplicating code in the model.ts.

rudolf

Looking good! We never relied on writing/reading _meta so we don't have coverage for it in src/core/server/integration_tests/saved_objects/migrations/actions/actions.test.ts

I think it'd be worth to update those tests to ensure that initAction reads the _meta and that updateTargetMappingsMeta writes the _meta.

rudolf · 2022-11-24T13:59:30Z

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/model/model.ts

      };
    } else {
      throwBadResponse(stateP, res);
    }
+  } else if (stateP.controlState === 'CHECK_TARGET_MAPPINGS') {
+    const res = resW as ResponseType<typeof stateP.controlState>;
+    if (Either.isLeft(res) || !res.right.match) {


a left response would mean something really unexpected happened like ES returning an error that we didn't know about before. I'm not sure we should just ignore that and apply new mappings in this case. In general we would call throwBadResponse for such a left.

kibana-ci · 2022-11-28T12:39:42Z

💚 Build Succeeded

Buildkite Build
Commit: bf74272

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/core-saved-objects-migration-server-internal`	78	79	+1

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id	before	after	diff
`@kbn/core-saved-objects-migration-server-internal`	44	45	+1

Unknown metric groups

API count

id	before	after	diff
`@kbn/core-saved-objects-migration-server-internal`	110	112	+2

ESLint disabled in files

id	before	after	diff
`osquery`	1	2	+1

ESLint disabled line counts

id	before	after	diff
`enterpriseSearch`	19	21	+2
`fleet`	59	65	+6
`osquery`	109	115	+6
`securitySolution`	443	449	+6
total			+20

Total ESLint disabled count

id	before	after	diff
`enterpriseSearch`	20	22	+2
`fleet`	68	74	+6
`osquery`	110	117	+7
`securitySolution`	520	526	+6
total			+21

History

💚 Build #90622 succeeded 4c2eb97
💔 Build #90616 failed 7a54336
💚 Build #90354 succeeded d9a923b
💔 Build #90310 failed 0ec3b76
💔 Build #89973 failed 6b7f12a
💔 Build #89750 failed 14d3c27

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

kibanamachine · 2022-11-28T14:38:59Z

💔 All backports failed

Status	Branch	Result
❌	8.6	Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 145604

Questions ?

Please refer to the Backport tool documentation

gsoldevila · 2022-11-29T21:53:41Z

💚 All backports created successfully

Status	Branch	Result
✅	8.6

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

…lastic#145604) The goal of this PR is to reduce the startup times of Kibana server by improving the migration logic. Fixes elastic#145743 Related elastic#144035) The migration logic is run systematically at startup, whether the customers are upgrading or not. Historically, these steps have been very quick, but we recently found out about some customers that have more than **one million** Saved Objects stored, making the overall startup process slow, even when there are no migrations to perform. This PR specifically targets the case where there are no migrations to perform, aka a Kibana node is started against an ES cluster that is already up to date wrt stack version and list of plugins. In this scenario, we aim at skipping the `UPDATE_TARGET_MAPPINGS` step of the migration logic, which internally runs the `updateAndPickupMappings` method, which turns out to be expensive if the system indices contain lots of SO. I locally tested the following scenarios too: - **Fresh install.** The step is not even run, as the `.kibana` index did not exist ✅ - **Stack version + list of plugins up to date.** Simply restarting Kibana after the fresh install. The step is run and leads to `DONE`, as the md5 hashes match those stored in `.kibana._mapping._meta` ✅ - **Faking re-enabling an old plugin.** I manually removed one of the MD5 hashes from the stored .kibana._mapping._meta through `curl`, and then restarted Kibana. The step is run and leads to `UPDATE_TARGET_MAPPINGS` as it used to before the PR ✅ - **Faking updating a plugin.** Same as the previous one, but altering an existing md5 stored in the metas. ✅ And that is the curl command used to tamper with the stored _meta: ```bash curl -X PUT "kibana:changeme@localhost:9200/.kibana/_mapping?pretty" -H 'Content-Type: application/json' -d' { "_meta": { "migrationMappingPropertyHashes": { "references": "7997cf5a56cc02bdc9c93361bde732b0", } } } ' ``` (cherry picked from commit b1e18a0) # Conflicts: # packages/core/saved-objects/core-saved-objects-migration-server-internal/src/actions/index.ts

…ble (#145604) (#146637) # Backport This will backport the following commits from `main` to `8.6`: - [Reduce startup time by skipping update mappings step when possible (#145604)](#145604)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)

Reduce startup time by skipping update mappings step when possible

dc059f4

gsoldevila requested a review from a team as a code owner November 17, 2022 17:40

rudolf reviewed Nov 17, 2022

View reviewed changes

packages/core/saved-objects/core-saved-objects-migration-server-internal/src/model/model.ts Outdated Show resolved Hide resolved

rudolf mentioned this pull request Nov 17, 2022

[SavedObjectsMigrations] migrations.skip: true turns Kibana's status in RED #145558

Closed

Update _mapping._meta on a separate step

14d3c27

gsoldevila commented Nov 21, 2022

View reviewed changes

gsoldevila added 6 commits November 22, 2022 14:34

Fix types errors, fix existing UTs

6b7f12a

Merge branch 'main' into kbna-7976-reduce-startup-time

705fd76

Add UTs

0ec3b76

Fix incorrect UTs

d9a923b

Add integration tests for the new steps

7be7eda

Merge branch 'main' into kbna-7976-reduce-startup-time

7a54336

gsoldevila requested a review from rudolf November 24, 2022 11:56

Fix incorrect import

4c2eb97

rudolf reviewed Nov 24, 2022

View reviewed changes

rudolf added the Feature:Migrations label Nov 28, 2022

gsoldevila added 2 commits November 28, 2022 12:40

Address PR comments

b539f6b

Merge branch 'main' into kbna-7976-reduce-startup-time

bf74272

gsoldevila requested a review from rudolf November 28, 2022 11:41

Update state machine documentation

7bcd062

rudolf approved these changes Nov 28, 2022

View reviewed changes

gsoldevila merged commit b1e18a0 into elastic:main Nov 28, 2022

gsoldevila mentioned this pull request Nov 29, 2022

[8.6] Reduce startup time by skipping update mappings step when possible (#145604) #146637

Merged

kibanamachine added the v8.6.0 label Nov 30, 2022

rudolf mentioned this pull request Dec 12, 2022

Allow for additive mappings update without creating a new version index #147237

Closed

gsoldevila mentioned this pull request Dec 13, 2022

Migration logic doesn't set the right mappings when upgrading to a newer version if the mappings don't change #147450

Closed

rudolf added the Epic:ScaleMigrations Scale upgrade migrations to millions of saved objects label Jun 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce startup time by skipping update mappings step when possible #145604

Reduce startup time by skipping update mappings step when possible #145604

gsoldevila commented Nov 17, 2022 •

edited by kibanamachine

Loading

elasticmachine commented Nov 17, 2022

rudolf left a comment

gsoldevila Nov 21, 2022

rudolf left a comment

rudolf Nov 24, 2022

kibana-ci commented Nov 28, 2022

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

kibanamachine commented Nov 28, 2022

gsoldevila commented Nov 29, 2022

Reduce startup time by skipping update mappings step when possible #145604

Reduce startup time by skipping update mappings step when possible #145604

Conversation

gsoldevila commented Nov 17, 2022 • edited by kibanamachine Loading

elasticmachine commented Nov 17, 2022

rudolf left a comment

Choose a reason for hiding this comment

gsoldevila Nov 21, 2022

Choose a reason for hiding this comment

rudolf left a comment

Choose a reason for hiding this comment

rudolf Nov 24, 2022

Choose a reason for hiding this comment

kibana-ci commented Nov 28, 2022

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Public APIs missing exports

API count

ESLint disabled in files

ESLint disabled line counts

Total ESLint disabled count

History

kibanamachine commented Nov 28, 2022

💔 All backports failed

Manual backport

Questions ?

gsoldevila commented Nov 29, 2022

💚 All backports created successfully

Questions ?

gsoldevila commented Nov 17, 2022 •

edited by kibanamachine

Loading