Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce startup time by skipping update mappings step when possible #145604

Merged
merged 12 commits into from
Nov 28, 2022

Conversation

gsoldevila
Copy link
Contributor

@gsoldevila gsoldevila commented Nov 17, 2022

The goal of this PR is to reduce the startup times of Kibana server by improving the migration logic.

Fixes #145743
Related #144035)

The migration logic is run systematically at startup, whether the customers are upgrading or not.
Historically, these steps have been very quick, but we recently found out about some customers that have more than one million Saved Objects stored, making the overall startup process slow, even when there are no migrations to perform.

This PR specifically targets the case where there are no migrations to perform, aka a Kibana node is started against an ES cluster that is already up to date wrt stack version and list of plugins.

In this scenario, we aim at skipping the UPDATE_TARGET_MAPPINGS step of the migration logic, which internally runs the updateAndPickupMappings method, which turns out to be expensive if the system indices contain lots of SO.

I locally tested the following scenarios too:

  • Fresh install. The step is not even run, as the .kibana index did not exist ✅
  • Stack version + list of plugins up to date. Simply restarting Kibana after the fresh install. The step is run and leads to DONE, as the md5 hashes match those stored in .kibana._mapping._meta
  • Faking re-enabling an old plugin. I manually removed one of the MD5 hashes from the stored .kibana._mapping._meta through curl, and then restarted Kibana. The step is run and leads to UPDATE_TARGET_MAPPINGS as it used to before the PR ✅
  • Faking updating a plugin. Same as the previous one, but altering an existing md5 stored in the metas. ✅

And that is the curl command used to tamper with the stored _meta:

curl -X PUT "kibana:changeme@localhost:9200/.kibana/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
  "_meta": {
      "migrationMappingPropertyHashes": {
        "references": "7997cf5a56cc02bdc9c93361bde732b0",
      }
  }
}
'

@gsoldevila gsoldevila added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc enhancement New value added to drive a business result release_note:skip Skip the PR/issue when compiling release notes backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) v8.7.0 labels Nov 17, 2022
@gsoldevila gsoldevila requested a review from a team as a code owner November 17, 2022 17:40
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

Copy link
Contributor

@rudolf rudolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did a brief first pass and will take a more detailed look later

} else {
throwBadResponse(stateP, res as never);
}
} else if (stateP.controlState === 'CHECK_VERSION_INDEX_READY_ACTIONS') {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this "logic-only" step to avoid duplicating code in the model.ts.

@gsoldevila gsoldevila requested a review from rudolf November 24, 2022 11:56
Copy link
Contributor

@rudolf rudolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! We never relied on writing/reading _meta so we don't have coverage for it in src/core/server/integration_tests/saved_objects/migrations/actions/actions.test.ts

I think it'd be worth to update those tests to ensure that initAction reads the _meta and that updateTargetMappingsMeta writes the _meta.

};
} else {
throwBadResponse(stateP, res);
}
} else if (stateP.controlState === 'CHECK_TARGET_MAPPINGS') {
const res = resW as ResponseType<typeof stateP.controlState>;
if (Either.isLeft(res) || !res.right.match) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a left response would mean something really unexpected happened like ES returning an error that we didn't know about before. I'm not sure we should just ignore that and apply new mappings in this case. In general we would call throwBadResponse for such a left.

@gsoldevila gsoldevila requested a review from rudolf November 28, 2022 11:41
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/core-saved-objects-migration-server-internal 78 79 +1

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
@kbn/core-saved-objects-migration-server-internal 44 45 +1
Unknown metric groups

API count

id before after diff
@kbn/core-saved-objects-migration-server-internal 110 112 +2

ESLint disabled in files

id before after diff
osquery 1 2 +1

ESLint disabled line counts

id before after diff
enterpriseSearch 19 21 +2
fleet 59 65 +6
osquery 109 115 +6
securitySolution 443 449 +6
total +20

Total ESLint disabled count

id before after diff
enterpriseSearch 20 22 +2
fleet 68 74 +6
osquery 110 117 +7
securitySolution 520 526 +6
total +21

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@gsoldevila gsoldevila merged commit b1e18a0 into elastic:main Nov 28, 2022
@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.6 Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 145604

Questions ?

Please refer to the Backport tool documentation

@gsoldevila
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.6

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

gsoldevila added a commit to gsoldevila/kibana that referenced this pull request Nov 29, 2022
…lastic#145604)

The goal of this PR is to reduce the startup times of Kibana server by
improving the migration logic.

Fixes elastic#145743
Related elastic#144035)

The migration logic is run systematically at startup, whether the
customers are upgrading or not.
Historically, these steps have been very quick, but we recently found
out about some customers that have more than **one million** Saved
Objects stored, making the overall startup process slow, even when there
are no migrations to perform.

This PR specifically targets the case where there are no migrations to
perform, aka a Kibana node is started against an ES cluster that is
already up to date wrt stack version and list of plugins.

In this scenario, we aim at skipping the `UPDATE_TARGET_MAPPINGS` step
of the migration logic, which internally runs the
`updateAndPickupMappings` method, which turns out to be expensive if the
system indices contain lots of SO.

I locally tested the following scenarios too:

- **Fresh install.** The step is not even run, as the `.kibana` index
did not exist ✅
- **Stack version + list of plugins up to date.** Simply restarting
Kibana after the fresh install. The step is run and leads to `DONE`, as
the md5 hashes match those stored in `.kibana._mapping._meta` ✅
- **Faking re-enabling an old plugin.** I manually removed one of the
MD5 hashes from the stored .kibana._mapping._meta through `curl`, and
then restarted Kibana. The step is run and leads to
`UPDATE_TARGET_MAPPINGS` as it used to before the PR ✅
- **Faking updating a plugin.** Same as the previous one, but altering
an existing md5 stored in the metas. ✅

And that is the curl command used to tamper with the stored _meta:
```bash
curl -X PUT "kibana:changeme@localhost:9200/.kibana/_mapping?pretty" -H 'Content-Type: application/json' -d'
{
  "_meta": {
      "migrationMappingPropertyHashes": {
        "references": "7997cf5a56cc02bdc9c93361bde732b0",
      }
  }
}
'
```

(cherry picked from commit b1e18a0)

# Conflicts:
#	packages/core/saved-objects/core-saved-objects-migration-server-internal/src/actions/index.ts
gsoldevila added a commit that referenced this pull request Nov 30, 2022
…ble (#145604) (#146637)

# Backport

This will backport the following commits from `main` to `8.6`:
- [Reduce startup time by skipping update mappings step when possible
(#145604)](#145604)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Gerard
Soldevila","email":"gerard.soldevila@elastic.co"},"sourceCommit":{"committedDate":"2022-11-28T14:34:58Z","message":"Reduce
startup time by skipping update mappings step when possible
(#145604)\n\nThe goal of this PR is to reduce the startup times of
Kibana server by\r\nimproving the migration logic.\r\n\r\nFixes
https://github.com/elastic/kibana/issues/145743\r\nRelated
https://github.com/elastic/kibana/issues/144035)\r\n\r\nThe migration
logic is run systematically at startup, whether the\r\ncustomers are
upgrading or not.\r\nHistorically, these steps have been very quick, but
we recently found\r\nout about some customers that have more than **one
million** Saved\r\nObjects stored, making the overall startup process
slow, even when there\r\nare no migrations to perform.\r\n\r\nThis PR
specifically targets the case where there are no migrations
to\r\nperform, aka a Kibana node is started against an ES cluster that
is\r\nalready up to date wrt stack version and list of
plugins.\r\n\r\nIn this scenario, we aim at skipping the
`UPDATE_TARGET_MAPPINGS` step\r\nof the migration logic, which
internally runs the\r\n`updateAndPickupMappings` method, which turns out
to be expensive if the\r\nsystem indices contain lots of
SO.\r\n\r\n\r\nI locally tested the following scenarios too:\r\n\r\n-
**Fresh install.** The step is not even run, as the `.kibana`
index\r\ndid not exist ✅\r\n- **Stack version + list of plugins up to
date.** Simply restarting\r\nKibana after the fresh install. The step is
run and leads to `DONE`, as\r\nthe md5 hashes match those stored in
`.kibana._mapping._meta` ✅\r\n- **Faking re-enabling an old plugin.** I
manually removed one of the\r\nMD5 hashes from the stored
.kibana._mapping._meta through `curl`, and\r\nthen restarted Kibana. The
step is run and leads to\r\n`UPDATE_TARGET_MAPPINGS` as it used to
before the PR ✅\r\n- **Faking updating a plugin.** Same as the previous
one, but altering\r\nan existing md5 stored in the metas. ✅\r\n\r\nAnd
that is the curl command used to tamper with the stored
_meta:\r\n```bash\r\ncurl -X PUT
\"kibana:changeme@localhost:9200/.kibana/_mapping?pretty\" -H
'Content-Type: application/json' -d'\r\n{\r\n \"_meta\": {\r\n
\"migrationMappingPropertyHashes\": {\r\n \"references\":
\"7997cf5a56cc02bdc9c93361bde732b0\",\r\n }\r\n
}\r\n}\r\n'\r\n```","sha":"b1e18a0414ed99456706119d15173b687c6e7366","branchLabelMapping":{"^v8.7.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","enhancement","release_note:skip","Feature:Migrations","backport:prev-minor","v8.7.0"],"number":145604,"url":"https://github.com/elastic/kibana/pull/145604","mergeCommit":{"message":"Reduce
startup time by skipping update mappings step when possible
(#145604)\n\nThe goal of this PR is to reduce the startup times of
Kibana server by\r\nimproving the migration logic.\r\n\r\nFixes
https://github.com/elastic/kibana/issues/145743\r\nRelated
https://github.com/elastic/kibana/issues/144035)\r\n\r\nThe migration
logic is run systematically at startup, whether the\r\ncustomers are
upgrading or not.\r\nHistorically, these steps have been very quick, but
we recently found\r\nout about some customers that have more than **one
million** Saved\r\nObjects stored, making the overall startup process
slow, even when there\r\nare no migrations to perform.\r\n\r\nThis PR
specifically targets the case where there are no migrations
to\r\nperform, aka a Kibana node is started against an ES cluster that
is\r\nalready up to date wrt stack version and list of
plugins.\r\n\r\nIn this scenario, we aim at skipping the
`UPDATE_TARGET_MAPPINGS` step\r\nof the migration logic, which
internally runs the\r\n`updateAndPickupMappings` method, which turns out
to be expensive if the\r\nsystem indices contain lots of
SO.\r\n\r\n\r\nI locally tested the following scenarios too:\r\n\r\n-
**Fresh install.** The step is not even run, as the `.kibana`
index\r\ndid not exist ✅\r\n- **Stack version + list of plugins up to
date.** Simply restarting\r\nKibana after the fresh install. The step is
run and leads to `DONE`, as\r\nthe md5 hashes match those stored in
`.kibana._mapping._meta` ✅\r\n- **Faking re-enabling an old plugin.** I
manually removed one of the\r\nMD5 hashes from the stored
.kibana._mapping._meta through `curl`, and\r\nthen restarted Kibana. The
step is run and leads to\r\n`UPDATE_TARGET_MAPPINGS` as it used to
before the PR ✅\r\n- **Faking updating a plugin.** Same as the previous
one, but altering\r\nan existing md5 stored in the metas. ✅\r\n\r\nAnd
that is the curl command used to tamper with the stored
_meta:\r\n```bash\r\ncurl -X PUT
\"kibana:changeme@localhost:9200/.kibana/_mapping?pretty\" -H
'Content-Type: application/json' -d'\r\n{\r\n \"_meta\": {\r\n
\"migrationMappingPropertyHashes\": {\r\n \"references\":
\"7997cf5a56cc02bdc9c93361bde732b0\",\r\n }\r\n
}\r\n}\r\n'\r\n```","sha":"b1e18a0414ed99456706119d15173b687c6e7366"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.7.0","labelRegex":"^v8.7.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/145604","number":145604,"mergeCommit":{"message":"Reduce
startup time by skipping update mappings step when possible
(#145604)\n\nThe goal of this PR is to reduce the startup times of
Kibana server by\r\nimproving the migration logic.\r\n\r\nFixes
https://github.com/elastic/kibana/issues/145743\r\nRelated
https://github.com/elastic/kibana/issues/144035)\r\n\r\nThe migration
logic is run systematically at startup, whether the\r\ncustomers are
upgrading or not.\r\nHistorically, these steps have been very quick, but
we recently found\r\nout about some customers that have more than **one
million** Saved\r\nObjects stored, making the overall startup process
slow, even when there\r\nare no migrations to perform.\r\n\r\nThis PR
specifically targets the case where there are no migrations
to\r\nperform, aka a Kibana node is started against an ES cluster that
is\r\nalready up to date wrt stack version and list of
plugins.\r\n\r\nIn this scenario, we aim at skipping the
`UPDATE_TARGET_MAPPINGS` step\r\nof the migration logic, which
internally runs the\r\n`updateAndPickupMappings` method, which turns out
to be expensive if the\r\nsystem indices contain lots of
SO.\r\n\r\n\r\nI locally tested the following scenarios too:\r\n\r\n-
**Fresh install.** The step is not even run, as the `.kibana`
index\r\ndid not exist ✅\r\n- **Stack version + list of plugins up to
date.** Simply restarting\r\nKibana after the fresh install. The step is
run and leads to `DONE`, as\r\nthe md5 hashes match those stored in
`.kibana._mapping._meta` ✅\r\n- **Faking re-enabling an old plugin.** I
manually removed one of the\r\nMD5 hashes from the stored
.kibana._mapping._meta through `curl`, and\r\nthen restarted Kibana. The
step is run and leads to\r\n`UPDATE_TARGET_MAPPINGS` as it used to
before the PR ✅\r\n- **Faking updating a plugin.** Same as the previous
one, but altering\r\nan existing md5 stored in the metas. ✅\r\n\r\nAnd
that is the curl command used to tamper with the stored
_meta:\r\n```bash\r\ncurl -X PUT
\"kibana:changeme@localhost:9200/.kibana/_mapping?pretty\" -H
'Content-Type: application/json' -d'\r\n{\r\n \"_meta\": {\r\n
\"migrationMappingPropertyHashes\": {\r\n \"references\":
\"7997cf5a56cc02bdc9c93361bde732b0\",\r\n }\r\n
}\r\n}\r\n'\r\n```","sha":"b1e18a0414ed99456706119d15173b687c6e7366"}}]}]
BACKPORT-->
@rudolf rudolf added the Epic:ScaleMigrations Scale upgrade migrations to millions of saved objects label Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) enhancement New value added to drive a business result Epic:ScaleMigrations Scale upgrade migrations to millions of saved objects Feature:Migrations release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.6.0 v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reduce startup time by skipping update mappings step
5 participants