-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocate newly created indices on data_hot tier nodes #61342
Conversation
This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by default when they are created. This does not break existing behavior, as nodes with the `data` role are considered to be part of the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`, `data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by default. This change is a little more complicated than changing the default value for `index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to have a plugin inject a setting into the builder for a newly created index. This has the benefit of allowing this setting to be visible as part of the settings when retrieving the index, for example: ``` // Create an index PUT /eggplant // Get an index GET /eggplant?flat_settings ``` Returns the default settings now of: ```json { "eggplant" : { "aliases" : { }, "mappings" : { }, "settings" : { "index.creation_date" : "1597855465598", "index.number_of_replicas" : "1", "index.number_of_shards" : "1", "index.provided_name" : "eggplant", "index.routing.allocation.include._tier" : "data_hot", "index.uuid" : "6ySG78s9RWGystRipoBFCA", "index.version.created" : "8000099" } } } ``` After the initial setting of this setting, it can be treated like any other index level setting. This new setting is *not* set on a new index if any of the following is true: - The index is created with an `index.routing.allocation.include._tier` setting - The index is created with an `index.routing.allocation.exclude._tier` setting - The index is created with an `index.routing.allocation.require._tier` setting - The index is created with a null `index.routing.allocation.include._tier` value - The index was created from an existing source metadata (shrink, clone, split, etc) Relates to elastic#60848
Pinging @elastic/es-core-features (:Core/Features/Features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the pluggable mechanism should not be to hook into when an index is being created, but merely for a plugin to return a list of default index settings that are explicitly applied to an index upon index creation unless overridden by the create index request (template, source index from a shrink/split/clone). This at least reduces the scope of pluggability.
@jasontedor Ryan and I had a conversation and agreed that this sounds like a good solution and a reasonable addition for plugin extensibility. I'm going to work on an implementation that will change it to this behavior |
@elasticmachine run elasticsearch-ci/packaging-sample-windows (build timed out downloading gradle) |
@elasticmachine update branch |
Okay, I've updated this to use the discussed plugin method, and this is ready for review. @rjernst please take a look at the plugin integration point if you're interested It is also worth mentioning that @andreidan and I discussed adding an index-level setting that will opt-out of this behavior in #61377 (comment) but since the setting does not yet exist (to be added either in Andrei's PR on subsequent work), I cannot make use of it for some of our built-in indices (yet!). |
Why do we need a setting for that? Automatic allocation should only occur if no explicit allocation is defined? So if a user wants to opt out, it’s sufficient to define explicit allocation? |
The idea would be that it would be a single unified setting to opt out of
not only the initial hot allocation by default, but also the automatic tier
migration that ILM will do.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the direction and just a few comments inline with code...
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateIndexService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateIndexService.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/DataTier.java
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/DataTier.java
Outdated
Show resolved
Hide resolved
@@ -391,6 +391,7 @@ Returns: | |||
"index.creation_date": "1474389951325", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably want to update https://www.elastic.co/guide/en/elasticsearch/reference/7.9/shard-allocation-filtering.html with or soon after this PR. (also https://www.elastic.co/guide/en/elasticsearch/reference/7.9/modules-node.html#data-node could use an update, but is outside the scope of this PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, I am leaving documentation for a subsequent PR though, in the chance that these end up having a name change. I will make sure they are handled in the documentation PR.
server/src/main/java/org/elasticsearch/index/shard/ExplicitIndexSettingProvider.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/DataTier.java
Show resolved
Hide resolved
...plugin/core/src/test/java/org/elasticsearch/xpack/cluster/routing/allocation/DataTierIT.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateIndexService.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/metadata/MetadataCreateIndexService.java
Show resolved
Hide resolved
Thanks @jakelandis, I think I addressed your comments so far |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple comments on the plugin portion
server/src/main/java/org/elasticsearch/index/shard/ExplicitIndexSettingProvider.java
Outdated
Show resolved
Hide resolved
/** | ||
* Returns explicitly set default index {@link Settings} for the given index. | ||
*/ | ||
default Settings getExplicitIndexSettings(String indexName, Settings templateAndRequestSettings) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do without the passed in settings? I had imagined the caller would do the merging/defaulting logic, so it's unclear why we would need the user passed in settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the settings to be able to not set mutually exclusive settings, for example, if a user created an index with:
POST /myindex
{
"settings": {
"index.routing.allocation.require._name": "mynode"
}
}
We need to not set index.routing.allocation.include._tier
automatically, because otherwise we're constraining a new index to a tier when the user specifically wanted it constrained to single node (we check the all index level filtering settings). These settings are the only way we can make explicit default index settings reactive to other index level settings (such as adding an opt-out index level setting in the future)
server/src/main/java/org/elasticsearch/plugins/IndexSettingsProviderPlugin.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/shard/ExplicitIndexSettingProvider.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT. I don't have any strong opinion on the naming .. but thanks for comments you added, it helped alot to understand the purpose.
@elasticmachine run elasticsearch-ci/1 (timed out downloading something from archive.ubuntu.com) |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by default when they are created. This does not break existing behavior, as nodes with the `data` role are considered to be part of the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`, `data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by default. This change is a little more complicated than changing the default value for `index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to have a plugin inject a setting into the builder for a newly created index. This has the benefit of allowing this setting to be visible as part of the settings when retrieving the index, for example: ``` // Create an index PUT /eggplant // Get an index GET /eggplant?flat_settings ``` Returns the default settings now of: ```json { "eggplant" : { "aliases" : { }, "mappings" : { }, "settings" : { "index.creation_date" : "1597855465598", "index.number_of_replicas" : "1", "index.number_of_shards" : "1", "index.provided_name" : "eggplant", "index.routing.allocation.include._tier" : "data_hot", "index.uuid" : "6ySG78s9RWGystRipoBFCA", "index.version.created" : "8000099" } } } ``` After the initial setting of this setting, it can be treated like any other index level setting. This new setting is *not* set on a new index if any of the following is true: - The index is created with an `index.routing.allocation.include.<anything>` setting - The index is created with an `index.routing.allocation.exclude.<anything>` setting - The index is created with an `index.routing.allocation.require.<anything>` setting - The index is created with a null `index.routing.allocation.include._tier` value - The index was created from an existing source metadata (shrink, clone, split, etc) Relates to elastic#60848
…61650) This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by default when they are created. This does not break existing behavior, as nodes with the `data` role are considered to be part of the hot tier. Users that separate their deployments by using the `data_hot` (and `data_warm`, `data_cold`, `data_frozen`) roles will have their data allocated on the hot tier nodes now by default. This change is a little more complicated than changing the default value for `index.routing.allocation.include._tier` from null to "data_hot". Instead, this adds the ability to have a plugin inject a setting into the builder for a newly created index. This has the benefit of allowing this setting to be visible as part of the settings when retrieving the index, for example: ``` // Create an index PUT /eggplant // Get an index GET /eggplant?flat_settings ``` Returns the default settings now of: ```json { "eggplant" : { "aliases" : { }, "mappings" : { }, "settings" : { "index.creation_date" : "1597855465598", "index.number_of_replicas" : "1", "index.number_of_shards" : "1", "index.provided_name" : "eggplant", "index.routing.allocation.include._tier" : "data_hot", "index.uuid" : "6ySG78s9RWGystRipoBFCA", "index.version.created" : "8000099" } } } ``` After the initial setting of this setting, it can be treated like any other index level setting. This new setting is *not* set on a new index if any of the following is true: - The index is created with an `index.routing.allocation.include.<anything>` setting - The index is created with an `index.routing.allocation.exclude.<anything>` setting - The index is created with an `index.routing.allocation.require.<anything>` setting - The index is created with a null `index.routing.allocation.include._tier` value - The index was created from an existing source metadata (shrink, clone, split, etc) Relates to #60848
This commit adds the functionality to allocate newly created indices on nodes in the "hot" tier by
default when they are created.
This does not break existing behavior, as nodes with the
data
role are considered to be part ofthe hot tier. Users that separate their deployments by using the
data_hot
(anddata_warm
,data_cold
,data_frozen
) roles will have their data allocated on the hot tier nodes now bydefault.
This change is a little more complicated than changing the default value for
index.routing.allocation.include._tier
from null to "data_hot". Instead, this adds the ability tohave a plugin inject a setting into the builder for a newly created index. This has the benefit of
allowing this setting to be visible as part of the settings when retrieving the index, for example:
Returns the default settings now of:
After the initial setting of this setting, it can be treated like any other index level setting.
This new setting is not set on a new index if any of the following is true:
index.routing.allocation.include.<anything>
settingindex.routing.allocation.exclude.<anything>
settingindex.routing.allocation.require.<anything>
settingindex.routing.allocation.include._tier
valueRelates to #60848