-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest ingest then create index #39607
Ingest ingest then create index #39607
Conversation
Pinging @elastic/es-core-features |
@@ -91,4 +91,4 @@ teardown: | |||
get: | |||
index: test | |||
id: 3 | |||
- match: { found: false } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed because with this change the test
index is no longer created since the index creation only happens after ingest (and this ingest drops the index request, thus no index ).
if (IngestService.NOOP_PIPELINE_NAME.equals(defaultPipeline) == false) { | ||
hasIndexRequestsWithPipelines = true; | ||
} | ||
} else if (indexRequest.index() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving the ingest node execution before the create index is the primary change of this PR.
However, to allow the prevent re-introducing #32758, we need to look up the default pipeline from the matched templates which is what this code block does.
@@ -460,7 +462,7 @@ public void testUseDefaultPipelineWithBulkUpsert() throws Exception { | |||
verifyZeroInteractions(transportService); | |||
} | |||
|
|||
public void testCreateIndexBeforeRunPipeline() throws Exception { | |||
public void testDoExecuteCalledTwiceCorrectly() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is valid, but it's name was slightly misleading (it still passed with my change to run pipeline before the create index)
-------------------------------------------------- | ||
// TESTRESPONSE | ||
//// | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these were removed because now that the ingest happens before index creation, if the index doesn't exist yet, and we only drop a document, the index never exists, and thus the response is different.
@jakelandis I checked this out and the approach in the pr looks good to me. |
@martijnvg @original-brownbear thanks for taking a look. I forgot to pull the PR out of draft mode, could I get a review approval or LGTM (assuming this is good to go). |
@jakelandis giving it a thorough review now then, won't take long :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one problem, otherwise LGTM
server/src/main/java/org/elasticsearch/action/bulk/TransportBulkAction.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (elastic#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates elastic#32786 Relates elastic#32758 Closes elastic#36545
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (elastic#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates elastic#32786 Relates elastic#32758 Closes elastic#36545
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (elastic#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates elastic#32786 Relates elastic#32758 Closes elastic#36545
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates #32786 Relates #32758 Closes #36545
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates #32786 Relates #32758 Closes #36545
Prior to this commit (and after 6.5.0), if an ingest node changes the _index in a pipeline, the original target index would be created. For daily indexes this could create an extra, empty index per day. This commit changes the TransportBulkAction to execute the ingest node pipeline before attempting to create the index. This ensures that the only index created is the original or one set by the ingest node pipeline. This was the execution order prior to 6.5.0 (#32786). The execution order was changed in 6.5 to better support default pipelines. Specifically the execution order was changed to be able to read the settings from the index meta data. This commit also includes a change in logic such that if the target index does not exist when ingest node pipeline runs, it will now pull the default pipeline (if one exists) from the settings of the best matched of the index template. Relates #32786 Relates #32758 Closes #36545
Execute ingest node pipeline before creating the index
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.
This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the
only index created is the original or one set by the ingest node pipeline.
This was the execution order prior to 6.5.0 (#32786).
The execution order was changed in 6.5 to better support default pipelines.
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such
that if the target index does not exist when ingest node pipeline runs, it
will now pull the default pipeline (if one exists) from the settings of the
best matched of the index template.
Relates #32786
Relates #32758
Closes #36545
There is some history here...
Originally ingest node executed before any creation of indices. We introduced default pipelines and the discovered #32758 (the first doc in an index failed to get run through the pipeline). So we fixed that via #32758 (changes to create the index first). However, that fix introduced #36545 (comment) (note - the title is misleading) where now if the pipeline re-directs the
_index
it still creates the original_index
resulting in an empty index. This is particularly problematic for daily indices that always re-directed, since it would result in an empty index per day. This (re-directing daily indices) is also how upgrading monitoring templates work, and thus this PR is a blocker to #39336.The fix here is to go back to the original order where ingest node executes before creation of indexes, but solve #32758 differently. Instead of creating the index and always reading the index meta data, if the index does not exist we can now do the index template matching to look for the default pipeline.