Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INGEST: Create Index Before Pipeline Execute #32786

Merged
merged 9 commits into from
Aug 14, 2018

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Aug 10, 2018

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in elastic#32758)
* closes elastic#32758
@original-brownbear original-brownbear added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.0.0 v6.5.0 labels Aug 10, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unit test is preferable, and can fit in TransportBulkActionTests? I don't think there should be an integration test, as the behavior is completely internal to TransportBulkAction.

@original-brownbear
Copy link
Member Author

@rjernst

the problem with a UT (and as a result with leaving the IT out) is that we would basically want to test the the sequence of calling the transport client for setting up the new index, and then only loading the index+pipeline settings (and finally running the pipeline) once that index is up.

That's a lot of new test code that our current org.elasticsearch.action.bulk.TransportBulkActionIngestTests doesn't provide (it just hard uses a hard coded subclass of TransportBulkAction that has the auto create check overriden to false and uses null for the transport client).

... but I guess no time like the present to improve our test infra here :) On it.

@original-brownbear
Copy link
Member Author

@rjernst how about this replacement for the IT:

37d28ca

Implemented the index creation check in the same style used by org.elasticsearch.action.bulk.TransportBulkActionTests.TestTransportBulkAction#createIndex. Def. a little dirty to override the class we test for testing, but how about it? :)

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @original-brownbear. This will work I think for now.

@@ -112,25 +120,42 @@
/** A subclass of the real bulk action to allow skipping real bulk indexing, and marking when it would have happened. */
class TestTransportBulkAction extends TransportBulkAction {
boolean isExecuted = false; // set when the "real" bulk execution happens

boolean needToCheck;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a comment that this makes the return of needToCheck() pluggable

@original-brownbear
Copy link
Member Author

@rjernst thanks, comment added, will merge once green :)

@original-brownbear original-brownbear merged commit 124c1f1 into elastic:master Aug 14, 2018
@original-brownbear original-brownbear deleted the 32758 branch August 14, 2018 09:27
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Aug 14, 2018
…e-types

* elastic/master: (199 commits)
  Watcher: Remove unused hipchat render method (elastic#32211)
  Watcher: Remove extraneous auth classes (elastic#32300)
  Watcher: migrate PagerDuty v1 events API to v2 API (elastic#32285)
  [TEST] Select free port for Minio (elastic#32837)
  MINOR: Remove `IndexTemplateFilter` (elastic#32841)
  Core: Add java time version of rounding classes (elastic#32641)
  Aggregations/HL Rest client fix: missing scores (elastic#32774)
  HLRC: Add Delete License API (elastic#32586)
  INGEST: Create Index Before Pipeline Execute (elastic#32786)
  Fix NOOP bulk updates (elastic#32819)
  Remove client connections from TcpTransport (elastic#31886)
  Increase logging testRetentionPolicyChangeDuringRecovery
  AwaitsFix case-functions.sql-spec
  Mute security-cli tests in FIPS JVM (elastic#32812)
  SCRIPTING: Support BucketAggScript return null (elastic#32811)
  Unmute WildFly tests in FIPS JVM (elastic#32814)
  [TEST] Force a stop to save rollup state before continuing (elastic#32787)
  [test] disable packaging tests for suse boxes
  Mute IndicesRequestIT#testBulk
  [ML][DOCS] Refer to rules feature as custom rules (elastic#32785)
  ...
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Aug 14, 2018
…listeners

* elastic/master:
  Watcher: Remove unused hipchat render method (elastic#32211)
  Watcher: Remove extraneous auth classes (elastic#32300)
  Watcher: migrate PagerDuty v1 events API to v2 API (elastic#32285)
  [TEST] Select free port for Minio (elastic#32837)
  MINOR: Remove `IndexTemplateFilter` (elastic#32841)
  Core: Add java time version of rounding classes (elastic#32641)
  Aggregations/HL Rest client fix: missing scores (elastic#32774)
  HLRC: Add Delete License API (elastic#32586)
  INGEST: Create Index Before Pipeline Execute (elastic#32786)
  Fix NOOP bulk updates (elastic#32819)
  Remove client connections from TcpTransport (elastic#31886)
  Increase logging testRetentionPolicyChangeDuringRecovery
  AwaitsFix case-functions.sql-spec
  Mute security-cli tests in FIPS JVM (elastic#32812)
  SCRIPTING: Support BucketAggScript return null (elastic#32811)
  Unmute WildFly tests in FIPS JVM (elastic#32814)
  [TEST] Force a stop to save rollup state before continuing (elastic#32787)
  [test] disable packaging tests for suse boxes
  Mute IndicesRequestIT#testBulk
  [ML][DOCS] Refer to rules feature as custom rules (elastic#32785)
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Aug 19, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in elastic#32758)
* closes elastic#32758
original-brownbear added a commit that referenced this pull request Aug 19, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes #32758
jasontedor pushed a commit that referenced this pull request Aug 21, 2018
* INGEST: Create Index Before Pipeline Execute

* Ensures that indices are created before the default pipeline setting is read to correcly handle the case of an index template containing a default pipeline (without the fix the first document does not get the pipeline applied as explained in #32758)
* closes #32758
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Aug 24, 2018
* commit '9088d811ce9cff922e6ef1befbeb0f1e0c27016a': (23 commits)
  Generalize remote license checker (elastic#32971)
  Trim translog when safe commit advanced (elastic#32967)
  Fix an inaccuracy in the dynamic templates documentation. (elastic#32890)
  HLREST: AwaitsFix ML Test
  Make Geo Context Mapping Parsing More Strict (elastic#32862)
  Add mzn and dz to unsupported locales (elastic#32957)
  Use settings from the context in BootstrapChecks (elastic#32908)
  Update docs for node specifications (elastic#30468)
  HLRC: Forbid all Elasticsearch logging infra (elastic#32784)
  Fix use of deprecated apis
  Only configure publishing if it's applied externally (elastic#32351)
  Protect ScriptedMetricIT test cases against failures on 0-doc shards (elastic#32959) (elastic#32968)
  Scripted metric aggregations: add deprecation warning and system (elastic#32944)
  Watcher: Properly find next valid date in cron expressions (elastic#32734)
  Fix some small issues in the getting started docs (elastic#30346)
  Set forbidden APIs target compatibility to compiler java version   (elastic#32935)
  [TEST] Add "ne" as an unsupported SimpleKdc locale (elastic#32700)
  MINOR: Remove `IndexTemplateFilter` (elastic#32841) (elastic#32974)
  INGEST: Create Index Before Pipeline Execute (elastic#32786) (elastic#32975)
  NETWORKING: Make RemoteClusterConn. Lazy Resolve DNS (elastic#32764) (elastic#32976)
  ...
@jimczi jimczi removed the v7.0.0 label Feb 7, 2019
jakelandis added a commit that referenced this pull request Mar 6, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit to jakelandis/elasticsearch that referenced this pull request Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (elastic#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates elastic#32786
Relates elastic#32758 
Closes elastic#36545
jakelandis added a commit that referenced this pull request Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit that referenced this pull request Mar 7, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
jakelandis added a commit that referenced this pull request Mar 8, 2019
Prior to this commit (and after 6.5.0), if an ingest node changes
the _index in a pipeline, the original target index would be created.
For daily indexes this could create an extra, empty index per day.

This commit changes the TransportBulkAction to execute the ingest node
pipeline before attempting to create the index. This ensures that the 
only index created is the original or one set by the ingest node pipeline. 
This was the execution order prior to 6.5.0 (#32786). 

The execution order was changed in 6.5 to better support default pipelines. 
Specifically the execution order was changed to be able to read the settings
from the index meta data. This commit also includes a change in logic such 
that if the target index does not exist when ingest node pipeline runs, it 
will now pull the default pipeline (if one exists) from the settings of the 
best matched of the index template. 

Relates #32786
Relates #32758 
Closes #36545
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ingest] Default ingest node pipeline via index template fails to apply for first document.
4 participants