-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UpdateByQueryBasicTests#testMultipleSources fails in CI #27820
Comments
Hmm, I wasn't able to reproduce this at the method, class, or project level. I also don't see anything in the build or test cluster logs that indicates a cause. @danielmitterdorfer do you have any advice for further ruling out a problem with the test |
We do see the following in the log:
So it seems that for some reason these nodes were hanging but this is all I can see in the log. Maybe it makes sense to increase the log level to get more info in case this test fails again? |
Gotcha, I'll increase the test log level since it can't hurt. Since this is the only time this test has failed I'm inclined to think it may be caused by something else |
* es/master: (45 commits) Adapt scroll rest test after backport. relates #27842 Move early termination based on index sort to TopDocs collector (#27666) Upgrade beats templates that we use for bwc testing. (#27929) ingest: upgraded ingest geoip's geoip2's dependencies. [TEST] logging for update by query test #27820 Add elasticsearch-nio jar for base nio classes (#27801) Use full profile on JDK 10 builds Require Gradle 4.3 Enable grok processor to support long, double and boolean (#27896) Add unreleased v6.1.2 version TEST: reduce blob size #testExecuteMultipartUpload Check index under the store metadata lock (#27768) Fixes DocStats to not report index size < -1 (#27863) Fixed test to be up to date with the new database files. Upgrade to Lucene 7.2.0. (#27910) Disable TestZenDiscovery in cloud providers integrations test Use `_refresh` to shrink the version map on inactivity (#27918) Make KeyedLock reentrant (#27920) ingest: Upgraded the geolite2 databases. [Test] Fix IndicesClientDocumentationIT (#27899) ...
* es/6.x: (43 commits) ingest: upgraded ingest geoip's geoip2's dependencies. [TEST] logging for update by query test #27820 Use full profile on JDK 10 builds Require Gradle 4.3 Add unreleased v6.1.2 version TEST: reduce blob size #testExecuteMultipartUpload Check index under the store metadata lock (#27768) Upgrade to Lucene 7.2.0. (#27910) Fixed test to be up to date with the new database files. Use `_refresh` to shrink the version map on inactivity (#27918) Make KeyedLock reentrant (#27920) Fixes DocStats to not report index size < -1 (#27863) Disable TestZenDiscovery in cloud providers integrations test ingest: Upgraded the geolite2 databases. [Issue-27716]: CONTRIBUTING.md IntelliJ configurations settings are confusing. (#27717) [Test] Fix IndicesClientDocumentationIT (#27899) Move uid lock into LiveVersionMap (#27905) Mute testRetentionPolicyChangeDuringRecovery Increase Gradle heap space to 1536m Move GlobalCheckpointTracker and remove SequenceNumbersService (#27837) ...
This looks like the node hanging for an unreasonable amount of time. The failure isn't particularly reindex based either. And we haven't seen another thing like this since. I'm going to close this and we'll reopen if we see something similar. |
A recent failure popped up that may be related. The test timed out while attempting to process cluster events for 'put mapping'. I don't see general slowness or node hangs, but the logs are difficult to understand because lines from different outputs are interleaved. Reproduction line:
Link to build logs: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=debian-9/320/console Relevant excerpt:
|
test failure link: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/850/console (full build log and test-cluster logs)
failure:
So the test fails to delete the index that is used. Above the failure we see in the log:
So I think the test may just be the victim and the root cause is in the test infrastructure. If you agree, please feel free to reassign.
reproduction line:
reproduces locally: no
failure frequency: this test failed once in CI within the last year
The text was updated successfully, but these errors were encountered: