Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Security Index upgrade failure during FullClusterRestart #82837

Closed
ywangd opened this issue Jan 20, 2022 · 5 comments · Fixed by #84843
Closed

[CI] Security Index upgrade failure during FullClusterRestart #82837

ywangd opened this issue Jan 20, 2022 · 5 comments · Fixed by #84843
Assignees
Labels
:Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team >test-failure Triaged test failures from CI

Comments

@ywangd
Copy link
Member

ywangd commented Jan 20, 2022

The failing test is FullClusterRestartIT testApiKeySuperuser
This happens on my PR CI and related to my change.

But the underlying reason seems to be upgrade failure of security system index. Cluster log shows a NPE https://gradle-enterprise.elastic.co/s/wgya6tyniu7y2/console-log#L2024 for AliasMetadata#isHidden invocation and comparison.

I suspect it has something to do with #79512 and recent enforce of 7.last for 8.x upgrade.

Build scan:
https://gradle-enterprise.elastic.co/s/wgya6tyniu7y2/tests/:x-pack:qa:full-cluster-restart:v7.8.1%23upgradedClusterTest/org.elasticsearch.xpack.restart.FullClusterRestartIT/testApiKeySuperuser

Reproduction line:
./gradlew ':x-pack:qa:full-cluster-restart:v7.8.1#upgradedClusterTest' -Dtests.class="org.elasticsearch.xpack.restart.FullClusterRestartIT" -Dtests.method="testApiKeySuperuser" -Dtests.seed=704325F309264EE2 -Dtests.bwc=true -Dtests.locale=fi -Dtests.timezone=America/Boise -Druntime.java=17

Applicable branches:
master

Reproduces locally?:
Yes

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.xpack.restart.FullClusterRestartIT&tests.test=testApiKeySuperuser

Failure excerpt:

org.elasticsearch.client.WarningFailureException: method [GET], host [http://127.0.0.1:42205], URI [.security/_search], status line [HTTP/1.1 200 OK]
{"took":2,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":4,"relation":"eq"},"max_score":1.0,"hits":[{"_index":".security-7","_id":"user-api_key_super_creator","_score":1.0,"_source":{"username":"api_key_super_creator","password":"$2a$10$OsBKVjcqgvziPdsewcOcmufqeJNjplujZ8NYFW6sjL2VXdVfTgpMe","roles":["superuser","monitoring_user"],"full_name":null,"email":null,"metadata":null,"enabled":true,"type":"user"}},{"_index":".security-7","_id":"_mHGdH4B2CIXTYmcz0_9","_score":1.0,"_source":{"doc_type":"api_key","creation_time":1642636693479,"expiration_time":null,"api_key_invalidated":false,"api_key_hash":"{PBKDF2}10000$v4R2re60QvaorlE+Cwl0MRq2uIE0rABYIrmBSMAxs4U=$yDhHqMHGzXtVgb4CPnGAdtKRMxP+nzpP0L4zdT45pPw=","role_descriptors":{},"limited_by_role_descriptors":{"superuser":{"cluster":["all"],"indices":[{"names":["*"],"privileges":["all"],"allow_restricted_indices":true}],"applications":[{"application":"*","privileges":["*"],"resources":["*"]}],"run_as":["*"],"metadata":{"_reserved":true},"type":"role"},"monitoring_user":{"cluster":["cluster:monitor/main","cluster:monitor/xpack/info","cluster:monitor/remote/info"],"indices":[{"names":[".monitoring-*"],"privileges":["read","read_cross_cluster"],"allow_restricted_indices":false}],"applications":[{"application":"kibana-*","privileges":["reserved_monitoring"],"resources":["*"]}],"run_as":[],"metadata":{"_reserved":true},"type":"role"}},"name":"super_legacy_key","version":7080199,"creator":{"principal":"api_key_super_creator","metadata":{},"realm":"default_native","realm_type":"native"}}},{"_index":".security-7","_id":"9aTGdH4B8Erq23pH0QLn","_score":1.0,"_source":{
  "doc_type": "foo"
}},{"_index":".security-7","_id":"AGHGdH4B2CIXTYmc1FBS","_score":1.0,"_source":{"doc_type":"api_key","creation_time":1642636694588,"expiration_time":null,"api_key_invalidated":false,"api_key_hash":"{PBKDF2}10000$Uj+vKfhGVDJ+0dM+YBDyJIsC2hSFGXCqojoMQ6Ens9E=$/8X0Dup+8cwgm1L5d5zNHtUPoDG1fV1WCPhcPu1p4yg=","role_descriptors":{"r":{"cluster":["all"],"indices":[{"names":["*"],"privileges":["all"],"allow_restricted_indices":false}],"applications":[],"run_as":[],"metadata":{},"type":"role"}},"limited_by_role_descriptors":{"_es_test_root":{"cluster":["ALL"],"indices":[{"names":["*"],"privileges":["ALL"],"allow_restricted_indices":true}],"applications":[{"application":"*","privileges":["*"],"resources":["*"]}],"run_as":["*"],"metadata":{},"type":"role"}},"name":"key-1","version":7080199,"creator":{"principal":"test_user","metadata":{},"realm":"default_file","realm_type":"file"}}}]}}

  at __randomizedtesting.SeedInfo.seed([704325F309264EE2:478E48673AE6CF31]:0)
  at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:342)
  at org.elasticsearch.client.RestClient.performRequest(RestClient.java:312)
  at org.elasticsearch.client.RestClient.performRequest(RestClient.java:287)
  at org.elasticsearch.xpack.restart.FullClusterRestartIT.testApiKeySuperuser(FullClusterRestartIT.java:421)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:568)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:44)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:824)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:475)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  at java.lang.Thread.run(Thread.java:833)

@ywangd ywangd added :Core/Infra/Core Core issues without another label >test-failure Triaged test failures from CI labels Jan 20, 2022
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Jan 20, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

ywangd added a commit to ywangd/elasticsearch that referenced this issue Jan 20, 2022
@javanna
Copy link
Member

javanna commented Jan 20, 2022

Been seeing the same behaviour, what struck me is that a WarningFailureException is being thrown but no warnings are being printed out with the exception message. Any reason why this test was not muted?

@ywangd
Copy link
Member Author

ywangd commented Jan 20, 2022

I muted 6b6f06e

@grcevski
Copy link
Contributor

grcevski commented Mar 9, 2022

I believe William just fixed this here: #84780.

I'm going to revert the workaround fix, reproduce the failure and confirm the fix resolves it.

@grcevski
Copy link
Contributor

grcevski commented Mar 9, 2022

I have confirmed that the fix in #84780 resolves the problem with the unexpected warning. I'll wait for the fix to be merged and submit a PR to revert the workaround for the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team >test-failure Triaged test failures from CI
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants