-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky TestSmoke failure #1048
Labels
>bug
Something isn't working
Comments
Looks like Kibana is not ready because ES is not ready. ES logs: {"type": "server", "timestamp": "2019-06-11T10:01:43,288+0000", "level": "WARN",
"component": "o.e.c.c.ClusterFormationFailureHelper", "cluster.name": "elasticsearch-sample",
"node.name": "elasticsearch-sample-es-5bfsk74wcn",
"message": "master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and [cluster.initial_master_nodes] is empty on this node: have discovered
[{elasticsearch-sample-es-2d974hzqt9}{TwjUUrubQO-geZA5WKvVXg}{ivV_oUkiQOqSbLS_b0twuQ}{10.20.5.17}{10.20.5.17:9300}{ml.machine_memory=27394752512, ml.max_open_jobs=20, xpack.installed=true}, {elasticsearch-sample-es-9r4mgjd9cw}{ELN1D1uYSmGK1U9fG78bCA}{pHURpUzMSYKmCISaTrTaJw}{10.20.4.16}{10.20.4.16:9300}{ml.machine_memory=27394752512, ml.max_open_jobs=20, xpack.installed=true}]; discovery will continue using [127.0.0.1:9300, 127.0.0.1:9301, 127.0.0.1:9302, 127.0.0.1:9303, 127.0.0.1:9304, 10.20.5.17:9300, 10.20.4.16:9300] from hosts providers and [{elasticsearch-sample-es-5bfsk74wcn}{HDUtj2UxRzeODVWSX2LKJQ}{Lf7dd-5mQLCuKIrNzo2MZA}{10.20.3.18}{10.20.3.18:9300}{ml.machine_memory=27394752512, xpack.installed=true, ml.max_open_jobs=20}] from last-known cluster state; node term 0, last-accepted version 0 in term 0" }
{"type": "server", "timestamp": "2019-06-11T10:01:48,887+0000", "level": "DEBUG",
"component": "o.e.a.a.i.g.TransportGetIndexAction", "cluster.name": "elasticsearch-sample",
"node.name": "elasticsearch-sample-es-5bfsk74wcn", "message": "no known master node, scheduling a retry" }
{"type": "server", "timestamp": "2019-06-11T10:01:48,902+0000", "level": "DEBUG",
"component": "o.e.a.a.i.t.g.TransportGetIndexTemplatesAction", "cluster.name": "elasticsearch-sample",
"node.name": "elasticsearch-sample-es-5bfsk74wcn", "message": "timed out while retrying [indices:admin/template/get] after failure (timeout [30s])" }
{"type": "server", "timestamp": "2019-06-11T10:01:48,903+0000", "level": "WARN",
"component": "r.suppressed", "cluster.name": "elasticsearch-sample",
"node.name": "elasticsearch-sample-es-5bfsk74wcn", "message": "path: /_template/.kibana_task_manager, params: {name=.kibana_task_manager, filter_path=*.version}" ,
"stacktrace": ["org.elasticsearch.discovery.MasterNotDiscoveredException: null",
"at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:259) [elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:322) [elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:249) [elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.cluster.service.ClusterApplierService$NotifyTimeout.run(ClusterApplierService.java:555) [elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681) [elasticsearch-7.1.0.jar:7.1.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:835) [?:?]"] }
{"type": "server", "timestamp": "2019-06-11T10:01:51,909+0000", "level": "WARN",
"component": "r.suppressed", "cluster.name": "elasticsearch-sample",
"node.name": "elasticsearch-sample-es-5bfsk74wcn", "message": "path: /.reporting-*/_search, params: {index=.reporting-*}" ,
"stacktrace": ["org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:191) ~[elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedRaiseException(ClusterBlocks.java:177) ~[elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.action.search.TransportSearchAction.executeSearch(TransportSearchAction.java:466) ~[elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.action.search.TransportSearchAction.executeLocalSearch(TransportSearchAction.java:399) ~[elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.action.search.TransportSearchAction.lambda$doExecute$3(TransportSearchAction.java:211) ~[elasticsearch-7.1.0.jar:7.1.0]",
"at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-7.1.0.jar:7.1.0]",
... |
Stranger things:
> k get pods
NAME READY STATUS RESTARTS AGE
elasticsearch-sample-es-6xlrcwbbqs 1/1 Running 0 13m
elasticsearch-sample-es-dl95qqg4vw 1/1 Running 0 13m
elasticsearch-sample-es-nnhrwckh8j 1/1 Running 0 13m
> k get secret $(po 1)-config -o json | jq '.data["elasticsearch.yml"] | @base64d' -r | y2j | jq .cluster,.discovery
{
"initial_master_nodes": [
"elasticsearch-sample-es-6xlrcwbbqs",
"elasticsearch-sample-es-qsxp9qstv4",
"elasticsearch-sample-es-vfng5dmkv2"
],
"name": "elasticsearch-sample"
}
{
"zen": {
"hosts_provider": "file",
"minimum_master_nodes": "2"
}
}
> k get secret $(po 2)-config -o json | jq '.data["elasticsearch.yml"] | @base64d' -r | y2j | jq .cluster,.discovery
{
"name": "elasticsearch-sample"
}
{
"zen": {
"hosts_provider": "file",
"minimum_master_nodes": "2"
}
}
> k get secret $(po 3)-config -o json | jq '.data["elasticsearch.yml"] | @base64d' -r | y2j | jq .cluster,.discovery
{
"name": "elasticsearch-sample"
}
{
"zen": {
"hosts_provider": "file",
"minimum_master_nodes": "2"
} |
Adding loggers shows a problem where PVC may be stolen by an other pod: 2019-06-11T15:20:55.386+0200 INFO version7 ClusterInitialMasterNodesEnforcer
{"changes": ["elasticsearch-sample-es-j94srzpqnj", "elasticsearch-sample-es-m82zrl25h9", "elasticsearch-sample-es-s92pmqg896"], "states": []}
2019-06-11T15:20:55.747+0200 INFO driver Created pod {"name": "elasticsearch-sample-es-j94srzpqnj", "namespace": "e2e"}
2019-06-11T15:20:55.747+0200 INFO pvc Found some orphaned PVC {"name": "elasticsearch-sample-es-j94srzpqnj-elasticsearch-data"}
2019-06-11T15:20:55.814+0200 INFO elasticsearch-controller Recording event {EventType:Normal Reason:Created Message: Created pod elasticsearch-sample-es-j94srzpqnj}
2019-06-11T15:20:55.814+0200 ERROR kubebuilder.controller Reconciler error
{"controller": "elasticsearch-controller", "request": "e2e/elasticsearch-sample", "error": "pods \"elasticsearch-sample-es-j94srzpqnj\" already exists", "errorCauses": [{"error": "pods \"elasticsearch-sample-es-j94srzpqnj\" already exists"}]}
2019-06-11T15:20:55.814+0200 DEBUG kubebuilder.manager.events Normal
{"object": {"kind":"Elasticsearch","namespace":"e2e","name":"elasticsearch-sample","uid":"bd5345e5-8c4b-11e9-9d90-42010a840048","apiVersion":"elasticsearch.k8s.elastic.co/v1alpha1","resourceVersion":"1397385"}, "reason": "Created", "message": " Created pod elasticsearch-sample-es-j94srzpqnj"}
2019-06-11T15:20:57.437+0200 INFO version7 ClusterInitialMasterNodesEnforcer
{"changes": ["elasticsearch-sample-es-h22zvtgdsp", "elasticsearch-sample-es-lxk9b8n75v"], "states": ["elasticsearch-sample-es-j94srzpqnj"]}
2019-06-11T15:20:57.726+0200 INFO driver Created pod {"name": "elasticsearch-sample-es-h22zvtgdsp", "namespace": "e2e"}
2019-06-11T15:20:57.726+0200 INFO pvc Found some orphaned PVC {"name": "elasticsearch-sample-es-h22zvtgdsp-elasticsearch-data"}
2019-06-11T15:20:57.793+0200 INFO elasticsearch-controller Recording event {EventType:Normal Reason:Created Message: Created pod elasticsearch-sample-es-h22zvtgdsp}
2019-06-11T15:20:57.794+0200 ERROR kubebuilder.controller Reconciler error
{"controller": "elasticsearch-controller", "request": "e2e/elasticsearch-sample", "error": "pods \"elasticsearch-sample-es-h22zvtgdsp\" already exists", "errorCauses": [{"error": "pods \"elasticsearch-sample-es-h22zvtgdsp\" already exists"}]}
2019-06-11T15:20:57.794+0200 DEBUG kubebuilder.manager.events Normal
{"object": {"kind":"Elasticsearch","namespace":"e2e","name":"elasticsearch-sample","uid":"bd5345e5-8c4b-11e9-9d90-42010a840048","apiVersion":"elasticsearch.k8s.elastic.co/v1alpha1","resourceVersion":"1397387"}, "reason": "Created", "message": " Created pod elasticsearch-sample-es-h22zvtgdsp"}
2019-06-11T15:20:59.392+0200 INFO version7 ClusterInitialMasterNodesEnforcer
{"changes": ["elasticsearch-sample-es-vrbktpw5x2"], "states": ["elasticsearch-sample-es-h22zvtgdsp", "elasticsearch-sample-es-j94srzpqnj"]}
2019-06-11T15:20:59.704+0200 INFO driver Created pod {"name": "elasticsearch-sample-es-vrbktpw5x2", "namespace": "e2e"}
2019-06-11T15:20:59.705+0200 INFO elasticsearch-controller Recording event {EventType:Normal Reason:Created Message: Created pod elasticsearch-sample-es-vrbktpw5x2}
2019-06-11T15:20:59.705+0200 DEBUG kubebuilder.manager.events Normal
{"object": {"kind":"Elasticsearch","namespace":"e2e","name":"elasticsearch-sample","uid":"bd5345e5-8c4b-11e9-9d90-42010a840048","apiVersion":"elasticsearch.k8s.elastic.co/v1alpha1","resourceVersion":"1397387"}, "reason": "Created", "message": " Created pod elasticsearch-sample-es-vrbktpw5x2"}
2019-06-11T15:21:00.403+0200 INFO version7 ClusterInitialMasterNodesEnforcer
{"changes": [], "states": ["elasticsearch-sample-es-h22zvtgdsp", "elasticsearch-sample-es-j94srzpqnj", "elasticsearch-sample-es-vrbktpw5x2"]} |
Should be fixed by #1052. I'm testing. |
#1052 fixes this issue. |
Closed by #1052. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes
TestSmoke
fails.https://devops-ci.elastic.co/job/cloud-on-k8s-pr/148/console
https://devops-ci.elastic.co/job/cloud-on-k8s-pr/149/console
The text was updated successfully, but these errors were encountered: