Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for kubernetes components on soft start #8199

Merged
merged 4 commits into from
May 19, 2020

Conversation

priyawadhwa
Copy link

@priyawadhwa priyawadhwa commented May 18, 2020

I noticed that TestComponentHealth/parallel/ComponentHealth was failing with this error:

Error apiserver status: https://172.17.0.3:8441/healthz returned error 500:
[-]etcd failed: reason withheld

but by the time post mortem logs were printed the etcd container was up and running.

I think this test occasionally fails because apiserver healthz is not yet returning a 200 status when we run the test. We wait for healthz to return 200 on regular start, but not on soft start, which we run in TestFunctional.

With this PR, the --wait flag will be respected on soft start, so now we can be sure the apiserver is healthy before running this test

closes #8014

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 18, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: priyawadhwa

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2020
@priyawadhwa priyawadhwa changed the title Component health wip: TestFunctional/parallel/ComponentHealth May 18, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 18, 2020
@priyawadhwa
Copy link
Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label May 18, 2020
I noticed that TestComponentHealth/parallel/ComponentHealth was failing with this error:

```
Error apiserver status: https://172.17.0.3:8441/healthz returned error 500:
[-]etcd failed: reason withheld
```

but by the time post mortem logs were printed the etcd container was up and running.

I think this test occasionally fails because apiserver healthz is not yet returning a 200 status when we run the test. We wait for healthz to return 200 on regular start, but not on soft start, which we run in `TestFunctional`.

This PR adds a retry, which should give the apiserver time to become healthy.
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 19, 2020
@priyawadhwa priyawadhwa changed the title wip: TestFunctional/parallel/ComponentHealth Add retry to TestFunctional/parallel/ComponentHealth to fix flake May 19, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 19, 2020
@codecov-commenter
Copy link

Codecov Report

Merging #8199 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8199   +/-   ##
=======================================
  Coverage   34.59%   34.59%           
=======================================
  Files         147      147           
  Lines        9381     9381           
=======================================
  Hits         3245     3245           
  Misses       5739     5739           
  Partials      397      397           

@minikube-pr-bot
Copy link

kvm2 Driver
Times for minikube: [63.52065798100001 65.256660667 62.629218319]
Average time for minikube: 63.802178989000005

Times for Minikube (PR 8199): [65.670369753 59.47718224399999 64.740622071]
Average time for Minikube (PR 8199): 63.29605802266667

Averages Time Per Log

+--------------------------------+-----------+--------------------+
|              LOG               | MINIKUBE  | MINIKUBE (PR 8199) |
+--------------------------------+-----------+--------------------+
| * minikube v1.10.1 on Debian   |  0.062285 |           0.057019 |
|                           9.11 |           |                    |
| * Using the kvm2 driver based  |  0.024278 |           0.021200 |
| on existing profile            |           |                    |
| * Starting control plane node  |  0.003116 |           0.003248 |
| minikube in cluster minikube   |           |                    |
| * Creating kvm2 VM (CPUs=2,    | 40.796452 |          38.667226 |
| Memory=3700MB, Disk=20000MB)   |           |                    |
| ...                            |           |                    |
| * Preparing Kubernetes v1.18.2 | 21.047492 |          22.683417 |
| on Docker 19.03.8 ...          |           |                    |
| * Verifying Kubernetes         |  1.321206 |           1.450023 |
| components...                  |           |                    |
| * Enabled addons:              |  0.478262 |           0.334046 |
| default-storageclass,          |           |                    |
| storage-provisioner            |           |                    |
| * Done! kubectl is now         |  0.066185 |           0.073361 |
| configured to use "minikube"   |           |                    |
|                                |  0.002903 |           0.006520 |
+--------------------------------+-----------+--------------------+

docker Driver
Times for minikube: [25.417960670000003 26.036046586999998 27.142706086]
Average time for minikube: 26.198904447666667

Times for Minikube (PR 8199): [26.041959337999998 25.815322612 25.46038041]
Average time for Minikube (PR 8199): 25.77255412

Averages Time Per Log

+----------------------------------------+-----------+--------------------+
|                  LOG                   | MINIKUBE  | MINIKUBE (PR 8199) |
+----------------------------------------+-----------+--------------------+
| * minikube v1.10.1 on Debian           |  0.074719 |           0.069940 |
|                                   9.11 |           |                    |
| * Using the docker driver              |  0.002695 |           0.002432 |
| based on existing profile              |           |                    |
| * Starting control plane node          |  0.060281 |           0.056758 |
| minikube in cluster minikube           |           |                    |
| * Creating docker container            |  7.616737 |           7.298782 |
| (CPUs=2, Memory=3700MB) ...            |           |                    |
| * Preparing Kubernetes v1.18.2         |  0.120832 |           0.118743 |
| on Docker 19.03.2 ...                  |           |                    |
|   -                                    | 17.268933 |          17.066447 |
| kubeadm.pod-network-cidr=10.244.0.0/16 |           |                    |
| * Verifying Kubernetes                 |  0.843374 |           1.091650 |
| components...                          |           |                    |
| * Enabled addons:                      |  0.142325 |           0.002539 |
| default-storageclass,                  |           |                    |
| storage-provisioner                    |           |                    |
| * Done! kubectl is now                 |  0.065862 |           0.061913 |
| configured to use "minikube"           |           |                    |
|                                        |  0.003146 |           0.003350 |
+----------------------------------------+-----------+--------------------+

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better that we solve this in the code, rather the integraiton test,
the wait=all be making sure the api server is up !

if it is not, we need a new wait-components that waits for all

@priyawadhwa
Copy link
Author

@medyagh --wait=all does wait for apiserver, but not on soft starts (if the config hasn't changed, we just skip that part of the code). I'll add change that.

@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 19, 2020
@priyawadhwa priyawadhwa requested a review from medyagh May 19, 2020 17:33
@priyawadhwa priyawadhwa changed the title Add retry to TestFunctional/parallel/ComponentHealth to fix flake Respect --wait flag on soft start May 19, 2020
@minikube-pr-bot
Copy link

kvm2 Driver
Times for minikube: [67.59165961800001 68.984006505 68.317118474]
Average time for minikube: 68.29759486566667

Times for Minikube (PR 8199): [68.338790635 69.342338672 65.71761839100002]
Average time for Minikube (PR 8199): 67.799582566

Averages Time Per Log

+--------------------------------+-----------+--------------------+
|              LOG               | MINIKUBE  | MINIKUBE (PR 8199) |
+--------------------------------+-----------+--------------------+
| * minikube v1.10.1 on Debian   |  0.070862 |           0.071307 |
|                           9.11 |           |                    |
| * Using the kvm2 driver based  |  0.025410 |           0.024458 |
| on existing profile            |           |                    |
| * Starting control plane node  |  0.003972 |           0.004008 |
| minikube in cluster minikube   |           |                    |
| * Creating kvm2 VM (CPUs=2,    | 42.097933 |          42.043541 |
| Memory=3700MB, Disk=20000MB)   |           |                    |
| ...                            |           |                    |
| * Preparing Kubernetes v1.18.2 | 23.526247 |          23.273330 |
| on Docker 19.03.8 ...          |           |                    |
| * Verifying Kubernetes         |  1.619988 |           1.505722 |
| components...                  |           |                    |
| * Enabled addons:              |  0.858720 |           0.793911 |
| default-storageclass,          |           |                    |
| storage-provisioner            |           |                    |
| * Done! kubectl is now         |  0.090570 |           0.077100 |
| configured to use "minikube"   |           |                    |
|                                |  0.003893 |           0.006206 |
+--------------------------------+-----------+--------------------+

docker Driver
Times for minikube: [26.954950917999998 28.302639237 30.058807597]
Average time for minikube: 28.438799250666666

Times for Minikube (PR 8199): [28.934048177 28.356889229000004 26.991050653000002]
Average time for Minikube (PR 8199): 28.093996019666665

Averages Time Per Log

+----------------------------------------+-----------+--------------------+
|                  LOG                   | MINIKUBE  | MINIKUBE (PR 8199) |
+----------------------------------------+-----------+--------------------+
| * minikube v1.10.1 on Debian           |  0.081372 |           0.086983 |
|                                   9.11 |           |                    |
| * Using the docker driver              |  0.002899 |           0.003389 |
| based on existing profile              |           |                    |
| * Starting control plane node          |  0.066570 |           0.068263 |
| minikube in cluster minikube           |           |                    |
| * Creating docker container            |  8.201660 |           8.358836 |
| (CPUs=2, Memory=3700MB) ...            |           |                    |
| * Preparing Kubernetes v1.18.2         |  0.160551 |           0.133328 |
| on Docker 19.03.2 ...                  |           |                    |
|   -                                    | 18.832209 |          18.204527 |
| kubeadm.pod-network-cidr=10.244.0.0/16 |           |                    |
| * Verifying Kubernetes                 |  1.011363 |           1.046577 |
| components...                          |           |                    |
| * Enabled addons:                      |  0.003437 |           0.115278 |
| default-storageclass,                  |           |                    |
| storage-provisioner                    |           |                    |
| * Done! kubectl is now                 |  0.073989 |           0.072642 |
| configured to use "minikube"           |           |                    |
|                                        |  0.004749 |           0.004173 |
+----------------------------------------+-----------+--------------------+

@priyawadhwa priyawadhwa changed the title Respect --wait flag on soft start Wait for kubernetes components on soft start May 19, 2020
@priyawadhwa priyawadhwa merged commit e721883 into kubernetes:master May 19, 2020
@priyawadhwa priyawadhwa deleted the component-health branch May 19, 2020 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestFunctional/parallel/ComponentHealth flake on docker
5 participants