-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP /healthy should wait for all components to be ready and healthy #1001
Comments
This causes issues as well when adding tests that rely and restarting the cp. We have no way to ensure that we can create a deployment and the webhook will work. Also we should also have a difference between "healthy" and "ready" I think what we're talking about here is a readiness check and not healthiness. |
Disable the tests for the moment because of kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
Disable the tests for the moment because of kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
Disable the tests for the moment because of kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
Disable the tests for the moment because of kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
Looking at this quickly it feels like the right approach is to add a The diagnostics server will then be able to change the status code of the Ready endpoints accurately but also provides some useful info for troubleshooting. The ready endpoint will also need to go to unhealthy as soon as the |
Would the OnReady only fire once? I would expect that each request to |
* test(kds): Add test for KDS when restarting CP These tests might be unstable because of: #1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
* test(kds): Add test for KDS when restarting CP These tests might be unstable because of: kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
* test(kds): Add test for KDS when restarting CP These tests might be unstable because of: kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
- Abstract some common http startup code. - Add a ReadyComponent interface and implement it for api-server,dp-server and mads - Use this readyComponent to return CP readiness in `/ready` probe part of kumahq#1001 Signed-off-by: Charly Molter <charly.molter@konghq.com>
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed. |
Summary
Currently, we expose
:5680/healthy
endpoint for health checks. The problem is that we spin up many components (API, XDS, SDS, KDS, DNS etc.) concurrently. This health check should only return ready if all components were started and are healthy (!)We see this problem in test app/kuma-cp/cmd/run_test.go:81 when we spin up CP and shut down too soon and then DNS Server complains that Server was not started.
The text was updated successfully, but these errors were encountered: