-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readiness check #50187
Comments
Pinging @elastic/es-core-infra (:Core/Infra/Core) |
My preference would be a separate API. I think it's a feature that |
I think we need to have connected to/formed a cluster, and also async initialization like security/watcher services need to have completed once they get cluster state. Additionally, IMO there is no need for an API. We can connect to and form a cluster with only the transport port bound, and then only bind to http once this is complete. This would allow users/tests to wait on the http port being bound. I started experimenting with this last year to improve how integ tests wait for ES to be ready, and would be happy to pick this back up. Last I remember, there were some issues in security, but I think those may now be solved with transport client being removed from master. |
I wonder how the gateway settings such as
The main issue I see with this is that you can't get any diagnostic output from the cluster through the APIs as to why the cluster is not forming or why some of the components did not initialize. |
In the context of Kubernetes, we need to distinguish a liveness check from a readiness check. A liveness check is used to determine when to restart a container because it's sick. A readiness check is used to determine when a container is ready to receive requests. The distinction is important because we don't necessarily want to restart a container because it's temporarily partitioned from the cluster. A failing liveness check is used to restart the container, a failing readiness check is used to stop routing traffic. In this context, I don't think we need to think about
rather we have dedicated checks for each use case. In the context of a Docker |
A related ECK user question: https://discuss.elastic.co/t/does-elastic-have-a-healthcheck-endpoint-that-does-not-require-username-and-password/217090 The default AWS EKS LoadBalancer implementation has its own healthcheck logic. It can be configured with http port and path, but not with authentication details. I would have expected it to be bound to the Pod readiness instead, as this is usually what happens with Kubernetes Services. This also stands true for GCP default Ingress. We may want to simplify how a user can setup unauthenticated access to the healthcheck endpoint? |
It is a common need to check whether a node has started up. Common approaches today include calling existing APIs, like
GET /
and checking the HTTP response code. However @jasontedor noted that the fact that none of our APIs is designed with this goal is mind means that we might break this use-case inadvertently, which wouldn't be the case if we had a dedicated API.We'd need to first understand what exact semantics are needed for a readiness check. For instance do we only need to check whether the node has started up, or do we also need to know whether it has formed a cluster that has an elected master?
Then should we make it its own dedicated API, or should we recommend using an existing API for this, like
GET /
? If the latter, then we will need to document it.The text was updated successfully, but these errors were encountered: