Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liveness/Readiness probes #189

Open
Mrkazik99 opened this issue Jul 19, 2024 · 1 comment
Open

Liveness/Readiness probes #189

Mrkazik99 opened this issue Jul 19, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@Mrkazik99
Copy link
Collaborator

Description:

Readiness probe: Will check all dependencies of each component (like S3, DB and other). It should be as lightweight as possible

Liveness probe: This should be independent from 3rd party services outage, connection errors might be considered (if Database is not accepting connections this probe should not be affected, but if pod cannot resolve hostname, then restart of component might be helpful)

TL;DR, liveness probe is causing restart of application, but readiness will just disconnect it from service/in case of correct PDB + HA setup will instantly schedule another replica to handle traffic.

We also need some additional endpoint that will detail readiness probe (or it can be merged if it won't affect performance of probe) that will tell about status of 3rd party connections, like in case of conversion it will inform us about issue with S3 connection or some redis issues.


Use Case:

K8S deployment of application mainly but not only


Benefits:

  1. Easier debug process in k8s and with this "detailed probe" docker might be easier as well
  2. We will be k8s native
  3. First steps into monitoring implementation, this "detailed probe" will be good endpoint to scrape metrics using prometheus

Possible Implementation (if applicable):

--


Additional Information:

--


@Mrkazik99 Mrkazik99 added the enhancement New feature or request label Jul 19, 2024
@dsonck92 dsonck92 self-assigned this Jul 19, 2024
@dsonck92
Copy link
Collaborator

Proposed implementation:

  • Keep a repository of statuses inside the app, defaulting to "not ready", globally accessible
  • Let various components post updates they know about to the repository
  • Repository implements the ready handler

Having a repository that gets updates posted from other systems allow the probe to be very efficient, albeit slightly lagging behind the actual situation. If we make the repository a map/struct of errors, then applications can post their failures for feedback utilizing the error framework from go. A nil would be considered "good to go"

@bouassaba bouassaba added this to the v3.0 milestone Jul 22, 2024
@bouassaba bouassaba removed this from the v3.0 milestone Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants