Liveness/Readiness probes #189

Mrkazik99 · 2024-07-19T17:45:46Z

Description:

Readiness probe: Will check all dependencies of each component (like S3, DB and other). It should be as lightweight as possible

Liveness probe: This should be independent from 3rd party services outage, connection errors might be considered (if Database is not accepting connections this probe should not be affected, but if pod cannot resolve hostname, then restart of component might be helpful)

TL;DR, liveness probe is causing restart of application, but readiness will just disconnect it from service/in case of correct PDB + HA setup will instantly schedule another replica to handle traffic.

We also need some additional endpoint that will detail readiness probe (or it can be merged if it won't affect performance of probe) that will tell about status of 3rd party connections, like in case of conversion it will inform us about issue with S3 connection or some redis issues.

Use Case:

K8S deployment of application mainly but not only

Benefits:

Easier debug process in k8s and with this "detailed probe" docker might be easier as well
We will be k8s native
First steps into monitoring implementation, this "detailed probe" will be good endpoint to scrape metrics using prometheus

Possible Implementation (if applicable):

--

Additional Information:

--

dsonck92 · 2024-07-19T18:00:58Z

Proposed implementation:

Keep a repository of statuses inside the app, defaulting to "not ready", globally accessible
Let various components post updates they know about to the repository
Repository implements the ready handler

Having a repository that gets updates posted from other systems allow the probe to be very efficient, albeit slightly lagging behind the actual situation. If we make the repository a map/struct of errors, then applications can post their failures for feedback utilizing the error framework from go. A nil would be considered "good to go"

Mrkazik99 added the enhancement New feature or request label Jul 19, 2024

dsonck92 self-assigned this Jul 19, 2024

bouassaba added this to the v3.0 milestone Jul 22, 2024

loboda4450 mentioned this issue Sep 2, 2024

Initial version of Cloud Console API #296

Merged

bouassaba removed this from the v3.0 milestone Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liveness/Readiness probes #189

Liveness/Readiness probes #189

Mrkazik99 commented Jul 19, 2024

dsonck92 commented Jul 19, 2024

Liveness/Readiness probes #189

Liveness/Readiness probes #189

Comments

Mrkazik99 commented Jul 19, 2024

Description:

Use Case:

Benefits:

Possible Implementation (if applicable):

Additional Information:

dsonck92 commented Jul 19, 2024