Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace operator-framework/operator-sdk/pkg/ready #612

Merged
merged 1 commit into from
Feb 24, 2021
Merged

Replace operator-framework/operator-sdk/pkg/ready #612

merged 1 commit into from
Feb 24, 2021

Conversation

HeavyWombat
Copy link
Contributor

@HeavyWombat HeavyWombat commented Feb 22, 2021

Changes

As part of the decoupling from the operator-sdk code base, there is the need
to introduce a replacement for their convenience ready package.

Add new ready package within build that has a comparable usage style.

Add code to make the ready filename configurable.

Change default ready filename to remove operator reference.

This addresses part of #584

See operator-framework/operator-sdk#3476 for details regarding the recommendation to create an own ready helper package if required.

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

action required: The default filename to indicate readiness in the pod was renamed from /tmp/operator-sdk-ready to /tmp/shipwright-build-ready. If you use custom deployment mechanisms, please have a look at the respective readiness probe settings to adjust for the change.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2021
As part of the decoupling from the `operator-sdk` code base, there is the need
to introduce a replacement for their convenience `ready` package.

Add new `ready` package within `build` that has a comparable usage style.

Add code to make the `ready` filename configurable.

Change default `ready` filename to remove `operator` reference.
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 24, 2021
Copy link
Contributor

@qu1queee qu1queee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @HeavyWombat , nice.
/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Feb 24, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qu1queee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2021
@SaschaSchwarze0
Copy link
Member

The branch misses Improve test stability #614 which is causing some tests to fail. But, there is a green integration and e2e test in at least one K8s version. Therefore manually merging.

@SaschaSchwarze0 SaschaSchwarze0 merged commit 18dbb5a into shipwright-io:master Feb 24, 2021
@HeavyWombat HeavyWombat deleted the remove_sdk_deps_03 branch February 24, 2021 12:47
@imjasonh
Copy link
Contributor

I stumbled across this new ready package while looking into something else, and I wonder if we need it at all?

It seems to only write the file then delete it when main exits, and seemingly nothing else uses it to determine readiness.

Even if it did, I wonder if we need a readinessProbe at all. Readiness probes are intended to tell a Service that traffic should be routed to the Pod, but in our case, AFAIK, there are no Services that read the status of the readiness probe.

@HeavyWombat
Copy link
Contributor Author

I stumbled across this new ready package while looking into something else, and I wonder if we need it at all?

It seems to only write the file then delete it when main exits, and seemingly nothing else uses it to determine readiness.

Even if it did, I wonder if we need a readinessProbe at all. Readiness probes are intended to tell a Service that traffic should be routed to the Pod, but in our case, AFAIK, there are no Services that read the status of the readiness probe.

Good question. At the time, I just decided to take the operator SDK based ready package implementation mostly as-is into a package of our own. Could be that this was actually also the recommended migration action from their documentation.

I agree that it does not very much for us. Correct me if I am wrong, but we could do with just a liveness probe configured against the metrics endpoint. If I am not totally wrong here, we even discussed something like that before. @SaschaSchwarze0 Do you recall that?

@SaschaSchwarze0
Copy link
Member

SaschaSchwarze0 commented Apr 14, 2021

@SaschaSchwarze0 Do you recall that?

Yes, Jason is right, readiness checks make only sense for services so that Kube can rule out pods that currently cannot accept traffic.

For the liveness check (which uses the same logic), the file-based approach is imo reasonable.

For the readiness check, we discussed to make that depending on the the pod having leadership. This can use the same file-based approach, just that the file would only get written once the instance became leader. Then one can have a Service for the metrics endpoint and requests would only go to this leader.

@imjasonh
Copy link
Contributor

For the liveness check (which uses the same logic), the file-based approach is imo reasonable.

I don't know if I agree, unless there's some code to delete the file when the container isn't live anymore. IMO having a liveness check that hits an HTTP endpoint would be able to more easily/accurately determine whether the container is running and healthy than signaling that with the presence/absence of a file.

Tekton does this with a simple HTTP handler that always returns 200 OK (source, tektoncd/pipeline#3489). In practice I'm not sure even that's necessary, since it's much more likely the container will exit rather than get into a state where it's running but unable to serve that HTTP handler.

@SaschaSchwarze0
Copy link
Member

@imjasonh yep, agree, you are bringing up questions I also had. So far, the issues we have seen where the controller was not anymore in good shape, always were when it either had a panic, or when it had connectivity issues with the kube-api-server and could not renew the lease. In case of a panic, the process died and kube would restart the pod. In case of the lease-renewal problem, I assume that it also terminates, but I am not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. release-note-action-required
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants