Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use certwatcher to support mounting cert-manager certificates. #134

Merged
merged 9 commits into from
Jan 22, 2022

Conversation

colinhoglund
Copy link
Contributor

Issue #:

Closes #94

Description of changes:

Mounting TLS certs from k8s Secret managed by https://cert-manager.io/ is a common approach for k8s webhooks.

Changes:

  • Updates the "out-of-cluster" configuration to to use certwatcher for watching and updating the configured TLS cert/key.
  • Moves signal handling out of the ShutdownOnTerm function and into main so that the context can be sent to both certwatcher and the http.Server.
  • Updates k8s dependencies in order to use the certwatcher package, which moves some of the types in pkg/cert/request.go from certificates/v1beta1 to certificates/v1.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@colinhoglund colinhoglund requested a review from a team as a code owner January 20, 2022 01:36
SignerName: certificates.LegacyUnknownSignerName,
// Hard coding this since LegacyUnknownSignerName is no longer available in certificates/v1
// https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/#kubernetes-signers.
SignerName: "kubernetes.io/legacy-unknown",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not just unavailable, it's not allowed: https://github.com/kubernetes/kubernetes/blob/48da959dbff18bfef6e801bd8c8ab3c88b7a7650/pkg/apis/certificates/validation/validation.go#L202

I think we will need to deprecate in-cluster flag + print a clear error + exit gracefully if the kubernetes server version is 1.22 AND in-cluster is true. Instead of trying and failing to create invalid CSR. However, doing all that is out of scope of your PR. So I would like to accept the other changes and drop these changes to pkg/cert/request.go, if possible.

Copy link
Contributor Author

@colinhoglund colinhoglund Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @wongma7, and it totally makes sense to separate these. The only reason I changed pkg/cert/request.go is because of dependencies. certwatcher is only exported in a newer version of the controller-runtime, which resulted in an upgrade of k8s.io/client-go to v0.23.0, which is turn changed the underlying Config type for KeyUsage to certficates/v1.

I'm not entirely sure if there is a way to avoid these dependency updates for now, but would be open to any suggesstions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thx for the explanation. Then I agree with merging this part as well, I don't see a way to avoid it either.

Copy link
Member

@wongma7 wongma7 Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for self and others interested in the in-cluster=true case:

Given that this PR basically solves the in-cluster=false case since now webhook actually reload certs when they are renewed...

My plan for in-cluster=true is...

  1. Currently, in-cluster=true works for k8s <=1.21 but doesn't for >1.21. We have discussed this at length and there seems to be no way to make it work in >1.21* (without mis/abusing the CSR api) so we need to deprecate it ASAP (as 1.22 has been out for months now).
  2. Upon merging this, in-cluster=true won't work for ANY k8s version. Users who rely on it will need to make sure they use an older revision when building pod-identity-webhook.
  3. Immediately/soon after merging this, someone (me/colleague/contributor) should deprecate in-cluster=true with a note in README and major version release so it is clear & obvious that it won't work for ANY k8s version

Copy link
Contributor Author

@colinhoglund colinhoglund Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. Upon merging this, in-cluster=true won't work for ANY k8s version. Users who rely on it will need to make sure they use an older revision when building pod-identity-webhook.

It probably makes sense to tag a release before merging this since the latest release is a bit old. Then once the followup work is done, a new release can be cut that introduces certwatcher and deprecates the --in-cluster flag at the same time.

@wongma7
Copy link
Member

wongma7 commented Jan 20, 2022

Thank you very much for this PR. I have one comment about the v1 CSR change . I think we will need to handle that separately by deprecating the in-cluster=true flag altogether because even if you set legacy-unknown it will be rejected for v1 CSRs. But open to discussion about that.

wongma7
wongma7 previously approved these changes Jan 20, 2022
@wongma7
Copy link
Member

wongma7 commented Jan 20, 2022

I dont actually have permission in this repo so pinging others for review @nckturner etc. thanks !!

@nckturner
Copy link
Contributor

@prasita123 was working on testing these changes with a cert-manager deployment, can you update the results of your test?

@prasita123
Copy link
Contributor

tested the changes from this PR along with updates to

  • deploy/deployment-base.yaml
    • setting in-cluster=false
    • updates to volumeMounts paths
    • add a cert-manager deployment
    • a cert-manager.io/v1 ClusterIssuer
    • a cert-manager.io/v1 Certificate
  • deploy/mutatingwebhook.yaml
    • add annotations to replace caBundle

Works as expected, cert-rotation was observed.

@nckturner
Copy link
Contributor

Failing dependency when go test is run:

 Error: ../../../go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1249:12: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)
Error: ../../../go/pkg/mod/sigs.k8s.io/json@v0.0.0-20211020170558-c049b76a60c6/internal/golang/encoding/json/encode.go:1255:18: sf.IsExported undefined (type reflect.StructField has no field or method IsExported)

@nckturner
Copy link
Contributor

@colinhoglund can you take a look at the dependency failure? Is it the go version we are running the tests with?

@colinhoglund
Copy link
Contributor Author

Thanks @nckturner, it looks like a dependency is using a feature of reflect added in 1.17: https://pkg.go.dev/reflect#StructField.IsExported

I've updated to go 1.17 and run go mod vendor and go mod tidy. Tests seem to be passing for me locally, but probably need to trigger a new run.

nckturner
nckturner previously approved these changes Jan 21, 2022
@nckturner
Copy link
Contributor

One more:

 Error: vendor/k8s.io/client-go/discovery/discovery_client.go:32:2: cannot find package "." in:
	/home/runner/work/amazon-eks-pod-identity-webhook/amazon-eks-pod-identity-webhook/vendor/github.com/googleapis/gnostic/openapiv2
Error: Process completed with exit code 1.

@colinhoglund
Copy link
Contributor Author

One more:

 Error: vendor/k8s.io/client-go/discovery/discovery_client.go:32:2: cannot find package "." in:
	/home/runner/work/amazon-eks-pod-identity-webhook/amazon-eks-pod-identity-webhook/vendor/github.com/googleapis/gnostic/openapiv2
Error: Process completed with exit code 1.

Looks like this is a case-sensitivity issue that I did not pick up locally (using darwin). I'm not entirely sure how to address it, but I'll take a look.

@colinhoglund
Copy link
Contributor Author

One more:

 Error: vendor/k8s.io/client-go/discovery/discovery_client.go:32:2: cannot find package "." in:
	/home/runner/work/amazon-eks-pod-identity-webhook/amazon-eks-pod-identity-webhook/vendor/github.com/googleapis/gnostic/openapiv2
Error: Process completed with exit code 1.

Looks like this is a case-sensitivity issue that I did not pick up locally (using darwin). I'm not entirely sure how to address it, but I'll take a look.

It looks like I was able to get it to recognize the case change by committing rm -rf vendor/github.com/googleapis/gnostic and then rerunning and committing go mod vendor. 🤞

@nckturner
Copy link
Contributor

Awesome. This looks good, I think we just need to update the README with the relevant information about this being somewhat breaking. Also cut a release.. which I can get started on. Do you want to add a summary to the README @colinhoglund?

@wongma7
Copy link
Member

wongma7 commented Jan 22, 2022

imo this doesn't need readme update, this is fixing a bug. anyway the readme is very 'in-cluster=true' focused and i think fixing that is out of scope of this PR

@nckturner
Copy link
Contributor

I mean I think it needs one, since in-cluster=true is essentially deprecated now. But we can merge that separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support cert-manager based deployments
4 participants