Skip to content
This repository has been archived by the owner on Jun 26, 2023. It is now read-only.

HNC: webhooks broken in K8s 1.19 due to Golang 1.15 #1100

Closed
adrianludwin opened this issue Sep 11, 2020 · 19 comments
Closed

HNC: webhooks broken in K8s 1.19 due to Golang 1.15 #1100

adrianludwin opened this issue Sep 11, 2020 · 19 comments
Assignees
Milestone

Comments

@adrianludwin
Copy link
Contributor

cc @mbtamuli. See #1096 (comment).

/assign @yiqigao217

{"level":"info","ts":1599798100.9179494,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-hnc-x-k8s-io-v1alpha2-hierarchyconfigurations"}
{"level":"info","ts":1599798100.9179764,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-objects"}
{"level":"info","ts":1599798100.9179943,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-hnc-x-k8s-io-v1alpha2-hncconfigurations"}
{"level":"info","ts":1599798100.9180129,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-hnc-x-k8s-io-v1alpha2-subnamespaceanchors"}
{"level":"info","ts":1599798100.9180274,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/validate-v1-namespace"}
{"level":"info","ts":1599798100.9180343,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=HNCConfiguration"}
{"level":"info","ts":1599798100.91805,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=HNCConfiguration"}
{"level":"info","ts":1599798100.9180915,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/convert"}
{"level":"info","ts":1599798100.9180963,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}
{"level":"info","ts":1599798100.9181077,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=HierarchyConfiguration"}
{"level":"info","ts":1599798100.918112,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=HierarchyConfiguration"}
{"level":"info","ts":1599798100.9181347,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}
{"level":"info","ts":1599798100.91815,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=SubnamespaceAnchor"}
{"level":"info","ts":1599798100.9181566,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"hnc.x-k8s.io/v1alpha2, Kind=SubnamespaceAnchor"}
{"level":"info","ts":1599798100.9181774,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}

@adrianludwin adrianludwin added this to the hnc-v0.6 milestone Sep 11, 2020
@mbtamuli
Copy link

I tried deploying the latest version using

HNC_VERSION=v0.5.2
kubectl apply -f https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml

Even then the logs from the manager container shows some errors. Logs -
manager.log

@mbtamuli
Copy link

mbtamuli commented Sep 16, 2020

Could it be a version issue of Kind or Kubernetes?

I'm using

$ kind version
kind v0.9.0 go1.15.2 linux/amd64

$ kubectl get nodes
NAME                 STATUS   ROLES    AGE     VERSION
kind-control-plane   Ready    master   6m40s   v1.19.1
kind-worker          Ready    <none>   6m4s    v1.19.1
kind-worker2         Ready    <none>   6m4s    v1.19.1

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.7", GitCommit:"bfb38f707bc4a8edfcd73472ec3d96b500b8b781", GitTreeState:"archive", BuildDate:"2020-09-12T13:44:53Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.1", GitCommit:"206bcadf021e76c27513500ca24182692aabd17e", GitTreeState:"clean", BuildDate:"2020-09-14T07:30:52Z", GoVersion:"go1.15", Compiler:"gc", Platform:"linux/amd64"}

$ go version
go version go1.15.2 linux/amd64

@adrianludwin
Copy link
Contributor Author

Ah, all those "skip registering a mutating webhook" messages are harmless. They're all about defaults that are set on the API types themselves, but we don't use that feature. Our webhooks are registered explicitly and aren't affected by these message. (This is all detailed controller-runtime stuff).

I think this is the same as open-policy-agent/gatekeeper#811, and we need to port that here. That is: HNC is current broken on K8s 1.19, because that's built with Go 1.15 which changes the way we handle certificates. Updating the comment to better describe this.

/assign @adrianludwin

@adrianludwin adrianludwin changed the title HNC: webhooks not starting in some cases HNC: webhooks broken in K8s 1.19 due to Golang 1.15 Sep 17, 2020
@adrianludwin adrianludwin modified the milestones: hnc-v0.6, hnc-v0.5.3 Sep 17, 2020
@adrianludwin
Copy link
Contributor Author

We should fix this in the 0.5 branch. I'll do it.

@mbtamuli
Copy link

That is: HNC is current broken on K8s 1.19, because that's built with Go 1.15 which changes the way we handle certificates. Updating the comment to better describe this.

Do you mean it should work with k8s <1.19?

@adrianludwin
Copy link
Contributor Author

adrianludwin commented Sep 17, 2020 via email

@mbtamuli
Copy link

mbtamuli commented Sep 18, 2020

Hey @adrianludwin,

I was trying to run the tests. I am always getting errors no matter the kubernetes version I'm using.

I've tested with GKE - v1.15, v1.16 and v1.17, the ones you mentioned here - in Testing Signoff

Here are the logs
test-e2e-v115.log
test-e2e-v116.log
test-e2e-v117.log


Am I running the e2e tests wrong? I was getting even errors without the HNC_REPAIR not set.

I did the following -

HNC_VERSION=v0.5.2
HNC_PLATFORM=linux_amd64
curl -L https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/kubectl-hns_${HNC_PLATFORM} -o ./kubectl-hns
chmod +x kubectl-hns
sudo mv kubectl-hns /usr/local/bin/

gcloud container clusters create hnc-test-v117 --preemptible --num-nodes 1 --cluster-version 1.17 --release-channel rapid
HNC_VERSION=v0.5.2
kubectl apply -f https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml
HNC_REPAIR=true make test-e2e 2>&1 | tee test-e2e-v117.log

gcloud container clusters create hnc-test-v116 --preemptible --num-nodes 1 --cluster-version 1.16 --release-channel regular
HNC_VERSION=v0.5.2
kubectl apply -f https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml
HNC_REPAIR=true make test-e2e 2>&1 | tee test-e2e-v116.log

gcloud container clusters create hnc-test-v115 --preemptible --num-nodes 1 --cluster-version 1.15 --release-channel stable
HNC_VERSION=v0.5.2
kubectl apply -f https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml
HNC_REPAIR=true make test-e2e 2>&1 | tee test-e2e-v115.log

@adrianludwin
Copy link
Contributor Author

@mbtamuli - I just discovered that we're broken at HEAD. Did you try this on v0.5.2? See here for details. We hadn't noticed this because we were testing on clusters that previously had HNC installed.

If you're not testing on HEAD, can you please attach the logs from HNC itself, and not just the tests? You can get the logs via kubectl logs or make deploy-watch.

@adrianludwin
Copy link
Contributor Author

I'm actively working on this, but I likely will only be able to fix this on Monday.

@mbtamuli
Copy link

mbtamuli commented Sep 18, 2020

Did you try this on v0.5.2?

Yes I did. Updated the previous comment with the deployment steps. Just followed the deployement steps mentioned in the release.

If you're not testing on HEAD, can you please attach the logs from HNC itself, and not just the tests? You can get the logs via kubectl logs or make deploy-watch.

Will get the logs as well. Also, I'm working on testing on KIND. I'm not able to get any version combination running on KIND. (locally)


Off topic:
This is WIP.

I'm trying to add GitHub Actions workflow to run e2e tests on KIND. It will run on multiple kubernetes versions at the same time. Right now, as I was not even able to get any version running on KIND, I haven't added it completely.
Example: https://github.com/mbtamuli/multi-tenancy/actions/runs/259409193
Source: https://github.com/mbtamuli/multi-tenancy/blob/add_kind_github_actions/.github/workflows/create-cluster.yml

@adrianludwin
Copy link
Contributor Author

Oh, you need to set HNC_REPAIR to https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml (with the env var filled in). The test runs kubectl apply -f ${HNC_REPAIR} after it (deliberately) breaks the HNC installation.

If you leave the variable unset, it will skip some of the tests but it will run enough other things to be confident that you've at least installed it correctly.

Sorry, I should document that env var better.

@adrianludwin
Copy link
Contributor Author

I don't know how to run Kind on Github Actions. We use Prow and we have some pointers on how to run Kind in Prow, so we'll likely be setting that up soon-ish.

@mbtamuli
Copy link

Oh, you need to set HNC_REPAIR to https://github.com/kubernetes-sigs/multi-tenancy/releases/download/hnc-${HNC_VERSION}/hnc-manager.yaml (with the env var filled in). The test runs kubectl apply -f ${HNC_REPAIR} after it (deliberately) breaks the HNC installation.

Ah, I should've taken a better look at the Makefile or the existing e2e tests. Will try setting HNC_REPAIR to the manifest path.


We use Prow and we have some pointers on how to run Kind in Prow, so we'll likely be setting that up soon-ish.

Can you mention one version of HNC(or commit), that you've tried with one version of KIND and one version of kubernetes, where the e2e tests finished successfully?
For example, you've tested successfully with
HNC - 0.5.2 AND Kind - 0.9.0 AND Kubernetes - 1.16

@mbtamuli
Copy link

Hey @adrianludwin, I got these logs. I tested these on GKE 1.17

gcloud container clusters create hnc-test-v117 --preemptible --num-nodes 1 --cluster-version 1.17 --release-channel rapid

HNC - v.0.5.2

HNC - v.0.5.1

@adrianludwin
Copy link
Contributor Author

I think I saw the same thing recently and it was because I had the wrong version of kubectl-hns. Can you confirm you have the v0.5.x version of kubectl-hns on your path?

@mbtamuli
Copy link

Hey, I did try with the correct version of kubectl-hns. This time a few of the tests passed.

FAIL! -- 20 Passed | 8 Failed | 0 Pending | 0 Skipped

HNC - v.0.5.1

@adrianludwin
Copy link
Contributor Author

adrianludwin commented Sep 22, 2020 via email

@adrianludwin
Copy link
Contributor Author

Fixed in v0.5 by #1125, and will be fixed in master by #1128. Closing because it's fixed in v0.5 (we should release v0.5.3 shortly).

/close

@k8s-ci-robot
Copy link
Contributor

@adrianludwin: Closing this issue.

In response to this:

Fixed in v0.5 by #1125, and will be fixed in master by #1128. Closing because it's fixed in v0.5 (we should release v0.5.3 shortly).

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants