-
Notifications
You must be signed in to change notification settings - Fork 31
Switch to prow for e2e testing. Fix intermittent test failures. #122
Conversation
@munnerz PR needs rebase |
/test e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @munnerz
The tests are failing, as you say, because minikube start
is failing.
And I expect the tests to fail 1.7 anyway, because we're passing exactly the same --requestheader-
arguments to Navigator API server as in #116 (comment), where we saw RBAC access control failures such as
Error from server (Forbidden): elasticsearchclusters.navigator.jetstack.io is forbidden: User "system:anonymous" cannot list elasticsearchclusters.navigator.jetstack.io in the namespace "default"
Even for a user with cluster-admin role binding.
Unless kubeadm
bootstrapper does something differently.
And if so, can you add a comment or two explaining what's different.
@@ -87,7 +87,7 @@ items: | |||
name: "{{ template "fullname" . }}:controller" | |||
rules: | |||
- apiGroups: ["navigator.jetstack.io"] | |||
resources: ["elasticsearchclusters", "pilots"] | |||
resources: ["elasticsearchclusters", "pilots", "elasticsearchclusters/status", "pilots/status"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Does a controller ever need to modify pilots/status
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to delegate permission to modify/update pilots/status
to each Pilot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
hack/install-e2e-dependencies.sh
Outdated
--extra-config=apiserver.Authorization.Mode=RBAC | ||
--extra-config=apiserver.Authorization.Mode=RBAC \ | ||
--bootstrapper=kubeadm; then | ||
sudo journalctl -xu kubelet.service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is failing on Travis because the VM doesn't have journalctl
.
Use minikube logs
instead?
Or does kubeadm
put the logs somewhere where minikube logs
can't find them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still WIP - we can't use the kubeadm bootstrapper with vm-driver=none as kubeadm requires systemctl, but travis don't support anything newer than ubuntu 14.04.
I'm working on getting rid of Travis in favour of minikube-in-go on our own test-infra at the moment.
hack/testdata/values-v1.7.0.yaml
Outdated
resources: | ||
requests: | ||
cpu: 50m | ||
memory: 64Mi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helm install
says -f, --values valueFiles specify values in a YAML file (can specify multiple) (default [])
so perhaps we could put only the apiserver.extraArgs
in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep we could - I'm inclined to keep this nice and explicit though, so it's easy to understand what's being deployed in the e2e tests.
Still TODO:
|
RBAC is enabled by default with the kubeadm bootstrapper on both 1.7 and 1.8. |
/test e2e |
@wallrj this appears to be passing pretty consistently now. The only time I've seen failures is when both the k8s 1.7 and 1.8 tests run in parallel sometimes. I think that's because our build server is maxing out (only 2 cores allocated right now) meaning the kubeadm start timeout is being reached. We'll be increasing the size of our build infrastructure accordingly. I haven't seen the failures very often however. I've also made the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @munnerz
Looks good. And amazing that you got all this stuff set up in such a short time!
The build logs seem to have disappeared
- https://prow.build-infra.jetstack.net/log?job=navigator-e2e-v1-7&id=11
Log not found: no such job found: navigator-e2e-v1-7 (id: 11)
- https://prow.build-infra.jetstack.net/log?job=navigator-e2e-v1-8&id=9
The Travis build now seems to be running a bare make
command which is a bit pointless.
I've left a few additional comments and questions below.
Please address and merge.
@@ -58,6 +64,7 @@ verify: .hack_verify go_verify | |||
DOCKER_BUILD_TARGETS = $(addprefix docker_build_, $(CMDS)) | |||
$(DOCKER_BUILD_TARGETS): | |||
$(eval DOCKER_BUILD_CMD := $(subst docker_build_,,$@)) | |||
eval $$(minikube docker-env --profile $$HOSTNAME --shell sh); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ Is it necessary to specify a profile here? The default is minikube, which should work on a new e2e VM right?
It means that I have to remember to use $HOSTNAME as the profile, when I launch minikube locally.
@@ -87,7 +87,7 @@ items: | |||
name: "{{ template "fullname" . }}:controller" | |||
rules: | |||
- apiGroups: ["navigator.jetstack.io"] | |||
resources: ["elasticsearchclusters", "pilots"] | |||
resources: ["elasticsearchclusters", "pilots", "elasticsearchclusters/status", "pilots/status"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
# - --requestheader-client-ca-file=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt | ||
# - --requestheader-username-headers=X-Remote-User | ||
# - --requestheader-group-headers=X-Remote-Group | ||
# - --requestheader-extra-headers-prefix=X-Remote-Extra - --proxy-client-cert-file="${CERT_DIR}/client-auth-proxy.crt" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line needs to be split. But do it in a followup branch if you like.
chmod +x minikube | ||
sudo mv minikube /usr/local/bin/ | ||
|
||
docker run -v /usr/local/bin:/hostbin quay.io/jetstack/ubuntu-nsenter cp /nsenter /hostbin/nsenter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I see all this stuff gets installed here: https://github.com/jetstack/test-infra/blob/master/images/minikube-in-go/Dockerfile
Great, that should speed things up and save some network traffic.
- How easy is it to get launch a KVM VM from inside a Docker container?
- How do you ensure that we're using the correct version of
kubectl
, if it is installed there butminikube start --kubernetes-version...
is specified here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The KVM VM is launched in a container by passing through the libvirt socket into the container. Anything that works with libvirt should work okay within a container too.
The KUBERNETES_VERSION
environment variable is set as part of the minikube-in-go docker image, which helps ensure we use the correct version of kubectl with kubernetes 😄
apiGroup: rbac.authorization.k8s.io | ||
- kind: Group | ||
name: system:unauthenticated | ||
apiGroup: rbac.authorization.k8s.io |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: munnerz The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
/test all |
Automatic merge from submit-queue. |
What this PR does / why we need it:
Attempts to fix requestheader & RBAC related issues with e2e tests
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #TBC
Special notes for your reviewer:
This may not pass due to issues with kubernetes actually starting still
Release note:
/assign
/cc @wallrj