Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive kubelet serving certificate CSR template from node status addresses #65594

Merged
merged 3 commits into from
Jul 12, 2018

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented Jun 28, 2018

xref kubernetes/enhancements#267
fixes #55633

Builds on #65587

  • Makes the cloud provider authoritative when recording node status addresses
  • Makes the node status addresses authoritative for the kube-apiserver determining how to speak to a kubelet (stops paying attention to the hostname label when determining how to reach a kubelet, which was only done to support kubelets < 1.5)
  • Updates kubelet certificate rotation to be driven from node status
    • Avoids needing to compute node addresses a second time, and differently, in order to request serving certificates.
    • Allows the kubelet to react to changes in its status addresses by updating its serving certificate
    • Allows the kubelet to be driven by external cloud providers recording node addresses on the node status

test procedure:

# setup
export FEATURE_GATES=RotateKubeletServerCertificate=true
export KUBELET_FLAGS="--rotate-server-certificates=true --cloud-provider=external"

# cleanup from previous runs
sudo rm -fr /var/lib/kubelet/pki/

# startup
hack/local-up-cluster.sh

# wait for a node to register, verify it didn't set addresses
kubectl get nodes 
kubectl get node/127.0.0.1 -o jsonpath={.status.addresses}

# verify the kubelet server isn't available, and that it didn't populate a serving certificate
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
ls -la /var/lib/kubelet/pki

# set an address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname, but NOT the IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki

# set an hostname and IP address on the node
curl -X PATCH http://localhost:8080/api/v1/nodes/127.0.0.1/status \
  -H "Content-Type: application/merge-patch+json" \
  --data '{"status":{"addresses":[{"type":"Hostname","address":"localhost"},{"type":"InternalIP","address":"127.0.0.1"}]}}'

# verify a csr was submitted with the right SAN, and approve it
kubectl describe csr
kubectl certificate approve csr-...

# verify the kubelet connection uses a cert that is properly signed and valid for the specified hostname AND IP
curl --cacert _output/certs/server-ca.crt -v https://localhost:10250/pods
curl --cacert _output/certs/server-ca.crt -v https://127.0.0.1:10250/pods
ls -la /var/lib/kubelet/pki
* kubelets that specify `--cloud-provider` now only report addresses in Node status as determined by the cloud provider (unless `--hostname-override` is used to force reporting of the specified hostname)
* kubelet serving certificate rotation now reacts to changes in reported node addresses, and will request certificates for addresses set by an external cloud provider

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 28, 2018
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 28, 2018
@liggitt
Copy link
Member Author

liggitt commented Jun 28, 2018

@kubernetes/sig-node-pr-reviews @kubernetes/sig-auth-pr-reviews
/assign @mikedanese @awly

}
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt since we are using reflect to verify a deep equal later in the code, we should sort the dns and IP addresses just in case they get returned in a different order at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, done

@liggitt
Copy link
Member Author

liggitt commented Jun 28, 2018

/test pull-kubernetes-e2e-gke

@liggitt
Copy link
Member Author

liggitt commented Jun 28, 2018

/retest

@rphillips
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 28, 2018
@liggitt
Copy link
Member Author

liggitt commented Jun 28, 2018

/test pull-kubernetes-e2e-gke

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 28, 2018

switch address.Type {
case v1.NodeHostName:
if ip := net.ParseIP(address.Address); ip != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm. Is there any scenario where the node hostname can like an IP, but is not actually an IP and would not be appropriate to interpret as an IP?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any scenario where the node hostname can like an IP, but is not actually an IP and would not be appropriate to interpret as an IP?

No? A hostname that parses as an IP is an IP... SNI wouldn't treat it as a DNS name, cert verification would require an IP SAN (c.f. https://golang.org/src/crypto/x509/verify.go?s=28297:28349#L914)

wait.PollImmediateInfinite(time.Second, func() (bool, error) {
lastTemplate = m.getTemplate()
if lastTemplate == nil {
glog.Infof("Waiting for certificate template")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not v(2)? Because it hangs? If so, might want a slightly longer message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, wanted visibility

@liggitt
Copy link
Member Author

liggitt commented Jul 4, 2018

/retest

2 similar comments
@liggitt
Copy link
Member Author

liggitt commented Jul 4, 2018

/retest

@liggitt
Copy link
Member Author

liggitt commented Jul 5, 2018

/retest

Copy link
Contributor

@awly awly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but @mikedanese should take another look

m := manager{
certSigningRequestClient: config.CertificateSigningRequestClient,
template: config.Template,
getTemplate: getTemplate,
dynamicTemplate: dynamicTemplate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be config.GetTemplate != nil and local dynamicTemplate var can be removed

Copy link
Member Author

@liggitt liggitt Jul 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.GetTemplate and config.Template are mutually exclusive. dynamicTemplate is used to determine whether we need to monitor getTemplate() for changes

edit: misread, can update

@mikedanese
Copy link
Member

This fixed GKE alpha cluster creation.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 9, 2018
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 10, 2018
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@liggitt
Copy link
Member Author

liggitt commented Jul 10, 2018

fixed #65594 (comment), rebased, no other changes, relabeling
/assign @smarterclayton
for approval of change to pkg/util/node/node.go (dropped use of the hostname label when computing node connection URLs... was for <1.5 compatibility)

@liggitt liggitt added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 10, 2018
@liggitt
Copy link
Member Author

liggitt commented Jul 11, 2018

/retest

@smarterclayton
Copy link
Contributor

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, mikedanese, rphillips, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 11, 2018
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 65052, 65594). If you want to cherry-pick this change to another branch, please follow the instructions here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Master Kubelet TLS verification broken due to external cloud provider node initialization order
8 participants