Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: CSIStorageCapacity #21634

Merged
merged 1 commit into from
Jul 16, 2020

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Jun 10, 2020

This is the documentation for CSIStorageCapacity, which was merged for 1.19 today:

The PR used to contain also documentation for GenericEphemeralVolume, but I took that out because documentation wasn't as closely tied as I first thought and the featured hasn't been merged yet.

Therefore this PR is now ready for review and (eventually) merging.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 10, 2020
@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Jun 10, 2020
@k8sio-netlify-preview-bot
Copy link
Collaborator

k8sio-netlify-preview-bot commented Jun 10, 2020

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit 19b8f84

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/5f0ff929e708250008a533a4

@k8s-ci-robot k8s-ci-robot added the sig/docs Categorizes an issue or PR as relevant to SIG Docs. label Jun 10, 2020
@savitharaghunathan
Copy link
Member

/milestone 1.19

@k8s-ci-robot k8s-ci-robot added this to the 1.19 milestone Jun 10, 2020
@savitharaghunathan
Copy link
Member

/assign

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some early feedback

content/en/docs/concepts/storage/ephemeral-volumes.md Outdated Show resolved Hide resolved
content/en/docs/concepts/storage/volumes.md Outdated Show resolved Hide resolved
content/en/docs/concepts/storage/ephemeral-volumes.md Outdated Show resolved Hide resolved
@pohly pohly force-pushed the kubernetes-1-19-features branch from 4fd0b37 to f188ed2 Compare July 9, 2020 09:56
@pohly pohly changed the title WIP: storage: storage capacity, generic ephemeral volumes storage: CSIStorageCapacity Jul 9, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 9, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 9, 2020

Pinging:

@pohly pohly force-pushed the kubernetes-1-19-features branch from f188ed2 to a76ae73 Compare July 9, 2020 10:16
- pohly
title: Ephemeral Volumes
content_type: concept
weight: 20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to check the weight....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and remove this file! It's supposed to be merged via #22438

@pohly pohly force-pushed the kubernetes-1-19-features branch from a76ae73 to 247d9ac Compare July 13, 2020 12:12
Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look. I wonder if this actually belongs inside https://kubernetes.io/docs/concepts/scheduling-eviction/ (with lots of inbound signposting from the storage concepts section).

@pohly what do you think?

I will also try to find time to take a longer look at this PR.

@sftim
Copy link
Contributor

sftim commented Jul 13, 2020

/sig storage

@k8s-ci-robot k8s-ci-robot added the sig/storage Categorizes an issue or PR as relevant to SIG Storage. label Jul 13, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 13, 2020

I had a quick look. I wonder if this actually belongs inside https://kubernetes.io/docs/concepts/scheduling-eviction/ (with lots of inbound signposting from the storage concepts section).

@pohly what do you think?

I'm undecided. We could have used storage topology support as guiding example, but I couldn't find any documentation for that. Also related are node-specific volume limits (https://kubernetes.io/docs/concepts/storage/storage-limits/), which is documented under storage although it is a scheduler feature.

Following that example and because the implementation is owned by SIG-Storage, I'd prefer to keep storage capacity under concepts/storage. Just my 2 cents, I'm also fine with moving it.

@sftim
Copy link
Contributor

sftim commented Jul 13, 2020

OK, makes sense. Please consider signposting to this from any relevant pages about scheduling!

referenced in a Pod via a `PersistentVolumeClaim` object.
A `csi` volume can be used in a pod in three different ways:
- through a reference to a [`persistentVolumeClaim`](#persistentvolumeclaim)
- with a [generic ephemeral volume](/docs/concepts/storage/ephemeral-volumes/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a chunk that should be in generic ephemeral volume PR. And generic ephemeral volumes are not limited to CSI, in-tree volumes can be used too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right. Removed.

Comment on lines 19 to 20
This page describes how Kubernetes keeps track of storage capacity and
how the scheduler uses that information to schedule pods.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss some "why" - why does the scheduler need to use the capacity information? To provision volumes on nodes (or topology segment) that actually have some free space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added ".... scheduler uses that information to schedule pods onto nodes
that have access to enough storage capacity for the remaining missing
volumes. Without storage capacity tracking, it is random whether the selected
node can run the Pod and multiple scheduling retries may be needed."

Comment on lines 83 to 84
includes the node. Without storage capacity tracking, nodes are picked
without this check.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe consequence of missing this check - like pods can be scheduled to nodes that do not have any free space for dynamic provisioning of a new volume, resulting in the pod not running.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After extending the introduction, this became a bit redundant. I just removed the "Without..." part here entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

@pohly pohly force-pushed the kubernetes-1-19-features branch from 247d9ac to 7953775 Compare July 13, 2020 14:26
@pohly pohly mentioned this pull request Jul 14, 2020
how the scheduler uses that information to schedule Pods onto nodes
that have access to enough storage capacity for the remaining missing
volumes. Without storage capacity tracking, it is random whether the
selected node has enough storage for the Pod and multiple scheduling retries may be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it is random? Could this be reworded.
Without storage capacity tracking, the way a node is selected ...?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Without storage capacity tracking, the scheduler may choose a node that doesn't have enough capacity to provision a volume.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

another volume. Manual intervention is necessary to recover from this,
for example by increasing capacity or deleting the volume that was
already created. [Further
work](https://github.com/kubernetes/enhancements/pull/1703) is needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to list a link to a pull request? This link is not stable.
Could you list this issue in the release notes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the KEP hasn't even be merged as provisional yet, all we have is this link to the proposal.

I'll make sure to list it in the release notes.


- For more information on the design, see the
[Storage Capacity Constraints for Pod Scheduling KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1472-storage-capacity-tracking/README.md).
- For more information on further development of this feature, see the [enhancement tracking issue #1472](https://github.com/kubernetes/enhancements/issues/1472).
Copy link
Contributor

@kbhawkey kbhawkey Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a concept page, linking to the enhancement tracking issue does not add much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I beg to differ here. For alpha features, one key question for users is when the feature will graduate or how it will change during future development. The enhancement issue is where they can find answers to those questions.

I can remove it in a follow-up PR if this argument is convincing, but let's merge this PR first, okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

needed.

Tracking storage capacity is supported for [{{< glossary_tooltip
text="Container Storage Interface" term_id="csi" >}} (CSI) drivers](https://kubernetes-csi.github.io/docs/drivers.html) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Did the link work out as expected? The tooltip is working.
I clicked on CSI and was forwarded to https://deploy-preview-21634--kubernetes-io-vnext-staging.netlify.app/docs/concepts/storage/volumes/#csi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it looks like the tooltip overrides the link destination. I've dropped the link. We can add it back under "enabling" once that table has information about drivers that support the feature (assuming that we consider it worth calling out there).

## API

There are two API extensions for this feature:
- [CSIStorageCapacity](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#csistoragecapacity-v1alpha1-storage-k8s-io) objects:
Copy link
Contributor

@kbhawkey kbhawkey Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these resource links will work once v1.19 is released.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.

@kbhawkey
Copy link
Contributor

Thanks @pohly . Noted a few questions. Generally, looks good.

@annajung
Copy link
Contributor

Hi @pohly 👋 1.19 docs shadow here, a friendly reminder that docs deadline is tomorrow. Pinging you to address the review suggestions above so we can get approvals from both tech/docs to merged in! thank you

@pohly
Copy link
Contributor Author

pohly commented Jul 15, 2020

@msau42, @kbhawkey: I pushed an update, please take another look.

@pohly pohly force-pushed the kubernetes-1-19-features branch from 05e9530 to 7f06adc Compare July 15, 2020 16:57
@msau42
Copy link
Member

msau42 commented Jul 15, 2020

/lgtm
from tech

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2020
@kbhawkey
Copy link
Contributor

@pohly , Would you rebase? I am looking again.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 16, 2020
This is the initial documentation for one new feature:
- kubernetes/enhancements#1472
Co-authored-by: Tim Bannister <tim@scalefactory.com>
@pohly pohly force-pushed the kubernetes-1-19-features branch from 7f06adc to 19b8f84 Compare July 16, 2020 06:52
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 16, 2020
@pohly
Copy link
Contributor Author

pohly commented Jul 16, 2020

@kbhawkey: rebased.

@kbhawkey
Copy link
Contributor

@pohly , Thanks!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2020
@kbhawkey
Copy link
Contributor

@savitharaghunathan , Ready to merge? Thanks!

Copy link
Member

@savitharaghunathan savitharaghunathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: savitharaghunathan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 16, 2020
@k8s-ci-robot k8s-ci-robot merged commit c728544 into kubernetes:dev-1.19 Jul 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants