Skip to content

Commit

Permalink
storage: CSIStorageCapacity
Browse files Browse the repository at this point in the history
This is the initial documentation for one new feature:
- kubernetes/enhancements#1472
  • Loading branch information
pohly committed Jul 9, 2020
1 parent 38a5d01 commit f188ed2
Show file tree
Hide file tree
Showing 2 changed files with 110 additions and 0 deletions.
108 changes: 108 additions & 0 deletions content/en/docs/concepts/storage/storage-capacity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
reviewers:
- jsafrane
- saad-ali
- msau42
- xing-yang
- pohly
title: Storage Capacity
content_type: concept
---

<!-- overview -->

Storage capacity is limited and may vary depending on the node on
which a pod runs: network-attached storage might not be accessible by
all nodes, or storage is local to a node to begin with.

This page describes how Kubernetes keeps track of storage capacity and
how the scheduler uses that information to schedule pods.

<!-- body -->


## Enabling the feature

Storage capacity tracking is an *alpha feature* and only enabled when
the `CSIStorageCapacity` feature gate is enabled. A quick check
whether a Kubernetes cluster supports the feature is to list
`CSIStorageCapacity` objects with:
```shell
kubectl get csistoragecapacities --all-namespaces
```

If supported, the response will a list of objects or:
```
No resources found
```

If not supported, this error is printed instead:
```
error: the server doesn't have a resource type "csistoragecapacities"
```

In addition to enabling the feature in the cluster, a CSI driver
deployment also has to support it. Please refer to the driver's
documentation for details. Without this support, there will be no
information about storage capacity available through the driver and
the scheduler will schedule pods with volumes provided by the driver
without looking for capacity information.

## API

There are two API extensions for this feature:
- [`CSIStorageCapacity` objects](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#csistoragecapacity-v1alpha1-storage-k8s-io): these get produced by a CSI driver in the namespace
where the driver is installed. Each object contains capacity
information for one storage class and defines which nodes have
access to that storage.
- [The `CSIDriverSpec.StorageCapacity` field](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#csidriverspec-v1-storage-k8s-io): when
set to `true`, the Kubernetes scheduler will consider storage
capacity for volumes that use the CSI driver.

## Scheduling

Storage capacity information is used by the Kubernetes scheduler if:
- the `CSIStorageCapacity` feature gate is true,
- a pod uses a volume that has not been created yet,
- that volume uses a storage class which references a CSI driver and
uses [`WaitForFirstConsumer` volume binding
mode](/docs/concepts/storage/storage-classes/#volume-binding-mode),
and
- the `CSIDriver` object for the driver has `StorageCapacity` set to
true.

In that case, the scheduler only considers nodes for the pod which
have enough storage available to them. This check is currently very
simplistic and only compares the size of the volume against the
capacity listed in `CSIStorageCapacity` objects with a topology that
includes the node. Without storage capacity tracking, nodes are picked
without this check.

For volumes with `Immediate` volume binding mode, the storage driver
decides where to create the volume, independently of pods that will
use the volume. The scheduler then schedules pods onto nodes where the
volume is available after the volume has been created.

For [CSI ephemeral volumes](/docs/concepts/storage/volumes/#csi),
scheduling always happens without considering storage capacity. This
is based on the assumption that this volume type is only used by
special CSI drivers which are local to a node and do not need
significant resources there.

## Rescheduling

When a node has been selected for a pod with `WaitForFirstConsumer`
volumes, that decision is still tentative. The next step is that the
CSI storage driver gets asked to create the volume with a hint that the
volume is supposed to be available on the selected node.

Because Kubernetes might have chosen a node based on out-dated
capacity information, it is possible that the volume cannot really be
created. The node selection is then reset and the Kubernetes scheduler
tries again to find a node for the pod.

## {{% heading "whatsnext" %}}

- For more information on the design, see the
[Storage Capacity Constraints for Pod Scheduling KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1472-storage-capacity-tracking/README.md).
- For more information on further development of this feature, see the [enhancement tracking issue #1472](https://github.com/kubernetes/enhancements/issues/1472).
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ different Kubernetes components.
| `CSIMigrationGCEComplete` | `false` | Alpha | 1.17 | |
| `CSIMigrationOpenStack` | `false` | Alpha | 1.14 | |
| `CSIMigrationOpenStackComplete` | `false` | Alpha | 1.17 | |
| `CSIStorageCapacity` | `false` | Alpha | 1.19 | |
| `ConfigurableFSGroupPolicy` | `false` | Alpha | 1.18 | |
| `CustomCPUCFSQuotaPeriod` | `false` | Alpha | 1.12 | |
| `CustomResourceDefaulting` | `false` | Alpha| 1.15 | 1.15 |
Expand Down Expand Up @@ -388,6 +389,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `CSIPersistentVolume`: Enable discovering and mounting volumes provisioned through a
[CSI (Container Storage Interface)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/container-storage-interface.md)
compatible volume plugin.
- `CSIStorageCapacity`: Enables CSI drivers to publish storage capacity information and the Kubernetes scheduler to use that information when scheduling pods. See [Storage Capacity](/docs/concepts/storage/storage-capacity/).
Check the [`csi` volume type](/docs/concepts/storage/volumes/#csi) documentation for more details.
- `CustomCPUCFSQuotaPeriod`: Enable nodes to change CPUCFSQuotaPeriod.
- `CustomPodDNS`: Enable customizing the DNS settings for a Pod using its `dnsConfig` property.
Expand Down

0 comments on commit f188ed2

Please sign in to comment.