diff --git a/content/en/docs/concepts/storage/ephemeral-volumes.md b/content/en/docs/concepts/storage/ephemeral-volumes.md new file mode 100644 index 0000000000000..86b4082f9ada2 --- /dev/null +++ b/content/en/docs/concepts/storage/ephemeral-volumes.md @@ -0,0 +1,254 @@ +--- +reviewers: +- jsafrane +- saad-ali +- msau42 +- xing-yang +- pohly +title: Ephemeral Volumes +content_type: concept +weight: 20 +--- + + + +This document describes the current state of _ephemeral volumes_ in Kubernetes. Familiarity with [volumes](/docs/concepts/storage/volumes/) is suggested. + + + +## Introduction + +Some application need additional storage but don't care whether that +data is stored persistently across restarts. For example, caching +services are often limited by memory size and can move infrequently +used data into storage that is slower than memory with little impact +on overall performance. + +Other applications expect some read-only input data to be present in +files, like configuration data or secret keys. + +_Ephemeral volumes_ are designed for these use cases. Because volumes +are created anew for each pod, pods can be stopped and restarted +without being limited to where some persistent volume is available. + +Ephemeral volumes are specified _inline_ in the pod spec, which +simplifies application deployment and management. + +Kubernetes supports several different kinds of ephemeral volumes for +different purposes: +- [emptyDir]((/docs/concepts/volumes/#emptydir): a directory on the root disk or + a tmpfs +- [configMap](/docs/concepts/volumes/#configmap), + [downwardAPI](/docs/concepts/volumes/#downwardapi), + [secret](/docs/concepts/storage/volumes/#secret): inject different + kinds of Kubernetes data into a pod +- [CSI ephemeral + volumes](docs/concepts/storage/volumes/#csi-ephemeral-volumes): + similar to the previous volume kinds, but provided by special [CSI + drivers](https://github.com/container-storage-interface/spec/blob/master/spec.md) + which specifically [support this feature](https://kubernetes-csi.github.io/docs/drivers.html) +- _generic ephemeral volumes_ (described [below](#generic-ephemeral-volumes)): + can be provided by all storage drivers that also support persistent volumes + +`emptyDir`, `configMap`, `downwardAPI`, `secret` are provided as +[local ephemeral +storage](/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage). +They are managed by kubelet on each node. + +CSI ephemeral volumes *must* be provided by third-party CSI storage +drivers. Generic ephemeral volumes *can* be provided by third-party +CSI storage drivers, but also by any other storage driver that +supports dynamic provisioning. These drivers can offer functionality +that Kubernetes itself does not support, for example storage with +different performance characteristics than the root disk that is +managed by kubelet, or injecting different data. + +### CSI ephemeral volumes + +{{< feature-state for_k8s_version="v1.16" state="beta" >}} + +This feature requires the CSIInlineVolume feature gate to be enabled. It +is enabled by default starting with Kubernetes 1.16. + +CSI ephemeral volumes are only supported by a subset of CSI +drivers. Please see [this +list](https://kubernetes-csi.github.io/docs/drivers.html). + +Conceptually, CSI ephemeral volumes are similar to `configMap`, +`downwardAPI` and `secret`: they are managed locally on each node and +get created together with other local resources after a pod has been +scheduled onto a node. Kubernetes has no concept of rescheduling pods +anymore at this stage. Volume creation has to be unlikely to fail, +otherwise pod startup gets stuck. In particular, [storage capacity +aware pod scheduling](/docs/concepts/storage-capacity/) is *not* +supported for these volumes. They are currently also not covered by +the storage resource usage limits of a pod, because that is something +that kubelet can only enforce for storage that it manages itself. + + +Example: + +```yaml +kind: Pod +apiVersion: v1 +metadata: + name: my-csi-app +spec: + containers: + - name: my-frontend + image: busybox + volumeMounts: + - mountPath: "/data" + name: my-csi-inline-vol + command: [ "sleep", "1000000" ] + volumes: + - name: my-csi-inline-vol + csi: + driver: inline.storage.kubernetes.io + volumeAttributes: + foo: bar +``` + +The `volumeAttributes` determine what volume is prepared by the +driver. These attributes are specific to each driver and not +standardized. See the documentation of each CSI driver for further +instructions. + +Cluster administrators can control which CSI drivers can be used in a +pod via the [Pod Security +Policy](/docs/concepts/policy/pod-security-policy/) with the +[`PodSecurityPolicySpec.allowedCSIDrivers` field](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#podsecuritypolicyspec-v1beta1-policy). + +### Generic ephemeral volumes + +{{< feature-state for_k8s_version="v1.19" state="alpha" >}} + +This feature requires the GenericEphemeralVolume feature gate to be +enabled. Because this is an alpha feature, it is disabled by default. + +Generic ephemeral volumes are similar to `emptyDir` volumes, just more +flexible: +- Storage can be local or network-attached. +- Volumes can have a fixed size that pods are not able to exceed. +- Volumes may have some initial data, depending on the driver and + parameters. +- All of the normal volume operations + ([snapshotting](/docs/concepts/storage/volume-snapshots/), + [cloning](/docs/concepts/storage/volume-pvc-datasource/), + [resizing](/docs/concepts/storage/persistent-volumes/#expanding-persistent-volumes-claims), + [storage capacity tracking](/docs/concepts/storage-capacity/), etc.) + are supported. + +Example: + +```yaml +kind: Pod +apiVersion: v1 +metadata: + name: my-app +spec: + containers: + - name: my-frontend + image: busybox + volumeMounts: + - mountPath: "/scratch" + name: scratch-volume + command: [ "sleep", "1000000" ] + volumes: + - name: scratch-volume + ephemeral: + volumeClaimTemplate: + metadata: + labels: + type: my-frontend-volume + spec: + accessModes: [ "ReadWriteOnce" ] + storageClassName: "scratch-storage-class" + resources: + requests: + storage: 1Gi +``` + +### Lifecycle and PersistentVolumeClaim + +The key design idea is that the [parameters for a +PersistentVolumeClaim](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#ephemeralvolumesource-v1alpha1-core) +are allowed inside a volume source of the pod. Labels, annotations and +the full PersistentVolumeClaimSpec are supported. When such a pod gets +created, a new controller then creates an actual PersistentVolumeClaim +object in the same namespace as the pod. + +That triggers volume binding and/or provisioning, either immediately if +the storage class uses immediate volume binding or when the pod is +tentatively scheduled onto a node (`WaitForFirstConsumer` volume +binding mode). The latter is recommended for generic ephemeral volumes +because then the pod scheduler is free to choose a suitable node for +the pod. With immediate binding, it is forced to use a node that has +access to the volume once it is available. + +These additional PVCs are owned by the pod. When the pod gets deleted, +the Kubernetes garbage collector deletes the PVC, which then usually +triggers deletion of the volume because the default reclaim policy of +storage classes is to delete volumes. If for some reason an ephemeral +volume is not meant to be deleted, a storage class with "retain" as +reclaim policy can be used. + +Once these PVCs exist, they can be used like any other PVC. In +particular, they can be referenced as data source in volume cloning or +snapshotting. The PVC object also holds the current status of the +volume. + +### PVC Naming + +Naming of the additional PVCs is currently deterministic: the name is +a combination of pod name and volume name, with a hyphen (`-`) in the +middle. In the example above, the PVC name will be +`my-app-scratch-volume`. This deterministic naming makes it easier to +interact with the PVC because one does not have to search for it once +the pod name and volume name are known. + +However, it also introduces a potential conflict between different +pods (a pod "pod-a" with volume "scratch" and another pod with name +"pod" and volume "a-scratch" both end up with the same PVC name +"pod-a-scratch") and between pods and manually created PVCs. + +Such conflicts are detected: a PVC is only used for an ephemeral +volume if it was created for the pod. This check is based on the +ownership relationship. An existing PVC is not overwritten or +modified. But this does not resolve the conflict because without the +right PVC, the pod cannot start. + +Therefore care must be taken when naming pods and volumes inside the +same namespace such that these conflicts cannot occur. + +### Security + +Enabling the GenericEphemeralVolume feature allows users to create +PVCs indirectly if they can create pods, even if they do not have +permission to create them directly. Cluster administrators must be +aware of this. If this does not fit their security model, they have +two choices: +- Explicitly disable the feature through the feature gate, to avoid + being surprised when some future Kubernetes version enables it + by default. +- Use a [Pod Security + Policy](/docs/concepts/policy/pod-security-policy/) where the + `volumes` list does not contain the `ephemeral` volume type. + +The normal namespace quota for PVCs in a namespace still applies, so +even if users are allowed to use this new mechanism, they cannot use +it to circumvent other policies. + +## {{% heading "whatsnext" %}} + +### CSI ephemeral volumes + +- For more information on the design, see the [Ephemeral Inline CSI + volumes KEP](https://github.com/kubernetes/enhancements/blob/ad6021b3d61a49040a3f835e12c8bb5424db2bbb/keps/sig-storage/20190122-csi-inline-volumes.md). +- For more information on further development of this feature, see the [enhancement tracking issue #596](https://github.com/kubernetes/enhancements/issues/596). + +### Generic ephemeral volumes + +- For more information on the design, see the +[Generic ephemeral inline volumes KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1698-generic-ephemeral-volumes/README.md). +- For more information on further development of this feature, see the [enhancement tracking issue #1698](https://github.com/kubernetes/enhancements/issues/1698). diff --git a/content/en/docs/concepts/storage/volumes.md b/content/en/docs/concepts/storage/volumes.md index b7bcb370ad23e..ee6ce297aa564 100644 --- a/content/en/docs/concepts/storage/volumes.md +++ b/content/en/docs/concepts/storage/volumes.md @@ -1291,8 +1291,11 @@ Once a CSI compatible volume driver is deployed on a Kubernetes cluster, users may use the `csi` volume type to attach, mount, etc. the volumes exposed by the CSI driver. -The `csi` volume type does not support direct reference from Pod and may only be -referenced in a Pod via a `PersistentVolumeClaim` object. +A `csi` volume can be used in a pod in three different ways: +- through a reference to a [`persistentVolumeClaim`](#persistentvolumeclaim) +- with a [generic ephemeral volume](/docs/concepts/storage/ephemeral-volumes/#generic-ephemeral-volume) +- with a [CSI ephemeral volume](/docs/concepts/storage/ephemeral-volumes/#csi-ephemeral-volume) if the driver + supports that The following fields are available to storage administrators to configure a CSI persistent volume: @@ -1355,37 +1358,9 @@ as usual, without any CSI specific changes. {{< feature-state for_k8s_version="v1.16" state="beta" >}} -This feature allows CSI volumes to be directly embedded in the Pod specification instead of a PersistentVolume. Volumes specified in this way are ephemeral and do not persist across Pod restarts. +This feature allows CSI volumes to be directly embedded in the Pod specification instead of a PersistentVolume. Volumes specified in this way are ephemeral and do not persist across Pod restarts. See [the ephemeral volume page](/docs/concepts/storage/ephemeral-volumes/#csi-ephemeral-volume) for more information. -Example: - -```yaml -kind: Pod -apiVersion: v1 -metadata: - name: my-csi-app -spec: - containers: - - name: my-frontend - image: busybox - volumeMounts: - - mountPath: "/data" - name: my-csi-inline-vol - command: [ "sleep", "1000000" ] - volumes: - - name: my-csi-inline-vol - csi: - driver: inline.storage.kubernetes.io - volumeAttributes: - foo: bar -``` - -This feature requires CSIInlineVolume feature gate to be enabled. It -is enabled by default starting with Kubernetes 1.16. - -CSI ephemeral volumes are only supported by a subset of CSI drivers. Please see the list of CSI drivers [here](https://kubernetes-csi.github.io/docs/drivers.html). - -# Developer resources +#### Developer resources For more information on how to develop a CSI driver, refer to the [kubernetes-csi documentation](https://kubernetes-csi.github.io/docs/) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 5422fb92025e8..3bfa6bc6aa33a 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -101,6 +101,7 @@ different Kubernetes components. | `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | | | `EvenPodsSpread` | `false` | Alpha | 1.16 | 1.17 | | `EvenPodsSpread` | `true` | Beta | 1.18 | | +| `GenericEphemeralVolume` | `false` | Alpha | 1.19 | | | `HPAScaleToZero` | `false` | Alpha | 1.16 | | | `HugePageStorageMediumSize` | `false` | Alpha | 1.18 | 1.18 | | `HugePageStorageMediumSize` | `true` | Beta | 1.19 | | @@ -431,6 +432,7 @@ Each feature gate is designed for enabling/disabling a specific feature: use EndpointSlices as the primary data source instead of Endpoints, enabling scalability and performance improvements. See [Enabling Endpoint Slices](/docs/tasks/administer-cluster/enabling-endpointslices/). - `GCERegionalPersistentDisk`: Enable the regional PD feature on GCE. +- `GenericEphemeralVolume`: Enables ephemeral, inline volumes that support all features of normal volumes (can be provided by third-party storage vendors, storage capacity tracking, restore from snapshot, etc.). See [Ephemeral Volumes](/docs/concepts/storage/ephemeral-volumes/). - `HugePages`: Enable the allocation and consumption of pre-allocated [huge pages](/docs/tasks/manage-hugepages/scheduling-hugepages/). - `HugePageStorageMediumSize`: Enable support for multiple sizes pre-allocated [huge pages](/docs/tasks/manage-hugepages/scheduling-hugepages/). - `HyperVContainer`: Enable [Hyper-V isolation](https://docs.microsoft.com/en-us/virtualization/windowscontainers/manage-containers/hyperv-container) for Windows containers.