From 4bd8edc635a76c6dc5b006a073382da83339f9c7 Mon Sep 17 00:00:00 2001 From: Patrick Ohly Date: Fri, 13 Dec 2019 11:02:14 +0100 Subject: [PATCH 1/3] csi inline volume size: initial draft --- .../20191213-csi-inline-volume-size.md | 175 ++++++++++++++++++ 1 file changed, 175 insertions(+) create mode 100644 keps/sig-storage/20191213-csi-inline-volume-size.md diff --git a/keps/sig-storage/20191213-csi-inline-volume-size.md b/keps/sig-storage/20191213-csi-inline-volume-size.md new file mode 100644 index 00000000000..f407460e9a0 --- /dev/null +++ b/keps/sig-storage/20191213-csi-inline-volume-size.md @@ -0,0 +1,175 @@ +--- +title: Size Parameter for CSI Ephemeral Inline Volumes +authors: + - "@pohly" +owning-sig: sig-storage +reviewers: + - TBD +approvers: + - TBD +editor: "@pohly" +creation-date: 2019-12-13 +last-updated: 2019-12-13 +status: provisional +see-also: + - "https://github.com/kubernetes/enhancements/pull/1353" +--- + +# Size Parameter for CSI Ephemeral Inline Volumes + +## Table of Contents + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories](#user-stories) + - [Ephemeral local volumes](#ephemeral-local-volumes) + - [Capacity-aware scheduling of pods](#capacity-aware-scheduling-of-pods) + - [fsSize field](#fssize-field) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) + + +## Release Signoff Checklist + +- [ ] kubernetes/enhancements issue in release milestone, which links + to KEP (this should be a link to the KEP location in + kubernetes/enhancements, not the initial KEP PR) +- [ ] KEP approvers have set the KEP status to `implementable` +- [ ] Design details are appropriately documented +- [ ] Test plan is in place, giving consideration to SIG Architecture and SIG Testing input +- [ ] Graduation criteria is in place +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in + [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation e.g., additional design documents, + links to mailing list discussions/SIG meetings, relevant + PRs/issues, release notes + +## Summary + +The [CSIVolumeSource API in Kubernetes +1.16](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#csivolumesource-v1-core) +for [CSI ephemeral inline +volumes](https://kubernetes-csi.github.io/docs/ephemeral-local-volumes.html) +currently supports only two standardized parameters for the volume: +- `readOnly` +- `fsType` + +All other parameters for the volume must be passed as +`volumeAttributes` with key/value pairs that are driver-specific and +thus opaque to Kubernetes. This API was chosen because it directly +maps to parameters supported by the underlying +[NodePublishVolumeRequest](https://github.com/container-storage-interface/spec/blob/4731db0e0bc53238b93850f43ab05d9355df0fd9/lib/go/csi/csi.pb.go#L3604-L3652). + +One common questions that developers and users have is how the size of +the volume can be specified. The proposal is to introduce a +standardized `fsSize` parameter as answer to that and gradually +replace driver-specific parameters that may have been used before. + +## Motivation + +### Goals + +- Add a `CSIVolumeSource.fsSize` field. +- Pass it to drivers as additional entry in `NodePublishVolumeRequest.VolumeContext` + if the driver has opted into getting additional fields there. + +### Non-Goals + +- Validate whether a CSI driver really supports this field. This is + consistent with `readOnly` and `fsType` which also may be silently + ignored. + +## Proposal + +### User Stories + +#### Ephemeral local volumes + +Local volumes are useful as scratch +space. [PMEM-CSI](https://github.com/intel/pmem-csi) and +[TopoLVM](https://github.com/cybozu-go/topolvm/blob/master/README.md) +are two examples for CSI drivers which dynamically create volumes of a +fixed capacity and thus need a size parameter. + +### Capacity-aware scheduling of pods + +Scheduling pods with ephemeral inline volumes onto nodes [with +sufficient free storage +capacity](https://github.com/kubernetes/enhancements/pull/1353) +depends (among other information, like free capacity) on knowing the +size of the volumes. If the size is only specified via some +vendor-specific parameter, it's not available to the Kubernetes +scheduler. + +### fsSize field + +A new field `fsSize` of type `*Quantity` in +[CSIVolumeSource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#csivolumesource-v1-core) +needs to be added, alongside the existing `fsType`. It must be a +pointer to distinguish between no size and zero size selected. + +The new field is under the `CSIInlineVolumeSize` feature +gate. + +While `fsType` can (and does get) mapped to +`NodePublishVolumeRequest.volume_capability`, for `fsSize` we need a +different approach for passing it to the CSI driver because there is +no pre-defined field for it in `NodePublishVolumeRequest`. We can +extend the [pod info on +mount](https://kubernetes-csi.github.io/docs/pod-info.html) feature: +if (and only if) the driver enables that, then a new +`csi.storage.k8s.io/size` entry in +`NodePublishVolumeRequest.publish_context` is set to the string +representation of the size quantity. An unset size is passed as empty +string. + +This has to be optional because CSI drivers written for 1.16 might do +strict validation of the `publish_context` content and reject volumes +with unknown fields. If the driver enables pod info, then new fields +in the `csi.storage.k8s.io` namespace are explicitly allowed. + +Using that new `fsSize` field must be optional. If a CSI driver +already accepts a size specification via some driver-specific +parameter, then specifying the size that way must continue to +work. But if the `fsSize` field is set, a CSI driver should use that +and treat it as an error when both `fsSize` and the driver-specific +parameter are set. + +Setting `fsSize` for a CSI driver which ignores the field is not an +error. This is similar to setting `fsType` which also may get ignored +silently. + +### Risks and Mitigations + +TBD + +## Design Details + +### Test Plan + +TBD + +### Upgrade / Downgrade Strategy + +TBD + +### Version Skew Strategy + +TBD + +## Implementation History + +## Drawbacks + +Why should this KEP _not_ be implemented - TBD. From f5cfb2d62a9b2c416c8796958f01308fa67871dc Mon Sep 17 00:00:00 2001 From: Patrick Ohly Date: Fri, 20 Dec 2019 17:58:45 +0100 Subject: [PATCH 2/3] inline volume size: publish_context -> volume_context This was found by Satoru Takeuchi and also fixed by him in https://github.com/kubernetes-csi/docs/pull/249. --- keps/sig-storage/20191213-csi-inline-volume-size.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/keps/sig-storage/20191213-csi-inline-volume-size.md b/keps/sig-storage/20191213-csi-inline-volume-size.md index f407460e9a0..7a2ba93b632 100644 --- a/keps/sig-storage/20191213-csi-inline-volume-size.md +++ b/keps/sig-storage/20191213-csi-inline-volume-size.md @@ -130,12 +130,12 @@ extend the [pod info on mount](https://kubernetes-csi.github.io/docs/pod-info.html) feature: if (and only if) the driver enables that, then a new `csi.storage.k8s.io/size` entry in -`NodePublishVolumeRequest.publish_context` is set to the string +`NodePublishVolumeRequest.volume_context` is set to the string representation of the size quantity. An unset size is passed as empty string. This has to be optional because CSI drivers written for 1.16 might do -strict validation of the `publish_context` content and reject volumes +strict validation of the `volume_context` content and reject volumes with unknown fields. If the driver enables pod info, then new fields in the `csi.storage.k8s.io` namespace are explicitly allowed. From 1a557c6de8d62f4d61b4f1e77226f2403de8b450 Mon Sep 17 00:00:00 2001 From: Patrick Ohly Date: Fri, 17 Jan 2020 11:13:57 +0100 Subject: [PATCH 3/3] inline volume size: move to implementable This adds the missing details (test plan, implementation history). Also clarifies what exactly the string representation of the size is when passed to the CSI driver. --- .../20191213-csi-inline-volume-size.md | 56 ++++++++++++------- 1 file changed, 36 insertions(+), 20 deletions(-) diff --git a/keps/sig-storage/20191213-csi-inline-volume-size.md b/keps/sig-storage/20191213-csi-inline-volume-size.md index 7a2ba93b632..04b32ba419f 100644 --- a/keps/sig-storage/20191213-csi-inline-volume-size.md +++ b/keps/sig-storage/20191213-csi-inline-volume-size.md @@ -4,13 +4,13 @@ authors: - "@pohly" owning-sig: sig-storage reviewers: - - TBD + - "@saad-ali" approvers: - - TBD + - "@saad-ali" editor: "@pohly" creation-date: 2019-12-13 last-updated: 2019-12-13 -status: provisional +status: implementable see-also: - "https://github.com/kubernetes/enhancements/pull/1353" --- @@ -30,13 +30,13 @@ see-also: - [Ephemeral local volumes](#ephemeral-local-volumes) - [Capacity-aware scheduling of pods](#capacity-aware-scheduling-of-pods) - [fsSize field](#fssize-field) - - [Risks and Mitigations](#risks-and-mitigations) - [Design Details](#design-details) - [Test Plan](#test-plan) + - [Graduation Criteria](#graduation-criteria) + - [Alpha -> Beta Graduation](#alpha---beta-graduation) + - [Beta -> GA Graduation](#beta---ga-graduation) - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) - - [Version Skew Strategy](#version-skew-strategy) - [Implementation History](#implementation-history) -- [Drawbacks](#drawbacks) ## Release Signoff Checklist @@ -131,8 +131,9 @@ mount](https://kubernetes-csi.github.io/docs/pod-info.html) feature: if (and only if) the driver enables that, then a new `csi.storage.k8s.io/size` entry in `NodePublishVolumeRequest.volume_context` is set to the string -representation of the size quantity. An unset size is passed as empty -string. +representation of the size quantity as decimal number, i.e. CSI drivers +do not need to import Kubernetes code to parse it. +An unset size is passed as empty string. This has to be optional because CSI drivers written for 1.16 might do strict validation of the `volume_context` content and reject volumes @@ -150,26 +151,41 @@ Setting `fsSize` for a CSI driver which ignores the field is not an error. This is similar to setting `fsType` which also may get ignored silently. -### Risks and Mitigations - -TBD - ## Design Details ### Test Plan -TBD +The [mount +tests](https://github.com/kubernetes/kubernetes/blob/05b6c32886cf522550928de30287806d53e1b293/pkg/volume/csi/csi_mounter_test.go#L97) +will be extended to cover different scenarios: +- no `fsSize` set +- `fsSize` set +- both of these cases with and without pod info enabled -### Upgrade / Downgrade Strategy +### Graduation Criteria -TBD +##### Alpha -> Beta Graduation -### Version Skew Strategy +- Gather feedback from developers +- Full implementation of `fsSize` to `csi.storage.k8s.io/size` conversion +- Tests are in Testgrid and linked in KEP -TBD +##### Beta -> GA Graduation -## Implementation History +- 3 CSI drivers which use the field +- Allowing time for feedback (two releases in beta) -## Drawbacks +### Upgrade / Downgrade Strategy + +Upgrading with an existing pod spec that uses a vendor-specific size +parameter is handled because the CSI driver must continue to check for +that parameter. Downgrading in this scenario also works. + +When actually using the new `fsSize` parameter, downgrading or +disabling `CSIInlineVolumeSize` will result in not passing the +`csi.storage.k8s.io/size` parameter. A CSI driver can treat this as an +error instead of falling back to some default volume size. + +## Implementation History -Why should this KEP _not_ be implemented - TBD. +- Kubernetes 1.18: alpha (tentative)