diff --git a/keps/prod-readiness/sig-storage/3636.yaml b/keps/prod-readiness/sig-storage/3636.yaml new file mode 100644 index 00000000000..004bae34537 --- /dev/null +++ b/keps/prod-readiness/sig-storage/3636.yaml @@ -0,0 +1,5 @@ +kep-number: 3636 +alpha: + approver: "@deads2k" +beta: + approver: "@deads2k" diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/README.md b/keps/sig-windows/3636-windows-csi-host-process-pods/README.md new file mode 100644 index 00000000000..5471a671efb --- /dev/null +++ b/keps/sig-windows/3636-windows-csi-host-process-pods/README.md @@ -0,0 +1,1248 @@ + +# KEP-3636: CSI Drivers in Windows as HostProcess Pods + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) + - [Glossary](#glossary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [Notes/Constraints/Caveats](#notesconstraintscaveats) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Prerequisite: Make CSI Proxy an embedded library without a server component](#prerequisite-make-csi-proxy-an-embedded-library-without-a-server-component) + - [Implementation idea 1: Update the conversion layer to use the server code gRPC](#implementation-idea-1-update-the-conversion-layer-to-use-the-server-code-grpc) + - [Implementation idea 2: Update the CSI Drivers to use the server code directly (preferred)](#implementation-idea-2-update-the-csi-drivers-to-use-the-server-code-directly-preferred) + - [Implementation idea 3: Convert CSI Proxy to a Library of Functions](#implementation-idea-3-convert-csi-proxy-to-a-library-of-functions) + - [Comparison Matrix](#comparison-matrix) + - [Maintenance of the new model and existing client/server model of CSI Proxy](#maintenance-of-the-new-model-and-existing-clientserver-model-of-csi-proxy) + - [Security analysis](#security-analysis) + - [Test Plan](#test-plan) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + + + +CSI enables third-party storage providers to write and deploy plugins without the need +to alter the core Kubernetes codebase. + +A CSI Driver in Kubernetes has two main components: a controller plugin which runs in +the control plane and a node plugin which runs on every node. + +The node component of a CSI Driver require direct access to the host for making block devices and/or filesystems +available to the kubelet, CSI Drivers use the [mkfs(8)](https://man7.org/linux/man-pages/man8/mkfs.8.html) and +[mount(8)](https://man7.org/linux/man-pages/man8/mount.8.html) commands to format and mount filesystems. +CSI Drivers running in Windows nodes can't execute similar Windows commands due to the missing capability +of running privileged operations from a container. To workaround this issue, a proxy binary called [CSI Proxy was +introduced](https://kubernetes.io/blog/2020/04/03/kubernetes-1-18-feature-windows-csi-support-alpha/) as a way to +perform privileged storage operations by relaying the execution of these privileged storage operations to it, CSI +Drivers connect to a gRPC API exposed by CSI Proxy as named pipes in the host and invoke the CSI Proxy API services +to execute privileged powershell commands to mount and format filesystems on behalf of the CSI Driver. [CSI Proxy +became GA in Kubernetes 1.22](https://kubernetes.io/blog/2021/08/09/csi-windows-support-with-csi-proxy-reaches-ga/). + +At around the same time, SIG Windows introduced [HostProcess containers](https://kubernetes.io/blog/2021/08/16/windows-hostprocess-containers/. +This feature enables running containers as a process in the host (hence the name), +with this feature CSI Drives can directly perform the same privileged storage operations +that CSI Proxy did. This KEP explains the process to transition CSI Drivers to become HostProcess containers. + +### Glossary + +Reference for a few terms used throughout this document: + +* API Group - A grouping of APIs in CSI Proxy by their purpose. For example, the Volume API Group has API Methods related with volume interaction. [There are 4 API groups (Disk, Fileyste, Volume, SMB) in v1 status and 2 in v1beta status](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/README.md#feature-status). +* API Version - An API Group can have multiple versions e.g. v1alpha1, v1beta1, v1, etc. [As of GA there are 4 API groups (Disk, Fileyste, Volume, SMB) in v1 status and 2 API Groups (iSCSI, System) in v1alpha status](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/README.md#feature-status). +* Conversion layer - Generated go code in CSI Proxy that transforms client versioned requests to server "version agnostic" requests. +* CSI Proxy client - The library that CSI Drivers and addons use to connect to the CSI Proxy server. +* CSI Proxy server - The CSI Proxy binary running in the host node. + +## Motivation + + + +CSI Proxy enabled running CSI in Windows nodes using a client/server model. The server is the CSI Proxy binary +running as a Windows service in the node, and the client is the CSI Driver, which communicates with CSI Proxy on +every CSI call done on the node plugin. While this model has worked fine, it has a few drawbacks: + +- Different deployment model than Linux - Linux privileged containers are used to perform the privileged storage + operations (format/mount). However, Windows containers aren't privileged. To work around the problem, the CSI Driver is run as non-privileged containers, + and privileged operations are relayed to CSI Proxy. In deployment manifests, the Windows component needs to have an + additional section to mount the named pipes exposed by CSI Proxy as a hostpath. +- Additional component in the host to maintain - The cluster administrator needs to install and run CSI Proxy + during the node bootstrap. The cluster administrator also needs to think about the upgrade workflow in addition to + upgrading the CSI Driver. +- Difficult releases of bugfixes & features - After a bugfix, we create a new version of the CSI Proxy to be + redeployed in the cluster. After a feature is merged, in addition to redeploying a new version of CSI Proxy, + the client needs to be updated with a new version of the CSI Proxy client and connect to the new version of the named pipes, + this workflow is not as simple as the Linux counterpart which only needs to update Go dependencies. +- Multiple API versions to maintain - As part of the original design of CSI Proxy it was decided to have different + protobuf versions whenever there were breaking changes (like updates in the protobuf services & messages), this lead + to having multiple versions of the API (v1alphaX, v1betaX, v1). In addition, if we want to add a new feature we'd need + to create a new API version e.g. v2alpha1 ([see this PR as an example of adding methods to the Volume API Group](https://github.com/kubernetes-csi/csi-proxy/pull/186)). + +In 1.22, SIG Windows introduced [HostProcess containers](https://kubernetes.io/blog/2021/08/16/windows-hostprocess-containers/) +as an alternative way to run containers. HostProcess containers run directly in the host +and behave like to a regular process. + +Using HostProcess containers in CSI Drivers enables CSI Drivers to perform the privileged storage operations +directly. Most of the drawbacks in the client/server model are no longer present in the new model. + +### Goals + + + +- Identify the pros/cons of the different ways to transition CSI Drivers to become HostProcess containers - This + includes changes in dependent components like CSI Proxy, as well as defining the changes in the CSI Drivers. +- Identify the security implications of running CSI Drivers as HostProcess containers - Like their Linux counterpart, + HostProcess containers need to have security policies in place limiting the scenarios in which they are enabled. + We provide an analysis on the security implications in this KEP. + +### Non-Goals + + +- Improve the performance of CSI Drivers in Windows - There should be an improvement in the performance by + removing the communication aspects between the CSI Driver and CSI Proxy (the protobuf serialization/deserialization, + the gRPC call through named pipes). However, this improvement might not be noticeable, as most of the latency + comes from doing the format/mount operations through powershell commands, which is outside the scope of this change. +- Define security implementation details - A goal is to understand the security implications of enabling HostProcess + containers. We aim to provide guidelines but not implementation details about the components that need to be installed + in the cluster. + +## Proposal + + +As part of the transition of CSI Drivers to HostProcess containers, we would like to: + +- Refactor the CSI Proxy codebase to become a Go library in favor of the current client/server model. +- Define guidelines for the transition of CSI Drivers to HostProcess containers, including changes in the Go code, + deployment, and security considerations. + +### Notes/Constraints/Caveats + + +HostProcess containers run as processes in the host. One of the differences with a privileged Linux container +is that there's no filesystem isolation. This means that enabling HostProcess containers should be done for +system components only. This point will be expanded on in the detailed design. + +### Risks and Mitigations + + + +Security implications of HostProcess containers will be reviewed by the SIG Windows team and the SIG Storage team +initially. + +One risk about enabling the HostProcess containers feature is not having enough security policies in the cluster +for workloads, if workloads can be deployed as HostProcess containers or if there's an escalation that allow +non-privileged pods to become HostProcess containers then workloads have complete access to the host filesystem, +this allows access to the tokens in `/var/lib/kubelet` as well as the volumes of other pods inside `/var/lib/kubelet/` + +## Design Details + + + +The following paragraphs summarize the architecture of CSI Proxy, how CSI Drivers use it, the purpose of the conversion +layer of CSI Proxy that enables backward compatibility with previous API Versions, and a description of the files +generated by the conversion layer used in CSI Proxy. + +CSI Proxy has a client/server design with two main components: + +* a binary that runs in the host (the CSI Proxy server). This binary can execute privileged storage operations on the + host. Once configured to run as a Windows service, it creates named pipes on startup for all the versions of the API + Groups defined on the codebase. +* client go libraries that CSI Drivers and Addons import to connect to the CSI Proxy server. The methods and objects + available in the library are defined with [protobuf](https://github.com/kubernetes-csi/csi-proxy#feature-status). On + startup, the CSI Driver initializes a client for each version of the API Groups required, which will connect and issue + requests through gRPC to their pre-configured named pipes on the host. + +CSI Driver implementers can write a Windows specific implementation of the node component of the CSI Driver. In the +implementation, a CSI Driver will make use of the imported CSI Proxy client libraries to issue privileged storage +operations. Assuming that a volume was created and attached to a node by the controller component of the CSI Driver, +the following CSI calls will be done by the kubelet to the CSI Driver. + +**Volume set up** + +* NodeStageVolume - Create a Windows volume, format it to NTFS, and create a partition access path in the node (global mount). +* NodePublishVolume - Create a symlink from the kubelet Pod-PVC path to the global path (pod mount). + +**Volume tear down** + +* NodeUnpublishVolume - Remove the symlink created above. +* NodeUnstageVolume - Remove the partition access path. + +CSI Proxy is designed to be backwards compatible, and a single binary running in the Windows node can serve requests from +multiple CSI Proxy client versions. We're able to do this, because the CSI Proxy binary will create named +pipes on startup for all the versions available in every API Group (e.g. the Volume, Disk, Filesystem, SMB groups). In addition, +there's a conversion layer in the CSI Proxy binary that transforms client version specific requests to server "version +agnostic" requests, which are then processed by the CSI Proxy binary. The following diagram shows the conversion process +(from the [CSI Proxy development docs](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/docs/DEVELOPMENT.md)): + +![CSI Proxy client/server model](./csi-proxy-client-server.jpg) + +Understanding the conversion layer will help in the transition to HostProcess containers, as most of the code that the +clients use to communicate with the CSI Proxy server is generated. The conversion layer's objective is to generate Go code +that maps versioned client requests to server agnostic requests. It does so by analyzing the generated `api.pb.go` +files (generated through `protoc` from the protobuf files) for each version of the API Groups and generating multiple +files for different purposes (taking as example the Volume API Group): + + +* [\/server_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/v1beta3/server_generated.go) + - The gRPC server implementation of the methods of a versioned API Group. Each method receives a versioned request and + expects a versioned response. The code generated follows this pattern: + +``` +func v1Foo(v1Request v1FooRequest) v1FooResponse { + + // convert versioned request to server request (version agnostic) + fooRequest = convertV1FooRequestToFooRequest(v1Request) + + // process request (server handler) + fooResponse = server.Foo(fooRequest) + + // convert server response (version agnostic) to versioned response + v1Response = convertFooResponseToV1FooResponse(fooResponse) + + return v1Response +} +``` + + +* [types_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/types_generated.go) + The idea is to collect all the methods available across all the versions of an API Group so that the server has a + corresponding implementation for it. The generator reads all the methods found across the + `volume//api.pb.go` files and generates an interface with all the methods found that the server must + implement, in the example above the server interface will have the `Foo` method +* [\/conversion_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/server/volume/impl/v1/conversion_generated.go) + The generated implementation of the conversion functions shown above (e.g. `convertV1FooRequestToFooRequest`, + `convertFooResponseToV1FooResponse`). In some cases, it's possible that the conversion code generator generates a nested + data structure that's not built correctly. There's an additional file with overrides for the functions that were + generated incorrectly. +* Client [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) + Generated in the client libraries to be used by users of the CSI Proxy client. It creates proxy methods corresponding + to the `api.pb.go` methods of the versioned API Group. This file defines the logic to create a connection to the + corresponding named pipe, creating a gRPC client out of it and storing it for later usage. As a result, the proxy + methods don't need a reference to the gRPC client. + + +### Prerequisite: Make CSI Proxy an embedded library without a server component + +If we configure the Windows node component of a CSI Driver/Addon to be a Windows HostProcess pod, then it'll be able to +use the same powershell commands that we use in the server code of CSI Proxy. The idea is to use the server code of CSI +Proxy as a library in CSI Drivers/Addons. With this, we also remove the server component. + +As described in the [Windows HostProcess Pod](https://kubernetes.io/docs/tasks/configure-pod-container/create-hostprocess-pod/) +guide, we'd need to configure the PodSpec of node component of the CSI Driver/Addon that runs in Windows nodes with: + + +```yaml +spec: + securityContext: + windowsOptions: + hostProcess: true + runAsUserName: "NT AUTHORITY\\SYSTEM" +``` + + +### Implementation idea 1: Update the conversion layer to use the server code gRPC + +Modify the implementation of [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) +so that it calls the server implementation directly (which should be part of the imported go module). The current +implementation uses `w.client` which is the gRPC client: + + +```go +func (w *Client) GetVolumeStats( + context context.Context, + request *v1.GetVolumeStatsRequest, + opts ...grpc.CallOption +) (*v1.GetVolumeStatsResponse, error) { + return w.client.GetVolumeStats(context, request, opts...) +} +``` + + +The new implementation should use the server code instead. In the server code, `volumeserver` is the implementation agnostic server that's instantiated by every versioned client `volumeservervX`. E.g., + + +```go +import v1 "github.com/kubernetes-csi/csi-proxy/client/api/volume/v1" +import volumeserver "github.com/kubernetes-csi/csi-proxy/pkg/server/volume" +import volumeserverv1 "github.com/kubernetes-csi/csi-proxy/pkg/server/volume/v1" + +// initialize all the versioned volume servers i.e. do what cmd/csi-proxy does but on the client +serverImpl := volumeserver.NewServer() + +// shim that would need to be auto generated for every version +serverv1 := volumeserverv1.NewVersionedServer(serverImpl) + +// client still calls the conversion handler code +func (w *Client) GetVolumeStats( + context context.Context, + request *v1.GetVolumeStatsRequest +) (*v1.GetVolumeStatsResponse, error) { + return serverv1.GetVolumeStats(context, request) +} +``` + +![csi-proxy-reuse-client-server-pod](./csi-proxy-reuse-client-server-pod.jpg) + +Pros: + +* We get to reuse the protobuf code. +* We would still support the client/server model, as this is a new method that clients would use. +* We only need to change the client import paths to use the alternative version that doesn't connect to the server with + gRPC, which minimizes the changes necessary in the client code. + +Cons: + +* New APIs would need to be added to the protobuf file, and we would need to run the code generation tool again, with + the rule of not modifying already released API Groups. This means that we would also need to create another API Group + version for a new API. +* We still have two distinct concepts of version: the Go module version and the API version. Given that we want to use + CSI Proxy as a library, it makes sense to use the Go module version as the source of truth and implement a single API + version in each Go version. + +### Implementation idea 2: Update the CSI Drivers to use the server code directly (preferred) + +Modify the client code to use the server API handlers directly which would call the server implementation next, this +means that the concept of an "API version" is also removed from the codebase, the clients instead would import and use +the internal server structs (request and response objects). + +Currently, GCE PD CSI driver uses the v1 Filesystem API group as follows: + + +```go +// note the API version in the imports +import fsapi "github.com/kubernetes-csi/csi-proxy/client/api/filesystem/v1" +import fsclient "github.com/kubernetes-csi/csi-proxy/client/groups/filesystem/v1" +func NewCSIProxyMounterV1() (*CSIProxyMounterV1, error) { + fsClient, err := fsclient.NewClient() + if err != nil { + return nil, err + } + return &CSIProxyMounterV1{ + FsClient: fsClient, + }, nil +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (mounter *CSIProxyMounterV1) PathExists(path string) (bool, error) { + isExistsResponse, err := mounter.FsClient.PathExists(context.Background(), + &fsapi.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + }) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxyV1, _ := NewCSIProxyMounterV1() +csiProxyV1.PathExists(path) +``` + + +Internally the `PathExists` call is in the file [\/\/client_generated.go](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/client/groups/volume/v1/client_generated.go) +described above, which performs the execution through gRPC. In the proposal we'd need to use the server implementation +instead: + +```go +// note that there is no version in the import +import fsserver "github.com/kubernetes-csi/csi-proxy/pkg/server/filesystem" +import fsserverimpl "github.com/kubernetes-csi/csi-proxy/pkg/server/filesystem/impl" +import fsapi "github.com/kubernetes-csi/csi-proxy/pkg/os/filesystem" + +// no need to initialize a gRPC client, however the server handler impl is initialized instead +// no need for a versioned client + +func NewCSIProxyMounter() (*CSIProxyMounter, error) { + fsServer, err := fsserver.NewServer(fsapi.New()) + if err != nil { + return nil, err + } + return &CSIProxyMounter{ + FsServer: fsServer, + }, nil +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (mounter *CSIProxyMounter) PathExists(path string) (bool, error) { + isExistsResponse, err := mounter.FsServer.PathExists(context.Background(), + &fsserverimpl.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + }, + // 3rd arg is the version, remove the version here too! + ) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxy, _ := NewCSIProxyMounter() +csiProxy.PathExists(path) +``` + +![csi-proxy-library](./csi-proxy-library.jpg) + +Pros: + +* We remove the concept of API Version & the conversion layer and instead consider the go mod version as the API + version. This is how other libraries like [k8s.io/mount-utils](https://github.com/kubernetes/mount-utils) work. + * Version dependent server validation in the API handler layer is removed. + * Legacy structs for older API versions are removed. +* New APIs are easier to add. Only the server handler & impl code is modified, so there’s no need for the code + generation tool anymore. + +Cons: + +* The client goes through a bigger diff. Every occurrence of a call to a CSI Proxy method needs to be modified to use + the server handler & impl code, but this penalty is paid only once. + * Legacy interface implementations for the v1beta API in the CSI Drivers are removed. +* As we no longer use protobuf to define the API and use internal structs instead, we'd need to update the API docs to + be directly generated from source code (including the comments around server handler methods and internal server + structs). + +It is worth noting that at this point, the notion of a server is no longer valid, as CSI Proxy has become a +library. We can take this opportunity to reorganize the packages by + +1. Moving `/pkg/server/` and `/pkg/server//impl` to `/pkg/` +2. Moving `/pkg/os/` to `/pkg//api` + +The new structure looks like: + + +``` +pkg +├── disk +│ ├── api +│ │ ├── api.go +│ │ └── types.go +│ ├── disk.go +│ └── types.go +├── iscsi +│ ├── api +│ │ ├── api.go +│ │ └── types.go +│ ├── disk.go +│ └── types.go +``` + +There are also three minor details we can take care of while we’re migrating: + +1. The two structs under `pkg/shared/disk/types.go` are only ever referenced by `pkg/os/disk`, so they can be safely added + to `pkg/disk/api/types.go`. +2. The FS server receives `workingDirs` as an input, in addition to the OS API. It’s only used to sandbox what directories + the CSI Proxy is enabled to operate on. Now that control is part of the CSI Driver, we can safely remove it. +3. `pkg/os/filesystem` is no longer necessary, as the implementation just calls out to the Golang standard library os + package. We can deprecate it in release notes and remove it in a future release. + +### Implementation idea 3: Convert CSI Proxy to a Library of Functions + +With the new changes, CSI Proxy is effectively just a library of Go functions mapping to Windows commands. The notion of +servers and clients is no longer relevant, so it makes sense to restructure the package into a library of functions, +with each API Group’s interfacing functions and types provided under `pkg/` (right now, these files sit at +`pkg/server//server.go` and `pkg/server//impl/types.go`). The OS-facing API at `/pkg/os` is kept +is, and the corresponding OS API struct is initialized globally inside each `pkg/` (to allow for subbing +during testing). All other code can be safely deleted. + +```go +// there is now only one single import +import fs "github.com/kubernetes-csi/csi-proxy/pkg/fs" + +// there is no longer a need to initialize a server +func NewCSIProxyMounter() *CSIProxyMounter { + return &CSIProxyMounter{ + } +} + +// ExistsPath - Checks if a path exists. Unlike util ExistsPath, this call does not perform follow link. +func (*CSIProxyMounter) PathExists(path string) (bool, error) { + // both mounter.FsServer and fsserverimpl are changed to just fs + isExistsResponse, err := fs.PathExists(context.Background(), + &fs.PathExistsRequest{ + Path: mount.NormalizeWindowsPath(path), + } + ) + if err != nil { + return false, err + } + return isExistsResponse.Exists, err +} + +// usage +csiProxy := NewCSIProxyMounter() +csiProxy.PathExists(path) + +// at test time +fs.UseAPI(mockAPI) +// run tests… +fs.ResetAPI() +``` + +This is the most invasive option of all three. Specifically, we combine the two imports into one and move to a pure +function paradigm. However, the method implementation sees very minimal changes, requiring only import path updates. + +Pros: + +* Like implementation idea 2, we switch to a single notion of version via Go modules. +* The pure function paradigm more accurately reflects the nature of the new design, which simplifies how clients use the + library. +* Like implementation idea 2, new APIs are easier to add by moving away from code generation. + +Cons: + +* There is now an implicit dependency on the os API package-level variable. Testing can still be done by subbing out the + variable with a mock implementation during test time. +* More work (2 imports -> 1, remove server initialization, replace function call and request type package names) needs + to be done by clients to adapt to the new change, though it’s not that much more than implementation idea 2. Again, + the price is only paid once. +* Like impl idea 2, we also need to transition our API doc generation to generate from Go source. + + +### Comparison Matrix + +| |Idea 1: Update the conversion layer to use the server code gRPC|Idea 2: Update the CSI Drivers to use the server code directly (preferred)|Idea 3: Convert CSI Proxy to a Library of Functions| +| --- |--- |--- |--- | +| Adoption cost|Minimal (only changing imports)|Considerate (imports and API calls)|Considerate (imports, API calls, and initialization)| +| Future development|Still need code generation and and protobuf|Directly add methods to Go code, but leaves legacy notion of “server”|Directly add functions to Go code. Code base cleaned up| +| Versioning|Both Go mod version and API version are maintained|Go mod version only|Go mod version only| +| Testing|Current tests should still work.|Current tests should still work.|OS API mocking needs to be subbed in, as we have an implicit dependency| +| Support for legacy client/server model|Still supported|Not supported|Not supported| + + +### Maintenance of the new model and existing client/server model of CSI Proxy + +The `library-development` branch will be used for the development of this model. `master` will have the existing +client/server mode. We plan to create alpha tags on the `library-development` branch and use it in CSI Drivers. +Once integrated, we will create a v2 tag and make `library-deveopment` the new default. `master` will point +to the new implementation, whereas the legacy code is maintained on the`v1.x`. + +`v1.x` will still be open for urgent bug fixes but new features should be developed in the v2 codebase. + +### Security analysis + +- Install the Pod Security Admissions controller and use Pod Security Standards + - Embrace the least privilege principle, quoting [Enforcing Pod Security Standards | Kubernetes](https://kubernetes.io/docs/setup/best-practices/enforcing-pod-security-standards/#embrace-the-principle-of-least-privilege) + - Namespaces that lack any configuration at all should be considered significant gaps in your cluster security model. + We recommend taking the time to analyze the types of workloads occurring in each namespace, and by referencing the Pod Security Standards, + decide on an appropriate level for each of them. Unlabeled namespaces should only indicate that they've yet to be evaluated. + - Namespaces allowing privileged workloads should establish and enforce appropriate access controls. + - For workloads running in those permissive namespaces, maintain documentation about their unique security requirements. + If at all possible, consider how those requirements could be further constrained. + - In namespaces without privileged workloads: + - Follow the guidelines in https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-namespace-labels/#applying-to-a-single-namespace, + for example, add the following labels to a namespace: + + ```plain + kubectl label --overwrite ns my-existing-namespace \ + pod-security.kubernetes.io/enforce=restricted \ + pod-security.kubernetes.io/enforce-version=v1.25 + ``` + + - Both the baseline and restricted Pod Security Standards disallows the creation of HPC pods (docs). +- Create a Windows user with limited permissions to create files under the kubelet controlled path `C:\var\lib\kubelet` + +### Test Plan + +#### Unit tests + + + + + +For CSI Proxy we already have unit tests inside `pkg/`. These tests are run on presubmit for every PR. + +Examples: + +- [volume tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/volume/volume_test.go) +- [filesystem tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/pkg/filesystem/filesystem_test.go) + +#### Integration tests + + + +For CSI Proxy, we already have integration tests inside `integrationtests`. These tests are run on presubmit for every PR. + +Examples: + +- [volume integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/volume_test.go) +- [filesystem integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/filesystem_test.go) +- [iscsi integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/iscsi_test.go) +- [system integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/system_test.go) +- [smb integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/smb_test.go) +- [disk integration tests](https://github.com/kubernetes-csi/csi-proxy/blob/c0c6293490fd8aec269685bb4089be56d69921b1/integrationtests/disk_test.go) + +#### e2e tests + + + +OSS storage e2e tests run out of tree. We plan to migrate at least 1 CSI Driver to use the CSI Proxy library and see the existing e2e tests passing. + +### Graduation Criteria + + +Most of the code used by CSI Drivers through CSI Proxy is already GA. This KEP is defining a new mechanism to run +the same code that the CSI Driver executes through CSI Proxy directly inside the CSI Driver. + + +### Upgrade / Downgrade Strategy + + +The following is a list of items that need to happen in different components of CSI in Windows for CSI Drivers to become HostProcess containers: + +**CSI Proxy** + +- Start a development branch for the upcoming work (`library-development`). +- Refactor the filesystem, disk, volume, system, iSCSI, SMB API Groups out of the current client/server. +- Remove the client/server code from the codebase. +- Update the unit and integration tests to work with the refactored code. +- Run the integration tests in a HostProcess container. +- Update the README and DEVELOPMENT docs. +- Once the above items are completed, we can create an alpha tag in the `library-development` branch to import in CSI Drivers. + +**CSI Driver** + +- Update the CSI Proxy library to the alpha v2 tag from the `library-development` branch. +- Update the codebase import to use the server implementation directly instead of the client library. +- Update the CSI Driver deployment manifest with the HostProcess container fields in the `PodSpec`. +- Run the e2e tests. + +### Version Skew Strategy + + +Previously, CSI Proxy has a different release cycle than the CSI Driver, where each binary had its own version and +supported different CSI Proxy clients. Once CSI Proxy becomes a library the version will be managed by the go module +version instead (similar to kubernetes/mount-utils). + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? (Do not assume `Dynamic Kubelet Config` feature is enabled). + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg new file mode 100644 index 00000000000..656deb3868e Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-client-server.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg new file mode 100644 index 00000000000..9a708170e98 Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-library.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg new file mode 100644 index 00000000000..a7ed48292ea Binary files /dev/null and b/keps/sig-windows/3636-windows-csi-host-process-pods/csi-proxy-reuse-client-server-pod.jpg differ diff --git a/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml b/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml new file mode 100644 index 00000000000..a47abe3e120 --- /dev/null +++ b/keps/sig-windows/3636-windows-csi-host-process-pods/kep.yaml @@ -0,0 +1,41 @@ +title: CSI Drivers in Windows as HostProcess Pods +kep-number: 3636 +authors: + - "@mauriciopoppe" +owning-sig: sig-storage +participating-sigs: + - sig-windows +status: implementable +creation-date: 2022-10-23 +reviewers: + - "@msau42" + - "@ddebroy" +approvers: + - "@msau42" + +see-also: + - "/keps/sig-windows/1122-windows-csi-support" +replaces: + - "/keps/sig-windows/1122-windows-csi-support" + +# The target maturity stage in the current dev cycle for this KEP. +stage: alpha + +# The most recent milestone for which work toward delivery of this KEP has been +# done. This can be the current (upcoming) milestone, if it is being actively +# worked on. +latest-milestone: "v1.26" + +# The milestone at which this feature was, or is targeted to be, at each stage. +milestone: + alpha: "v1.26" + beta: "v1.27" + stable: "v1.28" + +# The following PRR answers are required at alpha release +# List the feature gate name and the components for which it must be enabled +feature-gates: [] +disable-supported: true + +# The following PRR answers are required at beta release +metrics: []