KEP-3314: CSI Changed Block Tracking #4082

ihcsim · 2023-06-12T21:27:13Z

One-line PR description: KEP for CSI changed block tracking (CBT)

Issue link: CSI Differential Snapshot for Block Volumes #3314

Other comments: Supersedes KEP-3314: Changed Block Tracking With CSI VolumeSnapshotDelta #3367.

Signed-off-by: Ivan Sim <ivan.sim@dell.com>

Filled in the Proposal, Caveats and Risks. Put in the CSI spec in the Details section.

Clarified the proposal.

Initial structure. Filled in the Proposal, Caveats and Risks. Put in the CSI spec in the Details section.

linux-foundation-easycla · 2023-06-29T16:55:29Z

The committers listed above are authorized under a signed CLA.

✅ login: carlbraganza (320991b, 4dfcd49, 6b2130e, 2dee59e, 1d7062f, 679d5d3, 8319039, ffe8562, 5e50923, 6e71521, fdb94b1, 66eb80c, ec44e47, 46e6d93, 93f8f61, 20645b9, cd0f4c1, b687ecb, 753dbd2, 690f801, bf2d9f4, 3efbfbc, a42d41e, ba1e4c3, 3df9a5b, 5558e47, 0dc9264, 78b5204, a4671c5, d085a74, 2bfe89f, 35ff183, 4c66f27, 36afbfd, ce7d567, ffb7813, 7b5403e, 56b8f8d, 9702d1d, 0ae8dc7, deffd7f, e58bb3d, 8ffa4d3, c59ce97, 64163d4, 19a88ba, c048605, 45d2aee)
✅ login: PrasadG193 / name: Prasad Ghangal (db9715a, 0842a7a, 32d8982, a100f35)
✅ login: ihcsim / name: Ivan Sim (b9d8121, 8a3c2f8, 7ff9422, ba1dc52, 3e3d030, aba16ff, bbabfef)

stefanha · 2024-04-29T15:32:24Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+
+The `BlockMetadataType` enumeration specifies the style used: `FIXED_LENGTH` or `VARIABLE_LENGTH`.
+When the *block-based* style (`FIXED_LENGTH`) is used it is up to the SP plugin to define the
+block size.


How will backup clients make use of FIXED_LENGTH/VARIABLE_LENGTH?

A backup client would, in general, have to map the "view" of the block device returned by the API to its own view of the block device. For example, a client could have its own concept of block size and would have to map, say, a VMware extent map, to a range of blocks; conversely, it may have to map a range of AWS EBS blocks to an extent map.

The purpose of FIXED_LENGTH/VARIABLE_LENGTH is still not clear to me.

The API returns an ascending, non-overlapping sequence of BlockMetadata elements. Backup clients may accumulate adjacent BlockMetadata elements (i.e. convert from blocks to extents) and then align offsets/lengths to the backup client's internal block size, but this does not require BlockMetadataType.

Can you give a concrete example of how a backup client will use BlockMetadataType to clarify?

When we discussed this in the CSI WG, the argument was thus:

If the SP uses FIXED_LENGTH, it's a promise that lengths will be constant for the given pair of snapshots, which may unlock certain optimizations on the backup software side which wouldn't be possible with VARIABLE_LENGTH

If the SP uses VARIABLE_LENGTH, it remains free to return whatever lengths it wants -- possible folding consecutive change blocks and thereby compressing the metadata, and the backup software has to cope with it.

If these details aren't clear from the CSI spec, then it's a failure of spec writing, and we should amend the spec to be more clear about the exact semantics of that field.

@bswartz While the spec covers the effect of BlockMetadataType on the Length field, it's still unclear to me how a backup client would use this information.

The one example I can think of, making decisions about backup data layout (e.g. block size and the backup client's internal metadata parameters), doesn't seem to be well-served by BlockMetadataType:

BlockMetadataType only applies for the current API call. Maybe it could even change between successive calls on the same snapshot pair. Definitely between different snapshot pairs from the same volume.

Is a backup client that uses this information making the assumption that BlockMetadataType and the block size does not change? If yes, then either backup clients cannot use BlockMetadataType for this purpose or the API design needs adjustment.

An API that captures permanent attributes like block size would be something like GetVolumeMetadataHints(volume_id) -> VolumeMetadataHints where the spec guarantees the stability of the hints across snapshots of the volume.

Or maybe I'm still missing how backup clients could use BlockMetadataType?

Recall that these new snapshot metadata APIs are stream-oriented, so "one API call" is actually the whole stream of metadata for a given pair of snapshots. If the SP commits to always use a Length of 4096 (for example) then the backup software can safely allocate a fixed-size bitmap where each bit represents 4k of change for the whole snasphot.

If an SP returns variable length blocks, then the backup software has to be prepared to use something other than a bitmap to represent the delta, or be prepared to guess a optimal bitmap granularity and be prepared to adjust delta blocks to fit the bitmap, possibly wasting small amounts of space by backing up unchanged data.

FIXED_LENGTH is expensive over the wire because adjacent BlockMetadata elements cannot be sent merged. It would be more efficient for the SP to include an optional block size hint field in the response and allow the Length field to be vary instead of having BlockMetadataType.

Good thing the API is alpha and we can evolve it. I agree this isn't the most efficient transport mechanism, but it's dramatically more efficient that what's being done in the absence of this feature. The designers were optimizing for simplicity rather than efficiency, which is the right place to focus in an alpha version of anything.

Thanks for considering this feedback!

stefanha · 2024-04-29T15:37:31Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+  repeated BlockMetadata block_metadata = 3;
+}
+
+message GetMetadataDeltaRequest {


Is there a difference between GetMetadataAllocatedRequest and GetMetadataDeltaRequest aside from the snapshot_id vs base_snapshot_id/target_snapshot_id fields?

If there is no difference, maybe make base_snapshot_id optional (like the CSI spec's Snapshot group_snapshot_id field) and unify the two requests to reduce duplication. When base_snapshot_id is empty then allocated data is reported. When base_snapshot_id is non-empty then the delta is reported.

Is there a difference between GetMetadataAllocatedRequest and GetMetadataDeltaRequest aside from the snapshot_id vs base_snapshot_id/target_snapshot_id fields?

No, not really. We had considered your suggestion and decided to explicitly document separate methods for clarity.

keps/sig-storage/3314-csi-changed-block-tracking/kep.yaml

msau42 · 2024-05-22T23:52:20Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+```
+cbt.storage.k8s.io/driver: NAME_OF_THE_CSI_DRIVER
+```
+The presence of this label allows a backup application to efficiently locate


Can you have multiple CRs for the same driver?

Instead of using labels, can the name of the object == the name of the driver, which is what we require for the CSIDriver object: https://kubernetes.io/docs/reference/kubernetes-api/config-and-storage-resources/csi-driver-v1/

Or should we put it in the CSIDriver object itself?

Yeah, I don't see why we can't use the object name, instead of labels. We explored using the CSIDriver object to store this information, but was told that it was mainly used for kubelet/storage related spec, not things like HTTP endpoints.

It would be ambiguous to find multiple CRs for a given driver; which one should the application pick?

I agree the object could be named for the driver but the spec doesn't mandate that - it would totally depend on the driver installer. The reason is that the spec does not require these CRs to be in any particular namespace; the app finds them via a label search.

I think there were discussions on modifying the CSIDriver object in the WG and the concept was shot down.

Which spec doesn't mandate the name of the CR? We could define this requirement as such like we did for the CSIDriver object.

Sounds reasonable. I'll add something on the name of the CR.

I had forgotten to remove the CR label requirement - fixed now!

msau42 · 2024-05-23T00:00:50Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+existing tests to make this code solid enough prior to committing the changes necessary
+to implement this enhancement.
+
+##### Prerequisite testing updates


We need to make sure that our standard k8s-csi logging methods are not logging any tokens/secrets.

I agree with the sentiment, but should we be stating that in the spec?

No, I don't think this needs to be explicit in the spec. It's just an implementation requirement for our common libraries/sidecars as this has been a source of CVEs in the past

Signed-off-by: Ivan Sim <ihcsim@gmail.com>

xing-yang · 2024-05-30T15:47:50Z

/assign @liggitt
for review from SIG-Auth.

soltysh · 2024-06-05T18:21:56Z

keps/prod-readiness/sig-storage/3314.yaml

@@ -0,0 +1,3 @@
+kep-number: 3314
+alpha:
+  approver: "@johnbelamaric"


I've been looking at this before as shadow, and John is overbooked. So let's switch to me with PRR approval on this one.

soltysh · 2024-06-07T11:18:03Z

/label tide/merge-method-squash

soltysh

The PRR is mostly good, I'll wait for approval from sig-storage and sig-auth.

soltysh · 2024-06-07T11:29:25Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+  - Components depending on the feature gate:
+- [x] Other
+  - Describe the mechanism:
+The new components will be implemented as part of the out-of-tree CSI framework.


This still holds, not a blocking, but would be nice to answer this and questions around monitoring and troubleshooting.

msau42 · 2024-06-12T15:53:19Z

keps/sig-storage/3314-csi-changed-block-tracking/README.md

+to Kubernetes backup applications
+by creating a [SnapshotMetadataService CR](#snapshot-metadata-service-custom-resource)
+that contains the service's TCP endpoint address, CA certificate and
+an audience string needed for token authentication.


Let's follow up offline on this, but we may want to enforce some syntax/restrictions on the allowed audience strings. The primary goal is to prevent reusing some reserved audience for kube-apiserver. cc @liggitt

msau42 · 2024-06-12T15:56:09Z

/lgtm
/approve

xing-yang · 2024-06-12T16:00:56Z

@soltysh could you provide approval for PRR? Thanks.

soltysh

/approve
for PRR

k8s-ci-robot · 2024-06-12T16:31:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ihcsim, msau42, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [soltysh]
~~keps/sig-storage/OWNERS~~ [msau42]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Add draft of CSI CBT KEP

8a3c2f8

Signed-off-by: Ivan Sim <ivan.sim@dell.com>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 12, 2023

k8s-ci-robot requested review from saad-ali and xing-yang June 12, 2023 21:27

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jun 12, 2023

Update KEP status

b9d8121

Signed-off-by: Ivan Sim <ivan.sim@dell.com>

ihcsim mentioned this pull request Jun 12, 2023

KEP-3314: Changed Block Tracking With CSI VolumeSnapshotDelta #3367

Closed

xing-yang mentioned this pull request Jun 12, 2023

CSI Differential Snapshot for Block Volumes #3314

Open

4 tasks

ihcsim mentioned this pull request Apr 23, 2023

Add VolumeSnapshotDelta CSI RPCs container-storage-interface/spec#522

Closed

xing-yang mentioned this pull request Jun 21, 2023

[Suggestion] Requesting a security representative for the Kubernetes Data Protection WG cncf/tag-security#1034

Closed

carlbraganza added 10 commits June 25, 2023 00:12

Initial structure.

ba1e4c3

Filled in the Proposal, Caveats and Risks. Put in the CSI spec in the Details section.

Removed distracting links to common K8s definitions.

fdb94b1

Clarified the proposal.

More caveats. Better grammar.

7b5403e

Use "snapshot access session".

4dfcd49

addressed most of the feedback in the PR.

5e50923

Updated role figure.

93f8f61

More refinements.

66eb80c

Session figure. Renamed figure files.

3efbfbc

Fix background of session figure.

3df9a5b

Merge pull request #1 from ihcsim/carl-proposal-caveats-risks

ffe8562

Initial structure. Filled in the Proposal, Caveats and Risks. Put in the CSI spec in the Details section.

Updated figures and roles.

2dee59e

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jun 29, 2023

stefanha reviewed Apr 29, 2024

View reviewed changes

msau42 reviewed May 23, 2024

View reviewed changes

ihcsim and others added 2 commits May 24, 2024 10:28

Add sig-auth as participating sigs

7ff9422

Signed-off-by: Ivan Sim <ihcsim@gmail.com>

Require that the CR be named for the driver.

35ff183

k8s-ci-robot assigned liggitt May 30, 2024

Removed the label requirement for the CR.

ffb7813

soltysh reviewed Jun 5, 2024

View reviewed changes

k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 5, 2024

Replaced johnbelamaric with soltysh for PRR approver.

0dc9264

carlbraganza force-pushed the csi-cbt-kep branch from acbf444 to 0dc9264 Compare June 5, 2024 18:37

k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 5, 2024

Bump up milestone to v1.31

a100f35

k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jun 7, 2024

soltysh reviewed Jun 7, 2024

View reviewed changes

Change KEP status to implementable

32d8982

msau42 reviewed Jun 12, 2024

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 12, 2024

soltysh approved these changes Jun 12, 2024

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 12, 2024

k8s-ci-robot merged commit cf708c9 into kubernetes:master Jun 12, 2024
4 checks passed

k8s-ci-robot added this to the v1.31 milestone Jun 12, 2024

PrasadG193 mentioned this pull request Jun 14, 2024

REQUEST: New membership for PrasadG193 kubernetes/org#5019

Closed

11 tasks

carlbraganza mentioned this pull request Jun 17, 2024

REQUEST: New membership for carlbraganza kubernetes/org#5028

Closed

11 tasks

PrasadG193 mentioned this pull request Jun 27, 2024

[WIP] Docs for external-snapshot-metadata sidecar container kubernetes-csi/docs#601

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KEP-3314: CSI Changed Block Tracking #4082

KEP-3314: CSI Changed Block Tracking #4082

ihcsim commented Jun 12, 2023 •

edited

Loading

linux-foundation-easycla bot commented Jun 29, 2023 •

edited

Loading

stefanha Apr 29, 2024

carlbraganza Apr 30, 2024

stefanha Apr 30, 2024

bswartz Apr 30, 2024

stefanha Apr 30, 2024

bswartz Apr 30, 2024

stefanha Apr 30, 2024

bswartz Apr 30, 2024

stefanha May 1, 2024

stefanha Apr 29, 2024 •

edited

Loading

carlbraganza Apr 30, 2024

msau42 May 22, 2024

ihcsim May 24, 2024

carlbraganza May 24, 2024 •

edited

Loading

msau42 May 24, 2024

carlbraganza May 28, 2024

carlbraganza May 30, 2024

msau42 May 23, 2024

carlbraganza May 24, 2024 •

edited

Loading

msau42 May 24, 2024 •

edited

Loading

xing-yang commented May 30, 2024

soltysh Jun 5, 2024

carlbraganza Jun 5, 2024

soltysh commented Jun 7, 2024

soltysh left a comment

soltysh Jun 7, 2024

msau42 Jun 12, 2024

msau42 commented Jun 12, 2024

xing-yang commented Jun 12, 2024

soltysh left a comment

k8s-ci-robot commented Jun 12, 2024

KEP-3314: CSI Changed Block Tracking #4082

KEP-3314: CSI Changed Block Tracking #4082

Conversation

ihcsim commented Jun 12, 2023 • edited Loading

linux-foundation-easycla bot commented Jun 29, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanha Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlbraganza May 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carlbraganza May 24, 2024 • edited Loading

Choose a reason for hiding this comment

msau42 May 24, 2024 • edited Loading

Choose a reason for hiding this comment

xing-yang commented May 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soltysh commented Jun 7, 2024

soltysh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msau42 commented Jun 12, 2024

xing-yang commented Jun 12, 2024

soltysh left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jun 12, 2024

ihcsim commented Jun 12, 2023 •

edited

Loading

linux-foundation-easycla bot commented Jun 29, 2023 •

edited

Loading

stefanha Apr 29, 2024 •

edited

Loading

carlbraganza May 24, 2024 •

edited

Loading

carlbraganza May 24, 2024 •

edited

Loading

msau42 May 24, 2024 •

edited

Loading