Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: support MVCC range tombstones in MVCCExportToSST #82873

Merged
merged 1 commit into from
Jun 24, 2022

Conversation

erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Jun 14, 2022

This patch adds support for exporting MVCC range tombstones in
MVCCExportToSST(), and by extension in the KV Export method.
This will only happen after two version gates are enabled:

  • EnablePebbleFormatVersionRangeKeys: begins emitting SSTs in
    Pebblev2 format, which supports Pebble range keys.

  • MVCCRangeTombstones: allows writing MVCC range tombstones via KV.

MVCC range tombstones are emitted in the same way as point tombstones:
all tombstones if ExportAllRevisions is enabled, or the latest visible
tombstone if StartTS is given.

MVCC range tombstones are truncated to the SST bounds. For example, if
exporting the span a-f then any range tombstones wider than the span
will be truncated to [a-f). If the export hits a limit e.g. at c
then any MVCC range tombstones in the returned SST are truncated to
[a-c).

If StopMidKey is enabled, then it's possible for two subsequent
exports to contain overlapping MVCC range tombstones. For example, given
the range tombstone [a-f)@5, if we return a resume key at c@3 then
the response will contain a truncated MVCC range tombstone [a-c\0)@5
which covers the point keys at c, but resuming from c@3 will contain
the MVCC range tombstone [c-f)@5 which overlaps with the MVCC range
tombstone in the previous response for the interval [c-c\0)@5.
AddSSTable will allow this overlap during ingestion once it supports
MVCC range tombstones.

Resolves #71398.

Release note: None

@erikgrinaker erikgrinaker self-assigned this Jun 14, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@erikgrinaker erikgrinaker requested a review from a team June 14, 2022 11:30
@erikgrinaker erikgrinaker force-pushed the mvcc-range-tombstones-export branch 2 times, most recently from 73e464f to 364ef7d Compare June 18, 2022 09:51
@erikgrinaker erikgrinaker marked this pull request as ready for review June 18, 2022 09:52
@erikgrinaker erikgrinaker requested review from a team as code owners June 18, 2022 09:52
@erikgrinaker
Copy link
Contributor Author

This should be ready for review now. Sorry about having to drag in #82799.

Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:, last commit looks solid!

Reviewed 5 of 6 files at r1, 2 of 8 files at r2, 3 of 6 files at r3, 1 of 1 files at r4, 10 of 11 files at r5, 4 of 4 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911, @dt, and @erikgrinaker)


pkg/storage/mvcc.go line 4280 at r6 (raw file):

	// doesn't end up being any further point keys covered by it and we go on to
	// flush them as-is at their normal end key. We need to make sure we have
	// enough MaxSize budget to flush then in all of these cases.

nit: them* ?


pkg/storage/testdata/mvcc_histories/export line 3 at r6 (raw file):

# Tests MVCC export.
#
# Sets up the following dataset, where x is tombstone, o-o is range tombstone, [] is intent.

nit: MVCC range tombstone* just to make it clear this isn't a Pebble range tombstone.

This patch adds support for exporting MVCC range tombstones in
`MVCCExportToSST()`, and by extension in the KV `Export` method.
This will only happen after two version gates are enabled:

* `EnablePebbleFormatVersionRangeKeys`: begins emitting SSTs in
  `Pebblev2` format, which supports Pebble range keys.

* `MVCCRangeTombstones`: allows writing MVCC range tombstones via KV.

MVCC range tombstones are emitted in the same way as point tombstones:
all tombstones if `ExportAllRevisions` is enabled, or the latest visible
tombstone if `StartTS` is given.

MVCC range tombstones are truncated to the SST bounds. For example, if
exporting the span `a-f` then any range tombstones wider than the span
will be truncated to `[a-f)`. If the export hits a limit e.g. at `c`
then any MVCC range tombstones in the returned SST are truncated to
`[a-c)`.

If `StopMidKey` is enabled, then it's possible for two subsequent
exports to contain overlapping MVCC range tombstones. For example, given
the range tombstone `[a-f)@5`, if we return a resume key at `c@3` then
the response will contain a truncated MVCC range tombstone `[a-c\0)@5`
which covers the point keys at `c`, but resuming from `c@3` will contain
the MVCC range tombstone `[c-f)@5` which overlaps with the MVCC range
tombstone in the previous response for the interval `[c-c\0)@5`.
`AddSSTable` will allow this overlap during ingestion once it supports
MVCC range tombstones.

Release note: None
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Jun 24, 2022

This comes with a moderate performance penalty, which is mostly due to enabling range keys in Pebble and HasPointAndRange overhead. This is a known issue that will be addressed later in #83049 -- merging this for now to unblock higher-level work.

MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=false-24       25.2µs ± 2%  25.5µs ± 2%  +1.17%  (p=0.015 n=10+10)
MVCCExportToSST/numKeys=64/numRevisions=1/exportAllRevisions=true-24        25.1µs ± 1%  25.4µs ± 2%  +1.03%  (p=0.007 n=9+10)
MVCCExportToSST/numKeys=64/numRevisions=10/exportAllRevisions=false-24      25.5µs ± 2%  26.9µs ± 2%  +5.49%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=64/numRevisions=10/exportAllRevisions=true-24       25.2µs ± 1%  27.0µs ± 2%  +7.31%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=64/numRevisions=100/exportAllRevisions=false-24     25.3µs ± 2%  26.9µs ± 2%  +6.47%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=64/numRevisions=100/exportAllRevisions=true-24      25.1µs ± 2%  26.5µs ± 1%  +5.55%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=512/numRevisions=1/exportAllRevisions=false-24      25.1µs ± 1%  25.3µs ± 1%    ~     (p=0.095 n=10+9)
MVCCExportToSST/numKeys=512/numRevisions=1/exportAllRevisions=true-24       25.2µs ± 2%  25.4µs ± 2%    ~     (p=0.211 n=10+9)
MVCCExportToSST/numKeys=512/numRevisions=10/exportAllRevisions=false-24     25.3µs ± 1%  26.4µs ± 1%  +4.60%  (p=0.000 n=8+10)
MVCCExportToSST/numKeys=512/numRevisions=10/exportAllRevisions=true-24      25.2µs ± 2%  26.1µs ± 2%  +3.47%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=512/numRevisions=100/exportAllRevisions=false-24    25.2µs ± 1%  26.0µs ± 1%  +3.04%  (p=0.000 n=9+10)
MVCCExportToSST/numKeys=512/numRevisions=100/exportAllRevisions=true-24     25.3µs ± 1%  26.3µs ± 2%  +4.03%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=false-24     25.2µs ± 2%  25.3µs ± 1%    ~     (p=0.424 n=10+10)
MVCCExportToSST/numKeys=1024/numRevisions=1/exportAllRevisions=true-24      25.2µs ± 2%  25.4µs ± 1%    ~     (p=0.079 n=10+9)
MVCCExportToSST/numKeys=1024/numRevisions=10/exportAllRevisions=false-24    25.3µs ± 1%  26.1µs ± 1%  +3.21%  (p=0.000 n=9+9)
MVCCExportToSST/numKeys=1024/numRevisions=10/exportAllRevisions=true-24     25.2µs ± 1%  26.2µs ± 2%  +3.98%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=1024/numRevisions=100/exportAllRevisions=false-24   25.5µs ± 1%  26.3µs ± 2%  +3.47%  (p=0.000 n=9+10)
MVCCExportToSST/numKeys=1024/numRevisions=100/exportAllRevisions=true-24    25.4µs ± 2%  25.9µs ± 0%  +2.29%  (p=0.000 n=10+8)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=false-24     25.0µs ± 1%  25.1µs ± 1%    ~     (p=0.780 n=9+10)
MVCCExportToSST/numKeys=8192/numRevisions=1/exportAllRevisions=true-24      25.0µs ± 1%  25.1µs ± 2%    ~     (p=0.404 n=10+10)
MVCCExportToSST/numKeys=8192/numRevisions=10/exportAllRevisions=false-24    25.5µs ± 1%  26.6µs ± 4%  +4.37%  (p=0.000 n=10+9)
MVCCExportToSST/numKeys=8192/numRevisions=10/exportAllRevisions=true-24     25.3µs ± 2%  26.3µs ± 2%  +3.70%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=8192/numRevisions=100/exportAllRevisions=false-24   22.4µs ± 4%  24.3µs ± 9%  +8.41%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=8192/numRevisions=100/exportAllRevisions=true-24    22.6µs ± 1%  24.3µs ± 3%  +7.51%  (p=0.000 n=9+10)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=false-24    25.1µs ± 1%  25.6µs ± 2%  +1.98%  (p=0.003 n=10+10)
MVCCExportToSST/numKeys=65536/numRevisions=1/exportAllRevisions=true-24     25.1µs ± 2%  25.5µs ± 1%  +1.41%  (p=0.001 n=10+10)
MVCCExportToSST/numKeys=65536/numRevisions=10/exportAllRevisions=false-24   22.8µs ± 2%  24.4µs ± 2%  +6.63%  (p=0.000 n=9+10)
MVCCExportToSST/numKeys=65536/numRevisions=10/exportAllRevisions=true-24    22.6µs ± 3%  24.4µs ± 3%  +7.80%  (p=0.000 n=10+10)
MVCCExportToSST/numKeys=65536/numRevisions=100/exportAllRevisions=false-24  24.5µs ±11%  25.6µs ± 9%    ~     (p=0.123 n=10+10)
MVCCExportToSST/numKeys=65536/numRevisions=100/exportAllRevisions=true-24   24.9µs ± 5%  26.0µs ± 2%  +4.67%  (p=0.000 n=10+9)

TFTR!

bors r=itsbilal

@erikgrinaker
Copy link
Contributor Author

bors r=itsbilal

@craig
Copy link
Contributor

craig bot commented Jun 24, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: support MVCC range tombstones in Export
3 participants