-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: add (*sstable.Writer).RangeKey{Set,Unset,Delete} methods #1445
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions
internal/rangekey/rangekey.go, line 73 at r1 (raw file):
// SuffixValues, when encoded for a RangeKeySet. It may be used to construct a // buffer of the appropriate size before encoding. func EncodedSetSuffixValuesLen(suffixValues []SuffixValue) int {
I found myself doing a lot of gymnastics with just the suffix values (i.e. the value without the end key encoded). I split these functions out.
sstable/writer.go, line 398 at r1 (raw file):
// not required to be fragmented. func (w *Writer) RangeKeySet(start, end, suffix, value []byte) error { // FIXME(travers): Should the API allow for adding multiple (suffix, value)
Open question: is this sufficient? Or should we support allowing the caller to pass in multiple suffix / value pairs. Same for RangeKeyUnset
.
sstable/writer.go, line 401 at r1 (raw file):
// pairs at the same time? startKey := base.MakeInternalKey(start, 0, base.InternalKeyKindRangeKeySet) return w.addRangeKeySpan(startKey, end, suffix, value)
Open question: there's some subtlety in tracking the lifetimes of the byte slices that are passed to these methods. As we only perform a copy
when the fragmented span is coalesced and finalized, it's possible that the caller could alter the underlying data before we add to the block. Do we want to preemptively copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Reviewed 2 of 6 files at r1.
Reviewable status: 2 of 6 files reviewed, 11 unresolved discussions (waiting on @nicktrav and @sumeerbhola)
internal/rangekey/rangekey.go, line 180 at r1 (raw file):
} // EncodeUnsetSuffixValues encodes a slice of SuffixValues for a RangeKeyUnset
nit: s/slice of SuffixValues/slice of suffixes/
internal/rangekey/rangekey.go, line 182 at r1 (raw file):
// EncodeUnsetSuffixValues encodes a slice of SuffixValues for a RangeKeyUnset // into dst. The length of dst must be greater than or equal to // EncodedSetSuffixValuesLen. EncodeSetSuffixValues returns the number of bytes
nit: s/EncodeSetSuffixValues/EncodeUnsetSuffixes/
internal/rangekey/rangekey.go, line 185 at r1 (raw file):
// written, which should always equal the EncodedSetValueLen with the same // arguments. func EncodeUnsetSuffixValues(dst []byte, suffixes [][]byte) int {
s/EncodeUnsetSuffixValues/EncodeUnsetSuffixes/
sstable/writer.go, line 398 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Open question: is this sufficient? Or should we support allowing the caller to pass in multiple suffix / value pairs. Same for
RangeKeyUnset
.
I think this is sufficient. 👍
sstable/writer.go, line 401 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
Open question: there's some subtlety in tracking the lifetimes of the byte slices that are passed to these methods. As we only perform a
copy
when the fragmented span is coalesced and finalized, it's possible that the caller could alter the underlying data before we add to the block. Do we want to preemptively copy?
I vote preemptively copy to err towards guarding against misuse, since we expect range keys to be rare in CockroachDB. We can walk that decision back if we need to during performance testing.
We could share a bytes.Buffer
for all range key buffered data (including the encoded suffix-values tuples) to reduce the allocations..
sstable/writer.go, line 451 at r1 (raw file):
case base.InternalKeyKindRangeKeyUnset: suffixes := [][]byte{suffix} buf := make([]byte, rangekey.EncodedUnsetSuffixValuesLen(suffixes))
I think it might be worthwhile to use a shared bytes.Buffer
on the Writer
for these encoded Span.Value
s to reduce allocations.
sstable/writer.go, line 468 at r1 (raw file):
// only flush when we've moved past the previous key. if w.compare(span.Start.UserKey, w.prevRangeKeyUserKey) > 0 { w.fragmenter.TruncateAndFlushTo(w.prevRangeKeyUserKey)
Is this necessary? I think Fragmenter.Add
will flush if necessary. I think we can Add
each key span and call Finish
when finishing the sstable, and that should be enough to ensure everything is flushed.
sstable/writer.go, line 496 at r1 (raw file):
} func (w *Writer) addCoalescedSpan(s rangekey.CoalescedSpan) {
Can you add a comment here stating that this method is used as the emit function of the Coalescer
?
sstable/testdata/writer_range_keys, line 1 at r1 (raw file):
# NOTE: The operations SET, UNSET, and DEL in this test file are aliases for
These test cases are great (and so easy to read!)
sstable/testdata/writer_range_keys, line 85 at r1 (raw file):
d.RANGEKEYSET.0: e [(@9=v9)] d.RANGEKEYDEL.0: e f.RANGEKEYSET.0: i [(@4=v4),(@5=v5),(@7=v7),(@8=v8),(@9=v9)]
nit: The ordering of these suffix-values is opposite, because the test uses base.DefaultComparer
which compares suffixes as byte slices. I think it'd be worthwhile to use testkeys.Comparer
and reverse the order here to avoid confusion.
41e32ac
to
2bf7d93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @jbowens from a discussion.
Reviewable status: 1 of 6 files reviewed, 4 unresolved discussions (waiting on @jbowens and @sumeerbhola)
internal/rangekey/rangekey.go, line 185 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
s/EncodeUnsetSuffixValues/EncodeUnsetSuffixes/
Done.
sstable/writer.go, line 398 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I think this is sufficient. 👍
👍
sstable/writer.go, line 401 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I vote preemptively copy to err towards guarding against misuse, since we expect range keys to be rare in CockroachDB. We can walk that decision back if we need to during performance testing.
We could share a
bytes.Buffer
for all range key buffered data (including the encoded suffix-values tuples) to reduce the allocations..
Done. I added (*Writer).tempRangeKeyBuf
, which is similar to how we do things over in the block property collector.
sstable/writer.go, line 451 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I think it might be worthwhile to use a shared
bytes.Buffer
on theWriter
for these encodedSpan.Value
s to reduce allocations.
Same as above. Will re-use the same mechanism for buffering.
sstable/writer.go, line 468 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Is this necessary? I think
Fragmenter.Add
will flush if necessary. I think we canAdd
each key span and callFinish
when finishing the sstable, and that should be enough to ensure everything is flushed.
Removed.
sstable/writer.go, line 496 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Can you add a comment here stating that this method is used as the emit function of the
Coalescer
?
Done.
sstable/testdata/writer_range_keys, line 1 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
These test cases are great (and so easy to read!)
👍
sstable/testdata/writer_range_keys, line 85 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: The ordering of these suffix-values is opposite, because the test uses
base.DefaultComparer
which compares suffixes as byte slices. I think it'd be worthwhile to usetestkeys.Comparer
and reverse the order here to avoid confusion.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 5 files at r2.
Reviewable status: 5 of 6 files reviewed, 7 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)
sstable/writer.go, line 442 at r2 (raw file):
// caller to avoid re-use of buffers until the Writer is closed. endCopy := w.tempRangeKeyBuf(len(end)) copy(endCopy, end)
where are we copying start
? I am assuming that like the other slices it can be mutated by the caller when this method returns.
sstable/writer.go, line 497 at r2 (raw file):
var unsets [][]byte for _, item := range s.Items { sv := rangekey.SuffixValue{Suffix: item.Suffix}
nit: initializing sv
here adds a level of indirection that is unnecessary for the if-block.
how about
if item.Unset {
unsets = append(unsets, item.Suffix)
} else {
sets = append(sets, rangekey.SuffixValue{Suffix: item.Suffix, Value: item.Value)
}
sstable/writer_rangekey_test.go, line 50 at r2 (raw file):
start, end := startEndSplit[0], startEndSplit[1] switch kind {
Consider scrambling the bytes in the byte slices passed as parameters after the return from the method. We've had hard to track down bugs in the past.
2bf7d93
to
8fc3e51
Compare
Users of `sstable.Writer` other than Pebble itself (e.g. Cockroach) require a means of adding range keys to an sstable that is slightly more flexible than the existing `AddRangeKey` method in the way in which key spans can be added to the table. Specifically, `AddRangeKey` requires that spans be sorted (start / end key ascending, sequence number and key kind descending), in addition to fragmented and coalesced (i.e. multiple suffixes within the same span are coalesced into a single value). Add the `RangeKey{Set,Unset,Delete}` methods on the `sstable.Writer` that allow adding spans in order of start key. These methods have no requirement that the spans be fragmented or coalesced. Instead, the implementation handles the fragmenting of spans as keys are added. A `keyspan.Fragmenter` emits completed spans into a `rangekey.coalescer`, which aggregates the fragments for the same span into a `rangekey.CoalescedSpan`, which can then be used to write the individual `RangeKey{Set,Unset,Delete}` key / value pairs into the rang key block in the sstable. The new `RangeKey{Set,Unset,Delete}` methods have similar semantics to the existing `Set,Delete,DeleteRange,Merge` methods that write the key / value pairs at sequence number zero. In the range key case, these new methods are intended to be used by external callers when preparing sstables for ingestion. Add a data-driven test specific to the new range key methods. These tests are separate to the existing data driven test cases for the `AddRangeKey` method, which has stricter requirements on how keys are added (i.e. already ordered, fragmented and coalesced). Rename `AddInternalRangeKey` to `AddRangeKey` to better match the existing naming of `(*sstable.Writer).Add`. Add new utility methods to the `rangekey` package to aid in encoding range key values. Related to cockroachdb#1339.
8fc3e51
to
f2b06ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 6 files reviewed, 6 unresolved discussions (waiting on @jbowens and @sumeerbhola)
sstable/writer.go, line 442 at r2 (raw file):
Previously, sumeerbhola wrote…
where are we copying
start
? I am assuming that like the other slices it can be mutated by the caller when this method returns.
Added.
sstable/writer.go, line 497 at r2 (raw file):
Previously, sumeerbhola wrote…
nit: initializing
sv
here adds a level of indirection that is unnecessary for the if-block.
how aboutif item.Unset { unsets = append(unsets, item.Suffix) } else { sets = append(sets, rangekey.SuffixValue{Suffix: item.Suffix, Value: item.Value) }
Done.
sstable/writer_rangekey_test.go, line 50 at r2 (raw file):
Previously, sumeerbhola wrote…
Consider scrambling the bytes in the byte slices passed as parameters after the return from the method. We've had hard to track down bugs in the past.
Good point. Done.
cc: @erikgrinaker - FYI. Feel free to take this for a spin. There might be some subtleties we're not aware of that you may be able point out. |
TFTRs! |
Thanks @nicktrav, will do! |
Users of
sstable.Writer
other than Pebble itself (e.g. Cockroach)require a means of adding range keys to an sstable that is slightly more
flexible than the existing
AddRangeKey
method in the way in which keyspans can be added to the table.
Specifically,
AddRangeKey
requires that spans be sorted (start / endkey ascending, sequence number and key kind descending), in addition to
fragmented and coalesced (i.e. multiple suffixes within the same span
are coalesced into a single value).
Add the
RangeKey{Set,Unset,Delete}
methods on thesstable.Writer
that allow adding spans in order of start key. These methods have no
requirement that the spans be fragmented or coalesced. Instead, the
implementation handles the fragmenting of spans as keys are added. A
keyspan.Fragmenter
emits completed spans into arangekey.coalescer
,which aggregates the fragments for the same span into a
rangekey.CoalescedSpan
, which can then be used to write the individualRangeKey{Set,Unset,Delete}
key / value pairs into the rang key blockin the sstable.
The new
RangeKey{Set,Unset,Delete}
methods have similar semantics tothe existing
Set,Delete,DeleteRange,Merge
methods that write the key /value pairs at sequence number zero. In the range key case, these new
methods are intended to be used by external callers when preparing
sstables for ingestion.
Add a data-driven test specific to the new range key methods. These
tests are separate to the existing data driven test cases for the
AddRangeKey
method, which has stricter requirements on how keys areadded (i.e. already ordered, fragmented and coalesced).
Rename
AddInternalRangeKey
toAddRangeKey
to better match theexisting naming of
(*sstable.Writer).Add
.Add new utility methods to the
rangekey
package to aid in encodingrange key values.
Related to #1339.
Some open questions: