-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBG-3853: Implementation of range sequence entry for skipped sequence slice #6764
Conversation
… slice + alterations of tests to add test cases for range entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks fine, a few minor suggestions to help readability and to ensure we have complete test coverage.
db/skipped_sequence_test.go
Outdated
name: "list_full_range_items", | ||
inputList: []uint64{5, 15, 25, 35, 45}, | ||
expectedStartSeqList: []uint64{5, 15, 25, 28, 35, 45}, | ||
expectedEndSeqList: []uint64{10, 20, 26, 30, 40, 50}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making these test definitions a bit more readable, so that it's more immediately obvious what the input ranges are and what the expected results are? In particular the fact that we're creating ranges of [input, input+5] isn't really obvious, and it's also hard to eyeball what the expected end ranges are.
I was thinking of something like:
input: [][]uint64{{5,10}, {15,20}, {25,30}, {35,40}, {45,50}}
expected: [][]uint64{{5,10}, {15,20}, {25,26}, {28,30}, {35,40}, {45,50}
With that approach you could use {25} for cases where we expect a single sequence, and {25,25} where we expect a range of length 1. You could also include the range len=2 and range len=3 cases in this test a bit more easily.
It might be personal preference, if you disagree you can ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, want to ensure we've got coverage for all permutations of removals, including:
- length 1 range
- length 2 ranges
- length 3 ranges, for removal of first/second/third (I think test below just removes the middle element)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, have made changes to have input and expected be 2-D array of uint64[][].
Added test cases to the test TestRemoveSeqFromSkipped
for length 1 range seq removal, and length 2 seq removal. Have extended test TestRemoveSeqFromThreeSequenceRange
to include a test for removing the start/last seq on a three sequence range.
…in code based off comments
db/skipped_sequence.go
Outdated
// if x == startSeq set startSeq+1, if x == lastSeq set lastSeq-1 | ||
if rangeElem.getStartSeq() == x { | ||
rangeElem.setStartSeq(x + 1) | ||
return nil | ||
} | ||
if rangeElem.getLastSeq() == x { | ||
rangeElem.setLastSeq(x - 1) | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If numSequences
is 2 for these cases, does that mean this range becomes a single sequence without changing the type?
Does that impact anything the next time it's run through code that does isRange()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the original approach was if the above code lines makes a range of 1, we were going to delete this and insert a new single entry in its place. Adam suggested we could avoid extra allocations by just altering the range to be one sequence. It seems to work well!
With regards to it running through isRange()
, it would still return true for being a range entry but I think this has no functional impact. Where isRange()
is used in binary search function, it will still find the sequence if the range is length 1. Also for removeSeq
we have a check for 1 sequence (above this code) that will just delete the entry. The only other way this is used is for adding new entries in PushSkippedSequenceEntry
which I feel doesn't have functional impact.
db/skipped_sequence.go
Outdated
var _ SkippedSequenceListEntry = &SingleSkippedSequence{} | ||
var _ SkippedSequenceListEntry = &SkippedSequenceRange{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what the overhead of the interface will be vs. a single concrete type that can handle single sequences and ranges. ( https://syslog.ravelin.com/go-interfaces-but-at-what-cost-961e0f58a07b )
Most of the difference in behaviour is already in isRange
checks, and it'd save all of the no-op method implementations.
Having said that, this would be closer to a micro-optimisation than the planned refactor of skipped sequences, so I'm OK waiting for all of that work to be done first, since the refactor to change the interface to a single type wouldn't be too intrusive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this idea, and am fine doing it in a later optimization. However, it does make me wonder whether anything that's a no-op belongs on the interface. (setStartSeq/setLastSeq/extendRange). Looking at the code, this goes back to the fact that we're using isRange instead of a type switch. Was this intentional for performance reasons?
…for search function
Overall looks good, but I'm also wondering about the slightly non-standard use of interfaces here (with the isRange/no-ops). |
* CBG-3852: Implementation of single sequence entry for skipped sequence slice (#6747) * CBG-3852: implementation of single sequence entry for skipped sequence slice * minor updates to make setLastSeq no-op for single values * updates for review round * remove not needed comment * updates based off review * CBG-3853: Implementation of range sequence entry for skipped sequence slice (#6764) * CBG-3854: Integration of the skipped sequence slice into the cache (#6789) * CBG-3854: Integeration of the skipped sequence slice into the cache * remove old chnages not needed anymore + gh actions failures * updates based of review. Move stat update to updateStats and alter soem naming. Test updates to reflect stat chnages too. * remove incorrect incrementing of num current sequences inside PushSkippedSequenceEntry * Fix race condition in updateStats() * updates off review * tidy CleanSkippedSequenceQueue * stats update * CBG-3855: Sequence range removal support (#6801) * CBG-3855: support for seqeuence range removal * add comment to function + remove repeated code * refactor error handling, simplify stat calculation update * CBG-3850: Optimise releaseUnusedSequenceRange (#6831) * CBG-3850: Optimise releaseUnusedSequenceRange function + support for ranges in pending queue * updates to add _pushRangeToPending and _removeSubsetOfRangeFromSkipped to process duplicates sequences between unused rnage and pending/skipped lists * fix race condition in test * fix for incorrect logic for removing skipped range * updates based off review * updates to fix incorrect code + add extra test coverage of changes * change error name * CBG-3945: Fix for panic in _clip method on skipped sequence slice (#6842) * fix test to use old api
CBG-3853
Pre-review checklist
fmt.Print
,log.Print
, ...)base.UD(docID)
,base.MD(dbName)
)docs/api