Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): Introduces a split implementation based on split_key to replace the previous move implementation. #18594

Merged
merged 29 commits into from
Sep 29, 2024

Conversation

Li0k
Copy link
Contributor

@Li0k Li0k commented Sep 18, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Previously, we implemented the concept of compaction groups for hummock to achieve write/configuration isolation and to create new compaction groups via the table-level move function.

Since the sst key range generated by the move function is not contiguous, this makes it difficult to re-combine the resulting independent compaction groups (currently not supported). Because of this concern, we set the move threshold very high, fearing that the resulting compaciton group will not be able to be merged again, thus creating a large number of compaction groups.

This concern is contrary to our expectation that in scenarios with high write throughput (backfill, etc.), we would like to split the compaction to achieve parallel compaction, increase the overall compaction throughput, and make full use of the compactor node's scale capability.

Here are the changes involved in this PR for review.

  1. based on table_id level split function, compatible with the original Split interface.
  2. fix key_range/table_id/sst_size after sst split.
  3. Use (table_id, vnode = 0, epoch = MAX_REPRESENTABLE) as split key for splitting.

I'll summarise the changes to the file, for your code reivew:

  1. src/meta/src/hummock/manager/compaction/compaction_group_schedule.rs introduce helper functions that facilitate splitting, refactor the split interface based on the new implementatio
  2. src/storage/hummock_sdk/src/compaction_group/hummock_version_ext.rs reimplemented split logic based on split_key.
  3. src/storage/hummock_sdk/src/compaction_group/mod.rs provides a wrapper for the split_key based sst / level split implementation.

There may be more code to change in this PR, but the logic is more homogeneous, thank you very much for your review!

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts!


// `partition_vnode_count` only works inside a table, to avoid a lot of slicing sst, we only enable it in groups with high throughput and only one table.
// The target `table_ids` might be split to an existing group, so we need to try to update its config
if table_ids.len() == 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this logic for all CG with single table? If yes, only checking the provided table_ids is not enough. Consider the following example:

original cg table_ids = [1, 2, 3]
split table_ids = [2]

resulting cg1 table_ids = [1]
resulting cg2 table_ids = [2]
resulting cg3 table_ids = [3]

All the resulting CGs need this logic. I wonder whether we should do this inside split_compaction_group_impl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another issue if we do it outside of split_compaction_group_impl: the compaction group txn will not be committed within the same transaction of version delta, which may cause in consistency when error happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not required that all single table cgs must be partitioned, we just need to enable it for high throughput cgs, we could even consider deprecating this feature, but for the current PR we choose to keep it. Let me think about how to implement it atomically.

// split 2
// See the example in L603 and the split rule in `split_compaction_group_impl`.
let result_vec = self
.split_compaction_group_impl(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make sure split1 and split2 happens atomically in the same txn? If not, what happen if split1 succeeds but split2 fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, from my point of view it doesn't need to be.

If split 1 succeeds and split 2 fails (split condition/network), we just wait for the next split to be triggered, if necessary.

let split_key = if group_construct.split_key.is_some() {
Some(Bytes::from(group_construct.split_key.clone().unwrap()))
} else {
None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this branch reachable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reachable when construct a new compaction group

@Li0k
Copy link
Contributor Author

Li0k commented Sep 26, 2024

PTAL @zwang28 @hzxa21


let (key_range_l, key_range_r) = {
let key_range = &origin_sst_info.key_range;
let l = KeyRange {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of split_sst, the relative ordering of sst.left,sst.right and split_key cannot be assumed.
Maybe assert l.left < l.right and r.left<=r.right, or handle it properly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. split_key = sst.left = sst.right (inclusive).
We don't want any resulted sst.left > sst.right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, refactor split_sst and add related UT

Copy link
Contributor

@zwang28 zwang28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please check whether the PR description is accurate. For example, the split key now uses MAX instead of 0.

SstSplitType::Right => {
assert_eq!(0, pos);
insert_table_infos.extend_from_slice(&level.table_infos[pos..]); // the sst at pos has been split to the right
level.table_infos = level.table_infos[0..pos].to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: can be level.table_infos.clear()

@Li0k Li0k enabled auto-merge September 29, 2024 09:50
@Li0k Li0k added this pull request to the merge queue Sep 29, 2024
Merged via the queue into main with commit 83e3c4b Sep 29, 2024
31 of 33 checks passed
@Li0k Li0k deleted the li0k/storage_split_group_v2 branch September 29, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants