-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): support commit multi epoch for new compaction group #17749
Conversation
…nto li0k/storage_commit_multi_epoch
@@ -161,16 +188,19 @@ impl HummockManager { | |||
self.env.notification_manager(), | |||
&self.metrics, | |||
); | |||
let mut compaction_group_manager = | |||
compaction_group_manager_guard.start_compaction_groups_txn(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we only acquire the compaction_group_manager
lock only when batch_commit_for_new_cg
is not empty in L222?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved.
The original BTreeMapTransaction
cannot fulfill this, because it holds the mutation reference to the RwLockGuard, and they cannot be created and stored together, and otherwise there will be the problem of self-referencing.
In this PR, I change to let BTreeMapTransaction
support holding any pointer type that impl DerefMut<BTreeMap>
, so that the ownership of RwLockGuard
can be held by BTreeMapTransaction
and there won't be self-referencing. We further introduce a new struct DerefMutForward
, which forward &mut CompactionGroupManager
to &mut BTreeMap<...>
. This is required because RwLockGuard
wraps CompactionGroupManager
, but we actually need impl DerefMut<BTreeMap<...>>
.
|
||
batch_commit_info.insert(new_compaction_group_id, epoch_to_ssts); | ||
} | ||
let start_sst_id = next_sstable_object_id(&self.env, new_id_count).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to reserve new sst ids for the new compaction group, because its parent group will be StaticCompactionGroupId::NewCompactionGroup so that the reserved new sst ids won't never used anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this commit_epoch
, we will insert data to the newly created compaction together in a single version delta, so we may need to reserve some sst ids for the newly added data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly added SSTs in commit_epoch
already have their deterministic SST ids.
We only need reserve SST ids in advance during splitting compaction group, where branched SSTs are assigned ids in the reserved range.
See
pub fn init_with_parent_group( |
pub fn split_sst(sst_info: &mut SstableInfo, new_sst_id: &mut u64) -> SstableInfo { |
@Li0k Could you please confirm this again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it. I just check the usage of new_sst_start_id
, and indeed there is no need to set it when we create an empty new compaction group. Have changed to avoid passing this new_sst_start_id
value along the way.
group_id: compaction_group_id, | ||
parent_group_id: StaticCompactionGroupId::NewCompactionGroup | ||
as CompactionGroupId, | ||
new_sst_start_id: start_sst_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific new_sst_start_id will never be used, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we just fill in a field required to create new compaction group. What is this field used for? If it's not used for newly created compaction group, I can change to fill a trivial value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this field used for
group_id: compaction_group_id, | ||
parent_group_id: StaticCompactionGroupId::NewCompactionGroup | ||
as CompactionGroupId, | ||
new_sst_start_id: 0, // No need to set it when `NewCompactionGroup` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help to add an assertion in either init_with_parent_group or split_sst, to guarantee new_sst_start_id is 0 IFF NewCompactionGroup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a warning in non-debug context and an assertion in debug context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
For #17735.
In #17735 when we support partial checkpoint based backfill, the upstream won't wait for barrier collection of the creating downstream. Meanwhile, the creating downstream won't commit any data before completed and merged to upstream. Instead, the sst info of the creating downstream will be stored in memory. When completed and merged to upstream, all data of the creating downstream will be added together in a single
commit_epoch
. We will add the data to a new compaction group, and each epoch will still have a individual sub-level, with epoch as the sub-level id.For example, let's say we start creating a new mview at epoch1. While the creating downstream is consuming the upstream mview table data at the snapshot of epoch1, the upstream will continue working and consuming data from subsequent epoch 2, 3, 4 and so on.
When the downstream finish consuming the snapshot, it will continue consuming the epoch 2, 3, 4... data from the L0 log store. The data of the creating downstream won't be committed to hummock version until it catches up with the upstream, let's at epoch 6. When it catches up, we will complete it and merge it to the upstream graph. In epoch 6, the barrier will be injected and collected together from both upstream and downstream, and therefore the data of upstream and downstream will be mixed together. When we do
commit_epoch
at epoch6, we will first create a new compaction group, and add the data of the downstream mview epoch by epoch (epoch 1, 2, 3, 4, 5). Since data in epoch6 are synced together, the ssts of epoch6 will be split, some added to the original groups, and some aded to the new group.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.