-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: new table stream graph #12240
feat: new table stream graph #12240
Conversation
1d635a5
to
dbbe9db
Compare
dml_node = match kind { | ||
PrimaryKeyKind::UserDefinedPrimaryKey | PrimaryKeyKind::RowIdAsPrimaryKey => { | ||
RequiredDist::hash_shard(pk_column_indices) | ||
.enforce_if_not_satisfies(dml_node, &Order::any())? | ||
} | ||
PrimaryKeyKind::AppendOnly => StreamExchange::new_no_shuffle(dml_node).into(), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dist keys changes in planner test is because now we generate exchange here but previouesly we add it in materialize node. with
risingwave/src/frontend/src/optimizer/mod.rs
Line 482 in aa5e798
let table_required_dist = { |
I used the following code to test its compatibility with the current main branch, and it seems to have no problems. on main
restart to current branch
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #12240 +/- ##
==========================================
+ Coverage 67.86% 67.87% +0.01%
==========================================
Files 1526 1526
Lines 260047 260156 +109
==========================================
+ Hits 176468 176588 +120
+ Misses 83579 83568 -11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Please ping me once this PR is ready for review. :) |
This appears to be ready for review now 🥵 |
src/frontend/src/optimizer/mod.rs
Outdated
|
||
StreamUnion::new(Union { | ||
all: true, | ||
inputs: vec![external_source_node, dml_node], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure which id needs to be specified as the source here. In the case of ‘with connector’, we have two sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plans look great!
@@ -119,9 +119,10 @@ impl VirtualNode { | |||
if let Ok(idx) = keys.iter().exactly_one() | |||
&& let ArrayImpl::Serial(serial_array) = &**data_chunk.column_at(*idx) | |||
{ | |||
let fallback_row_id = Serial::from(rand::random::<i64>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? And also for compute_row
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the scenario where there is no user-defined primary key, the user’s insert operation, upon entering through the DML node, first undergoes a shuffle before entering the union, and only then is the row ID node. Therefore, at the time of the shuffle, the row id column is still empty. This was previously handled by using unwrap_or_default
, but this would shuffle all the traffic to vnode 0, resulting in a severe traffic skew. Therefore, we’ve adopted the random method to randomize it a bit, specifically, one chunk randomizes one vnode.
Indeed, we need to add support for It seems like it might not be necessary. 🤔compute_row
. I’m going to make some modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks too hack to me. 🥵 It could be surprising that this is not a pure function. Since the motivation here is to avoid data skew, I guess using the hash code of the row is way much better as it guarantees stability at least.
Can we add another RowIdGen
here as well? We just need to ensure that there's no overlapping with the latter one, which can be done by allocating a flag of one bit for this information I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, let’s use a hash of the entire row as a fallback when there’s no row id. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add another RowIdGen here as well? We just need to ensure that there's no overlapping with the latter one, which can be done by allocating a flag of one bit for this information I guess.
A little strange to me because we can not know which data source can give more records
@@ -37,7 +37,6 @@ impl StreamUnion { | |||
pub fn new(logical: generic::Union<PlanRef>) -> Self { | |||
let inputs = &logical.inputs; | |||
let dist = inputs[0].distribution().clone(); | |||
assert!(inputs.iter().all(|input| *input.distribution() == dist)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any violation of this assertion? 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering what the distribution will be in this case. 🤔 SomeShard
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's okay here. I'll add assert back in.
It doesn't seem to work. When executing the following statement, one side is someshard and the other side is hashshard.
CREATE TABLE msrc (v INT) WITH (
connector = 'datagen',
fields.v.kind = 'sequence',
fields.v.start = '1',
fields.v.end = '10',
datagen.rows.per.second='15',
datagen.split.num = '1'
) FORMAT PLAIN ENCODE JSON;
input StreamExchange [no_shuffle] { dist: SomeShard } -> [SomeShard]
input StreamExchange { dist: HashShard(_row_id) } -> [1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some shard and hash shard should be compatible here with -> some shard. I’ve handled it, and an assert will still be performed for typical unions.
I guess it's time to utilize the |
a684fd3
to
e66762b
Compare
44e3ba4
to
692035c
Compare
692035c
to
3cbd350
Compare
3cbd350
to
db7728e
Compare
2e994db
to
d965f44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you plz elaborate more on the background and motivation? Also may be better to have a more descriptive PR title.
531c202
to
2911174
Compare
if project.is_empty() { | ||
return project.row().hash(Crc32FastBuilder).into(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass a full indices here is better 🤔
} | ||
|
||
impl StreamRowIdGen { | ||
pub fn new(input: PlanRef, row_id_index: usize) -> Self { | ||
let distribution = if input.append_only() { | ||
if input.append_only() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can not check in the new function, instead, passing the flag is needed. DmlExecutor is non-append only.
… optimize data exchange between fragments.
…amSource, StreamDml, StreamCdcTableScan & StreamWatermarkFilter
e5af73b
to
0b87816
Compare
…distribution in `new_with_dist` function.
… for MySQL CDC sources
For chunks where the row id is none, hashing each row individually can break up the chunk and create a time difference in the calculation of data within the same chunk, leading to unexpected results (like ‘distinct on’ without ‘order by’). For most of our end-to-end tests, this might produce random results. Some potential solutions we can consider are:
|
c6b774e
to
e1f9c2b
Compare
That's true. 🥵 But why the random vnode approach does not encounter this problem? Did we use the same vnode for all rows in a single chunk? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM!
BTW, I realize that |
e1f9c2b
to
5059920
Compare
Oh, I think the possible reason could be that many of our tests are run on initial tables, which haven’t been scaled or undergone vnode redistribution. So according to our current implementation, when the parallelism of the entire graph is the same, the vnode allocation is also sequential, like (1,2,3), (4,5,6), etc. In the previous implementation, after the DML comes in, each row id generator would add a row id according to its own allocated vnodes. Hence, the row ids assigned to a data chunk coming in are also (1,2,3,1,2,3,1,2,3), and after being passed through the serial type, they are also distributed to the following (1,2,3) operator. This keeps these rows together. But if we change it to calculate the hash based on the entire row, it would lead to a very scattered distribution of hashes, so they would be randomly assigned to the subsequent operators. The data in our e2e tests is quite limited, so we won’t encounter this problem before. 🥵 |
That's interesting. If I understand you correctly, records fewer than |
Signed-off-by: Shanicky Chen <peng@risingwave-labs.com>
5059920
to
9a89780
Compare
retest this pr's compatibility with the current main branch run in main branch
restart to current branch
|
Cool! Is it possible to add it to CI compatibility test? cc @kwannoel |
#13403 will require testing against 4 latest versions. That will be a prerequisite to test this PR. What additional things need to be tested? I saw some If it's just normal DML, DQL, DDL, the backwards compat test already uses nexmark and tpch query for testing. |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
WIP, Some details require further discussionwith external source
no user defined primary key
append only
user defined primary key
without external source
The result of removing the source fragment (even though it might seem strange).
no user defined primary key
append only
user defined primary key
Checklist
./risedev check
(or alias,./risedev c
)