Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(batch): schedule single task for singleton table scan #5907

Merged
merged 8 commits into from
Oct 19, 2022

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Oct 18, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

For singleton table, we make the distribution of the table scan Single instead of UpstreamHashShard(vec![]).

Per discussion with @liurenjie1024, as the root stage is always executed on the frontend, we insert an additional Single if necessary to ensure the table scan runs on the compute node just like what we do for DMLs.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@github-actions github-actions bot added the type/fix Bug fix label Oct 18, 2022
@BugenZhao BugenZhao marked this pull request as ready for review October 18, 2022 13:28
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@codecov
Copy link

codecov bot commented Oct 18, 2022

Codecov Report

Merging #5907 (dc10e0b) into main (938f427) will decrease coverage by 0.02%.
The diff coverage is 81.81%.

@@            Coverage Diff             @@
##             main    #5907      +/-   ##
==========================================
- Coverage   74.89%   74.86%   -0.03%     
==========================================
  Files         924      924              
  Lines      147014   146913     -101     
==========================================
- Hits       110104   109992     -112     
- Misses      36910    36921      +11     
Flag Coverage Δ
rust 74.86% <81.81%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/frontend/src/handler/query.rs 20.49% <0.00%> (ø)
...ntend/src/optimizer/plan_node/stream_table_scan.rs 96.93% <ø> (ø)
src/frontend/src/scheduler/error.rs 8.33% <ø> (ø)
src/frontend/src/scheduler/local.rs 0.00% <0.00%> (ø)
src/frontend/src/scheduler/distributed/stage.rs 21.55% <33.33%> (ø)
src/frontend/src/scheduler/plan_fragmenter.rs 76.57% <74.24%> (-8.04%) ⬇️
src/frontend/src/optimizer/mod.rs 96.70% <100.00%> (+0.61%) ⬆️
...frontend/src/optimizer/plan_node/batch_seq_scan.rs 93.86% <100.00%> (-0.36%) ⬇️
src/frontend/src/optimizer/plan_visitor.rs 90.69% <100.00%> (+3.60%) ⬆️
src/frontend/src/scheduler/distributed/query.rs 74.81% <100.00%> (+0.91%) ⬆️
... and 4 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mergify
Copy link
Contributor

mergify bot commented Oct 19, 2022

Hey @BugenZhao, this pull request failed to merge and has been dequeued from the merge train. If you believe your PR failed in the merge train because of a flaky test, requeue it by clicking "Update branch" or pushing an empty commit with git commit --allow-empty -m "rerun" && git push.

Comment on lines +165 to +168
BatchExchange { order: [], dist: Single }
└─BatchProject { exprs: [max(s.v)] }
└─BatchHashAgg { group_key: [s.k], aggs: [max(s.v)] }
└─BatchScan { table: s, columns: [s.k, s.v], distribution: Single }
Copy link
Contributor

@BowenXiao1999 BowenXiao1999 Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this for example, root fragment task, no matter before-and-after pr, is always executed on FE, while one is exchange singleton and another is hash agg.

What's the benefit of changing? 🤔 Unify with DML?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for this bug: #4164

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@mergify mergify bot merged commit 3287ad7 into main Oct 19, 2022
@mergify mergify bot deleted the bz/minor-refactor-batch-scan-scheduling branch October 19, 2022 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

batch: singleton seq scan on local mode
4 participants