-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable shared source in session variable by default, and add cluster-level config to disable #18749
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
ea1e46a
to
b6a1d39
Compare
9dbf61c
to
fe849f7
Compare
b6a1d39
to
91524d3
Compare
|
c80cbf3
to
7a29ddb
Compare
357a0c6
to
25cadfa
Compare
ae0dc87
to
5201a1a
Compare
25cadfa
to
8579768
Compare
8579768
to
37bd4ce
Compare
660fa6c
to
eaebb9c
Compare
37bd4ce
to
03875c2
Compare
eaebb9c
to
1104609
Compare
899a05f
to
7326f9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Shall we cherry-pick to release 2.1?
/// If false, the shared source will be disabled, | ||
/// even if session variable set. | ||
/// If true, it's decided by session variable `streaming_use_shared_source` (default true) | ||
pub enable_shared_source: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't follow the standard pattern of a session/global variable, particularly, the "cluster-level config" should be SystemParams instead of RwConfig
. However, as the configuration is not user-facing but only for us, it's acceptable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
standard pattern of a session/global variable
FYI actually session variable can also be ALTER SYSTEM
to be changed globally.
Just add this extra gate in case it's needed.
Yes, it's planned |
Will merge this after changing benchmark pipeline https://linear.app/risingwave-labs/issue/PERF-163 |
7326f9d
to
2bf03cd
Compare
…stem variable to disable The config is similar to `streaming_use_arrangement_backfill` (session) and `stream_enable_arrangement_backfill` (system) BTW one problem found: Currently session variables have different styles: - with rw prefix: `rw_streaming_enable_delta_join`, `rw_batch_enable_sort_agg`, `rw_enable_share_plan` - without rw prefix: `streaming_use_arrangement_backfill`, `batch_enable_distributed_dml` Signed-off-by: xxchan <xxchan22f@gmail.com>
2bf03cd
to
06bd105
Compare
…uster-level config to disable (#18749) Signed-off-by: xxchan <xxchan22f@gmail.com>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
User interface
streaming_use_shared_source
to control whether to use shared feature:stream_enable_shared_source
([streaming.developer]
inrisingwave.toml
). Set it tofalse
can completely disable this feature in the cluster.What's changed
The config is similar to
streaming_use_arrangement_backfill
(session) andstream_enable_arrangement_backfill
(cluster)For more discussions see https://risingwave-labs.slack.com/archives/C06AL46R10S/p1729051215981299
enable_shared_source
tostreaming_use_shared_source
, to be consistent withstreaming_use_arrangement_backfill
true
stream_enable_shared_source
in[streaming.developer]
inrisingwave.toml
, defaults totrue
. When this isfalse
, the feature is disabled in the cluster (to re-enable, need to update toml and restart); When it'strue
, respect session variable.TODO: cloud, similar to https://github.com/risingwavelabs/risingwave-cloud/pull/7664/We will not have special handling on cloudALTER SOURCE
need to use non-shared source, as altering shared source is not supported yetrate_limit_source_kafka.slt
: altersource_rate_limit
do not affect source backfill.Benchmarks
Discussed with @lmatz , we will first disable shared source in daily benchmark pipeline. After fixing things like metrics collection, we add a new weekly benchmark for shared source.
We will need to tweak benchmark pipelines after enabling shared source, otherwise they may fail. Click to see details
Nexmark
Nexmark first creates a Kafka topic and fill data, then builds an MV. This will result in the entire benchmark process being backfill.
We can see backfill performance is indeed worse than source:
Longevity
It didn't fail, but the performance behavior will look different.
We have
enable_create_mv_nexmark_table = true
by default. So we have 3 source executors (bid/auction/person are MVs) -> 1 source executor + 3 backfill executors.Previously the 3 sources have individual consumption progress, but now they share consumption progress. This is like "unified topic" in nexmark benchmark.
enable_create_mv_nexmark_table = false
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
Need release note & doc for the whole shared source feature: https://www.notion.so/risingwave-labs/Doc-for-Shared-Source-10ef0d69cb768078b2a5e05bfa5d4807