Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dev): include CLI args in config file and allow overwrite #7523

Closed
wants to merge 20 commits into from

Conversation

Gun9niR
Copy link
Contributor

@Gun9niR Gun9niR commented Jan 24, 2023

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

We use overwrite instead of override because the latter is a keyword in rust.

In RFC risingwavelabs/rfcs#34, we decided that CLI args should be a subset of the config file containing user-facing options.

I realize that it's difficult to put the CLI args to existing sections of risingwave.toml for the following two reasons:

  • Configs such as host, prometheus_listen_addr's default values vary from worker types, so they cannot be shared in the server section.
  • Configs such as async_stack_trace, health_check_listener_addr are much related to the worker type, so they don't fit well into the categorization of storage, streaming or batch which is based on modules (in terms of code).

So I used a new section for each worker type, i.e. meta (reusing the existing one), compute_ndoe, compactor, frontend and frontend. To distinguish them from the existing sections, we can say that each newly added sections are categorized by worker types (physically deployed), while the old ones are categorized by modules (in terms of code). The former applies to exactly one worker type, while the latter may apply to multiple.

The CLI args that each worker type accepts remain the same. They will overwrite a subset of the entries in the corresponding section (e.g. args of compactor overwrites [compactor] section)

Checklist

  • I have written necessary rustdoc comments
    - [ ] I have added necessary unit tests and integration tests
    - [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features).
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.

Types of user-facing changes

Please keep the types that apply to your changes, and remove those that do not apply.

  • Installation and deployment
  • RisingWave cluster configuration changes

Release note

Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.

Refer to a related PR or issue link (optional)

@Gun9niR Gun9niR marked this pull request as draft January 24, 2023 17:03
@Gun9niR Gun9niR mentioned this pull request Jan 25, 2023
9 tasks
@Gun9niR Gun9niR requested review from BugenZhao, xxchan and fuyufjh and removed request for BugenZhao January 25, 2023 05:35
@Gun9niR Gun9niR marked this pull request as ready for review January 25, 2023 05:36
xxchan

This comment was marked as resolved.

@codecov

This comment was marked as resolved.

@fuyufjh fuyufjh mentioned this pull request Jan 26, 2023
Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the rest

src/meta/src/lib.rs Show resolved Hide resolved
src/compute/src/lib.rs Outdated Show resolved Hide resolved
#[clap(long)]
pub port: Option<u16>,
pub meta_addr: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should meta_addr be a shared config instead of being under the [frontend] section?

Copy link
Contributor Author

@Gun9niR Gun9niR Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for backward compatibility 🤣 This PR does not implement system param.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't hurt compatibility:

  • The CLI arg is not changed
  • The [frontend] section is newly added. I think eric means whether to put the variable in a different section. I also think that would be better. But it is also reasonable if you want to keep this PR straighforward and refactor it later.

Copy link
Contributor Author

@Gun9niR Gun9niR Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limitation of sharing clap and serde in the same struct is that the CLI args can only be the subset of one XxxConfig. If we, say, put meta_addr in [server], then CLI should both overwrite FrontendConfig and ServerConfig. I guess in that case we would still need to implement a proc macro on our own?

[frontend]
listen_addr = "127.0.0.1:4566"

[compute_node]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about [compute]? Only for consistency of style 🤣

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or change others to [frontend_node], [compactor_node]...? 🤪

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks misleading. What does it mean if we have multiple frontends or compute nodes?

Copy link
Contributor Author

@Gun9niR Gun9niR Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can read from different files? Or just overwrite with CLI.

Copy link
Member

@BugenZhao BugenZhao Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're suggesting different tomls for different physical nodes, then we should also naturally make toml node-type-specific. That is, if we should have to specify different config for every single frontend, why don't we specify different config for all frontends, where other sections like streaming will be redundant?

Copy link
Member

@BugenZhao BugenZhao Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still confusing to me. 🥺 Users may think that the "frontend address" must be the same for the one in the single and shared toml, with the one the frontend launches with (might be overridden in frontend CLI), though we know that the address in toml is never read by other components in the cluster. The essential reason is that, this is not a cluster-level config.

Copy link
Member

@fuyufjh fuyufjh Jan 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either looks good to me.

single and shared toml

This might not be an enforced rule in my mind. In principle, the process-level config should be different for each node, because there are local files and nothing enforces them to be identical.

What we did was to make these users happy, for example, by allowing the node-level arguments configurable via CLI args, so that they can use a shared config file and different CLI args. But this is only a sugar.

Saying that there is another group of users, who would like to use a toml library to generate a full config for each node respectively, I think this is also fine. Actually, MySQL/PG users have to do so, considering that MySQL/PG can be clustered to be Paxos HA group and they must listen on different IPs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer single toml because there has to be some configs in the toml file but not in CLI that can vary from nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tomls under config directory are used by RiseDev, which does not follow the way of "another group of users". So what about just leaving the addresses fields unset, as we always specify the addresses with CLI args? Using a comprehensive config file is also acceptable from this PR, but it's not a way we, as RiseDev users, to use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about just leaving the addresses fields unset, as we always specify the addresses with CLI args

Since using multiple files won't be a problem, I think these fields can still appear in the file 👀

we should also naturally make toml node-type-specific. That is, if we should have to specify different config for every single frontend, why don't we specify different config for all frontends, where other sections like streaming will be redundant?

We can certainly consider different schema for each node type. But let's not introduce too much change in this PR.

@Gun9niR

This comment was marked as resolved.

@xxchan

This comment was marked as resolved.

@Gun9niR

This comment was marked as outdated.

@xxchan
Copy link
Member

xxchan commented Jan 27, 2023

I will take a look soon. BTW, did you take a look at crates for layered configuration like https://github.com/Xuanwo/serfig and https://github.com/mehcode/config-rs? 🤔

@Gun9niR

This comment was marked as resolved.

@xxchan xxchan added the user-facing-changes Contains changes that are visible to users label Jan 27, 2023
@Gun9niR
Copy link
Contributor Author

Gun9niR commented Jan 27, 2023

BTW, how do we tell if an CLI arg should accept env variable?

Copy link
Member

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I browsed the code and I thought I understood your intention.

It looks good to use the proc macro to implement OverwriteConfig (its impl not reviewed yet though). However, I still think OverwriteConfig is unnecessary at all.

neither of them supports overwriting a subset of the fields from CLI args

I guess it's not the case. Correct me if I'm wrong. e.g.,

struct Config {
    #[clap(...)]
    #[serde(...)]
    pub a: String,
    #[serde(...)]
    pub b: String,
}

In this case only a can be override in CLI.

Also ping @Xuanwo if you would like offer some help :p


BTW, how do we tell if an CLI arg should accept env variable?

I guess it's generally OK to add env vars for all args with few exceptions? 🤔


Some other notes:

  • The section [meta] seems to be both a "worker type" like [compute_note] and a functional component like [streaming]. Should we split it? 🤔
  • Could you please write a shorter and intuitive summary without too much tech decisions for the PR description (or release note)? e.g., "add [compute_node] section in config file, which corresponds to the CLI args of compute-node

src/tests/compaction_test/src/lib.rs Outdated Show resolved Hide resolved
src/compute/src/lib.rs Outdated Show resolved Hide resolved
src/compute/src/lib.rs Outdated Show resolved Hide resolved
src/frontend/src/lib.rs Outdated Show resolved Hide resolved
#[clap(long)]
pub port: Option<u16>,
pub meta_addr: Option<String>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't hurt compatibility:

  • The CLI arg is not changed
  • The [frontend] section is newly added. I think eric means whether to put the variable in a different section. I also think that would be better. But it is also reasonable if you want to keep this PR straighforward and refactor it later.

src/frontend/src/lib.rs Show resolved Hide resolved
[frontend]
listen_addr = "127.0.0.1:4566"

[compute_node]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or change others to [frontend_node], [compactor_node]...? 🤪

@Gun9niR
Copy link
Contributor Author

Gun9niR commented Jan 27, 2023

Right, I was so stupid to neglect the skip attribute in clap so I thought it was necessary to define a separate struct for overwrite 😅. Using clap and serfig I was able to create two XxxConfig instances, one from cli, one from the file, and merge them together. And the docs should be naturally preserved.

But we still need the OverwriteConfig trait to specify which field in RwConfig to overwrite.

@Gun9niR Gun9niR force-pushed the zhidong/system-param branch from 7d20bdf to 470287c Compare January 27, 2023 17:56
@Xuanwo
Copy link
Contributor

Xuanwo commented Jan 27, 2023

It looks good to use the proc macro to implement OverwriteConfig (its impl not reviewed yet though).

Does OverwriteConfig mean args overwrites env, and env overwrite file? If so, serfig already implemented this:

really usage from databend.

impl Config {
  pub fn load(with_args: bool) -> Result<Self> {
      // Parse arg first because we should respect `--config-file`.
      let mut arg_conf = Self::default();
  
      if with_args {
          arg_conf = Self::parse();
      }
  
      let mut builder: serfig::Builder<Self> = serfig::Builder::default();
  
      // Load from config file first.
      {
          let config_file = if !arg_conf.config_file.is_empty() {
              arg_conf.config_file.clone()
          } else if let Ok(path) = env::var("CONFIG_FILE") {
              path
          } else {
              "".to_string()
          };
  
          if !config_file.is_empty() {
              builder = builder.collect(from_file(Toml, &config_file));
          }
      }
  
      // Then, load from env.
      builder = builder.collect(from_env());
  
      // Finally, load from args.
      if with_args {
          builder = builder.collect(from_self(arg_conf));
      }
  
      Ok(builder.build()?)
  }
}

src/compute/src/lib.rs Outdated Show resolved Hide resolved
@Xuanwo
Copy link
Contributor

Xuanwo commented Jan 28, 2023

Here are some notes to use serfig: serfig will use Default to decide which value to take. So it's required to ensure all instances have the same default value.

https://github.com/Xuanwo/serde-env can also help simplify the work of loading envs.

For any bugs/misbehavior that could happen, please rise an issue, I'm willing to provide help.


use crate::server::compactor_serve;

pub fn start(opts: CompactorConfig) -> Pin<Box<dyn Future<Output = ()> + Send>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When will this opts be used? 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • To get the path of the config file
  • To overwrite the config read from the file

Comment on lines +60 to +64
pub fn start(opts: MetaConfig) -> Pin<Box<dyn Future<Output = ()> + Send>> {
// WARNING: don't change the function signature. Making it `async fn` will cause
// slow compile in release mode.
Box::pin(async move {
let config = load_config(&opts.config_path);
let config = load_config(&opts.config_path.clone(), Some(opts));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is opts only used for CLI overwrite here? If so, it'll be better to give it a more specific name to show that this should not be used as the final config. Same for other occurrences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_config takes ownership of opts, so it should not be usable after the invocation 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it's possible that someone may want to clone it 😅 Maybe some more comments?

src/frontend/src/session.rs Show resolved Hide resolved
src/compute/src/lib.rs Outdated Show resolved Hide resolved
@BugenZhao
Copy link
Member

BugenZhao commented Jan 28, 2023 via email

@Gun9niR
Copy link
Contributor Author

Gun9niR commented Jan 30, 2023

As discussed offline, adding new sections will make the config file more confusing and complicated, and some items are not suitable to put into any existing sections. So what we actually need is just an overwriting framework. For the sake of clarity, I will move the implementation to another PR.

@Gun9niR Gun9niR closed this Jan 30, 2023
@Gun9niR Gun9niR deleted the zhidong/system-param branch February 2, 2023 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature user-facing-changes Contains changes that are visible to users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants