Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support command chaining #2222

Open
fkrauthan opened this issue Nov 24, 2020 · 47 comments
Open

Support command chaining #2222

fkrauthan opened this issue Nov 24, 2020 · 47 comments
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing

Comments

@fkrauthan
Copy link

Describe your use case

I would like to re-write an existing internal Python tool in rust. For this python tool we use click with the command chaining feature (https://click.palletsprojects.com/en/7.x/commands/#multi-command-chaining) this allows similar to some bigger CLI tools (e.g. gradle or setup.py) to execute multiple commands in one go. In our example we have a simple build and deploy tool and people often call it ether via ./tool build start or via ./tool start logs etc to trigger different functionality in one go. Where build would build our system; start would startup the code locally and logs would tail the logs.

Describe the solution you'd like

A simple switch that allows me to enable command chaining. I am ok with that prevents me from using sub commands and instead I can only use one level of commands. Examples of implementations are:

@epage epage added C-enhancement Category: Raise on the bar on expectations A-parsing Area: Parser's logic and needs it changed somehow. and removed T: new feature labels Dec 8, 2021
@epage epage added the S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing label Dec 13, 2021
@epage
Copy link
Member

epage commented Dec 13, 2021

Interesting idea!

It'd help if someone explored the nuances of how click worked so we can build on their shoulders. What limitations does it have? How does it impact various UX features? What could be improved?

I'm assuming any App that has this enabled would disallow sub-subcommands because that can be ambiguous with chaining. We do this with checks in src/build/app/debug_asserts.rs

We'd need to decide on how to expose this in ArgMatches.

  • Switch ArgMarches::subcommand* to returning iterators of active subcommands
  • Despite chained subcommands existing at the same level within the App hierarchy, recursively nest them in ArgMatches as if they were arbitrary depth sub-subcommands

@0xForerunner
Copy link

Copying over an example from another thread. This is a demonstration of how this could look using derive. Ideally this should be implemented for any type that is IntoIterator<Item = T>.

#[derive(Parser)]
#[clap(rename_all = "snake_case")]
pub enum MyCli {
    GetPrices {
        #[command(subcommand)]
        assets: Vec<AssetInfo>,
    },
}

#[derive(Subcommand)]
#[clap(rename_all = "snake_case")]
pub enum AssetInfo {
    Token {
        /// <string>, "atom3h6lk23h6"
        contract_addr: String
    },
    NativeToken {
        /// <string>, "uatom"
        denom: String
    },
}

This is a much simpler case then I have in reality, but useful for examining how this could work. An example run of the program could look like:

my-cli get_prices token atom3h6lk23h6 native_token uatom 

When these example args are parsed it would result in:

MyCli::GetPrices{
    assets: vec![
        AssetInfo::Token { contract_addr: atom3h6lk23h6 },
        AssetInfo::NativeToken { denom: uatom }
    ]
}

@0xForerunner
Copy link

0xForerunner commented Oct 26, 2022

(e.g. do we stop chaining when entering a sub-sub-command or do we back out to the parent chained command?)

Why would we need to stop chaining when entering a subcommand?

@epage
Copy link
Member

epage commented Oct 26, 2022

Allowing both chaining and sub-subcommands has the potential for ambiguity (in practice and from the users perspective) and implementation complexity.

What will help is someone exploring what precedence there is in other implementations of chaining, exploring the various cases for how to avoid ambiguity, and exploring the use cases for why someone would need each kind of behavior we would implement for this.

@0xForerunner
Copy link

It should be possible to have some check to prevent ambiguities but yes I see the problem you're referring to. As far as use case goes for me it's as simple as trying to force clap to work as well as possible with any arbitrary enum structure using its derive functionality.

@abey79
Copy link

abey79 commented Feb 23, 2023

It'd help if someone explored the nuances of how click worked so we can build on their shoulders. What limitations does it have? How does it impact various UX features? What could be improved?

FWIW, here is the relevant paragraph in Click's documentation. The limitations are:

  • no nested commands
  • you can't have arguments with arbitrary number of value (except for the last command)
  • the order --options argument is enforced (e.g. you can't do mycli cmd1 arg1 --opt1 val cmd2 [...]

@brownjohnf
Copy link

I found this issue via #4424 and wanted to add my use-case. I'm not sure how this would work implementation-wise, but it'd be very useful:

I'm building a fairly complex CLI that supports a dynamic number of duplicate configs. It's proxying traffic via different protocols, and adding filters on top of them. So for example, I might want to listen on tcp and udp, and parse tcp messages as strings but the udp messages as protobufs, and log the raw bytes first (just random simple examples). I can think of two logical approaches, and would be content if either were supported by clap; I'd just use whichever it supported.

The first is find-style layered/ordered flags:

$ mycli --listen tcp://localhost:1234 --filter str --filter log --listen udp://localhost:2345 --filter log --filter protobuf

The second is more what this issue (and #4424) is discussing, if I understand correctly:

$ mycli listen --uri tcp://localhost:1234 --filter str --filter log listen --uri udp://localhost:2345 --filter log --filter protobuf

I personally find the first pattern more intuitive because of prior exposure to things like the find command, but the second is probably more intuitive, and has the advantage of supporting more robust built-in argument validation/parsing using clap derive attributes (would be easier to express things like combinatoric flag constraints per-subcommand). An example of an in-the-wild use of the ordered find arguments, which is running gofmt on all *.go files in the current directory recursively, but pruning out any files in ./vendor:

gofmt -l $(find . -path ./vendor -prune -o -name '*.go' -print)

I've considered solving this short-term by requiring that you pass a number of --filter arguments equal to --uri arguments, but this starts breaking down with variable number of filters, or other arguments that should only apply to one of the listeners. It also requires doing a lot of custom parsing if you want to group multiple arguments (e.g. make --filter a Vec<Vec<String>> and accept multiple --filter flags, each with comma-delimited strings that get parsed into the nested Vec).

If someone knows a way to achieve this already in clap, I'd love to hear it, and otherwise would love to see support get added. Clap has been such a fabulous and pleasant asset when working with rust CLIs.

@epage

This comment was marked as off-topic.

@brownjohnf

This comment was marked as off-topic.

@abey79

This comment was marked as off-topic.

@brownjohnf
Copy link

brownjohnf commented Mar 28, 2023 via email

@epage
Copy link
Member

epage commented Mar 29, 2023

For this to move forward, we need

  • A summary of the expected behavior
    • Ideally also with a comparison to prior art
  • A proposal that includes
    • How to enable it
    • How it should show up in help
    • How to update [ArgMatches] to access the chained commands
    • How to expose this in clap_derive

@abey79
Copy link

abey79 commented Mar 29, 2023

OK, here is my take.

Coming from Click, this paragraph describes pretty well what I would expect. For background, I created and maintain a Click-based CLI pipeline called vpype. I think it's an excellent example of how a chained multi-command pipeline CLI should behave (but I'm slightly biased).

Note that I make this proposal with a very minimal knowledge of Clap's internals and a rather thin experience of actually using the library. So it may very well make little sense. In exchange, you get a fresh/candid feedback on one person's expectations for a CLI framework "with all of the bells and whistles". ;)

Here is how I'd transform the builder tutorial to use command chaining (I marked my changes with //NEW):

use std::path::PathBuf;

use clap::{arg, command, value_parser, ArgAction, Command};

fn main() {
    let matches = command!() // requires `cargo` feature
        .chain(true) //NEW: enable chaining of sub commands
        .arg(arg!([name] "Optional name to operate on"))
        .arg(
            arg!(
                -c --config <FILE> "Sets a custom config file"
            )
                // We don't have syntax yet for optional options, so manually calling `required`
                .required(false)
                .value_parser(value_parser!(PathBuf)),
        )
        .arg(arg!(
            -d --debug ... "Turn debugging information on"
        ))
        .subcommand(
            Command::new("test")
                .about("does testing things")
                .arg(arg!(-l --list "lists test values").action(ArgAction::SetTrue)),
        )
        .subcommand( //NEW: let's add another command for example purposes
            Command::new("config")
                .about("configures next test(s)")
                .arg(arg!(-s --set <PARAM> <VALUE> "sets a config value")),
        )
        .get_matches();

    // You can check the value provided by positional arguments, or option arguments
    if let Some(name) = matches.get_one::<String>("name") {
        println!("Value for name: {}", name);
    }

    if let Some(config_path) = matches.get_one::<PathBuf>("config") {
        println!("Value for config: {}", config_path.display());
    }

    // You can see how many times a particular flag or argument occurred
    // Note, only flags can have multiple occurrences
    match matches
        .get_one::<u8>("debug")
        .expect("Count's are defaulted")
    {
        0 => println!("Debug mode is off"),
        1 => println!("Debug mode is kind of on"),
        2 => println!("Debug mode is on"),
        _ => println!("Don't be crazy"),
    }

    //NEW: iterate over sub-commands
    for subcmd in matches.subcommand_iter() {
        match subcmd.name() {
            "test" => {
                if subcmd.get_flag("list") {
                    println!("Printing testing lists...");
                } else {
                    println!("Not printing testing lists...");
                }
            }
            "config" => {
                println!("Configuring next test(s)...");
            }
            _ => unreachable!(),
        }
    }

    // Continued program logic goes here...
}

This would yield the following help string (directly inspired from Click):

$ 01_quick --help
A simple to use, efficient, and full-featured Command Line Argument Parser

Usage: 01_quick[EXE] [OPTIONS] [name] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

Commands:
  test          does testing things
  config        configures next test(s)
  help          Print this message or the help of the given subcommand(s)

Arguments:
  [name]  Optional name to operate on

Options:
  -c, --config <FILE>  Sets a custom config file
  -d, --debug...       Turn debugging information on
  -h, --help           Print help
  -V, --version        Print version

And here is how I'd adapt the derive tutorial:

use std::path::PathBuf;

use clap::{Parser, Subcommand};

#[derive(Parser)]
#[command(author, version, about, long_about = None, chain = true)] //NEW
struct Cli {
    /// Optional name to operate on
    name: Option<String>,

    /// Sets a custom config file
    #[arg(short, long, value_name = "FILE")]
    config: Option<PathBuf>,

    /// Turn debugging information on
    #[arg(short, long, action = clap::ArgAction::Count)]
    debug: u8,

    #[command(subcommand)]
    command: Option<Commands>,
}

#[derive(Subcommand)]
enum Commands {
    /// does testing things
    Test {
        /// lists test values
        #[arg(short, long)]
        list: bool,
    },

    /// configures next test(s)
    Config { //NEW
        /// sets a config value
        #[arg(short, long)]
        set: Option<(String, String)>,
    },
}

fn main() {
    let cli = Cli::parse();

    // You can check the value provided by positional arguments, or option arguments
    if let Some(name) = cli.name.as_deref() {
        println!("Value for name: {name}");
    }

    if let Some(config_path) = cli.config.as_deref() {
        println!("Value for config: {}", config_path.display());
    }

    // You can see how many times a particular flag or argument occurred
    // Note, only flags can have multiple occurrences
    match cli.debug {
        0 => println!("Debug mode is off"),
        1 => println!("Debug mode is kind of on"),
        2 => println!("Debug mode is on"),
        _ => println!("Don't be crazy"),
    }

    //NEW: iterate over sub-commands
    for subcmd in cli.command_iter() {
        match subcmd {
            Commands::Test { list } => {
                if *list {
                    println!("Printing testing lists...");
                } else {
                    println!("Not printing testing lists...");
                }
            }
            Commands::Config { set } => {
                if let Some((param, value)) = set {
                    println!("Setting config value for {param} to {value}");
                }
            }
        }
    }

    // Continued program logic goes here...
}

In most cases, some dispatch mechanism (Command -> actual code that implements it) would be needed. This could be done statically with little boilerplate using a few macros (like I did in the project mentioned in my previous comment*), or dynamically with some code, for example if the CLI must support some kind of plug-in extensions. I'm not sure if even a minimalist implementation of either is in the scope of Clap though—it could certainly be factored in a separate crate/project.

[(*) BTW, I'm slightly surprised it was marked as off-topic, as it provides a (somewhat) working of command (well... argument) chaining, which is a work-around for the present issue. Maybe I wasn't sufficiently clear about this in that comment?]

@epage
Copy link
Member

epage commented Mar 29, 2023

[(*) BTW, I'm slightly surprised it was marked as off-topic, as it provides a (somewhat) working of command (well... argument) chaining, which is a work-around for the present issue. Maybe I wasn't sufficiently clear about this in that comment?]

Oh, that whole conversation was useful but since it ran its course, I was wanting to make it easier to find the relevant parts of the conversation for command chaining and to not get too off topic.

I also feel like "argument chaining" is only a workaround in some limited cases and comment I did leave expanded will likely lead people to expand the follow ups.

@epage
Copy link
Member

epage commented Mar 29, 2023

command: Option<Commands>,
...
for subcmd in cli.command_iter() {

I assume that should be a Vec, rather than an Option and that it would be iterated with &cli.commands

@abey79
Copy link

abey79 commented Mar 29, 2023

command: Option<Commands>,
...
for subcmd in cli.command_iter() {

I assume that should be a Vec, rather than an Option and that it would be iterated with &cli.commands

Actually, since Commands was defined as an enum, I extended the enum instead. It could certainly be a Vec of (single?) Command as well—I'm not sure of the implications.

On the other hand, cli.command_iter() would iterate on the commands that were actually passed on the CLI. For example, with 01_quick test config -s param 10 test, cli.command_iter() would yield 3 command instances, two of which with the Commands:Test variant.

@epage
Copy link
Member

epage commented Mar 29, 2023

Coming from Click, this paragraph describes pretty well what I would expect. For background, I created and maintain a Click-based CLI pipeline called vpype. I think it's an excellent example of how a chained multi-command pipeline CLI should behave (but I'm slightly biased).

Could we be more explicit, rather than linking out to other documentation.

If nothing else, it keeps the conversation in one place so we can make sure we are satisfying all of the requirements.

For example

When using multi command chaining you can only have one command (the last) use nargs=-1 on an argument

I assume this means that these commands effectively end the chain.

  • Is there a reason to not do this for subcommands as well?
  • Could we loosen this restriction since clap can handle variable length arguments and subcommands?

It is also not possible to nest multi commands below chained multicommands.

We enforce invariants as debug_asserts and we need to call out in design section that we will have a debug assert for this case

Could we make this like the nargs=-1 case and say that nested subcommands terminate the chain? I haven't thought through what this does to the parser to either recurse or backtrack

@epage
Copy link
Member

epage commented Mar 29, 2023

Actually, since Commands was defined as an enum, I extended the enum instead. It could certainly be a Vec of (single?) Command as well—I'm not sure of the implications.

Additional variants are just additional subcommands at the same level. We still need a way to store (and access) multiple instances of the enum

@abey79
Copy link

abey79 commented Mar 29, 2023

Additional variants are just additional subcommands at the same level. We still need a way to store (and access) multiple instances of the enum

If I understand you correctly (which I'm not entirely sure of), you mean that the enum in the example represent "level 1" sub-commands, which are by definition exclusive in the current (chaining-less) model, so adding variants and expecting that multiple instances exist is in contradiction with the current implementation.

I guess it's both a technical (on which I can't comment much) and API question. On the API side, I think I would be happier with Vec<Command> rather than a single Commands enum, because the dispatch mechanism might be easier to tuck on the Commands with some custom trait—but I'm pretty sure we can make do with either approaches.

What's for sure is that enforcing as single level of subcommand when chain=true is fair. Click has this restriction and command chaining with greater-than-1-level command hierarchies would be rather confusing from a UX point of view IMHO.

Edit: I'll try to answer your other message later tonight.

@epage
Copy link
Member

epage commented Mar 29, 2023

If I understand you correctly (which I'm not entirely sure of), you mean that the enum in the example represent "level 1" sub-commands, which are by definition exclusive in the current (chaining-less) model, so adding variants and expecting that multiple instances exist is in contradiction with the current implementation.

Not quite.

Take

prog test

and that would map to

Cli {
    name: None,
    config: None,
    debug: 0,
    command: Some(Commands::Test {
        list: false,
    })
}

With that said, what would the following map to?

prog test test test config

@epage
Copy link
Member

epage commented Mar 30, 2023

I was looking at the parser which reminded me of some of the complexities involved, including

  • Should prog chain0 --help list all the chained commands like prog --help?
  • Should both levels of usage show all the chained commands?
  • The parser for a chained command will need access to the parent command (for command flags, knowing whether an arg is a command or value) which will run into multiple borrows of a mutable value (prog owns chain0, chain0 will need to be mutable for the lazy building, prog and chain0 will both need to be accessible when parsing)

I'm starting to suspect that the cheapest / easiest way to implement this will be for the parent command to copy its subcommands to the command being parsed.

  • It would be easy to supported subcommands under chained subcommands because we just check to see if subcommands are present and, if not, we then copy

This would cause the ArgMatches to be recursive and the derive would have to flatten it out.

  • Maybe we could have the chained command, in the parser, flatten it out?
    • Would need to limit this to the root of the chain or else this would be a lot of moves, allocations, deallocations
    • Or we could make the caller responsible for deciding what to do with the subcommand's ArgMatcher and pass a &mut Vec<ArgMatches> down. In the no-subcommand case, nothing should be allocated. In the subcommand case, We just have a single-element Vec which we would need to switch to anyways. In the chained-subcommand case, each layer keeps appending until we get to the root. Most likely this will lead to an inverted list and we'd need to reverse the list.
  • In general, I'm leaning towards the recursive approach. Its the simplest, has no overhead for unchained subcommands (the more common case), and the builder API doesn't need to be as ergonomic as that is what the builder API is for
    • The main complication is the derive API knowing when to do this (it'll have to capture the state of any chaining) and knowing when to stop if we support unchained subcommands within a chained subcommand.

Except even that comes with its own challenges

  • We'll also need to copy a hand-maintained list of Command settings that are relevant to subcommands
    • This will become more complex when we add support for plugins because then we'll need to know if a plugin should be copied into the subcommand, or even a subset of a plugin
  • Subcommands under chained subcommands would get complicated if we move forward with deferred initialization of Command because we can't tell if subcommands are present until we build it, but we shouldn't build it until after we've copied
  • This doesn't work for any call that wants to just do a prog.build() as that shouldn't recurse infinitely
    • Only doing it one level will work for documentation purposes (clap_mangen)
    • completions will need to do special processing in general and would likely get confused seeing the commands propagated one level
      • For dynamic completions, we could offer a Command::build_subcommand(self, name: &str) -> Result<Command>

@abey79
Copy link

abey79 commented Mar 30, 2023

  • Should prog chain0 --help list all the chained commands like prog --help?
  • Should both levels of usage show all the chained commands?

The way Click does it is cli --help shows the global options and lists the available commands, and cli cmd --help shows the usage of this specific command. It does not support out-of-the-box the cli help and cli help cmd semantic, and it would probably require some knowledge of the internals to emulate this behaviour.

Real world example (abbreviated):

$ vpype --help
Usage: vpype [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...

  Execute the sequence of commands passed in argument.

Options:
  --version           Show the version and exit.
  -h, --help          Show this message and exit.
  -v, --verbose
  -c, --config PATH   Load an additional config file.

Commands:

  Metadata:
    alpha         Set the opacity of one or more layers.
    color         Set the color for one or more layers.
    name          Set the name for one or more layers.
    pens          Apply a pen configuration.
    penwidth      Set the pen width for one or more layers.

  Primitives:
    arc           Generate lines approximating a circular arc.
    circle        Generate lines approximating a circle.
    ellipse       Generate lines approximating an ellipse.
    line          Generate a single line.
    rect          Generate a rectangle, with optional rounded angles.

  Input:
    read          Extract geometries from an SVG file.
    script        Call an external python script to generate geometries.

  Transforms:
    rotate        Rotate the geometries (clockwise positive).
    scale         Scale the geometries by a factor.
    scaleto       Scale the geometries to given dimensions.
    skew          Skew the geometries.
    translate     Translate the geometries.

$ vpype scale --help
Usage: vpype scale [OPTIONS] SCALE...

  Scale the geometries by the provided factors

Options:
  -l, --layer LID         Target layer(s).
  -o, --origin LENGTH...  Use a specific origin.
  --help                  Show this message and exit.

Some additional spec/remarks come to mind.

The --help is a short circuit argument with special handling by Click, and this behaviour is actually very useful when building a complex pipeline. This sort of scenario where the CLI building is interrupted to check the help and easily resumed by using the shell's "repeat prev command" behaviour is common in my experience:

  • vpype read input.svg layout -m 1cm -
  • What's that arg again?
  • vpype read input.svg layout -m 1cm --help
  • (reads help of layout command) Ah, yes!
  • vpype read input.svg layout -m 1cm -v middle a4 write output.svg

For this to work, nothing must happen besides showing help when --help is present, regardless of the existence of other, fully-formed sub-command.

Another aspect is sub-command grouping in the top-level help, as shown above. All of these commands are "sister" sub-command of the top-level CLI—no nesting or anything. Yet they appear thematically grouped. Click doesn't support it out-of-the-box and a stack overflow recipe is needed. It's simple enough, but it would be nice to have built-in support.

Finally, in my previous message I mentioned variable length arguments. To clarify, tt would be really nice to have both variable length options and positional arguments. The scale command I used here for illustration is one such example: I'd like it have support both scale 0.5 and scale 0.5 1.0 for homogenous and non-homogenous 2D scaling. This is currently not possible with Click.

@brownjohnf
Copy link

brownjohnf commented Mar 30, 2023 via email

@brownjohnf
Copy link

brownjohnf commented Mar 30, 2023 via email

@epage
Copy link
Member

epage commented Mar 30, 2023

For this to work, nothing must happen besides showing help when --help is present, regardless of the existence of other, fully-formed sub-command.

Clap help behaves that way to do.

Another aspect is sub-command grouping in the top-level help, as shown above. All of these commands are "sister" sub-command of the top-level CLI—no nesting or anything. Yet they appear thematically grouped. Click doesn't support it out-of-the-box and a stack overflow recipe is needed. It's simple enough, but it would be nice to have built-in support.

See

@abey79
Copy link

abey79 commented Mar 30, 2023

@brownjohnf I agree with your (hot) take and we may certainly lift that "no nesting of chained sub-command" rule. I've included it here because of my understanding that it simplified implementation on Click's side, but I understand things might be entirely different on Clap's side.

@epage
Copy link
Member

epage commented Mar 30, 2023

I'd loosen this up a bit. A sub-command should eagerly parse all
arguments until failure, at which the parser should pop up the stack and
resume parsing the parent (sub)command until failure, etc.

In thinking on chained subcommands, I realized that one challenge is clap is smarter than that allows. At least for optional positionals and variable length positionals, clap checks for if a subcommand name exists among them and will then treat it as a subcommand. At least for variable-length options, clap has a setting to check for subcommand names among them. This requires subcommands to know every subsubcommand name and alias, without getting into subcommand short flags and long flags.

This presents an implementation challenge because

  • clap's behavior should be consistent
  • even if we accept chained commands being less than ideal now, it could be disruptive to change its behavior in the future

So we either need a solution that can have consistency out the gate or we need to be very restrictive and disallow any case that requires knowing about chained commands so we can just pop back up and continue processing. I'm assuming that would be too restrictive for this to be feasible: we'd be giving users a taste of something that won't work in a lot of their cases and it'd be better to not have the feature in that case.

@brownjohnf
Copy link

brownjohnf commented Mar 30, 2023 via email

@epage
Copy link
Member

epage commented Mar 30, 2023

Since we have no multi-subcommand support today, what about having a
config option to enable it that disables the "smart" detection of
subcommands after variable length args?

This isn't an acceptable option

  • Being a runtime setting means we'd be bloating the code and API for everyone for this less common feature
  • We try to limit what gets a feature flag because those don't have an ideal workflow for end users
  • As mentioned in my previous post, consistency is important. You shouldn't be required to lose out on clap features to be able to use other features.

@brownjohnf
Copy link

brownjohnf commented Mar 30, 2023 via email

@epage
Copy link
Member

epage commented Mar 31, 2023

Something I realized is command chaining would get us a form of "argument chaining" for free because users could specify the start of an argument chain using Command::long_flag, making the chained subcommand act like an argument. To clean up --help, they can specify do something like cmd.subcommand_value_name("OP").subcommand_help_heading("Operation")

@muja
Copy link

muja commented May 6, 2023

I also have this use case. I am writing a tool for mass editing docx files. docx files are zip files with lots of XMLs and once unpacked and parsed, it is very desirable to perform multiple operations on the contents instead of spreading it out into multiple processes which would waste a lot of computation power.

The CLI should look something like this: ./docx-edit ~/Documents/WordFiles --output=/tmp/wordout replace --replace-pairs-file=ReplacePairs.xlsx append --paragraph-break append --text="© MyCompany 2015-2023"
Here the subcommands are replace and append.

I would like to be able to use a Vec<Subcommand> in my Parser struct for ease of use. I also don't need nested subcommands for my use case.

@frol
Copy link

frol commented May 6, 2023

I would like to also share my use case, where I need subcommand chaining with nested subcommands. I need to construct a complex structure from the command line, here is an extract:

struct Transaction {
    signer_account_id: String,
    actions: Vec<Action>,
}

enum Action {
    GrantAccess {
        account_id: String,
        permissions: Permissions,
    },
    Transfer {
        receiver_account_id: String,
        amount: u64,
    },
}

enum Permissions {
    FullAccess,
    LimitedAccess { allowed_actions: Vec<AllowedActionKind> },
}

enum AllowedActionKind {
    GrantAccess,
    Transfer,
}

Here is how I want the CLI to look like:

./cli construct-transaction // <- Top-level Commands
        "frol@github" // <- `signer_account_id: String`
        add-action // <- Either `add-action` or some "next" command, see `submit` below
            grant-access // <- Either `grant-access` or `transfer`
                "not-frol@github" // <- `account_id: String`
                limited-access // <- Either `full-access` without parameters or `limited-access` with allowed actions
                    transfer
                    grant-access
        add-action
            transfer // <- Either `grant-access` or `transfer`
                "not-frol@github" // <- `receiver_account_id: String`
                "123" // <- `amount: u64`
        add-action
            transfer // <- Either `grant-access` or `transfer`
                "not-frol2@github" // <- `receiver_account_id: String`
                "124" // <- `amount: u64`
        submit // <- Stop adding actions. Next subcommands should also be able to receive arguments and use nested subcommands
            https://github.com // <- Positional argument to `submit`

You may argue that nobody will write such a long command, and I totally agree, but in my case near-cli-rs has interactive mode which extends clap (there is interactive-clap helper), and guides users through the available options with prompts, and then prints the final command to be reused later.

Currently, I copy-paste the code (see add_action_* folders), but after 5 levels of nesting with 6 "actions" on each level there are too many variants ($6^5=7776$) and clap tries to walk through all of them, so it is not only ugly but also not scalable beyond a certain point.

The pattern I use there is basically:

struct ConstructTransaction {
    #[clap(subcommand)]
    next: AddActionOrSubmit_1,
}

And then there are copies of:

enum AddActionOrSubmit_1 {
    AddAction(AddAction),
    Submit(super::Submit),
}

struct AddAction {
    #[clap(subcommand)]
    action: Action,
}

enum Action {
    GrantPermissions(GrantPermissionsAction),
    Transfer(TransferAction),
}

...

struct TransferAction {
    receiver_account_id: String,
    amount: u64,
    #[clap(subcommand)]
    next: AddActionOrSubmit_2, // <- This is where "recursion" happens unless I manually unroll the code, and this is where it explodes when I add 5+ levels or unrolled copies
}

@epage
Copy link
Member

epage commented May 6, 2023

clap tries to walk through all of them, so it is not only ugly but also not scalable beyond a certain point.

Unsure which aspect doesn't scale for you but #4792 will help clap scale with a lot of subcommands with a lot of arguments. For v4 it might have to be opt-in for derive users but then on by default for v5.

@alexpovel
Copy link

In search for this issue/feature, first thing I actually found searching was this thread, where a user has an issue identical to mine (and everyone else's in this thread...):

Does anyone know how to achieve the effect of parsing subcommands multiple times?

I'd like to have a data processing tool where multiple operations can be performed on a stream of data/events/... and it should be possible to specify the steps via command line arguments like so:

magical_command capture --source=/dev/abcd filter --kind=large-ones normalise convert --to=jsonl

I reckon that's more modest in comparison to other use cases in this thread. It'd be plenty for my uses anyway. Using a workaround, they solved it as:

I resorted to collecting std::env::args() into a Vec<String>, iterated over .split("--"), provided the zeroth argument (program name) at index 0 for each split and fed those to MagicalArgs::from_iter. It is bad in the sense that it abolutely requires the presence of -- between subcommands.

In hindsight, it is a rather obvious workaround that would have saved me half a day of frustration. Retains all the nice stuff of structopt and scales until the limits of linux command lines are reached. If only clap/structopt could parse partially and yield the remainder instead of failing with a hardcoded unkown argument error...

Could someone perhaps shed some light on this workaround: is it a viable one, while the present issue is in progress? Or is there a better approach? The find and git examples come close, but not quite.

@muja
Copy link

muja commented Sep 14, 2023

I've actually independently resorted to the same workaround, the only difference being that I've used the word and as separator instead of (the apparently more common) --. I think it reads more nicely but has the downside of course that it implicitly blacklists and as file name argument etc. No downsides except that it's ~5-10 lines of set up in main.

So the command would look like this:

magical_command capture --source=/dev/abcd and filter --kind=large-ones and normalise and convert --to=jsonl

@epage
Copy link
Member

epage commented Nov 11, 2023

Trying to re-summarize click

API

On the parent, set chain = True. The built-in dispatcher will automatically call the appropriate functions

@click.group(chain=True)
def cli():
    pass


@cli.command('sdist')
def sdist():
    click.echo('sdist called')


@cli.command('bdist_wheel')
def bdist_wheel():
    click.echo('bdist_wheel called')

How it looks in help

Parent command: TBD

Chained command: TBD

Behavior

  • Cannot chain another subcommand after using one with variable number of arguments
  • Cannot nest subcommands under chained subcommands
  • Options must come before arguments

@epage
Copy link
Member

epage commented Nov 11, 2023

Trying to re-summarize and polish the proposal

API

Setting it on the parent command is most likely the natural thing to do.

Parent commands will need to know it for the help. When parsing child commands, we'll need to know its set on the parent. This can be handled in our parsing book keeping.

Most likely this would end up looking like:

impl Command {
    pub fn chain_subcommands(self, yes: bool) -> Self
}

Likely the easiest way to implement this in ArgMatches will be to recurse but that will be harder for users, especially clap_derive users.

Most likely this would end up looking like:

impl ArgMatches {
    pub fn subcommands(&self) -> impl Iterator<Item=(&str,, &ArgMatches)>;
    pub fn remove_subcommands(&mut self) -> impl Iterator<Item=(Str, ArgMatches)>;
    pub fn subcommands_present(&self) -> bool;
}

We'd likely keep the singular forms for ease of working with like we do for arguments.

We could probably deprecate ArgMatches::subcommand_name and ArgMatches::subcommand_matches

For derive users, I would expect this to translate to:

#[derive(Parser)]
struct Cli {
    #[command(subcommand)]
    #[command(subcommand_required = true, args_required_else_help = true)]
    subcommand: Vec<Command>,
}

enum Command {
    New,
    Build,
    Clean,
}

We'll consider the subcommand optional (zero-element Vec) and will require users to manually require it. This will also match how we deal with positionals.

We can key off of the use of Vec and infer chain_subcommands(true).

How it looks in help

Parent command:

No different aside from the usage.

Usage: code[EXE] [OPTIONS] <COMMAND> [OPTIONS] [<COMMAND> [OPTIONS]]...

I'm thinking that if flatten_help(true) is set (see #5206), then we'll the flattened usage, like normal, with [<COMMAND> ...] at the end of each line. This would require knowing when they are chained. We could pass that through as a flag but it would be helpful for chained command help

Chained command:

Either

  • Only usage is affected
    • We'd need a way to know we are chained
  • We show the chained subcommands as if they were our own subcommand
    • This requires being aware of a lot more state

Behavior

Conceptually, the easiest is: when evaluating a potential subcommand, chain up.

  • Anywhere a subcommand will work, a chained subcommand will work
    • Preserving existing use of names, long flags, and short flags
    • Preserving the existing precedence between arguments and subcommands
  • We support arbitrary backtracing

In most cases, nested subcommands under a chained subcommand could be confusing and we'd likely want to steer people away from doing that. However, using short/long flags for chained subcommands to create argument groups within chained subcommands would be a legitimate use case. However, this isn't the highest priority and if it isn't in the initial release but can be added later, we won't block this feature on it.

Guarantees

  • Repeating a subcommand is allowed
  • Order of invocation is maintained

@vallentin
Copy link

Many years later, I'm coming back to this issue. After finally wanting to convert all my hacky old structopt code into clap. However, chaining adjacent commands was always the feature I needed, so clap still has to wait.

For the CLIs I'm working on, chaining commands is the thing I need. I fully understand, that there's many edge cases as mentioned in this thread. However, the CLIs I'm working on doesn't run into them. So now that I still can't use clap, I've looked through other major command-line parsers.

The ones I tested, that didn't support multiple adjacent commands are: clap, argh, and gumdrop.

However, bpaf does support multiple adjacent commands! A minimal example using bpaf looks like this:

use bpaf::Bpaf;

#[derive(Bpaf, Clone, Debug)]
#[bpaf(options)]
pub struct Options {
    #[bpaf(external, many)]
    cmd: Vec<Cmd>,
}

#[derive(Bpaf, Clone, Debug)]
pub enum Cmd {
    #[bpaf(command, adjacent)]
    Foo { a: bool },
    #[bpaf(command, adjacent)]
    Bar { b: bool },
    #[bpaf(command, adjacent)]
    Baz { c: bool },
}

fn main() {
    let opt = options().run();
    println!("{:#?}", opt);
}

Which then when executing:

[program] foo -a bar bar -b baz -c baz

Outputs the following:

Options {
    cmd: [
        Foo {
            a: true,
        },
        Bar {
            b: false,
        },
        Bar {
            b: true,
        },
        Baz {
            c: true,
        },
        Baz {
            c: false,
        },
    ],
}

That being said, I only have mere minutes of experience using bpaf. So I don't know the real differences between e.g. clap and bpaf. All I can say is, I previously had multiple projects, with semi-complex CLIs using structopt and clap, and all of them easily mapped into bpaf, with no issues so far. But again, take it with a grain of salt.

I'm not advocating the use of bpaf instead of clap. I personally wanted to use clap. However, supporting multiple adjacent commands, was the feature I needed. So I have to go with bpaf.

@epage
Copy link
Member

epage commented Feb 26, 2024

After finally wanting to convert all my hacky old structopt code into clap. However, chaining adjacent commands was always the feature I needed, so clap still has to wait.

I'm not aware of structopt supporting this, so I'm confused how this would be a blocker for moving off of it.

@vallentin
Copy link

@epage I'm confused why that's the take away. Anyways, I initially preferred structopt because of its derive macro, obviously today that has changed. However, as I said my solution was hacky. In short, instead of using from_args(), I would instead use from_iter() and manually split the args beforehand based on a delimiter. Again, like I said, hacky solution. But it allowed me to somewhat simulate having multiple chained adjacent commands.

@epage
Copy link
Member

epage commented Feb 27, 2024

Thanks for the clarification. That wasn't meant to be a "take away" but it was a leap that wasn't sufficient on its own with what was said so I was wanting to better understand.

@vallentin
Copy link

@epage if you're curious about the use case, I have various tools that operate on data or allow performing multiple actions sequentially.

[program] [--dry-run] \
    [cmd] [cmd-options...] \
    [cmd] [cmd-options...] \
    [cmd] [cmd-options...]

Actual example for rendering 2 images:

draw \
    size 100 100 \
    fill  0  0 50 50 "#FF0000" \
    fill 50  0 50 50 "#00FF00" \
    fill  0 50 50 50 "#0000FF" \
    fill 50 50 50 50 "#FFFFFF" \
    export --output "image.png" \
    fill 25 25 50 50 "#FFFF00" \
    export --output "image2.png" \

In reality, I could create config formats instead. However, many times I'm executing a single command, so having to create a config is tedious, e.g.:

draw --input "image.png"  --output "image-inverted.png" \
    invert  

Additionally, both supporting a CLI and config doubles the code. So many times, I end up with a CLI and then create a .sh file when I need a "config".

@abey79
Copy link

abey79 commented Feb 27, 2024

I can second this exact use case. I'm considering a rust rewrite of vpype, which is very similar to @vallentin's example (albeit targeted at vector graphics for the plotter-based art niche).

Example:

vpype  read input.svg  layout --landscape --fit-to-margin 1.5cm a4  linesort --tolerance 0.1mm  write output.svg

darsto added a commit to darsto/cabal_rgs that referenced this issue Apr 6, 2024
Previously all the params had to be unique between
services because clap didn't yet support what we want:
clap-rs/clap#2222

It still doesn't, but we can hack our way by parsing
the args string manually, and passing different parts to
different sub-parsers.

$ cargo run -- --help
Cabal Online Replacement Services

Usage: cabal-mgr [OPTIONS]
          [-s|--service <SERVICE1> <SERVICE1_OPTIONS>]
          [-s|--service <SERVICE2...>]

Services:
  crypto-mgr  RockAndRoll equivalent
  event-mgr   EventMgr equivalent
  proxy

Options:
  -r, --resources-dir <RESOURCES_DIR>
          [default: .]

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

Examples:
  ./cabal_mgr -s crypto-mgr --service event-mgr
@ahirner
Copy link

ahirner commented Dec 6, 2024

I think many cases can be solved with .defer() quite generically.

Setup:

#[derive(Debug, Parser)]
struct Cli {
    #[clap(long)]
    top: String,
    #[command(subcommand)]
    cmd: Option<Cmd>,
}

#[derive(Debug, Subcommand)]
#[command(subcommand_precedence_over_arg = true)]
enum Cmd {
    Foo(ReClap<FooOpts, Self>),
    Bar(ReClap<BatOpts, Self>),
    Baz(ReClap<BatOpts, Self>),
}

#[derive(Debug, Args)]
struct FooOpts {
    #[clap(long, short('a'))]
    ayy: Vec<usize>,
    #[clap(long)]
    hey: Option<String>,
}

/// Bar/Baz Docstring.
#[derive(Debug, Args)]
struct BatOpts {
    #[clap(long)]
    ok: bool,
}


fn main() {
    let cli = Cli::parse_from("main --top 8 baz --ok foo -a 1 -a 2 bar --ok baz".split(' '));
    let cli = Cli::parse();

    println!("top: {}", &cli.top);
    let mut next = cli.cmd;
    while let Some(cmd) = next {
        println!("{cmd:?}");
        // could use enum_dispatch
        next = match cmd {
            Cmd::Foo(rec) => (rec.next).map(|d| *d),
            Cmd::Bar(rec) => (rec.next).map(|d| *d),
            Cmd::Baz(rec) => (rec.next).map(|d| *d),
        }
    }
}

Examples:

$ cargo r -- --top 8 baz --ok foo -a 1 -a 2 bar --ok baz
top: 8
Baz(ReClap { inner: BatOpts { ok: true }, next: Some(Foo(ReClap { inner: FooOpts { ayy: [1, 2], hey: None }, next: Some(Bar(ReClap { inner: BatOpts { ok: true }, next: Some(Baz(ReClap { inner: BatOpts { ok: false }, next: None })) })) })) })
Foo(ReClap { inner: FooOpts { ayy: [1, 2], hey: None }, next: Some(Bar(ReClap { inner: BatOpts { ok: true }, next: Some(Baz(ReClap { inner: BatOpts { ok: false }, next: None })) })) })
Bar(ReClap { inner: BatOpts { ok: true }, next: Some(Baz(ReClap { inner: BatOpts { ok: false }, next: None })) })
Baz(ReClap { inner: BatOpts { ok: false }, next: None })

$ cargo r -- --top 8 baz --ok foo --help
Usage: clap-ext-play baz foo [OPTIONS] [COMMAND]

Commands:
  foo
  bar  Bar/Baz Docstring
  baz  Bar/Baz Docstring

Options:
      --ayy <AYY>
      --hey <HEY>
  -h, --help       Print help

Implementation:

/// `[Args]` wrapper to match `T` variants recursively in `U`.
#[derive(Debug, Clone)]
pub struct ReClap<T, U>
where
    T: Args,
    U: Subcommand,
{
    /// Specific Variant.
    pub inner: T,
    /// Enum containing `Self<T>` variants, in other words possible follow-up commands.
    pub next: Option<Box<U>>,
}

impl<T, U> Args for ReClap<T, U>
where
    T: Args,
    U: Subcommand,
{
    fn augment_args(cmd: clap::Command) -> clap::Command {
        T::augment_args(cmd).defer(|cmd| U::augment_subcommands(cmd.disable_help_subcommand(true)))
    }
    fn augment_args_for_update(_cmd: clap::Command) -> clap::Command {
        unimplemented!()
    }
}

impl<T, U> FromArgMatches for ReClap<T, U>
where
    T: Args,
    U: Subcommand,
{
    fn from_arg_matches(matches: &clap::ArgMatches) -> Result<Self, clap::Error> {
        //dbg!(&matches);
        // this doesn't match subcommands but a first match
        let inner = T::from_arg_matches(matches)?;
        let next = if let Some((_name, _sub)) = matches.subcommand() {
            //dbg!(name, sub);
            // most weirdly, Subcommand skips into the matched
            // .subcommand, hence we need to pass outer matches
            // (which in the average case should only match enumerated T)
            Some(U::from_arg_matches(matches)?)
            // we are done, since sub-sub commmands are matched in U::
        } else {
            None
        };
        Ok(Self { inner, next: next.map(Box::new) })
    }
    fn update_from_arg_matches(&mut self, _matches: &clap::ArgMatches) -> Result<(), clap::Error> {
        unimplemented!()
    }
}
...

Playground.

I see no practical length limit of the command chain, although there are quirks for sure.
I'm curious how it could solve anybody's problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: Parser's logic and needs it changed somehow. C-enhancement Category: Raise on the bar on expectations S-waiting-on-design Status: Waiting on user-facing design to be resolved before implementing
Projects
None yet
Development

No branches or pull requests

10 participants