Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[modular] User-driven validation #3008

Open
2 tasks done
epage opened this issue Nov 9, 2021 · 22 comments
Open
2 tasks done

[modular] User-driven validation #3008

epage opened this issue Nov 9, 2021 · 22 comments
Labels
A-validators Area: ArgMatches validation logi C-enhancement Category: Raise on the bar on expectations S-waiting-on-mentor Status: Needs elaboration on the details before doing a 'Call for participation'

Comments

@epage
Copy link
Member

epage commented Nov 9, 2021

Please complete the following tasks

  • I have searched the discussions
  • I have searched the existing issues

Clap Version

master

Describe your use case

Describe the solution you'd like

Define one or more App Validation trait

  • Validate: Takes a built App and ArgMatches as an input, reports errors
  • Requires: Takes a built App, ArgMatches, and args as input, reports which args are required
  • (maybe) Default value: Takes a built App, ArgMatches, and args as input, reports default values for them

With maybe a hint(&self, app: &App) -> Option<String> (see #1695 and #3321)

  • Should it also accept a matches: Option<&ArgMatches> for smart hints?

Steps

  1. Refactor all requires, conflicts, etc APIs to be implemented in terms of this
  2. Expose the trait and structs to the user in a clap::validate
  3. Deprecate the old Arg functions so now only those using this API pay the cost due to dead code elimination

Alternatives, if applicable

No response

Additional Context

See also #2832

Related issues

Risks are usability and performance. The most basic APIs, like required, we probably need to keep baked in. Its the special inter-arg interactions that we should be focusing on.

@epage epage added T: new feature A-validators Area: ArgMatches validation logi labels Nov 9, 2021
@epage epage added C-enhancement Category: Raise on the bar on expectations and removed T: new feature labels Dec 8, 2021
@epage epage added the S-waiting-on-mentor Status: Needs elaboration on the details before doing a 'Call for participation' label Dec 13, 2021
@ctrlcctrlv
Copy link

I'm not quite understanding what the Validate trait will look like. Is this a project-level trait that will be implemented on a certain Fn / FnMut, meaning users don't need to worry about it, or will it be provided by a clap_validate crate, and users need to include that crate and its trait to get App::validator_all?

@epage
Copy link
Member Author

epage commented Jan 10, 2022

My general expectation for traits in this issue and related ones (e.g. #2683) is they would live in clap (we'd need access to them) and we will have an impl for Fn signatures where they make sense for convinience. We'd work to duplicate clap's validation functionality in structs that implement these traits. These would most likely live in clap, relying on the compiler deleting dead code (this will be easier for a compiler to detect than the current under-used validation code).

All of this would start out behind an unstable-* feature flag as we work to polish it up. Once we felt good with them, we'd remove the feature flag and deprecate the old functionality. On the next major release, we'd then remove the old functionality, pushing people to use the trait-based version.

This is all contingent on the design working out. We might find it just isn't worth completing this.

@ctrlcctrlv
Copy link

Wait, why would clap need access to the trait?

The trait could just fit into clap's builder pattern for App, providing a function for App which modifies App.

Then you could call App::error in the Validator trait impl. clap wouldn't need to know the validator exists so long as App has the right struct members pub or fn's to get them.

@epage
Copy link
Member Author

epage commented Jan 13, 2022

The proposed validator trait would be called during parse so we have the context of the specific subcommand being processed.

@ctrlcctrlv
Copy link

Forgive me for not knowing the internals as well as you, but couldn't the Validator trait, instead of being called on App, be called on ArgMatches instead?

Perhaps as app.get_matches().validate().

@epage
Copy link
Member Author

epage commented Jan 13, 2022

It needs access to the relevant App for any extra error reporting it wants to do, like

  • Checking for additional context for whether validation has failed or not
  • Showing usage
  • Showing help if we made this replace ArgsRequiredElseHelp'

@ctrlcctrlv
Copy link

I see. Honestly, as much as I don't like it, perhaps you do have a strong case for pushing App::error on downstream users, and we're just going to have to learn to live with it. ;-)

@epage
Copy link
Member Author

epage commented Jan 13, 2022

The primary motivation behind this is to see if we can make the cost of validation paid by those who use it rather than everyone (ie clap's existing validation logic like conflicts, special require rules, etc). A secondary motivation is to simplify the APIs for the more complex validation APIs in clap today. All other validation use cases would just be a bonus since we do have solutions like App::error. This might work and be merged; it might fail and we reject this issue.

@ctrlcctrlv
Copy link

ctrlcctrlv commented Jan 13, 2022

Hopefully you can avoid my horrible situation of the past week: adding an API (MFEK/glifparser.rlib@6aaf880), realizing it's broken and trying to fix it (MFEK/glifparser.rlib@d9aa702)…twice (MFEK/glifparser.rlib@9b06a94). Maybe it has no edge cases now, we shall see. Popular software can't afford such…unprofessionalism, I'm aware. :)

@epage
Copy link
Member Author

epage commented Jan 13, 2022

This would most likely be implemented via

  1. Refactor the existing implementation to use a pub(crate) trait (dog fooding)
  2. Expose it as a pub trait behind a feature flag with a stablization process (e.g. Stabilise AppSettings::Multicall Tracking Issue #2861) (collect public feedback)
  3. Stablize the feature (point of no return ... until next major version)
  4. On next major version, remove parts of API not needed anymore

ctrlcctrlv added a commit to MFEK/stroke that referenced this issue Feb 2, 2022
Although I once thought that it's likely we'd be able to go back to
master `clap`, MFEK might have to maintain its fork indefinitely.

`clap`'s maintainer @epage unfortunately wrote in clap-rs/clap#3008:

>  We might find it just isn't worth completing this.
@kraktus
Copy link

kraktus commented May 6, 2022

What about relying on serde for custom validation instead? clap would serialise it, and users who would write custom deserialisation, helped with the helper methods of serde. That would a be more modular than clap implementing its own trait and visitor

@epage
Copy link
Member Author

epage commented May 6, 2022

I worry serde would too much complexity for most cases since a lot of the cases people are interested in validating would require hand-writing serde traits which can be burdensome while just providing a Vaidator trait in clap that works on an ArgMatches would be relatively trivial to implement. No visitors needed.

@ctrlcctrlv
Copy link

Not necessarily.

With serde_value, you get a Value type that represents any Serde "object". These can be coerced into the user's structs because they also implement Deserialize.

I think @kraktus' idea is a good middle ground.

@epage
Copy link
Member Author

epage commented May 6, 2022

With serde_value, you get a Value type that represents any Serde "object". These can be coerced into the user's structs because they also implement Deserialize.

I'm not seeing how that addressed the concern I raised. I'm assuming there is context missing that I could make guesses at but I would prefer to understand what you are getting at then understanding my assumptions :)

The other problem with serde is it assumes Paths are UTF-8 so naive handling of paths will be done incorrectly while clap is wanting to ensure paths are default handled as a bag of bytes rather than utf-8

@ctrlcctrlv
Copy link

I was thinking that you'd have Deserialize on the whole ArgMatches. It's already normal for me to have an Args struct. If ArgMatches could become a serde::Value then it could become also an Args, right?

@epage
Copy link
Member Author

epage commented May 6, 2022

How does users writing custom validation fit into that picture? If the user just wants to do a field-level validator, they can use the field attributes to provide their own custom deserializer. If they want to validate the entire struct, which is what this issue is focusing on, then they would need to hand write the entire Deserialize trait which is not pretty to do by hand.

Compare that to just passing in a value that implements Validator as described earlier. That is a large burden for the user to do one vs the other and so doesn't seem a viable way of solving this problem. We also have a custom derive that offers a lot of domain-specific value that can't be had with clap, which lowers the value add of going down this route for custom validation.

That said, it would be interesting for people to support serde support for ArgMatches but I think that would best start out as an external crate so it gives people room to experiment and try out ideas more to see what works or doesn't before pulling it into clap. This will require a newtype due to the orphan rules. Some more features will be needed from ArgMatches which I would like to provide but they are blocked on some v4 changes because of difficulties with our internal Id type.

@ctrlcctrlv
Copy link

ctrlcctrlv commented May 6, 2022

My problem was I was only considering my patch #3029 and not that you wanted to make every case work.

To refresh memories, my patch was focused on validating a single struct field that contains multiple values.

So I could use a Newtype and impl FromStr.

But you're right this won't let you validate the whole struct.

@ctrlcctrlv
Copy link

Erm, or I could impl From<Vec<…>> for Newtype.

Then when I call values_of / values_of_os...I could quickly make the returned Vec into my Newtype.

I hadn't considered the other cases, my bad, I forgot this issue was not just about #3029

@Easyoakland
Copy link

Where is work being done on this? Is there more discussion that must be had first?

@epage
Copy link
Member Author

epage commented Jun 25, 2023

For myself, I have tasks on other projects that I feel are higher priority on this. Someone else could do some exploratory work on this but the challenge is this is very open-ended work and someone working on it would need to keep us in the loop on decisions to make sure there is buy-in on the direction being taken.

@Easyoakland
Copy link

It seems like what this is proposing is a validator trait on functions which can take a validator to make new validator. Then use Combinatory_logic to define all the validators. How is this proposed approach different from copying bpaf?

@pacak
Copy link

pacak commented Jun 30, 2023

How is this proposed approach different from copying bpaf?

In bpaf you would compose parsers instead of validators, see this article for a general idea: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

Disclaimer, I'm not an expert in clap, but I can try to give some examples according some hopefully idiomatic existing code.

cargo-add needs user to specify the source, so it defines https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L85 a bunch of arguments with some validation - "tag" requires "git", "path" conflicts with "git", etc.

Values later extracted into variables https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L226

And consumed with https://github.com/rust-lang/cargo/blob/5b377cece0e0dd0af686cf53ce4637d5d85c2a10/src/bin/cargo/commands/add.rs#L259

Biggest problem with that is that user code (parse_dependencies) relies on validated invariants, not parsed ones.

With bpaf you would have those values constructed either by derive or construct! macro.

struct Options {
   ... 
   source: Input,
   ...
}

enum Ref {
    Branch { branch: String },
    Tag { tag: String },
     Rev { rev: String },
}

enum Input {
    Crate { name: String },
    File { path: PathBuf },
    Git { uri: String, ref: Option<Ref>  },
}

With parsers and validators operating on strictly typed Rust values.

For hypothetical input validator (let's say against crate blacklist) clap version would have to deal with the same steps - getting fields, assuming some invariants been vitiated earlier (unreachable!("clap should ensure we have some source selected");) and validating them. With bpaf validator would get a val: &Input or val: &Ref depending on where it is placed.

Now say you want to add one extra input type, let's call it "interactive picker".

With clap you modify parser definition and then you have to go over all the places where ArgMatches are used and update them, otherwise things will explode on runtime with unreachable!("clap should ensure we have some source selected");

With bpaf compiler will tell you all the places where match val needs to be updated to handle one more branch.

This becomes a problem when argument parser or parts of it are shared across multiple binaries, but even within a single crate it is easier to work with the right type

enum OutputFormat {
    Intel,
    Att,
    Llvm,
    LlvmInput,
    Mir,
    Wasm,
    McaIntel,
    McaAtt,
}

rather than trying to emulate it using product types:

struct OutputFormat {
    intel: bool,
    att: bool,
    llvm: bool,
    ...
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-validators Area: ArgMatches validation logi C-enhancement Category: Raise on the bar on expectations S-waiting-on-mentor Status: Needs elaboration on the details before doing a 'Call for participation'
Projects
None yet
Development

No branches or pull requests

5 participants