Regex matching. #85

Mike-Neto · 2018-01-25T20:56:46Z

Looking for feedback on this PR, Fixes #81
Added functionality to test against regex::Regex.

match regex fuzzy must match the regex at least once to pass.
match regex non fuzzy takes a second param to specify the number of matches it needs to find.

Also leaves room to implement predicate matching.

Added two new functions to the API

pub fn matches(mut self, output: regex::Regex) -> Assert
pub fn matches_ntimes(mut self, output: regex::Regex, nmatches: u32) -> Assert

You can see an example of the API in use in my own crate's branch: https://github.com/Mike-Neto/img_diff/tree/assert-cli-regex

Test's are not verified as i'm on Windows and 7 test's are failing from master (only those 7 keep failing).
Clippy and Rustfmt are not run yet as I'm just looking for feedback on the implementation itself.

I have created tests for any new feature, or added regression tests for bugfixes.
cargo test succeeds
Clippy is happy: cargo +nightly clippy succeeds
Rustfmt is happy: cargo +nightly fmt succeeds

epage · 2018-01-25T22:18:44Z

RE Your API.

What is the use case for matches_ntimes?
If we're exposing regex to the user, do we bother with distinguishing between fuzzy and non-fuzzy regex?

RE Conflicts

I'm torn. We're at a cross roads of a major refactor of the code / API (#74 or #75) and it'd be nice to avoid conflicts with those PRs because they were big. On the other hand, I don't want to discourage contributions or completely hold up assert_cli while we figure out what is going on.

Mike-Neto · 2018-01-25T22:54:50Z

What is the use case for matches_ntimes?
I use it in the tool I linked. The way it works is, it output's the diff value to the console and I check if they are there, simple enough.
For a single file I know the expected output, so something like .is("Dssim(0)") is appropriate, but in rust nightly the output is Dssim(0.0) , this is where the regex matching comes in, allowing me to target both output's.
This is for a single file, now for multiple files I could just match their output directly, but when dealing with multiple files the order in which they are compared is not deterministic, as such the best i can do is to check that, for example, I have two Dssim(0) and a single Dssim(0.234234).
Otherwise i'm just validating that I have at least one Dssim(0) and at least one Dssim(0.234234) which can be misleading as it will not check if the second Dssim(0) check was successful.
As such it's goal is to provide a way to match a single pattern that may occur more than once and be sure that it happens the expected amount of times.
If we're exposing regex to the user, do we bother with distinguishing between fuzzy and non-fuzzy regex?
Not sure what you mean here can you elaborate?

RE Conflicts:
I understand your concerns, but this is a Feature that is more or less ready and can be added as a non breaking change, as such if the merge from those big changes are not scheduled to be merged soon we might as well push this new feature out 1st and then rework it into the new code.

epage · 2018-01-25T23:16:57Z

I use it in the tool I linked. The way it works is, it output's the diff value to the console and I check if they are there, simple enough.

Then github's search failed me. Could you provide a link?

If we're exposing regex to the user, do we bother with distinguishing between fuzzy and non-fuzzy regex?

Not sure what you mean here can you elaborate?

https://github.com/killercup/assert_cli/pull/85/files#diff-b8e1279b7e534c886db53e49d60c14a5R495

In that, one specified fuzzy as true and another as false.

(sigh the refactor really cleans up this code)

Mike-Neto · 2018-01-25T23:42:03Z

Then github's search failed me. Could you provide a link?

https://github.com/Mike-Neto/img_diff/blob/256c0a75bc1ee1077f4278b320ed0e27ce6e7d5e/src/main.rs#L221

https://github.com/killercup/assert_cli/pull/85/files#diff-b8e1279b7e534c886db53e49d60c14a5R495

In that, one specified fuzzy as true and another as false.

ohh, I was just following the "pattern" that has there in the 1st place, contains is fuzzy as it isn't as specific (can match with many different output's as long as they contain the input string) as is, which is analogous to my match and matches_ntimes is more like is, at least that was the interpretation i got from that.

Can you point me which branch is the most "recent", by that I mean the one you want this feature implemented in, maybe I can make a PR for that branch instead and rework my code around it.

Also what do you thing about not taking the regex::Regex param an take it as a String and avoid making the calling code have to instantiate the regex itself.

epage · 2018-02-02T14:36:44Z

Sorry for the delay; I got caught up in other projects

ohh, I was just following the "pattern" that has there in the 1st place, contains is fuzzy as it isn't as specific (can match with many different output's as long as they contain the input string) as is, which is analogous to my match and matches_ntimes is more like is, at least that was the interpretation i got from that.

fuzzy is basically the predecessor to your ExpectType.

Can you point me which branch is the most "recent", by that I mean the one you want this feature implemented in, maybe I can make a PR for that branch instead and rework my code around it.

#74 has both a refactor and API change. I'm thinking of splitting these up so its easier to get changes in while we worry about the API. I could probably have that done by end of day tomorrow. Would you want to adjust your work to be on top of that?

Also what do you thing about not taking the regex::Regex param an take it as a String and avoid making the calling code have to instantiate the regex itself.

That can be handy., On the other hand, I was playing with the idea of having the contains and is functions be smart and take both strings and regexes. In the string case, it behaves as today rather than interpreting it as a regex.

I had this idea before I looked at the docs. I assumed the regex crate had something like python's, with distinct search and match, So we could either emulate that behavior or we could have distinct names for regex matching (and implicitly convert strings to regex).

Thoughts?

Mike-Neto · 2018-02-02T19:01:41Z

#74 has both a refactor and API change. I'm thinking of splitting these up so its easier to get changes in while we worry about the API. I could probably have that done by end of day tomorrow. Would you want to adjust your work to be on top of that?

Yes, that way we can get these features in for current users and only introduce the braking API changes later.

Regarding your last point, I think different functions is the best option as they document intent and will also use strings as params to allow for simple refactoring.
My reason to keep them in separated functions is the () handling, as native types like Some and Option print Some(VALUE) which will beak .contains and .is in cases were we use regex internally (how can we even try to guess which one was it that the user trying to use?).
Regarding the search and match behaviors i agree to implementing both as they are usefully however .search will also be able to take as (optional) param the amount of matches expected similar to my current .matches_ntimes.

Looking forward to implement against the new code base :)

epage · 2018-02-04T03:16:48Z

Feel free to provide feedback on #87

epage

FYI We'd prefer it for commit histories to be cleaned up before we merge them. Don't worry about it now but once this is fully ready and review comments are done, we'd appreciate it if you could clean them up.

Keep in mind that github doesn't always send notifications for forced pushes, so you'll need to let us know when its pushed so we can go in and merge.

epage · 2018-02-06T02:39:53Z

src/output.rs

+    fn verify(&self, got: &[u8]) -> Result<()> {
+        let conversion = String::from_utf8_lossy(got);
+        let got = conversion.as_ref();
+        if self.times == 0 {


What are your thoughts on using a sentinel value (0) compared to using Option?

This is a good suggestion, It's just some bad habit's take a while to leave :)

I ask not to point out bad habits because sometimes the line is blurry. In another project, I have one behavior when a Vec is empty and another when it has items. I'm not using an Option for it because it doesn't feel like it'd jive quite right with the API.

epage · 2018-02-06T02:45:53Z

src/output.rs

@@ -386,6 +463,16 @@ mod errors {
                description("Output predicate failed")
                display("{}\noutput=```{}```", msg, got)
            }
+/* Adding a single error more makes this break, using the bottom one temporarily


In what way does this break?

Granted, at some point we should probably move beyond error-chain

Out of stack space, this is a known bug in error-chain. I'm gonna look up some docs for this problem.

epage · 2018-02-06T02:49:49Z

src/assert.rs

+    ///     .stdout().matches("[0-9]{2}")
+    ///     .unwrap();
+    /// ```
+    pub fn matches(mut self, output: String) -> Assert {


Should we accept a regex?

Should we accept a byte slice and a byte regex?

If we decide on yes, it doesn't mean you have to do them (I don't want to bar for contributions to be perfection) but we need to at least create issues for them and keep them in mind with the design of the API / implementation.

I think yes, at least a regex, a byte regex not so much IMHO, however, I see no problem in setting us up for later by implementing an abstraction layer here.

Also, here the current usage code is more like .matches(String::from("[0-9]") as such, i will probably use str instead of string (just borrow).

For str vs String, you can always make it accept both

epage · 2018-02-06T02:55:51Z

src/output.rs

+    ///     .stdout().matches_ntimes("[0-9]{1}", 2)
+    ///     .unwrap();
+    /// ```
+    pub fn matches_ntimes(output: String, nmatches: u32) -> Self {


(longer term brain storming, feel free to ignore me)

Ideally, in the long run I'd like to do this through a more builder-like approach .matches("pattern").times(10). contains could also implement this. It'd provide a nice way to keep the upfront API "small". We're already talking about doing this for other features.

The interesting challenge is deciding how to implement that.

The nasty "unsafe" option is for Output to have a .times() function. I say "unsafe" because the call is meaningless in some cases (like .is).

Another option is to move away from having people interact with Output and instead have people construct the predicates directly and we'll do an Into<ContentPredicate>. This will end up more like killercup's proposed API.

This is something that can be very useful but out of the scope of this PR, for now, match_ntimes(V, N) is a good enough solution, we might as well open an improvement for later to refactor and abstract into times(N).

epage · 2018-04-07T16:07:36Z

FYI with #98 we are switching to generic predicates. In assert-rs/predicates-rs#18 I'm adding regex to the generic predicates. It doesn't contain repetitions but I have noted that in assert-rs/predicates-rs#12 .

epage · 2018-05-29T12:33:29Z

Addressed in https://github.com/assert-rs/assert_cmd

TheMikeNeto added 3 commits January 25, 2018 00:11

Implemented regex matching. Gonna test this before refactoring.

6406d80

fuzzy and nonfuzzy regex matches.

e70bdac

fixed fuzzy matching.

e695095

Mike-Neto mentioned this pull request Jan 25, 2018

Regex matching in OutputAssertionBuilder #81

Closed

epage mentioned this pull request Feb 3, 2018

Refactor output #87

Merged

= added 2 commits February 4, 2018 20:21

merge.

3b2197a

Implemented Regex parsing, 1st pass.

0d589c2

epage reviewed Feb 6, 2018

View reviewed changes

epage mentioned this pull request Apr 7, 2018

Take inspiration from hamcrest's predicates assert-rs/predicates-rs#12

Closed

1 task

epage mentioned this pull request Apr 16, 2018

Support repetition counts in regex predicate assert-rs/predicates-rs#27

Closed

epage closed this May 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regex matching. #85

Regex matching. #85

Mike-Neto commented Jan 25, 2018 •

edited

Loading

epage commented Jan 25, 2018

Mike-Neto commented Jan 25, 2018

epage commented Jan 25, 2018

Mike-Neto commented Jan 25, 2018

epage commented Feb 2, 2018

Mike-Neto commented Feb 2, 2018

epage commented Feb 4, 2018

epage left a comment

epage Feb 6, 2018

Mike-Neto Feb 6, 2018

epage Feb 6, 2018

epage Feb 6, 2018

Mike-Neto Feb 6, 2018

epage Feb 6, 2018

Mike-Neto Feb 6, 2018

Mike-Neto Feb 6, 2018

epage Feb 6, 2018

epage Feb 6, 2018

Mike-Neto Feb 6, 2018

epage commented Apr 7, 2018

epage commented May 29, 2018

Regex matching. #85

Regex matching. #85

Conversation

Mike-Neto commented Jan 25, 2018 • edited Loading

epage commented Jan 25, 2018

Mike-Neto commented Jan 25, 2018

epage commented Jan 25, 2018

Mike-Neto commented Jan 25, 2018

epage commented Feb 2, 2018

Mike-Neto commented Feb 2, 2018

epage commented Feb 4, 2018

epage left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

epage commented Apr 7, 2018

epage commented May 29, 2018

Mike-Neto commented Jan 25, 2018 •

edited

Loading