Skip to content
This repository has been archived by the owner on Dec 29, 2021. It is now read-only.

Provide option to not trim output strings #77

Closed
behnam opened this issue Nov 3, 2017 · 13 comments
Closed

Provide option to not trim output strings #77

behnam opened this issue Nov 3, 2017 · 13 comments

Comments

@behnam
Copy link

behnam commented Nov 3, 2017

https://github.com/killercup/assert_cli/blob/3bc97874a188423fcdc3252c5d677b0fb048530e/src/output.rs#L34

Writing an echo-like tool, I want to be able to test the output for detailed behavior on trailing newline characters. Unfortunately, it's not possible now because of the trimming done on both strings.

It would be great to have an option to disable the trimming, as a config or an alternate to is(), like is_exactly().

@behnam
Copy link
Author

behnam commented Nov 3, 2017

By the way, I'm not curious if this way of trimming behaves the same under Windows as it does on Unix-like systems.

@epage
Copy link
Collaborator

epage commented Nov 3, 2017

Thanks for the feedback. While trimming is a help to most people, I can see how some cases need exact handling.

@killercup
Copy link
Collaborator

By the way, I'm not curious if this way of trimming behaves the same under Windows as it does on Unix-like systems.

You are not curious? Then you probably don't want to read this: (:trollface:)

Returns a string slice with leading and trailing whitespace removed.

'Whitespace' is defined according to the terms of the Unicode Derived Core Property White_Space.

https://doc.rust-lang.org/1.21.0/std/primitive.str.html#method.trim

@behnam
Copy link
Author

behnam commented Nov 3, 2017

Huhu, @killercup! That was supposed to be a "now".

Yeah, I think I wasn't clear about that comment. I meant the "\n" part of that line, not the trim() part. Does "\n" become two chars in Windows? (I haven't used Rust in Windows, that's why I'm asking.)

@killercup
Copy link
Collaborator

killercup commented Nov 3, 2017

Hi Behnam! 👋 I have no way to test this on Windows but \r\n is stripped on playpen.

Do you feel like digging into libstd and friends? 'Cause I feel like it. Let's look at the code! This leads me here which in turn brings me to this and finally to this impl of trim_matches. The last three steps are not even necessary, I just wanted to make sure there is nothing platform specific there. The most interesting this is

self.trim_matches(|c: char| c.is_whitespace())

So, what is is_whitespace? So glad you ask! It's this simple fn. So every thing from 0x09 to 0x0d is whitespace—and there we finally have it: 0xa is LF (\n), 0x0d is CR (\r)! Let's check to make sure… yep. That's it.

(Sorry, you probably knew all that. But I kinda wanted to write it down cause I like this kind of thing 😅)

@behnam
Copy link
Author

behnam commented Nov 3, 2017

@killercup, nice writing, but for the newline matters, I'm not talking about Rust's trim() method anymore, but instead how difference is used here to match output that may have Unix-like newlines or Windows newlines.

I looked at the documentation for Changeset::new(), and looks like the separator argument is only used for interpreting the diff, if there's any, but it doesn't affect the matching.

And I don't see any newline conversion being done outside of matches_exact() or other functions.

So, if I write an echo command and test it like this:

    Assert::main_binary()
        .with_args(&["Hello"])
        .stdout().is("Hello") //.
        .stdout().contains("Hello\n") //.
        .unwrap();

it will be passing on Unix, but will fail on Windows.

So, if the goal is to be able to support cross-platform matching for multiline outputs either a newline-less pattern should be supported (like .has_lines(&["First line.", "Second line."])), or different newline sequences need to be treated equally when checking the diff.

What do you think, @killercup? Is this a goal to support any matching string (via is() or contains()) that contains a newline char?

@killercup
Copy link
Collaborator

Oh, I see, thanks for clarifying, @behnam!

I see two options here: We just note in the docs that you need to take care of platform-specific line endings, or our matchers deal with that automatically. I'm in favor of the latter, because I don't want to deal with that kind of stuff as a user.

but instead how difference is used here to match output that may have Unix-like newlines or Windows newlines.

Adding this to the difference crate is most likely the best way to do it (even if just to avoid the str.lines().map(str::trim).collect() stuff) and they have an issue that kinda goes into that direction: johannhof/difference.rs#10. (It's called "Edit distance assert should treat all whitespace the same" but Unix and Windows line endings have a different number of line ending characters, so I'm not sure it actually covers it.)

(Btw, I have the feeling that most people use .contains() and only ever check for single line content, but this is completely not backe by facts and I would actually have some feedback on whether that is true!)

@behnam
Copy link
Author

behnam commented Nov 3, 2017

I have the feeling that most people use .contains()

Right. Most people don't need multi-line matching. But for some applications it's necessary.

Adding this to the difference crate is most likely the best way to do it

Agreed. Almost any diff library/command I know has options for ignoring newline differences, and possibly more generally, number of whitespaces (one or more). For CLI testing, I need the former, but think that some applications can use of the latter, as well.

/cc @johannhof

For now, I think I'll have to use #[cfg] to get my tests pass in Windows.

@epage
Copy link
Collaborator

epage commented Nov 4, 2017

My proposed API for this.

Output is our user-facing predicate type once #74 or #75 goes in. We could add

  • trim(bool)
    • trims value passed into verify_str
    • default value: true
  • normalize_newline(bool)
    • for the value passed into verify_str, convert CR+LF to LF
    • default value: true?

So the API would look like

Assert::main_binary()
    .stdout(contains("Music\nMagic").trim(false).normalize_newline(false))
    .unwrap()

@behnam
Copy link
Author

behnam commented Nov 4, 2017

I like the API, @epage. Just a quick reminder that at the moment, contains() doesn't trim(), but is() does. I suppose the new API will work for both, and would have the same default values, as well.

@epage
Copy link
Collaborator

epage commented Nov 4, 2017

Correct. One benefit of changing to this is it would create consistency across different predicates.

@behnam
Copy link
Author

behnam commented Nov 6, 2017

FYI, I just learned something about Rust on Windows: writeln!() and println!() explicitly say:

On all platforms, the newline is the LINE FEED character (\n/U+000A) alone (no additional CARRIAGE RETURN (\r/U+000D).

From:

So, the example I provided before, it actually does work fine in Windows. But, because my implementation depends on writeln!(), and if I fix that part (to get it work as expected on Windows), then the test will fail.

This could be the reason difference.rs doesn't deal with CR already.

@epage
Copy link
Collaborator

epage commented May 29, 2018

The responsibility for this is moving to assert_cmd and predicate-rs.

assert_cli is going to morph into a wider-focused tool, including things like assert_fs. See #41

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants