Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WrapIterImpl: add support for inputs with line breaks #123

Merged
merged 5 commits into from
Feb 20, 2018

Conversation

axelchalon
Copy link
Contributor

Closes #122
does the trick but it needs reviewing!

@mgeisler
Copy link
Owner

Hi Alex, thanks for the PR! I'm about to go to the US for 10 days so please don't be afraid if I don't get around to comment on their before February :-)

My first thought was that this is something the caller could handle by splitting the input string on newlines and then mapping fill over each line. It'll of course be less efficient, so maybe this is why you want it in line?

@mgeisler
Copy link
Owner

By the way, the build failure seems to be caused by a breaking change in a dependency - not because of your changes. I'll have to sort that out...

@axelchalon
Copy link
Contributor Author

Hi Martin! Sure, no worries!

Yes, you're right that we could split the input first and then call fill on each line; however, I think it'd be more efficient if this was taken care of inside of fill.

Also, currently, calling fill on an input with line breaks yields unexpected results (i.e. output has line breaks at what seems at first like random positions, #122), so even if we don't decide that fill preserve line breaks, then I imagine they should probably be at least stripped out so that we have a consistent result.

Have a nice trip!

@mgeisler
Copy link
Owner

mgeisler commented Feb 5, 2018

Hi Alex, I'm back again :-)

I think you're right: the current behavior is confusing... I had actually expected fill to convert \n into a space since the splitting is done on whitespace, not just on . However, as you noticed, the \n is kept in the output and so weird line breaks show up!

I've considered adding a refill method (see #60) which would try to be "intelligent" and detect paragraphs in the input text. I would expect such a function to require a blank line between paragraphs -- similarly to how Markdown separates paragraphs and how people have been formatting plaintext emails for decades. The reason: it would allow you to wrap the text nicely in your source file, while still have the text adapt to the user's terminal on display.

Do you think we should put that kind of logic into fill in all cases? That is, turn a single \n into a and turn \n\n+ into what your patch treats like a single paragraph break?

Perhaps it's better to keep fill simple and just fix the annoying line breaks... a more fancy re-wrapping refill function could then be added on top. I would like a refill function to handle things like bullet lists too (more or less the same way Emacs handles these things out of the box).

Copy link
Owner

@mgeisler mgeisler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how complex this function is already, so I tried to see if we could avoid adding more code than absolutely necessary here.

Please give it a try and also correct the other minor things. Thanks!

src/lib.rs Outdated

#[test]
fn multiline() {
assert_eq!(fill("1 3 5 7\n1 3 5 7",11),"1 3 5 7\n1 3 5 7");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you run rustfmt on the file? I believe it would insert a space after the ,. Thanks!

src/lib.rs Outdated

self.start = self.split + self.split_len;
self.line_width += wrapper.subsequent_indent.width();
self.line_width -= self.line_width_at_split;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is redundant -- in total, self.line_width becomes wrapper.subsequent_indent.width() since you set it to self.line_width_at_split above. So you can remove this line and simplify the previous line to a direct assignment:

self.line_width = wrapper.subsequent_indent.width();

src/lib.rs Outdated
fn multiline() {
assert_eq!(fill("1 3 5 7\n1 3 5 7",11),"1 3 5 7\n1 3 5 7");
assert_eq!(fill("1 3 5 7\n1 3 5 7",5),"1 3 5\n7\n1 3 5\n7");
assert_eq!(fill("1 3 5 7\n1 3 5 7",6),"1 3 5\n7\n1 3 5\n7");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little hard for me to figure out which corner-case each test is testing :-) Could you perhaps add multiple smaller tests if there is a common theme?

Here, I think a third line with

        assert_eq!(fill("1 3 5 7\n1 3 5 7", 7), "1 3 5 7\n1 3 5 7"); 

could be interesting -- it would demonstrate that the code doesn't accidentally add an extra \n.

src/lib.rs Outdated
assert_eq!(fill("test\nabcdefghi\n",11),"test\nabcdefghi\n");
assert_eq!(fill("test\nabcdefghi\n\n",11),"test\nabcdefghi\n\n");
assert_eq!(fill("test\nabcdefghijklmnopq\n\n",11),"test\nabcdefghijk\nlmnopq\n\n");
assert_eq!(fill("test\nabcdefghijklmnopq abcdefghijk\n\n",11),"test\nabcdefghijk\nlmnopq\nabcdefghijk\n\n");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length of "abcdefghijklmnopq" is 17, so I'm not actually sure if there is a relation between that number and the width of 11?


// If this is not the final line, return the current line. Otherwise,
// we will return the line with its line break after exiting the loop
if self.split + self.split_len < self.source.len() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually try, but would it be possible to combine the logic here with the existing logic that kicks in when the current line it too long? Maybe by changing the conditions slightly:

if ch != '\n' && is_whitespace(ch) {
    // ...
} else if ch == `\n` || self.line_width + char_width > wrapper.width {
    // ...
}

No idea if it works! :-D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm maybe but I can't really seem to get it working this way ^^ feel free to commit if you find a way

@mgeisler mgeisler changed the title [fill] Add support for inputs with line breaks WrapIterImpl: add support for inputs with line breaks Feb 5, 2018
@axelchalon
Copy link
Contributor Author

Thanks for the feedback! I'm a bit busy at the moment but I'll get back to you in a bit

@mgeisler
Copy link
Owner

Hi @axelchalon, no stress at all! Let me know if you think my suggestions are annoying or unhelpful, in which case I'll be happy to do them myself. Otherwise just take your time to iterate on this.

I should have fixed the brokenness of master by bumping the minimum version of Rust we test against. If you rebase your branch and force push, you should see tests passing.

@axelchalon
Copy link
Contributor Author

Hey Martin! It's all good, no worries :)

I would expect such a function to require a blank line between paragraphs [...] The reason: it would allow you to wrap the text nicely in your source file, while still have the text adapt to the user's terminal on display.

Hmm, Rust already supports escaping line breaks and subsequent leading whitespace with \ at the end of a line. Also, I would expect such issues to be handled by the language itself, and I feel that libraries shouldn't make perhaps less intuitive decisions to make the language more easy-to-use? I guess we can count on the rust dev team to keep working on this?
I understand where you're coming from, but I think the library should stick to its primary purpose; I see the library more as a util library for quick text reformatting / simple text manipulation, not so much as a pretty printer / library for markdown-ey parsing of human-formatted input. As such I would expect the library to execute minimal tasks and for its functions to do a single thing (and do it well! :D). I would expect the fill function to simply wrap the input (and keep the line breaks). For example if I want to wrap the following string (the help message of a CLI option):

Set the operating mode. MODE can be one of:
    last - Uses the last-used mode, active if none;
    active - Parity continuously syncs the chain;
    passive - Parity syncs initially, then sleeps and wakes regularly to resync;
    dark - Parity syncs only when the RPC is active;
    offline - Parity doesn't sync.",

It certainly wouldn't feel intuitive for me to have to separate each line with an extra blank line. I would expect the function to simply do what it's supposed to do: wrap the lines that are too long. (Of course, we could say that this function could handle lists as well, but then again it would overreach its simple purpose / intended function.)

@mgeisler
Copy link
Owner

I understand where you're coming from, but I think the library should stick to its primary purpose; I see the library more as a util library for quick text reformatting / simple text manipulation, not so much as a pretty printer / library for markdown-ey parsing of human-formatted input.

Very good point! More elaborate wrapping logic should at the very least be put into a different function then.

@mgeisler mgeisler merged commit 157c1e7 into mgeisler:master Feb 20, 2018
@mgeisler
Copy link
Owner

Thanks a lot, I've merged the branch and will make a release in one of the next days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants