Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

match #34

Closed
nrc opened this issue Oct 25, 2016 · 33 comments
Closed

match #34

nrc opened this issue Oct 25, 2016 · 33 comments

Comments

@nrc
Copy link
Member

nrc commented Oct 25, 2016

basics

match $discr {
    $arm1
    $arm2
    ...
}

Spaces and newlines as shown. In particular, prefer not to line-break the discriminant if possible; each match arm must have its own line; arms are block indented.

simple arms

Short arm:

$pat if $expr => $expr,

Prefer the above form if the arm fits on one line.

Avoid single line blocks:

$pat if $expr => { $expr }   // No

The last match arm should have a comma if the body is not a block:

match $expr {
    $expr => foo,
    $expr => bar,
}

Block body:

$pat if $expr => {
    ...
}

Note no comma if using a block, prefer not to line-break pattern or if clause, block indentation of arm body.

Empty blocks:

$pat if $expr => {}

Note {} not (), no comma, no space in the block.

Question - multi-line single expressions

Should a block be required?

E.g.,

match foo {
    bar => 42,
    baz => a_very_long_expr(dsfsdfsdfa,
                            dsafdfa,
                            dsfsfdsfdsfsd),
}

// vs

match foo {
    bar => 42,
    baz => {
        a_very_long_expr(dsfsdfsdfa,
                         dsafdfa,
                         dsfsfdsfdsfsd)
    }
}

Rustfmt currently does the latter, I found that it sometimes makes it easier to read, helps prevent rightward drift, and gets around the slightly unfortunate issue of expressions with blocks needing a comma:

match foo {
    bar => 42,
    baz => if qux {
        ...
    },  // comma needed here :-(
}

I'm no longer 100% convinced this is worth the extra vertical space however. We might want to leave this question unanswered for a while since the benefit in preventing rightward drift would not be so important if we pick a style which other wise prevents it (e.g., by using two space indent, or using more block indentation):

match foo {
  bar => 42,
  baz => a_very_long_expr(dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd),
}

Breaking the arm

Prefer breaking before the body:

$pat =>
    $expr,

or

$pat => {
    ...
}

Then prefer breaking before the if (if present, see below).

Then prefer breaking the pattern at |s (if there are multiple sub-patterns, see below). If breaking the pattern, the rest of the arm should only be broken as necessary. E.g., the following are ok:

pat1 |
pat2 |
pat3 if $expr => {
    ...
}
pat1 |
pat2 |
pat3 => $expr,
pat1 |
pat2 |
pat3 if $expr => $expr,

Finally, break each pattern if necessary.

Breaking before if

If necessary to break before the if, it should be block indented, a block is required for the arm body, and the { of the block should be on a newline. This helps prevent the if clause be distinguished from the arm body. Never break after the if:

$pat
    if $expr =>
{
    ...
}

If $expr does not fit on one line, break it according to the rules for expressions.

large patterns

If using a compound pattern (pat1 | pat2), prefer to put the whole pattern on one line. If it does not fit, put each sub-pattern on a new line, with the line terminated by |:

pat1 |
pat2 |
pat3 => {
    ...
}

If any sub-pattern does not on the line, break it according to the rules for patterns.

Question - blank lines between arms

These are sometimes useful for indicating groups of arms, I propose Rustfmt leaves single blank lines where the user writes them and coalesces multiple blank lines into one. E.g.,

match $expr {
    $expr => foo,

    $expr => bar,
}

Question - consistency between arms

Should we allow a mix of block arms and single expression arms? Or should we use blocks for every arm if any arm requires it? The second alternative is more consistent, but wastes some space. It also looks odd in the relatively common case where there is a single large arm and many small arms.

Question - exhaustiveness

Should we offer any guidance about using _ in match arms? I feel there are strong idioms for using this or not depending on desired backwards compatibility. However, it may be beyond the remit of the style guide/process.

@steveklabnik
Copy link
Member

Note no comma if using a block,

I pretty much always use commas, I've found this behavior of rustfmt confusing.

Note {} not (), no comma, no space in the block.

I know I use () not {} here, since () is going to be the type/value of the return of {} anyway, so it feels more clear.

@joshtriplett
Copy link
Member

Regarding the comment about "no single-line blocks", I think single-line blocks are fine for short cases like unsafe { u.x }, per the discussion on #21. Though that effectively turns into an expression, and the block lives inside the expression rather than directly in the match, so the expression/block rules apply. Worth thinking about though.

Also, I don't know what rustfmt currently does, but in the same spirit as writing

let x = long_expr
        + long_expr_2;

(with the + at the start of the second line), I prefer to write:

match expr {
    pattern1
    | pattern2 => ...
}

@nrc
Copy link
Member Author

nrc commented Oct 25, 2016

I pretty much always use commas, I've found this behavior of rustfmt confusing.

To expand on my reasoning - I find block + comma noisey and prefer to avoid this kind of duplication where possible (by analogy to e.g., parens in if expressions and braces around an expression in match arms).

I know I use () not {} here, since () is going to be the type/value of the return of {} anyway, so it feels more clear.

My analogy here is that for a function we don't usually write -> () for a void return type, nor do we write

fn foo() {
    ()
}

So, it feels weird to explicitly write the () in a match.

I also think that the intuition of an empty match arm is "do nothing" (i.e., execute an empty block), rather than "reduce to a value which is void", in particular because it is rare to assign the result of a match with void type into a value.

I also argue by analogy to block expressions in statement position - you wouldn't write if { ... };.

Regarding the comment about "no single-line blocks" ...

Yeah, this is specifically about the arm body and whether that should be a block. I don't consider this applying to either if { ... } or unsafe { ... }, etc. I don't think you would ever need a regular block with a single expression, since there is no scoping to be affected.

Also, I don't know what rustfmt currently does, but in the same spirit as writing

Hmm, this might get into a wider discussion so it might be worth peeling off. My opinion is that I prefer binops to be on the previous line rather than the next one. I don't feel too strongly about that, but I do feel strongly that commas should follow that behaviour, and I kind of want | (and by extension all binops) to follow commas. My reasoning here is that this is the behviour of punctuation in written English, and violating this rule makes the code look more alien and slower to parse. I believe the main argument against is better diffs and possibly better scanability.

@joshtriplett
Copy link
Member

joshtriplett commented Oct 25, 2016

I agree completely regarding commas, and semicolons; those operators should fade into the background as simple terminators, though.

Regarding other binops (and I do think we should split this discussion off): I find that putting a binop at the end of the previous line makes it easier to get lost, and harder to construct the tree in your head, especially when you break an expression in multiple places. If I look at something like this:

if really_complicated_expression
   || another_complicated_expression
   || yet_another_complicated_expression

(where the complicated expressions themselves have piles of operators and calls in them), having the || at the start of the next line makes the tree of operations more obvious. This has nothing to do with diffs; I consider it a standalone readability issue. I don't want those operators to fade into the background the way commas do; I want them in the foreground.

Typically, a comma or semicolon has importance because of its presence or absence, but for most binary operators, I need to know which operator (such as && versus ||).

@steveklabnik
Copy link
Member

I find block + comma noisey and prefer to avoid this kind of duplication where possible

I find the lack of consistency more noisy than the extra character. Every time I use JSON, it trips me up that I can't include the trailing comma. There's also the usual diff arguments, etc.

I also think that the intuition of an empty match arm is "do nothing" (i.e., execute an empty block), rather than "reduce to a value which is void", in particular because it is rare to assign the result of a match with void type into a value.

This might be the root of it, yeah, I think I do think this way.

I also argue by analogy to block expressions in statement position - you wouldn't write if { ... };.

I do actually write this a lot; it took a lot of habit-breaking to get out of it. I'd probably keep it in, but this isn't an issue about that 😉

@Alex-PK
Copy link

Alex-PK commented Oct 25, 2016

There is probably a dedicated RFC, but honestly, I don't like this syntax:

match foo {
  bar => 42,
  baz => a_very_long_expr(dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd),
}

I much prefer:

match foo {
  bar => 42,
  baz => a_very_long_expr(
    dsfsdfsdfa,
    dsafdfa,
    dsfsfdsfdsfsd
  ),
}

I think it's visually clearer they are all parameters to the function, and where the list ends. I don't really mind the vertical spacing. Sometimes I even leave blank lines between conditions to make them stand out more.

I really dislike the alignment in this:

match foo {
    bar => 42,
    baz => a_very_long_expr(dsfsdfsdfa,
                            dsafdfa,
                            dsfsfdsfdsfsd),
}

@nrc
Copy link
Member Author

nrc commented Oct 25, 2016

There is probably a dedicated RFC

Yes, there will be. For the match arm formatting I don't think it matters which of those we choose, but the difference between block and visual indenting might affect our decision.

@azerupi
Copy link

azerupi commented Oct 25, 2016

Note no comma if using a block,

I pretty much always use commas, I've found this behavior of rustfmt confusing.

I too, prefer the use of commas even for arm blocks. I like the consistency and that you can visually differentiate the closing match arm from any other closing block. The comma is one of the things I use to help distinguish between arms, especially when the arms are relatively long. It would be a shame to loose that IMHO.

@nrc
Copy link
Member Author

nrc commented Oct 26, 2016

Some discussion from the Rustfmt repo: rust-lang/rustfmt#1129 and rust-lang/rustfmt#490

@jethrogb
Copy link

I often have match arms like this:

match x {
    long_const1 | long_const2 | long_const3 | long_const4 | long_const5
    | long_const6 | long_const7 | long_const8 | long_const9
    | long_const10 | long_const11 | long_const12 | long_const13
    => { ... }
}

Putting all the individual constants on a single line as proposed above takes up too much space IMO.

@chriskrycho
Copy link

@jethrogb To me, that seems like a place to extract that into a function to check those conditions, rather than requiring the formatter to support that style.

@jethrogb
Copy link

@chriskrycho I have an (error) enum with 100 variants, whatever I do, at the end there will be a match that looks like that.

@chriskrycho
Copy link

chriskrycho commented Nov 17, 2016

I don't want to further derail this thread, but I'd be inclined to write a function that did something like this, instead of that long hairy match, if you need something to handle a bunch of different cases:

fn is_error_blah(&variant: MyError) -> bool {
  let allowed_variants = [MyError:A, MyError:B, MyError:C, ...];
  allowed_variants.into_iter().any(|v| v == variant)
}

But there will inevitably be some long lists in a match, so… ¯\_(ツ)_/¯ I see your point.

@nrc
Copy link
Member Author

nrc commented Nov 29, 2016

@steveklabnik

There's also the usual diff arguments, etc.

I don't think there is a diff argument in this case - with a block you don't need a comma in any case, so adding or removing an arm has the same size diff with or without the extra comma.

@nrc
Copy link
Member Author

nrc commented Nov 29, 2016

I have an (error) enum with 100 variants, whatever I do, at the end there will be a match that looks like that.

My opinion is that there will always be edge cases where mechanical formatting does not work. Either you put such functions in a module by themselves and treat them like generated code, or you mark them with #[rustfmt_skip]. I don't think that we should let such edge cases dictate style for more common code.

@nrc
Copy link
Member Author

nrc commented Nov 29, 2016

I think this is ready for FCP. There is maybe a little more room for fine-tuning, but there hasn't been much movement recently.

@nrc nrc removed the has-PR label May 29, 2017
@seanmonstar
Copy link

I'd like to toss my opinion in about "multi-line single expressions":

The issue with turning all multi-line single expressions into a block in the arm means that a certain pattern becomes noisier:

Preferred

let enc = match subject {
    cond1 => Encoder {
        field1: 5,
        field2: cond1,
        // ...
    },
    cond2 => Encoder {
        field1: 11,
        field2: cond2,
    },
    // ... etc
}

Current Rustfmt, worse (imo)

let enc = match subject {
    cond1 => {
        Encoder {
            field1: 5,
            field2: cond1,
            // ...
        }
    }
    cond2 => {
        Encoder {
            field1: 11,
            field2: cond2,
        },
    }
    // ... etc
}

@nrc
Copy link
Member Author

nrc commented Jun 1, 2017

@seanmonstar this is covered here - #61 (comment) - and I think we are in agreement

@nielsle
Copy link

nielsle commented Jul 11, 2017

If one of the match patterns is formatted as a block, then rustfmt could prefer to format the remaining patterns as blocks. The following code doesn't look pretty

let bar  = match foo {
     Foo::X(v) => { this_expression_is_not_long_enough_to_be_formatted_as_a_block(v) },
     Foo::Y(v)  => { 
         this_expression_is_longer_so_it_is_long_enough_be_formatted_as_a_block(v)  
     },
}

This idea would break some diffs, but it could make the code look prettier - But perhaps it is better to wait and see if this idea is really needed.

@joshtriplett
Copy link
Member

@nielsle I think it's fine to have some match RHSes formatted on one line and some not. I regularly see matches with several trivial cases and then a couple of complex ones, and I'd rather not see the trivial cases pushed to multiple lines just because of a single complex case. (In addition to, as you pointed out, causing cascading changes in diffs.)

@djc
Copy link

djc commented Jul 20, 2017

I would like to argue against treating block arms and non-block arms differently in the use of commas. To me, a block is just the special case of an expression that's longer, or, more stringently, an expression which is more conveniently expressed by using a few statements. I also find the inconsistency jarring -- are commas required/preferred after block arms or not? Fragmenting this further does not make a whole lot of sense to me.

@nrc
Copy link
Member Author

nrc commented Jul 21, 2017

I wonder if we want to do something to handle cases like this: rust-lang/rustfmt#1807 (that is many small variants in a pattern)? It does seem bad to put each on their own line like this. It is not uncommon to see matches like this (but I don't think it is the common case).

@strega-nil
Copy link

So, during the discussion with the team, we came up with a grammar that, we think, encapsulates the cases where we want "mixed-line" formatting:

[single-line, ntp]:
    - single token
    - `&[single-line, ntp]`

[single-line]:
    - `[single-line, ntp]`
    - unary tuple constructor `([single-line, ntp])`
    - `&[single-line]`

Any patterns which are exclusively variants of [single-line] (i.e., Some(x) | Some(y)), where they differ only in the single token that they include (for example, Some(x) | None would be multi-line - I'm not sure of this rule, now that I say it with the example) would not be broken with one-pattern-per-line, as is the normal rule; it'd be broken with the line-width rule.

@Florob
Copy link

Florob commented Aug 18, 2017

Is there any particular rational for breaking multiple sub-patterns after the |?
I personally prefer breaking before the | by far.

The currently suggested style means my eyes have to dart back and forth between beginning and end of lines with varying length to find the last relevant pattern. It also makes missing that multiple patterns are part of the same match arm easier.

Having the | first thing on the line means I can just scan down the front of the lines to find the last relevant pattern. It is also an immediately visible indicator that the previous pattern was also relevant to the arm.

@strega-nil
Copy link

For the right hand side of match arms, I think that a rule like

If an arm is long enough to be broken across multiple lines, it should use block syntax.

would be good.

@strega-nil
Copy link

(I'd also argue for blocks everywhere, because I dislike non-block RHSes, but that's likely to be unpopular)

@Alex-PK
Copy link

Alex-PK commented Aug 18, 2017

@Florob if you mean something like this:

match &*ext.to_lowercase() {
    "bmp" | "gif" | "ico" | "jpe" | "jpeg" | "jpg" | "png" | "svg" | "tif" | "tiff"
    | "webp" | "xcf" | "psd" | "ai" 
    => FileType::Image,

    "3g2" | "3gp" | "avi" | "divx" | "flv" | "mov" | "mp4" | "mp4v" | "mpa" | "mpe"
    | "mpeg" | "ogv" | "qt" | "webm" | "wmv"
    => FileType::Video,

... I agree.
Putting them each on a single line is cumbersome and hinders visibility.

Keeping the operator on the left helps a lot in understanding what's going on, with a block too:

match &*ext.to_lowercase() {
    "bmp" | "gif" | "ico" | "jpe" | "jpeg" | "jpg" | "png" | "svg" | "tif" | "tiff"
    | "webp" | "xcf" | "psd" | "ai" 
    => {
        FileType::Image
    },

    "3g2" | "3gp" | "avi" | "divx" | "flv" | "mov" | "mp4" | "mp4v" | "mpa" | "mpe"
    | "mpeg" | "ogv" | "qt" | "webm" | "wmv"
    => {
        FileType::Video
    },

(Yes, I also prefer to have a comma after the blocks :P )

@Florob
Copy link

Florob commented Aug 18, 2017

@Alex-PK That is probably another case where it makes sense. My thinking was more along the lines of:

match … {
    Var(ref inner) |
    IAmAVariant(ref inner) |
    Pat(ref inner) |
    PataPattern(_, _, ref inner) => …
    NextPattern => …
}

vs.

match … {
    Var(ref inner)
    | IAmAVariant(ref inner)
    | Pat(ref inner)
    | PataPattern(_, ref inner, _) => …
    NextPattern => …
}

I find the later much either to scan through.

@Alex-PK
Copy link

Alex-PK commented Aug 18, 2017

Yes, I agree.

As I said, I would also put the arrow on a new line too, as it's even easier to read/find the proper action performed.

@joshtriplett
Copy link
Member

@Florob I didn't realize we broke after the |; I would prefer before, as well.

(Closely related: while I don't think we want rustfmt introducing rustc version compatibility issues by using leading |, we should make sure rustfmt can handle it when rustc can.)

@Florob
Copy link

Florob commented Aug 18, 2017

@joshtriplett I am rather confused by your parenthetical. rustc handles leading | just fine and has done so since before 1.0 as far as I know. Since Rust is not whitespace sensitive I'd be surprised to learn this isn't the case.

@joshtriplett
Copy link
Member

@ricvelozo
Copy link

Wildcard patterns must to be the last anyways, so should be no comma in this case?

match f(t) {
    y if y > 400.0 || y.is_nan() => println!("Too large"),
    y => println!("{}", y) // no comma here
}

match f(t) {
    y if y > 400.0 || y.is_nan() => println!("Too large"),
    _ => println!("wildcard") // no comma here
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests