Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make bool an enum #348

Open
aturon opened this issue Oct 2, 2014 · 38 comments
Open

Make bool an enum #348

aturon opened this issue Oct 2, 2014 · 38 comments
Labels
A-enum Enum related proposals & ideas A-primitive Primitive types related proposals & ideas postponed RFCs that have been postponed and may be revisited at a later time. T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@aturon
Copy link
Member

aturon commented Oct 2, 2014

Currently, bool is not an enum, but there's no strong reason for that state of affairs. Making it an enum would increase overall consistency in the language.

This is a backwards-compatible change (depending on the specifics) and has been postponed till after 1.0.

See #330.

@aturon aturon added the postponed RFCs that have been postponed and may be revisited at a later time. label Oct 2, 2014
withoutboats pushed a commit to withoutboats/rfcs that referenced this issue Jan 15, 2017
@petrochenkov petrochenkov added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 19, 2018
@Centril
Copy link
Contributor

Centril commented Aug 23, 2018

Whatever happened to this? =P

(is it still backwards compatible?)

@Centril Centril added A-enum Enum related proposals & ideas A-primitive Primitive types related proposals & ideas labels Nov 26, 2018
@SimonSapin
Copy link
Contributor

We document:

https://doc.rust-lang.org/1.34.0/std/primitive.bool.html

If you cast a bool into an integer, true will be 1 and false will be 0.

Casting here likely refers to the as operator.

However as far as I can tell we currently make no promise about the memory representation of bool (other than size_of being 1 byte). We may want to keep it that way, for #954.

If bool is a primitive rather an enum, we can (more easily) special-case its casting behavior in the language.

@oberien
Copy link

oberien commented May 20, 2019

Wouldn't the casting behaviour and guaranteed size_of still be possible with repr(u8)? (even though that might be a problem with #954)

#[repr(u8)]
enum bool {
    false = 0,
    true = 1,
}

@gorilskij
Copy link

The problem with representing bool as an enum is that it's either backwards-compatible (bool, true, false) or it conforms to the CamelCase naming convention for enums defined in RFC #430 (Bool, True, False), not both.

@RustyYato
Copy link

@gorilskij I don't see how this is a big issue. Obviously we have to preserve true and false and be inconsistent with CamelCase
How is this that bad though?

@gorilskij
Copy link

gorilskij commented Jul 11, 2019

@KrishnaSannasi I would tend to agree, both for compatibility and because bool is a basic enough type to warrant a lower case name. It's not bad, just unpleasantly inconsistent, especially in a language as relatively polished as Rust. It's not nice to have a builtin type that, were you to write it yourself, would raise a compiler warning.

There is another, more serious point. As proven by the fact that false is 0 and true is 1, bool is more than just a simple 2-variant enum, it has a u1 kind of numeric feeling to it, logical operations have a numerical operation feeling to them as well as evidenced by some of the notation (&& ~= *, || ~= +, ...) not to mention the fact that all those operations as well as the if/else construct depend on the bool type, it would be possible to do logic without all of that but with how fundamentally useful it is, I think it (bool and friends) deserves it's special status in Rust as in the vast majority of other languages.

As an example to illustrate what I mean, Haskell implements Bool as a true 2-variant type and defines the infix operators (&&, ||, ...) in the standard way to go along with it (note that logical not is called not, not ! to maintain consistency because it's a prefix operator, though Haskell does still bend the rules for prefix - but that's beside the point). This, I believe, is nice and idealistic but not suited to a more practical language like Rust.

TL;DR I think making bool an enum changes nothing from the implementation perspective and goes half-way in a direction I'm not sure we should be going from the purity perspective.

@RustyYato
Copy link

RustyYato commented Jul 11, 2019

@gorilskij

There is another, more serious point. As proven by the fact that false is 0 and true is 1, bool is more than just a simple 2-variant enum, it has a u1 kind of numeric feeling to it

I would argue that a u1 is just a 2-variant enum, and that any numeric properties it has are just semantics for us, but don't really need to be encoded into it's representation.

Also a bool can be represented as @oberien showed earlier,

#[repr(u8)]
enum bool {
    false = 0,
    true = 1
}

and then special casing && and ||. The other operators (like !) can be implemented normally.

Note that there is movement to make && and || overloadable like all of the other operators. If that lands, then we can implement && and || normally for bool as well.

One more advantage to this is that it reduces the number of keywords in Rust, because bool, true and false no longer have to be keywords.

edit: bool is not a keyword, my bad

@gorilskij
Copy link

Perhaps you're right. I still think that in an ideal world it would be Bool, True and False, maybe that's something to consider for a future edition. What about if/else though? If bool is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if/else to any 2-variant enum seems even more weird and magical.

@RustyYato
Copy link

What about if/else though? If bool is made to look like any other 2-variant enum won't it seem strange that it alone can be used in those statements? Opening up if/else to any 2-variant enum seems even more weird and magical.

Well, if and else are kinda special. We could define that

if $cond {
    $block
}

gets desugared to

match $cond {
    bool::true => $block,
    bool::false => (),
}

And similarly for if $cond { $block_t } else { $block_f } we could desugar to a match, this way it is still clear what is happening.

I still think that in an ideal world it would be Bool, True and False, maybe that's something to consider for a future edition

I don't think an edition can do this, it seems like to big of a breaking change.

@comex
Copy link

comex commented Jul 11, 2019

Alternately, if $cond could be desugared to if let bool::true = $cond, which is more consistent and also works for match guards (although if let match guards are not actually implemented yet).

edit: Actually, that's a bad idea because it conflicts with the proposal to make let an expression.

@gorilskij
Copy link

I understand that if and if/else can be treated as pure sugar but they are nonetheless (together with while which, as far as I can see, can't be desugared without recursion) very fundamental constructs in the language and very fundamentally depend on the bool type. Making the type an ordinary enum seems somehow dirty as it would lose it's intrinsic specialness yet retain a lot of special treatment, making it, as far as I can tell, the only enum "allowed" (without compiler complaint) to have a lowercase name seems like more of the same. I like reducing keywords and removing special behavior as much as anyone but I think this proposal does that too superficially.

@CryZe
Copy link

CryZe commented Jul 13, 2019

if, while, for and co. are already desugared to loop + match.

@SimonSapin
Copy link
Contributor

“Language items” are not dirty, they are often necessary. For example for loops need to know about the IntoIterator and Iterator traits, as well as the Option enum.

I think there is no real blocker to making bool an enum. We can make it work. But I also don’t see much benefits, so it may not be worth the bother.

@gorilskij
Copy link

gorilskij commented Jul 13, 2019

But I also don’t see much benefits, so it may not be worth the bother.

I don't see any benefits other than "it looks cleaner" which I'd argue it doesn't due to, at least, the naming convention inconsistency.

@varkor
Copy link
Member

varkor commented Jul 13, 2019

No-one is suggesting we change the name. Using an enum instead of a bool would simplify some special-casing in the compiler, which would be an advantage.

@gorilskij
Copy link

I was talking about the issue purely from a user's point of view, I have no knowledge of compiler internals (or experience in this area for that matter). Just out of curiosity, how would switching to an enum representation help? It seems to me that a primitive type would be easier to write special cases for.

(btw, if this is not the place for such discussion, tell me, I'll move it somewhere else)

@RustyYato
Copy link

RustyYato commented Jul 14, 2019

@gorilskij as @varkor said it would reduce special casing the compiler, so it would make the compiler simpler. Special casing only makes things more complex, not the other way around. For some things the complexity is justified, but with bool we could easily make it less special and thus simplify the compiler. This makes it easier to maintain and contribute to.

@gorilskij
Copy link

Oh, sorry, I misread it as "it would make it simpler to implement special casing." Thanks for clarifying.

@jhpratt
Copy link
Member

jhpratt commented Sep 13, 2019

@varkor What's the way forward on this? It seems like it's a good idea, given that it would reduce the special-casing in the compiler.

@varkor
Copy link
Member

varkor commented Sep 16, 2019

@jhpratt: there was a brief discussion on Discord about it a while ago. You'd probably want to talk to @oli-obk or @eddyb about it if you wanted to pursue it, who both had some ideas about it.

@jhpratt
Copy link
Member

jhpratt commented Dec 11, 2019

After a slight bit of discussion on Reddit, a couple things came up. Would the true and false keywords need to be removed? Intuitively, I don't think it would be possible to implement an enum bool { false, true } unless they weren't keywords (as otherwise raw identifiers would be necessary). Also, would removing keywords be backwards compatible?

I'll (finally) reach out on Discord to them to see if they have any ideas.

@varkor
Copy link
Member

varkor commented Dec 14, 2019

We can keep the keywords and have them refer to the variants of the enum. There might be some diagnostics changes to work out, but I don't think it'd be a problem.

@oli-obk
Copy link
Contributor

oli-obk commented Dec 14, 2019

I think it's mostly compiler-internal work that is blocking this. I tried it once and it's not super simple to do. Loads of code expects to know that something is a bool or make something a bool, and for this we right now have a very convenient way (tcx.types.bool). With this PR that would have to change to tcx.type_of(tcx.require_lang_item(lang_items::BoolEnum)). While we can totally abstract this to tcx.mk_bool(), it's still not the zero cost thing we had before and I think there were other inconveniences, too. So maybe there could be a preliminary PR that just removes tcx.types.bool in preparation of this change, so we can judge some of the fallout.

@petrochenkov
Copy link
Contributor

I have doubts that this "reduce special casing the compiler", bool is indeed special and built-in for the compiler in many senses.
I suspect it would be much easier to offload e.g. ! to the library (as an empty enum) than bool.

@burdges

This comment has been minimized.

@oli-obk

This comment has been minimized.

@SimonSapin
Copy link
Contributor

I think that 2014 or earlier would have been a good time for a change like this.

At this point doing it without breaking things would take non-trivial care while the benefits as far as I can tell are entirely theoretical. It would feel nice to have fewer special cases in the language, but is there any concrete benefit to users?

I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.

@Centril
Copy link
Contributor

Centril commented Dec 15, 2019

I would personally like to avoid considering whether bool should become a library type right now. I would also like to like to avoid making a formal decision that it should never become a library type as well. I'm personally happy with indecision for the foreseeable future. :)

@jhpratt
Copy link
Member

jhpratt commented Dec 15, 2019

@SimonSapin What's the harm in leaving this open, in case someone wanted to at least try it to see the admittedly uncertain benefits? That's what I was going to do, but @Centril said it would likely conflict with other ongoing work.

@SimonSapin
Copy link
Contributor

it would likely conflict with other ongoing work

It sounds like you answered your own question?

@SimonSapin
Copy link
Contributor

If this were a formal RFC submitted now, what should the Motivation section contain?

From the template:

Why are we doing this? What use cases does it support? What is the expected outcome?

@jhpratt
Copy link
Member

jhpratt commented Dec 15, 2019

The conflict with ongoing work would be temporary (iirc he said if-let chains), as the bits it would touch would eventually be completed. My interpretation of your original comment was a permanent (or at least indefinite) declination of the (pseudo-)RFC.

@varkor
Copy link
Member

varkor commented Dec 17, 2019

I think that formally deciding not to make this change (as opposed to it being merely postponed) should be an option on the table for @rust-lang/lang.

This is a compiler implementation detail and doesn't affect the language. The only possible user-facing change this could have is changing diagnostics, but these are easily special-cased anyway. While I don't think this ought to be a priority, if someone wants to try changing it to see whether the consequences for code clarity are advantageous, there's no reason to dissuade them (other than their time perhaps being better placed with more useful features).

@igotfr
Copy link

igotfr commented Apr 30, 2022

why not use just 1 bit for bool like ziglang?
https://ziglang.org/documentation/0.9.1/#packed-struct
"bool fields use exactly 1 bit."

@shepmaster
Copy link
Member

shepmaster commented May 1, 2022

use just 1 bit for bool

If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.

@igotfr
Copy link

igotfr commented May 1, 2022

use just 1 bit for bool

If that were the case, then you wouldn't be able to safely take a reference to the boolean. Note that the section you are referring to is titled packed struct.

https://ziglang.org/documentation/0.9.1/#boolToInt
@boolToInt

@boolToInt(value: bool) u1

Converts true to @as(u1, 1) and false to @as(u1, 0).

If the value is known at compile-time, the return type is comptime_int instead of u1.

@shepmaster
Copy link
Member

If you are responding to me, it’s not clear what the argument/point of the response is.

You asked why a Boolean can’t (universally?) be represented as a single bit. The answer is that you can’t take a reference to a bit (memory addresses only have byte granularity) and values in Rust can generally be referenced.

You’ll need to clarify what point your response is trying to make.

@Kixiron
Copy link
Member

Kixiron commented May 1, 2022

For what it's worth rust actually does use the i1 LLVM type to represent booleans. Bools only taking one byte within structs is a packed structs issue, so not relevant to this issue

Reinfy added a commit to Reinfy/Pair-Extraordinaire that referenced this issue Aug 8, 2022
I have presented too much Rust fandom in the past.
Time for some hate!

(I still love Rust, but I need an archive for the things I am sad about it)

(Title is inspired by http://phpsadness.com/)

(Some of these are unpopular opinions, quite many people think differently)

## Standard library
### Primitives
#### Primitives are not really primitive
[Why is `bool` not an `enum Bool { False, True }`?](rust-lang/rfcs#348)
[Why is `str` not a `struct Str([u8]);`?](rust-lang/rfcs#2692)

#### Conversion between integers is not explicit
`as` conversions do not explicitly state whether and what is lost.
For example, `u8 as u16` is perfectly fine (expand size),
but `u16 as u8` is a truncation,
`u16 as i16` might overflow,
`u16 as f32` is perfectly fine (lossless),
`u32 as f32` is lossy,
`usize as u32` and `u64 as usize` are lossy depending on the platform...

On the other hand, while we can use `.try_into()` on the numbers,
it is unclear what the impacts are,
and handling the error is more challenging if panic is not desired.
It is difficult to find (if it even exists) the *named* safe conversion method
from the standard library documentation.

To solve this problem, I published the [xias](https://docs.rs/xias) crate,
which exposes a blanket impl on primitive types to facilitate explicitly explained conversions.

### Collections
#### `.len()` and `.is_empty()`
Clippy [recommends](https://rust-lang.github.io/rust-clippy/master/index.html#len_without_is_empty)
that types implementing `.len()` should also implement `.is_empty()`.
This much sounds like `.clone()` and `.clone_from()`, which should use a default trait impl.
Why wasn't `.len()` a trait anyway?

## General syntax
### Operators
#### The `!` operator
```rs
if !matches!(value, Enum::Variant) {
    panic!("Do you think value matches Enum::Variant or not here, \
        if you didn't know Rust?");
}
```

An alternative is to use `condition.not()`, but that requires `use std::ops::Not` everywhere.

What about `condition == false`? [Thanks so much clippy.](https://rust-lang.github.io/rust-clippy/master/index.html#bool_comparison)

#### `.await`
No, don't get me wrong, I think `future.await` is much better than `await future`.
Not a fan of cargo cult.

But `future.await` looks as if it is a field,
especially if the syntax highlighter is not aware of it
(e.g. in outdated syntax highlighting libraries).
Why can't we make it *look like* a postfix macro,
like `future.await!` or `future.await!()`?

### Pattern matching
#### `let` has mixed use in assignment and conditions
If you didn't know Rust, you're probably confused why we use `=` instead of `==` here

```rs
if let Enum::Variant = value {
    // ...
}
```

This is getting even more confusing with the let...else syntax

```rs
let Ok(socket) = Socket::bind(addr) else {
    anyhow::bail!("Cannot bind socket to {addr}");
};
```

Guess what, `expr else { ... }` itself is not a condition. You actually group it as `let pat = expr` `else` `{...}` instead.

#### Ambiguous new bindings
```rs
pub fn validate(input_byte: u8) -> u8 {
    const FORBIDDEN: u8 = 42;
    
    match input_byte {
        FORBIDDEN => panic!("input byte cannot be {FORBIDDEN}"),
        value => return value,
    }
}

fn main() {
    dbg!(validate(41));
    dbg!(validate(42));
}
```

```
[src/main.rs:11] validate(41) = 41
thread 'main' panicked at 'input byte cannot be 42', src/main.rs:5:22
```

`FORBIDDEN` is a constant but `value` is a new binding? how is this even supposed to work?

#### Use of `|` in patterns
```rs
fn convert_bitmask(input: u8) -> Option<bool> {
    match x {
        Some(1 | 2 | 4) => Some(true),
        Some(..8) => Some(false),
        _ => None,
    }
}
```

Didn't learn Rust before? Guess what the above means:

- "returns true if x is 1 or 2 or 4"
- "returns true if x is 7 (1 bitwise-OR 2 bitwise-OR 4)"

#### `_` and `_var` mean different things
You may have noticed that you can suppress the unused variable warning with both `let _ =` and `let _var =`.
Golang prefers the former (the latter doesn't work in Go), but they actually mean different things in Rust.

`_var` is just a normal binding name (local variable name in the case of `let _var =`),
similar to `var`, except it suppresses some warnings about unused identifiers.

`_` is the pattern that matches everything and binds the value to nothing.
Since there is no binding, the value is dropped immediately.

This causes a significant difference when it comes to RAII:

```rs
{
    let _guard = mutex.lock();
    println!("mutex is still locked here");
}
println!("mutex is unlocked here because `_guard` is dropped");
```

```rs
{
    let _ = mutex.lock();
    println!("mutex is already unlocked here because `_` drops immediately");
}
```

Fortunately, this is usually not a problem, because it is rare to acquire an unused RAII object
(although it is still used in some rare occasions).

## Testing
### Difficulty to unit test procedural macros
Procedural macros are typically tested in the form of integration tests,
where the user writes example code to use the procedural macro
and check if it compiles correctly.
However, this means the whole program is tested together,
and breaking any single part would cause all tests to fail.
This is particularly troublesome when errors caused by macro output are hard to debug.

It doesn't make sense to unit test particular functions that generate `TokenStream` output,
because that would imply writing test cases that repeat every token in the generated output,
and modifying or even reformatting (e.g. adding trailing commas) any part of the output
would require updating all unit tests.
We want to unit test the logic to ensure that
each unit contributes what it is expected to contribute,
not to unit test the exact shape of the generated code.

Another option is to have `#[cfg(test)] #[proc_macro*] pub fn` macros that expose the internal functions,
but this suffers from two problems.
Firstly, this involves additional code to parse an additional TokenStream to serve as the input arguments,
and sometimes inapplicable when the macro generates a special, incomplete part of the syntax tree
like match arms, struct fields, trait bounds, other non-item non-statement ASTs,
or a combination of multiple ASTs (e.g. multiple non-exhaustive match arms).
Secondly, this only works when the integration test is located in the same package as the procedural macro.
This is often not the case when the procedural macro references some code in another closely coupled crate,
e.g. `derive@serde_derive::Deserialize` depends on `trait@serde::Deserialize`.
This can be solved by setting a dependent as a dev-dependency,
but such a solution would not be scalable.

### Inflexible panic tests
Unit tests currently allow `#[should_panic]` to assert that a unit test should panic.
`#[should_panic(expected = "substring")]` additionally asserts that the panic message must contain `substring`.
This is not ideal for two reasons.
Firstly, substrings may not be safe as some parts of the panic message remain untested.
Secondly, the panic message is an implementation detail of the panic but not the subject to be tested.
The goal is to test that a certain code panics for a known reason, not to test what it shows to the user.
For example, `panic!("Cannot insert entry into string map because {:?} is not one of String or &'static str", ty)`,
where `ty: std::any::TypeId`, cannot be tested properly,
because we cannot make sure that both `insert entry into the string map`
and `is not one of String or &'static str` appear in the panic message
since `TypeId as fmt::Debug` has inconsistent output
(which is perfectly fine because it should not be relied on).
This also means we cannot test which `TypeId` the panic involves
(although a crate like this should probably attempt to take the `any::type_name` for slightly more consistent output).

Coauthored-by: xqwtxon <xqwtxon@hotmail.com>
Reinfy added a commit to Reinfy/Pair-Extraordinaire that referenced this issue Aug 8, 2022
This wiki records some of my unpopular opinions.

Since this wiki is expected to be somewhat unorganized, there is no index page. See the sidebar if you want to continue browsing.

I have presented too much Rust fandom in the past.
Time for some hate!

(I still love Rust, but I need an archive for the things I am sad about it)

(Title is inspired by http://phpsadness.com/)

(Some of these are unpopular opinions, quite many people think differently)

## Standard library
### Primitives
#### Primitives are not really primitive
[Why is `bool` not an `enum Bool { False, True }`?](rust-lang/rfcs#348)
[Why is `str` not a `struct Str([u8]);`?](rust-lang/rfcs#2692)

#### Conversion between integers is not explicit
`as` conversions do not explicitly state whether and what is lost.
For example, `u8 as u16` is perfectly fine (expand size),
but `u16 as u8` is a truncation,
`u16 as i16` might overflow,
`u16 as f32` is perfectly fine (lossless),
`u32 as f32` is lossy,
`usize as u32` and `u64 as usize` are lossy depending on the platform...

On the other hand, while we can use `.try_into()` on the numbers,
it is unclear what the impacts are,
and handling the error is more challenging if panic is not desired.
It is difficult to find (if it even exists) the *named* safe conversion method
from the standard library documentation.

To solve this problem, I published the [xias](https://docs.rs/xias) crate,
which exposes a blanket impl on primitive types to facilitate explicitly explained conversions.

### Collections
#### `.len()` and `.is_empty()`
Clippy [recommends](https://rust-lang.github.io/rust-clippy/master/index.html#len_without_is_empty)
that types implementing `.len()` should also implement `.is_empty()`.
This much sounds like `.clone()` and `.clone_from()`, which should use a default trait impl.
Why wasn't `.len()` a trait anyway?

## General syntax
### Operators
#### The `!` operator
```rs
if !matches!(value, Enum::Variant) {
    panic!("Do you think value matches Enum::Variant or not here, \
        if you didn't know Rust?");
}
```

An alternative is to use `condition.not()`, but that requires `use std::ops::Not` everywhere.

What about `condition == false`? [Thanks so much clippy.](https://rust-lang.github.io/rust-clippy/master/index.html#bool_comparison)

#### `.await`
No, don't get me wrong, I think `future.await` is much better than `await future`.
Not a fan of cargo cult.

But `future.await` looks as if it is a field,
especially if the syntax highlighter is not aware of it
(e.g. in outdated syntax highlighting libraries).
Why can't we make it *look like* a postfix macro,
like `future.await!` or `future.await!()`?

### Pattern matching
#### `let` has mixed use in assignment and conditions
If you didn't know Rust, you're probably confused why we use `=` instead of `==` here

```rs
if let Enum::Variant = value {
    // ...
}
```

This is getting even more confusing with the let...else syntax

```rs
let Ok(socket) = Socket::bind(addr) else {
    anyhow::bail!("Cannot bind socket to {addr}");
};
```

Guess what, `expr else { ... }` itself is not a condition. You actually group it as `let pat = expr` `else` `{...}` instead.

#### Ambiguous new bindings
```rs
pub fn validate(input_byte: u8) -> u8 {
    const FORBIDDEN: u8 = 42;
    
    match input_byte {
        FORBIDDEN => panic!("input byte cannot be {FORBIDDEN}"),
        value => return value,
    }
}

fn main() {
    dbg!(validate(41));
    dbg!(validate(42));
}
```

```
[src/main.rs:11] validate(41) = 41
thread 'main' panicked at 'input byte cannot be 42', src/main.rs:5:22
```

`FORBIDDEN` is a constant but `value` is a new binding? how is this even supposed to work?

#### Use of `|` in patterns
```rs
fn convert_bitmask(input: u8) -> Option<bool> {
    match x {
        Some(1 | 2 | 4) => Some(true),
        Some(..8) => Some(false),
        _ => None,
    }
}
```

Didn't learn Rust before? Guess what the above means:

- "returns true if x is 1 or 2 or 4"
- "returns true if x is 7 (1 bitwise-OR 2 bitwise-OR 4)"

#### `_` and `_var` mean different things
You may have noticed that you can suppress the unused variable warning with both `let _ =` and `let _var =`.
Golang prefers the former (the latter doesn't work in Go), but they actually mean different things in Rust.

`_var` is just a normal binding name (local variable name in the case of `let _var =`),
similar to `var`, except it suppresses some warnings about unused identifiers.

`_` is the pattern that matches everything and binds the value to nothing.
Since there is no binding, the value is dropped immediately.

This causes a significant difference when it comes to RAII:

```rs
{
    let _guard = mutex.lock();
    println!("mutex is still locked here");
}
println!("mutex is unlocked here because `_guard` is dropped");
```

```rs
{
    let _ = mutex.lock();
    println!("mutex is already unlocked here because `_` drops immediately");
}
```

Fortunately, this is usually not a problem, because it is rare to acquire an unused RAII object
(although it is still used in some rare occasions).

## Testing
### Difficulty to unit test procedural macros
Procedural macros are typically tested in the form of integration tests,
where the user writes example code to use the procedural macro
and check if it compiles correctly.
However, this means the whole program is tested together,
and breaking any single part would cause all tests to fail.
This is particularly troublesome when errors caused by macro output are hard to debug.

It doesn't make sense to unit test particular functions that generate `TokenStream` output,
because that would imply writing test cases that repeat every token in the generated output,
and modifying or even reformatting (e.g. adding trailing commas) any part of the output
would require updating all unit tests.
We want to unit test the logic to ensure that
each unit contributes what it is expected to contribute,
not to unit test the exact shape of the generated code.

Another option is to have `#[cfg(test)] #[proc_macro*] pub fn` macros that expose the internal functions,
but this suffers from two problems.
Firstly, this involves additional code to parse an additional TokenStream to serve as the input arguments,
and sometimes inapplicable when the macro generates a special, incomplete part of the syntax tree
like match arms, struct fields, trait bounds, other non-item non-statement ASTs,
or a combination of multiple ASTs (e.g. multiple non-exhaustive match arms).
Secondly, this only works when the integration test is located in the same package as the procedural macro.
This is often not the case when the procedural macro references some code in another closely coupled crate,
e.g. `derive@serde_derive::Deserialize` depends on `trait@serde::Deserialize`.
This can be solved by setting a dependent as a dev-dependency,
but such a solution would not be scalable.

### Inflexible panic tests
Unit tests currently allow `#[should_panic]` to assert that a unit test should panic.
`#[should_panic(expected = "substring")]` additionally asserts that the panic message must contain `substring`.
This is not ideal for two reasons.
Firstly, substrings may not be safe as some parts of the panic message remain untested.
Secondly, the panic message is an implementation detail of the panic but not the subject to be tested.
The goal is to test that a certain code panics for a known reason, not to test what it shows to the user.
For example, `panic!("Cannot insert entry into string map because {:?} is not one of String or &'static str", ty)`,
where `ty: std::any::TypeId`, cannot be tested properly,
because we cannot make sure that both `insert entry into the string map`
and `is not one of String or &'static str` appear in the panic message
since `TypeId as fmt::Debug` has inconsistent output
(which is perfectly fine because it should not be relied on).
This also means we cannot test which `TypeId` the panic involves
(although a crate like this should probably attempt to take the `any::type_name` for slightly more consistent output).

Co-authored-by: xqwtxon <xqwtxon@hotmail.com>
Reinfy added a commit to Reinfy/Pair-Extraordinaire that referenced this issue Aug 8, 2022
This wiki records some of my unpopular opinions.

Since this wiki is expected to be somewhat unorganized, there is no index page. See the sidebar if you want to continue browsing.
I have presented too much Rust fandom in the past.
Time for some hate!

(I still love Rust, but I need an archive for the things I am sad about it)

(Title is inspired by http://phpsadness.com/)

(Some of these are unpopular opinions, quite many people think differently)

## Standard library
### Primitives
#### Primitives are not really primitive
[Why is `bool` not an `enum Bool { False, True }`?](rust-lang/rfcs#348)
[Why is `str` not a `struct Str([u8]);`?](rust-lang/rfcs#2692)

#### Conversion between integers is not explicit
`as` conversions do not explicitly state whether and what is lost.
For example, `u8 as u16` is perfectly fine (expand size),
but `u16 as u8` is a truncation,
`u16 as i16` might overflow,
`u16 as f32` is perfectly fine (lossless),
`u32 as f32` is lossy,
`usize as u32` and `u64 as usize` are lossy depending on the platform...

On the other hand, while we can use `.try_into()` on the numbers,
it is unclear what the impacts are,
and handling the error is more challenging if panic is not desired.
It is difficult to find (if it even exists) the *named* safe conversion method
from the standard library documentation.

To solve this problem, I published the [xias](https://docs.rs/xias) crate,
which exposes a blanket impl on primitive types to facilitate explicitly explained conversions.

### Collections
#### `.len()` and `.is_empty()`
Clippy [recommends](https://rust-lang.github.io/rust-clippy/master/index.html#len_without_is_empty)
that types implementing `.len()` should also implement `.is_empty()`.
This much sounds like `.clone()` and `.clone_from()`, which should use a default trait impl.
Why wasn't `.len()` a trait anyway?

## General syntax
### Operators
#### The `!` operator
```rs
if !matches!(value, Enum::Variant) {
    panic!("Do you think value matches Enum::Variant or not here, \
        if you didn't know Rust?");
}
```

An alternative is to use `condition.not()`, but that requires `use std::ops::Not` everywhere.

What about `condition == false`? [Thanks so much clippy.](https://rust-lang.github.io/rust-clippy/master/index.html#bool_comparison)

#### `.await`
No, don't get me wrong, I think `future.await` is much better than `await future`.
Not a fan of cargo cult.

But `future.await` looks as if it is a field,
especially if the syntax highlighter is not aware of it
(e.g. in outdated syntax highlighting libraries).
Why can't we make it *look like* a postfix macro,
like `future.await!` or `future.await!()`?

### Pattern matching
#### `let` has mixed use in assignment and conditions
If you didn't know Rust, you're probably confused why we use `=` instead of `==` here

```rs
if let Enum::Variant = value {
    // ...
}
```

This is getting even more confusing with the let...else syntax

```rs
let Ok(socket) = Socket::bind(addr) else {
    anyhow::bail!("Cannot bind socket to {addr}");
};
```

Guess what, `expr else { ... }` itself is not a condition. You actually group it as `let pat = expr` `else` `{...}` instead.

#### Ambiguous new bindings
```rs
pub fn validate(input_byte: u8) -> u8 {
    const FORBIDDEN: u8 = 42;
    
    match input_byte {
        FORBIDDEN => panic!("input byte cannot be {FORBIDDEN}"),
        value => return value,
    }
}

fn main() {
    dbg!(validate(41));
    dbg!(validate(42));
}
```

```
[src/main.rs:11] validate(41) = 41
thread 'main' panicked at 'input byte cannot be 42', src/main.rs:5:22
```

`FORBIDDEN` is a constant but `value` is a new binding? how is this even supposed to work?

#### Use of `|` in patterns
```rs
fn convert_bitmask(input: u8) -> Option<bool> {
    match x {
        Some(1 | 2 | 4) => Some(true),
        Some(..8) => Some(false),
        _ => None,
    }
}
```

Didn't learn Rust before? Guess what the above means:

- "returns true if x is 1 or 2 or 4"
- "returns true if x is 7 (1 bitwise-OR 2 bitwise-OR 4)"

#### `_` and `_var` mean different things
You may have noticed that you can suppress the unused variable warning with both `let _ =` and `let _var =`.
Golang prefers the former (the latter doesn't work in Go), but they actually mean different things in Rust.

`_var` is just a normal binding name (local variable name in the case of `let _var =`),
similar to `var`, except it suppresses some warnings about unused identifiers.

`_` is the pattern that matches everything and binds the value to nothing.
Since there is no binding, the value is dropped immediately.

This causes a significant difference when it comes to RAII:

```rs
{
    let _guard = mutex.lock();
    println!("mutex is still locked here");
}
println!("mutex is unlocked here because `_guard` is dropped");
```

```rs
{
    let _ = mutex.lock();
    println!("mutex is already unlocked here because `_` drops immediately");
}
```

Fortunately, this is usually not a problem, because it is rare to acquire an unused RAII object
(although it is still used in some rare occasions).

## Testing
### Difficulty to unit test procedural macros
Procedural macros are typically tested in the form of integration tests,
where the user writes example code to use the procedural macro
and check if it compiles correctly.
However, this means the whole program is tested together,
and breaking any single part would cause all tests to fail.
This is particularly troublesome when errors caused by macro output are hard to debug.

It doesn't make sense to unit test particular functions that generate `TokenStream` output,
because that would imply writing test cases that repeat every token in the generated output,
and modifying or even reformatting (e.g. adding trailing commas) any part of the output
would require updating all unit tests.
We want to unit test the logic to ensure that
each unit contributes what it is expected to contribute,
not to unit test the exact shape of the generated code.

Another option is to have `#[cfg(test)] #[proc_macro*] pub fn` macros that expose the internal functions,
but this suffers from two problems.
Firstly, this involves additional code to parse an additional TokenStream to serve as the input arguments,
and sometimes inapplicable when the macro generates a special, incomplete part of the syntax tree
like match arms, struct fields, trait bounds, other non-item non-statement ASTs,
or a combination of multiple ASTs (e.g. multiple non-exhaustive match arms).
Secondly, this only works when the integration test is located in the same package as the procedural macro.
This is often not the case when the procedural macro references some code in another closely coupled crate,
e.g. `derive@serde_derive::Deserialize` depends on `trait@serde::Deserialize`.
This can be solved by setting a dependent as a dev-dependency,
but such a solution would not be scalable.

### Inflexible panic tests
Unit tests currently allow `#[should_panic]` to assert that a unit test should panic.
`#[should_panic(expected = "substring")]` additionally asserts that the panic message must contain `substring`.
This is not ideal for two reasons.
Firstly, substrings may not be safe as some parts of the panic message remain untested.
Secondly, the panic message is an implementation detail of the panic but not the subject to be tested.
The goal is to test that a certain code panics for a known reason, not to test what it shows to the user.
For example, `panic!("Cannot insert entry into string map because {:?} is not one of String or &'static str", ty)`,
where `ty: std::any::TypeId`, cannot be tested properly,
because we cannot make sure that both `insert entry into the string map`
and `is not one of String or &'static str` appear in the panic message
since `TypeId as fmt::Debug` has inconsistent output
(which is perfectly fine because it should not be relied on).
This also means we cannot test which `TypeId` the panic involves
(although a crate like this should probably attempt to take the `any::type_name` for slightly more consistent output).

Co-authored-by: xqwtxon <xqwtxon@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-enum Enum related proposals & ideas A-primitive Primitive types related proposals & ideas postponed RFCs that have been postponed and may be revisited at a later time. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests