Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using for<'a> syntax when declaring closures #3216

Merged
merged 3 commits into from
May 24, 2022
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
279 changes: 279 additions & 0 deletions text/0000-closure-lifetime-binder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,279 @@
- Feature Name: `closure_lifetime_binder`
- Start Date: 2022-01-06
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)


This RFC went through a pre-RFC phase at https://internals.rust-lang.org/t/pre-rfc-allow-for-a-syntax-with-closures-for-explicit-higher-ranked-lifetimes/15888

# Summary

Allow explicitly specifying lifetimes on closures via `for<'a> |arg: &'a u8| { ... }`. This will always result in a higher-ranked closure which can accept *any* lifetime (as in `fn bar<'a>(val: &'a u8) {}`). Closures defined without the `for<'a>` syntax retain their current behavior: lifetimes will be inferred as either some local region (via an inference variable), or a higher-ranked lifetime.
Copy link

@mheiber mheiber Feb 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't such a closure just rank-1? I think the RFC and the feature great, but for docs "higher-rank" might be misleading.

If I understand correctly, for<'a> |arg: &'a u8| { ... } is the closure version of fn<'a>(arg: &'a u8) {}, which is rank 1.

When such a closure is passed to a function fn takes_closure(f: impl Fn(&u8)) then takes_closure (iuc) has a higher-ranked lifetime, but the closure itself does not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using 'higher-ranked' to mean 'has a for<'a> binder', in the same way that for<'a> Type: Trait {} is called a 'higher-ranked trait.

As far as I know, 'higher-ranked' in the context of Rust is taken to mean 'at least rank 1', instead of 'at least rank 2' (though I could be wrong about this).

Copy link

@mheiber mheiber Feb 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aaron1011 what is a higher-ranked trait? (sorry, I'm relatively new to Rust).

For higher-ranked trait bounds, I see two complete examples in the reference, and in each of these examples call_on_ref_zero looks higher-rank to me:

https://doc.rust-lang.org/stable/reference/trait-bounds.html?highlight=higher-rank#higher-ranked-trait-bounds

fn call_on_ref_zero<F>(f: F) where for<'a> F: Fn(&'a i32) {
    let zero = 0;
    f(&zero);
}

and

fn call_on_ref_zero<F>(f: F) where F: for<'a> Fn(&'a i32) {
    let zero = 0;
    f(&zero);
}

I'm using "higher-rank" in a way that I think is consistent with the higher-ranked trait bounds docs. There is a type variable not in prenex position. This is the sense in Wikipedia that, according to Types and Programming Languages, goes back to Leivant, 1983, I think in section 6.2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closures whose parameters have forall are called "higher-rank" because, if they were converted to a fn type, that fn type would involve a binder.

Copy link

@mheiber mheiber Feb 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn id<T>(t: T) -> T { t } has a binder. Would you consider it higher-ranked?

(asking to try to understand better)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mheiber I think fn id<T>(t: T) -> T { t } in your example is not considered even rank 1 in Rust, because it's not a first class value. The functions in Rust are monomorphised at compile-time, so the polymorphism is not preserved. (If it was, it would be rank 1) So each "type application" of your example would be rank 0, and the original polymorphic version wouldn't just exist.

However, for <'a> is a special case, because lifetimes are equivalent in runtime anyway, so they don't need monomorphisation. That means that the function as a value is actually rank 1 polymorphic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I acknowledge that it may be legitimate to go and try to add a foot-note to disambiguate this, but I also reckon that, imho, it is not that useful to fully forgo that terminology, since for<'lifetime…> quantification only really appears when, at some point, a higher-ranked API is involved, which thus makes for<'lifetime…> quantifications be either:

  • directly involved in a higher-order signature, as in your call_on_ref_zero example;
  • express that a type itself is compatible with a higher-order API.

So maybe a change along the following lines would strike the right balance:

Suggested change
Allow explicitly specifying lifetimes on closures via `for<'a> |arg: &'a u8| { ... }`. This will always result in a higher-ranked closure which can accept *any* lifetime (as in `fn bar<'a>(val: &'a u8) {}`). Closures defined without the `for<'a>` syntax retain their current behavior: lifetimes will be inferred as either some local region (via an inference variable), or a higher-ranked lifetime.
Allow explicitly specifying lifetimes on closures via `for<'a> |arg: &'a u8| { ... }`. This will always result in a higher-ranked[^higher_ranked] closure which can accept *any* lifetime (as in `fn bar<'a>(val: &'a u8) {}`). Closures defined without the `for<'a>` syntax retain their current behavior: lifetimes will be inferred as either some local region (via an inference variable), or a higher-ranked lifetime[^higher_ranked].
[^higher_ranked]: technically this is a misnomer: the closures themselves are not higher-ranked, but rather, rank-1, as in _simply generic over a lifetime_. But that makes them **compatible with higher-ranked APIs**, that is, APIs that expect, themselves, a generic-over-a-lifetime callback. So these `for<'lifetime…>` callback signatures can be labelled as _higher-ranked-compatible_. Moreover, since `for<'lifetime>` quantification is only seen in such cases, then such quantification, in and of itself, ends up dubbed "higher-order" as a shorthand, and by extension, so are the closures featuring it, as well as the lifetime parameters so introduced.

Copy link

@mheiber mheiber Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielhenrymantilla thanks for suggesting this change, but, my two cents is that it doesn't seem like an improvement to add additional concepts "higher-ranked API" and "compatibility with higher-ranked APIs."

You wrote:

"compatible with higher-ranked APIs, that is, APIs that expect, themselves, a generic-over-a-lifetime callback."

With this terminology, is takes_any a "higher-ranked API" that f is compatible with?

fn main() {
    let f = for <'_>|| 3;
    takes_any(f);
}

fn takes_any(_t: impl std:any::Any) {}

For closures that are not higher-ranked, "higher-ranked" doesn't seem like helpful terminology imo.

I'm not attached to it, but an example of an alternative is "closures with lifetime parameters".

Copy link

@golddranks golddranks Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mheiber Oops, I started to already replying to you, because GitHub send me an e-mail where you mentioned me, but I guess you deleted it? Let me just share this, because I already made the effort to check whether Haskell is different from Rust here, and turns out it is:

Rust doesn't consider polymorphic types as first-class. They are "proper" types that can have values only after applying the type. This Rust playground demonstrates this; a, which is supposed to be a variable of type of function id, gets initiated with a &str, and doesn't accept an integer after that. (The same happens when you replace a with a closure: let a = |t| t;)
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0791da7525a893979bc8d97ed78e4f8b

Instead, in Haskell, one can do this: (pardon my Haskell, def. not an expert) let id = \t -> t in let a = id in (a "hello", a 4) which works fine. (Apparently https://www.tryhaskell.org doesn't support sharing code examples, but copy-pasting from here works.)

You were absolutely correct in that monomorphisation is just an implementation strategy, but clearly Rust has let it also affect the semantics of the language. To be sure, I think it would be feasible to support "proper" rank-1 types for other than lifetime parameters; this seems feasible for types that don't carry the data of the polymorphic parameter with them, so you can monomorphise at usage site, depending on the usage. The prominent example is having generic closures: e.g. for<D: Debug> |d: D| println!("{:?}", d); I think this isn't considered a priority at the moment, but more like "a nice to have some day".

Also, I think that "closures with lifetime parameters" is a more helpful term indeed than "higher ranked", which is jargony, unclear and possibly plain out wrong here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with "closures with lifetime parameters" terminology, at least, when phrased like that, since it is waaaay too loose of a terminology to help anybody —the horribly unreadable-for-beginners error "one type is more general than the other" stems from this kind of loose "there is a generic lifetime param somewhere" terminology, that then needs to say "whoops, your generic lifetime param is not general enough". Maybe we could add the "for-quantified" nuance to these "lifetime parameters", I don't know, what I do know is that something needs to mention that in:

struct Foo<'a>(*mut Self);

impl<'a> Foo<'a> {
    fn call (_: &'a ()) {}
}

struct Bar(*mut Self);

impl Bar {
    fn call<'a> (_: &'a ()) {}
}

both Foo::call and Bar::call are "(stateless-)closures with a lifetime parameter, 'a", and yet only Bar::call is for-quantified / generically callable (whereas Foo::call is a generic [Foo's] callable).

In other words, there has to be a specific terminology for late-bound lifetimes (there we go, another term1), to distinguish from generics from an outer scope. Moreover, HRTB/ Higher-Rank Trait Bounds is a phrasing that is already part of the language (from 1.3.0 to now).

So "closure that meets/implements a Higher-Rank Trait bound" is official Rust terminology, I think we are way past changing that. Granted, the fact that we then say "higher-rank closure" as a shorthand may be confusing for the more type-theory rigorous people (at least based on this very thread), hence the suggestion of a footnote to soothe that aspect.

Footnotes

  1. although there is no user-facing official documentation about early-bound vs. late-bound generic lifetimes, so as of now, this would be more confusing than anything else. But that aspect could be independently improved by the official docs, and then the RFC could replace the "higher-rank" terminology with "with late-bound generic lifetimes"? 🤷


# Motivation

There are several open issues around closure lifetimes (https://github.com/rust-lang/rust/issues/91966 and https://github.com/rust-lang/rust/issues/41078), all of which stem from type inference incorrectly choosing either a higher-ranked lifetime, or a local lifetime.

This can be illustrated in the following cases:

1. We infer a higher-ranked region ( `for<'a> fn(&'a u8)` ) when we really want some specific (local) region. This occurs in the following code:

```rust
fn main () {
let mut fields: Vec<&str> = Vec::new();
let pusher = |a: &str| fields.push(a);
}
```

which gives the error:

```
error[E0521]: borrowed data escapes outside of closure
--> src/main.rs:3:28
|
2 | let mut fields: Vec<&str> = Vec::new();
| ---------- `fields` declared here, outside of the closure body
3 | let pusher = |a: &str| fields.push(a);
| - ^^^^^^^^^^^^^^ `a` escapes the closure body here
| |
| `a` is a reference that is only valid in the closure body
```

The issue is that `Vec<&str>` is not higher-ranked, so we can only push an `&'0 str` for some specific lifetime `'0` . The `pusher` closure signature requires that it accept *any* lifetime, which leads to a compiler error.

2. We infer some specific region when we really want a higher-ranked region. This occurs in the following code:

```rust
use std::cell::Cell;

fn main() {
let static_cell: Cell<&'static u8> = Cell::new(&25);
let closure = |s| {};
closure(static_cell);
let val = 30;
let short_cell: Cell<&u8> = Cell::new(&val);
closure(short_cell);
}
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, there is no need for invariance to show the problems of an inferred / non-higher-order lifetime parameter:

fn main ()
{
    let closure = |s| {
        let _: &'_ i32 = s; // type-annotations outside the param list don't help.
    };
    {
        let local = 42;
        closure(&local);
    }
    {
        let local = 42;
        closure(&local);
    }
}

fails as well.

or even shorter:

let closure = |_| ();
closure(&i32::default());
closure(&i32::default());


The above code uses `Cell` to force invariance, since otherwise, region subtyping will make this example work even without a higher-ranked region. The above code produces the following error:

```
error[E0597]: `val` does not live long enough
--> src/main.rs:8:43
|
4 | let static_cell: Cell<&'static u8> = Cell::new(&25);
| ----------------- type annotation requires that `val` is borrowed for `'static`
...
8 | let short_cell: Cell<&u8> = Cell::new(&val);
| ^^^^ borrowed value does not live long enough
9 | closure(short_cell);
10 | }
| - `val` dropped here while still borrowed
```

Here, the closure gets inferred to `|s: Cell<&'static u8>|` , so it cannot accept a `Cell<&'0 u8>` for some shorter lifetime `&'0` . What we really want is `for<'a> |s: Cell<&'a u8>|` , so that the closure can accept both `Cell` s.

It might be possible to create an 'ideal' closure lifetime inference algorithm, which always correctly decides between either a higher-ranked lifetime, or some local lifetime. Even if we were to implement this, however, the behavior of closure lifetimes would likely remain opaque to the majority of users. By allowing users to explicitly 'desugar' a closure, we can make it easier to teach how closures work. Users can also take advantage of the `for<>` syntax to explicitly indicate that a particular closure is higher-ranked - just as they can explicitly provide a type annotation for the parameters and return type - to improve the readability of their code.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Side note, I do believe this ideal closure inference is possible-- I hope to float a design soon that enables exactly this to the nascent types team.)


Additionally, the Rust compiler currently accepts the following trait impls (and may eventually do so without any warnings):

```rust
trait Trait {}
impl<T> Trait for fn(&T) { }
impl<T> Trait for fn(T) { }
```

See https://github.com/rust-lang/rust/pull/72493#issuecomment-633307151

These impls are accepted because `for<'a> fn(&'a T)` and `fn(T)` are distinct types. While this not does *directly* apply to closures, closures *can* be cast to function pointers, which will have a different impl of `Trait` apply depending on whether they contain a higher-ranked lifetime parameter. Thus, the closure lifetimes inferred by the compiler can end up influencing what code is executed at runtime (provided that the user inserts the necessary cast to the correct function pointer type). While this is definitely an unusual case, it highlights the subtlety of lifetimes. Allowing greater control over how closure lifetimes are determined will allow users to better understand and control the behavior of their code in unusual situations like this one.

# Guide-level explanation

When writing a closure, you will often take advantage of type inference to avoid the need to explicitly specify types. For example:

```rust
fn func(_: impl Fn(&i32) -> &i32) {}

fn main() {
func(|arg| { arg });
}
```

Here, the type of `arg` will be inferred to `&i32`, and the return type will also be `&i32`. We can write this explicitly:

```rust
fn func(_: impl Fn(&i32) -> &i32) {}

fn main() {
func(|arg: &i32| -> &i32 { arg });
}
```

Notice that we've *elided* the lifetime in `&i32`. When a lifetime is written this way, Rust will infer its value based on how it's used.

In this case, our closure needs to be able to accept an `&i32` with *any* lifetime. This is because our closure needs to implement `Fn(&i32) -> &i32` - this is syntactic sugar for `for<'a> Fn(&'a i32) -> &'a i32`.

We can make this explicit by writing our closure in the following way:

```rust
fn func(_: impl Fn(&i32) -> &i32) {}

fn main() {
func(for<'a> |arg: &'a i32| -> &'a i32 { arg });
}
```

This indicates to both the compiler and the user that this closure can accept an `&i32` with *any* lifetime, and returns an `&i32` with the same lifetime.

However, there are cases where a closure *cannot* accept any lifetime - it can only accept some particular lifetime. Consider the following code:

```rust
fn main() {
let mut values: Vec<&bool> = Vec::new();
let first = true;
values.push(&first);

let mut closure = |value| values.push(value);
let second = false;
closure(&second);
}
```

In this code, `closure` takes in an `&bool`, and pushes it to `values`. However, `closure` *cannot* accept an `&bool` with *any* lifetime - it can only work with some specific lifetime. To see this, consider this slight modification of the program:

```rust
fn main() {
let mut values: Vec<&bool> = Vec::new();
let first = true;
values.push(&first);

let mut closure = |value| values.push(value);
{ // This new scope was added
let second = false;
closure(&second);
} // The scope ends here, causing `second` to be dropped
println!("Values: {:?}", values);
}
```

This program fails to compile:

```
error[E0597]: `second` does not live long enough
--> src/main.rs:9:17
|
9 | closure(&second);
| ^^^^^^^ borrowed value does not live long enough
10 | }
| - `second` dropped here while still borrowed
11 | println!("Values: {:?}", values);
| ------ borrow later used here
```

This is because `closure` can only accept an `&bool` with a lifetime that lives at least as long as `values`. If this code were to compile (that is, if `closure` could accept a `&bool` with the shorter lifetime associated with `&second`), then `values` would end up containing a reference to the freed stack variable `second`.

Since `closure` cannot accept *any* lifetime, it cannot be written as `for<'a> |value: &'a bool| values.push(value)`. It's natural to ask - how *can* we write down an explicit lifetime for `value: &bool`?

Unfortunately, Rust does not currently allow the signature of such a closure to be written explicitly. Instead, you must rely on type inference to choose the correct lifetime for you.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do something like

'a: {

    let mut values: Vec<&'a bool> = Vec::new();
    let first = true;
    values.push(&first);
    let mut closure = |value: &'a | values.push(value);
    // ...
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On nighly anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very surprising to have 'a create both a block label and a lifetime - currently, it's always either one or the other.

Additionally, this would require the closure to be declared inside a new block, which could force the user to refactor their code to avoid temporaries being dropped too early.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be very surprising to have 'a create both a block label and a lifetime - currently, it's always either one or the other.

I actually think it's surprising that 'a isn't a lifetime in that snippet, especially since the label uses such similar syntax to a lifetime.

It's also surprising to me that if you have

let x: i64 = 0;
let y: &i64 = &x;

you can't actually write the type of y as &'X i64 because the lifetime 'X isn't nameable.

But that's probably getting a little off topic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think it's surprising that 'a isn't a lifetime in that snippet, especially since the label uses such similar syntax to a lifetime.

For that reason, I think the current labelled block syntax is a bad idea - it suggests a connection where none exists. But as you said, this is getting off-topic.

I've written this RFC to avoid constraining our options for the syntax of non-higher-ranked closures, so we don't have to come to a decision now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason to use 'a as the labeled block was precisely so that it could become a lifetime. That said, I've been having some "second thoughts" about the formulation of lifetimes vs origins (see my Rust Belt Rust talk for more details), so I'd be reluctant to move forward on that point right now.


# Reference-level explanation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if it makes sense to use this RFC as an opportunity to better document how the inference works today. Perhaps it would be better to do that as PRs on the Rust reference.


We now allow closures to be written with a `for<'a .. 'z>` prefix, where `'a .. 'z` is a comma-separated sequence of zero or more lifetimes. The syntax is parsed identically to the `for<'a .. 'z>` in the function pointer type `for<'a .. 'z> fn(&'a u8, &'b u8) -> &'a u8`.
This can be use with or without the `move` keyword:

`for<'a .. 'z> |arg1, arg2, ..., argN| { ... }`
`for<'a .. 'z> move |arg1, arg2, ..., argN| { ... }`

When this syntax is used, any lifetimes specified with the `for<>` binder are always treated as higher-ranked, regardless of any other hints we discover during type inference. That is, a closure of the form `for<'a, 'b> |first: &'a u8, second: &'b bool| -> &'b bool` will have a compiler-generated impl of the form:

```rust
impl<'a, 'b> FnOnce(&'a u8, &'b bool) -> &'b bool for [closure type] { ... }
```

Using this syntax requires that the closure signature be fully specified, without any elided lifetimes or implicit type inference variables. For example, all of the following closures do **not** compile:

```rust
for<'a> |elided: &u8, specified: &'a bool| -> () {}; // Compiler error: lifetime in &u8 not specified
for<'b> || {}; // Compiler error: return type not specified
for<'c> |elided_type| -> &'c bool { elided_type }; // Compiler error: type of `elided_type` not specified
for<> || {}; // Compiler error: return type not specified
```

This restriction allows us to avoid specifying how elided lifetime should be treated inside a closure with an explicit `for<>`. We may decide to lift this restriction in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems relatively easy to design up front (i.e., follow the rules for lifetime-generic functions with elided lifetimes). Unless there are difficult questions to be answered, it seems better to discuss this at the design stage than to kick it down the road and add another rough edge.

Copy link

@danielhenrymantilla danielhenrymantilla Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow the rules for lifetime-generic functions with elided lifetimes

Well, the issue then would be that while such an approach would favor fully higher-order signatures, and we'd also have the question of the hybrid ones.

let ft = 42_u8;
// The outer lifetime parameter in `elem` can be higher-order, but not the inner one.
let f = for<'s> |x: &'s str, y: &'s str, elem: &mut &u8| -> &'s str {
    *elem = &ft;
    if x.len() > y.len() { x } else { y }
};

So I don't personally think it is that easy; there is currently no way to favor some use cases without hindering others. So the best thing, right now, would be to "equally hinder all the ambiguous ones", by conservatively denying them, and see what is done afterwards.

FWIW, some kind designator for 'in-ferred lifetimes, such as 'in, or, say, '? could be added, I think:

let ft = 42_u8;
let f = for<'s> |x: &'s str, y: &'s str, elem: &mut &'? u8| -> &'s str {
    *elem = &ft;
    if x.len() > y.len() { x } else { y }
};
  • (Or '_?). And '* / '_* for a disambiguated higher-order elided lifetime parameter?

But I also agree that some of these "left for the future" questions can end up taking years to be resolved, for something that doesn't warrant that much thinking, just because the primitive feature already lifted most of the usability pressure off it (I'm, for instance, thinking of turbofish not having been usable for APIT functions).


Additionally, this syntax is currently incompatible with async closures:

```rust
for<'a> async |arg: &'a u8| -> () {}; // Compare error: `for<>` syntax cannot be used with async closures
for<'a> async move |arg: &'a u8| -> () {}; // Compare error: `for<>` syntax cannot be used with async closures
```

This restriction may be lifted in the future, but the interactions between this feature and the `async` desugaring will need to be considered.

# Drawbacks

This slightly increases the complexity of the language and the compiler implementation. However, the syntax introduced (`for<'a>`) can already be used in both trait bounds and function pointer types, so we are not introducing any new concepts in the languages.

Previously, we only allowed thw `for<>` syntax in a 'type' position: function pointers (`for<'a> fn(&'a u8)`) and higher-ranked trait bounds (`where for<'a> T: MyTrait<'a>`). This RFC requires supporting the `for<>` syntax in an 'expression' position as well (`for<'a> |&'a u8| { ... }`). While there should be no ambiguity in parsing, crates that handle parsing Rust syntax (e.g. `syn`) will need to be updated to support this.
Aaron1011 marked this conversation as resolved.
Show resolved Hide resolved

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo: thw -> the


In its initial form, this feature may be of limited usefulness - it can only be used with closures that have all higher-ranked lifetimes, prevents type elision from being used, and does not provide a way of explicitly indicating *non*-higher-ranked lifetimes. However, this proposal has been explicitly designed to be forwards-compatible with such additions. It represents a small, (hopefully) uncontroversial step towards better control over closure signatures.

# Rationale and alternatives

* We could use a syntax other than `for<>` for binding lifetimes - however, this syntax is already used, and has the same meaning here as it does in the other positions where it is allowed.
* We could allow mixing elided and explicit lifetimes in a closure signature - for example, `for<'a> |first: &'a u8, second: &bool|`. However, this would force us to commit to one of two options for the interpretation of `second: &bool`

1. The lifetime in `&bool` continues to be inferred as it would be without the `for<'a>`, and may or may not end up being higher-ranked.
2. The lifetime in `&bool` is always *non*-higher-ranked (we create a region inference variable). This would allow for solving the closure inference problem in the opposite direction (a region is inferred to be higher-ranked when it really shouldn't be).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think the second alternative is preferable, since it would also be consistent with this my suggestion rust-lang/rust#42868 (comment) for supporting for<...> parameters on function items.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main issue with this approach is that it would require us to either:

  1. Make a breaking change to closure inference, since |val: &i32| is (usually) higher-ranked
  2. Change the behavior of &T (with an elided lifetime) depending on whether or not a for<> binder is present, which would be inconsistent with function pointers (for<'a> fn(&'a u8, &u8) is equivalent to for<'a, 'b> fn(&'a u8, &'b u8))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the #42868 suggestion, elided function argument lifetimes are still late-bound, like they are today, correct? If there was a way to indicate non-higher-ranked lifetimes on closures, that would open up a third alternative: Make elided lifetimes in the closure argument list higher-ranked, like function args today. (Also discussed in the pre-rfc thread. This RFC is future-compatible with that route as far as I can tell.)

Copy link

@danielhenrymantilla danielhenrymantilla Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(usually) higher-ranked

Do you have an example where it's not always the case? EDIT: found one: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=04bd8fac30f256406e9d7efc82f8448a


These options are mutually exclusive. By banning this ambiguous case altogether, we can allow users to begin experimenting with the (limited) `for<>` closure syntax, and later reach a decision about how (or not) to explicitly indicate non-higher-ranked regions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a third alternative, which would be to treat the signature exactly how we would treat it if it appeared in a fn. In that case, the lifetime would be higher-ranked. Can you add that, @Aaron1011?


* We could try to design a 'perfect' or 'ideal' closure region inference algorithm that always correctly chooses between a higher-ranked and non-higher-ranked region, eliminating the need for users to explicitly specify their choice. Even if this is possible and easy to implement, there's still value in allowing closures to be explicitly desugared for teaching purposes. Currently: function definitions, function pointers, and higher-ranked trait bounds (e.g. `Fn(&u8)`) can all have their lifetimes (mostly) manually desugared - however, closures do not support this.
* We could do nothing, and accept the status quo for closure region inference. Given the number of users that have run into issues in practice, this would mean keeping a fairly significant wart in the Rust language.

# Prior Art

I previously discussed this topic in Zulip: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Explicit.20closure.20lifetimes

The `for<>` syntax is used with function pointers (`for<'a> fn(&'a u8)`) and higher-ranked trait bounds (`fn bar<T>() where for<'a> T: Foo<'a> {}`)

I'm not aware of any languages that have anything analogous to Rust's distinction between higher-ranked and non-higher-ranked lifetimes, let alone an interaction with closure/lambda type inference.

# Unresolved questions

None at this time

# Future possibilities

We could allow a lifetime to be explicitly indicated to be *non*-higher-ranked. The `'_` lifetime could be given special meaning in closures - for example, `for<'a> |first: &'a u8, second: &'_ bool| {}` could be used to indicate a closure that takes in a `&u8` with any lifetime, and an `&bool` with some specific lifetime. However, we already accept `|second: &'_ bool| {}` as a closure, so this would require changing the behavior of `&'_` when a `for<>` binder is present.

## Appendix: Late-bound regions, early-bound regions, and region variables


There are three 'kinds' of lifetimes we need to consider for closures:

1. Late-bound lifetimes (also referred to as higher-ranked lifetimes). These lifetimes
can be written in function pointers using the `for<>` syntax (e.g. `for<'a> fn(&'a u8) -> &'a u8`).
When a lifetime is used in a function argument without any other 'restrictions' (see below), then the corresponding function pointer type will have a late-bound lifetime. For example, the function `fn bar<'a>(val: &'a u8) {}` can be cast to the function pointer type `for<'a> fn(&'a u8)`
2. Early-bound lifetimes. A lifetime becomes early-bound when it is 'constrained' in some way that prevents us from writing down the necessary bounds with a `for<>` binder. For example, the function `fn foo<'a>(&'a u8) where &'a u8: MyTrait<'a> {}` will have an early-bound lifetime `'a`, since we cannot write function pointer with a 'higher-ranked bound' like `for<'a> fn(&'a u8) where &'a u8: MyTrait<'a>`
3. Region variables. This corresponds to some particular region in the enclosing function body, and cannot be explicitly named by the user. This exact region is inferred by the compiler based on the closure usage. For example:

```rust
fn main() {
let mut values: Vec<&bool> = Vec::new();
let first = true;
values.push(&first);

let mut closure = |value| values.push(value);
let second = false;
closure(&second);
}
```


Here, the closure stored in variable `closure` takes in an argument of type `&'0 bool`, where `'0` is some region variable. The closure *cannot* accept a `&bool` with an *any* lifetime - only lifetimes that live at least as long as `'0`.

This RFC is only concerned with higher-ranked (late-bound) lifetimes and region variables.

See https://rustc-dev-guide.rust-lang.org/early-late-bound.html#early-and-late-bound-variables and https://rust-lang.github.io/rfcs/0387-higher-ranked-trait-bounds.html#distinguishing-early-vs-late-bound-lifetimes-in-impls for more discussion about early-bound vs late-bound regions.