Unbox almost all the closures #19467

japaric · 2014-12-02T21:17:31Z

This PR moves almost all our current uses of closures, both in public API and internal uses, to the new "unboxed" closures system.

In most cases, downstream code that only uses closures will continue to work as it is. The reason is that the || {} syntax can be inferred either as a boxed or an "unboxed" closure according to the context. For example the following code will continue to work:

some_option.map(|x| x.transform_with(upvar))

And will get silently upgraded to an "unboxed" closure.

In some other cases, it may be necessary to "annotate" which Fn* trait the closure implements:

// Change this
|x| { /* body */}
// to either of these
|: x| { /* body */}  // closure implements the FnOnce trait
|&mut : x| { /* body */}  // FnMut
|&: x| { /* body */}  // Fn

This mainly occurs when the closure is assigned to a variable first, and then passed to a function/method.

let closure = |: x| x.transform_with(upvar);
some.option.map(closure)

(It's very likely that in the future, an improved inference engine will make this annotation unnecessary)

Other cases that require annotation are closures that implement some trait via a blanket impl, for example:

std::finally::Finally
regex::Replacer
std::str::CharEq

string.trim_left_chars(|c: char| c.is_whitespace())
//~^ ERROR: the trait `Fn<(char,), bool>` is not implemented for the type `|char| -> bool`
string.trim_left_chars(|&: c: char| c.is_whitespace())  // OK

Finally, all implementations of traits that contain boxed closures in the arguments of their methods are now broken. And will need to be updated to use unboxed closures. These are the main affected traits:

serialize::Decoder
serialize::DecoderHelpers
serialize::Encoder
serialize::EncoderHelpers
rustrt::ToCStr

For example, change this:

// libserialize/json.rs
impl<'a> Encoder<io::IoError> for Encoder<'a> {
    fn emit_enum(&mut self,
                 _name: &str,
                 f: |&mut Encoder<'a>| -> EncodeResult) -> EncodeResult {
        f(self)
    }
}

to:

// libserialize/json.rs
impl<'a> Encoder<io::IoError> for Encoder<'a> {
    fn emit_enum<F>(&mut self, _name: &str, f: F) -> EncodeResult where
        F: FnOnce(&mut Encoder<'a>) -> EncodeResult
    {
        f(self)
    }
}

[breaking-change]

How the `Fn*` bound has been selected

I've chosen the bounds to make the functions/structs as "generic as possible", i.e. to let them allow the maximum amount of input.

An F: FnOnce bound accepts the three kinds of closures: |:|, |&mut:| and |&:|.
An F: FnMut bound only accepts "non-consuming" closures: |&mut:| and |&:|.
An F: Fn bound only accept the "immutable environment" closures: |&:|.

This means that whenever possible the FnOnce bound has been used, if the FnOnce bound couldn't be used, then the FnMut was used. The Fn bound was never used in the whole repository.

The FnMut bound was the most used, because it resembles the semantics of the current boxed closures: the closure can modify its environment, and the closure may be called several times.

The FnOnce bound allows new semantics: you can move out the upvars when the closure is called. This can be effectively paired with the move || {} syntax to transfer ownership from the environment to the closure caller.

In the case of trait methods, is hard to select the "right" bound since we can't control how the trait may be implemented by downstream users. In these cases, I have selected the bound based on how we use these traits in the repository. For this reason the selected bounds may not be ideal, and may require tweaking before stabilization.

r? @aturon

Gankra · 2014-12-02T21:48:24Z

😍

alexcrichton · 2014-12-03T08:11:27Z

@japaric this looks amazing, awesome work!

aturon · 2014-12-03T21:06:44Z

@japaric

First, thanks as always for your amazing work moving the libraries forward with these new language features!

I talked with @alexcrichton and @nikomatsakis a bit about our strategy, and we had a few thoughts:

Functions consuming closures should essentially always take them generically (as you're doing)
Functions yielding closures should, in general, be using a newtype to hide the implementation choice. In particular, the various pub type definitions for iterators like Values should all be changed to be newtypes hiding their implementation and providing their own Iterator impl. (This is a workaround until we have a better mechanism for doing this hiding.) This newtype work is something we're doing as part of API stabilization, but in many cases we haven't gotten to it yet. Please convert cases like this (Keys, Values, etc) to newtypes.
The use of internal free fns to have a nameable type as you're using in a few places here shouldn't be necessary: we should be able to provide an implementation of Fn(X) -> Y for Box<Fn(X) -> Y>, for example, and so you should be able to just write .map(box |(k, _)| k). (As with your current implementation, this will involve dynamic dispatch, which is bad, but there's essentially no way around that with our current plans for what's available on the Stable channel).

So, I'd suggest adding an implementation of the Fn traits for boxes Fn values and using box rather than manually creating local bare fns -- which should be less work anyway :-)

Please let me know if any of that is unclear.

japaric · 2014-12-03T23:56:58Z

Functions yielding closures should, in general, be using a newtype to hide the implementation choice.

Could this be done in a follow up PR? This quite a bit of work, not only we must make the wrapper implement Iterator, but also DoubleEndedIterator, RandomAccessIterator, ExactSizeIterator, Clone, and/or any other trait that the type alias "transparently" implements today.

you should be able to just write .map(box |(k, _)| k)

Which implementation should be in the standard library:

impl FnOnce<Args, Results> for Box<F> where F: FnOnce<Args, Results> { .. }, or
impl FnOnce<Args, Results> for Box<FnOnce<Args, Results> + 'static> { .. }

If the later, then won't we need to write .map(box |(k, _)| k as Box<FnOnce(_) -> _ + 'static) instead? I don't think that box || {} gets automatically coerced to a trait object.

Anyhow, due to #19032/#18835, I can't implement either right now. Can we move forward with the bare functions? Or should we block on #19032 getting fixed?

aturon · 2014-12-04T06:16:20Z

@japaric

Yes, I think it's fine to move forward with this as-is; I didn't realize that the impls were currently blocked. The transition to bare fns is relatively rare and can be cleaned up later. The newtype wrappers for various Iterator types will fall out of stabilization regardless.

nikomatsakis · 2014-12-04T14:24:43Z

@aturon @japaric I think the most likely fix to #19032 is to convert the Fn traits to use associated types. We should start investigating this. Hurrah for multiplying risk!

nikomatsakis · 2014-12-04T14:55:02Z

(Or at least, the most likely workaround to #19032 for the short term)

apasel422 · 2014-12-05T20:23:08Z

src/libcore/iter.rs

@@ -163,7 +163,7 @@ pub trait IteratorExt<A>: Iterator<A> {
    /// ```
    #[inline]
    #[unstable = "waiting for unboxed closures"]
-    fn map<'r, B>(self, f: |A|: 'r -> B) -> Map<'r, A, B, Self> {
+    fn map<'r, B, F: FnMut(A) -> B>(self, f: F) -> Map<A, B, Self, F> {


Should this still have 'r as a lifetime parameter?

No, it shouldn't. I'll remove it. Thanks for pointing it out!

japaric · 2014-12-06T16:13:32Z

I'm deferring unboxing TrieMap methods to a later PR, because of the "recursion limit during monomorphization" issue (#19596)

alexcrichton · 2014-12-06T22:57:37Z

src/libcore/slice.rs


    /// Returns an iterator over subslices separated by elements that match
    /// `pred` limited to splitting at most `n` times. This starts at the end of
    /// the slice and works backwards.  The matched element is not contained in
    /// the subslices.
    #[unstable = "waiting on unboxed closures, iterator type name conventions"]
-    fn rsplitn_mut<'a>(&'a mut self,  n: uint, pred: |&T|: 'a -> bool) -> SplitsN<MutSplits<'a, T>>;
+    fn rsplitn_mut<'a, P>(&'a mut self,  n: uint, pred: P) -> SplitsN<MutSplits<'a, T, P>> where
+        P: FnMut(&T) -> bool;


Stylistically I've personally been pushing towards something like:

fn foo(...) where ... { }

I think this came up on a previous PR though, @aturon do you know if we've settled on a convention for this just yet?

Not really :-)

In fact, we don't currently have a way to settle on these kinds of conventions -- the ones in the Guidelines retain a very informal status and haven't gone through an RFC.

FWIW, I personally prefer @japaric's style here. But ultimately, we're going to need to sit down and design a full style guide -- probably between a release candidate and the 1.0 final.

japaric · 2014-12-10T13:30:36Z

@alexcrichton @aturon This is pretty much ready to go. There are a few closures that still need to be unboxed, but most of them are blocked on #19596 or #18835. I'll do one final pass marking them with FIXMEs. I've also updated this PR message, be sure to check it out.

aturon · 2014-12-11T05:43:56Z

@japaric I'm speechless; this is amazing. I can't thank you enough for the quality and quantity of work you've been doing on the libraries. I will review this PR ASAP.

aturon · 2014-12-12T23:25:20Z

src/libcore/str.rs

    /// let v: Vec<&str> = "Mary had a little lamb".split(' ').collect();
    /// assert_eq!(v, vec!["Mary", "had", "a", "little", "lamb"]);
    ///
-    /// let v: Vec<&str> = "abc1def2ghi".split(|c: char| c.is_numeric()).collect();
+    /// let v: Vec<&str> = "abc1def2ghi".split(|&: c: char| c.is_numeric()).collect();


I'm surprised this is needed. I tried a similar test case on nightly without the &: and it seemed to go through, but perhaps I'm missing something?

I'm surprised this is needed.

I removed the impl CharEq for |char| -> bool (note: that's a boxed closure).

I tried a similar test case on nightly without the &: and it seemed to go through

Are you sure it wasn't using a boxed closure instead of an unboxed one?

Note that the split method takes an CharEq implementor (not an Fn* implementor). And last time I checked the || {} syntax only gets interpreted as an unboxed closure if it's used right were a Fn* implementor is expected [1]. But it has been a few days, perhaps that changed already?

[1] For example:

fn f<F>(_: F) where F: Fn() {} fn main() { f(|| {}); // OK let closure = || {}; // Interpreted as a boxed closure at this point f(closure); // ERROR: type `||` doesn't implement the `Fn()` trait }

Ahha -- I just wasn't paying quite close enough attention: I was accidentally looking at the Vec split method. Thanks.

aturon · 2014-12-12T23:59:06Z

OK, I've made it all the way through the PR. Wonderful, epic work -- I can't wait to land this.

I'm basically happy to r+ as is (after a rebase), but I'm curious whether all uses of explicit closure notation (|&: ...| and move) are needed.

This also leaves me somewhat curious what our long term conventions around boxing will be. For example, this change adds type parameters to a fair number of structs that we would probably be equally happy to just put boxed closures in. But I think this is a question we can resolve over time, and for now getting off the old-style closures is the main thing -- we can always box up the new-style ones over time. Most public APIs should stay unboxed in any case, I think.

japaric · 2014-12-13T02:43:59Z

I'm curious whether all uses of explicit closure notation (|&: ...| and move) are needed.

Well, in most cases I added the explicit notation to avoid compiler errors, but I may have over annotated in some places.

Also, IIRC, @nikomatsakis mentioned (in his blog post, I think) that he wanted/planned to improve the inference of the "generic" || {} syntax to not require the explict move/&: notation. But, I'm not sure how hard it would be implement that (i.e. if it can be done for 1.0). I do expect that the following code will work as soon as we drop the boxed closures:

fn f<F>(_: F) where F: Fn() {}

fn main() {
    let closure = || {};   // currently, this defaults to a boxed closure
    f(closure);  // currently, errors with `Fn` trait is not implemented for `||` type
}

Which is one of main cases for explicit closure annotation (&:).

If we can improve the || {} inference, then I think we can remove most of the explicit annotations.

This also leaves me somewhat curious what our long term conventions around boxing will be.

So, one "advantage" of storing a bare function (fn()) or an unboxed closure instead of a boxed closure (Box<Fn()>) in a struct like Bytes, an alias of Map, is that in the former case the struct can implement Copy. Though not everyone likes implicitly copyable structs, specially iterators.

There are also some places where can simply not use boxed closures, like libcore, since there is no Box there.

I'll rebase this ASAP.

(BTW, I also wanted to add #[deriving(Clone, Copy)] to several structs that now contain unboxed closures (this would fix issues like #12677), but this patch is already too big. So, I'll do that in another PR.)

japaric · 2014-12-13T17:53:47Z

@aturon rebased

@aturon

This PR moves almost all our current uses of closures, both in public API and internal uses, to the new "unboxed" closures system. In most cases, downstream code that *only uses* closures will continue to work as it is. The reason is that the `|| {}` syntax can be inferred either as a boxed or an "unboxed" closure according to the context. For example the following code will continue to work: ``` rust some_option.map(|x| x.transform_with(upvar)) ``` And will get silently upgraded to an "unboxed" closure. In some other cases, it may be necessary to "annotate" which `Fn*` trait the closure implements: ``` // Change this |x| { /* body */} // to either of these |: x| { /* body */} // closure implements the FnOnce trait |&mut : x| { /* body */} // FnMut |&: x| { /* body */} // Fn ``` This mainly occurs when the closure is assigned to a variable first, and then passed to a function/method. ``` rust let closure = |: x| x.transform_with(upvar); some.option.map(closure) ``` (It's very likely that in the future, an improved inference engine will make this annotation unnecessary) Other cases that require annotation are closures that implement some trait via a blanket `impl`, for example: - `std::finally::Finally` - `regex::Replacer` - `std::str::CharEq` ``` rust string.trim_left_chars(|c: char| c.is_whitespace()) //~^ ERROR: the trait `Fn<(char,), bool>` is not implemented for the type `|char| -> bool` string.trim_left_chars(|&: c: char| c.is_whitespace()) // OK ``` Finally, all implementations of traits that contain boxed closures in the arguments of their methods are now broken. And will need to be updated to use unboxed closures. These are the main affected traits: - `serialize::Decoder` - `serialize::DecoderHelpers` - `serialize::Encoder` - `serialize::EncoderHelpers` - `rustrt::ToCStr` For example, change this: ``` rust // libserialize/json.rs impl<'a> Encoder<io::IoError> for Encoder<'a> { fn emit_enum(&mut self, _name: &str, f: |&mut Encoder<'a>| -> EncodeResult) -> EncodeResult { f(self) } } ``` to: ``` rust // libserialize/json.rs impl<'a> Encoder<io::IoError> for Encoder<'a> { fn emit_enum<F>(&mut self, _name: &str, f: F) -> EncodeResult where F: FnOnce(&mut Encoder<'a>) -> EncodeResult { f(self) } } ``` [breaking-change] --- ### How the `Fn*` bound has been selected I've chosen the bounds to make the functions/structs as "generic as possible", i.e. to let them allow the maximum amount of input. - An `F: FnOnce` bound accepts the three kinds of closures: `|:|`, `|&mut:|` and `|&:|`. - An `F: FnMut` bound only accepts "non-consuming" closures: `|&mut:|` and `|&:|`. - An `F: Fn` bound only accept the "immutable environment" closures: `|&:|`. This means that whenever possible the `FnOnce` bound has been used, if the `FnOnce` bound couldn't be used, then the `FnMut` was used. The `Fn` bound was never used in the whole repository. The `FnMut` bound was the most used, because it resembles the semantics of the current boxed closures: the closure can modify its environment, and the closure may be called several times. The `FnOnce` bound allows new semantics: you can move out the upvars when the closure is called. This can be effectively paired with the `move || {}` syntax to transfer ownership from the environment to the closure caller. In the case of trait methods, is hard to select the "right" bound since we can't control how the trait may be implemented by downstream users. In these cases, I have selected the bound based on how we use these traits in the repository. For this reason the selected bounds may not be ideal, and may require tweaking before stabilization. r? @aturon

japaric · 2014-12-17T02:00:28Z

The binary size of the Rust distribution (bin + lib folders) went up 3% after this change. Here's a table with per file binary size change, and here's the raw data.

cc @aturon @brson @huonw

huonw · 2014-12-17T02:09:37Z

Sounds like my concern was unfounded! Thanks for investigating.

japaric force-pushed the uc branch from cdca581 to b1f5e8f Compare December 4, 2014 00:29

japaric force-pushed the uc branch 2 times, most recently from 8d67ee8 to 35f43c7 Compare December 5, 2014 14:11

apasel422 reviewed Dec 5, 2014
View reviewed changes

apasel422 mentioned this pull request Dec 5, 2014

libcore: Fix Sized bounds on overloaded function traits. #19573

Closed

japaric force-pushed the uc branch 3 times, most recently from 0dba68c to 506b9e0 Compare December 6, 2014 15:53

alexcrichton reviewed Dec 6, 2014
View reviewed changes

japaric force-pushed the uc branch 3 times, most recently from 88e9afb to 1454c76 Compare December 10, 2014 02:51

japaric changed the title ~~[WIP] Unbox all the closures~~ Unbox almost all the closures Dec 10, 2014

japaric force-pushed the uc branch from 1454c76 to c6e6935 Compare December 10, 2014 13:04

aturon reviewed Dec 12, 2014
View reviewed changes

japaric force-pushed the uc branch from c6e6935 to 008a4ef Compare December 13, 2014 17:53

Jorge Aparicio added 20 commits December 13, 2014 17:03

libcollections: use unboxed closures

879ebce

libregex_macros: use unboxed closures

b44b5da

librustrt: use unboxed closures

be53d61

libstd: use unboxed closures

cdbb3ca

Fix benches

2160427

libsyntax: use unboxed closures

0dac05d

librustc: fix fallout

d3d707c

librustc_back: use unboxed closures

451eef5

librustc_trans: fix fallout

3739a24

librustc_llvm: use unboxed closures

933e7b4

librustc: use unboxed closures

1195708

librustc_typeck: fix fallout

46272c1

librustc_trans: fix fallout

0d4d8b9

librustc_trans: use unboxed closures

0676c3b

librustdoc: use unboxed closures

888f249

librustc_typeck: use unboxed closures

521a6e6

librustc_driver: use unboxed closures

015c0fc

libtest: use unboxed closures

745225d

Remove some unnecessary move keywords

6f28816

libstd: add missing imports

db8300c

japaric force-pushed the uc branch from 4f71c42 to db8300c Compare December 13, 2014 22:05

librustc_borrowck: add #![feature(unboxed_closures)]

b8e0b81

bors closed this Dec 14, 2014

bors merged commit b8e0b81 into rust-lang:master Dec 14, 2014

Ogeon mentioned this pull request Dec 15, 2014

Rust update: Unboxed closures and removal of proc() chris-morgan/rust-http#188

Merged

japaric deleted the uc branch December 16, 2014 02:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unbox almost all the closures #19467

Unbox almost all the closures #19467

japaric commented Dec 2, 2014

Gankra commented Dec 2, 2014

alexcrichton commented Dec 3, 2014

aturon commented Dec 3, 2014

japaric commented Dec 3, 2014

aturon commented Dec 4, 2014

nikomatsakis commented Dec 4, 2014

nikomatsakis commented Dec 4, 2014

apasel422 Dec 5, 2014

japaric Dec 5, 2014

japaric commented Dec 6, 2014

alexcrichton Dec 6, 2014

aturon Dec 8, 2014

japaric commented Dec 10, 2014

aturon commented Dec 11, 2014

aturon Dec 12, 2014

japaric Dec 12, 2014

aturon Dec 12, 2014

aturon commented Dec 12, 2014

japaric commented Dec 13, 2014

japaric commented Dec 13, 2014

japaric commented Dec 17, 2014

huonw commented Dec 17, 2014

Unbox almost all the closures #19467

Unbox almost all the closures #19467

Conversation

japaric commented Dec 2, 2014

How the Fn* bound has been selected

Gankra commented Dec 2, 2014

alexcrichton commented Dec 3, 2014

aturon commented Dec 3, 2014

japaric commented Dec 3, 2014

aturon commented Dec 4, 2014

nikomatsakis commented Dec 4, 2014

nikomatsakis commented Dec 4, 2014

apasel422 Dec 5, 2014

Choose a reason for hiding this comment

japaric Dec 5, 2014

Choose a reason for hiding this comment

japaric commented Dec 6, 2014

alexcrichton Dec 6, 2014

Choose a reason for hiding this comment

aturon Dec 8, 2014

Choose a reason for hiding this comment

japaric commented Dec 10, 2014

aturon commented Dec 11, 2014

aturon Dec 12, 2014

Choose a reason for hiding this comment

japaric Dec 12, 2014

Choose a reason for hiding this comment

aturon Dec 12, 2014

Choose a reason for hiding this comment

aturon commented Dec 12, 2014

japaric commented Dec 13, 2014

japaric commented Dec 13, 2014

japaric commented Dec 17, 2014

huonw commented Dec 17, 2014

How the `Fn*` bound has been selected