Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Add pub fn identity<T>(x: T) -> T { x } to core::convert #2306

Merged
merged 7 commits into from
Aug 19, 2018
204 changes: 204 additions & 0 deletions text/0000-convert-id.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
- Feature Name: convert_identity
- Start Date: 2018-01-19
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Adds an identity function `pub fn identity<T>(x: T) -> T { x }` as
`core::convert::identity`. The function is also re-exported to
`std::convert::identity` as well as the prelude of
both libcore and libstd.

# Motivation
[motivation]: #motivation

## The identity function is useful

While it might seem strange to have a function that just returns back the input,
there are some cases where the function is useful.

### Using `identity` to do nothing among a collection of mappers

When you have collections such as maps or arrays of mapping functions like
below and you watch to dispatch to those you sometimes need the identity
function as a way of not transforming the input. You can use the identity
function to achieve this.

```rust
// Let's assume that this and other functions do something non-trivial.
fn do_interesting_stuff(x: u32) -> u32 { .. }

// A dispatch-map of mapping functions:
let mut map = HashMap::new();
map.insert("foo", do_interesting_stuff);
map.insert("bar", other_stuff);
map.insert("baz", identity);
```

### Using `identity` as a no-op function in a conditional

This reasoning also applies to simpler yes/no dispatch as below:

```rust
let mapper = if condition { some_manipulation } else { identity };

// do more interesting stuff inbetween..

do_stuff(42);
```

### Using `identity` to concatenate an iterator of iterators

We use the identity function to perform a monadic join on iterators, in the
example below. In other words we are concatenating an iterator of iterators
into a single iterator,

```rust
let vec_vec = vec![vec![1, 3, 4], vec![5, 6]];
let iter_iter = vec_vec.into_iter().map(Vec::into_iter);
let concatenated = iter_iter.flat_map(identity).collect::<Vec<_>>();
assert_eq!(vec![1, 3, 4, 5, 6], concatenated);
```

### Using `identity` to keep the `Some` variants of an iterator of `Option<T>`

We can keep all the maybe variants by simply `iter.filter_map(identity)`.

```rust
let iter = vec![Some(1), None, Some(3)].into_iter();
let filtered = iter.filter_map(identity).collect::<Vec<_>>();
assert_eq!(vec![1, 3], filtered);
```

### To be clear that you intended to use an identity conversion

If you instead use a closure as in `|x| x` when you need an
identity conversion, it is less clear that this was intentional.
With `identity`, this intent becomes clearer.

## The `drop` function as a precedent

The `drop` function in `core::mem` is defined as `pub fn drop<T>(_x: T) { }`.
The same effect can be achieved by writing `{ _x; }`. This presents us
with a precendent that such trivial functions are considered useful and
includable inside the standard library even though they can be written easily
inside a user's crate.

## Avoiding repetition in user crates

Here are a few examples of the identity function being defined and used:

+ https://docs.rs/functils/0.0.2/functils/fn.identity.html
+ https://docs.rs/tool/0.2.0/tool/fn.id.html
+ https://github.com/hephex/api/blob/ef67b209cd88d0af40af10b4a9f3e0e61a5924da/src/lib.rs

There's a smattering of more examples. To reduce duplication, it
should be provided in the standard library as a common place it is defined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this argument is persuasive for functions. There's a case for things like traits and structs having a common definition, since they're nominal and thus the commonality can improve interoperability (not that that place needs to be std even there). But that's not the case for functions, where it doesn't matter if you used the functils or tools one.


## Precedent from other languages

There are other languages that include an identity function in
their standard libraries, among these are:

+ [Haskell](http://hackage.haskell.org/package/base-4.10.1.0/docs/Prelude.html#v:id), which also exports this to the prelude.
+ [Scala](https://www.scala-lang.org/api/current/scala/Predef$.html#identity[A](x:A):A), which also exports this to the prelude.
+ [Java](https://docs.oracle.com/javase/8/docs/api/java/util/function/Function.html#identity--), which is a widely used language.
+ [Idris](https://www.idris-lang.org/docs/1.0/prelude_doc/docs/Prelude.Basics.html), which also exports this to the prelude.
+ [Ruby](http://ruby-doc.org/core-2.5.0/Object.html#method-i-itself), which exports it to what amounts to the top type.
+ [Racket](http://docs.racket-lang.org/reference/values.html)
+ [Julia](https://docs.julialang.org/en/release-0.4/stdlib/base/#Base.identity)
+ [R](https://stat.ethz.ch/R-manual/R-devel/library/base/html/identity.html)
+ [F#](https://msdn.microsoft.com/en-us/visualfsharpdocs/conceptual/operators.id%5B%27t%5D-function-%5Bfsharp%5D)
+ [Clojure](https://clojuredocs.org/clojure.core/identity)
+ [Agda](http://www.cse.chalmers.se/~nad/repos/lib/src/Function.agda)
+ [Elm](http://package.elm-lang.org/packages/elm-lang/core/latest/Basics#identity)

## The case for inclusion in the prelude

Let's compare the effort required, assuming that each letter
typed has a uniform cost wrt. effort.

```rust
use std::convert::identity; iter.filter_map(identity)

fn identity<T>(x: T) -> T { x } iter.filter_map(identity)

iter.filter_map(::std::convert::identity)

iter.filter_map(identity)
```

Comparing the length of these lines, we see that there's not much difference in
length when defining the function yourself or when importing or using an absolute
path. But the prelude-using variant is considerably shorter. To encourage the
use of the function, exporting to the prelude is therefore a good idea.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the prelude-using variant is considerably shorter.

This is true for literally every possible library addition, so I don't think its a good argument.


In addition, there's an argument to be made from similarity to other things in
`core::convert` as well as `drop` all of which are in the prelude. This is
especially relevant in the case of `drop` which is also a trivial function.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

An identity function is a mapping of one type onto itself such that the output
is the same as the input. In other words, a function `identity : T -> T` for
some type `T` defined as `identity(x) = x`. This RFC adds such a function for
all `Sized` types in Rust into libcore at the module `core::convert` and
defines it as:

```rust
pub fn identity<T>(x: T) -> T { x }
```

This function is also re-exported to `std::convert::identity` as well as
the prelude of both libcore and libstd.

It is important to note that the input `x` passed to the function is
moved since Rust uses move semantics by default.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

An identity function defined as `pub fn identity<T>(x: T) -> T { x }` exists as
`core::convert::identity`. The function is also re-exported as
`std::convert::identity` as well as the prelude of both libcore and libstd.

Note that the identity function is not always equivalent to a closure
such as `|x| x` since the closure may coerce `x` into a different type
while the identity function never changes the type.

# Drawbacks
[drawbacks]: #drawbacks

It is already possible to do this in user code by:

+ using an identity closure: `|x| x`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What specific advantages do you see of identity over |x| x? identity is longer, and I think its meaning would be less obvious to most users, as they'd then have to go looking for the definition of a function called identity. I suppose there are potential codegen advantages, but those are small at best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A previous version of the RFC used id, but it was changed because id is ambiguous and for those who haven't heard of identity functions very non obvious (it's easy to guess what it means if you see "identity", but not "id")

I assert that length is not the defining factor of ergonomics. identity is still easier to read and type over the closure, but I agree that the closure just easier to grok for those who haven't seen this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

identity is still easier to read and type

I guess that's what I'm disputing, but it's totally a matter of opinion. I'd expect people familiar with Rust's syntax to quickly grok |x| x, while identity could require looking around for a function called identity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for everyone, but I always double take when I see |x| x and the way I grok it is by going "oh right it's just the identity function"

Copy link
Contributor Author

@Centril Centril Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cramertj

[..] I think its meaning would be less obvious to most users [..]

What users are you referring to specifically here? Current Rust users or all programmers in general? I think I believe you are correct in either case, but still - I have no evidence to offer myself that your assertion is correct, which is less than satisfactory.

What specific advantages do you see of identity over |x| x? identity is longer [..]

Now that the length "advantage" (if we believe it is that) of id has been removed with the renaming to identity, I can only offer these thoughts:

  • A lot of functional languages and others have this function, so functional programmers may expect the function to exist while they may be new to the language and don't understand what |x| x means without reading about it. Consistency with those (quite many) languages is an argument on its own.

  • Understanding identity for those that don't already understand (I think functional programmers will) it is a one time cost - this can also be said of closure syntax or any other identifier in the standard library, you have to learn it and that too is a one time setup cost. If you previously didn't know about identity, reading about it can also be a useful experience learning-wise.

  • I believe that using identity is more clearly showing that the identity-conversion was intentional compared to |x| x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Centril Thanks for explaining! Those arguments seem reasonable to me. I personally still think that identity is less clear, but I'll leave it to others to voice their opinions one way or another, and I won't object to the feature if general consensus is that it would be beneficial.

+ writing the `identity` function as defined in the RFC yourself.

These are contrasted with the [motivation] for including the function
in the standard library.

# Rationale and alternatives
[alternatives]: #alternatives

The rationale for including this in `convert` and not `mem` is that the
former generally deals with conversions, and identity conversion" is a used
phrase. Meanwhile, `mem` does not relate to `identity` other than that both
deal with move semantics. Therefore, `convert` is the better choice. Including
it in `mem` is still an alternative, but as explained, it isn't fitting.

The rationale for including this in the prelude has been previously
explained in the [motivation] section. It is an alternative to not do that.
If the function is not in the prelude, the utility is so low that it may
be a better idea to not add the function at all.

Naming the function `id` instead of `identity` is a possibility.
This name is however ambiguous with *"identifier"* and less clear
wherefore `identifier` was opted for.

# Unresolved questions
[unresolved]: #unresolved-questions

There are no unresolved questions.