Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Deprecate then remove function piping #20331

Closed
3 tasks
ararslan opened this issue Jan 30, 2017 · 37 comments
Closed
3 tasks

Proposal: Deprecate then remove function piping #20331

ararslan opened this issue Jan 30, 2017 · 37 comments
Labels
deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself julep Julia Enhancement Proposal

Comments

@ararslan
Copy link
Member

Proposal

Deprecate the current use of |> as a function pipe. That is, the syntax x |> f would be deprecated in favor of the normal call syntax f(x). After the deprecation period, Base.:(|>) would be undefined.

This change was initially suggested by tkelman in #16985 (comment).

There has been a lot of contentious debate over various syntaxes for function piping (in particular, see #5571), with arguments for mimicking a variety of languages. That discussion has been had ad nauseum and I do not wish to rehash it. That is NOT the purpose of this proposal.

Rationale

A number of well thought out, well maintained packages have implemented macros that provide convenient piping syntax for a variety of use cases, both general and specific. Examples include Lazy.jl, FunctionalData.jl, Pipe.jl, and ChainMap.jl, among others.

StefanKarpinski and andyferris gave us arbitrary function composition in #17155, which can serve a similar purpose in many situations.

As tkelman similarly argued in #5571, the function pipeline in Base is backwards from the familiar call syntax; having both in the Base language is essentially endorsing the use of 2 disparate syntaxes to achieve the same goal. While there are often multiple ways to write the same thing using solutions in Base, typically the solutions at least adhere to a similar mental model. In this case, the syntaxes employ literally opposite mental models.

Function pipelines violate the principle of least surprise by applying the action after the object. That is, if you read sum(x) you know immediately when you see sum() that you're going to add up the values in the argument. When you see x |> sum, you see x, then all of a sudden you're adding up its values. Few if any other Base solutions put the action at the end, which makes piping the odd one out.

Piping does indeed have precedent in other languages, e.g. Hadley Wickham's %>% in R (which is not part of base R), and sometimes that style/flow makes sense. However, in the interest of consistency within Base Julia, I propose that we defer the responsibility for providing piping syntax to packages, which can redefine |> or provide convenience macros as they see fit.

Action Items

Should this proposal be accepted, the action items would be:

  • Remove uses of the syntax within Base, if any exist
  • Provide a formal deprecation for Base.:(|>) in either 0.6 or 1.0
  • Remove it in a subsequent release
@ararslan ararslan added deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself julep Julia Enhancement Proposal labels Jan 30, 2017
@jiahao
Copy link
Member

jiahao commented Jan 30, 2017

Function piping provides a postfix syntax for function calling, which is convenient at the REPL for interactive data generation and further visualization/summarization.

A use case that I have seen many people type is

julia> somecomplicatedthingproducingarray
...

<ARROW UP>

julia> somecomplicatedthingproducingarray |> summarize

where the summarize function is something like a plot or histogram

@ararslan
Copy link
Member Author

@jiahao I'm not arguing that it's not useful, but rather that we should be consistent within Base and let packages provide things like this.

@tkelman
Copy link
Contributor

tkelman commented Jan 30, 2017

there's also ans for repl usage

@ajkeller34
Copy link
Contributor

ajkeller34 commented Jan 31, 2017

In this proposal would |> still be parsed as an infix operator?

@tkelman
Copy link
Contributor

tkelman commented Jan 31, 2017

@ajkeller34: definitely, packages would be free to do whatever they want with it (though they'd have to play nicely with each other in terms of type piracy and coexistence), without as much of a constraint of being semantically compatible with the old base definition.

Remove uses of the syntax within Base, if any exist

Here's a now-very-outdated attempt I made to do this: tkelman@212727c
At the time, most of the uses in base were pretty trivial. A few of the tests' uses of "pipe this thing to this anonymous function" are maybe nicer with piping, but since most of those were reusing the same anonymous function multiple times it would probably be worth giving it a name and calling it like a normal function at that point.

@bramtayl
Copy link
Contributor

bramtayl commented Jan 31, 2017

In case anyone is curious, I have ChainRecursive.jl out now. I'll put an announcement on discourse about the disintegration of ChainMap.jl and its various children once it's complete.

@shashi
Copy link
Contributor

shashi commented Jan 31, 2017

Let me offer some resistance here since I have some vested interest and a particular liking to what |> makes possible.

I second with @jiahao that |> is very useful when you want to quickly try things out in the REPL. Further, I find it also useful when your argument is too big or merits some poise (yes, I said that). In the case of the linked example, it is in fact better to have the argument be more prominent than the function being called. sum(x) is too simple an example, and should indeed be written as sum(x)). In Escher.jl all functions that add properties to elements have a curried method. This dovetails so well with |> (that was planned, it also works great with map) and it's a joy to be able to try things out at the end of the line and see the UI update immediately. I don't have to find my way to the beginning of the expression and faff around. For use with Escher at least, the suggested alternative is to assign big expressions to variables of made up names like padded_box_contents_aligned_right_tomato_background (or worse box34) and then call a function on them. As opposed to the beautifully reading <big UI expression> |> aligncontents(right) |> pad(1em) |> fillcolor("tomato")

I know that after this I can define |> inside Escher and I probably will, but it will kill my brain to see WARNING: using Escher.|> in module YourPackage conflicts with an existing identifier. Packages will almost definitely give different meanings to this, which to me is very alarming!

StefanKarpinski and andyferris gave us arbitrary function composition in #17155, which can serve a similar purpose in many situations.

The alternative to box |> fill("orange") |> pad(2em) would be (fill("orange") ∘ pad(2em))(box) as opposed to box |> fill("orange") ∘ pad(2em)? These two seem orthogonal.

@tkelman
Copy link
Contributor

tkelman commented Jan 31, 2017

Escher's use of closures as objects seems to me like it's defining a DSL just for the sake of using this syntax (which has serious limitations for anything that isn't single-input, single-output), where it would likely be better-served, and more generalizable, if it used one of the multiple available chaining macros.

Removing Base's definition of this would allow people who like this syntax to do more interesting things with it.

@ararslan
Copy link
Member Author

@shashi I understand your points, but you would be able to get the same behavior using one of the packages I cited in the issue, would you not? As an example, in your Escher example, you could use FunctionalData to do @p vbox(<really big thing>) | pad(2em) or Lazy to do @> vbox(...) pad(2em).

@shashi
Copy link
Contributor

shashi commented Jan 31, 2017

Removing Base's definition of this would allow people who like this syntax to do more interesting things with it.

Except it will not be usable, since the only safe way to use it then would be Escher.|>(...) or Lazy.|>(...).

@kmsquire
Copy link
Member

Hypothetically, how would one use |> as an infix operator if you're using two different packages that both define and export it, assuming it's not defined in Base?

@ararslan
Copy link
Member Author

@kmsquire It depends on the use case. |> would still be parsed as an infix operator just as it is now, it just wouldn't have a value in Base. If you use it in a macro, it doesn't matter how any particular package defines it, since it simply becomes the first argument in a call expression.

Take for example <|, which is parsed as an infix operator but does not have a value. Even though it's undefined, we still have

julia> dump(:(a <| b))
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol <|
    2: Symbol a
    3: Symbol b
  typ: Any

Packages can define and export methods for Base.:(<|) that mean different things, just as one can do with +.

But the packages that provide nice function piping do so in macros, I assume for precisely this reason.

@bramtayl
Copy link
Contributor

bramtayl commented Jan 31, 2017

FWIW, no chaining package would need to make use of |> during evaluation because during chaining everything gets zipped up into one expression. I'd imagine if packages do go defining |> it will be precisely the definition in base. Although they should probably be using a chaining macro instead. See DataFramesMeta for a good example of how to build an interface that works well with chaining.

@MikeInnes
Copy link
Member

If we're deprecating this should we also deprecate * for string concatenation? That has similar issues as it's with redundant with string(a, b), and violates the principle of least surprise given that a and b aren't numbers.

More generally, we should probably deprecate all infix notation, as it's confusing to have multiple calling conventions like *(a, b) vs a * b – we can trim our current 3 disparate syntaxes down to one and get total consistency. To avoid ugliness we might consider moving the function call inside the parens, and perhaps getting rid of the redundant commas, as well.

@shashi
Copy link
Contributor

shashi commented Jan 31, 2017

|> would still be parsed as an infix operator just as it is now, it just wouldn't have a value in Base. If you use it in a macro, it doesn't matter how any particular package defines it

Still not sure why we need to remove it from Base.

@bramtayl makes a good point:

I'd imagine if packages do go defining |> it will be precisely the definition is base.

And still the only way to use more than one package which define this is to not use it infix.

@ararslan
Copy link
Member Author

I don't see why removing the definition in Base is required for |> to be used inside macros.

It isn't. My point is that |> can be used inside of macros regardless of the situation in Base. The same goes for any operator that parses appropriately. The point of the proposal is to make Base self-consistent in terms of function calls, then piping behavior can be achieved through packages. Whether the packages use |> in particular doesn't matter; they could just as well use <| or literally any other infix operator.

@shashi
Copy link
Contributor

shashi commented Jan 31, 2017

@ararslan right, that was not what I meant to ask, I updated my comment right after, sorry.

Anyway, I don't quite get the "Base self-contained in terms of function calls" sentiment. Seems like this will only make it harder to use |> in a non-macro context. I personally believe |> is a worthwhile thing to learn about for a newbie, despite it being surprising. It at least saves effort at the REPL. It's quite fun to realize later that |> is a function just like any other infix function and reinforces the lesson that functions are just values.

@lobingera
Copy link

Maybe just something formal: Please discuss/decide deprecations and/or syntax changes at the beginning of a release cycle, not at the end. Currently all the main developers and package responsible spend time and energy on finishing 0.6 and they just might have no time to think of another (good) idea.

@cormullion
Copy link
Contributor

"I'm not arguing that it's not useful, but rather that we should be consistent"

Sometimes usefulness beats consistency? I wasn't aware of the inconsistency, but I have found the |> syntax useful. If it's removed I won't feel I've gained anything tangible.

@evanfields
Copy link

An explanation for my thumbs-down vote, if I may:

Much of what's currently in Base could happen in packages instead. Should we move dictionaries to a package? Maybe list operations like sort and shuffle? Collections operations, etc.? I'm sure there have been long and detailed discussions concerning what should and shouldn't be included in Base, but I presume there are three reasons some functionality might be included in base:

  1. That functionality is necessary to enable other features in base.
  2. That functionality is an essential part of the language, many Julia programmers and packages will use it, and therefore it's desirable to have a single implementation/syntax that everyone agrees on, rather than the fragmentation of lots of people rolling their own.
  3. Including that functionality in base makes "raw" Julia more pleasant to use or makes it feel more full featured, which helps with language evangelism and adoption.

Something like sum probably hits all 3 points, and I'd argue that function piping hits the second and third point:

In both the initial (well-written) proposal and the discussion in this thread, a common theme is the existence of several packages providing pipe-like functionality through macros: Lazy.jl, Pipe.jl, ChainMap.jl, etc. The existence of multiple packages strongly suggests that many people in the community find piping a useful and desirable feature, and these packages' presence in this discussion thread suggests that many folks here understand and support the use of piping.

Given that piping is a common and popular feature in the Julia community and other languages, even in this discussion people seem to agree that it has many uses, especially at the REPL (where Julia shines), and there's already fragmentation in the Julia ecosystem...my read is not that it should be removed from Base, but rather that the piping syntax available in Base should be enhanced so that there's less need for fragmentation. Different packages offering different ways of e.g. plotting seems okay; different popular packages offering different ways of applying functions seems pretty scary.

I further argue that removing piping from Base but leaving the infix operator around is rather surprising: in Julia you can't define your own infix operators, but there is an unused infix operator |> hanging around that you can define as you please? If that's good functionality, why not give us a solid 10 or 20 infix operators to define as we please?

Lastly, I believe it's natural to keep piping exactly because it is different from other function application. It's a feature, not a bug, that it's different from other conventions of applying functions; this difference is what lets it shine in some use cases. And there are other cases where (hand-waving a bit) the noun comes before the verb, and many of these are exactly syntactic sugar in cases where raw function application is unwieldy. Off the top of my head, assignment x = 5 is putting the noun (symbol x) before the verb (bind to a value). Likewise for accessing fields of types t.a instead of getfield. And most profoundly, array indexing z[5] reads like "from z take the 5th element" and is generally more natural than getindex(z, 5).

@martinholters
Copy link
Member

If that's good functionality, why not give us a solid 10 or 20 infix operators to define as we please?

There's probably more than that if one includes all unicode ones in addition to the unclaimed ASCII ones like <|, ++, ...

@alanedelman
Copy link
Contributor

Not reading whole thread -- but just wanted to say
that I love being able to pipe. I would vote useful over
consistency any day.

@ajkeller34
Copy link
Contributor

I have a very mild preference to keep it, but don't really care so long as it remains an infix operator. I feel like I probably wouldn't use function piping if it entailed importing a package, which tells me that I don't value it very much.

That being said, I don't think this "principle of least surprise" argument is compelling, as it makes some presumptions about a diverse user base. To native speakers of subject-object-verb languages, I suppose most of Julia's syntax violates the principle of least surprise, and function piping is rather comfortable...

@ararslan
Copy link
Member Author

Not reading whole thread

😕

I love being able to pipe

Again, I'm not arguing that one should not be able to pipe, but rather that the functionality could easily be had instead in one of the several existing piping packages. Removing the Base pipe allows for packages to more easily define their own piping semantics without having to adhere to or remain consistent with whatever Base provides.

in Julia you can't define your own infix operators

That's not true; anything that parses as an infix operator can be defined or redefined. As martinholters pointed out, <| and ++ are similarly available, among others.

@JeffBezanson
Copy link
Member

I'm kind of neutral on this one, but I will second the sentiment that |> being backwards from normal function call syntax is the whole point of it. Even the biggest fans of piping are not asking (AFAIK) for e.g. sin <| x because that really is redundant with sin(x). |> is for those cases where it's easier on the eyes and/or brain to think of data flowing left to right without lots of parentheses.

@StefanKarpinski
Copy link
Member

I'd like for |> to be more powerful, e.g. allow x |> f(_) + 2g(_) |> h etc. and for it not to just be an operator. Every time anyone defined x |> f to mean something besides f(x) it really trips me up because the whole point of the operator as we've used it is that it's a different-order call syntax. Since we can overload call I can't see a good reason for having x |> f mean something else.

@ararslan
Copy link
Member Author

@StefanKarpinski More powerful pipes can already be obtained using macros. See for example Pipe.jl, which provides exactly the syntax you're describing. As long as |> is an operator (I personally don't see |> as being worth a special case), macros can use any piping delimiter that parses infix, even if it isn't a :call. As an example, one could similarly use @~ to pipe (at least as of this writing). That level of flexibility is one of the advantages of using macros in Julia.

@JeffBezanson
Copy link
Member

We could add the functionality of Pipe.jl to the language, and then you'd have it without needing to write @pipe.

The main reason to deprecate |> would be if we want to reclaim the syntax for some other purpose that people like much better.

@ararslan
Copy link
Member Author

I guess I'm trying to argue that piping doesn't need to be part of the language, it can (and already does) live in a package.

@JeffBezanson
Copy link
Member

But if there's nothing else we want |> for, I see little harm in leaving its (trivial) definition alone.

@ararslan
Copy link
Member Author

I don't believe there are currently any proposals to repurpose |> in Base. My argument for not defining it in Base is that it gives us more consistency without loss of functionality.

@tkelman
Copy link
Contributor

tkelman commented Feb 1, 2017

Would any "more powerful piping" proposals or package implementations be made simpler by not having this existing definition to worry about or work around?

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Feb 2, 2017

@ararslan "That's not true; anything that parses as an infix operator can be defined or redefined."

From the manual "&& and || operators", they are parsed but can't be redefined (it's a good thing). I believe the only exceptions.

The so-called "logical operators" && and || are infix. [unary binary relation] "operator" is IMHO the incorrect term for them as they aren't. Not is a similar way to the logical bitwise & and | that do allow overloading (something I'm not sure is a good choice).

@ararslan
Copy link
Member Author

ararslan commented Feb 2, 2017

@PallHaraldsson Those are control flow, not operators in the same sense as &, |, +, etc.

Let's try to stay on topic here if possible, please.

@JeffBezanson
Copy link
Member

@tkelman That's a good point. I suspect we can make future piping syntax backwards-compatible though. For example, if _ is reserved then |> can have a special meaning when its arguments contain _, and otherwise do the same thing it does now.

There's another issue: to make |> work for your object, do you define |> or the "function call operator" (i.e. adding methods to it)? It might be cleaner if |> were built-in syntax for function call, to ensure f(x) and x |> f are always the same.

@ararslan
Copy link
Member Author

ararslan commented Feb 3, 2017

The consensus here is very clearly against, so I'll go ahead and close the issue. I appreciate the discussion, everyone.

@ararslan ararslan closed this as completed Feb 3, 2017
@EconometricsBySimulation
Copy link
Contributor

I know this issue is closed. Just wanted to say "thank you" for keeping the operator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself julep Julia Enhancement Proposal
Projects
None yet
Development

No branches or pull requests