Make .a syntactic sugar for i->i.a #22710

davidanthoff · 2017-07-08T11:09:27Z

I think this has been suggested in various places before (i.e. I deserve no credit for this idea), but I couldn't find an issue for it, so here it is.

The motivation for something like this are @group .. into statements in Query.jl. With those one often gets an array of named tuples, and a super typical next step is that one wants to run some aggregation function over one specific field of the named tuple. Say A is an array of named tuples, then I might want to write something like mean(map(i->i.b,A)) to take the mean of column b.

Idea 1 would be to simple make A..b syntactic sugar for map(i->i.b,A). The aggregation expression would then be written as mean(A..b).

Idea 2 is based on an observation by @JeffBezanson in #21875:

Some languages use .a as short for x -> x.a, which is kind of nice.

Which is probably somehow related to this issue, but I'm not entirely sure.

I think maybe idea 2a might be something like .b.(A) instead of A..b? Not sure, more putting this out here for discussion. The aggregation would then be written as mean(.b.(A)). I find that a bit confusing, though.

Maybe idea 2b could be to still have .b mean i->i.b, and then make sure that all aggregation functions like mean etc. take an anonymous function as their first argument, so that one could always write these aggregations as say mean(.b, A).

queryverse/Query.jl#121 in Query.jl currently implements A..b within queries, but I'm a bit hesitant to add too much special syntax in Query.jl, especially around things where we might end up with some other solution in base

UPDATE: It seems pretty clear that idea 1 is not a good one, so I changed the title of this issue to refer to idea 2b, which seems the most plausible one.

The text was updated successfully, but these errors were encountered:

ararslan · 2017-07-08T17:30:56Z

.. is widely used in math packages to mean an interval, so this would be quite breaking for packages. I also find the syntax .b.(A) quite odd. An abbreviated syntax for this kind of map already exists as getfield.(A, :b), which is equivalent to broadcast(i->i.b, A).

davidanthoff · 2017-07-08T17:43:32Z

.. is widely used in math packages to mean an interval, so this would be quite breaking for packages.

Ah, that wouldn't be good. Just out of curiosity, what is an example package like that?

An abbreviated syntax for this kind of map already exists as getfield.(A, :b)

That doesn't seem type stable, whereas both a broadcast and map version are type stable. It also is a tad too verbose for my taste.

Given the .. conflict with other packages, I think my current preference would be idea 2b in that case.

ararslan · 2017-07-08T17:51:43Z

what is an example package like that?

IntervalSets

That doesn't seem type stable

I'm confused, why is getfield.(A, :b) not type stable but map(i->i.b, A) is? The former lowers to the same code as broadcast(i->i.b, A).

I think my current preference would be idea 2b

Of those proposed I do prefer 2b as well, though I'm still not really a fan of it. i->i.b, while more verbose, is IMO clearer than .b, since we use prefix . for dot-broadcasted infix operators. Explicitly providing the i in i.b makes it clear that it's a getfield rather than a broadcasted operator of some kind.

davidanthoff · 2017-07-08T17:59:36Z

I'm confused, why is getfield.(A, :b) not type stable

I have no idea, I just looked at the output from @code_warntype for all three variants, and the getfield. version was the one that looked type instable.

JeffBezanson · 2017-07-08T18:01:15Z

Agree that we should keep .. as an operator for intervals, and it's also useful for range queries. I'm fine with the syntax .a for x->x.a though.

When you look at code_warntype for getfield.(A, :b), it applies typeof to all the arguments first, so you'll see code for type Symbol as the final argument. But at a particular call site the constant :b will be taken into account.

TotalVerb · 2017-07-08T19:40:31Z

This is a little sketchy to me. It's sort of introducing a global namespace of field names. What kind of accessor is .name, and what kinds of properties do you expect of this operation? I don't think you can really say, and so these things can't be used in generic code.

JeffBezanson · 2017-07-08T20:12:11Z

We already have a global namespace of field names, as does every other object-oriented language. In any case, those issues apply equally to a.b and getfield(a, :b); .b is just syntax for the same thing.

ararslan · 2017-07-08T20:32:01Z

We already have a global namespace of field names

Those are called without a leading . though.

It still seems really weird and confusing to me to be omitting the object from which you're getting the field. What's wrong with i->i.b?

malmaud · 2017-07-09T03:44:12Z

-1 from me. . can be already a daunting, seemingly-magical concept to newcomers because of the broadcast lowering. The last thing we need is to for it to have more magical properties.

davidanthoff · 2017-07-10T12:18:05Z

What's wrong with i->i.b?

For my Query.jl use case it is just too verbose (e.g. see this comment).

. can be already a daunting, seemingly-magical concept to newcomers because of the broadcast lowering. The last thing we need is to for it to have more magical properties.

I hear you, that worries me too. I'm not particularly wedded to this syntax, but so far I couldn't think of anything better, and (at least from my perspective) the benefits of having something for this use-case outweigh the costs, even if we end up using the .b notation.

stevengj · 2017-07-10T16:44:50Z

If we adopt @JeffBezanson's suggestion for dot overloading, then Field{:b}(x) could be defined as x.b.

In my mind, the main use for this is for things like map and broadcast/dot calls. For example: map(Field{:b}, x) or sqrt.(Field{:b}.(foo.(x))). Or, in @davidanthoff's example, @select {g.key.metric, m = myfun(Field{:score}.(g), Field{:track_id}.(g)) }.

Field{:b} is reasonably terse while remaining fairly readable and explicit. (And if it is not terse enough, we could use dot overloading to make this equivalent to Field.b.)

(Is there a problem that dot overloading doesn't solve? 😉 )

stevengj · 2017-07-10T19:24:06Z

Another possibility would be to use $.b as sugar for x -> x.b and $[i] as sugar for x -> x[i], but $ is pretty overloaded already.

Or _.b and _[i], since we're already turning _ into a quasi-magical placeholder symbol (#9343)?

malmaud · 2017-07-10T19:34:32Z

I definitely feel your pain about verboseness though, @davidanthoff . Perhaps we just have to resort to having a macro that goes in front of a query than relying on changes to Julia syntax though.

JeffBezanson · 2017-07-10T19:42:55Z

resort to having a macro that goes in front of a query than relying on changes to Julia syntax

I think it's highly valuable to try to think of generally-usable syntax that makes macros less necessary.

malmaud · 2017-07-10T19:46:50Z

OK sure, I am all for bending Julia's syntax to be more accommodating to data analysis :) I was just trying to be sensitive to the valid complaints that Julia syntax should not become the symbol soup of Mathematica etc.

StefanKarpinski · 2017-07-10T20:44:26Z

I would still like to have a terse function syntax based on _ so that _[i] and _.b work as @stevengj mentions above, but it's not a feature we need for 1.0 and since _ is already disallowed as an r-value, we're in the clear to give it some new meaning in the future.

stevengj · 2017-07-10T21:43:50Z

Basically, _ could become an implicit single-argument currying syntax when used as an r-value. f(_, y) would be sugar for x -> f(x, y), and _.b and _[i] would just be special cases of this for getfield and getindex. People have also suggested using ~ for this. (See also #5571 and #554.)

JeffBezanson · 2017-07-10T21:47:27Z

_.b is definitely an appealing option here. The syntax rule could be that the anonymous function contains the single function call directly containing the _. (Similar to how T{<:S} puts where outside one set of curly braces.)

yurivish · 2017-07-10T21:54:01Z

Here's a previous discussion with a bunch of good examples to check against: #5571 (comment)

stevengj · 2017-07-10T21:57:55Z

Note that @davidanthoff can already use the _.b syntax in Query.jl, since it parses just fine.

davidanthoff · 2017-07-22T04:32:03Z

I really like the _.b idea, and especially that I can use it now :)

For my use-case it does kind of rely on reducer functions having a combined map-reduce method that accepts a map function as an argument. Currently many reducer functions don't have such a method. Over in #20402 @StefanKarpinski has one item "Reducers APIs. Make sure reducers have consistent behaviors – all take a map function before reduction; congruent dimension arguments, etc." I guess if that happens for 1.0 all is good and we would have a pretty elegant solution for the Query.jl use-case (and many others). Thanks all for the great ideas :)

bramtayl · 2017-09-09T09:20:06Z

How about .b being Field{:b}? Then .b.a would be broadcast(Field{:b}, a).

bramtayl · 2017-09-09T10:07:25Z

Eh maybe inconsistent if .field is essentially a function with special suffix syntax. Still, plain old .field would be very useful, because once the compiler knows field is a type parameter, not a value, all sorts of operations can be shifted from run time to compile time.

davidanthoff · 2017-09-10T16:12:00Z

I thought a bit more about this, and I think I could actually solve the original issue in Query.jl that motivated this issue in a much more elegant way if we had dot overloading a la #1974. So from my point of view we could close this issue and just add one more cheer for #1974.

Essentially, I could then extend the Grouping container that holds results from a @group operation in Query.jl so that g.a would extract column a from the group g if g happens to be a collection of NamedTuples. That would be much more consistent with some future table type where df.a would extract a column from a table type, something that would also be enabled by #1974.

davidanthoff · 2017-12-22T17:23:08Z

I'm going to close this issue because I can essentially solve this in a really good way for Query.jl with the new dot-overloading.

Having said that, one crazy idea might be to add such a dot-overloaded method to any AbstractArray. A modest version would be for any AbstractArray that holds named tuples, the radical option would be for just any AbstractArray. In that world, if a is an AbstractArray, a.b would always end up extracting a collection of the b properties of the individual elements of a.

JeffBezanson · 2017-12-22T18:26:58Z

Wouldn't that be implicit vectorization of the kind we've moved away from?

davidanthoff · 2017-12-22T18:33:11Z

Hm, I'm not sure? It would unify the user API for arrays-of-struct and struct-of-array containers in the table world. I assume DataFrame at some point will get df.a as a shortcut for df[:a], and then a DataFrame and an array of named tuples would both provide x.a as a way to get the a column. I'm not sure that is good, but it could be done ;) I guess another question is what else a.b could mean...

But in any case, clearly not 1.0 stuff.

JeffBezanson · 2017-12-22T18:47:00Z

To me, a table-like thing is semantically always an array or collection of structs. It might be stored as a struct of arrays, but should have the same API. So for example map(i->i.a, table) can be O(1) and non-copying if the table is stored as a struct of arrays. That needs better syntax, but you get the idea.

bramtayl · 2018-05-03T16:15:21Z

It would be convenient to make . syntax available to package authors, lowered to something like

.variable => dot(:variable), where dot isn't defined in Base. I'd like to be able to use dot to create custom keys. Currently the syntax to get symbols into the type domain is somewhat ugly; alternatives like Dot(:variable) and dot"variable" are definitely not as pretty as .variable.

mbauman · 2018-05-03T16:22:48Z

Why do you need it in the type domain? IPO on 0.7 will propagate symbols as constants to any inlined functions. From there you can lift them to the type domain yourself if you really wish.

bramtayl · 2018-05-03T16:32:39Z

Yeah, I've worked pretty heavily trying to get constant propagation to work. Even with the changes in 0.7 constant propagation is finicky. It's disabled during recursion, and it doesn't work through slurps, making lispy tuple programming very difficult (though not impossible with the judicious use of @pure). Unless constant propagation becomes a semantic guarantee, it's a lot more reliable just to keep everything in the type domain as early as possible. Making dot available to package authors could potentially satisfy both mine and David's needs.

bramtayl · 2018-05-03T17:04:07Z

See for example https://discourse.julialang.org/t/is-this-pure/8050/6

stevengj · 2018-05-03T18:02:05Z

@bramtayl, in #24990, _.variable already gets lowered to a Fix2{typeof(getproperty),...} object that you could dispatch on.

mbauman · 2018-05-03T18:06:06Z

And #26826 seeks to address constant propagation through varargs.

bramtayl · 2018-05-03T18:11:28Z

Another issue is that constant propagation doesn't survive keyword arguments and named tuples.

bramtayl · 2018-05-03T18:16:12Z

Oh and here's another option: define a custom type with overloaded dots, something like

struct K 
end

@inline getproperty(k::K, s::Symbol) = Key{s}()

const k = K()

k.a

ararslan added design Design of APIs or of the language itself speculative Whether the change will be implemented is speculative labels Jul 8, 2017

davidanthoff changed the title ~~Make A..b syntactic sugar for map(i->i.b, A) (or some alternative design)~~ Make .a syntactic sugar for i->i.a Jul 8, 2017

stevengj mentioned this issue Jul 27, 2017

deprecate multiple-underscore-only variable names #22982

Closed

stevengj mentioned this issue Dec 8, 2017

RFC: curry underscore arguments to create anonymous functions #24990

Open

5 tasks

davidanthoff closed this as completed Dec 22, 2017

rapus95 mentioned this issue Feb 20, 2023

headless anonymous function (->) syntax #38713

Open

chrstphrbrns mentioned this issue Jun 15, 2023

any possible to vectorize getproperty as arr..name? #50180

Closed

Make .a syntactic sugar for i->i.a #22710

Make .a syntactic sugar for i->i.a #22710

Comments

davidanthoff commented Jul 8, 2017 • edited Loading

ararslan commented Jul 8, 2017

davidanthoff commented Jul 8, 2017

ararslan commented Jul 8, 2017

davidanthoff commented Jul 8, 2017

JeffBezanson commented Jul 8, 2017

TotalVerb commented Jul 8, 2017

JeffBezanson commented Jul 8, 2017

ararslan commented Jul 8, 2017

malmaud commented Jul 9, 2017

davidanthoff commented Jul 10, 2017

stevengj commented Jul 10, 2017 • edited Loading

stevengj commented Jul 10, 2017 • edited Loading

malmaud commented Jul 10, 2017

JeffBezanson commented Jul 10, 2017

malmaud commented Jul 10, 2017

StefanKarpinski commented Jul 10, 2017 • edited Loading

stevengj commented Jul 10, 2017 • edited Loading

JeffBezanson commented Jul 10, 2017

yurivish commented Jul 10, 2017 • edited Loading

stevengj commented Jul 10, 2017

davidanthoff commented Jul 22, 2017

bramtayl commented Sep 9, 2017 • edited Loading

bramtayl commented Sep 9, 2017

davidanthoff commented Sep 10, 2017

davidanthoff commented Dec 22, 2017

JeffBezanson commented Dec 22, 2017

davidanthoff commented Dec 22, 2017

JeffBezanson commented Dec 22, 2017

bramtayl commented May 3, 2018

mbauman commented May 3, 2018

bramtayl commented May 3, 2018 • edited Loading

bramtayl commented May 3, 2018

stevengj commented May 3, 2018

mbauman commented May 3, 2018

bramtayl commented May 3, 2018

bramtayl commented May 3, 2018

davidanthoff commented Jul 8, 2017 •

edited

Loading

stevengj commented Jul 10, 2017 •

edited

Loading

stevengj commented Jul 10, 2017 •

edited

Loading

StefanKarpinski commented Jul 10, 2017 •

edited

Loading

stevengj commented Jul 10, 2017 •

edited

Loading

yurivish commented Jul 10, 2017 •

edited

Loading

bramtayl commented Sep 9, 2017 •

edited

Loading

bramtayl commented May 3, 2018 •

edited

Loading