Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make keys(::Tuple) return a Tuple. #24897

Closed
wants to merge 2 commits into from
Closed

Make keys(::Tuple) return a Tuple. #24897

wants to merge 2 commits into from

Conversation

andyferris
Copy link
Member

This seems the most sensible choice to me, so that keys((1, 2.0, :c)) === (1, 2, 3).

I will find this useful where I tend to use things like map on keys and it's really, really nice to preserve the properties of the container when you do this. Mapping keys of a tuple might also be a nicer replacement for some particular usages of ntuple. (Use map(f, keys(t)) instead of ntuple(f, Val(length(t)))).

Relatedly, I've been thinking about generalizing getindex for getting multiple elements out of dictionaries (#24019) and other containers including tuples, and it seems the easiest definition goes like:

getindices(A, inds) = map(i -> A[i], inds)

where I feel you also want something like this:

t::Tuple
map(i -> t[i], keys(t)) === t

which is provided by this PR.

@StefanKarpinski
Copy link
Member

Wouldn't you want keys((1.0, "two", 3)) to return a range object 1:3?

@andyferris
Copy link
Member Author

That is the current behavior - personally I don't find that output type particularly useful since it hides the type of where the keys came from (one assumes an array or something), their length is no longer statically known, etc.

@andyferris
Copy link
Member Author

I guess one could have a "static" version of OneTo to describe the keys of a tuple?

@vtjnash
Copy link
Member

vtjnash commented Dec 4, 2017

type of where the keys came from (one assumes an array or something)

This is definitely a feature. Getting the same answer from multiple places is always preferable.

their length is no longer statically known

This is false. The same information in contained in OneTo() (or 1:N). It is merely accessible to a varying degree via different passes at various times. With 1:N, the information should become visible via optimization passes such as loop-idiom detection. With (1...N), the information might be propagated by MemSSA or constant-propagation.

@andyferris
Copy link
Member Author

andyferris commented Dec 4, 2017

Hmm... so the types of situations I'm considering is when you have some data in some indexable container a, and you want to do operations like:

map(f1, a)
map(f2, keys(a))
map(k -> f3(k, a[k]), keys(a)) # could also be using `pairs`
map(f4, pairs(a))
map(f5, values(a))

It would be nice if you tended to get the same type of container back.

The idea is to build a coherent interface that works across indexable containers in Julia - specifically AbstractArray, Associative and Tuple (EDIT: and NamedTuple).

@StefanKarpinski
Copy link
Member

@andyferris: their length is no longer statically known
@vtjnash: This is false.

Allow me to translate: their length is no longer part of their type. Which is true.

@strickek
Copy link
Contributor

strickek commented Dec 5, 2017

Another question: What will keys for named tuples return?

@andyferris
Copy link
Member Author

To fulfill what I said above about map, it’s good if the container type for k = keys(a) is similar to a. If the output container k is “similar” to indexable container a then k would also be indexable. It’s values would obviously be the keys of a, but also keys(k) needs to be the same as keys(a) if the second and third map commands above are going to preserve the keys of a.

In this picture you end up with key containers which are idempotent under indexing (k[i] = i). We already have this property for vectors (and offset arrays).

For named tuples I’d return a similar container (named tuple) with same keys and keys as values. Thus I propose:

keys((a=1, b=2.0)) === (a = :a, b = :b)

I get that the answers are sometimes a bit surprising, but I feel the logic is sound and being able to map indexable containers with the same semantics as we have for arrays (similar output types, preserves keys) is super powerful. The AbstractArray Interface is beautifully constructed, easy to use, and powerful, however we never really constructed a more widely applicable Indexable interface until v0.7 with keys applying to all indexable containers. I say we propagate the little gems of beauty about AbstractArray that makes them fun and easy to use. But I’m just one Julia user, and am (honestly) interested in how others think an indexable interface might play out.

@yurivish
Copy link
Contributor

yurivish commented Dec 6, 2017

@andyferris I've been following your developments along these lines with great interest. Are there are any examples of the expressive improvements enabled by a consistent indexing scheme that you could point me to? There's been a lot of discussion in various issues and it's sometimes been a little bit difficult to follow from the sidelines, so it's quite possible that I just missed the right thread.

As a side note, Mathematica has a Dataset type that with a lot of powerful functionality packed into a set of indexing primitives: http://reference.wolfram.com/language/ref/Dataset.html

The precise mechanics there are sometimes rather arcane (some functions get applied on the way down to filter or select, and others get applied on the way up, to aggregate) but it feels like this is playing in the same space as your recent push towards a consistent basic interface.

@andyferris
Copy link
Member Author

@yurivish Thanks for following along. It seems like Mathematic Datasets are basically tables / dataframes.

I'll try get back with some more broadly motivating examples, but the kinds of things I've been trying to address in the last while are highlighted in my new prototype Indexing.jl, in particular the usage of keys here and the broken test here. This PR addresses the former for tuples.

@andyferris andyferris added collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself triage This should be discussed on a triage call labels Dec 14, 2017
@andyferris
Copy link
Member Author

Could I ask triage to please consider whether this is desired for v1.0 - in particular the broader issue of having the various incantations of map I mention in #24897 (comment) return the same container type with the same keys, or not. Thanks!

(My position is that this is the most useful, convenient and consistent way of mapping/iterating collections of data for "data science" type applications, and having such APIs be consistent in their semantics across arrays, dictionaries, and (named) tuples would be a great boon).

@JeffBezanson
Copy link
Member

I would hold off on this until/unless it becomes clear that we need it. In particular it's not clear whether this applies to dictionaries.

@StefanKarpinski
Copy link
Member

See my comment on #25013 (comment).

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Dec 14, 2017
@vtjnash vtjnash closed this Apr 3, 2021
@vtjnash vtjnash deleted the ajf/tuplekeys branch April 3, 2021 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets design Design of APIs or of the language itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants