Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename indices? #23434

Closed
JeffBezanson opened this issue Aug 24, 2017 · 15 comments
Closed

rename indices? #23434

JeffBezanson opened this issue Aug 24, 2017 · 15 comments
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself needs decision A decision on this change is needed

Comments

@JeffBezanson
Copy link
Member

I propose renaming this to axes, since it basically returns a description of each axis of an array. The current name makes me think it gives you, well, the indices (which is what keys does).

@JeffBezanson JeffBezanson added arrays [a, r, r, a, y, s] needs decision A decision on this change is needed deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself labels Aug 24, 2017
@stevengj
Copy link
Member

There is also eachindex ... personally, I find axes more confusing than indices. We don't use the "axis" terminology anywhere else in Julia as far as I can tell.

@JeffBezanson
Copy link
Member Author

eachindex also doesn't tell you the indices of an array --- it gives you something useful for fast iteration, which can be different from the actual indices.

@JeffBezanson
Copy link
Member Author

However indices(A, i) gives you the indices for dimension i, so that's good.

@timholy
Copy link
Member

timholy commented Aug 24, 2017

I like the idea of switching our terminology to "axes" to refer to what we sometimes call "dimensions." The problem with dimensions is that it both means "number of dimensions" (as in "n-dimensional") and "the dimensions of this room" (aka, size). "axes" seems to clearly imply the former and not the latter. For example, I'd propose that most methods that take an argument called dims or region should probably rename that variable axes.

With AxisArrays, @mbauman has played with making indices return the information that is currently returned by AxisArrays.axes. This makes sense because both return a tuple that is essentially a "broadcast-worthy" description of the complete set of Cartesian indices for the array. However axes returns these in physical units (e.g., mm if you've assigned such units to the axes of the array) whereas indices returns these in computer science units (aka, integers). Both turn out to have their uses, so AFAICT attempts to unify axes and indices have proved problematic. For Base, the advantage of indices is that it somehow seems to imply integer units to me.

I wonder if the distinction you're working towards is the difference between the set of all indices, an iterator to generate all indices, and the "basis vectors" for constructing (via reshape&broadcasting) all possible indices. Given that arrays have rectangular/Cartesian indexing, all valid indices can be constructed from the "basis vectors". Maybe call it indexvectors?

@Jutho
Copy link
Contributor

Jutho commented Oct 4, 2017

There might be some more inconsistencies in how properties of arrays are named. I could see that size in singular refers to both individual an individual size(A,i) as well as the collection size(A), but always find this confusing when on the next line probing strides as stride(A,i) or strides(A).

On a more philosophical level, I would say that the current nomenclature for array properties is very much oriented towards arrays representing data on a discretised grid in space(time), i.e. the rank of the tensor is referred to as ndims being typically 2 (plane), 3 (space) or 4 (spacetime), and indices(A,i) is used for the possible indices (the range of values) that the ith dimension can take, and the length of that range is size(A,i), again referring to the geometrical interpretation af the array.

That use of indices and dimensions is exactly opposite to how tensors are often described and used in physics. A tensor has a number of indices (the rank of the tensor), which are abstract objects (not the actual values they take), and the ith index of a tensor has a certain dimension associated to a vector space in which it lives, which can be spacetime. For example, tensors in classical physics or general relativity can have a number of indices whose dimension is typically 3 (for space) or 4 (for spacetime), but the number of indices itself has nothing to do with space(time) dimensionality. size and ndims are therefore awkward descriptions of these properties. Similarly, in quantum physics, the indices of a tensor take values in some Hilbert space with a certain dimension, and neither those dimensions nor the number of indices have any relation to space(time) dimensionality.

All of this just as a side note, as I am not actually voting to change the current convention.

@cormullion
Copy link
Contributor

cormullion commented Oct 4, 2017

Would existing users of an axes() function have to rename their versions? (It might be quite common in the plotting/graphics world as well as the array world...)

@andyferris
Copy link
Member

I made a prototype PR of changing this name at #25057 - it's pretty easy to choose a new name on that branch, as desired.

I'll cross-post a discussion I made there:

I feel that the indices of an indexable container would probably be the collection of things you can index with, which is currently what the keys function does. [...]

Going further, I feel there is a bit of discord between "keys" and "indices", since we have keys and haskey as doing discovery of things we can do getindex and setindex! with. I don't see the advantage on forcing users to learn and use multiple words for the same concept. We could consider harmonizing like so:

Current "Index" terminology "Key" terminology
getindex getindex getkey
setindex! setindex! setkey!
keys indices or index keys
haskey hasindex haskey

Personally, I'd prefer the "index" version.

@StefanKarpinski
Copy link
Member

I prefer the "index" terminology as well – "index" sounds right for both arrays and dicts (although it's a bit non-standard for dicts), whereas "key" sounds quite wrong for arrays.

@StefanKarpinski
Copy link
Member

Has the name dims been considered? It matches ndims and seems like a more intuitive name for what this function returns: a description of the dimensions of the argument. It would be a little weird that dims(A, i) is plural, but it seems ok to me:

julia> dims(rand(3, 4))
(Base.OneTo(3), Base.OneTo(4))

julia> dims(rand(3, 4), 1)
Base.OneTo(3)

julia> ndims(rand(3, 4))
2

@Jutho
Copy link
Contributor

Jutho commented Dec 13, 2017

With strides, it's also strides(A) and stride(A,n) (plural and singular). However, dim sounds like dim(A,n) should just be size(A,n), not the corresponding range, but that's maybe my difference in background, as explained above: #23434 (comment)

@andyferris
Copy link
Member

I had thought of dimranges since it returns a range for each dimension.

However maybe dimindices / dimkeys would be better, since it’s the indices (keys) for each dimension.

@StefanKarpinski
Copy link
Member

FWIF, I prefer axes to dimindices, but I prefer dims to both.

@andyferris
Copy link
Member

Just looked into this - dims is really commonly used in Base as a variable name, containing a tuple of integers (the result of size). We'd want to do a bit of a nomenclature switch around if we go with dims here.

@waldyrious
Copy link
Contributor

Has the name dims been considered?

Wasn't this precisely what Tim Holy's comment above addressed? Or am I missing something?

@andyferris
Copy link
Member

Closed via #25057

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation design Design of APIs or of the language itself needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

8 participants