Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Reshape shorthand for squashing first dimensions instead of last #27474

Open
tbole opened this issue Jun 7, 2018 · 25 comments
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation

Comments

@tbole
Copy link
Contributor

tbole commented Jun 7, 2018

Currently reshape(a, Val(N)) with N>0 is a useful shorthand returning an N-dimensional array by squashing the trailing dimensions of a. However, there is no such shorthand for the first dimensions. For that I propose to make reshape accept negative values of N.

a = rand(3,4,5)
reshape(a, Val(2))  # -> 3×20 Array{Float64,2}
reshape(a, Val(-2)) # -> 12×5 Array{Float64,2}
reshape(a, Val(-1)) == reshape(a, Val(1)) == a[:] # -> true
@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Jun 7, 2018
@mbauman
Copy link
Member

mbauman commented Jun 7, 2018

I think the best way to express this is with the : placeholder:

reshape(a, (3, :)) # -> 3×20 Array{Float64,2}
reshape(a, (:, 5)) # -> 12×5 Array{Float64,2}
reshape(a, :)

This Val API is a little strange in the first place — it changes the interpretations of the argument from a list of dimension lengths to a number of dimensions. Adding an extra negative interpretation here would only add to that strangeness.

@StefanKarpinski
Copy link
Member

I would propose deleting reshape with Val entirely.

@mbauman
Copy link
Member

mbauman commented Jun 7, 2018

Yes I would agree. I initially was the one who introduced this Val API — and I did so because I needed a way to express reshaping to a certain number of dimensions to support partial linear indexing (omitting indices over non-singleton dimensions). That's why it's asymmetric in terms of the trailing/leading dimensions; that asymmetry used to also live in our indexing behaviors. That's no longer the case.

@JeffBezanson JeffBezanson added this to the 0.7 milestone Jun 7, 2018
@JeffBezanson JeffBezanson added deprecation This change introduces or involves a deprecation and removed triage This should be discussed on a triage call labels Jun 7, 2018
@tbole
Copy link
Contributor Author

tbole commented Jun 7, 2018

I agree the Val API isn't particularly nice, but as a shorthand it's quite useful, especially for arrays with a large number of dimensions. The alternative to Val for arbitrary N-dimensional arrays would be something like

reshape(a, size(a)[1:x]..., :)
reshape(a, :, size(a)[x:end]...)

which feels way more cluttered than

reshape(a, Val(x))
reshape(a, Val(-x))

If the Val API is deprecated I'd introduce some specialized function for this operation, similar to squeeze, which is also just a shorthand for one particular reshape operation. This could either be something like

squeezefirstdims(a, ndim); squeezelastdims(a, ndim)
squeezedims(a, [+/-]ndim)

@JeffBezanson
Copy link
Member

I should also point out that using negative numbers for special behaviors is something we've tried to avoid.

@JeffBezanson
Copy link
Member

I'm looking at deprecating this, and it's true that replacing this is not so easy. For example when growing the number of dimensions you need to pass 1s, but to reduce the number of dimensions you pass a :. Maybe we should just give it a different name, e.g. reshapetrailing and reshapeleading?

@mbauman
Copy link
Member

mbauman commented Jun 13, 2018

We had talked about using convert(AbstractArray{<:Any, N}, A) as a potential replacement in the past. Not sure how well that'd play with all the other convert methods, though.

@JeffBezanson
Copy link
Member

At most, I would use convert for the case of adding trailing singleton dimensions.

@tbole
Copy link
Contributor Author

tbole commented Jun 13, 2018

convert sounds rather misplaced for this task. While reshaping often involves type conversion I don't think it is the natural choice. reshapetrailing and reshapeleading sound better.
The current behavior is that reshape(a, Val(nd)) will add singleton dimensions for nd>ndims(a) and compress trailing dimensions for nd < ndims(a). Alternatively one could try to go with a single function and introduce a forward/backward switch and eventually a start offset for more flexibility

reshapedims(a, nd, startdim, forward) = ...
reshapedims(a, nd) = reshapedims(a, nd, 1, true)   # == reshape(a, Val(nd))

@StefanKarpinski
Copy link
Member

It occurs to me that this operation is closely related to squeeze but where the squeezed dimensions don't need to be singletons.

@JeffBezanson
Copy link
Member

I don't think we have time to design and implement a replacement for this for 0.7. Hopefully we can add a non-breaking new feature for this any time, then eventually deprecate the existing function.

@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Jun 26, 2018
@StefanKarpinski
Copy link
Member

Sorry, this is probably not the outcome you were hoping for when you requested a new feature here, @tbole — a better API will be forthcoming in a future Julia version!

@tbole
Copy link
Contributor Author

tbole commented Jun 26, 2018

I'm mostly using reshape to automatically map whole data structures of multi-dimensional arrays to various layouts, so my CFD solver always has the optimal layout for a specific task at hand. Despite the limitations I'm quite content with the current Val API, thus having it around till 2.0 is a good thing for me. @JeffBezanson is right, improvements to this feature can be added any time.

@timholy
Copy link
Member

timholy commented Jun 26, 2018

A[☠️, ♻️^N] to kill dimensions from the front (reducing to an N-dimensional matrix) and A[♻️^N, ☠️] to kill from the back. Where

struct Die end
const ☠️ = Die()
struct Recycle{N} end
const ♻️ = Recycle{1}()
Base.:*(::Recycle{M}, ::Recycle{N}) where {M,N} = Recycle{M+N}()
@inline Base.literal_pow(::typeof(^), ::Recycle{1}, ::Type{Val{N}}) where N = Recycle{N}()
export ☠️, ♻️

@mbauman
Copy link
Member

mbauman commented Jun 26, 2018

Seriously, expressing this as an indexing is very clever and actually works without much consternation:

julia> Base.to_indices(A, inds, I::Tuple{Tuple{Vararg{Colon,N}}, Vararg{Any}}) where N = (vec(CartesianIndices(inds[1:N])), Base.to_indices(A, inds[N+1:end], Base.tail(I))...)

julia> A = rand(3,4,5);

julia> A[:, (:, :)]
3×20 Array{Float64,2}:
 0.232048  0.113046  0.0152497  0.279078  0.385592   0.835976  0.148665  0.266044  0.870706  0.800947  0.0904307  0.750843  0.341379  0.989266  0.786485  0.812475  0.556002  0.6635    0.50563   0.509057
 0.172248  0.511823  0.438507   0.511153  0.0461281  0.562187  0.629468  0.849421  0.453065  0.086566  0.174001   0.737693  0.655136  0.417266  0.712693  0.843132  0.177883  0.166547  0.123498  0.167775
 0.136909  0.147522  0.915731   0.624489  0.146087   0.993909  0.25996   0.227684  0.317993  0.285055  0.249758   0.519837  0.579373  0.874992  0.739405  0.45147   0.849409  0.410294  0.433313  0.904717

julia> A[(:, :), :]
12×5 Array{Float64,2}:
 0.232048   0.385592   0.870706   0.341379  0.556002
 0.172248   0.0461281  0.453065   0.655136  0.177883
 0.136909   0.146087   0.317993   0.579373  0.849409
 0.113046   0.835976   0.800947   0.989266  0.6635
 0.511823   0.562187   0.086566   0.417266  0.166547
 0.147522   0.993909   0.285055   0.874992  0.410294
 0.0152497  0.148665   0.0904307  0.786485  0.50563
 0.438507   0.629468   0.174001   0.712693  0.123498
 0.915731   0.25996    0.249758   0.739405  0.433313
 0.279078   0.266044   0.750843   0.812475  0.509057
 0.511153   0.849421   0.737693   0.843132  0.167775
 0.624489   0.227684   0.519837   0.45147   0.904717

A little bit more is required here to do the inds[1:N] and inds[N+1:end] type-stably, but that's not hard.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jun 26, 2018

What about these syntaxes: A[:, ...], A[..., :] and even A[:, ..., :]? Talk about dot-oriented programming though 😂

@tbole
Copy link
Contributor Author

tbole commented Jun 26, 2018

@StefanKarpinski @mbauman This seems to work pretty well and feels a lot more simple than using reshape. The syntax sounds great for reducing dimensions. Would it be worthwhile to have something like A[(),:,:,:] or A[:,(),:,:] for adding singleton dimensions at arbitrary positions, since we can already append them by A[:,:,:,:]?

@StefanKarpinski
Copy link
Member

Although A[a:b, ...] could also be a syntax to slice in the first dimension but keep all the others. @mbauman, I guess the thinking behind A[:, (:, :)] is that the (:, :) a single dimensional slice into multiple dimensions?

@timholy
Copy link
Member

timholy commented Jun 26, 2018

While the above was definitely tongue-in-cheek, one other point I want to make sure isn't being lost: for programming for generic dimensionality, A[ntuple(i->:, N), :] isn't bad but A[♻️^N, ☠️] has certain attractions.

@mbauman
Copy link
Member

mbauman commented Jun 27, 2018

Would it be worthwhile to have something like A[(),:,:,:] or A[:,(),:,:] for adding singleton dimensions at arbitrary positions

We have that! It's a [CartesianIndex(())]. It falls out of our existing rules quite nicely: CartesianIndex(()) has no indices (and therefore selects no dimensions), but a vector of one is one-dimensional so it adds a dimension.

We had talked about using ellipses for "filling out" the rest of an indexing expression with :s to preserve the trailing shape. This behavior is like "gathering them up" into one resulting dimension.

Whether we use a Tuple for this or some new structure, we can generalize it such that it not only accepts : but any indexing expression at all — each provided index would simply re-index into the original array's axes to compute the result.

@StefanKarpinski
Copy link
Member

You are truly the array master, @mbauman.

@JeffBezanson
Copy link
Member

Indexing with (:, :) is a really clever idea. Can all reshape calls be replaced with indexing?

@mbauman
Copy link
Member

mbauman commented Jun 27, 2018

Can all reshape calls be replaced with indexing?

To a certain extent, yes. reshape(A, (m, n, o)) == A[LinearIndices((m, n, o))], but indexing is more permissive in that it'll allow new dimensions that don't fully fill the original array. There are also a whole slew of optimizations on ReshapedArray that we'd need to move over to SubArray{<:Any,<:Any,<:Any,Tuple{LinearIndices}} (or maybe could be generalized to Tuple{AbstractArray{<:Integer}}).

@tbole
Copy link
Contributor Author

tbole commented Jun 27, 2018

There are also a whole slew of optimizations on ReshapedArray that we'd need to move over to SubArray{<:Any,<:Any,<:Any,Tuple{LinearIndices}} (or maybe could be generalized to Tuple{AbstractArray{<:Integer}}).

In fact the better performance of reshaped arrays over cartesian indexing one of the main reason I picked this approach, especially when combined with @inbounds and views.

@StefanKarpinski
Copy link
Member

This now seems like just a feature addition so doesn't need to block the release.

@JeffBezanson JeffBezanson removed this from the 0.7 milestone Jun 28, 2018
@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Jun 28, 2018
@nsajko nsajko added the arrays [a, r, r, a, y, s] label Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation
Projects
None yet
Development

No branches or pull requests

6 participants