Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyword indexing fails for 0-dimensional DimArray #423

Closed
sethaxen opened this issue Nov 18, 2022 · 8 comments · Fixed by #656
Closed

keyword indexing fails for 0-dimensional DimArray #423

sethaxen opened this issue Nov 18, 2022 · 8 comments · Fixed by #656

Comments

@sethaxen
Copy link
Collaborator

julia> using DimensionalData

julia> x = DimArray(fill(3), ());

julia> y = DimArray(randn(2, 3), (Dim{:a}(1:2), Dim{:b}(1:3)));

julia> y[c=1]  # for a >0-dim array, indexing a missing dim just raises a warning
┌ Warning: (Dim{:c},) dims were not found in object
└ @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/K9D4P/src/Dimensions/primitives.jl:645
2×3 DimArray{Float64,2} with dimensions: 
  Dim{:a} Sampled{Int64} 1:2 ForwardOrdered Regular Points,
  Dim{:b} Sampled{Int64} 1:3 ForwardOrdered Regular Points
     1          2         3
 1   0.295756   1.10136   0.855181
 2  -0.963257  -0.4447   -0.95428

julia> x[c=1] # but for 0-dim we StackOverflow
ERROR: StackOverflowError:
Stacktrace:
 [1] dims2indices(x::Tuple{}, I::Tuple{Dim{:c, Int64}}) (repeats 79984 times)
   @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/K9D4P/src/Dimensions/indexing.jl:34
@sethaxen
Copy link
Collaborator Author

I thought what was missing was the following method

julia> @inline Dimensions.dims2indices(::Tuple{}, I) = ()

But this has an unintended result. Indexing by a missing keyword then behaves like x[]:

julia> x[c=1]
3

julia> ds = DimStack((; x, y));

julia> ds[c=1]
┌ Warning: (Dim{:c},) dims were not found in object
└ @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/K9D4P/src/Dimensions/primitives.jl:645
(x = 3, y = [-1.5090420829521751 -0.629637266507813 -1.1542533262114874; 1.78624724297272 -1.341032743849202 0.2629446089245767])

I would expect x[c=1] to return x as a 0-dimensional array with a warning, and similarly I would expect indexing with ds to return the dataset unchanged.

@rafaqz
Copy link
Owner

rafaqz commented Nov 18, 2022

It is pretty weird I agree. But I would also have defined your missing method, expecting the result you got.

We fill in missing dimensions with : in dimensional indexing so we can be concise. But here it means calling getindex(x) after splatting in the empty tuple.

So I guess you are asking to special case this to not call getindex at all for zero dim arrays? Maybe only with DimStack?

That also feels pretty wierd to me, getindex call on a zero dim array always returns a scalar (unless you add extra colons). But honestly I don't know what we should do here and if it should be different to Base, thats probably why the bug is still here.

view actually does what you want here by the way, as Base doesn't unwrap with view(x). That could be a way to deal with this.

Edit: another way to explain why I expect your result is that getindex always creates a new array for all stack layers, and it still gets called on each layer even when there are no matching dims - just with colons for all dimensions.

This keeps to to our mental model for arrays - that altering an object after getindex does not change the original.

Maybe what we want for DimStack is the exact opposite of Base.dotview: an indexing method that always assigns a new array, but never unwraps zero dims - it copies them instead.

@sethaxen
Copy link
Collaborator Author

Perhaps I have the wrong mental model of DimStack, but I would expect that getindex with just keyword arguments would always return a DimStack. It's strange to me that if I put a 0-dimensional array in the DimStack, then I suddenly don't get a DimStack output anymore when I use keyword arguments to getindex. The result isn't even a valid Table.

view actually does what you want here by the way, as Base doesn't unwrap with view(x). That could be a way to deal with this.

That's good to know and could let me work around this. I should probably be viewing more than getindexing anyways.

@rafaqz
Copy link
Owner

rafaqz commented Nov 19, 2022

It was hard to work out the behaviour when you index arrays with different dimensions all at the same time - because getindex returns a scalar if you index with all Int, but an array with Colon or a range. So if we index with Int for all the dimensions of the smallest array we end up with a mix of scalars and arrays.

But from this discussion I think using regular getindex with DimStack was actually a mistake - we should instead define a noscalargetindex method I described in my last comment - that always returns an array.

Then we always have a DimStack of DimArray that can become a DimTable. Even when all of the arrays have no dimensions after indexing.

Would that be better?

@sethaxen
Copy link
Collaborator Author

But from this discussion I think using regular getindex with DimStack was actually a mistake - we should instead define a noscalargetindex method I described in my last comment - that always returns an array.

If such a method existed that would indeed be useful. But IIUC, this could be implemented as

noscalargetindex(s::AbstractDimStack; kwargs...) = copy(view(s; kwargs...))

so maybe it's better to just document that users who want this behavior should use this approach instead of having a special method?

@sethaxen
Copy link
Collaborator Author

Ah wait, it seems view is not guaranteed to return an AbstractDimStack?

julia> x = DimArray(randn(2), :a)
2-element DimArray{Float64,1} with dimensions: Dim{:a}
 1  -0.753079
 2  -0.502682

julia> ds = DimStack((; x))
DimStack with dimensions: Dim{:a}
and 1 layer:
  :x Float64 dims: Dim{:a} (2)


julia> view(ds; a=1)
(x = fill(-0.753079230064672),)

@rafaqz
Copy link
Owner

rafaqz commented Nov 19, 2022

That's a problem here rather than in Base so easy fix. Maybe view has the same check as getindex to not rebuild, that we just need to speialise.

@sethaxen
Copy link
Collaborator Author

That's a problem here rather than in Base so easy fix. Maybe view has the same check as getindex to not rebuild, that we just need to speialise.

Ah, it seems to be due to this inconsistency:

julia> x = DimArray(randn(2), :a);

julia> y = DimArray(randn(2, 3), (:b, :c));

julia> view(x; a=1) # a `DimArray` is not returned
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
-1.1244365106893395

julia> view(x, 1)  # what the above dispatches to
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
-1.1244365106893395

julia> z = DimArray(fill(randn()), ()); # also a problem for 0-dim arrays

julia> view(z) # fine with no indices
0-dimensional DimArray{Float64,0}: 

0.477376

julia> view(z, 1, 1)  # or 2
0-dimensional DimArray{Float64,0}: 

0.477376

julia> view(z, 1) # but not a DimArray with 1 index
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
0.4773758383436243

julia> view(y; b=1, c=1)  # if we index all dimensions, a `DimArray` is still returned
0-dimensional DimArray{Float64,0}: and reference dimensions: Dim{:b}, Dim{:c}

0.364411

julia> view(y, 1, 1)  # what the above dispatches to
0-dimensional DimArray{Float64,0}: and reference dimensions: Dim{:b}, Dim{:c}

0.364411

The issues seems to be this overload,

@propagate_inbounds Base.$f(A::AbstractDimArray, i::Integer) = Base.$f(parent(A), i)

For more than one integer index, there's a view-specific overload:

if f == :view
@eval @propagate_inbounds function Base.$f(A::AbstractDimArray, i1::StandardIndices, i2::StandardIndices, I::StandardIndices...)
I = to_indices(A, (i1, i2, I...))
x = Base.$f(parent(A), I...)
rebuildsliced(Base.$f, A, x, I)
end
else

I can open a PR fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants