Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Diskarrays 0.4 #247

Merged
merged 5 commits into from
Oct 24, 2024

Conversation

meggart
Copy link
Contributor

@meggart meggart commented Feb 16, 2024

This is a placeholder PR to update the DiskArray interface implementation once DiskArrays 0.4 is released.

@Alexander-Barth
Copy link
Owner

Thanks a lot Fabian for your help!

@rafaqz
Copy link
Contributor

rafaqz commented Mar 3, 2024

Can we re-run CI and see if this passes now?

…sError rather than a generic NCDatasets.NetCDFError
@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Mar 4, 2024

There was a failure when accessing a variable out-of-bounds. Previously test was just checking for an NetCDFError now it is also checking for a BoundsError. The test code is updated here:

332c846

Currently, we have a failing test for an array of strings and strided writes

using Test
using NCDatasets

sz = (4,5)
filename = tempname()

ds = NCDataset(filename,"c")
# define the dimension "lon" and "lat"
ds.dim["lon"] = sz[1]
ds.dim["lat"] = sz[2]

T = String
data = [Char(i+64) * Char(j+64) for i = 1:sz[1], j = 1:sz[2]]
v = defVar(ds,"var-$T",T,("lon","lat"))
v[:,:] = data[:,:] # ok

# stridded write and read
v[1:2:end,1:2:end] = data[1:2:end,1:2:end]
#    @test all(v[1:2:end,1:2:end] .== data[1:2:end,1:2:end])
close(ds)

We do not have a failure for non-strided writes. The data given to NCDatasets.writeblock! seems to contain #undef

("writeblock!", data) = ("writeblock!", ["AA" #undef "AC" #undef "AE"; #undef #undef #undef #undef #undef; "CA" #undef "CC" #undef "CE"])

Do you have any ideas?

@rafaqz
Copy link
Contributor

rafaqz commented Mar 20, 2024

@meggart ?

@asinghvi17
Copy link

Bump, this would be pretty useful to have nowadays since I'm looking at Zarr + raster ecosystem improvements

@Alexander-Barth
Copy link
Owner

I am currently hitting this error with DiskArrays 0.4.4 and NCDatasets 2d0e7cb:

  Test threw exception
  Expression: isempty(ncv[Int[]])
  MethodError: reducing over an empty collection is not allowed; consider supplying `init` to the reducer
  Stacktrace:
    [1] reduce_empty(op::Base.MappingRF{Base.ExtremaMap{typeof(identity)}, typeof(Base._extrema_rf)}, ::Type{Int64})           
      @ Base ./reduce.jl:361
    [2] reduce_empty_iter
      @ ./reduce.jl:384 [inlined]
    [3] mapreduce_empty_iter(f::Function, op::Function, itr::Vector{Int64}, ItrEltype::Base.HasEltype)                                                                         
      @ Base ./reduce.jl:380
    [4] _mapreduce
      @ ./reduce.jl:432 [inlined]
    [5] _mapreduce_dim
      @ ./reducedim.jl:365 [inlined]
    [6] mapreduce
      @ ./reducedim.jl:357 [inlined]
    [7] _extrema
      @ ./reducedim.jl:1015 [inlined]
    [8] _extrema
      @ ./reducedim.jl:1014 [inlined]
    [9] extrema
      @ ./reducedim.jl:1010 [inlined]
   [10] span(v::Vector{Int64})
      @ DiskArrays ~/.julia/packages/DiskArrays/6JA8Z/src/batchgetindex.jl:70
   [11] #is_sparse_index#13
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/batchgetindex.jl:84 [inlined]
   [12] is_sparse_index
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/batchgetindex.jl:83 [inlined]
   [13] need_batch_index
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:72 [inlined]
   [14] _need_batch
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:57 [inlined]
   [15] need_batch
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:54 [inlined]
   [16] getindex_disk!(out::Nothing, a::NCDatasets.Variable{Int64, 1, NCDataset{Nothing, Missing}}, i::Vector{Int64})                              
      @ DiskArrays ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:249
   [17] getindex_disk
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:210 [inlined]
   [18] getindex
      @ ~/.julia/packages/DiskArrays/6JA8Z/src/diskarray.jl:348 [inlined]
   [19] getindex(v::CommonDataModel.CFVariable{Int64, 1, NCDatasets.Variable{Int64, 1, NCDataset{Nothing, Missing}}, CommonDataModel.Attributes{NCDatasets.Variable{Int64, 1, NCDataset{Nothing, Missing}}}, @NamedTuple{fillvalue::Nothing, missing_values::Tuple{}, scale_factor::Nothing, add_offset::Nothing, calendar::Nothing, time_origin::Nothing, time_factor::Nothing, maskingvalue::Missing}}, indexes::Vector{Int64})
      @ CommonDataModel ~/.julia/packages/CommonDataModel/G3moc/src/cfvariable.jl:445
   [20] macro expansion
      @ ~/.julia/juliaup/julia-1.10.4+0.x64.linux.gnu/share/julia/stdlib/v1.10/Test/src/
Test.jl:669 [inlined]                                                                  
   [21] top-level scope
      @ ~/.julia/dev/NCDatasets/test/test_variable.jl:508
ERROR: LoadError: There was an error during testing
in expression starting at /home/abarth/.julia/dev/NCDatasets/test/test_variable.jl:275

Reported also here:
JuliaIO/DiskArrays.jl#186

@meggart
Copy link
Contributor Author

meggart commented Oct 21, 2024

Ok, I tested this again with DiskArrays 0.4.5. It looks like tests are passing now after these changes:

  • Test on lts instead of 1.6. DiskArrays needs at least julia 1.9 these days
  • changing 2 thrown error types from DimensionMismatch to NetCDFError

@meggart meggart changed the title WIP Update to Diskarrays 0.4 Update to Diskarrays 0.4 Oct 21, 2024
@meggart
Copy link
Contributor Author

meggart commented Oct 21, 2024

I just removed the WIP from the PR title. I think all issues with the DiskArrays update should be addressed.

@meggart
Copy link
Contributor Author

meggart commented Oct 21, 2024

I have no idea about the failing appveyor test. Which platform is this testing?

@Alexander-Barth
Copy link
Owner

Alexander-Barth commented Oct 22, 2024

thanks a lot Fabian looking into this! The failing appveyor is on Windows 32-bit (the error also occurs on linux 32-bit). Given the stack-trace, it might be related to Int = Int32 on this platform:

NCDatasets: Error During Test at C:\projects\ncdatasets-jl\test\runtests.jl:10
  Got exception outside of a @test
  LoadError: MethodError: no method matching DiskArrays.DiskIndex(::Tuple{Int64}, ::Tuple{Int64}, ::Tuple{Colon}, ::Tuple{Colon}, ::Tuple{Base.OneTo{Int64}})
  
  Closest candidates are:
    DiskArrays.DiskIndex(!Matched::Tuple{Vararg{Int32, N}}, !Matched::Tuple{Vararg{Int32, M}}, ::A, ::B, ::C) where {N, M, A<:Tuple, B<:Tuple, C<:Tuple}
     @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:78
  
  Stacktrace:
    [1] process_index(i::Base.OneTo{Int64}, cs::Tuple{DiskArrays.RegularChunks}, ::DiskArrays.NoBatch{DiskArrays.CanStepRange})
      @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:129
    [2] _resolve_indices(cs::Tuple{DiskArrays.RegularChunks}, i::Tuple{Base.OneTo{Int64}}, indices_pre::DiskArrays.DiskIndex{0, 0, Tuple{}, Tuple{}, Tuple{}}, nb::DiskArrays.NoBatch{DiskArrays.CanStepRange})
      @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:96
    [3] resolve_indices(a::NCDatasets.Variable{Float64, 1, NCDataset{Nothing, Missing}}, i::Tuple{Base.OneTo{Int64}}, batch_strategy::DiskArrays.NoBatch{DiskArrays.CanStepRange})
      @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:45
    [4] setindex_disk_nobatch!(a::NCDatasets.Variable{Float64, 1, NCDataset{Nothing, Missing}}, v::Vector{Float64}, i::Tuple{Base.OneTo{Int64}})
      @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:310
    [5] setindex_disk!(a::NCDatasets.Variable{Float64, 1, NCDataset{Nothing, Missing}}, v::Vector{Float64}, i::Base.OneTo{Int64})
      @ DiskArrays C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:349
    [6] setindex!(a::NCDatasets.Variable{Float64, 1, NCDataset{Nothing, Missing}}, v::Vector{Float64}, i::Base.OneTo{Int64})
      @ NCDatasets C:\Users\appveyor\.julia\packages\DiskArrays\xuWUZ\src\diskarray.jl:383
[...]

As far I understand the problem, in the test case the array is indexed with an index of the type Base.OneTo{Int64} (on this 32-bit platform). The length of such a range is Int64:

https://github.com/meggart/DiskArrays.jl/blob/main/src/diskarray.jl#L129

But the type DiskIndex only accepts Int:

https://github.com/meggart/DiskArrays.jl/blob/main/src/diskarray.jl#L77-L83

(Maybe a similar error can also be triggered on 64-bit platforms using non-Int64 indices, but I am not sure)

When I define the following method, then the test passes:

import DiskArrays: DiskIndex
DiskIndex(output_size::NTuple{N,<:Integer},temparray_size::NTuple{M,<:Integer}, 
    output_indices::Tuple,temparray_indices::Tuple,data_indices::Tuple) where {N, M} = 
          DiskIndex(Int.(output_size),Int.(temparray_size),output_indices,temparray_indices,data_indices)

Can this be added to DiskArrays? (I can make a PR :-) )

@meggart
Copy link
Contributor Author

meggart commented Oct 22, 2024

When I define the following method, then the test passes:

import DiskArrays: DiskIndex
DiskIndex(output_size::NTuple{N,<:Integer},temparray_size::NTuple{M,<:Integer},
output_indices::Tuple,temparray_indices::Tuple,data_indices::Tuple) where {N, M} =
DiskIndex(Int.(output_size),Int.(temparray_size),output_indices,temparray_indices,data_indices)

Can this be added to DiskArrays? (I can make a PR :-) )

Yes, this would be great to have in DiskArrays!

@Alexander-Barth
Copy link
Owner

great @meggart ! All tests now also work on 32-bit julia on linux. When you have the time, can you make a new release of DiskArrays, so I can merge this PR?

@meggart
Copy link
Contributor Author

meggart commented Oct 24, 2024

A new version of DiskArrays is tagged, so you can try to rerun CI

@Alexander-Barth
Copy link
Owner

Great, all test do pass!

@Alexander-Barth Alexander-Barth merged commit 857c394 into Alexander-Barth:master Oct 24, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants