Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data without embedded projection throws exception when reprojected on load() #224

Closed
omad opened this issue May 1, 2017 · 3 comments
Closed

Comments

@omad
Copy link
Member

omad commented May 1, 2017

When I run

from datacube import Datacube

dc = Datacube()

query = {'crs': 'EPSG:3577',
 'time': ('1987-10-01', '1990-10-01'),
 'x': (349388.9787330463, 358497.9246628304),
 'y': (-2379960.5883129314, -2375926.544118764),
'output_crs': 'EPSG:3577', 
'resolution': (-25, 25),
'resampling': 'cubic',
 'product': 'bom_rainfall_grids'
}

dc.load(**query)

Expect
I expect to receive a reprojected DataArray.

Instead
I get the following error:

Traceback (most recent call last):
  File "/g/data/v10/public/modules/agdc-py3-env/20170327/envs/agdc/lib/python3.5/site-packages/rasterio/windows.py", line 279, in evaluate
    r, c = window
ValueError: too many values to unpack (expected 2)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/g/data/v10/public/modules/agdc-py3-env/20170327/envs/agdc/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-8-991b42ee536c>", line 15, in <module>
    dc.load(**query)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/api/core.py", line 322, in load
    fuse_func=fuse_func, dask_chunks=dask_chunks)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/api/core.py", line 491, in load_data
    geobox, measurements, data_func)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/api/core.py", line 428, in create_storage
    data = data_func(measurement)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/api/core.py", line 484, in data_func
    skip_broken_datasets=skip_broken_datasets)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/api/core.py", line 550, in _fuse_measurement
    skip_broken_datasets=skip_broken_datasets)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/storage/storage.py", line 199, in reproject_and_fuse
    read_from_source(sources[0], destination, dst_transform, dst_nodata, dst_projection, resampling)
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/storage/storage.py", line 155, in read_from_source
    NUM_THREADS=OPTIONS['reproject_threads'])
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/storage/storage.py", line 333, in reproject
    source = self.read(self.source)  # TODO: read only the part the we care about
  File "/g/data/v10/public/modules/agdc-py3/1.3.0/lib/python3.5/site-packages/datacube/storage/storage.py", line 330, in read
    return self.source.ds.read(indexes=self.source.bidx, window=window, out_shape=out_shape)
  File "rasterio/_io.pyx", line 207, in rasterio._io.DatasetReaderBase.read (rasterio/_io.c:5103)
  File "/g/data/v10/public/modules/agdc-py3-env/20170327/envs/agdc/lib/python3.5/site-packages/rasterio/windows.py", line 283, in evaluate
    raise ValueError("invalid window structure; expecting ints"
ValueError: invalid window structure; expecting ints((row_start, row_stop), (col_start, col_stop))

It looks like somewhere along the way the read window is being replaced with a rasterio.Band object.

@simonaoliver
Copy link
Member

simonaoliver commented May 1, 2017

That's one we don't have a test for. This seems to work though:

albers_grid = dc.load(product='dsm1sv10', x=(149.07, 149.17), y=(-35.25, -35.35),
                       output_crs='EPSG:3577', resolution=(-25,25))

@omad
Copy link
Member Author

omad commented May 1, 2017

Thanks for that test Simon. Glad to know it's not as widespread a problem as I first suspected.

Also, my bad for not including version/runtime information. I can confirm that it fails with the latest stable release and module available on the NCI.

$ module load agdc-py3-prod
$ datacube --version
Data Cube, version 1.3.2
$ which datacube
/g/data/v10/public/modules/agdc-py3/1.3.2/bin/datacube
>>> import datacube
>>> datacube.__version__
'1.3.2'
>>> datacube.__path__
['/g/data/v10/public/modules/agdc-py3/1.3.2/lib/python3.6/site-packages/datacube']

@andrewdhicks
Copy link
Contributor

It looks to only occur for reading formats which rasterio can't read the projection, and has to fall back to the product definition to determine a CRS.
In this case, the format is a NetCDF file without CF-compliant geospatial metadata.

@jeremyh jeremyh changed the title Unable to reproject data on load() Data without embedded projection throws exception when reprojected on load() May 2, 2017
andrewdhicks added a commit that referenced this issue May 12, 2017
@andrewdhicks andrewdhicks mentioned this issue May 12, 2017
@omad omad closed this as completed in #231 May 12, 2017
omad pushed a commit that referenced this issue May 12, 2017
omad pushed a commit that referenced this issue May 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants