Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add xarray entrypoint #197

Closed
snowman2 opened this issue Dec 23, 2020 · 9 comments · Fixed by #281
Closed

ENH: Add xarray entrypoint #197

snowman2 opened this issue Dec 23, 2020 · 9 comments · Fixed by #281
Labels
proposal Idea for a new feature.

Comments

@snowman2
Copy link
Member

Relevant issues:

@alexamici
Copy link
Contributor

alexamici commented Apr 5, 2021

I'm willing to contribute a PR with the plugin, but I'd really prefer xr.open_dataset to map raster bands to data_vars not to a band coordinate, at least by default.

@snowman2 would you evaluate such a PR?

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

I'm willing to contribute a PR with the plugin, but I'd really prefer xr.open_dataset to map raster bands to data_vars not to a band coordinate, at least by default.

I would check whether it is a DataArray or Dataset returned by rioxarray.open_rasterio as it could return either one depending on the input data: https://corteva.github.io/rioxarray/stable/examples/convert_to_raster.html

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

I'd really prefer xr.open_dataset to map raster bands to data_vars not to a band coordinate, at least by default.

This has been something that I have been on the fence about for the sake of consistency for the user. I think it would be good to have some input from others who might use this interface for their opinions on this.

As an alternative, when converting to a dataset, you could set the name of the DataArray to band_data. This makes selecting the band consistent with the current interface of open_rasterio.

For example:

>>> import xarray as xr
>>> xds = xr.open_dataset("myfile.jp2", engine="gdal")
>>> xds.band_data.sel(band=1)

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

I'd really prefer xr.open_dataset to map raster bands to data_vars not to a band coordinate, at least by default.

Another reason for doing it the way you suggest here allows adding band specific metadata in the attrs for each data_var.

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

Thanks for getting started on this @alexamici 👍

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

I'd really prefer xr.open_dataset to map raster bands to data_vars not to a band coordinate, at least by default.

After giving it some more thought, I don't think we should go this route. This is due to the fact that GDAL also supports netCDF/HDF/grib/etc.. Within each of the data variables for those scenarios, there needs to be a mechanism to select the band (which could be named time). For consistency across the board, I think we should go with something more like: #197 (comment)

@alexamici
Copy link
Contributor

I implemented the check for xr.Dataset and reverted the mapping of bands to data_vars (but I still would prefer it).

Current implementation returns:

>>> ds = xr.open_dataset("myRGB.tif")
>>> ds
<xarray.Dataset>
Dimensions:      (band: 3, x: 19087, y: 3932)
Coordinates:
  * band         (band) int64 1 2 3
  * y            (y) float64 ...
  * x            (x) float64 ...
Data variables:
    spatial_ref  int64 ...
    band_data    (band, y, x) float32 ...

>>> ds.band_data
<xarray.DataArray 'band_data' (band: 3, y: 3932, x: 19087)>
[225150252 values with dtype=float32]
Coordinates:
  * band     (band) int64 1 2 3
  * y        (y) float64 ...
  * x        (x) float64 ...
Attributes:
    scale_factor:  1.0
    add_offset:    0.0
    grid_mapping:  spatial_ref

@alexamici
Copy link
Contributor

I think opening bands as variables or as a dimensions needs to be an option.

For example using bands as variables I can easily add xarray mask_and_scale support independently for every variable. This is a use case I'm interested in.

@snowman2
Copy link
Member Author

snowman2 commented Apr 5, 2021

For example using bands as variables I can easily add xarray mask_and_scale support independently for every variable. This is a use case I'm interested in.

This sounds like modifications would be needed for rioxarray.open_rasterio to enable support for this in the backend entrypoint. Would you mind opening up a new issue to further discuss of your use case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Idea for a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants