Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XArrayInterface improvements: dimension units and labels #2431

Merged
merged 4 commits into from
Mar 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 77 additions & 1 deletion examples/user_guide/08-Gridded_Datasets.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,69 @@
"heatmap + heatmap.table()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Working with xarray data types\n",
"As demonstrated previously, `Dataset` comes with support for the `xarray` library, which offers a powerful way to work with multi-dimensional, regularly spaced data. In this example, we'll load an example dataset, turn it into a HoloViews `Dataset` and visualize it. First, let's have a look at the xarray dataset's contents:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"xr_ds = xr.tutorial.load_dataset(\"air_temperature\")\n",
"xr_ds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is trivial to turn this xarray Dataset into a Holoviews `Dataset` (the same also works for DataArray):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hv_ds = hv.Dataset(xr_ds)[:, :, \"2013-01-01\"]\n",
"print(hv_ds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We have used the usual slice notation in order to select one single day in the rather large dataset. Finally, let's visualize the dataset by converting it to a `HoloMap` of `Images` using the `to()` method. We need to specify which of the dataset's key dimensions will be consumed by the images (in this case \"lat\" and \"lon\"), where the remaing key dimensions will be associated with the HoloMap (here: \"time\"). We'll use the slice notation again to clip the longitude."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%opts Image [colorbar=True]\n",
"%%output size=200\n",
"hv_ds.to(hv.Image, kdims=[\"lon\", \"lat\"], dynamic=False)[:, 220:320, :]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, we have explicitly specified the default behaviour `dynamic=False`, which returns a HoloMap. Note, that this approach immediately converts all available data to images, which will take up a lot of RAM for large datasets. For these situations, use `dynamic=True` to generate a [DynamicMap](./06-Live_Data.ipynb) instead. Additionally, [xarray features dask support](http://xarray.pydata.org/en/stable/dask.html), which is helpful when dealing with large amounts of data.\n",
"\n",
"Additional examples of visualizing xarrays in the context of geographical data can be found in the GeoViews documentation: [Gridded Datasets I](http://geo.holoviews.org/Gridded_Datasets_I.html) and\n",
"[Gridded Datasets II](http://geo.holoviews.org/Gridded_Datasets_II.html). These guides also contain useful information on the interaction between xarray data structures and HoloViews Datasets in general."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -399,9 +462,22 @@
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"pygments_lexer": "ipython3"
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
Expand Down
12 changes: 11 additions & 1 deletion holoviews/core/data/xarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def init(cls, eltype, data, kdims, vdims):
cls)
vdims = [vdim]
data = data.to_dataset(name=vdim.name)

if not isinstance(data, xr.Dataset):
if kdims is None:
kdims = kdim_param.default
Expand Down Expand Up @@ -120,8 +121,9 @@ def init(cls, eltype, data, kdims, vdims):
for c in data.coords:
if c not in kdims and set(data[c].dims) == set(virtual_dims):
kdims.append(c)
vdims = [vd if isinstance(vd, Dimension) else Dimension(vd) for vd in vdims]
kdims = [kd if isinstance(kd, Dimension) else Dimension(kd) for kd in kdims]

kdims = [d if isinstance(d, Dimension) else Dimension(d) for d in kdims]
not_found = []
for d in kdims:
if not any(d.name == k or (isinstance(v, xr.DataArray) and d.name in v.dims)
Expand All @@ -133,6 +135,14 @@ def init(cls, eltype, data, kdims, vdims):
raise DataError("xarray Dataset must define coordinates "
"for all defined kdims, %s coordinates not found."
% not_found, cls)

# retrieve units and labels from Dataset:
for d in kdims + vdims:
d.unit = data[d.name].attrs.get('units')
label = data[d.name].attrs.get('long_name')
if label is not None:
d.label = label

return data, {'kdims': kdims, 'vdims': vdims}, {}


Expand Down
41 changes: 41 additions & 0 deletions tests/core/data/testdataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1572,6 +1572,47 @@ def test_xarray_dataset_with_scalar_dim_canonicalize(self):
expected = np.array([[0, 1], [2, 3], [4, 5]])
self.assertEqual(canonical, expected)

def test_xarray_dataset_names_and_units(self):
import xarray as xr
xs = [0.1, 0.2, 0.3]
ys = [0, 1]
zs = np.array([[0, 1], [2, 3], [4, 5]])
da = xr.DataArray(zs, coords=[('x_dim', xs), ('y_dim', ys)], name="data_name", dims=['y_dim', 'x_dim'])
da.attrs['long_name'] = "data long name"
da.attrs['units'] = "array_unit"
da.x_dim.attrs['units'] = "x_unit"
da.y_dim.attrs['long_name'] = "y axis long name"
dataset = Dataset(da)
self.assertEqual(dataset.get_dimension("x_dim"), Dimension("x_dim", unit="x_unit"))
self.assertEqual(dataset.get_dimension("y_dim"), Dimension("y_dim", label="y axis long name"))
self.assertEqual(dataset.get_dimension("data_name"),
Dimension("data_name", label="data long name", unit="array_unit"))

def test_xarray_dataset_dataarray_vs_dataset(self):
import xarray as xr
xs = [0.1, 0.2, 0.3]
ys = [0, 1]
zs = np.array([[0, 1], [2, 3], [4, 5]])
da = xr.DataArray(zs, coords=[('x_dim', xs), ('y_dim', ys)], name="data_name", dims=['y_dim', 'x_dim'])
da.attrs['long_name'] = "data long name"
da.attrs['units'] = "array_unit"
da.x_dim.attrs['units'] = "x_unit"
da.y_dim.attrs['long_name'] = "y axis long name"
ds = da.to_dataset()
dataset_from_da = Dataset(da)
dataset_from_ds = Dataset(ds)
self.assertEqual(dataset_from_da, dataset_from_ds)
# same with reversed names:
da_rev = xr.DataArray(zs, coords=[('x_dim', xs), ('y_dim', ys)], name="data_name", dims=['x_dim', 'y_dim'])
da_rev.attrs['long_name'] = "data long name"
da_rev.attrs['units'] = "array_unit"
da_rev.x_dim.attrs['units'] = "x_unit"
da_rev.y_dim.attrs['long_name'] = "y axis long name"
ds_rev = da_rev.to_dataset()
dataset_from_da_rev = Dataset(da_rev)
dataset_from_ds_rev = Dataset(ds_rev)
self.assertEqual(dataset_from_da_rev, dataset_from_ds_rev)

def test_dataset_array_init_hm(self):
"Tests support for arrays (homogeneous)"
raise SkipTest("Not supported")
Expand Down