Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add set_xindex and drop_indexes methods #6971

Merged
merged 29 commits into from
Sep 28, 2022
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
3f6f637
temporary API to set custom indexes
benbovy Jul 16, 2022
bf30d54
add the temporary index API to DataArray
keewis Jul 16, 2022
9de9c46
add options argument to Index.from_variables()
benbovy Jul 17, 2022
aa403a4
fix mypy
benbovy Jul 17, 2022
210a59a
remove temporary API warning
benbovy Aug 31, 2022
d8c3985
add the Index class in Xarray's root namespace
benbovy Aug 31, 2022
c4afabf
improve set_xindex docstrings and add to api.rst
benbovy Aug 31, 2022
fe723ce
remove temp comments
benbovy Aug 31, 2022
a48c853
special case for pandas multi-index dim coord
benbovy Aug 31, 2022
01de6bd
add tests for set_xindex
benbovy Aug 31, 2022
201bd05
error message tweaks
benbovy Aug 31, 2022
41c896f
set_xindex with 1 coord: avoid reodering coords
benbovy Aug 31, 2022
1ec5ca6
mypy fixes
benbovy Aug 31, 2022
a6caa7a
add Dataset and DataArray drop_indexes methods
benbovy Aug 31, 2022
bb07d5a
improve assert_no_index_corrupted error msg
benbovy Aug 31, 2022
ec2f8fc
drop_indexes: add tests
benbovy Aug 31, 2022
f9601b9
add drop_indexes to api.rst
benbovy Aug 31, 2022
1a555bc
improve docstrings of legacy methods
benbovy Aug 31, 2022
0b7d582
add what's new entry
benbovy Aug 31, 2022
3ab0bc9
try using correct typing w/o mypy complaining
benbovy Sep 1, 2022
9e75f95
make index_cls arg optional
benbovy Sep 7, 2022
00c2711
docstrings fixes and tweaks
benbovy Sep 23, 2022
cb67612
make Index.from_variables options arg keyword only
benbovy Sep 23, 2022
af67168
Merge branch 'main' into add-set-xindex-and-drop-indexes
benbovy Sep 23, 2022
2cd0aa8
improve set_xindex invalid coordinates error msg
benbovy Sep 23, 2022
61d6e28
add xarray.indexes namespace
benbovy Sep 27, 2022
ec08d73
Merge branch 'main' into add-set-xindex-and-drop-indexes
benbovy Sep 27, 2022
20dbf5a
Merge branch 'main' into add-set-xindex-and-drop-indexes
benbovy Sep 27, 2022
b598447
type tweaks
benbovy Sep 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ Dataset contents
Dataset.swap_dims
Dataset.expand_dims
Dataset.drop_vars
Dataset.drop_indexes
Dataset.drop_duplicates
Dataset.drop_dims
Dataset.set_coords
Expand Down Expand Up @@ -146,6 +147,7 @@ Indexing
Dataset.reindex_like
Dataset.set_index
Dataset.reset_index
Dataset.set_xindex
Dataset.reorder_levels
Dataset.query

Expand Down Expand Up @@ -298,6 +300,7 @@ DataArray contents
DataArray.swap_dims
DataArray.expand_dims
DataArray.drop_vars
DataArray.drop_indexes
DataArray.drop_duplicates
DataArray.reset_coords
DataArray.copy
Expand Down Expand Up @@ -330,6 +333,7 @@ Indexing
DataArray.reindex_like
DataArray.set_index
DataArray.reset_index
DataArray.set_xindex
DataArray.reorder_levels
DataArray.query

Expand Down Expand Up @@ -1080,6 +1084,7 @@ Advanced API
Variable
IndexVariable
as_variable
Index
Context
register_dataset_accessor
register_dataarray_accessor
Expand Down
5 changes: 5 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ v2022.07.0 (unreleased)

New Features
~~~~~~~~~~~~

- Add :py:meth:`Dataset.set_xindex` and :py:meth:`Dataset.drop_indexes` and
their DataArray counterpart for setting and dropping pandas or custom indexes
given a set of arbitrary coordinates. (:pull:`6971`)
By `Benoît Bovy <https://github.com/benbovy>`_ and `Justus Magin <https://github.com/keewis>`_.
- Enable taking the mean of dask-backed :py:class:`cftime.datetime` arrays
(:pull:`6556`, :pull:`6940`). By `Deepak Cherian
<https://github.com/dcherian>`_ and `Spencer Clark
Expand Down
2 changes: 2 additions & 0 deletions xarray/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
from .core.dataarray import DataArray
from .core.dataset import Dataset
from .core.extensions import register_dataarray_accessor, register_dataset_accessor
from .core.indexes import Index
from .core.merge import Context, MergeError, merge
from .core.options import get_options, set_options
from .core.parallel import map_blocks
Expand Down Expand Up @@ -99,6 +100,7 @@
"Coordinate",
"DataArray",
"Dataset",
"Index",
"IndexVariable",
"Variable",
# Exceptions
Expand Down
68 changes: 68 additions & 0 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -2170,6 +2170,11 @@ def set_index(
"""Set DataArray (multi-)indexes using one or more existing
coordinates.

This legacy method is limited to pandas (multi-)indexes and
1-dimensional "dimension" coordinates. See
:py:meth:`~DataArray.set_xindex` for setting a pandas or a custom
Xarray-compatible index from one or more arbitrary coordinates.

Parameters
----------
indexes : {dim: index, ...}
Expand Down Expand Up @@ -2214,6 +2219,7 @@ def set_index(
See Also
--------
DataArray.reset_index
DataArray.set_xindex
"""
ds = self._to_temp_dataset().set_index(indexes, append=append, **indexes_kwargs)
return self._from_temp_dataset(ds)
Expand All @@ -2227,6 +2233,12 @@ def reset_index(
) -> DataArray:
"""Reset the specified index(es) or multi-index level(s).

This legacy method is specific to pandas (multi-)indexes and
1-dimensional "dimension" coordinates. See the more generic
:py:meth:`~DataArray.drop_indexes` and :py:meth:`~DataArray.set_xindex`
method to respectively drop and set pandas or custom indexes for
arbitrary coordinates.

Parameters
----------
dims_or_levels : Hashable or sequence of Hashable
Expand All @@ -2245,10 +2257,41 @@ def reset_index(
See Also
--------
DataArray.set_index
DataArray.set_xindex
DataArray.drop_indexes
"""
ds = self._to_temp_dataset().reset_index(dims_or_levels, drop=drop)
return self._from_temp_dataset(ds)

def set_xindex(
self: T_DataArray,
coord_names: Hashable | Sequence[Hashable],
index_cls: type[Index] | None = None,
**options,
) -> T_DataArray:
"""Set a new, Xarray-compatible index from one or more existing
coordinate(s).

Parameters
----------
coord_names : str or list
Name(s) of the coordinate(s) used to build the index.
If several names are given, their order matters.
index_cls : subclass of :class:`~xarray.Index`
The type of index to create. By default, try setting
a pandas (multi-)index from the supplied coordinates.
**options
Options passed to the index constructor.

Returns
-------
obj : DataArray
Another dataarray, with this dataarray's data and with a new index.

"""
ds = self._to_temp_dataset().set_xindex(coord_names, index_cls, **options)
return self._from_temp_dataset(ds)

def reorder_levels(
self: T_DataArray,
dim_order: Mapping[Any, Sequence[int | Hashable]] | None = None,
Expand Down Expand Up @@ -2559,6 +2602,31 @@ def drop_vars(
ds = self._to_temp_dataset().drop_vars(names, errors=errors)
return self._from_temp_dataset(ds)

def drop_indexes(
self: T_DataArray,
coord_names: Hashable | Iterable[Hashable],
*,
errors: ErrorOptions = "raise",
) -> T_DataArray:
"""Drop the indexes assigned to the given coordinates.

Parameters
----------
coord_names : hashable or iterable of hashable
Name(s) of the coordinate(s) for which to drop the index.
errors : {"raise", "ignore"}, default: "raise"
If 'raise', raises a ValueError error if any of the coordinates
passed have no index or are not in the dataset.
If 'ignore', no error is raised.

Returns
-------
dropped : DataArray
A new dataarray with dropped indexes.
"""
ds = self._to_temp_dataset().drop_indexes(coord_names, errors=errors)
return self._from_temp_dataset(ds)

def drop(
self: T_DataArray,
labels: Mapping[Any, Any] | None = None,
Expand Down
Loading