-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028
Conversation
Adds an errors keyword to Dataset.drop(), Dataset.drop_dims(), and DataArray.drop() (GH2994). Consistent with pandas, the value can be either "raise" or "ignore"
This looks great, thank you! One design question: should we stick with |
I'm ambivalent. I prefer the |
I have to say as someone who is probably an average user, inconsistencies between related projects, like xarray/pandas or matplotlib/seaborn, drive me nuts. But I also agree that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to stick with what pandas uses for consistency here. I think missing
is a better name, but it's not that much better. I think most users could still guess correctly what errors='ignore'
means.
xarray/core/dataarray.py
Outdated
@@ -1461,7 +1461,7 @@ def transpose(self, *dims, transpose_coords=None) -> 'DataArray': | |||
def T(self) -> 'DataArray': | |||
return self.transpose() | |||
|
|||
def drop(self, labels, dim=None): | |||
def drop(self, labels, dim=None, errors='raise'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this new argument require using a keyword argument:
def drop(self, labels, dim=None, errors='raise'): | |
def drop(self, labels, dim=None, *, errors='raise'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I'll add this to the methods for both Dataset
and DataArray
, since they take the same arguments.
Does it make sense to also add to Dataset.drop_dims()
? It is similar but takes no other keywords:
def drop_dims(self, drop_dims, errors='raise'):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's make that keyword argument only, too.
OK, I'm going to go ahead and merge after tests pass! |
Sounds great. Thanks for making this a smooth and well documented process for new contributors |
* master: (31 commits) Add quantile method to GroupBy (pydata#2828) rolling_exp (nee ewm) (pydata#2650) Ensure explicitly indexed arrays are preserved (pydata#3027) add back dask-dev tests (pydata#3025) ENH: keepdims=True for xarray reductions (pydata#3033) Revert cmap fix (pydata#3038) Add "errors" keyword argument to drop() and drop_dims() (pydata#2994) (pydata#3028) More consistency checks (pydata#2859) Check types in travis (pydata#3024) Update issue templates (pydata#3019) Add pytest markers to avoid warnings (pydata#3023) Feature/merge errormsg (pydata#2971) More support for missing_value. (pydata#2973) Use flake8 rather than pycodestyle (pydata#3010) Pandas labels deprecation (pydata#3016) Pytest capture uses match, not message (pydata#3011) dask-dev tests to allowed failures in travis (pydata#3014) Fix 'to_masked_array' computing dask arrays twice (pydata#3006) str accessor (pydata#2991) fix safe_cast_to_index (pydata#3001) ...
whats-new.rst
for all changes andapi.rst
for new APIThis addresses #2994 by adding an "errors" keyword argument to
Dataset.drop()
,Dataset.drop_dims()
, andDataArray.drop()
.I stuck with pandas' convention of using either
errors='raise'
, now the default that maintains previous behavior by raising an error if any passed label is not found in the dataset/array, orerrors='ignore'
in which case any missing labels are silently ignored.This seems like a pretty straightforward change; mainly it is just skipping checks for missing labels when
errors == 'ignore'
and passing the errors keyword over to the pandas method when usingindex.drop()
. Hopefully there are no subtleties that I've missed.I added documentation to the appropriate methods, although I have been struggling to build the docs locally and am unsure if they look right.
Also this is my first attempt to contribute to any project, so suggestions and feedback are welcome.