Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028

andrew-c-ross · 2019-06-17T14:34:19Z

Closes xr.Dataset.drop #2994
Tests added
Fully documented, including whats-new.rst for all changes and api.rst for new API

This addresses #2994 by adding an "errors" keyword argument to Dataset.drop(), Dataset.drop_dims(), and DataArray.drop().

I stuck with pandas' convention of using either errors='raise', now the default that maintains previous behavior by raising an error if any passed label is not found in the dataset/array, or errors='ignore' in which case any missing labels are silently ignored.

This seems like a pretty straightforward change; mainly it is just skipping checks for missing labels when errors == 'ignore' and passing the errors keyword over to the pandas method when using index.drop(). Hopefully there are no subtleties that I've missed.

I added documentation to the appropriate methods, although I have been struggling to build the docs locally and am unsure if they look right.

Also this is my first attempt to contribute to any project, so suggestions and feedback are welcome.

Adds an errors keyword to Dataset.drop(), Dataset.drop_dims(), and DataArray.drop() (GH2994). Consistent with pandas, the value can be either "raise" or "ignore"

shoyer · 2019-06-18T15:33:50Z

This looks great, thank you!

One design question: should we stick with errors='ignore' like pandas, or would it be better to use a more self-descriptive argument name like missing='ignore'/missing='raise'?

max-sixty · 2019-06-18T17:49:03Z

One design question: should we stick with errors='ignore' like pandas, or would it be better to use a more self-descriptive argument name like missing='ignore'/missing='raise'?

I'm ambivalent. I prefer the missing=, but I'm more 'in it' than the average user, who would probably prefer consistency. If people / @shoyer vote that missing= is materially better, let's change

andrew-c-ross · 2019-06-19T13:08:41Z

I have to say as someone who is probably an average user, inconsistencies between related projects, like xarray/pandas or matplotlib/seaborn, drive me nuts. But I also agree that missing= is more descriptive, and if we were starting from scratch I would totally support that. So I will defer to the maintainers here.

shoyer

I think it's fine to stick with what pandas uses for consistency here. I think missing is a better name, but it's not that much better. I think most users could still guess correctly what errors='ignore' means.

shoyer · 2019-06-20T13:15:16Z

xarray/core/dataarray.py

@@ -1461,7 +1461,7 @@ def transpose(self, *dims, transpose_coords=None) -> 'DataArray':
    def T(self) -> 'DataArray':
        return self.transpose()

-    def drop(self, labels, dim=None):
+    def drop(self, labels, dim=None, errors='raise'):


Let's make this new argument require using a keyword argument:

Suggested change

def drop(self, labels, dim=None, errors='raise'):

def drop(self, labels, dim=None, *, errors='raise'):

Good point. I'll add this to the methods for both Dataset and DataArray, since they take the same arguments.

Does it make sense to also add to Dataset.drop_dims()? It is similar but takes no other keywords:
def drop_dims(self, drop_dims, errors='raise'):

Yes, let's make that keyword argument only, too.

shoyer · 2019-06-20T15:24:19Z

OK, I'm going to go ahead and merge after tests pass!

andrew-c-ross · 2019-06-20T15:46:59Z

Sounds great. Thanks for making this a smooth and well documented process for new contributors

* master: (31 commits) Add quantile method to GroupBy (pydata#2828) rolling_exp (nee ewm) (pydata#2650) Ensure explicitly indexed arrays are preserved (pydata#3027) add back dask-dev tests (pydata#3025) ENH: keepdims=True for xarray reductions (pydata#3033) Revert cmap fix (pydata#3038) Add "errors" keyword argument to drop() and drop_dims() (pydata#2994) (pydata#3028) More consistency checks (pydata#2859) Check types in travis (pydata#3024) Update issue templates (pydata#3019) Add pytest markers to avoid warnings (pydata#3023) Feature/merge errormsg (pydata#2971) More support for missing_value. (pydata#2973) Use flake8 rather than pycodestyle (pydata#3010) Pandas labels deprecation (pydata#3016) Pytest capture uses match, not message (pydata#3011) dask-dev tests to allowed failures in travis (pydata#3014) Fix 'to_masked_array' computing dask arrays twice (pydata#3006) str accessor (pydata#2991) fix safe_cast_to_index (pydata#3001) ...

andrew-c-ross added 2 commits June 17, 2019 09:56

Add "errors" keyword argument (GH2994)

aadd0b2

Adds an errors keyword to Dataset.drop(), Dataset.drop_dims(), and DataArray.drop() (GH2994). Consistent with pandas, the value can be either "raise" or "ignore"

Fix quotes

8b441b8

andrew-c-ross changed the title ~~Add "errors" keyword argument to drop() and drop_dims() #2994~~ Add "errors" keyword argument to drop() and drop_dims() (#2994) Jun 17, 2019

andrew-c-ross added 2 commits June 17, 2019 10:52

Different pandas versions raise different errors

9a651e4

Error messages also vary

cecfba3

andrew-c-ross marked this pull request as ready for review June 17, 2019 17:12

Correct doc for DataArray.drop; array, not dataset

5a2f915

shoyer reviewed Jun 20, 2019

View reviewed changes

Require errors argument to be passed with a keyword

51a22fa

shoyer merged commit 9c0bbf7 into pydata:master Jun 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028

Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028

andrew-c-ross commented Jun 17, 2019 •

edited

Loading

shoyer commented Jun 18, 2019

max-sixty commented Jun 18, 2019

andrew-c-ross commented Jun 19, 2019

shoyer left a comment •

edited

Loading

shoyer Jun 20, 2019

andrew-c-ross Jun 20, 2019

shoyer Jun 20, 2019

shoyer commented Jun 20, 2019

andrew-c-ross commented Jun 20, 2019

	def drop(self, labels, dim=None, errors='raise'):
	def drop(self, labels, dim=None, *, errors='raise'):

Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028

Add "errors" keyword argument to drop() and drop_dims() (#2994) #3028

Conversation

andrew-c-ross commented Jun 17, 2019 • edited Loading

shoyer commented Jun 18, 2019

max-sixty commented Jun 18, 2019

andrew-c-ross commented Jun 19, 2019

shoyer left a comment • edited Loading

Choose a reason for hiding this comment

shoyer Jun 20, 2019

Choose a reason for hiding this comment

andrew-c-ross Jun 20, 2019

Choose a reason for hiding this comment

shoyer Jun 20, 2019

Choose a reason for hiding this comment

shoyer commented Jun 20, 2019

andrew-c-ross commented Jun 20, 2019

andrew-c-ross commented Jun 17, 2019 •

edited

Loading

shoyer left a comment •

edited

Loading