Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

Closed
mathause opened this issue Oct 15, 2020 · 2 comments
Closed

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

mathause opened this issue Oct 15, 2020 · 2 comments
Labels
topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)

Comments

@mathause
Copy link
Collaborator

mathause commented Oct 15, 2020

I really should not open this can of worms but per #4450 (comment):

I'm always confused about whether ds.groupby(..., keep_attrs=True).mean() or ds.groupby(...).mean(keep_attrs=True) is correct. (similarly for rolling, coarsen etc.)

Also as I try to fix the keep_attr behavior in #4510 it would be good to know where they should go. So I tried to figure out how this is currently handled and found the following:

ds.xxx(keep_attrs=True).yyy()

  • all fixed

ds.xxx().yyy(keep_attrs=True)

So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True) - any comments on that?

(Edit: looking at this it is only half as bad, "only" coarsen, rolling (#4510), and rolling_exp would need to be fixed.)

Detailed analysis

import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
da = ds.air

coarsen

ds.coarsen(time=2, keep_attrs=True).mean() # keeps global attributes
ds.coarsen(time=2).mean(keep_attrs=True) # keeps DataArray attributes
ds.coarsen(time=2, keep_attrs=True).mean(keep_attrs=True) # keeps both

da.coarsen(time=2).mean(keep_attrs=True) # error
da.coarsen(time=2, keep_attrs=True).mean() # keeps DataArray attributes

groupby

ds.groupby("time.month").mean(keep_attrs=True) # keeps both
da.groupby("time.month").mean(keep_attrs=True) # keeps DataArray attributes

ds.groupby("time.month", keep_attrs=True).mean() # error
da.groupby("time.month", keep_attrs=True).mean() # error

groupby_bins

ds.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps both
da.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps DataArray attrs

ds.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors
da.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors

resample

ds.resample(time="A").mean(keep_attrs=True) # keeps both
da.resample(time="A").mean(keep_attrs=True) # keeps DataArray attributes

ds.resample(time="A", keep_attrs=False).mean() # ignored
da.resample(time="A", keep_attrs=False).mean() # ignored

rolling

ds.rolling(time=2).mean(keep_attrs=True) # keeps both
da.rolling(time=2).mean(keep_attrs=True) # keeps DataArray attributes

ds.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps both
da.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps DataArray attributes

see #4510

rolling_exp

ds.rolling_exp(time=5, keep_attrs=True).mean() # ignored
da.rolling_exp(time=5, keep_attrs=True).mean() # ignored

ds.rolling_exp(time=5).mean(keep_attrs=True) # keeps both
da.rolling_exp(time=5).mean(keep_attrs=True) # keeps DataArray attributes

weighted

ds.weighted(ds.lat).mean(keep_attrs=True) # keeps both
da.weighted(ds.lat).mean(keep_attrs=True) # keeps DataArray attrs

edit: moved rolling after #4510, moved rolling_exp after #4592 and coarsen after #5227

@dcherian
Copy link
Contributor

So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True) - any comments on that?

This is nice because it matches the da.mean(..., keep_attrs=True) syntax.

I think we should at least warn for da.groupby(..., keep_attrs=True) saying that keep_attrs is ignored and should be provided to the applied function.

@dcherian dcherian added the topic-metadata Relating to the handling of metadata (i.e. attrs and encoding) label Oct 15, 2020
@mathause mathause mentioned this issue May 6, 2021
4 tasks
@mathause
Copy link
Collaborator Author

I think this can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)
Projects
None yet
Development

No branches or pull requests

2 participants