where should keep_attrs be set in groupby, resample, weighted etc.? #4513

mathause · 2020-10-15T09:36:43Z

I really should not open this can of worms but per #4450 (comment):

I'm always confused about whether ds.groupby(..., keep_attrs=True).mean() or ds.groupby(...).mean(keep_attrs=True) is correct. (similarly for rolling, coarsen etc.)

Also as I try to fix the keep_attr behavior in #4510 it would be good to know where they should go. So I tried to figure out how this is currently handled and found the following:

ds.xxx(keep_attrs=True).yyy()

all fixed

ds.xxx().yyy(keep_attrs=True)

coarsen (fixed in coarsen: better keep_attrs #5227)
groupby
groupby_bin
resample
rolling (adjusted in rolling keep_attrs & default True #4510)
rolling_exp (fixed in rolling_exp: keep_attrs and typing #4592)
weighted

So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True) - any comments on that?

(Edit: looking at this it is only half as bad, "only" coarsen, rolling (#4510), and rolling_exp would need to be fixed.)

Detailed analysis

import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
da = ds.air

coarsen

ds.coarsen(time=2, keep_attrs=True).mean() # keeps global attributes
ds.coarsen(time=2).mean(keep_attrs=True) # keeps DataArray attributes
ds.coarsen(time=2, keep_attrs=True).mean(keep_attrs=True) # keeps both

da.coarsen(time=2).mean(keep_attrs=True) # error
da.coarsen(time=2, keep_attrs=True).mean() # keeps DataArray attributes

groupby

ds.groupby("time.month").mean(keep_attrs=True) # keeps both
da.groupby("time.month").mean(keep_attrs=True) # keeps DataArray attributes

ds.groupby("time.month", keep_attrs=True).mean() # error
da.groupby("time.month", keep_attrs=True).mean() # error

groupby_bins

ds.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps both
da.groupby_bins(ds.lat, np.arange(0, 90, 10)).mean(keep_attrs=True) # keeps DataArray attrs

ds.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors
da.groupby_bins(ds.lat, np.arange(0, 90, 10), keep_attrs=True) # errors

resample

ds.resample(time="A").mean(keep_attrs=True) # keeps both
da.resample(time="A").mean(keep_attrs=True) # keeps DataArray attributes

ds.resample(time="A", keep_attrs=False).mean() # ignored
da.resample(time="A", keep_attrs=False).mean() # ignored

rolling

ds.rolling(time=2).mean(keep_attrs=True) # keeps both
da.rolling(time=2).mean(keep_attrs=True) # keeps DataArray attributes

ds.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps both
da.rolling(time=2, keep_attrs=True).mean() # DeprecationWarning; keeps DataArray attributes

see #4510

rolling_exp

ds.rolling_exp(time=5, keep_attrs=True).mean() # ignored
da.rolling_exp(time=5, keep_attrs=True).mean() # ignored

ds.rolling_exp(time=5).mean(keep_attrs=True) # keeps both
da.rolling_exp(time=5).mean(keep_attrs=True) # keeps DataArray attributes

weighted

ds.weighted(ds.lat).mean(keep_attrs=True) # keeps both
da.weighted(ds.lat).mean(keep_attrs=True) # keeps DataArray attrs

edit: moved rolling after #4510, moved rolling_exp after #4592 and coarsen after #5227

The text was updated successfully, but these errors were encountered:

dcherian · 2020-10-15T16:05:37Z

So the working consensus seems to be to to ds.xxx().yyy(keep_attrs=True) - any comments on that?

This is nice because it matches the da.mean(..., keep_attrs=True) syntax.

I think we should at least warn for da.groupby(..., keep_attrs=True) saying that keep_attrs is ignored and should be provided to the applied function.

mathause · 2023-11-10T16:58:35Z

I think this can be closed.

mathause mentioned this issue Oct 15, 2020

rolling keep_attrs & default True #4510

Merged

4 tasks

dcherian added the topic-metadata Relating to the handling of metadata (i.e. attrs and encoding) label Oct 15, 2020

mathause mentioned this issue Nov 18, 2020

rolling_exp: keep_attrs and typing #4592

Merged

4 tasks

mathause mentioned this issue Apr 28, 2021

coarsen: better keep_attrs #5227

Merged

4 tasks

mathause mentioned this issue May 6, 2021

Warn ignored keep attrs #5265

Merged

4 tasks

mathause closed this as completed Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

mathause commented Oct 15, 2020 •

edited

Loading

coarsen

groupby

groupby_bins

resample

rolling

rolling_exp

weighted

dcherian commented Oct 15, 2020

mathause commented Nov 10, 2023

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

where should keep_attrs be set in groupby, resample, weighted etc.? #4513

Comments

mathause commented Oct 15, 2020 • edited Loading

Detailed analysis

coarsen

groupby

groupby_bins

resample

rolling

rolling_exp

weighted

dcherian commented Oct 15, 2020

mathause commented Nov 10, 2023

mathause commented Oct 15, 2020 •

edited

Loading