Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding lost upon concatenation #1297

Closed
gerritholl opened this issue Mar 6, 2017 · 6 comments
Closed

Encoding lost upon concatenation #1297

gerritholl opened this issue Mar 6, 2017 · 6 comments
Labels
stale topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)

Comments

@gerritholl
Copy link
Contributor

When using xarray.concat, attributes are retained, but encoding information is not. I believe encoding information should be retained, at least optionally.

In [64]: da = xarray.DataArray([1, 2, 3, 2, 1])

In [65]: da.attrs.update(foo="bar")

In [66]: da.encoding.update(complevel=5)

In [67]: da2 = xarray.concat((da, da), dim="new")

In [68]: print(da2.attrs)
OrderedDict([('foo', 'bar')])

In [69]: print(da2.encoding)
{}
@gerritholl
Copy link
Contributor Author

This is more serious when we are concatenating datasets, because then the encoding is lost for each containing data-array

@shoyer
Copy link
Member

shoyer commented Mar 8, 2017

I guess I'm not opposed to this per se, but preserving encoding in operations is even harder than attrs because it's possible encoding to no longer be valid. So currently most xarray operation simply delete encoding. That said, I can see some reasons for keeping it when concatenating (e.g., because concat is a fundamental part of open_mfdataset).

@gerritholl
Copy link
Contributor Author

Mine retains it always upon concatenation, but if you prefer we could add an argument keep_encoding in analogy with keep_attrs. In that case we'd want to add it wherever we have keep_attrs.

@shoyer
Copy link
Member

shoyer commented Mar 8, 2017

Mine retains it always upon concatenation, but if you prefer we could add an argument keep_encoding in analogy with keep_attrs. In that case we'd want to add it wherever we have keep_attrs.

No, I'm happy with your fix here. I'd rather keep encoding as hidden away from the user facing API as possible, because it's only relevant to a fraction of users (those reading and writing netCDF files).

olemke pushed a commit to atmtools/typhon that referenced this issue Jun 29, 2017
	* datasets/dataset.py(DatasetDeque.move):

	- Workaround for xarray bug causing encoding to be lost upon
	concatenation pydata/xarray#1297


git-svn-id: https://arts.mi.uni-hamburg.de/svn/rt/typhon/trunk@10360 aaf1aab0-4228-0410-ad68-8dceda47f409
@stale
Copy link

stale bot commented Feb 6, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Feb 6, 2019
@jhamman
Copy link
Member

jhamman commented Feb 6, 2019

closed via #1297

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants