Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataArrays should display their coordinates in the natural order #712

Open
anntzer opened this issue Jan 8, 2016 · 13 comments
Open

DataArrays should display their coordinates in the natural order #712

anntzer opened this issue Jan 8, 2016 · 13 comments

Comments

@anntzer
Copy link
Contributor

anntzer commented Jan 8, 2016

Consider

from collections import *
import numpy as np
from xray import *

d1 = DataArray(np.empty((2, 2)), coords=OrderedDict([("foo", [0, 1]), ("bar", [0, 1])]))
d2 = DataArray(np.empty((2, 2)), coords=OrderedDict([("bar", [0, 1]), ("foo", [0, 1])]))

ds = Dataset({"d1": d1, "d2": d2})

print(ds.d1)
print(ds.d2)

This outputs

<xray.DataArray 'd1' (foo: 2, bar: 2)>
array([[  6.91516848e-310,   1.64244654e-316],
       [  6.91516881e-310,   6.91516881e-310]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1
<xray.DataArray 'd2' (bar: 2, foo: 2)>
array([[  1.59987863e-316,   6.91516883e-310],
       [  6.91515690e-310,   2.12670320e-316]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1

I understand that internally both DataArrays use the same coords object and thus the same coords order, but it would be helpful if, when printing d2 by itself, the coordinates were printed in the natural order ("bar", "foo"). In particular, when working interactively, the list of coordinates at the end of the repr is the most easy thing to spot, and thus most helpful to know how to format the call to array.loc[...].

@shoyer
Copy link
Member

shoyer commented Jan 8, 2016

I think this may have been fixed by the recent rewrite of DataArray internals. On master, I have:

In [2]: d1
Out[2]:
<xray.DataArray (foo: 2, bar: 2)>
array([[  0.00000000e+000,   0.00000000e+000],
       [  2.15725662e-314,   2.15893204e-314]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1

In [3]: d2
Out[3]:
<xray.DataArray (bar: 2, foo: 2)>
array([[  0.00000000e+000,   0.00000000e+000],
       [  2.15906985e-314,   2.14458868e-314]])
Coordinates:
  * bar      (bar) int64 0 1
  * foo      (foo) int64 0 1

@anntzer
Copy link
Contributor Author

anntzer commented Jan 8, 2016

Awesome, thanks. Any plans for a release soon?
Feel free to close the issue.

@shoyer
Copy link
Member

shoyer commented Jan 8, 2016

yes, in the next week, hopefully.

@shoyer
Copy link
Member

shoyer commented Jan 26, 2016

This should be fixed in v0.7.0... please reopen if it resurfaces.

@shoyer shoyer closed this as completed Jan 26, 2016
@anntzer
Copy link
Contributor Author

anntzer commented Apr 18, 2016

Requesting a reopen: this issue is present again in 0.7.2.

@shoyer shoyer reopened this Apr 18, 2016
@shoyer
Copy link
Member

shoyer commented Apr 18, 2016

OK, I didn't read your first post carefully last time. Your complaint was about the order of coordinates in ds.d1 and ds.d2, not the original DataArrays. So this is a more subtle issue than I thought.

We could add some sort of ad-hoc adjustment to the order in which we display coordinates, but I'm reluctant because it's not obvious to me what that "correct" order would be. For example, that you can directly supply the coords argument as a mapping with any arbitrary order to construct a DataArray.

I suppose once principled choice would always be to display coordinates corresponding to dimensions first in lists of coordinates, and to always display them in the same order as dimensions. If we do this, it should be consistent between both DataArray and Dataset.

@stale
Copy link

stale bot commented Jan 28, 2019

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Jan 28, 2019
@anntzer
Copy link
Contributor Author

anntzer commented Jan 28, 2019

The issue is still relevant.

For the record, the repro code is now (e.g.)

In [4]: from collections import * 
   ...: import numpy as np 
   ...: from xarray import * 
   ...:  
   ...: d1 = DataArray(np.empty((2, 2)), coords=OrderedDict([("foo", [0, 1]), ("bar", [0, 1])]), dims=["foo", "bar"]) 
   ...: d2 = DataArray(np.empty((2, 2)), coords=OrderedDict([("bar", [0, 1]), ("foo", [0, 1])]), dims=["bar", "foo"]) 
   ...:  
   ...: ds = Dataset({"d1": d1, "d2": d2}) 
   ...:  
   ...: print(ds.d1) 
   ...: print(ds.d2)                                                                                                                                                                                                                        
<xarray.DataArray 'd1' (foo: 2, bar: 2)>
array([[4.665651e-310, 0.000000e+000],
       [4.940656e-324,           nan]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1
<xarray.DataArray 'd2' (bar: 2, foo: 2)>
array([[4.66565e-310, 0.00000e+000],
       [4.94066e-324,          nan]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1

@stale stale bot removed the stale label Jan 28, 2019
@jhamman
Copy link
Member

jhamman commented Jan 28, 2019

@anntzer - would you be interested in working on this?

@anntzer
Copy link
Contributor Author

anntzer commented Jan 28, 2019

I don't know anything about the internals of xarray, and to be honest I rarely use it anymore.
The issue remains valid (which is why I posted the reply above), but it's not going to be the end of the world if you close it as wontfix.

@keewis
Copy link
Collaborator

keewis commented Nov 6, 2020

what should we do about this? We did touch the subject in #4409, but decided to keep the order the coordinates were passed in rather than sorting by dimension (or alphabetically). I think there's a lot of confusion about the difference between the dimensions in the summary line of DataArray objects and the order in the coordinates section.

A fix for #4515 might make sorting by dimension order much more important.

@dcherian
Copy link
Contributor

dcherian commented Nov 6, 2020

#4515 is consistent with this comment up above:

display coordinates corresponding to dimensions first in lists of coordinates, and to always display them in the same order as dimensions.

@keewis
Copy link
Collaborator

keewis commented Nov 6, 2020

true, it seems I didn't read this issue carefully enough

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants