Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] xr.concat inverts coordinates order #4072

Closed
clausmichele opened this issue May 18, 2020 · 5 comments · Fixed by #4409 or #4419
Closed

[BUG] xr.concat inverts coordinates order #4072

clausmichele opened this issue May 18, 2020 · 5 comments · Fixed by #4409 or #4419

Comments

@clausmichele
Copy link
Contributor

Following the issue #3969
Merging two datasets using xr.concat inverts the coordinates order.

MCVE Code Sample

import numpy as np
import xarray as xr

x = np.arange(0,10)
y = np.arange(0,10)
time = [0,1]
data = np.zeros((10,10), dtype=bool)
dataArray1 = xr.DataArray([data], coords={'time': [time[0]], 'y': y, 'x': x},
                             dims=['time', 'y', 'x'])
dataArray2 = xr.DataArray([data], coords={'time': [time[1]], 'y': y, 'x': x},
                             dims=['time', 'y', 'x'])
dataArray1 = dataArray1.to_dataset(name='data')
dataArray2 = dataArray2.to_dataset(name='data')

print(dataArray1)
print(xr.concat([dataArray1,dataArray2], dim='time'))

Current Output

<xarray.Dataset>
Dimensions:  (time: 1, x: 10, y: 10)
Coordinates:
  * time     (time) int64 0
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    data     (time, y, x) bool False False False False ... False False False
<xarray.Dataset>
Dimensions:  (time: 2, x: 10, y: 10)
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9  ##Inverted x and y
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * time     (time) int64 0 1
Data variables:
    data     (time, y, x) bool False False False False ... False False False

Expected Output

<xarray.Dataset>
Dimensions:  (time: 1, x: 10, y: 10)
Coordinates:
  * time     (time) int64 0
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    data     (time, y, x) bool False False False False ... False False False
<xarray.Dataset>
Dimensions:  (time: 2, x: 10, y: 10)
Coordinates:
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
  * time     (time) int64 0 1
Data variables:
    data     (time, y, x) bool False False False False ... False False False

Problem Description

The concat function should not invert the coordinates but maintain the original order.

Versions

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 (default, May 7 2019, 14:58:50) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 4.15.0-88-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None

xarray: 0.15.1
pandas: 1.0.3
numpy: 1.18.2
scipy: None
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.2.0
cartopy: None
seaborn: None
numbagg: None
setuptools: 46.1.3
pip: 9.0.1
conda: None
pytest: None
IPython: 7.13.0
sphinx: 2.4.3

@keewis
Copy link
Collaborator

keewis commented May 18, 2020

don't use the Coordinates section to check the order. Dataset objects may have multiple data variables (each with a possibly different order of dimensions) so for displaying it needs to define some kind of order.

What you want to compare is the order of dimensions referenced by a variable:

<xarray.Dataset>
Dimensions:  (time: 1, x: 10, y: 10)
Coordinates:
  * time     (time) int64 0
  * y        (y) int64 0 1 2 3 4 5 6 7 8 9
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    data     (time, y, x) bool False False False False ... False False False
             ^^^^^^^^^^^^

and you will notice that the order does not change.

To make this less confusing, maybe we should always sort the coordinates when displaying them? I'm guessing that right now they are displayed in the order they were added to the coordinates.

@kmuehlbauer
Copy link
Contributor

This rings a bell! It has be discussed in #2811. I've done extensive checks back then, but didn't come up with an PR.

My workaround or solution is outlined in this comment. Code might have been changed!

@clausmichele
Copy link
Contributor Author

@keewis yes you are right, but still a consistent ordering would be less confusing, including maybe also the Dimensions field.
Now Dimensions shows first x and then y, even though the data itself has y,x ordering.

@keewis
Copy link
Collaborator

keewis commented Sep 18, 2020

sorry, I forgot to remove the "closes" item in #4409. This will be fixed by #4419.

@keewis keewis reopened this Sep 18, 2020
@keewis
Copy link
Collaborator

keewis commented Sep 19, 2020

closed by #4419

@keewis keewis closed this as completed Sep 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants