Replace SortedKeysDict with dict #4753

max-sixty · 2021-01-03T07:02:39Z

Tests added
Passes isort . && black . && mypy . && flake8
User visible changes (including notable bug fixes) are documented in whats-new.rst

Inspired by @mathause research in #4571 (comment)

Note that one test is removed: https://github.com/pydata/xarray/compare/master...max-sixty:sorted-keys?expand=1#diff-2db7f6624707083db4aaab1b62eb11352200ec9d3ac1055de84877912226d7b5L560

While I'm keen on simplifying the data structures, this may have some unforeseen consequences, and at least deserves some thought re how we handle differently ordered dims.

Currently this PR retains the sorting in reprs.

max-sixty · 2021-01-06T16:59:17Z

As discussed in dev meeting, this makes sense to pursue. We should try and remove the repr sorting if we can manage the test breaks.

pep8speaks · 2021-01-23T21:21:00Z

Hello @max-sixty! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-05-16 03:24:39 UTC

# Conflicts: # xarray/core/utils.py

max-sixty · 2021-05-12T21:02:25Z

xarray/core/combine.py

    Data variables:
        temperature    (y, x) float64 10.98 14.3 12.06 nan ... nan 18.89 10.44 8.293
        precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176

+    # FIXME
    >>> xr.combine_by_coords([x3, x1], join="override")


Is the existing behavior here a bug?

In [16]: x1.dims Out[16]: Frozen({'y': 2, 'x': 3}) In [17]: x3.dims Out[17]: Frozen({'y': 2, 'x': 3}) In [18]: xr.combine_by_coords([x3, x1], join="override") Out[18]: <xarray.Dataset> Dimensions: (y: 4, x: 3) Coordinates: * x (x) int64 10 20 30 * y (y) int64 0 1 2 3 Data variables: temperature (y, x) float64 10.98 14.3 12.06 10.9 ... 18.89 10.44 8.293 precipitation (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176

and override is documented as:

"override": if indexes are of same size, rewrite indexes to be
those of the first object with that dimension. Indexes for the same
dimension must have the same size in all objects.

So should the result y dim have length 4?

This is changing behavior because the order of dims is changing.

Checking the repr of x1 and x2 reveals that both x and y are valid concat dimensions (given join="override") and it just chooses the first one (arbitrarily). I think this PR did not break it but just revealed a problem (also #4824). This should be fixed in another PR.

Thanks for that reference. I agree.

Is this a dfferent bug from #4824?

max-sixty · 2021-05-12T21:04:19Z

xarray/core/dataset.py

+        ...         lat=(["x", "y"], lat),
+        ...         time=pd.date_range("2014-09-06", periods=3),
+        ...         reference_time=pd.Timestamp("2014-09-05"),
+        ...      ),


@pydata/xarray do we have a preference on how we show constructors? I find the dict representation much easier to read, especially where we're passing tuples of lists etc.

I updated this with vim, could probably do the rest of them with some regex if people agree (different PR)

I don't really have a preference, but keep in mind that with the dict constructor you're restricted to python identifiers while the dictionary literal takes any hashable (not sure how much of a restricting that would be, though).

Yes, agree. As you suggest though — these examples almost exclusively use strings.

max-sixty · 2021-05-12T21:12:22Z

This ended up being more work than expected, but I think is ready for review. There's one issue that may be an existing bug (but may be my misunderstanding).

xarray/core/combine.py

mathause

No strong opinion on the dicts. Looks good overall.

I think combine_by_coords is brittle in case of ambiguous concat dims as in the given example. Before this change the concat dim depended on the name of the dimensions (i.e. their alphabetical ordering), now it depends on the insertion order. I am not sure what's worse. We should absolutely fix that (also #4824).

mathause · 2021-05-12T22:58:38Z

xarray/core/combine.py

    Data variables:
        temperature    (y, x) float64 10.98 14.3 12.06 nan ... nan 18.89 10.44 8.293
        precipitation  (y, x) float64 0.4376 0.8918 0.9637 ... 0.5684 0.01879 0.6176

+    # FIXME
    >>> xr.combine_by_coords([x3, x1], join="override")


Checking the repr of x1 and x2 reveals that both x and y are valid concat dimensions (given join="override") and it just chooses the first one (arbitrarily). I think this PR did not break it but just revealed a problem (also #4824). This should be fixed in another PR.

xarray/core/combine.py

max-sixty · 2021-05-13T00:48:25Z

Ready to go!

TomNicholas · 2021-05-13T15:21:44Z

I think combine_by_coords is brittle in case of ambiguous concat dims as in the given example

@mathause if this is a new bug then can you make an issue for it please?

max-sixty · 2021-05-13T15:50:55Z

I think combine_by_coords is brittle in case of ambiguous concat dims as in the given example

@mathause if this is a new bug then can you make an issue for it please?

Isn't this #4824?

keewis

this is a good way to work around not being able to sort keys of different types.

I worry about the dimension sizes which are part of the first line of the Dataset repr, though: we've had lots of questions about this because people thought that it would mean the same as in a DataArray's repr (the order there has a meaning). Arguably, that's because the difference between them is not big enough. This seems to have become less frequent since the introduction of the HTML reprs, so maybe it is also less important?

doc/whats-new.rst

max-sixty · 2021-05-16T23:27:43Z

I worry about the dimension sizes which are part of the first line of the Dataset repr, though: we've had lots of questions about this because people thought that it would mean the same as in a DataArray's repr (the order there has a meaning). Arguably, that's because the difference between them is not big enough. This seems to have become less frequent since the introduction of the HTML reprs, so maybe it is also less important?

I agree the current situation isn't ideal (though I'm not sure there's a better one) but also that this doesn't make it much worse — the existing approach doesn't represent an useful ordering, and nor does this. And if we choose to make it mean something in the future, then this is a necessary step towards that...

shoyer

This looks great to me, thanks!

I wonder if we should hold off on merging until after the 0.18.1 release -- or maybe just move to a date based versioning system / release cycle like dask.

…ata/xarray#4753)

max-sixty added 4 commits January 1, 2021 20:20

Remove SortedKeysDict

fdf72c8

Merge branch 'master' into sorted-keys

650bb53

Retain sorted reprs

6110bf1

Update tests

ef55555

Merge branch 'master' into sorted-keys

1d5f4fd

max-sixty added 3 commits January 23, 2021 13:23

_

0625586

Merge branch 'master' into sorted-keys

4f8be16

Merge branch 'master' into sorted-keys

1727377

# Conflicts: # xarray/core/utils.py

max-sixty force-pushed the sorted-keys branch from fdccc51 to 1727377 Compare May 12, 2021 17:09

max-sixty added 5 commits May 12, 2021 10:29

Removing sorting in repr

ed50311

whatsnew

2a99687

Reset snapshot tests

6b97895

Fix unstable ordering

d984cb8

Existing bug?

137157f

max-sixty commented May 12, 2021

View reviewed changes

effb64e

dcherian reviewed May 12, 2021

View reviewed changes

xarray/core/combine.py Show resolved Hide resolved

mathause reviewed May 12, 2021

View reviewed changes

Overwrite doctest test with combine_by_coords bug

1c2cb33

max-sixty and others added 2 commits May 13, 2021 11:08

Revert adding ovrride to join docstring

8946771

Merge branch 'master' into sorted-keys

2cac794

mathause mentioned this pull request May 14, 2021

combine_by_coords can succed when it shouldn't #4824

Open

Merge branch 'master' into sorted-keys

d0949ca

max-sixty added the plan to merge Final call for comments label May 15, 2021

keewis reviewed May 16, 2021

View reviewed changes

doc/whats-new.rst Outdated Show resolved Hide resolved

Ordered not sorted!

a8cec8c

shoyer approved these changes May 17, 2021

View reviewed changes

max-sixty merged commit 5a602cd into pydata:master May 19, 2021

max-sixty deleted the sorted-keys branch May 19, 2021 19:30

max-sixty mentioned this pull request May 19, 2021

0.18.1 patch release? #5298

Closed

9 tasks

tomwhite mentioned this pull request May 25, 2021

Upstream failing due to Xarray dims ordering change sgkit-dev/sgkit#583

Closed

tomwhite added a commit to tomwhite/sgkit that referenced this pull request May 31, 2021

Ignore failing doctests due to pydata/xarray#4753

dbadf94

tomwhite added a commit to sgkit-dev/sgkit that referenced this pull request Jun 4, 2021

Ignore failing doctests due to pydata/xarray#4753

70a91ba

keewis mentioned this pull request Aug 2, 2021

TypeError: Expected label or tuple of labels since switching to 0.19.0 #5651

Closed

tomwhite added a commit to tomwhite/sgkit that referenced this pull request Sep 29, 2022

Remove ignored doctests now Xarray>0.18.2 has been released (with pyd…

be67823

…ata/xarray#4753)

tomwhite mentioned this pull request Sep 29, 2022

Remove ignored doctests now Xarray>0.18.2 has been released (with htt… sgkit-dev/sgkit#914

Merged

tomwhite added a commit to tomwhite/sgkit that referenced this pull request Sep 29, 2022

Remove ignored doctests now Xarray>0.18.2 has been released (with pyd…

e1cdb43

…ata/xarray#4753)

tomwhite added a commit to tomwhite/sgkit that referenced this pull request Oct 7, 2022

Remove ignored doctests now Xarray>0.18.2 has been released (with pyd…

29b1665

…ata/xarray#4753)

mergify bot pushed a commit to sgkit-dev/sgkit that referenced this pull request Oct 13, 2022

Remove ignored doctests now Xarray>0.18.2 has been released (with pyd…

c7be753

…ata/xarray#4753)

shoyer mentioned this pull request Oct 21, 2024

Dataset.to_dataframe() dimension order is not alphabetically sorted by default #9653

Closed

5 tasks

mgunyho mentioned this pull request Oct 22, 2024

Update to_dataframe doc to match current behavior #9662

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace SortedKeysDict with dict #4753

Replace SortedKeysDict with dict #4753

max-sixty commented Jan 3, 2021 •

edited

Loading

max-sixty commented Jan 6, 2021

pep8speaks commented Jan 23, 2021 •

edited

Loading

max-sixty May 12, 2021

mathause May 12, 2021

max-sixty May 12, 2021

TomNicholas May 13, 2021 •

edited

Loading

max-sixty May 12, 2021

keewis May 12, 2021

max-sixty May 12, 2021

max-sixty commented May 12, 2021

mathause left a comment

mathause May 12, 2021

max-sixty commented May 13, 2021

TomNicholas commented May 13, 2021

max-sixty commented May 13, 2021

keewis left a comment

max-sixty commented May 16, 2021

shoyer left a comment

Replace SortedKeysDict with dict #4753

Replace SortedKeysDict with dict #4753

Conversation

max-sixty commented Jan 3, 2021 • edited Loading

max-sixty commented Jan 6, 2021

pep8speaks commented Jan 23, 2021 • edited Loading

Comment last updated at 2021-05-16 03:24:39 UTC

max-sixty May 12, 2021

Choose a reason for hiding this comment

mathause May 12, 2021

Choose a reason for hiding this comment

max-sixty May 12, 2021

Choose a reason for hiding this comment

TomNicholas May 13, 2021 • edited Loading

Choose a reason for hiding this comment

max-sixty May 12, 2021

Choose a reason for hiding this comment

keewis May 12, 2021

Choose a reason for hiding this comment

max-sixty May 12, 2021

Choose a reason for hiding this comment

max-sixty commented May 12, 2021

mathause left a comment

Choose a reason for hiding this comment

mathause May 12, 2021

Choose a reason for hiding this comment

max-sixty commented May 13, 2021

TomNicholas commented May 13, 2021

max-sixty commented May 13, 2021

keewis left a comment

Choose a reason for hiding this comment

max-sixty commented May 16, 2021

shoyer left a comment

Choose a reason for hiding this comment

max-sixty commented Jan 3, 2021 •

edited

Loading

pep8speaks commented Jan 23, 2021 •

edited

Loading

TomNicholas May 13, 2021 •

edited

Loading