Change an `==` to an `is`. Fix tests so that this won't happen again. #2648

WeatherGod · 2019-01-04T17:36:36Z

Closes getting a "truth value of an array" error when supplying my own concat_dim. #2647
Tests added

Also re-affirms #1988.

Closes pydata#2647 and re-affirms pydata#1988.

WeatherGod · 2019-01-04T17:39:38Z

xarray/backends/api.py

@@ -606,7 +606,7 @@ def open_mfdataset(paths, chunks=None, concat_dim=_CONCAT_DIM_DEFAULT,
    # Coerce 1D input into ND to maintain backwards-compatible API until API
    # for N-D combine decided
    # (see https://github.com/pydata/xarray/pull/2553/#issuecomment-445892746)
-    if concat_dim is None or concat_dim == _CONCAT_DIM_DEFAULT:
+    if concat_dim is None or concat_dim is _CONCAT_DIM_DEFAULT:


Are we sure we want this to be an is test? Are we sure that the keyword argument will default to that string object?

Yes, we use that as the default value here.

Possibly a better choice would be to use something like ReprObject('<inferred>') for the sentinel object, e.g., as done here:

xarray/xarray/core/dtypes.py

Line 8 in 69086b3

NA = utils.ReprObject('<NA>')

I guess what I am getting at is a matter of user-friendliness. A user looking up the call signature of this method will see the default as the string (and not an object), but, they'll never be able to explicitly force concat dimension inference, because passing the string themselves will not work (it'll be a different object).

shoyer · 2019-01-04T21:47:32Z

Yes, it would be better to use the custom ReprObject.

…

On Fri, Jan 4, 2019 at 1:42 PM Benjamin Root ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In xarray/backends/api.py <#2648 (comment)>: > @@ -606,7 +606,7 @@ def open_mfdataset(paths, chunks=None, concat_dim=_CONCAT_DIM_DEFAULT, # Coerce 1D input into ND to maintain backwards-compatible API until API # for N-D combine decided # (see #2553/#issuecomment-445892746) - if concat_dim is None or concat_dim == _CONCAT_DIM_DEFAULT: + if concat_dim is None or concat_dim is _CONCAT_DIM_DEFAULT: I guess what I am getting at is a matter of user-friendliness. A user looking up the call signature of this method will see the default as the string (and not an object), but, they'll never be able to explicitly force concat dimension inference, because passing the string themselves will not work (it'll be a different object). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2648 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABKS1ll3RBImsQ_AzjJbKMdgtHFILXdIks5u_8qwgaJpZM4ZqBxI> .

WeatherGod · 2019-01-04T22:00:10Z

ok, so we use the ReprObject for the default, and then test if concat_dim is of type `ReprObject and then test its equivalance?

shoyer · 2019-01-04T22:03:51Z

ok, so we use the ReprObject for the default, and then test if concat_dim is of type `ReprObject and then test its equivalance?

No, just check identity with the exact ReprObject used as the default value.

This is just a slightly more readable version of the common idiom of use object() as a default value, e.g., as shown here
http://effbot.org/zone/default-values.htm#what-to-do-instead

WeatherGod · 2019-01-04T22:12:44Z

Is the following statement True or False: "The user should be allowed to explicitly declare that they want the concatenation dimension to be inferred by passing a keyword argument". If this is True, then you need to test equivalence. If it is False, then there is nothing more I need to do for the PR, as changing this to use a ReprObject is orthogonal to these changes.

shoyer · 2019-01-04T22:51:24Z

Is the following statement True or False: "The user should be allowed to explicitly declare that they want the concatenation dimension to be inferred by passing a keyword argument".

This statement is False, but it looks like we have a related bug: we really should be importing _CONCAT_DIM_DEFAULT from xarray.core.combine, because we use identity comparison in that module. Right now this only works by virtue of the accident that identical string literals points to the same variable in CPython.

WeatherGod · 2019-01-05T04:18:50Z

I completely forgotten about that little quirk of cpython. I try to ignore implementation details like that. Heck, I still don't fully trust dictionaries to be ordered!

I removed the WIP. We can deal with the concat dim default object separately, including turning it into a ReprObject (not exactly sure what the advantage of it is over just using the string, but, meh).

shoyer · 2019-01-05T04:37:10Z

I just pushed a commit to your branch that should fix the string identity issue (by not defining the constant multiple times).

shoyer · 2019-01-05T06:46:52Z

Thanks

* master: Remove broken Travis-CI builds (pydata#2661) Type checking with mypy (pydata#2655) Added Coarsen (pydata#2612) Improve test for GH 2649 (pydata#2654) revise top-level package description (pydata#2430) Convert ref_date to UTC in encode_cf_datetime (pydata#2651) Change an `==` to an `is`. Fix tests so that this won't happen again. (pydata#2648) ENH: switch Dataset and DataArray to use explicit indexes (pydata#2639) Use pycodestyle for lint checks. (pydata#2642) Switch whats-new for 0.11.2 -> 0.11.3 DOC: document v0.11.2 release Use built-in interp for interpolation with resample (pydata#2640) BUG: pytest-runner no required for setup.py (pydata#2643)

…#2648) * Change an `==` to an `is`. Fix tests so that this won't happen again. Closes #2647 and re-affirms #1988. * Reuse the same _CONCAT_DIM_DEFAULT object

* concatenates along a single dimension * Wrote function to find correct tile_IDs from nested list of datasets * Wrote function to check that combined_tile_ids structure is valid * Added test of 2d-concatenation * Tests now check that dataset ordering is correct * Test concatentation along a new dimension * Started generalising auto_combine to N-D by integrating the N-D concatentation algorithm * All unit tests now passing * Fixed a failing test which I didn't notice because I don't have pseudoNetCDF * Began updating open_mfdataset to handle N-D input * Refactored to remove duplicate logic in open_mfdataset & auto_combine * Implemented Shoyers suggestion in #2553 to rewrite the recursive nested list traverser as an iterator * --amend * Now raises ValueError if input not ordered correctly before concatenation * Added some more prototype tests defining desired behaviour more clearly * Now raises informative errors on invalid forms of input * Refactoring to alos merge along each dimension * Refactored to literally just apply the old auto_combine along each dimension * Added unit tests for open_mfdatset * Removed TODOs * Removed format strings * test_get_new_tile_ids now doesn't assume dicts are ordered * Fixed failing tests on python3.5 caused by accidentally assuming dict was ordered * Test for getting new tile id * Fixed itertoolz import so that it's compatible with older versions * Increased test coverage * Added toolz as an explicit dependency to pass tests on python2.7 * Updated 'what's new' * No longer attempts to shortcut all concatenation at once if concat_dims=None * Rewrote using itertools.groupby instead of toolz.itertoolz.groupby to remove hidden dependency on toolz * Fixed erroneous removal of utils import * Updated docstrings to include an example of multidimensional concatenation * Clarified auto_combine docstring for N-D behaviour * Added unit test for nested list of Datasets with different variables * Minor spelling and pep8 fixes * Started working on a new api with both auto_combine and manual_combine * Wrote basic function to infer concatenation order from coords. Needs better error handling though. * Attempt at finalised version of public-facing API. All the internals still need to be redone to match though. * No longer uses entire old auto_combine internally, only concat or merge * Updated what's new * Removed uneeded addition to what's new for old release * Fixed incomplete merge in docstring for open_mfdataset * Tests for manual combine passing * Tests for auto_combine now passing * xfailed weird behaviour with manual_combine trying to determine concat_dim * Add auto_combine and manual_combine to API page of docs * Tests now passing for open_mfdataset * Completed merge so that #2648 is respected, and added tests. Also moved concat to it's own file to avoid a circular dependency * Separated the tests for concat and both combines * Some PEP8 fixes * Pre-empting a test which will fail with opening uamiv format * Satisfy pep8speaks bot * Python 3.5 compatibile after changing some error string formatting * Order coords using pandas.Index objects * Fixed performance bug from GH #2662 * Removed ToDos about natural sorting of string coords * Generalized auto_combine to handle monotonically-decreasing coords too * Added more examples to docstring for manual_combine * Added note about globbing aspect of open_mfdataset * Removed auto-inferring of concatenation dimension in manual_combine * Added example to docstring for auto_combine * Minor correction to docstring * Another very minor docstring correction * Added test to guard against issue #2777 * Started deprecation cycle for auto_combine * Fully reverted open_mfdataset tests * Updated what's new to match deprecation cycle * Reverted uamiv test * Removed dependency on itertools * Deprecation tests fixed * Satisfy pycodestyle * Started deprecation cycle of auto_combine * Added specific error for edge case combine_manual can't handle * Check that global coordinates are monotonic * Highlighted weird behaviour when concatenating with no data variables * Added test for impossible-to-auto-combine coordinates * Removed uneeded test * Satisfy linter * Added airspeedvelocity benchmark for combining functions * Benchmark will take longer now * Updated version numbers in deprecation warnings to fit with recent release of 0.12 * Updated api docs for new function names * Fixed docs build failure * Revert "Fixed docs build failure" This reverts commit ddfc6dd. * Updated documentation with section explaining new functions * Suppressed deprecation warnings in test suite * Resolved ToDo by pointing to issue with concat, see #2975 * Various docs fixes * Slightly renamed tests to match new name of tested function * Included minor suggestions from shoyer * Removed trailing whitespace * Simplified error message for case combine_manual can't handle * Removed filter for deprecation warnings, and added test for if user doesn't supply concat_dim * Simple fixes suggested by shoyer * Change deprecation warning behaviour * linting

Change an == to an is. Fix tests so that this won't happen again.

e1bb375

Closes pydata#2647 and re-affirms pydata#1988.

WeatherGod changed the title ~~Change an == to an is. Fix tests so that this won't happen again.~~ WIP: Change an == to an is. Fix tests so that this won't happen again. Jan 4, 2019

WeatherGod commented Jan 4, 2019

View reviewed changes

WeatherGod changed the title ~~WIP: Change an == to an is. Fix tests so that this won't happen again.~~ Change an == to an is. Fix tests so that this won't happen again. Jan 5, 2019

Reuse the same _CONCAT_DIM_DEFAULT object

e138228

shoyer merged commit ee44478 into pydata:master Jan 5, 2019

shoyer mentioned this pull request Jan 6, 2019

Remove py2 compat #2645

Merged

3 tasks

TomNicholas mentioned this pull request Jan 6, 2019

API for N-dimensional combine #2616

Merged

15 tasks

WeatherGod deleted the fix_2647 branch January 7, 2019 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change an `==` to an `is`. Fix tests so that this won't happen again. #2648

Change an `==` to an `is`. Fix tests so that this won't happen again. #2648

WeatherGod commented Jan 4, 2019

WeatherGod Jan 4, 2019

shoyer Jan 4, 2019

WeatherGod Jan 4, 2019

shoyer commented Jan 4, 2019 via email

WeatherGod commented Jan 4, 2019

shoyer commented Jan 4, 2019

WeatherGod commented Jan 4, 2019

shoyer commented Jan 4, 2019

WeatherGod commented Jan 5, 2019

shoyer commented Jan 5, 2019

shoyer commented Jan 5, 2019

Change an == to an is. Fix tests so that this won't happen again. #2648

Change an == to an is. Fix tests so that this won't happen again. #2648

Conversation

WeatherGod commented Jan 4, 2019

WeatherGod Jan 4, 2019

Choose a reason for hiding this comment

shoyer Jan 4, 2019

Choose a reason for hiding this comment

WeatherGod Jan 4, 2019

Choose a reason for hiding this comment

shoyer commented Jan 4, 2019 via email

WeatherGod commented Jan 4, 2019

shoyer commented Jan 4, 2019

WeatherGod commented Jan 4, 2019

shoyer commented Jan 4, 2019

WeatherGod commented Jan 5, 2019

shoyer commented Jan 5, 2019

shoyer commented Jan 5, 2019

Change an `==` to an `is`. Fix tests so that this won't happen again. #2648

Change an `==` to an `is`. Fix tests so that this won't happen again. #2648