Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching to Quantity-in-DataArray/NEP 18 Support over DataArray-with-units-attribute #1325

Merged
merged 12 commits into from
Apr 14, 2020
Merged

Switching to Quantity-in-DataArray/NEP 18 Support over DataArray-with-units-attribute #1325

merged 12 commits into from
Apr 14, 2020

Conversation

jthielen
Copy link
Collaborator

@jthielen jthielen commented Feb 24, 2020

Description Of Changes

As discussed in #1304, this PR moves forward with Quantity-in-a-DataArray as the data structure preferred over a DataArray-with-units-attribute. Support for the units attribute still needs to stick around though in order to continue to support units on coordinates (coordinates tied to indexes don't support Quantities).

Since bumping Pint to at least 0.10.1 was required for NEP 18 support, this also allowed a few things in units.py to be cleaned up

  • don't need try/except for preprocessors support
  • can use built-in alias definition
  • can rely on NEP 18 instead of custom wrappers for diff/at_least1d/at_least2d
  • don't need fallback for matplotlib support

Similarly, the strftime monkey patch in xarray.py was removed since minimum xarray was bumped to 0.13.0. I didn't go through yet and check for any pre-numpy-v1.16 patches that could be removed.

I have this marked as a draft for now, since I want to go through the tests again to ensure proper coverage for the added functionality, and think about what is best documented here or in #1304. I think documentation is best to go in #1304.

Once this PR is good to go, #1304 will be redone based on this.

Checklist

  • Tests added
  • Fully documented

@jthielen jthielen marked this pull request as ready for review March 5, 2020 16:25
@jthielen jthielen requested a review from dopplershift as a code owner March 5, 2020 16:25
@kgoebber
Copy link
Collaborator

I've been working with this PR today and was able to squash some failures and errors, but not all of them using the current xarray 0.15.1

One of the issues that arises is that coordinates and attributes must be assigned with da.assign_coords() and da.assign_attrs(), respectively.
Updates needed to unit_array lines 150-161

    @unit_array.setter
    def unit_array(self, values):
        """Set data values using a `pint.Quantity`."""
        if not isinstance(values, units.Quantity):
            values = units.Quantity(values, 'dimensionless')
        if isinstance(self._data_array.data, units.Quantity):
            self._data_array.data = values
        else:
            try:
                self._data_array.data = values.magnitude
            except ValueError:
                self._data_array = self._data_array.assign_coords(coords={str(self._data_array.name): values.magnitude})
            self._data_array = self._data_array.assign_attrs({'units': str(values.units)})

and a slight re-write within _rebuild_coords lines 673-679

if crs is not None:
     new_coord_var = coord_var.copy()
     height = crs['perspective_point_height']
     scaled_vals = new_coord_var * (height * units.meters)
     new_coord_var = scaled_vals.metpy.convert_units('meter')
     new_coord_var = new_coord_var.assign_attrs({'units': 'meter'})
     yield coord_name, new_coord_var

There are still three other test failures that I can seem to figure out the object oriented code well enough to quash. Two failures come from not correctly finding the x/y location because of not actually converting the coordinate values to meters, but the unit does get updated. The final failure is related to a data array being set with no units and not getting that attribute fully attached to the data array. I can see that the code gets the unit attached, but for some reason doesn't come back to the main program during the test.

Anyway, I had some time and am ignoring lots of other work. That's what I've found so far. If I were more confident in the git abilities I would submit a commit to this PR, but I'm not, so hence the long reply.

@jthielen
Copy link
Collaborator Author

@kgoebber Unless I'm misunderstanding something, I don't think your proposed changes quite fix everything, as lines like:

self._data_array = self._data_array.assign_attrs({'units': str(values.units)})

fail to update the original DataArray inplace, which is what we need on the unit_array setter. For example, this causes this test to fail:

def test_unit_array_setter_without_units(test_ds_generic):
    """Test setting a unit array when data is not a quantity."""
    test_ds_generic['test'].metpy.unit_array = np.ones((1, 3, 3, 5, 5))
    np.testing.assert_array_equal(test_ds_generic['test'].data, np.ones((1, 3, 3, 5, 5)))
    assert test_ds_generic['test'].attrs['units'] == 'dimensionless'

The remaining issues you're seeing with it not actually converting the coordinate values to meters also seem to be due to this failure to update in-place.

I'll try digging into this a bit more, but now with our inability to modify coords in-place, I'm afraid much of our accessor functionality may need to be reworked on a more fundamental level.

@kgoebber
Copy link
Collaborator

kgoebber commented Mar 31, 2020 via email

@jthielen
Copy link
Collaborator Author

jthielen commented Mar 31, 2020

My current hunch is that we should do away with the unit_array setter entirely, and instead use direct assignment for data that isn't a dimension coordinate, and then a unit-wrapped assign_coords (as with sel and loc) for coordinates. convert_units would likewise need to be split into coord and non-coord versions. I'll see if I can get something working with that.

Also, to make things more logically consistent, it may also a good idea to go through and not have our accessor act inplace where we can avoid it? Otherwise, after this change, it seems like it'd be around half-and-half.

@dopplershift
Copy link
Member

Also, to make things more logically consistent, it may also a good idea to go through and not have our accessor act inplace where we can avoid it? Otherwise, after this change, it seems like it'd be around half-and-half.

UGH. It does seem like xarray (and pandas) have moved away from the in-place API's, so yes, we probably should do the same.

@jthielen jthielen requested a review from dcamron as a code owner March 31, 2020 21:12
@jthielen
Copy link
Collaborator Author

Based on running the test suite locally, I've gotten the xarray 0.15.1 issue resolved, with all tests passing except the AEA mismatch here: #1344 (comment).

This includes updating all the unit handling components of the accessor to no longer operate in-place, but it doesn't modify the coordinate type handling to no longer operate in-place. Can that be left to a follow-up PR?

@dopplershift
Copy link
Member

That sounds fine to me.

@jthielen
Copy link
Collaborator Author

Also, I shouldn't have been so quick to say "all tests passing," as I forgot about the image tests. Not sure which if any of those have to do with this PR or not.

@dopplershift
Copy link
Member

I'll take care of the image tests. Let's just get xarray fixed for now. Is this PR actually finishing the xarray work we were needing, or is there more left to do?

@jthielen
Copy link
Collaborator Author

I'll take care of the image tests. Let's just get xarray fixed for now. Is this PR actually finishing the xarray work we were needing, or is there more left to do?

Sounds good. This finishes the transition from DataArray-with-units-attribute to Quantity-in-xarray, and comes with the fix for unit operations not modifying in-place (thus working with xarray 0.15.1). What remains for xarray work (in the bigger picture for 1.0) is:

@@ -89,7 +89,7 @@ def test_aea_minimal():

def test_aea_single_std_parallel():
"""Test albers equal area with one standard parallel."""
attrs = {'grid_mapping_name': 'albers_conical_equal_area', 'standard_parallel': 25}
attrs = {'grid_mapping_name': 'albers_conical_equal_area', 'standard_parallel': 20}
Copy link
Contributor

@rsignell-usgs rsignell-usgs Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, yes, the defaults for albers are 20,50. https://scitools.org.uk/cartopy/docs/latest/crs/projections.html#albersequalarea

@jthielen
Copy link
Collaborator Author

jthielen commented Apr 7, 2020

@dopplershift I have the xarray 0.15.1 fix from here backported to the v0.12.0 tag (see https://github.com/Unidata/MetPy/compare/v0.12.0...jthielen:0-12-patch-xarray-0-15-1?expand=1). Would you be able to create a v0.12.x branch from the v0.12.0 tag so I can submit a PR for that? pytest is green for me locally (all latest versions, with image tests excluded).

Also, do you think you would get the chance to review this PR sometime soon so I could more confidently move forward with redoing #1304?

@jthielen jthielen mentioned this pull request Apr 8, 2020
16 tasks
@jthielen
Copy link
Collaborator Author

@dopplershift @dcamron Any updates on getting a 0.12.x branch created for the xarray 0.15.1 patch, or reviewing this PR? Thanks!

@dopplershift
Copy link
Member

0.12.x branch pushed.

Copy link
Member

@dopplershift dopplershift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, huge 👍 from me. This looks good and I'm happy with the changes/capabilities we're getting to.

tests/test_xarray.py Outdated Show resolved Hide resolved
@dopplershift dopplershift added Area: Units Pertains to unit information Area: Xarray Pertains to xarray integration Type: API Change Changes to how existing functionality works Type: Enhancement Enhancement to existing functionality labels Apr 14, 2020
@dopplershift dopplershift added this to the 1.0 milestone Apr 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Units Pertains to unit information Area: Xarray Pertains to xarray integration Type: API Change Changes to how existing functionality works Type: Enhancement Enhancement to existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants