Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable internal plotting with cftime datetime #2665

Merged
merged 17 commits into from
Feb 8, 2019

Conversation

jbusecke
Copy link
Contributor

@jbusecke jbusecke commented Jan 10, 2019

This PR is meant to restore the internal plotting capabilities for objects with cftime.datetime dimensions.
Based mostly on the discussions in #2164

@jbusecke
Copy link
Contributor Author

jbusecke commented Jan 10, 2019

I have been along the lines of a short example. This works for timeseries data.

import xarray as xr
import numpy as np
%matplotlib inline

# Create a simple line dataarray with cftime
time = xr.cftime_range(start='2000', periods=6, freq='2MS', calendar='noleap')
data = np.random.rand(len(time))
da = xr.DataArray(data, coords=[('time', time)])
da.plot()

image

For pcolormesh plots this still fails.

# Create a simple line dataarray with cftime
time = xr.cftime_range(start='2000', periods=6, freq='2MS', calendar='noleap')
data2 = np.random.rand(len(time), 4)
da2 = xr.DataArray(data2, coords=[('time', time), ('other', range(4))])
da2.plot()
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in 3 data2 = np.random.rand(len(time), 4) 4 da2 = xr.DataArray(data2, coords=[('time', time), ('other', range(4))]) ----> 5 da2.plot()

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in call(self, **kwargs)
585
586 def call(self, **kwargs):
--> 587 return plot(self._da, **kwargs)
588
589 @functools.wraps(hist)

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in plot(darray, row, col, col_wrap, ax, hue, rtol, subplot_kws, **kwargs)
220 kwargs['ax'] = ax
221
--> 222 return plotfunc(darray, **kwargs)
223
224

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in newplotfunc(darray, x, y, figsize, size, aspect, ax, row, col, col_wrap, xincrease, yincrease, add_colorbar, add_labels, vmin, vmax, cmap, center, robust, extend, levels, infer_intervals, colors, subplot_kws, cbar_ax, cbar_kwargs, xscale, yscale, xticks, yticks, xlim, ylim, norm, **kwargs)
887 vmax=cmap_params['vmax'],
888 norm=cmap_params['norm'],
--> 889 **kwargs)
890
891 # Label the plot with metadata

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in pcolormesh(x, y, z, ax, infer_intervals, **kwargs)
1135 (np.shape(y)[0] == np.shape(z)[0])):
1136 if len(y.shape) == 1:
-> 1137 y = _infer_interval_breaks(y, check_monotonic=True)
1138 else:
1139 # we have to infer the intervals on both axes

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in _infer_interval_breaks(coord, axis, check_monotonic)
1085 coord = np.asarray(coord)
1086
-> 1087 if check_monotonic and not _is_monotonic(coord, axis=axis):
1088 raise ValueError("The input coordinate is not sorted in increasing "
1089 "order along axis %d. This can lead to unexpected "

~/Work/CODE/PYTHON/xarray/xarray/plot/plot.py in _is_monotonic(coord, axis)
1069 n = coord.shape[axis]
1070 delta_pos = (coord.take(np.arange(1, n), axis=axis) >=
-> 1071 coord.take(np.arange(0, n - 1), axis=axis))
1072 delta_neg = (coord.take(np.arange(1, n), axis=axis) <=
1073 coord.take(np.arange(0, n - 1), axis=axis))

TypeError: '>=' not supported between instances of 'CalendarDateTime' and 'CalendarDateTime'

Perhaps @spencerkclark has an idea how to deal with differencing cftime.datetime objects?

'arrays indexed by cftime.datetime objects requires the '
'optional `nc-time-axis` package '
'(https://github.com/SciTools/nc-time-axis).'
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old version of contains_cftime_datetimes was never triggered for my example. Not sure if this was a bug or if I am missing something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks, checking the dimensions would definitely be an improvement 👍. We probably still want to check that the data does not contain cftime.datetime objects too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a little more, technically, I think you could still pass a time dimension filled with cftime dates to the row or col argument of plot, and things would still work without nc-time-axis (because you would not be plotting cftime dates in that case; they would just be in the plot titles).

For that reason, it might be slightly cleaner to move this check to _ensure_plottable, because that only gets called on things that xarray tries to plot. Something like:

def _ensure_plottable(*args):
    """
    Raise exception if there is anything in args that can't be plotted on an
    axis by matplotlib.
    """
    numpy_types = [np.floating, np.integer, np.timedelta64, np.datetime64]
    other_types = [datetime]

    try:
        import cftime
        cftime_datetime = [cftime.datetime]
    except ImportError:
        cftime_datetime = []
    other_types = other_types + cftime_datetime

    for x in args:
        if not (_valid_numpy_subdtype(np.array(x), numpy_types)
                or _valid_other_type(np.array(x), other_types)):
            raise TypeError('Plotting requires coordinates to be numeric '
                            'or dates of type np.datetime64, '
                            'datetime.datetime, or cftime.datetime or '
                            'pd.Interval.')
        if (_valid_other_type(np.array(x), cftime_datetime)
            and not nc_axis_available):
            raise ImportError('Plotting of arrays of cftime.datetime '
                              'objects or arrays indexed by cftime.datetime '
                              'objects requires the optional `nc-time-axis` '
                              'package.')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am super sorry @spencerkclark. I misread your example and then had to submit a paper. I am working on it now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good @jbusecke -- I appreciate your help with this.

@jbusecke
Copy link
Contributor Author

One of the more general questions I had was if we should expose the conversion using nc-time-axis in the public API.
That way users could easily plot the data in matplotlib, e.g.:

da_new = da.convert_cftime()
plt.plot(da_new.time, da_new)

Just an idea...

@spencerkclark
Copy link
Member

Note I have a PR open in nc-time-axis, which enables plotting of cftime.datetime objects directly (without having to convert to CalendarDateTime objects). This would make this easier in xarray and elsewhere.

That said, I'm not sure if/when it will be merged, so it probably makes sense to go forward with this approach for now.

@jhamman
Copy link
Member

jhamman commented Jan 10, 2019

@spencerkclark - I pinged the met-office folks about your PR. Hopefully that get's merged.

@jbusecke
Copy link
Contributor Author

Oh shoot, I now remember seeing this.
If this will be implemented soon I guess the PR can be discarded.
Any chance you would have a quick solution for the pcolormesh plot error (second example in the PR) @spencerkclark?

@spencerkclark
Copy link
Member

spencerkclark commented Jan 11, 2019

I pinged the met-office folks about your PR. Hopefully that get's merged.

I appreciate it @jhamman; we'll see what happens there.

Oh shoot, I now remember seeing this. If this will be implemented soon I guess the PR can be discarded.

Or this PR could be amended :). We'd still need to make some changes to xarray along the lines of what you've started on here for the optional import of nc-time-axis, addition of cftime.datetime as a plottable type, and updates to the error messages.

Any chance you would have a quick solution for the pcolormesh plot error (second example in the PR)

CalendarDateTime objects are limited in the operations they support, e.g. >= is not supported, which is used in _infer_interval_breaks, which by default is called in xarray's pcolormesh. So this is one place in xarray where being able to use true cftime.datetime objects would really help. Otherwise you'd need to either wait to convert to CalendarDateTime until just before you passed data to a matplotlib function in xarray, or hack _infer_interval_breaks to make it compatible with input arrays of CalendarDateTime objects by converting to them cftime.datetime and back.

If you're in pinch I think things would work here if you passed infer_intervals=False as an argument to plot:

da2.plot(infer_intervals=False)

though in general infer_intervals is used for a reason (see #781 (comment)).

@@ -102,6 +111,26 @@ def _line_facetgrid(darray, row=None, col=None, hue=None,
return g.map_dataarray_line(hue=hue, **kwargs)


def _convert_cftime_data(values):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move these to utils.py because they might come in handy for a Dataset plotting API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still think that this could be of use? Otherwise Ill go ahead and remove it.

@jbusecke
Copy link
Contributor Author

Is there still interest in this PR? Or did the upstream changes move ahead?
I am finding myself explaining workarounds for this too students in the department, so maybe my time would be better invested getting this fix to the full community?

But obviously if things are going to be fixed upstream soon, I would devote time to other projects.
Thoughts?

@dcherian
Copy link
Contributor

Looks like upstream hasn't moved. Maybe @jhamman and @spencerkclark can re-ping for a review there?

I am +0.5 on moving forward with an xarray workaround. It seems easy to remove once upstream makes all the required changes.

@spencerkclark
Copy link
Member

I agree @dcherian; I just pinged the PR again, but if there is no activity there by this time next week, I think we should probably move forward here.

@jbusecke
Copy link
Contributor Author

jbusecke commented Jan 24, 2019 via email

@spencerkclark
Copy link
Member

@jbusecke SciTools/nc-time-axis#42 has been merged, and a new release has been made (it's already available on conda-forge)! It would be great if you could update this PR to use this latest version of nc-time-axis -- you should no longer need to convert dates to CalendarDateTime objects.

Thanks again @lbdreyer for your help.

@jbusecke
Copy link
Contributor Author

Cool. Ill give it a shot right now.

@jbusecke jbusecke force-pushed the jbusecke_cftime_plotting branch from 4c93272 to bbe4da2 Compare January 25, 2019 23:06
@pep8speaks
Copy link

pep8speaks commented Jan 25, 2019

Hello @jbusecke! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 07, 2019 at 13:16 Hours UTC

@jbusecke
Copy link
Contributor Author

Ok so the plotting works now with both timeseries and 2d data as follows

import xarray as xr
import numpy as np
%matplotlib inline

# Create a simple line dataarray with cftime
time = xr.cftime_range(start='2000', periods=4, freq='1H', calendar='noleap')
data = np.random.rand(len(time))
da = xr.DataArray(data, coords=[('time', time)])
da.plot()

download

# Check with 2d data
time = xr.cftime_range(start='2000', periods=6, freq='2MS', calendar='noleap')
data2 = np.random.rand(len(time), 4)
da2 = xr.DataArray(data2, coords=[('time', time), ('other', range(4))])
da2.plot()

download-1

@jbusecke
Copy link
Contributor Author

jbusecke commented Jan 25, 2019

I have quickly looked into the testing and found an oddity that might be important if nc-time-axis is not installed.

So in the definition of plot in plot.py I have changed

if contains_cftime_datetimes(darray):

to

if any([contains_cftime_datetimes(darray[dim]) for dim in darray.dims]):

Because if I understand correctly, the previous statement only checks the dtype of the actual data, not the dimensions. Is this appropriate or am I misunderstanding the syntax? In my example above it doesnt matter, because this only spits out an error message when nc-time-axis is not available.

@jbusecke
Copy link
Contributor Author

jbusecke commented Jan 26, 2019

Great idea to simplify @spencerkclark. Thanks.
Regarding the tests. I have removed the following:

@requires_cftime
def test_plot_cftime_coordinate_error():
    cftime = _import_cftime()
    time = cftime.num2date(np.arange(5), units='days since 0001-01-01',
                           calendar='noleap')
    data = DataArray(np.arange(5), coords=[time], dims=['time'])
    with raises_regex(TypeError,
                      'requires coordinates to be numeric or dates'):
        data.plot()


@requires_cftime
def test_plot_cftime_data_error():
    cftime = _import_cftime()
    data = cftime.num2date(np.arange(5), units='days since 0001-01-01',
                           calendar='noleap')
    data = DataArray(data, coords=[np.arange(5)], dims=['x'])
    with raises_regex(NotImplementedError, 'cftime.datetime'):
        data.plot()

And the test suite passes locally.

But I assume Ill have to add another test dataset with a cftime.datetime time-axis, which then gets dragged through all the plotting tests? Where would I have to put that in?

Many thanks for all the help

@spencerkclark
Copy link
Member

I think it would make sense to follow this example, but use cftime.datetime objects instead. You might want to add a test for a 2D plot just to be sure.

class TestDatetimePlot(PlotTestCase):
@pytest.fixture(autouse=True)
def setUp(self):
'''
Create a DataArray with a time-axis that contains datetime objects.
'''
month = np.arange(1, 13, 1)
data = np.sin(2 * np.pi * month / 12.0)
darray = DataArray(data, dims=['time'])
darray.coords['time'] = np.array([datetime(2017, m, 1) for m in month])
self.darray = darray
def test_datetime_line_plot(self):
# test if line plot raises no Exception
self.darray.plot.line()

Note you'll also need to add nc-time-axis to some CI environments so that things run in some Travis/AppVeyor builds, probably best in:

  • ci/requirements-py37.yml
  • ci/requirements-py37-windows.yml

and add some decorators to skip the tests if the needed packages (cftime and nc-time-axis) are not installed. @requires_cftime already exists, but I think you'll have to write your own @requires_nc_time_axis decorator, which you can do, e.g., here:

has_cftime, requires_cftime = _importorskip('cftime')

Maybe by leaving nc-time-axis out of the py36 test environment (which has cftime) you can use it to test the error message if nc-time-axis is not installed?

xarray/plot/plot.py Outdated Show resolved Hide resolved
@jbusecke jbusecke force-pushed the jbusecke_cftime_plotting branch from 2812ba1 to 0479829 Compare February 5, 2019 18:09
@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 5, 2019

Ok I think I have most of the things covered. All test pass for me locally. What should I add to the whats-new.rst. I thought of something like this under Enhancements (or would this be considered a bug fix?):
Internal plotting now supports cftime.datetime objects as time axis (@spencerkclark, @jbusecke)

@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 5, 2019

I think I have addressed all the above remarks (Many thanks for the thorough review and tips). Waiting for the CI again.

xarray/tests/test_plot.py Show resolved Hide resolved
xarray/tests/test_plot.py Outdated Show resolved Hide resolved
Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing before I forget -- at the bottom of doc/time-series.rst there is a note that says that built-in plotting with cftime datetime coordinate axes is not supported:

While much of the time series functionality that is possible for standard dates has been implemented for dates from non-standard calendars, there are still some remaining important features that have yet to be implemented, for example:

  • Built-in plotting of data with cftime.datetime coordinate axes (GH2164).

Go ahead and delete the quoted portion above. I believe the rest of the note is worth leaving to describe to_datetimeindex() in case folks are interested in it (xref: this StackOverflow question).

doc/plotting.rst Outdated Show resolved Hide resolved
doc/whats-new.rst Outdated Show resolved Hide resolved
doc/whats-new.rst Outdated Show resolved Hide resolved
doc/whats-new.rst Outdated Show resolved Hide resolved
spencerkclark and others added 4 commits February 5, 2019 23:16
Co-Authored-By: jbusecke <jb3210@columbia.edu>
Co-Authored-By: jbusecke <jb3210@columbia.edu>
Co-Authored-By: jbusecke <jb3210@columbia.edu>
@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 6, 2019

Seems like the travis builds all pass, wohoo. Please let me know if anything else is needed.

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbusecke -- I have a few more minor cleanup suggestions, but once those are addressed I think this should be ready to go.

self.darray_2d.plot.contour()


# @requires_nc_time_axis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to delete this commented-out code.

doc/whats-new.rst Outdated Show resolved Hide resolved

def test_cfdatetime_line_plot(self):
# test if line plot raises no Exception
self.darray.plot.line()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you could use the 2D DataArray for the line plot test instead of creating a separate DataArray for the 1D case (i.e. just use isel to select a single point along the 'x' dimension before calling plot).

xarray/tests/test_plot.py Outdated Show resolved Hide resolved
xarray/tests/test_plot.py Outdated Show resolved Hide resolved
doc/plotting.rst Outdated Show resolved Hide resolved
spencerkclark and others added 3 commits February 6, 2019 10:26
Co-Authored-By: jbusecke <jb3210@columbia.edu>
Co-Authored-By: jbusecke <jb3210@columbia.edu>
@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 6, 2019

Thanks. I updated the PR accordingly.

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another documentation note, sorry -- it occurred to me it would be helpful to add a reference to nc-time-axis under the "For plotting" heading in the "Optional dependencies" section of the installation page (doc/installing.rst). Could you add that as well?

Create a DataArray with a time-axis that contains cftime.datetime
objects.
'''
# case for 1d array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# case for 1d array

@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 7, 2019

Awesome. Just added the line. Let me know if you think it is appropriate.

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbusecke, this looks good to me too!

@jbusecke
Copy link
Contributor Author

jbusecke commented Feb 7, 2019

Is there anything else that I need to do at this point? Sorry for the xarray noob question...

@dcherian dcherian changed the title WIP: enable internal plotting with cftime datetime enable internal plotting with cftime datetime Feb 8, 2019
@dcherian dcherian merged commit 8a1a8a1 into pydata:master Feb 8, 2019
@dcherian
Copy link
Contributor

dcherian commented Feb 8, 2019

thanks @jbusecke

@jhamman
Copy link
Member

jhamman commented Feb 8, 2019

Thanks for this @jbusecke! Really excited to have this feature in Xarray now.

dcherian pushed a commit to yohai/xarray that referenced this pull request Feb 14, 2019
* master:
  typo in whats_new (pydata#2763)
  Update computation.py to use Python 3 function signatures (pydata#2756)
  add h5netcdf+dask tests (pydata#2737)
  Fix name loss when masking (pydata#2749)
  fix datetime_to_numeric and Variable._to_numeric (pydata#2668)
  Fix mypy errors (pydata#2753)
  enable internal plotting with cftime datetime (pydata#2665)
  remove references to cyordereddict (pydata#2750)
  BUG: Pass kwargs to the FileManager for pynio engine (pydata#2380) (pydata#2732)
  reintroduce pynio/rasterio/iris to py36 test env (pydata#2738)
  Fix CRS being WKT instead of PROJ.4 (pydata#2715)
  Refactor (part of) dataset.py to use explicit indexes (pydata#2696)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants