Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add x,y kwargs for plot.line(). #1926

Merged
merged 6 commits into from
Mar 5, 2018
Merged

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Feb 20, 2018

Description

plot.line now supports both 1D and 2D DataArrays as input. I've changed some variable names to make code clearer:

  1. set xplt, yplt to be values that are passed to ax.plot()
  2. xlabel, ylabel are axes labels
  3. xdim, ydim are dimension names

Example

This code

da = xr.DataArray(np.cos(z), dims=['z'], coords=[z], name='f')

xy = [[None, None],
     [None, 'f'],
     [None, 'z'],
     ['f', None],
      ['z', None],
      ['z', 'f'],
      ['f', 'z']]

f, ax = plt.subplots(2,4)

for aa, (x,y) in enumerate(xy):
    da.plot(x=x, y=y, ax=ax.flat[aa])
    ax.flat[aa].set_title('x='+str(x)+ ' | '+'y='+str(y))

yields

image

Feedback requested

Should I refactor out the kwarg checking?

_ensure_plottable(x)
else:
if x is not None and x is not darray.name:
raise ValueError('Cannot make a line plot with x=%r, y=%r and hue=%r'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E501 line too long (85 > 79 characters)

Copy link
Member

@fmaussion fmaussion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dcherian ! I have a couple of comments and a general question: do we want to add the control via variable name? I think it makes things more confusing for little added value (names have no control on other plots such as 2D plots).

Otherwise this looks quite good!

doc/plotting.rst Outdated
@@ -197,6 +197,16 @@ It is required to explicitly specify either
Thus, we could have made the previous plot by specifying ``hue='lat'`` instead of ``x='time'``.
If required, the automatic legend can be turned off using ``add_legend=False``.

Data along x-axis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Rotated line plots" as title? Just a suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rotation makes me think of rotating the lines by an angle. How about "Coordinate along y-axis"?

doc/plotting.rst Outdated
Data along x-axis
~~~~~~~~~~~~~~~~~

It is also possible to make line plots such that the data are on the x-axis and a co-ordinate is on the y-axis. This can be done by specifying the ``x`` and ``y`` keyword arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"co-ordinate" -> coordinate

@@ -97,6 +97,8 @@ Enhancements
encoding/decoding of datetimes with non-standard calendars without the
netCDF4 dependency (:issue:`1084`).
By `Joe Hamman <https://github.com/jhamman>`_.
- :py:func:`~plot.line()` learned to make plots with data on x-axis if so specified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add (:issue:`575`)

1D and 2D DataArrays: Coordinate for x axis.
y : string, optional
1D DataArray: Can be coordinate name or DataArray.name
2D DataArray: Coordinate for y axis.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is really helping. I'd prefer to stick to the simple Coordinate for x/y axis.

if x is not None and y is not None and x == y:
raise ValueError('Cannot make a plot with x=%r and y=%r' % (x, y))

if (x is None and y is None) or x == dim or y is darray.name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y is darray.name -> y == darray.name

xlabel = dim
ylabel = darray.name

elif y == dim or x is darray.name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

@dcherian
Copy link
Contributor Author

dcherian commented Feb 20, 2018

@fmaussion when I added the hue argument, @shoyer suggested that we have a "fully explicit way to make these plots".
In this case this would mean
da.plot(x='temp', y='time', hue='lat')
which seems a lot more readable than
da.plot(y='coordinate', hue='lat')

It would also mean that x,y are different from hue (for hue we do specify dimension name).

@shoyer Thoughts?

EDIT:

@fmaussion Do you mean da.plot(y='time', x=None, hue='lat')?

@fmaussion
Copy link
Member

Do you mean da.plot(y='time', x=None, hue='lat')

Yes, I meant that there should be one and only way to specify which axis should do what: my understanding of your current implementation is that there are two ways to reach the same goal: either by (i) specifying the name of the variable to the axis you want to plot it onto, or (ii) by specifying the name of the coordinate you want to use for the axis. Since (ii) is the default and only option for 1D and 2D plots until now, I just wondered if (i) is very necessary. (or maybe I missed something)

@shoyer
Copy link
Member

shoyer commented Feb 21, 2018

My main thought is that this API would feel much more natural on a Dataset object, alongside a .plot.scatter() method.

That said, I suppose this could still be useful and I don't think it's harmful to expand the API here. It does feel a little strange that if you had a DataArray with non-dimension coordinates you could make a plot without including any of the DataArray values, e.g., xr.DataArray(..., dims=['x'], coords={'x': ..., 'y': ('x', ...)}, name='f').plot.line(x='x', y='y').

@dcherian
Copy link
Contributor Author

dcherian commented Feb 21, 2018

either by (i) specifying the name of the variable to the axis you want to plot it onto, or (ii) by specifying the name of the coordinate you want to use for the axis. Since (ii) is the default and only option for 1D and 2D plots until now, I just wondered if (i) is very necessary.

It does feel a little strange that if you had a DataArray with non-dimension coordinates you could make a plot without including any of the DataArray values

Good points. I'll change it so that only one of x and y can be specified.

My main thought is that this API would feel much more natural on a Dataset object, alongside a .plot.scatter() method.

It's very useful in oceanography (and meteorology) where we want to make plots of physical quantities like temperature against depth; it is much more natural to see depth on the y-axis than the x-axis. This is the original usecase @rabernat mention in #575.

The Dataset API is a more involved undertaking that I'm not confident of executing at this point.

add_legend = kwargs.pop('add_legend', True)

ax = get_axis(figsize, size, aspect, ax)

error_msg = 'must be either None or %r' % (' or '.join(darray.dims))

if x not in [None, *darray.dims]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax

add_legend = kwargs.pop('add_legend', True)

ax = get_axis(figsize, size, aspect, ax)

error_msg = 'must be either None or one of %r' % list(darray.dims)

if x not in [None, *darray.dims]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what's wrong here. It runs locally; the tests pass and flake8 goes through plot.py without errors. Is it a python2 vs python3 thing?

@shoyer
Copy link
Member

shoyer commented Feb 21, 2018 via email

error_msg = ('must be either None or one of ({0:s})'
.format(', '.join(['\''+dd+'\'' for dd in darray.dims])))

if x not in [None, *darray.dims]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax


for aa, (x, y) in enumerate(xy):
da.plot(x=x, y=y, ax=ax.flat[aa])
ax.flat[aa].set_title('x=' + str(x) + ' | '+'y='+str(y))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E226 missing whitespace around arithmetic operator

@dcherian
Copy link
Contributor Author

Updated and rebased. Now these are the only options.

image

@fmaussion
Copy link
Member

Thanks! In view of @shoyer ' comment we could revisit the API for datasets at a later stage.

add_legend = kwargs.pop('add_legend', True)

ax = get_axis(figsize, size, aspect, ax)

error_msg = ('must be either None or one of ({0:s})'
.format(', '.join(['\'' + dd + '\'' for dd in darray.dims])))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use repr(dd) here instead of '\'' + dd + '\''

raise ValueError('y ' + error_msg)

if x is not None and y is not None:
raise ValueError('You cannot specify both x and y kwargs.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be qualified by "for line plots"

dcherian added 2 commits February 28, 2018 19:59
Supports both 1D and 2D DataArrays as input.

Change variable names to make code clearer:
   1. set xplt, yplt to be values that are passed to ax.plot()
   2. xlabel, ylabel are axes labels
   3. xdim, ydim are dimension names
@dcherian
Copy link
Contributor Author

dcherian commented Mar 1, 2018

Done and rebased on master.

Coordinate for which you want multiple lines plotted
(2D DataArrays only).
x, y : string, optional
Coordinates for x, y axis. Only one of these may be specified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably say something like "The other coordinate plots values from the DataArray on which this plot method is called."


for aa, (x, y) in enumerate(xy):
da.plot(x=x, y=y, ax=ax.flat[aa])
ax.flat[aa].set_title('x=' + str(x) + ' | ' + 'y=' + str(y))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point to this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I copied the test over from my debugging plots. I'll remove it (I assume you mean the set_title bit).

[None, 'z'],
['z', None]]

f, ax = plt.subplots(2, 4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only have two cases now, so no need for 2x4 subplots.

@shoyer
Copy link
Member

shoyer commented Mar 3, 2018

@fmaussion any further concerns here? This looks good to me.

@fmaussion fmaussion merged commit 4983f1f into pydata:master Mar 5, 2018
@fmaussion
Copy link
Member

Thanks @dcherian ! This is a nice feature

@dcherian dcherian deleted the line-specify-xy branch May 10, 2018 05:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1D line plot with data on the x axis
4 participants