Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in datetime handling in plots #6102

Closed
jklymak opened this issue Dec 22, 2021 · 10 comments · Fixed by #6064 or #6109
Closed

Regression in datetime handling in plots #6102

jklymak opened this issue Dec 22, 2021 · 10 comments · Fixed by #6064 or #6109

Comments

@jklymak
Copy link
Contributor

jklymak commented Dec 22, 2021

#5794 (ea28861) introduced a regression in whether or not pandas datetime converters are loaded or Matplotlib's. This leads to basic Matplotlib-native plotting failing matplotlib/matplotlib#22023 Previously matplotlib's converters were loaded, now pandas are being loaded, despite the downstream user not ever using xarray's plotting utilities.

test code

import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

import matplotlib.units as munits
print(munits.registry)
ds = xr.Dataset({"time": [np.datetime64('2000-01-01'), np.datetime64('2000-01-02')],
                     "sir": [0, 1]})

fig, ax = plt.subplots()
# crashes:
ax.scatter(ds['time'], ds['sir'])
plt.show()

Previously:

{... 
<class 'numpy.datetime64'>: <matplotlib.dates._SwitchableDateConverter object at 0x106434ac0>, 
<class 'datetime.date'>: <matplotlib.dates._SwitchableDateConverter object at 0x106434ac0>, 
<class 'datetime.datetime'>: <matplotlib.dates._SwitchableDateConverter object at 0x106434ac0>}

Now:

{... <class 'numpy.datetime64'>: <pandas.plotting._matplotlib.converter.DatetimeConverter object at 0x17f288160>, 
<class 'datetime.date'>: <pandas.plotting._matplotlib.converter.DatetimeConverter object at 0x17f182250>, 
<class 'datetime.datetime'>: <pandas.plotting._matplotlib.converter.DatetimeConverter object at 0x17f1821c0>, 
<class 'pandas._libs.tslibs.timestamps.Timestamp'>: <pandas.plotting._matplotlib.converter.DatetimeConverter object at 0x17f16ff40>, 
<class 'pandas._libs.tslibs.period.Period'>: <pandas.plotting._matplotlib.converter.PeriodConverter object at 0x17f16ffa0>, 
<class 'datetime.time'>: <pandas.plotting._matplotlib.converter.TimeConverter object at 0x17f288130>}

As you can see, the pandas converters have been loaded without any use of pandas nor xarray plotting utilities.

Suggestion

Of course if xarray plotting is loaded, you should use and register what date converters you would like (I'd suggest matplotlib.dates.ConciseConverter, but your mileage may vary). But I think if the user is just trying to use xarray to load a data set, they should not have decisions made for them about the converter (or any other plotting functions), and to prevent confusion they should get the default matplotlib converter since it handles datetime64 just fine.

I think it could also be argued that this is a pandas issue, in that just importing pandas should not automatically register their converters unless their plotting is used. ping @TomAugspurger because I thought that was the plan, but apparently things changed. And it indeed appears their converter has a bug in it for matplotlib scatter.

Thanks!

@Illviljan
Copy link
Contributor

The registration happens here, added in #1669 in 2017:

def register_pandas_datetime_converter_if_needed():
# based on https://github.com/pandas-dev/pandas/pull/17710
global _registered
if not _registered:
pd.plotting.register_matplotlib_converters()
_registered = True

Is the registration even needed anymore I wonder? Matplotlib started supporting datetime64 in 2018: https://github.com/matplotlib/matplotlib/releases/tag/v2.2.0

@jklymak
Copy link
Contributor Author

jklymak commented Dec 23, 2021

Hi @Illviljan that is correct. However after #5794 xarray is more aggressively making the pandas choice for the user.

I'll play with it a bit to see if just removing your explicit registration fixes the problem. However changing the datetime converter would be a breaking change (to your plotting) that I'm not sure you want.

This is a tricky problem that I'm not sure matplotlib has handled properly (full disclosure, I'm on the mpl dev team and usually handle datetime issues, though I didn't design our units registry). Having a registry that users can change is very flexible. However when downstream libraries like xarray or pandas affect user plotting just by importing the package, it leads to considerable confusion as users don't necessarily know this has happened or how to get back to the Matplotlib default. Particularly if they are not using the package's plotting utilities, but just the other features and/or data types (for instance I love xarray and use it all the time in my data analysis, but I rarely use the plotting convenience functions)

@Illviljan
Copy link
Contributor

There's a pr for reverting, #6064, for a separate reason that we probably should merge soon.

I simply don't understand what that registration does, it might well be just a quick fix in a time when mpl didn't support datetimes and it might be good idea to simplify and remove it in a future pr.

@dcherian
Copy link
Contributor

So is #6109 enough to permanently fix this?

@dcherian dcherian reopened this Dec 29, 2021
@jklymak
Copy link
Contributor Author

jklymak commented Dec 29, 2021

As I'm away from a computer for a few days so can't double check, but I did bisect the problem to the pr that was reverted.

However you could keep this open for a more fulsome discussion of date handling and whether xarray wants to use the pandas or matplotlib converters. I would actually be pretty happy if pandas also just used matplotlibs converters - we already jump through some hoops to make data frames work.

@dcherian
Copy link
Contributor

whether xarray wants to use the pandas or matplotlib converters.

IMO matplotlib is fine

This code is from before matplotlib supported datetime64.

@jklymak
Copy link
Contributor Author

jklymak commented Jan 1, 2022

I checked the code from above, and it has the Matplotlib unit handlers rather than the pandas

@jklymak
Copy link
Contributor Author

jklymak commented Jan 1, 2022

BTW, maybe you could/should add a test for this behaviour?

@Illviljan
Copy link
Contributor

Do you have an easy test in mind?

@jklymak
Copy link
Contributor Author

jklymak commented Jan 1, 2022

Sure see #6128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants