-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: move plotting funcs to pd.plotting #13579
Conversation
Current coverage is 85.68% (diff: 69.35%)@@ master #13579 diff @@
==========================================
Files 145 154 +9
Lines 51419 51464 +45
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 44057 44098 +41
- Misses 7362 7366 +4
Partials 0 0
|
@sinhrks Generally sounds like a good plan! |
+1 overall. @sinhrks will this break old code? Does |
try: | ||
import matplotlib as mpl | ||
return (str(mpl.__version__) <= LooseVersion('1.2.1') and | ||
str(mpl.__version__)[0] != '0') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add these checks for '0'
when matplotlib's versioneer was broken for 1.5. I believe it's fixed now. Do you want to remove them in here, or a separate PR?
477827a
to
e184485
Compare
@sinhrks Can you rebase this? Further it is mainly docs that is still needed? |
@sinhrks as usual, certainly can include in 0.19.0, just trying to clear the decks. |
874bc5a
to
b2eebed
Compare
@sinhrks can you update |
@sinhrks do you time to update this? |
@sinhrks Do you have time to update this? I still think this is a useful refactor. I can also do a rebase to get it in. |
Once updated to the latest master. Let me add release note later. |
subplot.format_coord = lambda t, y: ( | ||
"t = {0} y = {1:8f}".format(Period(ordinal=int(t), freq=freq), y)) | ||
|
||
pylab.draw_if_interactive() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason you left this file? (for some sort of compat)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or I see, you may have left all of the pandas/tseries namespace as is (and just moved things), ok then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
And sorry, I merged a PR on the converters. I had better waited to merge that one until this is merged, as you now have to rebase again (for the converters.py file)
Some other things:
- I would personally move
tests/plotting
also into this new module (soplotting/tests
) - Should we raise a deprecation warning on the old namespaces? So we can remove them sometime.
@@ -0,0 +1,20 @@ | |||
""" | |||
Plotting api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is a bit the style in other pandas modules as well, but can't we just put this in __init__.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is cleaner here actually. And it is only imported explicity (unlike __init__
which is implicity imported)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with implicitly imported?
The difference with the other modules where we use this pattern, is that, eg in pandas.io
, pandas.io.api
are those functions that are imported top-level (imported in pandas/__init__.py
), not what makes up the pandas.io
namespace itself. Here, we don't have a collection of functions (put in api.py
) which we want to have top level, but we just have the pandas.io
namespace. So IMO it makes more sense and is cleaner to have it just in the init file here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what I mean is, unless you explicity reference these imports, like from pandas.plotting import .....)
you would by-default actually import anything.
thoughI guess for back-compat this is needed.
IOW, the imports for things like matplotlib
would only be done when the pandas.plotting namespace is used (IOW, you try to plot something), so the import is done lazily later, rather than on startup now (to register converters and such).
# flake8: noqa | ||
|
||
try: # mpl optional | ||
from pandas.plotting import converter as conv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave out the conv
(now both conv
and converter
in pandas.plotting namespace)
andrews_curves, bootstrap_plot, | ||
parallel_coordinates, lag_plot, | ||
autocorrelation_plot) | ||
from pandas.plotting.plotting import (boxplot, scatter_plot, grouped_hist, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the most important :-), but the plotting.plotting
looks a bit silly. Maybe we can call this file base
or core
to make it more clear?
I am also not sure if grouped_hist
, hist_frame
, hist_series
should be exposed in the public pandas.plotting
namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments on the deprecation.
Wondering, should we deprecate the top-level pd.scatter_matrix
as well? (pointing to pd.plotting.scatter_matrix
)
def outer(t=t): | ||
|
||
def wrapper(*args, **kwargs): | ||
warnings.warn("pandas.tools.plotting{t} is deprecated. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plotting{t}
-> plotting.{t}
warnings.warn("pandas.tools.plotting{t} is deprecated. " | ||
"import from the " | ||
"pandas.plotting.{t} instead".format(t=t), | ||
UserWarning, stacklevel=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this a FutureWarning instead of UserWarning (I think FutureWarning is OK here, as most of those functions will be used by users directly)
def wrapper(*args, **kwargs): | ||
warnings.warn("pandas.tools.plotting{t} is deprecated. " | ||
"import from the " | ||
"pandas.plotting.{t} instead".format(t=t), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "from the" can be left out.
I would do either:
"pandas.tools.plotting.scatter_matrix is deprecated, import pandas.plotting.scatter_matrix instead"
or
"pandas.tools.plotting.scatter_matrix is deprecated, import from pandas.plotting instead"
Thx, updated based on the comments.
|
top namespace ``pandas.plotting``. All the public plotting functions should be available | ||
from ``pandas.plotting``. | ||
|
||
Also, ``scatter_matrix`` function imported directly under ``pandas`` namespace is also deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.scatter_matrix
Deprecate .plotting | ||
^^^^^^^^^^^^^^^^^^^ | ||
|
||
``pandas.tools.plotting`` module has been deprecated, moving directory under the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pandas.tools.plotting
module has been deprecated, in favor of the top level namespace pandas.plotting
Previous script: | ||
|
||
.. code-block:: python | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add import pandas as pd
from pandas.tools.plotting import scatter_matrix, plot_params | ||
|
||
# deprecate tools.plotting, and scatter_matrix on the top namespace | ||
import pandas.tools.plotting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can actually deprecate pandas.tools.plotting
with pandas.util.deprecate_module
import pandas.tools.plotting | ||
from pandas.plotting import plot_params | ||
# do not import deprecate to top namespace | ||
scatter_matrix = pandas.util.decorators.deprecate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason NOT to remove plot_parms
as well from top-level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on moving plot_params as well
really just some minor comments. ok with merging, then fixing after if it is easier (one test failure on locale on travis, not sure why that is). |
parallel_coordinates, lag_plot, | ||
autocorrelation_plot) | ||
from pandas.plotting.core import (boxplot, scatter_plot, grouped_hist, | ||
hist_frame, hist_series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I mentioned this before, but is there use in exposing grouped_hist
, hist_frame
, hist_series
as public functions?
There are also not documented I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation argument is actually also true for boxplot
and scatter_plot
, and for both the functionality is also available in DataFrame.plot.scatter or DataFrame.boxplot/plot.box
@sinhrks seems really close...can you rebase |
I rebased this one, but pushed to a new PR: #16005 (this PR predates from when you can push to PR's branch) |
@@ -823,7 +823,7 @@ before plotting. | |||
Plotting Tools | |||
-------------- | |||
|
|||
These functions can be imported from ``pandas.tools.plotting`` | |||
These functions can be imported from ``pandas.plotting`` | |||
and take a :class:`Series` or :class:`DataFrame` as an argument. | |||
|
|||
.. _visualization.scatter_matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pandas.tools.plotting import scatter_matrix
from matplotlib import cm
feature_names = ['mass', 'width', 'height', 'color_score']
X = fruits[feature_names]
y = fruits['fruit_label']
cmap = cm.get_cmap('gnuplot')
scatter = pd.scatter_matrix(X, c = y, marker = 'o', s=40, hist_kwds={'bins':15}, figsize=(9,9), cmap = cmap)
plt.suptitle('Scatter-matrix for each input variable')
plt.savefig('fruits_scatter_matrix')
Rebased in #16005
git diff upstream/master | flake8 --diff
cleanup
tools/plotting
asplotting
subpackage. Because the differences are being huge, would like to split 3 part if base direction is OK:1. Move plotting related tests under(#13621)pandas/tests/plotting
splitting toSeries
,DataFrame
, etc. (0.18.2)2. Create
pd.plotting
and move related files (0.20.0)3. Split / refactor
pd.plotting
to use small utility functions / property control classes. Add tests. (0.20.0)