Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Panel deprecated #15601

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -505,13 +505,18 @@ two ``Series`` or any combination of ``DataFrame/Series`` or
- ``DataFrame/DataFrame``: by default compute the statistic for matching column
names, returning a DataFrame. If the keyword argument ``pairwise=True`` is
passed then computes the statistic for each pair of columns, returning a
``Panel`` whose ``items`` are the dates in question (see :ref:`the next section
``MultiIndexed DataFrame`` whose ``index`` are the dates in question (see :ref:`the next section
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the backticks here ('MultiIndexed DataFrame' is not on object)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<stats.moments.corr_pairwise>`).

For example:

.. ipython:: python

df = pd.DataFrame(np.random.randn(1000, 4),
index=pd.date_range('1/1/2000', periods=1000),
columns=['A', 'B', 'C', 'D'])
df = df.cumsum()

df2 = df[:20]
df2.rolling(window=5).corr(df2['B'])

Expand All @@ -520,11 +525,16 @@ For example:
Computing rolling pairwise covariances and correlations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. warning::

Prior to version 0.20.0 if ``pairwise=True`` was passed, a ``Panel`` would be returned.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only in the case of pairwise I think? (every rolling corr returned a Panel, eg df.rolling(12).corr() returns a Panel)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What confused me here is that you don't need to pass this pairwise=True, as it is the default when doing a rolling corr with self.
But it is correct it is only with pairwise? So I would maybe first just say something like "when pairwise rolling correlation is calculated, a Panel ..." and only after say like "(the default when no other frame is provided or when pairwise=True")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is implicit

This will now return a 2-level MultiIndexed DataFrame, see the whatsnew :ref:`here <whatsnew_0200.api_breaking.rolling_pairwise>`

In financial data analysis and other fields it's common to compute covariance
and correlation matrices for a collection of time series. Often one is also
interested in moving-window covariance and correlation matrices. This can be
done by passing the ``pairwise`` keyword argument, which in the case of
``DataFrame`` inputs will yield a ``Panel`` whose ``items`` are the dates in
``DataFrame`` inputs will yield a MultiIndexed ``DataFrame`` whose ``index`` are the dates in
question. In the case of a single DataFrame argument the ``pairwise`` argument
can even be omitted:

Expand All @@ -539,12 +549,12 @@ can even be omitted:
.. ipython:: python

covs = df[['B','C','D']].rolling(window=50).cov(df[['A','B','C']], pairwise=True)
covs[df.index[-50]]
covs.loc['2002-09-22':]

.. ipython:: python

correls = df.rolling(window=50).corr()
correls[df.index[-50]]
correls.loc['2002-09-22':]

You can efficiently retrieve the time series of correlations between two
columns using ``.loc`` indexing:
Expand All @@ -557,7 +567,7 @@ columns using ``.loc`` indexing:
.. ipython:: python

@savefig rolling_corr_pairwise_ex.png
correls.loc[:, 'A', 'C'].plot()
correls.loc[:, ('A', 'C')].plot()

.. _stats.aggregate:

Expand Down
55 changes: 55 additions & 0 deletions doc/source/dsintro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -763,6 +763,11 @@ completion mechanism so they can be tab-completed:
Panel
-----

.. warning::

In 0.20.0, ``Panel`` is deprecated and will be removed in
a future version. See the section :ref:`Deprecate Panel <dsintro.deprecate_panel>`.

Panel is a somewhat less-used, but still important container for 3-dimensional
data. The term `panel data <http://en.wikipedia.org/wiki/Panel_data>`__ is
derived from econometrics and is partially responsible for the name pandas:
Expand All @@ -783,6 +788,7 @@ From 3D ndarray with optional axis labels
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. ipython:: python
:okwarning:

wp = pd.Panel(np.random.randn(2, 5, 4), items=['Item1', 'Item2'],
major_axis=pd.date_range('1/1/2000', periods=5),
Expand All @@ -794,6 +800,7 @@ From dict of DataFrame objects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. ipython:: python
:okwarning:

data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
'Item2' : pd.DataFrame(np.random.randn(4, 2))}
Expand All @@ -816,6 +823,7 @@ dictionary of DataFrames as above, and the following named parameters:
For example, compare to the construction above:

.. ipython:: python
:okwarning:

pd.Panel.from_dict(data, orient='minor')

Expand All @@ -824,6 +832,7 @@ DataFrame objects with mixed-type columns, all of the data will get upcasted to
``dtype=object`` unless you pass ``orient='minor'``:

.. ipython:: python
:okwarning:

df = pd.DataFrame({'a': ['foo', 'bar', 'baz'],
'b': np.random.randn(3)})
Expand Down Expand Up @@ -851,6 +860,7 @@ This method was introduced in v0.7 to replace ``LongPanel.to_long``, and convert
a DataFrame with a two-level index to a Panel.

.. ipython:: python
:okwarning:

midx = pd.MultiIndex(levels=[['one', 'two'], ['x','y']], labels=[[1,1,0,0],[1,0,1,0]])
df = pd.DataFrame({'A' : [1, 2, 3, 4], 'B': [5, 6, 7, 8]}, index=midx)
Expand Down Expand Up @@ -880,6 +890,7 @@ A Panel can be rearranged using its ``transpose`` method (which does not make a
copy by default unless the data are heterogeneous):

.. ipython:: python
:okwarning:

wp.transpose(2, 0, 1)

Expand Down Expand Up @@ -909,6 +920,7 @@ Squeezing
Another way to change the dimensionality of an object is to ``squeeze`` a 1-len object, similar to ``wp['Item1']``

.. ipython:: python
:okwarning:

wp.reindex(items=['Item1']).squeeze()
wp.reindex(items=['Item1'], minor=['B']).squeeze()
Expand All @@ -923,12 +935,55 @@ for more on this. To convert a Panel to a DataFrame, use the ``to_frame``
method:

.. ipython:: python
:okwarning:

panel = pd.Panel(np.random.randn(3, 5, 4), items=['one', 'two', 'three'],
major_axis=pd.date_range('1/1/2000', periods=5),
minor_axis=['a', 'b', 'c', 'd'])
panel.to_frame()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those example above will need a :okwarning:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually had to add about 10 :>



.. _dsintro.deprecate_panel:

Deprecate Panel
---------------

Over the last few years, pandas has increased in both breadth and depth, with new features,
datatype support, and manipulation routines. As a result, supporting efficient indexing and functional
routines for ``Series``, ``DataFrame`` and ``Panel`` has contributed to an increasingly fragmented and
difficult-to-understand codebase.

The 3-d structure of a ``Panel`` is much less common for many types of data analysis,
than the 1-d of the ``Series`` or the 2-D of the ``DataFrame``. Going forward it makes sense for
pandas to focus on these areas exclusively.

Oftentimes, one can simply use a MultiIndex ``DataFrame`` for easily working with higher dimensional data.

In additon, the ``xarray`` package was built from the ground up, specifically in order to
support the multi-dimensional analysis that is one of ``Panel`` s main usecases.
`Here is a link to the xarray panel-transition documentation <http://xarray.pydata.org/en/stable/pandas.html#panel-transition>`__.

.. ipython:: python
:okwarning:

p = tm.makePanel()
p

Convert to a MultiIndex DataFrame

.. ipython:: python
:okwarning:

p.to_frame()

Alternatively, one can convert to an xarray ``DataArray``.

.. ipython:: python

p.to_xarray()

You can see the full-documentation for the `xarray package <http://xarray.pydata.org/en/stable/>`__.

.. _dsintro.panelnd:
.. _dsintro.panel4d:

Expand Down
75 changes: 75 additions & 0 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,16 @@ users upgrade to this version.
Highlights include:

- The ``.ix`` indexer has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_ix>`
- ``Panel`` has been deprecated, see :ref:`here <whatsnew_0200.api_breaking.deprecate_panel>`
- Improved user API when accessing levels in ``.groupby()``, see :ref:`here <whatsnew_0200.enhancements.groupby_access>`
- Improved support for UInt64 dtypes, see :ref:`here <whatsnew_0200.enhancements.uint64_support>`
- A new orient for JSON serialization, ``orient='table'``, that uses the Table Schema spec, see :ref:`here <whatsnew_0200.enhancements.table_schema>`
- Window Binary Corr/Cov operations return a MultiIndexed ``DataFrame`` rather than a ``Panel``, as ``Panel`` is now deprecated, see :ref:`here <whatsnew_0200.api_breaking.rolling_pairwise>`
- Support for S3 handling now uses ``s3fs``, see :ref:`here <whatsnew_0200.api_breaking.s3>`
- Google BigQuery support now uses the ``pandas-gbq`` library, see :ref:`here <whatsnew_0200.api_breaking.gbq>`
- Switched the test framework to use `pytest <http://doc.pytest.org/en/latest>`__ (:issue:`13097`)


Check the :ref:`API Changes <whatsnew_0200.api_breaking>` and :ref:`deprecations <whatsnew_0200.deprecations>` before updating.

.. contents:: What's new in v0.20.0
Expand Down Expand Up @@ -425,6 +428,33 @@ Using ``.iloc``. Here we will get the location of the 'A' column, then use *posi
df.iloc[[0, 2], df.columns.get_loc('A')]


.. _whatsnew_0200.api_breaking.deprecate_panel:

Deprecate Panel
^^^^^^^^^^^^^^^

``Panel`` is deprecated and will be removed in a future version. The recommended way to represent 3-D data are
with a ``MultiIndex``on a ``DataFrame`` via the :meth:`~Panel.to_frame` or with the `xarray package <http://xarray.pydata.org/en/stable/>`__. Pandas
provides a :meth:`~Panel.to_xarray` method to automate this conversion. See the documentation :ref:`Deprecate Panel <dsintro.deprecate_panel>`. (:issue:`13563`).

.. ipython:: python
:okwarning:

p = tm.makePanel()
p

Convert to a MultiIndex DataFrame

.. ipython:: python

p.to_frame()

Convert to an xarray DataArray

.. ipython:: python

p.to_xarray()

.. _whatsnew.api_breaking.io_compat:

Possible incompat for HDF5 formats for pandas < 0.13.0
Expand Down Expand Up @@ -836,6 +866,51 @@ New Behavior:

df.groupby('A').agg([np.mean, np.std, np.min, np.max])

.. _whatsnew_0200.api_breaking.rolling_pairwise:

Window Binary Corr/Cov operations return a MultiIndex DataFrame
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A binary window operation, like ``.corr()`` or ``.cov()``, when operating on a ``.rolling(..)``, ``.expanding(..)``, or ``.ewm(..)`` object,
will now return a 2-level ``MultiIndexed DataFrame`` rather than a ``Panel``, as ``Panel`` is now deprecated,
see :ref:`here <_whatsnew_0200.api_breaking.deprecate_panel>`. These are equivalent in function,
but MultiIndexed ``DataFrame`` s enjoy more support in pandas.
See the section on :ref:`Windowed Binary Operations <stats.moments.binary>` for more information. (:issue:`15677`)

.. ipython:: python

np.random.seed(1234)
df = pd.DataFrame(np.random.rand(100, 2),
columns=pd.Index(['A', 'B'], name='bar'),
index=pd.date_range('20160101',
periods=100, freq='D', name='foo'))
df.tail()

Old Behavior:

.. code-block:: ipython

In [2]: df.rolling(12).corr()
Out[2]:
<class 'pandas.core.panel.Panel'>
Dimensions: 100 (items) x 2 (major_axis) x 2 (minor_axis)
Items axis: 2016-01-01 00:00:00 to 2016-04-09 00:00:00
Major_axis axis: A to B
Minor_axis axis: A to B

New Behavior:

.. ipython:: python

res = df.rolling(12).corr()
res.tail()

Retrieving a correlation matrix for a cross-section

.. ipython:: python

df.rolling(12).corr().loc['2016-04-07']

.. _whatsnew_0200.api_breaking.hdfstore_where:

HDFStore where string comparison
Expand Down
16 changes: 13 additions & 3 deletions pandas/core/panel.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,8 @@
# pylint: disable=E1103,W0231,W0212,W0621
from __future__ import division

import warnings

import numpy as np

import warnings
from pandas.types.cast import (infer_dtype_from_scalar,
maybe_cast_item)
from pandas.types.common import (is_integer, is_list_like,
Expand Down Expand Up @@ -132,6 +130,18 @@ def _constructor(self):

def __init__(self, data=None, items=None, major_axis=None, minor_axis=None,
copy=False, dtype=None):
# deprecation GH13563
warnings.warn("\nPanel is deprecated and will be removed in a "
"future version.\nThe recommended way to represent "
"these types of 3-dimensional data are with a "
"MultiIndex on a DataFrame, via the "
"Panel.to_frame() method\n"
"Alternatively, you can use the xarray package "
"http://xarray.pydata.org/en/stable/.\n"
"Pandas provides a `.to_xarray()` method to help "
"automate this conversion.\n",
DeprecationWarning, stacklevel=3)

self._init_data(data=data, items=items, major_axis=major_axis,
minor_axis=minor_axis, copy=copy, dtype=dtype)

Expand Down
Loading