Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SparseSeries and SparseDataFrame #28425

Merged
merged 36 commits into from
Sep 18, 2019
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
acf5f2f
CLN: Remove sparse
TomAugspurger Sep 12, 2019
5418dd5
round 2
TomAugspurger Sep 12, 2019
f61b5e3
Round 3
TomAugspurger Sep 13, 2019
238db69
round 4
TomAugspurger Sep 13, 2019
7448795
remove hdf
TomAugspurger Sep 13, 2019
f285272
some more
TomAugspurger Sep 13, 2019
b6fb1aa
cleanup
TomAugspurger Sep 13, 2019
c476b21
note
TomAugspurger Sep 13, 2019
fc34fe8
fixups
TomAugspurger Sep 13, 2019
3cc4765
pickle changes
TomAugspurger Sep 13, 2019
766a2f2
pickle compat
TomAugspurger Sep 13, 2019
dd51140
skip feather
TomAugspurger Sep 13, 2019
129e89e
fixups
TomAugspurger Sep 13, 2019
5b711c6
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 13, 2019
075bfd2
cleanups
TomAugspurger Sep 13, 2019
2b58e53
black
TomAugspurger Sep 13, 2019
413347f
to_sparse docs
TomAugspurger Sep 13, 2019
d5828e3
doc note
TomAugspurger Sep 13, 2019
047773e
rm sparse frame
TomAugspurger Sep 13, 2019
2d8d195
rm sparse series
TomAugspurger Sep 13, 2019
fa508c1
docs
TomAugspurger Sep 13, 2019
9b61370
doc
TomAugspurger Sep 13, 2019
0c530ae
remove new pickle
TomAugspurger Sep 13, 2019
5d55a49
Update v0.25.0.rst
TomAugspurger Sep 13, 2019
58b848a
Update v0.25.0.rst
TomAugspurger Sep 13, 2019
7a7e2d3
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 16, 2019
f1afc8f
added new legacy pickle files
TomAugspurger Sep 16, 2019
7742e36
Merge remote-tracking branch 'upstream/master' into remove-sparse
TomAugspurger Sep 17, 2019
008931a
shim
TomAugspurger Sep 17, 2019
04bf466
shim
TomAugspurger Sep 17, 2019
a4a21ae
revert io changes
TomAugspurger Sep 17, 2019
a8b0d65
warning for sparse
TomAugspurger Sep 17, 2019
77b7da3
Fixup typing
TomAugspurger Sep 17, 2019
d265ba9
format
TomAugspurger Sep 17, 2019
c2a9514
fixup typing
TomAugspurger Sep 17, 2019
0c02b2a
0.24.0 todo
TomAugspurger Sep 17, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions doc/redirects.csv
Original file line number Diff line number Diff line change
Expand Up @@ -503,7 +503,6 @@ generated/pandas.DataFrame.to_parquet,../reference/api/pandas.DataFrame.to_parqu
generated/pandas.DataFrame.to_period,../reference/api/pandas.DataFrame.to_period
generated/pandas.DataFrame.to_pickle,../reference/api/pandas.DataFrame.to_pickle
generated/pandas.DataFrame.to_records,../reference/api/pandas.DataFrame.to_records
generated/pandas.DataFrame.to_sparse,../reference/api/pandas.DataFrame.to_sparse
generated/pandas.DataFrame.to_sql,../reference/api/pandas.DataFrame.to_sql
generated/pandas.DataFrame.to_stata,../reference/api/pandas.DataFrame.to_stata
generated/pandas.DataFrame.to_string,../reference/api/pandas.DataFrame.to_string
Expand Down Expand Up @@ -1432,7 +1431,6 @@ generated/pandas.Series.to_msgpack,../reference/api/pandas.Series.to_msgpack
generated/pandas.Series.to_numpy,../reference/api/pandas.Series.to_numpy
generated/pandas.Series.to_period,../reference/api/pandas.Series.to_period
generated/pandas.Series.to_pickle,../reference/api/pandas.Series.to_pickle
generated/pandas.Series.to_sparse,../reference/api/pandas.Series.to_sparse
generated/pandas.Series.to_sql,../reference/api/pandas.Series.to_sql
generated/pandas.Series.to_string,../reference/api/pandas.Series.to_string
generated/pandas.Series.to_timestamp,../reference/api/pandas.Series.to_timestamp
Expand Down
8 changes: 0 additions & 8 deletions doc/source/reference/frame.rst
Original file line number Diff line number Diff line change
Expand Up @@ -356,15 +356,7 @@ Serialization / IO / conversion
DataFrame.to_msgpack
DataFrame.to_gbq
DataFrame.to_records
DataFrame.to_sparse
DataFrame.to_dense
DataFrame.to_string
DataFrame.to_clipboard
DataFrame.style

Sparse
~~~~~~
.. autosummary::
:toctree: api/

SparseDataFrame.to_coo
11 changes: 0 additions & 11 deletions doc/source/reference/series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -576,18 +576,7 @@ Serialization / IO / conversion
Series.to_sql
Series.to_msgpack
Series.to_json
Series.to_sparse
Series.to_dense
Series.to_string
Series.to_clipboard
Series.to_latex


Sparse
------

.. autosummary::
:toctree: api/

SparseSeries.to_coo
SparseSeries.from_coo
37 changes: 26 additions & 11 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4658,26 +4658,41 @@ See the `Full Documentation <https://github.com/wesm/feather>`__.

Write to a feather file.

.. ipython:: python
:okwarning:
.. TODO(Arrow 0.15): remove change these back to .. ipython blocks.

df.to_feather('example.feather')
.. code-block:: python

Read from a feather file.
>>> df.to_feather('example.feather')

.. ipython:: python
:okwarning:
Read from a feather file.

result = pd.read_feather('example.feather')
result
.. code-block:: python
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved

# we preserve dtypes
result.dtypes
>>> result = pd.read_feather('example.feather')
>>> result
a b c d e f g h i
0 a 1 3 4.0 True a 2013-01-01 2013-01-01 00:00:00-05:00 2013-01-01 00:00:00.000000000
1 b 2 4 5.0 False b 2013-01-02 2013-01-02 00:00:00-05:00 2013-01-01 00:00:00.000000001
2 c 3 5 6.0 True c 2013-01-03 2013-01-03 00:00:00-05:00 2013-01-01 00:00:00.000000002

>>> # we preserve dtypes
>>> result.dtypes
a object
b int64
c uint8
d float64
e bool
f category
g datetime64[ns]
h datetime64[ns, US/Eastern]
i datetime64[ns]
dtype: object

.. ipython:: python
:suppress:

os.remove('example.feather')
if os.path.exists("example.feather"):
os.remove('example.feather')


.. _io.parquet:
Expand Down
20 changes: 5 additions & 15 deletions doc/source/user_guide/sparse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,6 @@
Sparse data structures
**********************

.. note::

``SparseSeries`` and ``SparseDataFrame`` have been deprecated. Their purpose
is served equally well by a :class:`Series` or :class:`DataFrame` with
sparse values. See :ref:`sparse.migration` for tips on migrating.

Pandas provides data structures for efficiently storing sparse data.
These are not necessarily sparse in the typical "mostly 0". Rather, you can view these
objects as being "compressed" where any data matching a specific value (``NaN`` / missing value, though any value
Expand Down Expand Up @@ -168,6 +162,11 @@ the correct dense result.
Migrating
---------

.. note::

``SparseSeries`` and ``SparseDataFrame`` were removed in pandas 1.0.0. This migration
guide is present to aid in migrating from previous versions.

In older versions of pandas, the ``SparseSeries`` and ``SparseDataFrame`` classes (documented below)
were the preferred way to work with sparse data. With the advent of extension arrays, these subclasses
are no longer needed. Their purpose is better served by using a regular Series or DataFrame with
Expand Down Expand Up @@ -366,12 +365,3 @@ row and columns coordinates of the matrix. Note that this will consume a signifi

ss_dense = pd.Series.sparse.from_coo(A, dense_index=True)
ss_dense


.. _sparse.subclasses:

Sparse subclasses
-----------------

The :class:`SparseSeries` and :class:`SparseDataFrame` classes are deprecated. Visit their
API pages for usage.
6 changes: 2 additions & 4 deletions doc/source/whatsnew/v0.16.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,7 @@ Interaction with scipy.sparse

Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:issue:`8048`) for converting to and from ``scipy.sparse.coo_matrix`` instances (see :ref:`here <sparse.scipysparse>`). For example, given a SparseSeries with MultiIndex we can convert to a `scipy.sparse.coo_matrix` by specifying the row and column labels as index levels:

.. ipython:: python
:okwarning:
.. code-block:: python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be done as a follow up and probably something easy for the community to contribute to, but ideally since we are converting these from ipython to python directives should add the original output (applicable in a few places below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly wouldn't object, but I think that's fairly low-value since, e.g. the 0.16 version of the docs at https://pandas.pydata.org/pandas-docs/version/0.16.0/whatsnew.html will have the correct rendering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I don’t think you should do here as there’s already enough - just a nice to have a first time contributor could pick up.

I can open an issue for it later


s = pd.Series([3.0, np.nan, 1.0, 3.0, np.nan, np.nan])
s.index = pd.MultiIndex.from_tuples([(1, 2, 'a', 0),
Expand Down Expand Up @@ -121,8 +120,7 @@ Added :meth:`SparseSeries.to_coo` and :meth:`SparseSeries.from_coo` methods (:is
The from_coo method is a convenience method for creating a ``SparseSeries``
from a ``scipy.sparse.coo_matrix``:

.. ipython:: python
:okwarning:
.. code-block:: python

from scipy import sparse
A = sparse.coo_matrix(([3.0, 1.0, 2.0], ([1, 0, 0], [0, 2, 3])),
Expand Down
6 changes: 2 additions & 4 deletions doc/source/whatsnew/v0.18.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -393,8 +393,7 @@ used in the ``pandas`` implementation (:issue:`12644`, :issue:`12638`, :issue:`1

An example of this signature augmentation is illustrated below:

.. ipython:: python
:okwarning:
.. code-block:: python

sp = pd.SparseDataFrame([1, 2, 3])
sp
Expand All @@ -409,8 +408,7 @@ Previous behaviour:

New behaviour:

.. ipython:: python
:okwarning:
.. code-block:: python

np.cumsum(sp, axis=0)

Expand Down
6 changes: 2 additions & 4 deletions doc/source/whatsnew/v0.19.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1235,8 +1235,7 @@ Operators now preserve dtypes

- Sparse data structure now can preserve ``dtype`` after arithmetic ops (:issue:`13848`)

.. ipython:: python
:okwarning:
.. code-block:: python

s = pd.SparseSeries([0, 2, 0, 1], fill_value=0, dtype=np.int64)
s.dtype
Expand All @@ -1245,8 +1244,7 @@ Operators now preserve dtypes

- Sparse data structure now support ``astype`` to convert internal ``dtype`` (:issue:`13900`)

.. ipython:: python
:okwarning:
.. code-block:: python

s = pd.SparseSeries([1., 0., 2., 0.], fill_value=0)
s
Expand Down
5 changes: 2 additions & 3 deletions doc/source/whatsnew/v0.20.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -338,8 +338,7 @@ See the :ref:`documentation <sparse.scipysparse>` for more information. (:issue:

All sparse formats are supported, but matrices that are not in :mod:`COOrdinate <scipy.sparse>` format will be converted, copying data as needed.

.. ipython:: python
:okwarning:
.. code-block:: python

from scipy.sparse import csr_matrix
arr = np.random.random(size=(1000, 5))
Expand All @@ -351,7 +350,7 @@ All sparse formats are supported, but matrices that are not in :mod:`COOrdinate

To convert a ``SparseDataFrame`` back to sparse SciPy matrix in COO format, you can use:

.. ipython:: python
.. code-block:: python

sdf.to_coo()

Expand Down
3 changes: 1 addition & 2 deletions doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -902,8 +902,7 @@ by a ``Series`` or ``DataFrame`` with sparse values.

**Previous way**

.. ipython:: python
:okwarning:
.. code-block:: python

df = pd.SparseDataFrame({"A": [0, 0, 1, 2]})
df.dtypes
Expand Down
9 changes: 9 additions & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,17 @@ Deprecations

.. _whatsnew_1000.prior_deprecations:


Removed SparseSeries and SparseDataFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``SparseSeries`` and ``SparseDataFrame`` have been removed (:issue:`28425`).
We recommend using a ``Series`` or ``DataFrame`` with sparse values instead.
See :ref:`sparse.migration` for help with migrating existing code.

Removal of prior version deprecations/changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Removed the previously deprecated :meth:`Series.get_value`, :meth:`Series.set_value`, :meth:`DataFrame.get_value`, :meth:`DataFrame.set_value` (:issue:`17739`)
- Changed the the default value of `inplace` in :meth:`DataFrame.set_index` and :meth:`Series.set_axis`. It now defaults to False (:issue:`27600`)
- :meth:`pandas.Series.str.cat` now defaults to aligning ``others``, using ``join='left'`` (:issue:`27611`)
Expand Down
7 changes: 1 addition & 6 deletions pandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,12 +114,7 @@
DataFrame,
)

from pandas.core.sparse.api import (
SparseArray,
SparseDataFrame,
SparseSeries,
SparseDtype,
)
from pandas.core.sparse.api import SparseArray, SparseDtype

from pandas.tseries.api import infer_freq
from pandas.tseries import offsets
Expand Down
5 changes: 1 addition & 4 deletions pandas/_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,10 @@
from pandas.core.dtypes.dtypes import ExtensionDtype # noqa: F401
from pandas.core.indexes.base import Index # noqa: F401
from pandas.core.series import Series # noqa: F401
from pandas.core.sparse.series import SparseSeries # noqa: F401
from pandas.core.generic import NDFrame # noqa: F401


AnyArrayLike = TypeVar(
"AnyArrayLike", "ExtensionArray", "Index", "Series", "SparseSeries", np.ndarray
)
AnyArrayLike = TypeVar("AnyArrayLike", "ExtensionArray", "Index", "Series", np.ndarray)
ArrayLike = TypeVar("ArrayLike", "ExtensionArray", np.ndarray)
DatetimeLikeScalar = TypeVar("DatetimeLikeScalar", "Period", "Timestamp", "Timedelta")
Dtype = Union[str, np.dtype, "ExtensionDtype"]
Expand Down
31 changes: 28 additions & 3 deletions pandas/compat/pickle_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import copy
import pickle as pkl
import sys
from typing import Any

from pandas import Index

Expand Down Expand Up @@ -54,6 +55,22 @@ def load_reduce(self):
raise


class _LoadSparseSeries:
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved
# To load a SparseSeries as a Series[Sparse]
def __new__(cls) -> Any:
TomAugspurger marked this conversation as resolved.
Show resolved Hide resolved
from pandas import Series

return Series()


class _LoadSparseFrame:
# To load a SparseDataFrame as a DataFrame[Sparse]
def __new__(cls) -> Any:
from pandas import DataFrame

return DataFrame()


# If classes are moved, provide compat here.
_class_locations_map = {
("pandas.core.sparse.array", "SparseArray"): ("pandas.core.arrays", "SparseArray"),
Expand Down Expand Up @@ -101,12 +118,12 @@ def load_reduce(self):
"SparseArray",
),
("pandas.sparse.series", "SparseSeries"): (
"pandas.core.sparse.series",
"SparseSeries",
"pandas.compat.pickle_compat",
"_LoadSparseSeries",
),
("pandas.sparse.frame", "SparseDataFrame"): (
"pandas.core.sparse.frame",
"SparseDataFrame",
"_LoadSparseFrame",
),
("pandas.indexes.base", "_new_Index"): ("pandas.core.indexes.base", "_new_Index"),
("pandas.indexes.base", "Index"): ("pandas.core.indexes.base", "Index"),
Expand Down Expand Up @@ -139,6 +156,14 @@ def load_reduce(self):
"pandas.core.indexes.numeric",
"Float64Index",
),
("pandas.core.sparse.series", "SparseSeries"): (
"pandas.compat.pickle_compat",
"_LoadSparseSeries",
),
("pandas.core.sparse.frame", "SparseDataFrame"): (
"pandas.compat.pickle_compat",
"_LoadSparseFrame",
),
}


Expand Down
Loading