Subclassed reshape clean #15655

delgadom · 2017-03-11T20:21:12Z

closes Preserve subclass family on reshape operations #15563
tests added / passed
passes git diff upstream/master | flake8 --diff
whatsnew entry

This PR enables reshape operations on subclassed DataFrame and Series objects to preserve subclass families through the use of _constructor* properties.

See discussion on PR #15564. Thanks for the help @jreback.

This PR has a cleaner implementation of subclassed unstack operations by modifying the _Unstacker initializer to allow an optional constructor argument.

Additionally, this PR implements tests for a wider set of reshape operations. It now covers:

DataFrame.stack(), DataFrame.unstack(), DataFrame.pivot(), and series.unstack() for containers with Index and MultiIndex indices and/or columns
pd.melt()
pd.wide_to_long()

Finally, the pandas internals docs have been edited for clarity and additional examples have been added to showcase subclassed reshape and math operations.

delgadom · 2017-03-11T20:30:46Z

Oops just saw #13563/#15601. Want me to remove SubclassedPanel from the subclassing docs or is that something for another PR once panels are officially on their way out?

jreback · 2017-03-11T20:44:33Z

yeah don't add anything about panel

delgadom · 2017-03-11T20:55:11Z

I left the Panel column in the default _constructor* property table but removed references to panels from the examples

codecov-io · 2017-03-11T22:29:08Z

Codecov Report

Merging #15655 into master will increase coverage by 0.63%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15655      +/-   ##
==========================================
+ Coverage   90.38%   91.01%   +0.63%     
==========================================
  Files         161      143      -18     
  Lines       50916    49383    -1533     
==========================================
- Hits        46019    44947    -1072     
+ Misses       4897     4436     -461

Flag	Coverage Δ
#multiple	`?`
#single	`?`

Impacted Files	Coverage Δ
pandas/core/reshape.py	`99.27% <100%> (ø)`
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/tools/plotting.py	`71.79% <0%> (-10.03%)`	⬇️
pandas/util/_tester.py	`35.29% <0%> (-3.6%)`	⬇️
pandas/types/concat.py	`98.06% <0%> (-1.94%)`	⬇️
pandas/conftest.py	`94.11% <0%> (-1.72%)`	⬇️
pandas/compat/pickle_compat.py	`68.29% <0%> (-1.22%)`	⬇️
pandas/tools/hashing.py	`99.02% <0%> (-0.98%)`	⬇️
pandas/io/packers.py	`87.57% <0%> (-0.94%)`	⬇️
pandas/core/base.py	`95.51% <0%> (-0.67%)`	⬇️
... and 189 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ea0f25...dc3b07e. Read the comment docs.

delgadom · 2017-03-11T23:15:46Z

Any thoughts as to why coverage would have decreased when the diff was 100% covered and I only added (didn't modify) tests? I don't like all that red.

jreback · 2017-03-12T15:22:43Z

doc/source/internals.rst

@@ -152,44 +152,110 @@ Below example shows how to define ``SubclassedSeries`` and ``SubclassedDataFrame
       def _constructor_sliced(self):
           return SubclassedSeries

+
+Overriding constructor properties allows subclass families to be preserved across slice and reshape operations:
+
 .. code-block:: python



actually I am puzzled why these are code-blocks at all, they should all simply be ipython-blocks. much more simple and will show an error if something is wrong. can you change.

jreback · 2017-03-12T15:23:38Z

doc/source/whatsnew/v0.20.0.txt

@@ -308,6 +308,9 @@ Other enhancements
 - ``pd.types.concat.union_categoricals`` gained the ``ignore_ordered`` argument to allow ignoring the ordered attribute of unioned categoricals (:issue:`13410`). See the :ref:`categorical union docs <categorical.union>` for more information.
 - ``pandas.io.json.json_normalize()`` with an empty ``list`` will return an empty ``DataFrame`` (:issue:`15534`)
 - ``pd.DataFrame.to_latex`` and ``pd.DataFrame.to_string`` now allow optional header aliases. (:issue:`15536`)
+- Reshape operations now preserve subclass family. This includes ``DataFrame``


reshape operations on Series/DataFrame will now preserve subclasses. The following operations ......

jreback · 2017-03-12T15:23:54Z

pandas/core/reshape.py

@@ -37,9 +37,29 @@ class _Unstacker(object):

    Parameters
    ----------
+    values : ndarray
+        Values of DataFrame to "Unstack"
+


don't put blank lines in-between parameters

jreback · 2017-03-12T15:24:22Z

pandas/core/reshape.py

+        Values of DataFrame to "Unstack"
+
+    index : object
+        Pandas ``Index`` or ``MultiIndex``


index : Index is all that is needed here. MultiIndex is a valid Index.

jreback · 2017-03-12T15:25:20Z

pandas/core/reshape.py

    level : int or str, default last level
        Level to "unstack". Accepts a name for the level.

+    value_columns : object, optional


value_columns: Index, optional
A MultiIndex must be specified to unstack a DataFrame

jreback · 2017-03-12T15:26:22Z

pandas/core/reshape.py

@@ -69,7 +89,8 @@ class _Unstacker(object):
    """

    def __init__(self, values, index, level=-1, value_columns=None,
-                 fill_value=None):


so what I would do instead of this, is simply change values -> obj. Where obj is a Series/DataFrame. Then you don't need a constructor argument at all, and if needed you can always .values. Though to be honest this should be completely changed. That coerces all dtypes to a common one.

jreback · 2017-03-12T15:26:49Z

pandas/core/reshape.py

@@ -452,7 +475,8 @@ def unstack(obj, level, fill_value=None):
            return obj.T.stack(dropna=False)
    else:
        unstacker = _Unstacker(obj.values, obj.index, level=level,


e.g. it makes much more sense to simply pass in obj

jreback · 2017-03-12T15:28:09Z

pandas/core/reshape.py

@@ -920,7 +945,7 @@ def lreshape(data, groups, dropna=True, label=None):
        if not mask.all():
            mdata = dict((k, v[mask]) for k, v in compat.iteritems(mdata))

-    return DataFrame(mdata, columns=id_cols + pivot_cols)


the way to really do this nicely is to add two properties to _Unstacker

def _constructor(self): return self.obj._constructor def _constructor_sliced(self): return self.obj._constructor_sliced

jreback · 2017-04-03T15:14:14Z

can you rebase / update

delgadom · 2017-04-18T23:01:16Z

Will do. I was traveling for the last few weeks but I'll update this when I find a bit of time. Thanks!

…o preverve subclasses

…source/internals.rst

…examples

…o preverve subclasses

…source/internals.rst

…examples

jreback · 2017-07-26T23:57:28Z

needs a rebase / update. pls comment if you would like to continue.

jreback requested changes Mar 12, 2017

View reviewed changes

jreback added API Design Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Mar 12, 2017

delgadom added 20 commits May 13, 2017 14:14

add subclassed stack/unstack/pivot tests

642bfce

add melt test

0295604

use _constructor* properties to create Series and DataFrame objects t…

60a2cfd

…o preverve subclasses

document _Unstacker

d65cff5

fix bug in wide_to_long_test, add GH issue numbers

9d1cf63

add whatsnew entry

3efb82f

flake8 cleanup

200c752

fix bug in existing docs internals.rst:220 {A, [ --> {A: [

5e480c6

clarify language and add subclassed reshape and math examples to doc/…

4f3319c

…source/internals.rst

additional clarification in doc/source/internals.rst

8af21c1

remove references to Panel from doc/source/internals.rst subclassing …

027f36a

…examples

change code-block to ipython directives in doc/source/internals.rst

8a61374

change from python to ipython code blocks in docs

6715a25

reformat docstrings

f751a85

add subclassed stack/unstack/pivot tests

ca85796

add melt test

b0bc8f4

use _constructor* properties to create Series and DataFrame objects t…

246a464

…o preverve subclasses

document _Unstacker

1c672a9

fix bug in wide_to_long_test, add GH issue numbers

eff151e

flake8 cleanup

16dae8e

delgadom added 5 commits May 13, 2017 14:25

fix bug in existing docs internals.rst:220 {A, [ --> {A: [

66b2e42

clarify language and add subclassed reshape and math examples to doc/…

7641812

…source/internals.rst

additional clarification in doc/source/internals.rst

be66ce0

remove references to Panel from doc/source/internals.rst subclassing …

d27034d

…examples

merge conflicts

dc3b07e

jreback closed this Jul 26, 2017

This was referenced Dec 20, 2017

stack() does not work with subclassed DataFrame #18859

Closed

BUG: Stack/unstack do not return subclassed objects (GH15563) #18929

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subclassed reshape clean #15655

Subclassed reshape clean #15655

delgadom commented Mar 11, 2017

delgadom commented Mar 11, 2017 •

edited

Loading

jreback commented Mar 11, 2017

delgadom commented Mar 11, 2017

codecov-io commented Mar 11, 2017 •

edited by codecov bot

Loading

delgadom commented Mar 11, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback Mar 12, 2017

jreback commented Apr 3, 2017

delgadom commented Apr 18, 2017

jreback commented Jul 26, 2017

Subclassed reshape clean #15655

Subclassed reshape clean #15655

Conversation

delgadom commented Mar 11, 2017

delgadom commented Mar 11, 2017 • edited Loading

jreback commented Mar 11, 2017

delgadom commented Mar 11, 2017

codecov-io commented Mar 11, 2017 • edited by codecov bot Loading

Codecov Report

delgadom commented Mar 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Apr 3, 2017

delgadom commented Apr 18, 2017

jreback commented Jul 26, 2017

delgadom commented Mar 11, 2017 •

edited

Loading

codecov-io commented Mar 11, 2017 •

edited by codecov bot

Loading