-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subclassed reshape clean #15655
Subclassed reshape clean #15655
Conversation
yeah don't add anything about panel |
I left the |
Codecov Report
@@ Coverage Diff @@
## master #15655 +/- ##
==========================================
+ Coverage 90.38% 91.01% +0.63%
==========================================
Files 161 143 -18
Lines 50916 49383 -1533
==========================================
- Hits 46019 44947 -1072
+ Misses 4897 4436 -461
Continue to review full report at Codecov.
|
Any thoughts as to why coverage would have decreased when the diff was 100% covered and I only added (didn't modify) tests? I don't like all that red. |
doc/source/internals.rst
Outdated
@@ -152,44 +152,110 @@ Below example shows how to define ``SubclassedSeries`` and ``SubclassedDataFrame | |||
def _constructor_sliced(self): | |||
return SubclassedSeries | |||
|
|||
|
|||
Overriding constructor properties allows subclass families to be preserved across slice and reshape operations: | |||
|
|||
.. code-block:: python | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I am puzzled why these are code-blocks at all, they should all simply be ipython-blocks. much more simple and will show an error if something is wrong. can you change.
doc/source/whatsnew/v0.20.0.txt
Outdated
@@ -308,6 +308,9 @@ Other enhancements | |||
- ``pd.types.concat.union_categoricals`` gained the ``ignore_ordered`` argument to allow ignoring the ordered attribute of unioned categoricals (:issue:`13410`). See the :ref:`categorical union docs <categorical.union>` for more information. | |||
- ``pandas.io.json.json_normalize()`` with an empty ``list`` will return an empty ``DataFrame`` (:issue:`15534`) | |||
- ``pd.DataFrame.to_latex`` and ``pd.DataFrame.to_string`` now allow optional header aliases. (:issue:`15536`) | |||
- Reshape operations now preserve subclass family. This includes ``DataFrame`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reshape operations on Series/DataFrame will now preserve subclasses. The following operations ......
pandas/core/reshape.py
Outdated
@@ -37,9 +37,29 @@ class _Unstacker(object): | |||
|
|||
Parameters | |||
---------- | |||
values : ndarray | |||
Values of DataFrame to "Unstack" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't put blank lines in-between parameters
pandas/core/reshape.py
Outdated
Values of DataFrame to "Unstack" | ||
|
||
index : object | ||
Pandas ``Index`` or ``MultiIndex`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index : Index
is all that is needed here. MultiIndex is a valid Index.
pandas/core/reshape.py
Outdated
level : int or str, default last level | ||
Level to "unstack". Accepts a name for the level. | ||
|
||
value_columns : object, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value_columns: Index, optional
A MultiIndex must be specified to unstack a DataFrame
pandas/core/reshape.py
Outdated
@@ -69,7 +89,8 @@ class _Unstacker(object): | |||
""" | |||
|
|||
def __init__(self, values, index, level=-1, value_columns=None, | |||
fill_value=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so what I would do instead of this, is simply change values -> obj. Where obj is a Series/DataFrame. Then you don't need a constructor argument at all, and if needed you can always .values
. Though to be honest this should be completely changed. That coerces all dtypes to a common one.
pandas/core/reshape.py
Outdated
@@ -452,7 +475,8 @@ def unstack(obj, level, fill_value=None): | |||
return obj.T.stack(dropna=False) | |||
else: | |||
unstacker = _Unstacker(obj.values, obj.index, level=level, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. it makes much more sense to simply pass in obj
pandas/core/reshape.py
Outdated
@@ -920,7 +945,7 @@ def lreshape(data, groups, dropna=True, label=None): | |||
if not mask.all(): | |||
mdata = dict((k, v[mask]) for k, v in compat.iteritems(mdata)) | |||
|
|||
return DataFrame(mdata, columns=id_cols + pivot_cols) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the way to really do this nicely is to add two properties to _Unstacker
def _constructor(self):
return self.obj._constructor
def _constructor_sliced(self):
return self.obj._constructor_sliced
can you rebase / update |
Will do. I was traveling for the last few weeks but I'll update this when I find a bit of time. Thanks! |
…o preverve subclasses
…source/internals.rst
…o preverve subclasses
…source/internals.rst
needs a rebase / update. pls comment if you would like to continue. |
git diff upstream/master | flake8 --diff
This PR enables reshape operations on subclassed
DataFrame
andSeries
objects to preserve subclass families through the use of_constructor*
properties.See discussion on PR #15564. Thanks for the help @jreback.
This PR has a cleaner implementation of subclassed unstack operations by modifying the
_Unstacker
initializer to allow an optionalconstructor
argument.Additionally, this PR implements tests for a wider set of reshape operations. It now covers:
DataFrame.stack()
,DataFrame.unstack()
,DataFrame.pivot()
, andseries.unstack()
for containers withIndex
andMultiIndex
indices and/or columnspd.melt()
pd.wide_to_long()
Finally, the pandas internals docs have been edited for clarity and additional examples have been added to showcase subclassed reshape and math operations.