Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Error with ambiguous groupby strings #22415

Merged
merged 1 commit into from
Aug 22, 2018

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Aug 19, 2018

Title is self-explanatory.

xref #14432.

@gfyoung gfyoung added Groupby Deprecate Functionality to remove in pandas labels Aug 19, 2018
@gfyoung gfyoung added this to the 0.24.0 milestone Aug 19, 2018
@codecov
Copy link

codecov bot commented Aug 19, 2018

Codecov Report

Merging #22415 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22415      +/-   ##
==========================================
- Coverage   92.05%   92.04%   -0.01%     
==========================================
  Files         169      169              
  Lines       50713    50708       -5     
==========================================
- Hits        46683    46675       -8     
- Misses       4030     4033       +3
Flag Coverage Δ
#multiple 90.45% <100%> (-0.01%) ⬇️
#single 42.25% <20%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/reshape/merge.py 94.15% <100%> (-0.01%) ⬇️
pandas/core/generic.py 96.44% <100%> (-0.01%) ⬇️
pandas/core/frame.py 97.24% <100%> (-0.01%) ⬇️
pandas/core/groupby/grouper.py 98.16% <100%> (-0.01%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (-0.63%) ⬇️
pandas/util/testing.py 85.75% <0%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8bb2cc1...5374ba4. Read the comment docs.

# descending order
expected = df.sort_index(level='idx', ascending=False)
assert_frame_equal(result, expected)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these sufficiently covered as testing that they raise?

Copy link
Member Author

@gfyoung gfyoung Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise, but we already have coverage.

[pd.Grouper(key='inner')],
[pd.Grouper(level='inner')]
),
(['B', 'inner'], # Column and index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just change to have these check for raising? or too duplicative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too duplicative --> it raises.

columns=['YEAR', 'MONTH'],
values=['DAYS', 'SALARY'],
aggfunc={'DAYS': 'mean', 'SALARY': 'sum'},
margins=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these raise now?

Copy link
Member Author

@gfyoung gfyoung Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we already have coverage for raising.

# Construct right_df with an index level named 'outer'
right_df = df2.set_index('outer')

# Construct expected result.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these raise now?

Copy link
Member Author

@gfyoung gfyoung Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we already have coverage for raising.

@gfyoung
Copy link
Member Author

gfyoung commented Aug 22, 2018

@jreback : Any further thoughts on this PR? I answered all of your questions.

@jreback
Copy link
Contributor

jreback commented Aug 22, 2018

lgtm. @jorisvandenbossche @TomAugspurger

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 22, 2018

Running dask's test suite on this branch quick. I suspect this will break some things (because dask has unhandled warnings).

@jreback
Copy link
Contributor

jreback commented Aug 22, 2018

@TomAugspurger if might make sense to expand the test_dask test that we have (which basically just tests imports).

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 22, 2018 via email

@jreback jreback merged commit 25e6a21 into pandas-dev:master Aug 22, 2018
@jreback
Copy link
Contributor

jreback commented Aug 22, 2018

thanks @gfyoung

issued and the column takes precedence. This will result in an ambiguity error
in a future version.
If a string matches both a column name and an index level name, a
``ValueError`` will be raised.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should no longer be in a versionadded directive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we drop ..versionadded directives from docstrings?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a docstring, but yes, we do drop them from time to time (but not that often).
However, in this case, what is stated is now not correct anymore. There was no error added in 0.20. So maybe we can also change the 0.20 to 0.24 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just refer to drop it, since it really isn't an add either. Happy to debate semantics 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in #22503.

@gfyoung gfyoung deleted the groupby-ambiguous-error branch August 25, 2018 06:07
gfyoung added a commit to forking-repos/pandas that referenced this pull request Aug 25, 2018
jreback pushed a commit that referenced this pull request Aug 28, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Groupby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants