-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecating Series.argmin and Series.argmax (#16830) #16955
Conversation
Do note that that PR will not be merged for some time because we need at least one major release of time before we change behavior like that. |
Also, you should add tests to make sure that these warnings get issued (we should have tests already for |
2784c3f
to
dfd9d06
Compare
Done
Yea I figured that would be the case. If it's okay with you I'll still put up the PR so that the code is there and ready to be rebased when the time comes. |
That works, though it will be up to you primarily to keep it rebased and merge-able in the interim. We can perhaps make reference to the PR in our deprecations tracker. |
result = np.argmin(Series(data)) | ||
assert result == np.argmin(data) | ||
|
||
with pytest.warns(FutureWarning): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use tm.assert_produces_warning(FutureWarning)
. We don't use pytest
warning context manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
dfd9d06
to
2b7ebbc
Compare
Codecov Report
@@ Coverage Diff @@
## master #16955 +/- ##
=========================================
Coverage ? 90.99%
=========================================
Files ? 161
Lines ? 49288
Branches ? 0
=========================================
Hits ? 44849
Misses ? 4439
Partials ? 0
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #16955 +/- ##
==========================================
- Coverage 91.25% 91.24% -0.02%
==========================================
Files 163 163
Lines 49808 49810 +2
==========================================
- Hits 45454 45447 -7
- Misses 4354 4363 +9
Continue to review full report at Codecov.
|
looks like #16964 covers this. |
@lphk92 hmm. So the question is to we just to a public change that may break code, or deprecate it for a cycle, and then the change which may break code. In the past what we have done is introduced a new method, e.g. was the case with But here we want to use Any thoughts @TomAugspurger @jorisvandenbossche @gfyoung |
I think merge this for 0.21 and #16964 for 0.22 |
@jreback : The PR description above outlines @lphk92 's plan of action. Deprecate for this version and modify in the subsequent one. I agree with this and therefore am onboard with @TomAugspurger here. |
Is there going to be a 0.22? I thought we were going straight to 1.0. |
Not sure yet why Appveyor isn't catching the warning, especially the first time around. Also, this error seems persistent because it occurred twice now. My first suspicion is that there are other places where |
One thing I didn't really appreciate yesterday, is this will show a warning even for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. ping on green.
result = np.argmin(Series(data)) | ||
assert result == np.argmin(data) | ||
|
||
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue deprecation issue as a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this as a separate function (these deprecation tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all functions in this test will cause the deprecation warning, if I break them into a separate test, what should be left in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing much, maybe even rename the test a bit to reflect what you are testing
test_numpy_argmin_deprecated or somesuch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gfyoung I thought that was what I was doing by adding the comments stating that the deprecation warning was also occurring in np.argmax. Could you clarify where you would like a comment added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What he means is just reference the issue number beneath the function definition e.g. "see gh-16830"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh gotcha, thanks for the clarification. I will add that in the next commit.
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -116,6 +116,8 @@ Other API Changes | |||
Deprecations | |||
~~~~~~~~~~~~ | |||
- :func:`read_excel()` has deprecated ``sheetname`` in favor of ``sheet_name`` for consistency with ``.to_excel()`` (:issue:`10559`). | |||
- :method:`Series.argmax` has been deprecated in favor of :method:`Series.idxmax` (:issue:`16830`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use :func:
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Note that in 'theory', :method:
is more correct here (as idxmax is a method, not a function), but in practice both work, so it is not that important
data = np.random.randint(0, 11, size=10) | ||
result = np.argmax(Series(data)) | ||
assert result == np.argmax(data) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
|
||
with tm.assert_produces_warning(FutureWarning): | ||
# argmin is aliased to idxmin | ||
Series(data).argmin() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a test on the result as well (and same for argmin)
Just to make sure this doesn't get lost, there's an issue with this current approach, as In [1]: import pandas as pd; import numpy as np
s = ^[[A
In [2]: s = pd.Series([1, 2])
In [3]: s
Out[3]:
0 1
1 2
dtype: int64
In [4]: np.argmin(s)
/Users/taugspurger/Envs/pandas-dev/lib/python3.6/site-packages/numpy/core/fromnumeric.py:57: FutureWarning: argmin is deprecated. Use idxmin instead
return getattr(obj, method)(*args, **kwds)
Out[4]: 0 These functions do take diff --git a/pandas/core/series.py b/pandas/core/series.py
index 5294031be..cfda05486 100644
--- a/pandas/core/series.py
+++ b/pandas/core/series.py
@@ -1287,6 +1287,8 @@ class Series(base.IndexOpsMixin, strings.StringAccessorMixin,
DataFrame.idxmax
numpy.ndarray.argmax
"""
+ if '__pandas_deprecated__' in kwargs:
+ warnings.warn(FutureWarning, 'message')
skipna = nv.validate_argmax_with_skipna(skipna, args, kwargs)
i = nanops.nanargmax(_values_from_object(self), skipna=skipna)
if i == -1:
@@ -1294,7 +1296,10 @@ class Series(base.IndexOpsMixin, strings.StringAccessorMixin,
return self.index[i]
# ndarray compat
- argmin = deprecate('argmin', idxmin)
+ def argmin(self, axis=None, skipna=True, *args, **kwargs):
+ return self.idxmin(axis=None, skipna=skipna, __pandas__deprecated=True,
+ *args, **kwargs)
+
argmax = deprecate('argmax', idxmax)
def round(self, decimals=0, *args, **kwargs): @lphk92 currently this fails in |
And then your test should assert that |
I don't see a problem with this. I would merge this PR as is. |
We don't want to warn when the user is correctly using argmin. If you prefer we could only warn once, with an option to always disable the warning.
… On Jul 16, 2017, at 11:24, Jeff Reback ***@***.***> wrote:
@TomAugspurger
Just to make sure this doesn't get lost, there's an issue with this current approach, as np.argmin(arr), etc. will dispatch to arr.argmin, which warns immediately.
I don't see a problem with this. I would merge this PR as is.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I agree with @TomAugspurger that we should avoid this. However, the warning occurs because Perhaps we should update the message to say something about |
My last diff showed how we could cause np.argmin to behave differently than series.argmin. Does it feel too hacky? I don't like it in general, but am ok with it for a single major release.
… On Jul 16, 2017, at 12:27, gfyoung ***@***.***> wrote:
We don't want to warn when the user is correctly using argmin. If you prefer we could only warn once, with an option to always disable the warning.
I agree with @TomAugspurger that we should avoid this. However, the warning occurs because np.argmin calls the argmin attribute of the object if that exists. Thus, the only way to avoid it is to remove the argmin method completely...
Perhaps we should update the message to say something about argmin being called from numpy is okay. That's the only thing that comes to mind ATM.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. not sure why this is failing. pls rebase.
doc/source/whatsnew/v0.21.0.txt
Outdated
- ``pd.options.html.border`` has been deprecated in favor of ``pd.options.display.html.border`` (:issue:`15793`). | ||
|
||
- :func:`SeriesGroupBy.nth` has deprecated ``True`` in favor of ``'all'`` for its kwarg ``dropna`` (:issue:`11038`). | ||
|
||
Series.argmax and Series.argmin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a ref to this subsection. make the title something like
Series.argmin/max
are deprecated.
doc/source/whatsnew/v0.21.0.txt
Outdated
we've deprecated the current behavior of :func:`Series.argmax` and | ||
:func:`Series.argmin`. Using either of these will emit a ``FutureWarning``. | ||
|
||
If you were using ``Series.argmin`` and ``Series.argmax``, please switch to using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove these last 2 paragraphs. keep it short and sweet.
OK, something very strange was going on with the tests. I could only reproduce with python 2.7 when running the sparse tests first:
Running the There may be some warnings to cleanup to. Looking into those now. |
Still failed, not sure why. I'm ok with skipping those tests python 2 since
|
data = np.random.randint(0, 11, size=10) | ||
result = np.argmin(Series(data)) | ||
assert result == np.argmin(data) | ||
@pytest.mark.skipif(PY2, reason="Buggy assertion on warning") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually having to do this means that you are NOT catching the actual assertion somewhere else. IOW there is another place where it IS showing the warning, but its not being caught.
If you look at the travis log, there are quite some places in other tests that raise this warning, and which probably should be fixed |
In series/test_operators.py, there are |
Ok, the tets are (finally) all green. We used argmax internally in https://github.com/pandas-dev/pandas/pull/16955/files#diff-db91df05b120c5eb39d90899b54b5321R348 when we wanted idxmax. I found one more warning from a test where I failed to switch to idxmax. Merging later today if it's green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Just two minor comments
# until the implemention of Series.argmin is corrected. | ||
result = np.argmin(s) | ||
|
||
assert result == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you asserting here directly to 0, and below to the result of expected = s.idxmin()
. Both can use the same expected value? (I see in the recent commits you changed this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just changed them all to compare directly to scalars. I think that's preferable.
# See gh-16830 | ||
data = np.arange(10) | ||
|
||
s = Series(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this was like this before, but it would actually be good to use eg index=np.arange(1, 11)
, just to have a distinction between positional and label based argmin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you changed it correctly. It's the index of the Series that should be changed, not the actual data
. Because now the tests should fail (the argmin should still be 0 (first location or first index label), while you changed it to assert to 1)
pandas/io/formats/format.py
Outdated
@@ -599,7 +599,7 @@ def to_string(self): | |||
else: # max_cols == 0. Try to fit frame to terminal | |||
text = self.adj.adjoin(1, *strcols).split('\n') | |||
row_lens = Series(text).apply(len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this just be
max_len = Series(text).str.len().max()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests seem to pass locally with this.
minor comment, otherwise lgtm. |
Whoops, fixed.
…On Tue, Sep 26, 2017 at 9:19 AM, Joris Van den Bossche < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/tests/series/test_analytics.py
<#16955 (comment)>:
> @@ -1242,16 +1242,32 @@ def test_idxmin(self):
result = s.idxmin()
assert result == 1
- def test_numpy_argmin(self):
- # argmin is aliased to idxmin
- data = np.random.randint(0, 11, size=10)
- result = np.argmin(Series(data))
- assert result == np.argmin(data)
+ def test_numpy_argmin_deprecated(self):
+ # See gh-16830
+ data = np.arange(10)
+
+ s = Series(data)
I don't think you changed it correctly. It's the index of the Series that
should be changed, not the actual data. Because now the tests should fail
(the argmin should still be 0 (first location or first index label), while
you changed it to assert to 1)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#16955 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHItpk13wWbx2s6kXtQ27_UBCozv5Mks5smQgLgaJpZM4OZG7M>
.
|
All green, and no new warnings in the summary. Merging. Thanks @lphk92! |
…s-dev#16955) * Deprecating Series.argmin and Series.argmax (pandas-dev#16830) Added statements about correcting behavior in future commit Add reference to github ticket Fixing placement of github comment Made test code more explicit Fixing unrelated tests that are also throwing warnings Updating whatsnew to give more detail about deprecation Fixing whatsnew and breaking out tests to catch warnings Additional comments and more concise whatsnew Updating deprecate decorator to support custom message DOC: Update docstrings, depr message, and whatsnew * Added debug prints * Try splitting the filters * Reword whatsnew * Change sparse series test * Skip on py2 * Change to idxmin * Remove py2 skips * Catch more warnings * Final switch to idxmax * Consistent tests, refactor to_string * Fixed tests
…s-dev#16955) * Deprecating Series.argmin and Series.argmax (pandas-dev#16830) Added statements about correcting behavior in future commit Add reference to github ticket Fixing placement of github comment Made test code more explicit Fixing unrelated tests that are also throwing warnings Updating whatsnew to give more detail about deprecation Fixing whatsnew and breaking out tests to catch warnings Additional comments and more concise whatsnew Updating deprecate decorator to support custom message DOC: Update docstrings, depr message, and whatsnew * Added debug prints * Try splitting the filters * Reword whatsnew * Change sparse series test * Skip on py2 * Change to idxmin * Remove py2 skips * Catch more warnings * Final switch to idxmax * Consistent tests, refactor to_string * Fixed tests
git diff upstream/master -u -- "*.py" | flake8 --diff
This is one step toward closing issue #16830. After this, I will post another pull request containing the code to implement the expected behavior of argmax and argmin for both Series and DataFrame.