Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecating Series.argmin and Series.argmax (#16830) #16955

Merged
merged 13 commits into from
Sep 27, 2017
22 changes: 22 additions & 0 deletions doc/source/whatsnew/v0.21.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -487,11 +487,33 @@ Other API Changes

Deprecations
~~~~~~~~~~~~

- :func:`read_excel()` has deprecated ``sheetname`` in favor of ``sheet_name`` for consistency with ``.to_excel()`` (:issue:`10559`).
- ``pd.options.html.border`` has been deprecated in favor of ``pd.options.display.html.border`` (:issue:`15793`).
- :func:`SeriesGroupBy.nth` has deprecated ``True`` in favor of ``'all'`` for its kwarg ``dropna`` (:issue:`11038`).
- :func:`DataFrame.as_blocks` is deprecated, as this is exposing the internal implementation (:issue:`17302`)

.. _whatsnew_0210.deprecations.argmin_min

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be in a separate sub-section

Series.argmax and Series.argmin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a ref to this subsection. make the title something like

Series.argmin/max are deprecated.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

- The behavior of :func:`Series.argmax` has been deprecated in favor of :func:`Series.idxmax` (:issue:`16830`)
- The behavior of :func:`Series.argmin` has been deprecated in favor of :func:`Series.idxmin` (:issue:`16830`)

For compatibility with NumPy arrays, ``pd.Series`` implements ``argmax`` and
``argmin``. Since pandas 0.13.0, ``argmax`` has been an alias for
:meth:`pandas.Series.idxmax`, and ``argmin`` has been an alias for
:meth:`pandas.Series.idxmin`. They return the *label* of the maximum or minimum,
rather than the *position*.

We've deprecated the current behavior of ``Series.argmax`` and
``Series.argmin``. Using either of these will emit a ``FutureWarning``. Use
:meth:`Series.idxmax` if you want the label of the maximum. Use
``Series.values.argmax()`` if you want the position of the maximum. Likewise for
the minimum. In a future release ``Series.argmax`` and ``Series.argmin`` will
return the position of the maximum or minimum.

.. _whatsnew_0210.prior_deprecations:

Removal of prior version deprecations/changes
Expand Down
29 changes: 22 additions & 7 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,8 @@
import pandas.core.common as com
import pandas.core.nanops as nanops
import pandas.io.formats.format as fmt
from pandas.util._decorators import Appender, deprecate_kwarg, Substitution
from pandas.util._decorators import (
Appender, deprecate, deprecate_kwarg, Substitution)
from pandas.util._validators import validate_bool_kwarg

from pandas._libs import index as libindex, tslib as libts, lib, iNaT
Expand Down Expand Up @@ -1272,7 +1273,7 @@ def duplicated(self, keep='first'):

def idxmin(self, axis=None, skipna=True, *args, **kwargs):
"""
Index of first occurrence of minimum of values.
Index *label* of the first occurrence of minimum of values.

Parameters
----------
Expand All @@ -1285,7 +1286,9 @@ def idxmin(self, axis=None, skipna=True, *args, **kwargs):

Notes
-----
This method is the Series version of ``ndarray.argmin``.
This method is the Series version of ``ndarray.argmin``. This method
returns the label of the minimum, while ``ndarray.argmin`` returns
the position. To get the position, use ``series.values.argmin()``.

See Also
--------
Expand All @@ -1300,7 +1303,7 @@ def idxmin(self, axis=None, skipna=True, *args, **kwargs):

def idxmax(self, axis=None, skipna=True, *args, **kwargs):
"""
Index of first occurrence of maximum of values.
Index *label* of the first occurrence of maximum of values.

Parameters
----------
Expand All @@ -1313,7 +1316,9 @@ def idxmax(self, axis=None, skipna=True, *args, **kwargs):

Notes
-----
This method is the Series version of ``ndarray.argmax``.
This method is the Series version of ``ndarray.argmax``. This method
returns the label of the maximum, while ``ndarray.argmax`` returns
the position. To get the position, use ``series.values.argmax()``.

See Also
--------
Expand All @@ -1327,8 +1332,18 @@ def idxmax(self, axis=None, skipna=True, *args, **kwargs):
return self.index[i]

# ndarray compat
argmin = idxmin
argmax = idxmax
argmin = deprecate('argmin', idxmin,
msg="'argmin' is deprecated. Use 'idxmin' instead. "
"The behavior of 'argmin' will be corrected to "
"return the positional minimum in the future. "
"Use 'series.values.argmin' to get the position of "
"the minimum now.")
argmax = deprecate('argmax', idxmax,
msg="'argmax' is deprecated. Use 'idxmax' instead. "
"The behavior of 'argmax' will be corrected to "
"return the positional maximum in the future. "
"Use 'series.values.argmax' to get the position of "
"the maximum now.")

def round(self, decimals=0, *args, **kwargs):
"""
Expand Down
4 changes: 1 addition & 3 deletions pandas/io/formats/format.py
Original file line number Diff line number Diff line change
Expand Up @@ -598,9 +598,7 @@ def to_string(self):
text = self._join_multiline(*strcols)
else: # max_cols == 0. Try to fit frame to terminal
text = self.adj.adjoin(1, *strcols).split('\n')
row_lens = Series(text).apply(len)
max_len_col_ix = np.argmax(row_lens)
max_len = row_lens[max_len_col_ix]
max_len = Series(text).str.len().max()
headers = [ele[0] for ele in strcols]
# Size of last col determines dot col size. See
# `self._to_str_columns
Expand Down
60 changes: 44 additions & 16 deletions pandas/tests/series/test_analytics.py
Original file line number Diff line number Diff line change
Expand Up @@ -1242,16 +1242,31 @@ def test_idxmin(self):
result = s.idxmin()
assert result == 1

def test_numpy_argmin(self):
# argmin is aliased to idxmin
data = np.random.randint(0, 11, size=10)
result = np.argmin(Series(data))
assert result == np.argmin(data)
def test_numpy_argmin_deprecated(self):
# See gh-16830
data = np.arange(1, 11)

s = Series(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was like this before, but it would actually be good to use eg index=np.arange(1, 11), just to have a distinction between positional and label based argmin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you changed it correctly. It's the index of the Series that should be changed, not the actual data. Because now the tests should fail (the argmin should still be 0 (first location or first index label), while you changed it to assert to 1)

with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue deprecation issue as a comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this as a separate function (these deprecation tests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all functions in this test will cause the deprecation warning, if I break them into a separate test, what should be left in here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing much, maybe even rename the test a bit to reflect what you are testing

test_numpy_argmin_deprecated or somesuch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue deprecation issue as a comment

@lphk92 : you should also address this comment from @jreback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfyoung I thought that was what I was doing by adding the comments stating that the deprecation warning was also occurring in np.argmax. Could you clarify where you would like a comment added?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What he means is just reference the issue number beneath the function definition e.g. "see gh-16830"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh gotcha, thanks for the clarification. I will add that in the next commit.

# The deprecation of Series.argmin also causes a deprecation
# warning when calling np.argmin. This behavior is temporary
# until the implemention of Series.argmin is corrected.
result = np.argmin(s)

assert result == 1

with tm.assert_produces_warning(FutureWarning):
# argmin is aliased to idxmin
result = s.argmin()

assert result == 1

if not _np_version_under1p10:
msg = "the 'out' parameter is not supported"
tm.assert_raises_regex(ValueError, msg, np.argmin,
Series(data), out=data)
with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
msg = "the 'out' parameter is not supported"
tm.assert_raises_regex(ValueError, msg, np.argmin,
s, out=data)

def test_idxmax(self):
# test idxmax
Expand Down Expand Up @@ -1297,17 +1312,30 @@ def test_idxmax(self):
result = s.idxmin()
assert result == 1.1

def test_numpy_argmax(self):
def test_numpy_argmax_deprecated(self):
# See gh-16830
data = np.arange(1, 11)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

s = Series(data)
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False):
# The deprecation of Series.argmax also causes a deprecation
# warning when calling np.argmax. This behavior is temporary
# until the implemention of Series.argmax is corrected.
result = np.argmax(s)
assert result == 10

with tm.assert_produces_warning(FutureWarning):
# argmax is aliased to idxmax
result = s.argmax()

# argmax is aliased to idxmax
data = np.random.randint(0, 11, size=10)
result = np.argmax(Series(data))
assert result == np.argmax(data)
assert result == 10

if not _np_version_under1p10:
msg = "the 'out' parameter is not supported"
tm.assert_raises_regex(ValueError, msg, np.argmax,
Series(data), out=data)
with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
msg = "the 'out' parameter is not supported"
tm.assert_raises_regex(ValueError, msg, np.argmax,
s, out=data)

def test_ptp(self):
N = 1000
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/series/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ def test_ndarray_compat(self):
index=date_range('1/1/2000', periods=1000))

def f(x):
return x[x.argmax()]
return x[x.idxmax()]

result = tsdf.apply(f)
expected = tsdf.max()
Expand Down
28 changes: 14 additions & 14 deletions pandas/tests/series/test_operators.py
Original file line number Diff line number Diff line change
Expand Up @@ -1872,33 +1872,33 @@ def test_op_duplicate_index(self):
),
]
)
def test_assert_argminmax_raises(self, test_input, error_type):
def test_assert_idxminmax_raises(self, test_input, error_type):
"""
Cases where ``Series.argmax`` and related should raise an exception
"""
with pytest.raises(error_type):
test_input.argmin()
test_input.idxmin()
with pytest.raises(error_type):
test_input.argmin(skipna=False)
test_input.idxmin(skipna=False)
with pytest.raises(error_type):
test_input.argmax()
test_input.idxmax()
with pytest.raises(error_type):
test_input.argmax(skipna=False)
test_input.idxmax(skipna=False)

def test_argminmax_with_inf(self):
def test_idxminmax_with_inf(self):
# For numeric data with NA and Inf (GH #13595)
s = pd.Series([0, -np.inf, np.inf, np.nan])

assert s.argmin() == 1
assert np.isnan(s.argmin(skipna=False))
assert s.idxmin() == 1
assert np.isnan(s.idxmin(skipna=False))

assert s.argmax() == 2
assert np.isnan(s.argmax(skipna=False))
assert s.idxmax() == 2
assert np.isnan(s.idxmax(skipna=False))

# Using old-style behavior that treats floating point nan, -inf, and
# +inf as missing
with pd.option_context('mode.use_inf_as_na', True):
assert s.argmin() == 0
assert np.isnan(s.argmin(skipna=False))
assert s.argmax() == 0
np.isnan(s.argmax(skipna=False))
assert s.idxmin() == 0
assert np.isnan(s.idxmin(skipna=False))
assert s.idxmax() == 0
np.isnan(s.idxmax(skipna=False))
16 changes: 15 additions & 1 deletion pandas/tests/sparse/test_series.py
Original file line number Diff line number Diff line change
Expand Up @@ -1379,11 +1379,25 @@ def test_numpy_func_call(self):
# numpy passes in 'axis=None' or `axis=-1'
funcs = ['sum', 'cumsum', 'var', 'mean',
'prod', 'cumprod', 'std', 'argsort',
'argmin', 'argmax', 'min', 'max']
'min', 'max']
for func in funcs:
for series in ('bseries', 'zbseries'):
getattr(np, func)(getattr(self, series))

def test_deprecated_numpy_func_call(self):
# NOTE: These should be add to the 'test_numpy_func_call' test above
# once the behavior of argmin/argmax is corrected.
funcs = ['argmin', 'argmax']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment that we need to include them again in the test above when behaviour of argmin/argmax is changes? (to make sure we don't just delete the test when the deprecation is removed)

for func in funcs:
for series in ('bseries', 'zbseries'):
with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
getattr(np, func)(getattr(self, series))

with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):
getattr(getattr(self, series), func)()


@pytest.mark.parametrize(
'datetime_type', (np.datetime64,
Expand Down
8 changes: 5 additions & 3 deletions pandas/util/_decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@


def deprecate(name, alternative, alt_name=None, klass=None,
stacklevel=2):
stacklevel=2, msg=None):
"""
Return a new function that emits a deprecation warning on use.
Expand All @@ -21,14 +21,16 @@ def deprecate(name, alternative, alt_name=None, klass=None,
Name to use in preference of alternative.__name__
klass : Warning, default FutureWarning
stacklevel : int, default 2
msg : str
The message to display in the warning.
Default is '{name} is deprecated. Use {alt_name} instead.'
"""

alt_name = alt_name or alternative.__name__
klass = klass or FutureWarning
msg = msg or "{} is deprecated. Use {} instead".format(name, alt_name)

def wrapper(*args, **kwargs):
msg = "{name} is deprecated. Use {alt_name} instead".format(
name=name, alt_name=alt_name)
warnings.warn(msg, klass, stacklevel=stacklevel)
return alternative(*args, **kwargs)
return wrapper
Expand Down