DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739

jreback · 2017-10-02T12:30:58Z

jreback · 2017-10-02T16:27:58Z

jschendel · 2017-10-02T17:18:16Z

pandas/core/sparse/frame.py

@@ -460,7 +497,35 @@ def set_value(self, index, col, value, takeable=False):
        -------
        frame : DataFrame
        """
-        dense = self.to_dense().set_value(index, col, value, takeable=takeable)
+        warnings.warn("get_value is deprecated and will be removed "


Should this one be "set_value is deprecated..." instead of "get_value"?

thanks...i fixed the whole doc-string duplication mess as well.

jorisvandenbossche · 2017-10-02T19:15:54Z

I am not fully convinced this is needed. As @wesm said in the issue, they were originally meant as fast access methods, and although performance has degraded over time, they are still considerably faster as any alternative.

I agree in the large majority of the use case you should not (or can avoid) having to set / get individual values, but why not keep them for those few cases you cannot otherwise?

jreback · 2017-10-02T19:18:33Z

well we have more than 1 way to do things (12 ways in fact). This is just a bad API design. Not to mention that these barely have any tests / attention, and they break in lots of cases (as they don't handle validation / dtypes propertly).

this is a no-brainer.

jreback · 2017-10-02T19:18:49Z

.iat/.at are properly supported and pretty close in perf

codecov · 2017-10-02T23:23:34Z

Codecov Report

Merging #17739 into master will decrease coverage by 0.01%.
The diff coverage is 96.55%.

@@            Coverage Diff             @@
##           master   #17739      +/-   ##
==========================================
- Coverage   91.24%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       49918    49957      +39     
==========================================
+ Hits        45546    45574      +28     
- Misses       4372     4383      +11

Flag	Coverage Δ
#multiple	`89.03% <96.55%> (ø)`	⬆️
#single	`40.25% <39.65%> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/sparse/frame.py	`94.78% <100%> (+0.09%)`	⬆️
pandas/core/frame.py	`97.74% <100%> (-0.09%)`	⬇️
pandas/core/indexing.py	`93% <100%> (ø)`	⬆️
pandas/core/sparse/series.py	`95.26% <100%> (+0.11%)`	⬆️
pandas/core/panel.py	`96.94% <100%> (+0.03%)`	⬆️
pandas/core/series.py	`94.89% <77.77%> (-0.16%)`	⬇️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 37860a5...ec7b086. Read the comment docs.

jreback · 2017-10-03T15:01:46Z

any final comments?

jschendel · 2017-10-03T17:17:32Z

doc/source/whatsnew/v0.21.0.txt

@@ -666,6 +666,7 @@ Deprecations
 - ``cdate_range`` has been deprecated in favor of :func:`bdate_range`, which has gained ``weekmask`` and ``holidays`` parameters for building custom frequency date ranges. See the :ref:`documentation <timeseries.custom-freq-ranges>` for more details (:issue:`17596`)
 - passing ``categories`` or ``ordered`` kwargs to :func:`Series.astype` is deprecated, in favor of passing a :ref:`CategoricalDtype <whatsnew_0210.enhancements.categorical_dtype>` (:issue:`17636`)
 - Passing a non-existant column in ``.to_excel(..., columns=)`` is deprecated and will raise a ``KeyError`` in the future (:issue:`17295`)
+- ``.get_value`` and ``.set_value`` on ``Series``, ``DataFrame``, ``Panel``, ``SparseSeries``, and ``SparseDataFrame`` are deprecated in favor of using ``.iat[]`` or ``.at[]`` accessors


Should there be an issue number here?

Unrelated, but if you're going to make a change, there's also a typo on the line above this one that could be fixed: "non-existant" -> "non-existent". Can submit a fix myself if there's no need to add an issue number and changes don't need to be made.

thanks, both fixed

jorisvandenbossche · 2017-10-04T11:17:32Z

I already stated my reservations about the need for this.
So ideally it would be good to hear some more opinions from other devs (because if most are OK with this, it's fine for me too, I don't care about it too much)

TomAugspurger · 2017-10-04T11:45:16Z

I feel the same as Joris, and will defer to others on this.

wesm · 2017-10-04T13:59:38Z

As we have already capitulated long ago on caring about microperformance (https://mail.python.org/pipermail/pandas-dev/2016-March/000499.html) I don't have strong feelings. If it does not cause us a maintenance hardship (maybe it does...), and the performance is better, why not keep them?

jreback · 2017-10-04T14:01:06Z

perf is almost exactly the same as .iat/.at; they are in fact wrappers around the internal versions of .set_value/.get_value. but absolutely this is an API/maintenance burden. too much surface area.

jorisvandenbossche · 2017-10-04T18:11:40Z

Since they are indeed rather thin wrappers around the internal set/get_value, I don't think they are actually that much of a maintenance burden. But it is true it is an API 'burden' (they add to the long list of available functions that users have to cope with).
So it's a bit a trade-off. But they are still a factor of around 3x faster as .at. So given that I would tend to keep them (although I never use them myself).

jreback · 2017-10-04T18:18:14Z

you are completely missing the point.

They duplicate APIs. They are not tested in the least and they are very buggy. The entire point of .iat/.at is that they follow a common path and ARE tested with all dtypes.

We are not going to keep around buggy code that has duplicate API's. This such a maintance burden. We need to actively be removing code, not preserving untested, non-documented duplicative API's.

I personally would rather have correct code.

wesm · 2017-10-04T19:54:29Z

I wouldn't say he / I are missing the point but we have a slightly different perspective =) I am +0 with removing the APIs in the interest of simplification and maintainability.

jorisvandenbossche · 2017-10-04T20:15:35Z

If you read my previous post, I think you can see that I am not completely missing the point. I agree that it is duplicate API. But there is still a trade-off between duplicate api vs speed / breaking people's code.
But I do not agree they are not tested. They are tested by the many indexing tests that use them under the hood.

But to the point: by further looking at it, I am certainly fine with the idea of deprecating those functions. As I would also like to clean up the API, and iat/at are I think nicer.
But, at/iat are actually also only very small wrappers around get_value/set_value. They also do almost as little data validation as get_value/set_value (eg the bug report about PeriodIndex with get_value (#15268), you have exactly the same using at). And I think in general this limited validation is actually the feature of those functions/methods (although the fact it cannot handle a Period on a PeriodIndex could be seen as a bug for at).
And the little bit extra validation they do (the reason they are slower) is IMO not that useful. So I would like to hear if you would be OK with making at/iat a bit faster (closer to get_value/set_value).

Eg, if you look at the at implementation for getting, it does the following validation:

check that it is a tuple -> otherwise raise ValueError
on integer axis, check that key is integer -> otherwise ValueError
on non-integer axis, check that key is not integer -> otherwise raise ValueError

But in all those case, when not doing this check, you would get a KeyError otherwise with the get_value that is used under the hood. But those checks make up for a big part of the reason that at is 3x as slow as get_value.
Are those checks necessary? They provide a little bit a more informative error message (compared to just KeyError). And dropping the checks would be an API change from ValueError to KeyError. But IMO they are not really important, and I would be in favor to drop them, making at closer in speed to get_value.

Would you be with such a change?

(and to repeat from above, I am OK with the idea of removing get_value/set_value)

jorisvandenbossche

Minor doc comment, for the rest no code comments!

jreback · 2017-10-05T00:25:19Z

Eg, if you look at the at implementation for getting, it does the following validation:

check that it is a tuple -> otherwise raise ValueError
on integer axis, check that key is integer -> otherwise ValueError
on non-integer axis, check that key is not integer -> otherwise raise ValueError
But in all those case, when not doing this check, you would get a KeyError otherwise with the get_value that is used under the hood. But those checks make up for a big part of the reason that at is 3x as slow as get_value.
Are those checks necessary? They provide a little bit a more informative error message (compared to just KeyError). And dropping the checks would be an API change from ValueError to KeyError. But IMO they are not really important, and I would be in favor to drop them, making at closer in speed to get_value.

Would you be with such a change?

I guess my overarching point is that we have a framework that we can/should test .iat/at more. So for sure testing is +1. We already have some scalar fast paths that least to these routines, but some of those changes are certainly possible/welcome. Obviously not averse to speeding things up.

Things done in the unified indexing context are always welcome.

didn't see any doc comments left, can you point.

jorisvandenbossche

Oops, forgot to push the button. Here is the minor doc comment.

jorisvandenbossche · 2017-10-05T06:57:03Z

pandas/core/frame.py

@@ -1918,6 +1919,8 @@ def get_value(self, index, col, takeable=False):
        """
        Quickly retrieve single value at passed column and index

+        .. deprecated:: 0.21.0
+


Can you add here the alternative like in the deprecations message? (the "use '.at[]' or '.iat[]' instead")

(and the same for the other docstrings)

…l, SparseSeries, SparseDataFrame closes pandas-dev#15269

…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269

jreback added Deprecate Functionality to remove in pandas Indexing Related to indexing on series/frames, not to indexes themselves labels Oct 2, 2017

jreback added this to the 0.21.0 milestone Oct 2, 2017

jsexauer mentioned this pull request Oct 2, 2017

DEPR: Clean up list of deprecations from prior versions #6581

Closed

1 task

jschendel reviewed Oct 2, 2017

View reviewed changes

jreback force-pushed the get_value branch from 21d3bb0 to 4bf8a21 Compare October 2, 2017 19:02

jreback force-pushed the get_value branch from 4bf8a21 to 095256d Compare October 3, 2017 11:11

jschendel reviewed Oct 3, 2017

View reviewed changes

jreback force-pushed the get_value branch from 095256d to 7dd12ee Compare October 3, 2017 20:27

jorisvandenbossche reviewed Oct 4, 2017

View reviewed changes

jorisvandenbossche reviewed Oct 5, 2017

View reviewed changes

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Pane…

ec7b086

…l, SparseSeries, SparseDataFrame closes pandas-dev#15269

jreback force-pushed the get_value branch from 7dd12ee to ec7b086 Compare October 5, 2017 10:23

jreback merged commit 3fae8dd into pandas-dev:master Oct 5, 2017

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Pane…

897867f

…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269

alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Pane…

ecc067c

…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Pane…

db0d19b

…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269

jreback mentioned this pull request Jul 26, 2019

DEPR: deprecations log for removed issues #13777

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739

jreback commented Oct 2, 2017

jreback commented Oct 2, 2017

jschendel Oct 2, 2017

jreback Oct 2, 2017

jorisvandenbossche commented Oct 2, 2017

jreback commented Oct 2, 2017

jreback commented Oct 2, 2017

codecov bot commented Oct 2, 2017 •

edited

Loading

jreback commented Oct 3, 2017

jschendel Oct 3, 2017

jreback Oct 3, 2017

jorisvandenbossche commented Oct 4, 2017

TomAugspurger commented Oct 4, 2017

wesm commented Oct 4, 2017

jreback commented Oct 4, 2017

jorisvandenbossche commented Oct 4, 2017

jreback commented Oct 4, 2017 •

edited

Loading

wesm commented Oct 4, 2017

jorisvandenbossche commented Oct 4, 2017

jorisvandenbossche left a comment

jreback commented Oct 5, 2017

jorisvandenbossche left a comment

jorisvandenbossche Oct 5, 2017

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739

DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739

Conversation

jreback commented Oct 2, 2017

jreback commented Oct 2, 2017

jschendel Oct 2, 2017

Choose a reason for hiding this comment

jreback Oct 2, 2017

Choose a reason for hiding this comment

jorisvandenbossche commented Oct 2, 2017

jreback commented Oct 2, 2017

jreback commented Oct 2, 2017

codecov bot commented Oct 2, 2017 • edited Loading

Codecov Report

jreback commented Oct 3, 2017

jschendel Oct 3, 2017

Choose a reason for hiding this comment

jreback Oct 3, 2017

Choose a reason for hiding this comment

jorisvandenbossche commented Oct 4, 2017

TomAugspurger commented Oct 4, 2017

wesm commented Oct 4, 2017

jreback commented Oct 4, 2017

jorisvandenbossche commented Oct 4, 2017

jreback commented Oct 4, 2017 • edited Loading

wesm commented Oct 4, 2017

jorisvandenbossche commented Oct 4, 2017

jorisvandenbossche left a comment

Choose a reason for hiding this comment

jreback commented Oct 5, 2017

jorisvandenbossche left a comment

Choose a reason for hiding this comment

jorisvandenbossche Oct 5, 2017

Choose a reason for hiding this comment

codecov bot commented Oct 2, 2017 •

edited

Loading

jreback commented Oct 4, 2017 •

edited

Loading