-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: deprecate .get_value and .set_value for Series, DataFrame, Panel & Sparse #17739
Conversation
pandas/core/sparse/frame.py
Outdated
@@ -460,7 +497,35 @@ def set_value(self, index, col, value, takeable=False): | |||
------- | |||
frame : DataFrame | |||
""" | |||
dense = self.to_dense().set_value(index, col, value, takeable=takeable) | |||
warnings.warn("get_value is deprecated and will be removed " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this one be "set_value is deprecated..." instead of "get_value"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks...i fixed the whole doc-string duplication mess as well.
I am not fully convinced this is needed. As @wesm said in the issue, they were originally meant as fast access methods, and although performance has degraded over time, they are still considerably faster as any alternative. I agree in the large majority of the use case you should not (or can avoid) having to set / get individual values, but why not keep them for those few cases you cannot otherwise? |
well we have more than 1 way to do things (12 ways in fact). This is just a bad API design. Not to mention that these barely have any tests / attention, and they break in lots of cases (as they don't handle validation / dtypes propertly). this is a no-brainer. |
|
Codecov Report
@@ Coverage Diff @@
## master #17739 +/- ##
==========================================
- Coverage 91.24% 91.22% -0.02%
==========================================
Files 163 163
Lines 49918 49957 +39
==========================================
+ Hits 45546 45574 +28
- Misses 4372 4383 +11
Continue to review full report at Codecov.
|
any final comments? |
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -666,6 +666,7 @@ Deprecations | |||
- ``cdate_range`` has been deprecated in favor of :func:`bdate_range`, which has gained ``weekmask`` and ``holidays`` parameters for building custom frequency date ranges. See the :ref:`documentation <timeseries.custom-freq-ranges>` for more details (:issue:`17596`) | |||
- passing ``categories`` or ``ordered`` kwargs to :func:`Series.astype` is deprecated, in favor of passing a :ref:`CategoricalDtype <whatsnew_0210.enhancements.categorical_dtype>` (:issue:`17636`) | |||
- Passing a non-existant column in ``.to_excel(..., columns=)`` is deprecated and will raise a ``KeyError`` in the future (:issue:`17295`) | |||
- ``.get_value`` and ``.set_value`` on ``Series``, ``DataFrame``, ``Panel``, ``SparseSeries``, and ``SparseDataFrame`` are deprecated in favor of using ``.iat[]`` or ``.at[]`` accessors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be an issue number here?
Unrelated, but if you're going to make a change, there's also a typo on the line above this one that could be fixed: "non-existant" -> "non-existent". Can submit a fix myself if there's no need to add an issue number and changes don't need to be made.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, both fixed
I already stated my reservations about the need for this. |
I feel the same as Joris, and will defer to others on this. |
As we have already capitulated long ago on caring about microperformance (https://mail.python.org/pipermail/pandas-dev/2016-March/000499.html) I don't have strong feelings. If it does not cause us a maintenance hardship (maybe it does...), and the performance is better, why not keep them? |
perf is almost exactly the same as |
Since they are indeed rather thin wrappers around the internal set/get_value, I don't think they are actually that much of a maintenance burden. But it is true it is an API 'burden' (they add to the long list of available functions that users have to cope with). |
you are completely missing the point. They duplicate APIs. They are not tested in the least and they are very buggy. The entire point of .iat/.at is that they follow a common path and ARE tested with all dtypes. We are not going to keep around buggy code that has duplicate API's. This such a maintance burden. We need to actively be removing code, not preserving untested, non-documented duplicative API's. I personally would rather have correct code. |
I wouldn't say he / I are missing the point but we have a slightly different perspective =) I am +0 with removing the APIs in the interest of simplification and maintainability. |
If you read my previous post, I think you can see that I am not completely missing the point. I agree that it is duplicate API. But there is still a trade-off between duplicate api vs speed / breaking people's code. But to the point: by further looking at it, I am certainly fine with the idea of deprecating those functions. As I would also like to clean up the API, and iat/at are I think nicer. Eg, if you look at the
But in all those case, when not doing this check, you would get a KeyError otherwise with the Would you be with such a change? (and to repeat from above, I am OK with the idea of removing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor doc comment, for the rest no code comments!
I guess my overarching point is that we have a framework that we can/should test Things done in the unified indexing context are always welcome. didn't see any doc comments left, can you point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot to push the button. Here is the minor doc comment.
@@ -1918,6 +1919,8 @@ def get_value(self, index, col, takeable=False): | |||
""" | |||
Quickly retrieve single value at passed column and index | |||
|
|||
.. deprecated:: 0.21.0 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add here the alternative like in the deprecations message? (the "use '.at[]' or '.iat[]' instead")
(and the same for the other docstrings)
…l, SparseSeries, SparseDataFrame closes pandas-dev#15269
…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269
…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269
…l, SparseSeries, SparseDataFrame (pandas-dev#17739) closes pandas-dev#15269
closes #15269