Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Series ndarray properties (strides, data, base, itemsize, flags) #20721

Merged

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche added the Deprecate Functionality to remove in pandas label Apr 17, 2018
@jorisvandenbossche jorisvandenbossche added this to the 0.23.0 milestone Apr 17, 2018
@@ -737,11 +737,17 @@ def item(self):
@property
def data(self):
""" return the data pointer of the underlying data """
warnings.warn("Series/Index.data is deprecated and will be "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to make the warning messages a bit more dynamic with something like:

"{obj}.data is deprecated...".format(obj=type(self).__name__)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you (somehow I was the docstring mindset thinking we needed to adapt this upon class definition, but of course this can be done dynamically on run time :-))

@pep8speaks
Copy link

pep8speaks commented Apr 18, 2018

Hello @jorisvandenbossche! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on April 24, 2018 at 12:42 Hours UTC

@codecov
Copy link

codecov bot commented Apr 18, 2018

Codecov Report

Merging #20721 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20721      +/-   ##
==========================================
- Coverage   91.85%   91.84%   -0.01%     
==========================================
  Files         153      153              
  Lines       49308    49316       +8     
==========================================
+ Hits        45290    45296       +6     
- Misses       4018     4020       +2
Flag Coverage Δ
#multiple 90.24% <100%> (-0.01%) ⬇️
#single 41.89% <25%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/algorithms.py 94.37% <100%> (ø) ⬆️
pandas/core/internals.py 95.58% <100%> (ø) ⬆️
pandas/core/base.py 96.83% <100%> (+0.03%) ⬆️
pandas/core/indexes/category.py 97.03% <0%> (-0.27%) ⬇️
pandas/core/arrays/categorical.py 95.64% <0%> (-0.14%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0ae7e90...feb7caf. Read the comment docs.

@jorisvandenbossche jorisvandenbossche force-pushed the depr-series-ndarray-attrs branch from f10dd3f to 11182b1 Compare April 18, 2018 08:44
@@ -328,7 +328,7 @@ def test_series_agg_multi_pure_python():
'F': np.random.randn(11)})

def bad(x):
assert (len(x.base) > 0)
assert (len(x.values.base) > 0)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure what the purpose of this assert actually is (was introduced in 71e9046)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls run all tests and see if any warnings comes up

""" return the base object if the memory of the underlying data is
shared
"""
# override deprecated property in IndexOpsMixin, as we still need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your explanation doesn't make sense, we are deprecating these no? you need to change the way its accessed in the code to remove the warnings.

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation perfectly makes sense, but you don't like the way I solved it I suppose (I agree it is not the nicest, but I was thinking that this would only be temporarily until DatetimeTZBlock is an ExtensionBlock).

To fix the usage itself, .base is used in two placed:

  • Block.is_view: this I can override in DatetimeTzBlock to check self.values.values.base
  • concatenate_join_units:
    if copy and concat_values.base is not None:
    Not fully sure about this one what to do here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is this method should not exist (as you are ddeprecateding), and would rather have you fix the usage

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, I agree with that. Do you have a suggestion for the second case? (inside concatenate_join_units) In #20745 I need to touch the same code, and checked if the values have a base attribute, but that also feels hacky

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, see my comment there, only check base if its an ndarray

@@ -1491,6 +1491,8 @@ def take_nd(arr, indexer, axis=0, out=None, fill_value=np.nan, mask_info=None,
if is_sparse(arr):
arr = arr.get_values()

arr = np.array(arr, copy=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.asarray slightly more idiomatic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes (for some reason I wanted the copy=False keyword, but for asarray that is of course just the default behaviour)

""" return the base object if the memory of the underlying data is
shared
"""
# override deprecated property in IndexOpsMixin, as we still need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my point is this method should not exist (as you are ddeprecateding), and would rather have you fix the usage

@jreback
Copy link
Contributor

jreback commented Apr 21, 2018

can you update

@jorisvandenbossche
Copy link
Member Author

Now that #20745 is merged, updated this.

@TomAugspurger
Copy link
Contributor

@jorisvandenbossche a couple new warnings I think: https://travis-ci.org/pandas-dev/pandas/jobs/369977719#L2762

pandas/tests/indexes/datetimes/test_datetimelike.py::TestDatetimeIndex::()::test_copy_name
  /Users/travis/build/pandas-dev/pandas/pandas/core/indexes/base.py:551: FutureWarning: DatetimeIndex.base is deprecated and will be removed in a future version
    orig = orig if orig.base is None else orig.base
pandas/tests/internals/test_internals.py::TestBlockManager::()::test_copy
  /Users/travis/build/pandas-dev/pandas/pandas/tests/internals/test_internals.py:469: FutureWarning: DatetimeIndex.base is deprecated and will be removed in a future version
    assert cp_blk.values.base is blk.values.base
  /Users/travis/build/pandas-dev/pandas/pandas/tests/internals/test_internals.py:477: FutureWarning: DatetimeIndex.base is deprecated and will be removed in a future version
    if cp_blk.values.base is not None and blk.values.base is not None:
  /Users/travis/build/pandas-dev/pandas/pandas/tests/internals/test_internals.py:478: FutureWarning: DatetimeIndex.base is deprecated and will be removed in a future version
    assert cp_blk.values.base is not blk.values.base

Can you see if they're easy to fix?

@jreback
Copy link
Contributor

jreback commented Apr 24, 2018

lgtm, other than @TomAugspurger comments.

@jreback
Copy link
Contributor

jreback commented Apr 24, 2018

lgtm. modulo any warnings fixes. note that I just pushed a CI fix for lint (on other PR)

@@ -548,6 +548,9 @@ def _deepcopy_if_needed(self, orig, copy=False):
"""
if copy:
# Retrieve the "base objects", i.e. the original memory allocations
if not isinstance(orig, np.ndarray):
# orig is a DatetimeIndex
orig = orig.values
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback This special case is because we sometimes construct DatetimeIndex objects where the _data attribute still is a DatetimeIndex in itself.
To me this seems like something we should not do? (it doesn't hurt, but there is also no reason to have it I think, so better always store the ndarray to be consistent?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should never do that
how did this come up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just opened an issue about it: #20810
It came up from fixing the warnings in the tests. The above case, where orig is not always a ndarray (hence my special case if statement), triggered it, because orig always comes from a _data attribute, and so I expected it to always be a ndarray.

@TomAugspurger TomAugspurger merged commit 21e884f into pandas-dev:master Apr 25, 2018
@TomAugspurger
Copy link
Contributor

Thanks @jorisvandenbossche

@naazneen
Copy link

I was doing scipy.fftpack.fft over a series and it showed error as
Attriute error: Series object has no attribute flags.

What should I use instead? And how to overcome it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate Series.strides, .base, .data, .itemsize, .flags (numpy) attributes ?
6 participants