Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate fastpath keyword in Index constructors #23110

Merged
merged 4 commits into from
Oct 18, 2018

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Oct 12, 2018

From all that looking at Index constructors ...

Apparently it was even hardly used internally for the Index classes (except a little bit in RangeIndex).

Part of #20110

@pep8speaks
Copy link

Hello @jorisvandenbossche! Thanks for submitting the PR.

@jorisvandenbossche jorisvandenbossche added this to the 0.24.0 milestone Oct 12, 2018
@jorisvandenbossche jorisvandenbossche added the Deprecate Functionality to remove in pandas label Oct 12, 2018
@codecov
Copy link

codecov bot commented Oct 12, 2018

Codecov Report

Merging #23110 into master will increase coverage by 0.01%.
The diff coverage is 96.29%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23110      +/-   ##
==========================================
+ Coverage    92.2%   92.21%   +0.01%     
==========================================
  Files         169      169              
  Lines       50888    50895       +7     
==========================================
+ Hits        46919    46935      +16     
+ Misses       3969     3960       -9
Flag Coverage Δ
#multiple 90.63% <96.29%> (+0.01%) ⬆️
#single 42.23% <40.74%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/period.py 93.24% <0%> (-0.03%) ⬇️
pandas/core/indexes/category.py 97.56% <100%> (+0.29%) ⬆️
pandas/core/indexes/numeric.py 97.44% <100%> (+0.03%) ⬆️
pandas/core/indexes/base.py 96.6% <100%> (+0.05%) ⬆️
pandas/core/indexes/range.py 95.77% <100%> (+0.03%) ⬆️
pandas/core/arrays/datetimelike.py 95.34% <0%> (-0.15%) ⬇️
pandas/core/indexes/datetimes.py 95.63% <0%> (-0.02%) ⬇️
pandas/io/formats/csvs.py 98.21% <0%> (ø) ⬆️
pandas/core/arrays/datetimes.py 96.94% <0%> (+0.02%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12a0dc4...4575847. Read the comment docs.

@@ -306,7 +306,7 @@ def __contains__(self, key):

@cache_readonly
def _int64index(self):
return Int64Index(self.asi8, name=self.name, fastpath=True)
return Int64Index._simple_new(self.asi8, name=self.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the performance implication of using _simple_new vs just Int64Index(...)? If its sufficiently small, we should be using the public constructors

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatively it is a big difference:

In [8]: values = np.arange(100000)

In [9]: %timeit pd.Index(values)
31 µs ± 140 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [10]: %timeit pd.Index._simple_new(values)
1.62 µs ± 26.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

but if that will be significant in actual code, I don't know.

But since we are using that in many places, I leave changing _simple_new calls to the public constructor for another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel this leads a bit further, but given that the public constructors (__init__) of Index classes are very complex (something we will not be able to change I think), I think that we need to see _simple_new as kind of a "internal public" method within the pandas code base.

@@ -174,7 +179,7 @@ def _data(self):

@cache_readonly
def _int64index(self):
return Int64Index(self._data, name=self.name, fastpath=True)
return Int64Index._simple_new(self._data, name=self.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shallow_copy? ditto below

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Oct 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this case, since it is not self that is calling it, so you need to pass the self.name anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One case below is shallow_copy, but the other can indeed be changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually, _shallow_copy has not the ability to specify (override) the name (which is not really following how _shallow_copy works for other Index classes), so can also not use it there.

Might be a bit buggy in _shallow_copy (it is also not used at all anywhere in the RangeIndex implementation, although maybe in inherited methods ..).
But I want to keep it here just about deprecating the fastpath keyword, so let's leave that for another issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are u changing these constructors at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not changing anything. fastpath=True means exactly the _simple_new what I replaced it with.
This PR does not try to do a clean-up of usage of _simple_new by replacing it with the default constructor, it only deprecates the fastpath argument.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@jbrockmendel
Copy link
Member

General +1, couple of comments. Is there a tracker issue for deprecations-to-eventually-remove?

@jreback
Copy link
Contributor

jreback commented Oct 12, 2018

yes
search for the label of Deprecation then sort by most comments

@jorisvandenbossche
Copy link
Member Author

Added it to the list in #6581

@jorisvandenbossche
Copy link
Member Author

@jbrockmendel Thanks for the review!

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer simply to use the actual constructor rather than _simple_new if we can

@@ -174,7 +179,7 @@ def _data(self):

@cache_readonly
def _int64index(self):
return Int64Index(self._data, name=self.name, fastpath=True)
return Int64Index._simple_new(self._data, name=self.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are u changing these constructors at all?

@jreback
Copy link
Contributor

jreback commented Oct 18, 2018

@jorisvandenbossche can you update

@jorisvandenbossche
Copy link
Member Author

What do I need to update?

@jreback
Copy link
Contributor

jreback commented Oct 18, 2018

why are u changing these constructors at all?

comment above

@jorisvandenbossche
Copy link
Member Author

Answer above: #23110 (comment)

@jreback jreback merged commit 795b771 into pandas-dev:master Oct 18, 2018
@jreback
Copy link
Contributor

jreback commented Oct 18, 2018

thanks @jorisvandenbossche

@jorisvandenbossche
Copy link
Member Author

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants