Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add dtype parameter to IntervalIndex constructors and deprecate from_intervals #19339

Merged
merged 3 commits into from
Jan 25, 2018

Conversation

jschendel
Copy link
Member

Summary:

  • Added support for a dtype parameter to all IntervalIndex constructors
    • Allows users to override the inferred dtype
  • Deprecated IntervalIndex.from_intervals
    • Still added dtype parameter, since it's just a pass through to IntervalIndex
    • Removed usage and references to IntervalIndex.from_intervals throughout the codebase
  • Split construction tests off into interval/test_construction.py
    • Created a base class for common tests, and subclasses for each constructor
    • Was previously written in a more flat style, where each constructor was explicitly called
    • Expanded the tests to hit some previously untested behavior

@@ -1617,7 +1617,6 @@ IntervalIndex Components
IntervalIndex.from_arrays
IntervalIndex.from_tuples
IntervalIndex.from_breaks
IntervalIndex.from_intervals
Copy link
Member Author

@jschendel jschendel Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be removed from api.rst now, or should I wait until from_intervals is actually deleted from the codebase and not just deprecated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is fine

@codecov
Copy link

codecov bot commented Jan 22, 2018

Codecov Report

Merging #19339 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19339      +/-   ##
==========================================
- Coverage   91.61%    91.6%   -0.02%     
==========================================
  Files         150      150              
  Lines       48677    48697      +20     
==========================================
+ Hits        44595    44607      +12     
- Misses       4082     4090       +8
Flag Coverage Δ
#multiple 89.97% <100%> (-0.02%) ⬇️
#single 41.72% <26.31%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/category.py 97.51% <100%> (ø) ⬆️
pandas/core/reshape/tile.py 90.25% <100%> (ø) ⬆️
pandas/core/indexes/interval.py 92.4% <100%> (-0.02%) ⬇️
pandas/core/indexes/base.py 96.46% <100%> (ø) ⬆️
pandas/util/testing.py 83.83% <0%> (-0.62%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7202635...da2eac2. Read the comment docs.

@jreback jreback added Deprecate Functionality to remove in pandas Interval Interval data type labels Jan 22, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments. testing looks great!

@@ -1617,7 +1617,6 @@ IntervalIndex Components
IntervalIndex.from_arrays
IntervalIndex.from_tuples
IntervalIndex.from_breaks
IntervalIndex.from_intervals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is fine

@@ -200,7 +200,8 @@ def __new__(cls, data=None, dtype=None, copy=False, name=None,
# interval
if is_interval_dtype(data) or is_interval_dtype(dtype):
from .interval import IntervalIndex
return IntervalIndex(data, dtype=dtype, name=name, copy=copy)
return IntervalIndex(data, dtype=dtype, name=name, copy=copy,
**kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need kwargs to be passed or is that for consistentcy with others?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed to pass closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you pass that directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done by doing closed = kwargs.get('closed', None), then passing that into the constructor. Not sure if you meant doing it that way, or by adding closed=None to the Index constructor.

@@ -151,6 +152,8 @@ class IntervalIndex(IntervalMixin, Index):
Name to be stored in the index.
copy : boolean, default False
Copy the meta-data
dtype : dtype or None, default None
If None, dtype will be inferred
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a version added

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -417,6 +427,8 @@ def from_breaks(cls, breaks, closed='right', name=None, copy=False):
Name to be stored in the index.
copy : boolean, default False
copy the data
dtype : dtype or None, default None
If None, dtype will be inferred
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vesionadded

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -458,6 +469,8 @@ def from_arrays(cls, left, right, closed='right', name=None, copy=False):
Name to be stored in the index.
copy : boolean, default False
copy the data
dtype : dtype or None, default None
If None, dtype will be inferred
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -545,6 +559,8 @@ def from_tuples(cls, data, closed='right', name=None, copy=False):
Name to be stored in the index.
copy : boolean, default False
by-default copy the data, this is compat only and ignored
dtype : dtype or None, default None
If None, dtype will be inferred
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

lhs, rhs = d
except ValueError:
msg = ('IntervalIndex.from_tuples requires tuples of '
'length 2, got {tpl}').format(tpl=d)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we testing both of these paths?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in TestFromTuples.test_constructor_errors: testing tuples of length 1 and 3, as well as non-tuple input.

except TypeError:
msg = ('IntervalIndex.from_tuples received an invalid '
'item, {tpl}').format(tpl=d)
raise TypeError(msg)
left.append(lhs)
right.append(rhs)

# TODO
# if we have nulls and we previous had *only*
# integer data, then we have changed the dtype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is moot? (this was from me originally).....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed and removed. Was wondering about that comment but previously left it as-is in case there was something I missed; guess not.

@jschendel jschendel force-pushed the ii-constructor-dtype branch 2 times, most recently from 7b56db9 to 213b8eb Compare January 23, 2018 02:12
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very minor comments. ping on green.

left = _ensure_index(left, copy=copy)
right = _ensure_index(right, copy=copy)

if dtype is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a 1-liner here, noting that this must be an interval dtype

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -571,15 +595,21 @@ def from_tuples(cls, data, closed='right', name=None, copy=False):
if isna(d):
lhs = rhs = np.nan
else:
lhs, rhs = d
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a 1-liner here showing the format of th expected input (list-of-tuples)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jschendel jschendel force-pushed the ii-constructor-dtype branch from 213b8eb to da2eac2 Compare January 24, 2018 08:01
@jschendel
Copy link
Member Author

@jreback : green

@jreback jreback merged commit 250574e into pandas-dev:master Jan 25, 2018
@jreback
Copy link
Contributor

jreback commented Jan 25, 2018

nice @jschendel keep em coming!

@jschendel jschendel deleted the ii-constructor-dtype branch January 25, 2018 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Interval Interval data type
Projects
None yet
2 participants