Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: add "level=" argument to MultiIndex.unique() #17897

Merged

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Oct 16, 2017

@@ -896,20 +896,24 @@ def _get_level_values(self, level):
Parameters
----------
level : int level
unique : bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to add a version added tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(done)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a versionadded tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it, and then removed it on @jorisvandenbossche 's suggestion because this is an internal method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the default value in the doc string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u can remove the versionadded tag here

@toobaz
Copy link
Member Author

toobaz commented Oct 16, 2017

From #17881 , @jreback 's comments:

I am not sure unique is the right word here.

I'm not a fan either...

.get_level_values(level, used=False), though I am not sure I like this either.

... but I think this would be pretty confusing ("used" as in "this label appears in the index" vs. "used" as "this label is used multiple times")

@toobaz
Copy link
Member Author

toobaz commented Oct 16, 2017

Alternatives:

  • MultiIndex.get_level_values(idx, drop_duplicates=False)
  • MultiIndex.get_level_values(idx, duplicated=True)

@jreback
Copy link
Contributor

jreback commented Oct 16, 2017

how about remove_unused_levels=False, so then remove_unused_levels is obvious

@toobaz
Copy link
Member Author

toobaz commented Oct 16, 2017

how about remove_unused_levels=False, so then remove_unused_levels is obvious

you mean "levels" or "labels"? Anyway, not following you: get_level_values() always removes unused labels... the particularity of this is that (used) labels only appear once.

@codecov
Copy link

codecov bot commented Oct 16, 2017

Codecov Report

Merging #17897 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17897      +/-   ##
==========================================
- Coverage   91.23%   91.22%   -0.02%     
==========================================
  Files         163      163              
  Lines       50105    50107       +2     
==========================================
- Hits        45715    45708       -7     
- Misses       4390     4399       +9
Flag Coverage Δ
#multiple 89.03% <100%> (ø) ⬆️
#single 40.31% <75%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/multi.py 96.4% <100%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.75% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5bf7f9a...8a69543. Read the comment docs.

@codecov
Copy link

codecov bot commented Oct 16, 2017

Codecov Report

Merging #17897 into master will decrease coverage by 0.04%.
The diff coverage is 95.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17897      +/-   ##
==========================================
- Coverage   91.38%   91.34%   -0.05%     
==========================================
  Files         164      164              
  Lines       49797    49812      +15     
==========================================
- Hits        45508    45501       -7     
- Misses       4289     4311      +22
Flag Coverage Δ
#multiple 89.14% <95.23%> (-0.03%) ⬇️
#single 39.55% <61.9%> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/base.py 96.42% <100%> (ø) ⬆️
pandas/core/indexes/multi.py 96.4% <100%> (+0.02%) ⬆️
pandas/core/indexes/category.py 97.2% <75%> (-0.26%) ⬇️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.44% <0%> (-1.82%) ⬇️
pandas/core/frame.py 97.8% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a172ff9...feb65ed. Read the comment docs.

@@ -792,6 +792,7 @@ Other API Changes
- Pandas no longer registers matplotlib converters on import. The converters
will be registered and used when the first plot is draw (:issue:`17710`)
- Setting on a column with a scalar value and 0-len index now raises a ``ValueError`` (:issue:`16823`)
- :func:`MultiIndex.get_level_values` now supports the `unique` argument (:issue:`17896`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should unique have double ticks here, ``unique``?

@toobaz
Copy link
Member Author

toobaz commented Oct 16, 2017

Answering @jorisvandenbossche from here:

I agree it would be nice to have a clean way to get those unique values, but IMO it does not belong in get_level_values. That method returns the actual values of the Index level, with a length equal to the length of the Index, and IMO we should stick to that contract. Having such a keyword would completely alter the return type of this method.

(not directly a good idea for alternative though)

I'm open to alternatives, but let me summarize my rationale for working on get_level_values():

  • the difference between mi.levels ("implementation detail") and mi.get_level_values() (the "effective" content of the index level) has always confused users, but at least it finally reduces to remembering the difference between the two. In this sense, the set of "effective" unique values appearing in the level clearly pertains to the second.
  • I've always considered the get_level_values() name itself misleading - it reminds me of something like value_counts(), not of something like mi.levels[idx] or df[col], which get entire levels/columns. If I had to redesign the API from scratch, I would probably call get_level_values() precisely the version with unique=True (and get_level() the version with unique=False)
  • in general, I would like to avoid adding new methods to an API which already has too many
  • (whatever choice we make for the API, the work will be done by the internal _get_level_values, but this is an implementation detail)

All this said, an alternative which I would find elegant enough would be to add an argument level to MultiIndex.unique(). It could even accept a list, which could be useful. The only disadvantage I see would be that it would deviate from the signature of Index.unique(), but after all this already happens for MultiIndex.drop(level=).

@toobaz
Copy link
Member Author

toobaz commented Oct 17, 2017

Two additional elements in favor of get_level_values():

  • we might want to add an argument sort= which switches between the order the elements appear in mi.levels[level] and the order the elements appear in mi (the implementation is trivial). This would be specific to MultiIndexes (sorting might be undefined for a generic Index) and hence a bit out of place in MultiIndex.unique(). Vice-versa, it would be trivial (albeit of limited utility) to support it also in conjunction with unique=False.
  • the standard way as of now to (inefficiently) obtain the same result is precisely mi.get_level_values(level).unique()

@jorisvandenbossche
Copy link
Member

OK, trying to clarify my rationale for objecting adding it to get_level_values.

So indeed we have to distinguish the levels (the "unique labels" for each level) and the get_level_values (the "effective values" inside one level, as a full Index). And I agree that the naming is not ideal (additionally with labels not being the labels but the integer codes ..).

In this sense, the set of "effective" unique values appearing in the level clearly pertains to the second.

You could indeed see it that way (unique "effective values"), but I would rather see it as the used "unique labels". If you look at it like that, it clearly pertains to the first ..

In that sense, suppose we had a method like .get_level(i) (which would be equivalent to .levels[i]), I think it would have belonged in that method: something like .get_level(i, remove_unused=True).
But of course, we don't have that method, and I agree we should avoid adding new methods ..

All this said, an alternative which I would find elegant enough would be to add an argument level to MultiIndex.unique()

This actually sounds like a nice alternative.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Oct 17, 2017

the standard way as of now to (inefficiently) obtain the same result is precisely mi.get_level_values(level).unique()

or mi.remove_unused_levels().levels[i] (which will also be more efficient)

@toobaz
Copy link
Member Author

toobaz commented Oct 17, 2017

or mi.remove_unused_levels().levels[i] (which will also be more efficient)

Good point. Still, it's recent and hence probably not very well known. And in principle less efficient for short indexes with multiple long levels.

Anyway, I'm OK with changing this PR to MultiIndex.unique(level=): @jreback what do you think?

@jreback
Copy link
Contributor

jreback commented Oct 17, 2017

Anyway, I'm OK with changing this PR to MultiIndex.unique(level=): @jreback what do you think?

seems reasonable.

alternatively, .get_levels() is available, sure I don't like expanding the API but it is explicit.

@toobaz toobaz force-pushed the get_level_values_unique branch 4 times, most recently from e5a4635 to 3617e2a Compare October 17, 2017 11:57
@toobaz
Copy link
Member Author

toobaz commented Oct 17, 2017

alternatively, .get_levels() is available, sure I don't like expanding the API but it is explicit.

I'm afraid it would be confusing (I would love to swap its name with get_level_values(), but it's too late), so I went for unique(level=).

@@ -945,6 +951,34 @@ def get_level_values(self, level):
values = self._get_level_values(level)
return values

def unique(self, level=None):
Copy link
Contributor

@jreback jreback Oct 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the level=None kw to Index.unique as well (for compat. It should raise if its not None).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, would you mention it in the (Index) docs?

expected = Index(['foo', 'bar', 'baz', 'qux'],
name='first')
tm.assert_index_equal(result, expected)
assert result.name == 'first'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already done by assert_index_equal

arrays = [['a', 'b', 'b'], [2, np.nan, 2]]
index = pd.MultiIndex.from_arrays(arrays)
values = index.unique(level=1)
expected = np.array([2], dtype=np.int64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a normal index, the NaN is included in the uniques.

index = pd.MultiIndex.from_arrays(arrays)
values = index.unique(level=1)
expected = np.array([2], dtype=np.int64)
tm.assert_numpy_array_equal(values.values, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you testing here the .values and not comparing with an Index object ?

unique : bool
if True, drop duplicated values

.. versionadded:: 0.21.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed for an internal method IMO

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see that actually @jreback asked to add it above. But still, I don't think this should be added for internal private functions, as internal code should never have to care about the pandas version


Parameters
----------
level : int, optional, defaults None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you be able to both specify int or level name ?

@toobaz
Copy link
Member Author

toobaz commented Oct 30, 2017

The first xfail is unrelated to my patch to the code, and just reflects the fact that I replaced tm.assert_numpy_array_equal with tm.assert_index_equal, as suggested by @jorisvandenbossche

@jreback
Copy link
Contributor

jreback commented Oct 31, 2017

can you rebase and will take a look.

def unique(self):
def unique(self, level=None):
if level not in {0, self.name, None}:
raise ValueError("Level {} not found".format(level))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the doc string for unique updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because you never replied to my question

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what the question is

@@ -896,20 +896,24 @@ def _get_level_values(self, level):
Parameters
----------
level : int level
unique : bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a versionadded tag

@@ -1484,6 +1484,24 @@ def test_get_level_values(self):
result = index_with_name.get_level_values('a')
tm.assert_index_equal(result, index_with_name)

def test_unique(self):
idx = pd.Index([2, 3, 2, 1], name='my_index')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number

def test_unique(self):
idx = pd.Index([2, 3, 2, 1], name='my_index')
expected = pd.Index([2, 3, 1], name='my_index')
for level in 0, 'my_index', None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you integrete with the other tests of unique in indexes/common.py and/or tests/test_base.py

@@ -2269,6 +2271,20 @@ def test_unique(self):
exp = pd.MultiIndex.from_arrays([['a'], ['a']])
tm.assert_index_equal(res, exp)

# GH #17896 - with level= argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try with named level, try on an already unique level as well. ideally even more corner cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a separate test and parametrize

@@ -100,7 +100,7 @@ Indexing

- Bug in :func:`PeriodIndex.truncate` which raises ``TypeError`` when ``PeriodIndex`` is monotonic (:issue:`17717`)
- Bug in ``DataFrame.groupby`` where key as tuple in a ``MultiIndex`` were interpreted as a list of keys (:issue:`17979`)
-
- :func:`MultiIndex.unique` now supports the ``level=`` argument (:issue:`17896`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to other enhancements

def unique(self):
def unique(self, level=None):
if level not in {0, self.name, None}:
raise ValueError("Level {} not found".format(level))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the doc string for unique updated?

@@ -896,20 +896,24 @@ def _get_level_values(self, level):
Parameters
----------
level : int level
unique : bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the default value in the doc string

@@ -896,20 +896,24 @@ def _get_level_values(self, level):
Parameters
----------
level : int level
unique : bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u can remove the versionadded tag here

@@ -329,6 +329,25 @@ def test_duplicates(self, indices):
assert not idx.is_unique
assert idx.has_duplicates

def test_unique(self):
# GH 17896
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really like to use the indices fixture here to test all Index types

Copy link
Member Author

@toobaz toobaz Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It assigns no name to the indexes... shall I change it so it does?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can try that

@@ -2269,6 +2271,20 @@ def test_unique(self):
exp = pd.MultiIndex.from_arrays([['a'], ['a']])
tm.assert_index_equal(res, exp)

# GH #17896 - with level= argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this a separate test and parametrize

tm.assert_index_equal(result, expected)

# With already unique level
mi = pd.MultiIndex.from_arrays([[1, 3, 2, 4], [1, 1, 1, 2]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test with each level of the mi

tm.assert_index_equal(result, expected)

@pytest.mark.xfail(reason='GH 17924 (returns Int64Index with float data)')
def test_unique_with_nans(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this test

@toobaz toobaz force-pushed the get_level_values_unique branch 3 times, most recently from 103d192 to d5f4324 Compare November 12, 2017 15:13
@toobaz
Copy link
Member Author

toobaz commented Nov 12, 2017

The second commit changes the behavior of Index(., dtype='category').unique() to not change the order of categories (to the order of appearance). I know this is inconsistent with both Categorical.unique() and Series(., dtype='category').unique(), but I would rather change the latter two, because I don't see the point of unique() changing the categories at all.

If instead the current behavior is preferred, that commit can be dropped, and I will just need to exclude the category case from this test - because drop_duplicates and unique currently give different results.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

but I would rather change the latter two, because I don't see the point of unique() changing the categories at all.

how did this come up? why do you think that the consistency should diverge here?

@toobaz
Copy link
Member Author

toobaz commented Nov 12, 2017

how did this come up?

My new test compares drop_duplicates with unique, which differ for category, precisely because the categories change. But my argument is clearly not "let's save the test", it is "why should an operation which clearly doesn't change the set of distinct values affect the categories"?

why do you think that the consistency should diverge here?

I'm not in favor of breaking consistency (between Index, Categorical and Series), I would rather change them all.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

@jorisvandenbossche

@toobaz toobaz changed the title API: add "unique=" argument to MultiIndex.get_level_values() API: add "level=" argument to MultiIndex.unique() Nov 13, 2017
@toobaz
Copy link
Member Author

toobaz commented Nov 14, 2017

The behavior of Categorical.unique() and friends (Series, Index) is probably worth discussing separately and fixing all together, so I created #18291, and changed the problematic test in this PR so it temporarily skips any CategoricalIndex.

@toobaz
Copy link
Member Author

toobaz commented Nov 15, 2017

( @jreback : ping)

@jorisvandenbossche
Copy link
Member

Will take a look at this tomorrow!

level : int or str, optional, default None
only return values from specified level (for MultiIndex)

.. versionadded:: 0.21.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.22.0

with tm.assert_raises_regex(ValueError, msg):
indices.unique(level=level)

def test_unique_na(self, indices):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are not using indices here, so don't add it as an argument

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general!
Added a bunch of minor comments

@@ -24,6 +24,7 @@ Other Enhancements

- Better support for :func:`Dataframe.style.to_excel` output with the ``xlsxwriter`` engine. (:issue:`16149`)
- :func:`pandas.tseries.frequencies.to_offset` now accepts leading '+' signs e.g. '+1h'. (:issue:`18171`)
- :func:`MultiIndex.unique` now supports the ``level=`` argument (:issue:`17896`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add briefly what it does, something like ".. to get the unique values of a single index level" ?

@@ -361,7 +361,9 @@ def is_monotonic_decreasing(self):
return Index(self.codes).is_monotonic_decreasing

@Appender(base._shared_docs['unique'] % _index_doc_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be the shared docs of 'index_unique' ?

Series.unique
""")

@Appender(base._shared_docs['index_unique'] % _index_doc_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the _index_doc_kwargs are doing something here ?

Copy link
Member Author

@toobaz toobaz Nov 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you're right. But it was already the case, so I thought it was a choice of consistency ("you don't need to change this line if you change the docs making the interpolation meaningful"). Shall I remove it?

@@ -3757,8 +3757,32 @@ def drop(self, labels, errors='raise'):
indexer = indexer[~mask]
return self.delete(indexer)

@Appender(base._shared_docs['unique'] % _index_doc_kwargs)
def unique(self):
base._shared_docs['index_unique'] = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be in _index_shared_docs instead of base._shared_docs ? (because base are the ones shared with series)

Parameters
----------
level : int or str, optional, default None
only return values from specified level (for MultiIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Start sentence with capital letter

@@ -943,6 +947,15 @@ def get_level_values(self, level):
values = self._get_level_values(level)
return values

@Appender(base._shared_docs['index_unique'] % _index_doc_kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base._shared_docs -> index_shared_docs

@Appender(base._shared_docs['index_unique'] % _index_doc_kwargs)
def unique(self, level=None):
if level not in {0, self.name, None}:
raise ValueError("Level {} not found".format(level))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have _get_level_number for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather _validate_index_level which is used in _get_level_number

@@ -963,19 +963,21 @@ def test_get_level_values(self):
exp = CategoricalIndex([1, 2, 3, 1, 2, 3])
tm.assert_index_equal(index.get_level_values(1), exp)

def test_get_level_values_na(self):
@pytest.mark.xfail(reason='GH 17924 (returns Int64Index with float data)')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we xfail this? (I mean, why don't we keep asserting for now that it is float)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertions compare a correctly built index (float64 data in Float64Index) with an invalid index (float64 data in Int64Index, due to #17924 ). Maybe I'm missing something, but it's not obvious to me what we would assert - plus, I see it as an added value to xfail for a buggy behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@toobaz toobaz force-pushed the get_level_values_unique branch 3 times, most recently from fbf9eff to 337e942 Compare November 18, 2017 10:03
@jreback
Copy link
Contributor

jreback commented Nov 19, 2017

needs a rebase

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last minor comments, otherwise looks good!

@Appender(_index_shared_docs['index_unique'] % _index_doc_kwargs)
def unique(self, level=None):
if level not in {0, self.name, None}:
raise ValueError("Level {} not found".format(level))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use here _validate_index_level as well?

@@ -963,19 +963,21 @@ def test_get_level_values(self):
exp = CategoricalIndex([1, 2, 3, 1, 2, 3])
tm.assert_index_equal(index.get_level_values(1), exp)

def test_get_level_values_na(self):
@pytest.mark.xfail(reason='GH 17924 (returns Int64Index with float data)')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@toobaz
Copy link
Member Author

toobaz commented Nov 20, 2017

ping

@jorisvandenbossche jorisvandenbossche merged commit 3b05a60 into pandas-dev:master Nov 20, 2017
@jorisvandenbossche
Copy link
Member

Thanks!

@toobaz toobaz deleted the get_level_values_unique branch November 20, 2017 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: support "unique=True" in MultiIndex.get_level_values()
4 participants