Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate .ix in favor of .loc/.iloc #15113

Closed
wants to merge 1 commit into from

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jan 12, 2017

closes #14218
closes #15116

This shows a pretty big deprecation message, though its instructive.

In [1]: df = pd.DataFrame({'A': [1, 2, 3],
   ...:                        'B': [4, 5, 6]},
   ...:                       index=list('abc'))
   ...: df
   ...: 
Out[1]: 
   A  B
a  1  4
b  2  5
c  3  6

In [2]: df.ix[[0, 2], 'A']
/Users/jreback/miniconda3/envs/pandas/bin/ipython:1: FutureWarning: 
.ix is deprecated. Please use
.loc for label based indexing or
.iloc for positional indexing

You an do multi-axes indexing like the following
>>> df = pd.DataFrame({'A': [1, 2, 3],
                       'B': [4, 5, 6]},
                      index=list('abc'))
>>> df
   A  B
a  1  4
b  2  5
c  3  6

From prior versions
>>> df.ix[[0, 2], 'A']
a    1
c    3
Name: A, dtype: int64

Using .loc
>>> df.loc[df.index[[0, 2]], 'A']
a    1
c    3
Name: A, dtype: int64

Using .iloc
>>> df.iloc[[0, 2]], df.columns.get_loc('A')]
a    1
c    3
Name: A, dtype: int64

@jreback jreback added Deprecate Functionality to remove in pandas Indexing Related to indexing on series/frames, not to indexes themselves labels Jan 12, 2017
@jreback jreback added this to the 0.20.0 milestone Jan 12, 2017
@jreback
Copy link
Contributor Author

jreback commented Jan 12, 2017

This is the big kahuna! wasn't that difficult, mostly did a search-and-replace of .ix -> .loc then fixed the errors.

@jreback
Copy link
Contributor Author

jreback commented Jan 12, 2017

@wesm
Copy link
Member

wesm commented Jan 12, 2017

very nice!

@codecov-io
Copy link

codecov-io commented Jan 12, 2017

Current coverage is 85.53% (diff: 97.36%)

Merging #15113 into master will decrease coverage by 0.01%

@@             master     #15113   diff @@
==========================================
  Files           145        145          
  Lines         51352      51361     +9   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits          43932      43932          
- Misses         7420       7429     +9   
  Partials          0          0          

Powered by Codecov. Last update 362e78d...0bf5cb5

@zygmuntz
Copy link

Or maybe df.iloc[[0, 2]].A

@max-sixty
Copy link
Contributor

Awesome!

That's a long message. Is it worth having a shorter message including a link to the docs?

@jreback
Copy link
Contributor Author

jreback commented Jan 12, 2017

i changed the message to have a link
could skip the example if people think it's too long

@shoyer
Copy link
Member

shoyer commented Jan 12, 2017

This is great!

Deprecation warnings tend to turn up downstream in application code, where they are often read by users for whom they are not relevant. This is a necessary evil, but to reduce their burden it's best to keep them short. I would stick to a few lines with a link, e.g.,

.ix is deprecated. Please use .loc for label based indexing or .iloc for
positional indexing. For more details, see:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate_ix

@jorisvandenbossche
Copy link
Member

I would be in favor for that reason to go with a more 'softer' deprecation, and in a first place use DeprecationWarnings (these should still be visible by default in interactive usage in ipython), and only in a second phase (one or few releases later) a more visible FutureWarning.
.ix is used massively, and users will otherwise be flooded with warnings when using library code depending on pandas (which I personally find very annoying).

jreback added a commit to jreback/pandas that referenced this pull request Jan 12, 2017
handle iterator
handle NamedTuple
.loc retuns scalar selection dtypes correctly, closes pandas-dev#11617

xref pandas-dev#15113
jreback added a commit to jreback/pandas that referenced this pull request Jan 12, 2017
handle iterator
handle NamedTuple
.loc retuns scalar selection dtypes correctly, closes pandas-dev#11617

xref pandas-dev#15113
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! (I didn't check all the ix -> iloc/loc substitutions in detail :-))

We will need to do a clean-up of the docs as well (usage of ix in the docs besides the actual indexing docs)

My main concern is that the alternative presented here in case you want to use iloc is not fully complete:

In [18]: df = pd.DataFrame(np.random.randn(5,5), columns=list('ABCDE'))

In [19]: df
Out[19]: 
          A         B         C         D         E
0  0.870482 -0.388952 -0.972597 -0.843245  0.903255
1 -0.483238  1.130196 -0.105157 -0.700333 -0.291880
2 -0.541109 -1.916656  1.039409 -0.678030 -0.995090
3  0.375849  0.313649 -0.621017 -1.517242 -0.888986
4  0.144481  0.155721  0.719531  0.959571  2.066996

It works fine as long as you want a single label:

In [21]: df.iloc[0, df.columns.get_loc('A')]
Out[21]: 0.87048249788122523

but not anymore for multiple labels:

In [22]: df.ix[0, ['A', 'B']]
Out[22]: 
A    0.870482
B   -0.388952
Name: 0, dtype: float64

In [23]: df.iloc[0, df.columns.get_loc(['A', 'B'])]
...
TypeError: '['A', 'B']' is an invalid key

Then you can use get_indexer:

In [24]: df.iloc[0, df.columns.get_indexer(['A', 'B'])]
Out[24]: 
A    0.870482
B   -0.388952
Name: 0, dtype: float64

but this does not work on a single value:

In [25]: df.iloc[0, df.columns.get_indexer('A')]
...
TypeError: Index(...) must be called with a collection of some kind, 'A' was passed

No matter that this is the intended behaviour of get_loc/get_indexer, it makes it again more complex for the user.


This deliberate decision was made to prevent ambiguities and subtle bugs (many
users reported finding bugs when the API change was made to stop "falling back"
on position-based indexing).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above section is still valid for [] on a Series, so should not removed I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


.. warning::

Startin in 0.20.0, the ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startin -> starting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Deprecate .ix
^^^^^^^^^^^^^

The ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years. The full indexing documentation are :ref:`here <indexing>`. (:issue:`14218`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"user confusion" -> not only user confusion :-) I still can't predict what it will do in all cases!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you want to change here

self.assertEqual(result, 1)

# TODO: this doesn't work, should it?
# result = df.loc[IndexType("foo", "bar")]["A"]
# self.assertEqual(result, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably yes, but seems quite exotic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you did already handle this in the other PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I have to rebase this.

self.frame.ix[('bar', 'two'), 'B'] = 5
self.assertEqual(self.frame.ix[('bar', 'two'), 'B'], 5)
self.frame.loc[('bar', 'two'), 'B'] = 5
self.assertEqual(self.frame.loc[('bar', 'two'), 'B'], 5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep some of the ix tests here for MultiIndexing? (I see you kept non in this file, but maybe MultiIndex is also tested somewhere else?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a few back

@jreback
Copy link
Contributor Author

jreback commented Jan 12, 2017

No matter that this is the intended behaviour of get_loc/get_indexer, it makes it again more complex for the user.

That's just an alternative, I think it is much more clear to use direct indexing and use .loc

thus

df.loc[df.index[[0, 2]], 'A'] or whatever is pretty much a direct translation. I will relegate .get_loc/.get_indexer I think to a smaller section (as that really isn't the recommended way I think).

jreback added a commit that referenced this pull request Jan 12, 2017
handle iterator
handle NamedTuple
.loc retuns scalar selection dtypes correctly, closes #11617

xref #15113

Author: Jeff Reback <jeff@reback.net>

Closes #15120 from jreback/indexing and squashes the following commits:

801c8d9 [Jeff Reback] BUG: indexing changes to .loc for compat to .ix for several situations
@jreback
Copy link
Contributor Author

jreback commented Jan 12, 2017

ok I updated all of the docs that I could as well.

@jreback jreback force-pushed the ix branch 2 times, most recently from eaf57e2 to 0bf5cb5 Compare January 17, 2017 18:42
@jreback
Copy link
Contributor Author

jreback commented Jan 17, 2017

@jorisvandenbossche do you feel strongly about this being a DeprecationWarning rather than a FutureWarning? I don't think we do that for anything else IIRC, though this is more visibile than anything else we have done.

@jorisvandenbossche
Copy link
Member

@jorisvandenbossche do you feel strongly about this being a DeprecationWarning rather than a FutureWarning? I don't think we do that for anything else IIRC, though this is more visibile than anything else we have done.

I am personally in favor of using a DeprecationWarning, yes. But would love to hear some other opinions.

The reason that I am in favor is because, if we use FutureWarning, many users will see them (from external libraries they use), don't know where they come from or what they can do about it. Of course you specifically silence them, but that is not very friendly to novice users.
And remember that, even when using DeprecationWarning, those are visible in an interactive IPython session (if the stacklevel is set correctly, but that is the case). So you still get warned for direct interactive usage.

Using DeprecationWarning now gives package authors some time to clean-up first (of course, assuming they turn on deprecation warnings in their tests. But eg seaborn and statsmodels both have uses of .ix). And after some time, we can still turn the DeprecationWarning into a more visible FutureWarning.

We do this currently like this for the deprecations in core.common, but as I was the one in favor there as well and changed this, that is not really a reference :-)

@jreback
Copy link
Contributor Author

jreback commented Jan 18, 2017

ok, I will change this to DeprecationWarning; also will create an issue to change this for the next version of pandas.

@TomAugspurger
Copy link
Contributor

Also +1 for DeprecationWaring then FutureWarning in the next release.

I started to work on removing .ix from statsmodels. 300ish uses from a quick grep IIRC.

@jreback jreback closed this in 99afdd9 Jan 18, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
handle iterator
handle NamedTuple
.loc retuns scalar selection dtypes correctly, closes pandas-dev#11617

xref pandas-dev#15113

Author: Jeff Reback <jeff@reback.net>

Closes pandas-dev#15120 from jreback/indexing and squashes the following commits:

801c8d9 [Jeff Reback] BUG: indexing changes to .loc for compat to .ix for several situations
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
closes pandas-dev#14218
closes pandas-dev#15116

Author: Jeff Reback <jeff@reback.net>

Closes pandas-dev#15113 from jreback/ix and squashes the following commits:

1544f50 [Jeff Reback] DEPR: deprecate .ix in favor of .loc/.iloc
@topper-123 topper-123 mentioned this pull request Oct 10, 2019
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reindex versus ix gotchas documentation text does not match example code DEPR: deprecate .ix
8 participants