Skip to content

Commit

Permalink
DEPR: deprecate .ix in favor of .loc/.iloc
Browse files Browse the repository at this point in the history
  • Loading branch information
jreback committed Jan 18, 2017
1 parent 362e78d commit 1544f50
Show file tree
Hide file tree
Showing 90 changed files with 1,657 additions and 1,399 deletions.
23 changes: 6 additions & 17 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ of tuples:
Advanced indexing with hierarchical index
-----------------------------------------

Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc/.ix`` is a
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc`` is a
bit challenging, but we've made every effort to do so. for example the
following works as you would expect:

Expand Down Expand Up @@ -258,7 +258,7 @@ Passing a list of labels or tuples works similar to reindexing:

.. ipython:: python
df.ix[[('bar', 'two'), ('qux', 'one')]]
df.loc[[('bar', 'two'), ('qux', 'one')]]
.. _advanced.mi_slicers:

Expand Down Expand Up @@ -604,7 +604,7 @@ intended to work on boolean indices and may return unexpected results.
ser = pd.Series(np.random.randn(10))
ser.take([False, False, True, True])
ser.ix[[0, 1]]
ser.iloc[[0, 1]]
Finally, as a small note on performance, because the ``take`` method handles
a narrower range of inputs, it can offer performance that is a good deal
Expand All @@ -620,7 +620,7 @@ faster than fancy indexing.
timeit arr.take(indexer, axis=0)

ser = pd.Series(arr[:, 0])
timeit ser.ix[indexer]
timeit ser.iloc[indexer]
timeit ser.take(indexer)

.. _indexing.index_types:
Expand Down Expand Up @@ -661,7 +661,7 @@ Setting the index, will create create a ``CategoricalIndex``
df2 = df.set_index('B')
df2.index
Indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an ``Index`` with duplicates.
Indexing with ``__getitem__/.iloc/.loc`` works similarly to an ``Index`` with duplicates.
The indexers MUST be in the category or the operation will raise.

.. ipython:: python
Expand Down Expand Up @@ -759,14 +759,12 @@ same.
sf = pd.Series(range(5), index=indexf)
sf
Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
.. ipython:: python
sf[3]
sf[3.0]
sf.ix[3]
sf.ix[3.0]
sf.loc[3]
sf.loc[3.0]
Expand All @@ -783,7 +781,6 @@ Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS posit
.. ipython:: python
sf[2:4]
sf.ix[2:4]
sf.loc[2:4]
sf.iloc[2:4]
Expand Down Expand Up @@ -813,14 +810,6 @@ In non-float indexes, slicing using floats will raise a ``TypeError``
In [3]: pd.Series(range(5)).iloc[3.0]
TypeError: cannot do positional indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [3.0] of <type 'float'>
Further the treatment of ``.ix`` with a float indexer on a non-float index, will be label based, and thus coerce the index.
.. ipython:: python
s2 = pd.Series([1, 2, 3], index=list('abc'))
s2
s2.ix[1.0] = 10
s2
Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat
irregular timedelta-like indexing scheme, but the data is recorded as floats. This could for
Expand Down
9 changes: 3 additions & 6 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -268,13 +268,12 @@ Indexing, iteration
Series.get
Series.at
Series.iat
Series.ix
Series.loc
Series.iloc
Series.__iter__
Series.iteritems

For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
For more information on ``.at``, ``.iat``, ``.loc``, and
``.iloc``, see the :ref:`indexing documentation <indexing>`.

Binary operator functions
Expand Down Expand Up @@ -774,7 +773,6 @@ Indexing, iteration
DataFrame.head
DataFrame.at
DataFrame.iat
DataFrame.ix
DataFrame.loc
DataFrame.iloc
DataFrame.insert
Expand All @@ -791,7 +789,7 @@ Indexing, iteration
DataFrame.mask
DataFrame.query

For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
For more information on ``.at``, ``.iat``, ``.loc``, and
``.iloc``, see the :ref:`indexing documentation <indexing>`.


Expand Down Expand Up @@ -1090,7 +1088,6 @@ Indexing, iteration, slicing

Panel.at
Panel.iat
Panel.ix
Panel.loc
Panel.iloc
Panel.__iter__
Expand All @@ -1100,7 +1097,7 @@ Indexing, iteration, slicing
Panel.major_xs
Panel.minor_xs

For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
For more information on ``.at``, ``.iat``, ``.loc``, and
``.iloc``, see the :ref:`indexing documentation <indexing>`.

Binary operator functions
Expand Down
6 changes: 3 additions & 3 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ either match on the *index* or *columns* via the **axis** keyword:
'two' : pd.Series(np.random.randn(4), index=['a', 'b', 'c', 'd']),
'three' : pd.Series(np.random.randn(3), index=['b', 'c', 'd'])})
df
row = df.ix[1]
row = df.iloc[1]
column = df['two']
df.sub(row, axis='columns')
Expand Down Expand Up @@ -556,7 +556,7 @@ course):
series[::2] = np.nan
series.describe()
frame = pd.DataFrame(np.random.randn(1000, 5), columns=['a', 'b', 'c', 'd', 'e'])
frame.ix[::2] = np.nan
frame.iloc[::2] = np.nan
frame.describe()
You can select specific percentiles to include in the output:
Expand Down Expand Up @@ -1081,7 +1081,7 @@ objects either on the DataFrame's index or columns using the ``axis`` argument:

.. ipython:: python
df.align(df2.ix[0], axis=1)
df.align(df2.iloc[0], axis=1)
.. _basics.reindex_fill:

Expand Down
3 changes: 1 addition & 2 deletions doc/source/categorical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -482,7 +482,7 @@ Pivot tables:
Data munging
------------

The optimized pandas data access methods ``.loc``, ``.iloc``, ``.ix`` ``.at``, and ``.iat``,
The optimized pandas data access methods ``.loc``, ``.iloc``, ``.at``, and ``.iat``,
work as normal. The only difference is the return type (for getting) and
that only values already in `categories` can be assigned.

Expand All @@ -501,7 +501,6 @@ the ``category`` dtype is preserved.
df.iloc[2:4,:]
df.iloc[2:4,:].dtypes
df.loc["h":"j","cats"]
df.ix["h":"j",0:1]
df[df["cats"] == "b"]
An example where the category type is not preserved is if you take one single row: the
Expand Down
10 changes: 5 additions & 5 deletions doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ in order to have a valid result.
.. ipython:: python
frame = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])
frame.ix[:5, 'a'] = np.nan
frame.ix[5:10, 'b'] = np.nan
frame.loc[frame.index[:5], 'a'] = np.nan
frame.loc[frame.index[5:10], 'b'] = np.nan
frame.cov()
Expand Down Expand Up @@ -120,7 +120,7 @@ All of these are currently computed using pairwise complete observations.
.. ipython:: python
frame = pd.DataFrame(np.random.randn(1000, 5), columns=['a', 'b', 'c', 'd', 'e'])
frame.ix[::2] = np.nan
frame.iloc[::2] = np.nan
# Series with Series
frame['a'].corr(frame['b'])
Expand All @@ -137,8 +137,8 @@ Like ``cov``, ``corr`` also supports the optional ``min_periods`` keyword:
.. ipython:: python
frame = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])
frame.ix[:5, 'a'] = np.nan
frame.ix[5:10, 'b'] = np.nan
frame.loc[frame.index[:5], 'a'] = np.nan
frame.loc[frame.index[5:10], 'b'] = np.nan
frame.corr()
Expand Down
21 changes: 9 additions & 12 deletions doc/source/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,19 +66,19 @@ An if-then on one column

.. ipython:: python
df.ix[df.AAA >= 5,'BBB'] = -1; df
df.loc[df.AAA >= 5,'BBB'] = -1; df
An if-then with assignment to 2 columns:

.. ipython:: python
df.ix[df.AAA >= 5,['BBB','CCC']] = 555; df
df.loc[df.AAA >= 5,['BBB','CCC']] = 555; df
Add another line with different logic, to do the -else

.. ipython:: python
df.ix[df.AAA < 5,['BBB','CCC']] = 2000; df
df.loc[df.AAA < 5,['BBB','CCC']] = 2000; df
Or use pandas where after you've set up a mask

Expand Down Expand Up @@ -149,7 +149,7 @@ Building Criteria
{'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
aValue = 43.0
df.ix[(df.CCC-aValue).abs().argsort()]
df.loc[(df.CCC-aValue).abs().argsort()]
`Dynamically reduce a list of criteria using a binary operators
<http://stackoverflow.com/questions/21058254/pandas-boolean-operation-in-a-python-list/21058331>`__
Expand Down Expand Up @@ -217,9 +217,9 @@ There are 2 explicit slicing methods, with a third general case
df.loc['bar':'kar'] #Label
#Generic
df.ix[0:3] #Same as .iloc[0:3]
df.ix['bar':'kar'] #Same as .loc['bar':'kar']
# Generic
df.iloc[0:3]
df.loc['bar':'kar']
Ambiguity arises when an index consists of integers with a non-zero start or non-unit increment.

Expand All @@ -231,9 +231,6 @@ Ambiguity arises when an index consists of integers with a non-zero start or non
df2.loc[1:3] #Label-oriented
df2.ix[1:3] #General, will mimic loc (label-oriented)
df2.ix[0:3] #General, will mimic iloc (position-oriented), as loc[0:3] would raise a KeyError
`Using inverse operator (~) to take the complement of a mask
<http://stackoverflow.com/questions/14986510/picking-out-elements-based-on-complement-of-indices-in-python-pandas>`__

Expand Down Expand Up @@ -440,7 +437,7 @@ Fill forward a reversed timeseries
.. ipython:: python
df = pd.DataFrame(np.random.randn(6,1), index=pd.date_range('2013-08-01', periods=6, freq='B'), columns=list('A'))
df.ix[3,'A'] = np.nan
df.loc[df.index[3], 'A'] = np.nan
df
df.reindex(df.index[::-1]).ffill()
Expand Down Expand Up @@ -545,7 +542,7 @@ Unlike agg, apply's callable is passed a sub-DataFrame which gives you access to
agg_n_sort_order = code_groups[['data']].transform(sum).sort_values(by='data')
sorted_df = df.ix[agg_n_sort_order.index]
sorted_df = df.loc[agg_n_sort_order.index]
sorted_df
Expand Down
60 changes: 4 additions & 56 deletions doc/source/gotchas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ Label-based indexing with integer axis labels is a thorny topic. It has been
discussed heavily on mailing lists and among various members of the scientific
Python community. In pandas, our general viewpoint is that labels matter more
than integer locations. Therefore, with an integer axis index *only*
label-based indexing is possible with the standard tools like ``.ix``. The
label-based indexing is possible with the standard tools like ``.loc``. The
following code will generate exceptions:

.. code-block:: python
Expand All @@ -230,7 +230,7 @@ following code will generate exceptions:
s[-1]
df = pd.DataFrame(np.random.randn(5, 4))
df
df.ix[-2:]
df.loc[-2:]
This deliberate decision was made to prevent ambiguities and subtle bugs (many
users reported finding bugs when the API change was made to stop "falling back"
Expand Down Expand Up @@ -305,15 +305,15 @@ index can be somewhat complicated. For example, the following does not work:

::

s.ix['c':'e'+1]
s.loc['c':'e'+1]

A very common use case is to limit a time series to start and end at two
specific dates. To enable this, we made the design design to make label-based
slicing include both endpoints:

.. ipython:: python
s.ix['c':'e']
s.loc['c':'e']
This is most definitely a "practicality beats purity" sort of thing, but it is
something to watch out for if you expect label-based slicing to behave exactly
Expand All @@ -322,58 +322,6 @@ in the way that standard Python integer slicing works.
Miscellaneous indexing gotchas
------------------------------

Reindex versus ix gotchas
~~~~~~~~~~~~~~~~~~~~~~~~~

Many users will find themselves using the ``ix`` indexing capabilities as a
concise means of selecting data from a pandas object:

.. ipython:: python
df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'],
index=list('abcdef'))
df
df.ix[['b', 'c', 'e']]
This is, of course, completely equivalent *in this case* to using the
``reindex`` method:

.. ipython:: python
df.reindex(['b', 'c', 'e'])
Some might conclude that ``ix`` and ``reindex`` are 100% equivalent based on
this. This is indeed true **except in the case of integer indexing**. For
example, the above operation could alternately have been expressed as:

.. ipython:: python
df.ix[[1, 2, 4]]
If you pass ``[1, 2, 4]`` to ``reindex`` you will get another thing entirely:

.. ipython:: python
df.reindex([1, 2, 4])
So it's important to remember that ``reindex`` is **strict label indexing
only**. This can lead to some potentially surprising results in pathological
cases where an index contains, say, both integers and strings:

.. ipython:: python
s = pd.Series([1, 2, 3], index=['a', 0, 1])
s
s.ix[[0, 1]]
s.reindex([0, 1])
Because the index in this case does not contain solely integers, ``ix`` falls
back on integer indexing. By contrast, ``reindex`` only looks for the values
passed in the index, thus finding the integers ``0`` and ``1``. While it would
be possible to insert some logic to check whether a passed sequence is all
contained in the index, that logic would exact a very high cost in large data
sets.

Reindex potentially changes underlying Series dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Loading

0 comments on commit 1544f50

Please sign in to comment.