DEPR: deprecate .ix in favor of .loc/.iloc

closes pandas-dev#14218 closes pandas-dev#15116
AnkurDedania · Jan 18, 2017 · 1544f50 · 1544f50
1 parent 362e78d
commit 1544f50
Show file tree

Hide file tree

Showing 90 changed files with 1,657 additions and 1,399 deletions.
diff --git a/doc/source/advanced.rst b/doc/source/advanced.rst
@@ -230,7 +230,7 @@ of tuples:
 Advanced indexing with hierarchical index
 -----------------------------------------
 
-Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc/.ix`` is a
+Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc`` is a
 bit challenging, but we've made every effort to do so. for example the
 following works as you would expect:
 
@@ -258,7 +258,7 @@ Passing a list of labels or tuples works similar to reindexing:
 
 .. ipython:: python
 
-   df.ix[[('bar', 'two'), ('qux', 'one')]]
+   df.loc[[('bar', 'two'), ('qux', 'one')]]
 
 .. _advanced.mi_slicers:
 
@@ -604,7 +604,7 @@ intended to work on boolean indices and may return unexpected results.
 
    ser = pd.Series(np.random.randn(10))
    ser.take([False, False, True, True])
-   ser.ix[[0, 1]]
+   ser.iloc[[0, 1]]
 
 Finally, as a small note on performance, because the ``take`` method handles
 a narrower range of inputs, it can offer performance that is a good deal
@@ -620,7 +620,7 @@ faster than fancy indexing.
    timeit arr.take(indexer, axis=0)
 
    ser = pd.Series(arr[:, 0])
-   timeit ser.ix[indexer]
+   timeit ser.iloc[indexer]
    timeit ser.take(indexer)
 
 .. _indexing.index_types:
@@ -661,7 +661,7 @@ Setting the index, will create create a ``CategoricalIndex``
    df2 = df.set_index('B')
    df2.index
 
-Indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an ``Index`` with duplicates.
+Indexing with ``__getitem__/.iloc/.loc`` works similarly to an ``Index`` with duplicates.
 The indexers MUST be in the category or the operation will raise.
 
 .. ipython:: python
@@ -759,14 +759,12 @@ same.
    sf = pd.Series(range(5), index=indexf)
    sf
 
-Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
+Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
 
 .. ipython:: python
 
    sf[3]
    sf[3.0]
-   sf.ix[3]
-   sf.ix[3.0]
    sf.loc[3]
    sf.loc[3.0]
 
@@ -783,7 +781,6 @@ Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS posit
 .. ipython:: python
 
    sf[2:4]
-   sf.ix[2:4]
    sf.loc[2:4]
    sf.iloc[2:4]
 
@@ -813,14 +810,6 @@ In non-float indexes, slicing using floats will raise a ``TypeError``
       In [3]: pd.Series(range(5)).iloc[3.0]
       TypeError: cannot do positional indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [3.0] of <type 'float'>
 
-   Further the treatment of ``.ix`` with a float indexer on a non-float index, will be label based, and thus coerce the index.
-
-   .. ipython:: python
-
-      s2 = pd.Series([1, 2, 3], index=list('abc'))
-      s2
-      s2.ix[1.0] = 10
-      s2
 
 Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat
 irregular timedelta-like indexing scheme, but the data is recorded as floats. This could for

diff --git a/doc/source/api.rst b/doc/source/api.rst
@@ -268,13 +268,12 @@ Indexing, iteration
    Series.get
    Series.at
    Series.iat
-   Series.ix
    Series.loc
    Series.iloc
    Series.__iter__
    Series.iteritems
 
-For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
+For more information on ``.at``, ``.iat``, ``.loc``, and
 ``.iloc``,  see the :ref:`indexing documentation <indexing>`.
 
 Binary operator functions
@@ -774,7 +773,6 @@ Indexing, iteration
    DataFrame.head
    DataFrame.at
    DataFrame.iat
-   DataFrame.ix
    DataFrame.loc
    DataFrame.iloc
    DataFrame.insert
@@ -791,7 +789,7 @@ Indexing, iteration
    DataFrame.mask
    DataFrame.query
 
-For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
+For more information on ``.at``, ``.iat``, ``.loc``, and
 ``.iloc``,  see the :ref:`indexing documentation <indexing>`.
 
 
@@ -1090,7 +1088,6 @@ Indexing, iteration, slicing
 
    Panel.at
    Panel.iat
-   Panel.ix
    Panel.loc
    Panel.iloc
    Panel.__iter__
@@ -1100,7 +1097,7 @@ Indexing, iteration, slicing
    Panel.major_xs
    Panel.minor_xs
 
-For more information on ``.at``, ``.iat``, ``.ix``, ``.loc``, and
+For more information on ``.at``, ``.iat``, ``.loc``, and
 ``.iloc``,  see the :ref:`indexing documentation <indexing>`.
 
 Binary operator functions

diff --git a/doc/source/basics.rst b/doc/source/basics.rst
@@ -145,7 +145,7 @@ either match on the *index* or *columns* via the **axis** keyword:
                       'two' : pd.Series(np.random.randn(4), index=['a', 'b', 'c', 'd']),
                       'three' : pd.Series(np.random.randn(3), index=['b', 'c', 'd'])})
    df
-   row = df.ix[1]
+   row = df.iloc[1]
    column = df['two']
 
    df.sub(row, axis='columns')
@@ -556,7 +556,7 @@ course):
     series[::2] = np.nan
     series.describe()
     frame = pd.DataFrame(np.random.randn(1000, 5), columns=['a', 'b', 'c', 'd', 'e'])
-    frame.ix[::2] = np.nan
+    frame.iloc[::2] = np.nan
     frame.describe()
 
 You can select specific percentiles to include in the output:
@@ -1081,7 +1081,7 @@ objects either on the DataFrame's index or columns using the ``axis`` argument:
 
 .. ipython:: python
 
-   df.align(df2.ix[0], axis=1)
+   df.align(df2.iloc[0], axis=1)
 
 .. _basics.reindex_fill:
 

diff --git a/doc/source/categorical.rst b/doc/source/categorical.rst
@@ -482,7 +482,7 @@ Pivot tables:
 Data munging
 ------------
 
-The optimized pandas data access methods  ``.loc``, ``.iloc``, ``.ix`` ``.at``, and ``.iat``,
+The optimized pandas data access methods  ``.loc``, ``.iloc``, ``.at``, and ``.iat``,
 work as normal. The only difference is the return type (for getting) and
 that only values already in `categories` can be assigned.
 
@@ -501,7 +501,6 @@ the ``category`` dtype is preserved.
     df.iloc[2:4,:]
     df.iloc[2:4,:].dtypes
     df.loc["h":"j","cats"]
-    df.ix["h":"j",0:1]
     df[df["cats"] == "b"]
 
 An example where the category type is not preserved is if you take one single row: the

diff --git a/doc/source/computation.rst b/doc/source/computation.rst
@@ -84,8 +84,8 @@ in order to have a valid result.
 .. ipython:: python
 
    frame = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])
-   frame.ix[:5, 'a'] = np.nan
-   frame.ix[5:10, 'b'] = np.nan
+   frame.loc[frame.index[:5], 'a'] = np.nan
+   frame.loc[frame.index[5:10], 'b'] = np.nan
 
    frame.cov()
 
@@ -120,7 +120,7 @@ All of these are currently computed using pairwise complete observations.
 .. ipython:: python
 
    frame = pd.DataFrame(np.random.randn(1000, 5), columns=['a', 'b', 'c', 'd', 'e'])
-   frame.ix[::2] = np.nan
+   frame.iloc[::2] = np.nan
 
    # Series with Series
    frame['a'].corr(frame['b'])
@@ -137,8 +137,8 @@ Like ``cov``, ``corr`` also supports the optional ``min_periods`` keyword:
 .. ipython:: python
 
    frame = pd.DataFrame(np.random.randn(20, 3), columns=['a', 'b', 'c'])
-   frame.ix[:5, 'a'] = np.nan
-   frame.ix[5:10, 'b'] = np.nan
+   frame.loc[frame.index[:5], 'a'] = np.nan
+   frame.loc[frame.index[5:10], 'b'] = np.nan
 
    frame.corr()
 

diff --git a/doc/source/cookbook.rst b/doc/source/cookbook.rst
@@ -66,19 +66,19 @@ An if-then on one column
 
 .. ipython:: python
 
-   df.ix[df.AAA >= 5,'BBB'] = -1; df
+   df.loc[df.AAA >= 5,'BBB'] = -1; df
 
 An if-then with assignment to 2 columns:
 
 .. ipython:: python
 
-   df.ix[df.AAA >= 5,['BBB','CCC']] = 555; df
+   df.loc[df.AAA >= 5,['BBB','CCC']] = 555; df
 
 Add another line with different logic, to do the -else
 
 .. ipython:: python
 
-   df.ix[df.AAA < 5,['BBB','CCC']] = 2000; df
+   df.loc[df.AAA < 5,['BBB','CCC']] = 2000; df
 
 Or use pandas where after you've set up a mask
 
@@ -149,7 +149,7 @@ Building Criteria
         {'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]}); df
 
    aValue = 43.0
-   df.ix[(df.CCC-aValue).abs().argsort()]
+   df.loc[(df.CCC-aValue).abs().argsort()]
 
 `Dynamically reduce a list of criteria using a binary operators
 <http://stackoverflow.com/questions/21058254/pandas-boolean-operation-in-a-python-list/21058331>`__
@@ -217,9 +217,9 @@ There are 2 explicit slicing methods, with a third general case
 
    df.loc['bar':'kar'] #Label
 
-   #Generic
-   df.ix[0:3] #Same as .iloc[0:3]
-   df.ix['bar':'kar'] #Same as .loc['bar':'kar']
+   # Generic
+   df.iloc[0:3]
+   df.loc['bar':'kar']
 
 Ambiguity arises when an index consists of integers with a non-zero start or non-unit increment.
 
@@ -231,9 +231,6 @@ Ambiguity arises when an index consists of integers with a non-zero start or non
 
    df2.loc[1:3] #Label-oriented
 
-   df2.ix[1:3] #General, will mimic loc (label-oriented)
-   df2.ix[0:3] #General, will mimic iloc (position-oriented), as loc[0:3] would raise a KeyError
-
 `Using inverse operator (~) to take the complement of a mask
 <http://stackoverflow.com/questions/14986510/picking-out-elements-based-on-complement-of-indices-in-python-pandas>`__
 
@@ -440,7 +437,7 @@ Fill forward a reversed timeseries
 .. ipython:: python
 
    df = pd.DataFrame(np.random.randn(6,1), index=pd.date_range('2013-08-01', periods=6, freq='B'), columns=list('A'))
-   df.ix[3,'A'] = np.nan
+   df.loc[df.index[3], 'A'] = np.nan
    df
    df.reindex(df.index[::-1]).ffill()
 
@@ -545,7 +542,7 @@ Unlike agg, apply's callable is passed a sub-DataFrame which gives you access to
 
    agg_n_sort_order = code_groups[['data']].transform(sum).sort_values(by='data')
 
-   sorted_df = df.ix[agg_n_sort_order.index]
+   sorted_df = df.loc[agg_n_sort_order.index]
 
    sorted_df
 

diff --git a/doc/source/gotchas.rst b/doc/source/gotchas.rst
@@ -221,7 +221,7 @@ Label-based indexing with integer axis labels is a thorny topic. It has been
 discussed heavily on mailing lists and among various members of the scientific
 Python community. In pandas, our general viewpoint is that labels matter more
 than integer locations. Therefore, with an integer axis index *only*
-label-based indexing is possible with the standard tools like ``.ix``. The
+label-based indexing is possible with the standard tools like ``.loc``. The
 following code will generate exceptions:
 
 .. code-block:: python
@@ -230,7 +230,7 @@ following code will generate exceptions:
    s[-1]
    df = pd.DataFrame(np.random.randn(5, 4))
    df
-   df.ix[-2:]
+   df.loc[-2:]
 
 This deliberate decision was made to prevent ambiguities and subtle bugs (many
 users reported finding bugs when the API change was made to stop "falling back"
@@ -305,15 +305,15 @@ index can be somewhat complicated. For example, the following does not work:
 
 ::
 
-    s.ix['c':'e'+1]
+    s.loc['c':'e'+1]
 
 A very common use case is to limit a time series to start and end at two
 specific dates. To enable this, we made the design design to make label-based
 slicing include both endpoints:
 
 .. ipython:: python
 
-    s.ix['c':'e']
+    s.loc['c':'e']
 
 This is most definitely a "practicality beats purity" sort of thing, but it is
 something to watch out for if you expect label-based slicing to behave exactly
@@ -322,58 +322,6 @@ in the way that standard Python integer slicing works.
 Miscellaneous indexing gotchas
 ------------------------------
 
-Reindex versus ix gotchas
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Many users will find themselves using the ``ix`` indexing capabilities as a
-concise means of selecting data from a pandas object:
-
-.. ipython:: python
-
-   df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'],
-                     index=list('abcdef'))
-   df
-   df.ix[['b', 'c', 'e']]
-
-This is, of course, completely equivalent *in this case* to using the
-``reindex`` method:
-
-.. ipython:: python
-
-   df.reindex(['b', 'c', 'e'])
-
-Some might conclude that ``ix`` and ``reindex`` are 100% equivalent based on
-this. This is indeed true **except in the case of integer indexing**. For
-example, the above operation could alternately have been expressed as:
-
-.. ipython:: python
-
-   df.ix[[1, 2, 4]]
-
-If you pass ``[1, 2, 4]`` to ``reindex`` you will get another thing entirely:
-
-.. ipython:: python
-
-   df.reindex([1, 2, 4])
-
-So it's important to remember that ``reindex`` is **strict label indexing
-only**. This can lead to some potentially surprising results in pathological
-cases where an index contains, say, both integers and strings:
-
-.. ipython:: python
-
-   s = pd.Series([1, 2, 3], index=['a', 0, 1])
-   s
-   s.ix[[0, 1]]
-   s.reindex([0, 1])
-
-Because the index in this case does not contain solely integers, ``ix`` falls
-back on integer indexing. By contrast, ``reindex`` only looks for the values
-passed in the index, thus finding the integers ``0`` and ``1``. While it would
-be possible to insert some logic to check whether a passed sequence is all
-contained in the index, that logic would exact a very high cost in large data
-sets.
-
 Reindex potentially changes underlying Series dtype
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~