Skip to content

Commit

Permalink
DEPR: deprecate .ix in favor of .loc/.iloc
Browse files Browse the repository at this point in the history
closes #14218
  • Loading branch information
jreback committed Jan 12, 2017
1 parent 0fe491d commit c462504
Show file tree
Hide file tree
Showing 78 changed files with 1,590 additions and 1,290 deletions.
23 changes: 6 additions & 17 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ of tuples:
Advanced indexing with hierarchical index
-----------------------------------------

Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc/.ix`` is a
Syntactically integrating ``MultiIndex`` in advanced indexing with ``.loc`` is a
bit challenging, but we've made every effort to do so. for example the
following works as you would expect:

Expand Down Expand Up @@ -258,7 +258,7 @@ Passing a list of labels or tuples works similar to reindexing:

.. ipython:: python
df.ix[[('bar', 'two'), ('qux', 'one')]]
df.loc[[('bar', 'two'), ('qux', 'one')]]
.. _advanced.mi_slicers:

Expand Down Expand Up @@ -604,7 +604,7 @@ intended to work on boolean indices and may return unexpected results.
ser = pd.Series(np.random.randn(10))
ser.take([False, False, True, True])
ser.ix[[0, 1]]
ser.iloc[[0, 1]]
Finally, as a small note on performance, because the ``take`` method handles
a narrower range of inputs, it can offer performance that is a good deal
Expand All @@ -620,7 +620,7 @@ faster than fancy indexing.
timeit arr.take(indexer, axis=0)

ser = pd.Series(arr[:, 0])
timeit ser.ix[indexer]
timeit ser.iloc[indexer]
timeit ser.take(indexer)

.. _indexing.index_types:
Expand Down Expand Up @@ -661,7 +661,7 @@ Setting the index, will create create a ``CategoricalIndex``
df2 = df.set_index('B')
df2.index
Indexing with ``__getitem__/.iloc/.loc/.ix`` works similarly to an ``Index`` with duplicates.
Indexing with ``__getitem__/.iloc/.loc`` works similarly to an ``Index`` with duplicates.
The indexers MUST be in the category or the operation will raise.

.. ipython:: python
Expand Down Expand Up @@ -759,14 +759,12 @@ same.
sf = pd.Series(range(5), index=indexf)
sf
Scalar selection for ``[],.ix,.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
Scalar selection for ``[],.loc`` will always be label based. An integer will match an equal float index (e.g. ``3`` is equivalent to ``3.0``)
.. ipython:: python
sf[3]
sf[3.0]
sf.ix[3]
sf.ix[3.0]
sf.loc[3]
sf.loc[3.0]
Expand All @@ -783,7 +781,6 @@ Slicing is ALWAYS on the values of the index, for ``[],ix,loc`` and ALWAYS posit
.. ipython:: python
sf[2:4]
sf.ix[2:4]
sf.loc[2:4]
sf.iloc[2:4]
Expand Down Expand Up @@ -813,14 +810,6 @@ In non-float indexes, slicing using floats will raise a ``TypeError``
In [3]: pd.Series(range(5)).iloc[3.0]
TypeError: cannot do positional indexing on <class 'pandas.indexes.range.RangeIndex'> with these indexers [3.0] of <type 'float'>
Further the treatment of ``.ix`` with a float indexer on a non-float index, will be label based, and thus coerce the index.
.. ipython:: python
s2 = pd.Series([1, 2, 3], index=list('abc'))
s2
s2.ix[1.0] = 10
s2
Here is a typical use-case for using this type of indexing. Imagine that you have a somewhat
irregular timedelta-like indexing scheme, but the data is recorded as floats. This could for
Expand Down
69 changes: 51 additions & 18 deletions doc/source/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ See the :ref:`MultiIndex / Advanced Indexing <advanced>` for ``MultiIndex`` and

See the :ref:`cookbook<cookbook.selection>` for some advanced strategies

.. _indexing.choice:

Different Choices for Indexing
------------------------------

Expand Down Expand Up @@ -104,24 +106,13 @@ of multi-axis indexing.

See more at :ref:`Selection by Position <indexing.integer>`

- ``.ix`` supports mixed integer and label based access. It is primarily label
based, but will fall back to integer positional access unless the corresponding
axis is of integer type. ``.ix`` is the most general and will
support any of the inputs in ``.loc`` and ``.iloc``. ``.ix`` also supports floating point
label schemes. ``.ix`` is exceptionally useful when dealing with mixed positional
and label based hierarchical indexes.

However, when an axis is integer based, ONLY
label based access and not positional access is supported.
Thus, in such cases, it's usually better to be explicit and use ``.iloc`` or ``.loc``.

See more at :ref:`Advanced Indexing <advanced>` and :ref:`Advanced
Hierarchical <advanced.advanced_hierarchical>`.

- ``.loc``, ``.iloc``, ``.ix`` and also ``[]`` indexing can accept a ``callable`` as indexer. See more at :ref:`Selection By Callable <indexing.callable>`.
- ``.loc``, ``.iloc``, and also ``[]`` indexing can accept a ``callable`` as indexer. See more at :ref:`Selection By Callable <indexing.callable>`.

Getting values from an object with multi-axes selection uses the following
notation (using ``.loc`` as an example, but applies to ``.iloc`` and ``.ix`` as
notation (using ``.loc`` as an example, but applies to ``.iloc`` as
well). Any of the axes accessors may be the null slice ``:``. Axes left out of
the specification are assumed to be ``:``. (e.g. ``p.loc['a']`` is equiv to
``p.loc['a', :, :]``)
Expand All @@ -135,6 +126,48 @@ the specification are assumed to be ``:``. (e.g. ``p.loc['a']`` is equiv to
DataFrame; ``df.loc[row_indexer,column_indexer]``
Panel; ``p.loc[item_indexer,major_indexer,minor_indexer]``

.. _indexing.deprecate_ix:

IX Indexer is Deprecated
------------------------

.. warning::

Startin in 0.20.0, the ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years.


The recommended methods of indexing are:

.. ipython:: python
dfd = pd.DataFrame({'A': [1, 2, 3],
'B': [4, 5, 6]},
index=list('abc'))
dfd
Previous Behavior, where you wish to get the 0th and the 2nd elements from the index in the 'A' column.

.. code-block:: ipython
In [3]: dfd.ix[[0, 2], 'A']
Out[3]:
a 1
c 3
Name: A, dtype: int64
Using ``.loc``. Here we will select the appropriate indexes from the index, then use *label* indexing.

.. ipython:: python
dfd.loc[df.index[[0, 2]], 'A']
Using ``.iloc``. Here we will get the location of the 'A' column, then use *positional* indexing to select things.

.. ipython:: python
dfd.iloc[[0, 2], df.columns.get_loc('A')]
.. _indexing.basics:

Basics
Expand Down Expand Up @@ -193,7 +226,7 @@ columns.

.. warning::

pandas aligns all AXES when setting ``Series`` and ``DataFrame`` from ``.loc``, ``.iloc`` and ``.ix``.
pandas aligns all AXES when setting ``Series`` and ``DataFrame`` from ``.loc``, and ``.iloc``.

This will **not** modify ``df`` because the column alignment is before value assignment.

Expand Down Expand Up @@ -526,7 +559,7 @@ Selection By Callable

.. versionadded:: 0.18.1

``.loc``, ``.iloc``, ``.ix`` and also ``[]`` indexing can accept a ``callable`` as indexer.
``.loc``, ``.iloc``, and also ``[]`` indexing can accept a ``callable`` as indexer.
The ``callable`` must be a function with one argument (the calling Series, DataFrame or Panel) and that returns valid output for indexing.

.. ipython:: python
Expand Down Expand Up @@ -641,7 +674,7 @@ Setting With Enlargement

.. versionadded:: 0.13

The ``.loc/.ix/[]`` operations can perform enlargement when setting a non-existant key for that axis.
The ``.loc/[]`` operations can perform enlargement when setting a non-existant key for that axis.

In the ``Series`` case this is effectively an appending operation

Expand Down Expand Up @@ -906,7 +939,7 @@ without creating a copy:

Furthermore, ``where`` aligns the input boolean condition (ndarray or DataFrame),
such that partial selection with setting is possible. This is analogous to
partial setting via ``.ix`` (but on the contents rather than the axis labels)
partial setting via ``.loc`` (but on the contents rather than the axis labels)

.. ipython:: python
Expand Down Expand Up @@ -1716,7 +1749,7 @@ A chained assignment can also crop up in setting in a mixed dtype frame.

.. note::

These setting rules apply to all of ``.loc/.iloc/.ix``
These setting rules apply to all of ``.loc/.iloc``

This is the correct access method

Expand Down
48 changes: 48 additions & 0 deletions doc/source/whatsnew/v0.20.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ users upgrade to this version.
Highlights include:

- Building pandas for development now requires ``cython >= 0.23`` (:issue:`14831`)
- The ``.ix`` indexer has been deprecated, see :ref:`here <whatsnew.api_breaking.deprecate_ix>`

Check the :ref:`API Changes <whatsnew_0200.api_breaking>` and :ref:`deprecations <whatsnew_0200.deprecations>` before updating.

Expand Down Expand Up @@ -122,6 +123,53 @@ Other enhancements
Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


.. _whatsnew.api_breaking.deprecate_ix

Deprecate .ix
^^^^^^^^^^^^^

The ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*. This has caused quite a bit of user confusion over the years. The full indexing documentation are :ref:`here <indexing>`. (:issue:`14218`)


The recommended methods of indexing are:

- ``.loc`` if you want to *label* index
- ``.iloc`` if you want to *positionally* index.

Using ``.ix`` will now show a deprecation warning with a mini-example of how to convert code.

.. ipython:: python

df = pd.DataFrame({'A': [1, 2, 3],
'B': [4, 5, 6]},
index=list('abc'))

df

Previous Behavior, where you wish to get the 0th and the 2nd elements from the index in the 'A' column.

.. code-block:: ipython

In [3]: df.ix[[0, 2], 'A']
Out[3]:
a 1
c 3
Name: A, dtype: int64

Using ``.loc``. Here we will select the appropriate indexes from the index, then use *label* indexing.

.. ipython:: python

df.loc[df.index[[0, 2]], 'A']

Using ``.iloc``. Here we will get the location of the 'A' column, then use *positional* indexing to select things.

.. ipython:: python

df.iloc[[0, 2], df.columns.get_loc('A')]


.. _whatsnew.api_breaking.index_map

Map on Index types now return other Index types
Expand Down
14 changes: 7 additions & 7 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1961,7 +1961,7 @@ def _ixs(self, i, axis=0):
if isinstance(i, slice):
# need to return view
lab_slice = slice(label[0], label[-1])
return self.ix[:, lab_slice]
return self.loc[:, lab_slice]
else:
if isinstance(label, Index):
return self.take(i, axis=1, convert=True)
Expand Down Expand Up @@ -2056,7 +2056,7 @@ def _getitem_array(self, key):
indexer = key.nonzero()[0]
return self.take(indexer, axis=0, convert=False)
else:
indexer = self.ix._convert_to_indexer(key, axis=1)
indexer = self.loc._convert_to_indexer(key, axis=1)
return self.take(indexer, axis=1, convert=True)

def _getitem_multilevel(self, key):
Expand Down Expand Up @@ -2389,7 +2389,7 @@ def __setitem__(self, key, value):

def _setitem_slice(self, key, value):
self._check_setitem_copy()
self.ix._setitem_with_indexer(key, value)
self.loc._setitem_with_indexer(key, value)

def _setitem_array(self, key, value):
# also raises Exception if object array with NA values
Expand All @@ -2400,17 +2400,17 @@ def _setitem_array(self, key, value):
key = check_bool_indexer(self.index, key)
indexer = key.nonzero()[0]
self._check_setitem_copy()
self.ix._setitem_with_indexer(indexer, value)
self.loc._setitem_with_indexer(indexer, value)
else:
if isinstance(value, DataFrame):
if len(value.columns) != len(key):
raise ValueError('Columns must be same length as key')
for k1, k2 in zip(key, value.columns):
self[k1] = value[k2]
else:
indexer = self.ix._convert_to_indexer(key, axis=1)
indexer = self.loc._convert_to_indexer(key, axis=1)
self._check_setitem_copy()
self.ix._setitem_with_indexer((slice(None), indexer), value)
self.loc._setitem_with_indexer((slice(None), indexer), value)

def _setitem_frame(self, key, value):
# support boolean setting with DataFrame input, e.g.
Expand Down Expand Up @@ -4403,7 +4403,7 @@ def append(self, other, ignore_index=False, verify_integrity=False):
elif isinstance(other, list) and not isinstance(other[0], DataFrame):
other = DataFrame(other)
if (self.columns.get_indexer(other.columns) >= 0).all():
other = other.ix[:, self.columns]
other = other.loc[:, self.columns]

from pandas.tools.merge import concat
if isinstance(other, (list, tuple)):
Expand Down
17 changes: 6 additions & 11 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1809,18 +1809,12 @@ def xs(self, key, axis=0, level=None, drop_level=True):
loc, new_ax = labels.get_loc_level(key, level=level,
drop_level=drop_level)

# convert to a label indexer if needed
if isinstance(loc, slice):
lev_num = labels._get_level_number(level)
if labels.levels[lev_num].inferred_type == 'integer':
loc = labels[loc]

# create the tuple of the indexer
indexer = [slice(None)] * self.ndim
indexer[axis] = loc
indexer = tuple(indexer)

result = self.ix[indexer]
result = self.iloc[indexer]
setattr(result, result._get_axis_name(axis), new_ax)
return result

Expand Down Expand Up @@ -1983,7 +1977,7 @@ def drop(self, labels, axis=0, level=None, inplace=False, errors='raise'):
slicer = [slice(None)] * self.ndim
slicer[self._get_axis_number(axis_name)] = indexer

result = self.ix[tuple(slicer)]
result = self.loc[tuple(slicer)]

if inplace:
self._update_inplace(result)
Expand Down Expand Up @@ -4332,8 +4326,9 @@ def first(self, offset):
if not offset.isAnchored() and hasattr(offset, '_inc'):
if end_date in self.index:
end = self.index.searchsorted(end_date, side='left')
return self.iloc[:end]

return self.ix[:end]
return self.loc[:end]

def last(self, offset):
"""
Expand Down Expand Up @@ -4364,7 +4359,7 @@ def last(self, offset):

start_date = start = self.index[-1] - offset
start = self.index.searchsorted(start_date, side='right')
return self.ix[start:]
return self.iloc[start:]

def rank(self, axis=0, method='average', numeric_only=None,
na_option='keep', ascending=True, pct=False):
Expand Down Expand Up @@ -5078,7 +5073,7 @@ def truncate(self, before=None, after=None, axis=None, copy=True):

slicer = [slice(None, None)] * self._AXIS_LEN
slicer[axis] = slice(before, after)
result = self.ix[tuple(slicer)]
result = self.loc[tuple(slicer)]

if isinstance(ax, MultiIndex):
setattr(result, self._get_axis_name(axis),
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -4103,7 +4103,7 @@ def _chop(self, sdata, slice_obj):
if self.axis == 0:
return sdata.iloc[slice_obj]
else:
return sdata._slice(slice_obj, axis=1) # ix[:, slice_obj]
return sdata._slice(slice_obj, axis=1) # .loc[:, slice_obj]


class NDFrameSplitter(DataSplitter):
Expand Down
Loading

0 comments on commit c462504

Please sign in to comment.