Skip to content

Commit

Permalink
API: multi-line, not inplace eval
Browse files Browse the repository at this point in the history
PEP8 compliance for test_eval and eval.py
  • Loading branch information
chris-b1 committed Jan 3, 2016
1 parent 13f659f commit 3465696
Show file tree
Hide file tree
Showing 5 changed files with 316 additions and 50 deletions.
62 changes: 56 additions & 6 deletions doc/source/enhancingperf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -570,18 +570,51 @@ prefix the name of the :class:`~pandas.DataFrame` to the column(s) you're
interested in evaluating.

In addition, you can perform assignment of columns within an expression.
This allows for *formulaic evaluation*. Only a single assignment is permitted.
The assignment target can be a new column name or an existing column name, and
it must be a valid Python identifier.
This allows for *formulaic evaluation*. The assignment target can be a
new column name or an existing column name, and it must be a valid Python
identifier.

.. versionadded:: 0.18.0

The ``inplace`` keyword determines whether this assignment will performed
on the original ``DataFrame`` or return a copy with the new column.

.. warning::

For backwards compatability, ``inplace`` defaults to ``True`` if not
specified. This will change in a future version of pandas - if your
code depends on an inplace assignment you should update to explicitly
set ``inplace=True``

.. ipython:: python
df = pd.DataFrame(dict(a=range(5), b=range(5, 10)))
df.eval('c = a + b')
df.eval('d = a + b + c')
df.eval('a = 1')
df.eval('c = a + b', inplace=True)
df.eval('d = a + b + c', inplace=True)
df.eval('a = 1', inplace=True)
df
When ``inplace`` is set to ``False``, a copy of the ``DataFrame`` with the
new or modified columns is returned and the original frame is unchanged.

.. ipython:: python
df
df.eval('e = a - c', inplace=False)
df
.. versionadded:: 0.18.0

As a convenience, multiple assignments can be performed by using a
multi-line string.

.. ipython:: python
df.eval("""
c = a + b
d = a + b + c
a = 1""", inplace=False)
The equivalent in standard Python would be

.. ipython:: python
Expand All @@ -592,6 +625,23 @@ The equivalent in standard Python would be
df['a'] = 1
df
.. versionadded:: 0.18.0

The ``query`` method gained the ``inplace`` keyword which determines
whether the query modifies the original frame.

.. ipython:: python
df = pd.DataFrame(dict(a=range(5), b=range(5, 10)))
df.query('a > 2')
df.query('a > 2', inplace=True)
df
.. warning::

Unlike with ``eval``, the default value for ``inplace`` for ``query``
is ``False``. This is consistent with prior versions of pandas.

Local Variables
~~~~~~~~~~~~~~~

Expand Down
47 changes: 46 additions & 1 deletion doc/source/whatsnew/v0.18.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -295,15 +295,60 @@ Other API Changes

- ``.memory_usage`` now includes values in the index, as does memory_usage in ``.info`` (:issue:`11597`)

Changes to eval
^^^^^^^^^^^^^^^

In prior versions, new columns assignments in an ``eval`` expression resulted
in an inplace change to the ``DataFrame``. (:issue:`9297`)

.. ipython:: python

df = pd.DataFrame({'a': np.linspace(0, 10, 5), 'b': range(5)})
df.eval('c = a + b')
df

In version 0.18.0, a new ``inplace`` keyword was added to choose whether the
assignment should be done inplace or return a copy.

.. ipython:: python

df
df.eval('d = c - b', inplace=False)
df
df.eval('d = c - b', inplace=True)
df

.. warning::

For backwards compatability, ``inplace`` defaults to ``True`` if not specified.
This will change in a future version of pandas - if your code depends on an
inplace assignment you should update to explicitly set ``inplace=True``

The ``inplace`` keyword parameter was also added the ``query`` method.

.. ipython:: python

df.query('a > 5')
df.query('a > 5', inplace=True)
df

.. warning::

Note that the default value for ``inplace`` in a ``query``
is ``False``, which is consistent with prior verions.

``eval`` has also been updated to allow multi-line expressions for multiple
assignments. These expressions will be evaluated one at a time in order. Only
assginments are valid for multi-line expressions.

.. ipython:: python

df
df.eval("""
e = d + a
f = e - 22
g = f / 2.0""", inplace=True)
df

.. _whatsnew_0180.deprecations:

Expand Down Expand Up @@ -410,7 +455,7 @@ Bug Fixes
- Bug in ``pd.read_clipboard`` and ``pd.to_clipboard`` functions not supporting Unicode; upgrade included ``pyperclip`` to v1.5.15 (:issue:`9263`)



- Bug in ``DataFrame.query`` containing an assignment (:issue:`8664`)



Expand Down
109 changes: 83 additions & 26 deletions pandas/computation/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
"""Top level ``eval`` module.
"""

import warnings
import tokenize
from pandas.core import common as com
from pandas.computation.expr import Expr, _parsers, tokenize_string
from pandas.computation.scope import _ensure_scope
from pandas.compat import DeepChainMap, builtins
from pandas.compat import string_types
from pandas.computation.engines import _engines
from distutils.version import LooseVersion

Expand Down Expand Up @@ -138,7 +139,7 @@ def _check_for_locals(expr, stack_level, parser):

def eval(expr, parser='pandas', engine='numexpr', truediv=True,
local_dict=None, global_dict=None, resolvers=(), level=0,
target=None):
target=None, inplace=None):
"""Evaluate a Python expression as a string using various backends.
The following arithmetic operations are supported: ``+``, ``-``, ``*``,
Expand Down Expand Up @@ -196,6 +197,13 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
scope. Most users will **not** need to change this parameter.
target : a target object for assignment, optional, default is None
essentially this is a passed in resolver
inplace : bool, default True
If expression mutates, whether to modify object inplace or return
copy with mutation.
WARNING: inplace=None currently falls back to to True, but
in a future version, will default to False. Use inplace=True
explicitly rather than relying on the default.
Returns
-------
Expand All @@ -214,29 +222,78 @@ def eval(expr, parser='pandas', engine='numexpr', truediv=True,
pandas.DataFrame.query
pandas.DataFrame.eval
"""
expr = _convert_expression(expr)
_check_engine(engine)
_check_parser(parser)
_check_resolvers(resolvers)
_check_for_locals(expr, level, parser)

# get our (possibly passed-in) scope
level += 1
env = _ensure_scope(level, global_dict=global_dict,
local_dict=local_dict, resolvers=resolvers,
target=target)

parsed_expr = Expr(expr, engine=engine, parser=parser, env=env,
truediv=truediv)

# construct the engine and evaluate the parsed expression
eng = _engines[engine]
eng_inst = eng(parsed_expr)
ret = eng_inst.evaluate()

# assign if needed
if env.target is not None and parsed_expr.assigner is not None:
env.target[parsed_expr.assigner] = ret
return None
first_expr = True
if isinstance(expr, string_types):
exprs = [e for e in expr.splitlines() if e != '']
else:
exprs = [expr]
multi_line = len(exprs) > 1

if multi_line and target is None:
raise ValueError("multi-line expressions are only valid in the "
"context of data, use DataFrame.eval")

first_expr = True
for expr in exprs:
expr = _convert_expression(expr)
_check_engine(engine)
_check_parser(parser)
_check_resolvers(resolvers)
_check_for_locals(expr, level, parser)

# get our (possibly passed-in) scope
level += 1
env = _ensure_scope(level, global_dict=global_dict,
local_dict=local_dict, resolvers=resolvers,
target=target)

parsed_expr = Expr(expr, engine=engine, parser=parser, env=env,
truediv=truediv)

# construct the engine and evaluate the parsed expression
eng = _engines[engine]
eng_inst = eng(parsed_expr)
ret = eng_inst.evaluate()

if parsed_expr.assigner is None and multi_line:
raise ValueError("Multi-line expressions are only valid"
" if all expressions contain an assignment")

# assign if needed
if env.target is not None and parsed_expr.assigner is not None:
if inplace is None:
warnings.warn(
"eval expressions containing an assignment currently"
"default to operating inplace.\nThis will change in "
"a future version of pandas, use inplace=True to "
"avoid this warning.",
FutureWarning, stacklevel=3)
inplace = True

# if returning a copy, copy only on the first assignment
if not inplace and first_expr:
target = env.target.copy()
else:
target = env.target

target[parsed_expr.assigner] = ret

if not resolvers:
resolvers = ({parsed_expr.assigner: ret},)
else:
# existing resolver needs updated to handle
# case of mutating existing column in copy
for resolver in resolvers:
if parsed_expr.assigner in resolver:
resolver[parsed_expr.assigner] = ret
break
else:
resolvers += ({parsed_expr.assigner: ret},)

ret = None
first_expr = False

if not inplace and inplace is not None:
return target

return ret
Loading

0 comments on commit 3465696

Please sign in to comment.