-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd.eval() discards imaginary part in division "/" #21374
Comments
@fillipe-gsm : How odd! Investigation and patch are welcome! |
@gfyoung @fillipe-gsm maybe the problem lies in the usage of It also mentions that " If val is real, the type of val is used for the output. If val has complex elements, the returned type is float. " |
@gfyoung @uds5501 class Div(BinOp):
"""Div operator to special case casting.
Parameters
----------
lhs, rhs : Term or Op
The Terms or Ops in the ``/`` expression.
truediv : bool
Whether or not to use true division. With Python 3 this happens
regardless of the value of ``truediv``.
"""
def __init__(self, lhs, rhs, truediv, *args, **kwargs):
super(Div, self).__init__('/', lhs, rhs, *args, **kwargs)
if not isnumeric(lhs.return_type) or not isnumeric(rhs.return_type):
raise TypeError("unsupported operand type(s) for {0}:"
" '{1}' and '{2}'".format(self.op,
lhs.return_type,
rhs.return_type))
if truediv or PY3:
# do not upcast float32s to float64 un-necessarily
acceptable_dtypes = [np.float32, np.float_]
_cast_inplace(com.flatten(self), acceptable_dtypes, np.float_) It seems that this class is simply not ready to handle complex numbers given the instruction data = {"a": [1 + 2j], "b": [1 + 1j]}
df = pd.DataFrame(data = data)
df.eval("a/b")
0 (1.5+0.5j)
dtype: complex128 What do you guys think? |
@fillipe-gsm : That seems reasonable to me. A PR for this would be great! |
This is quite an old issue but it was not solved and it looks similar to issues I encountered working on hgrecco/pint-pandas#137 and #58748 :
This makes me questioning the use of A specific test was introduced at the time. It is now here: pandas/pandas/tests/computation/test_eval.py Line 761 in 3b48b17
I tried to run this test removing the used of _cast_inplace and the test ran successfully. Changes in numpy or expression backend seams to deprecate the use of _cast_inplace . When it is removed complex and ExtensionArray are computed properly @jreback @jennolsen84 do you have any opinion on the matter?
Since pandas/pandas/core/computation/ops.py Line 515 in 3b48b17
|
note that test - expected = Series([0.25, 0.40, 0.50])
+ expected = Series(pd.array([0.25, 0.40, 0.50])) |
I worked a bit on this and I am facing an issue when updating |
…olve pandas-dev#2137) * remove core.computation.ops.Div resolves pandas-dev#21374 pandas-dev#58748 * need to preserve order * updating tests * (update whatsnew -- no whatsnew for 2.2.x and 2.3 yet) * solve mypy issue * fixing pytests * better than cast * adding specific test (* Update pandas/tests/frame/test_query_eval.py // Not backported, fails on 2.2) Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> * Update pandas/tests/computation/test_eval.py Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Laurent Mutricy <laurent.mutricy@ekium.eu> Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
…olve pandas-dev#2137) * remove core.computation.ops.Div resolves pandas-dev#21374 pandas-dev#58748 * need to preserve order * updating tests * (update whatsnew -- no whatsnew for 2.2.x and 2.3 yet) * solve mypy issue * fixing pytests * better than cast * adding specific test (* Update pandas/tests/frame/test_query_eval.py // Not backported, fails on 2.2) Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> * Update pandas/tests/computation/test_eval.py Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Laurent Mutricy <laurent.mutricy@ekium.eu> Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
* remove core.computation.ops.Div resolves pandas-dev#21374 pandas-dev#58748 * need to preserve order * updating tests * update whatsnew * solve mypy issue * fixing pytests * better than cast * adding specific test * Update pandas/tests/frame/test_query_eval.py Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> * Update pandas/tests/computation/test_eval.py Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> --------- Co-authored-by: Laurent Mutricy <laurent.mutricy@ekium.eu> Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com>
#59535) * remove core.computation.ops.Div resolves #21374 #58748 * need to preserve order * updating tests * (update whatsnew -- no whatsnew for 2.2.x and 2.3 yet) * solve mypy issue * fixing pytests * better than cast * adding specific test (* Update pandas/tests/frame/test_query_eval.py // Not backported, fails on 2.2) * Update pandas/tests/computation/test_eval.py --------- Co-authored-by: Laurent Mutricy <laurent.mutricy@ekium.eu> Co-authored-by: Matthew Roeschke <10647082+mroeschke@users.noreply.github.com> Co-authored-by: Thomas Li <47963215+lithomas1@users.noreply.github.com>
Code Sample
Problem description
The output type was coerced into a float. This also happens by assigning the result to another existing column:
And even if the operation is in place:
Expected Output
The expected output is
The problem seems to happen only with the "/" operator. In fact, the correct result can be obtained by replacing the division with a multiplication and a negative exponent:
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.16.12-300.fc28.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: en_US.UTF-8
pandas: 0.23.0
pytest: None
pip: 9.0.3
setuptools: 39.2.0
Cython: None
numpy: 1.14.4
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.4.1
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.1
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: None
lxml: 4.1.1
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: