Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError using DataFrame.mask #93

Closed
sanzoghenzo opened this issue Aug 31, 2021 · 3 comments
Closed

TypeError using DataFrame.mask #93

sanzoghenzo opened this issue Aug 31, 2021 · 3 comments

Comments

@sanzoghenzo
Copy link

Hi there, as always thanks for your hard work!

I'm runing into problem while trying to use the DataFrame.mask method on a DataFrame with PintArray.

Here's my code:

import pandas as pd
import pint_pandas

pint_pandas.show_versions()

df = pd.DataFrame({
    "distance": pd.Series([1, 2, 3, 4], dtype="pint[m]"),
    "numbers1": pd.Series([1, 2, 2, 3], dtype="int64"),
    "numbers2": pd.Series([1, 1, 2, 3], dtype="int64"),

})

mask = df["numbers2"] > 1

df.mask(mask)

It returns:

{'numpy': '1.19.4', 'pandas': '1.2.0', 'pint': '0.17', 'pint_pandas': '0.2'}


Traceback (most recent call last):
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pint\quantity.py", line 1851, in __getitem__
    return type(self)(self._magnitude[key], self._units)
TypeError: 'float' object is not subscriptable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/a.ghensi/AppData/Roaming/JetBrains/PyCharm2021.1/scratches/scratch_1.py", line 29, in <module>
    main()
  File "C:/Users/a.ghensi/AppData/Roaming/JetBrains/PyCharm2021.1/scratches/scratch_1.py", line 23, in main
    df2 = df.mask(mask)
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\generic.py", line 9317, in mask
    errors=errors,
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\generic.py", line 9280, in where
    cond, other, inplace, axis, level, errors=errors, try_cast=try_cast
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\generic.py", line 9135, in _where
    axis=block_axis,
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\internals\managers.py", line 564, in where
    axis=axis,
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\internals\managers.py", line 427, in apply
    applied = getattr(b, f)(**kwargs)
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pandas\core\internals\blocks.py", line 2001, in where
    set_other = other[icond]
  File "C:\tools\miniconda3\envs\dprock\lib\site-packages\pint\quantity.py", line 1857, in __getitem__
    "supports indexing".format(self._magnitude)
TypeError: Neither Quantity object nor its magnitude (nan)supports indexing

The error stems from pandas ExtensionBlock.where method:

  • first, if the replacement value (nan in this case) is a scalar and isna, then it uses the dtype.na_value (Quantity(np.nan, "meters") in this case)
  • then, it re-checks if the value is a scalar, and if not it treats it as an indexable and uses the mask as index. This throws the error because of pd.api.types.is_scalar(Quantity(np.nan, "meters") == False.

Is it something that can be done here or it's a pandas issue?

@sanzoghenzo
Copy link
Author

I should have searched the pandas issues first!

@MichaelTiemannOSC
Copy link
Collaborator

I'm getting hit by this, too! df.where does not behave well when it passes rows with NaNs. I see the above link to pandas...we'll see what happens.

bors bot added a commit that referenced this issue Oct 27, 2022
133: Support for Pandas 1.5/1.6 r=andrewgsavage a=andrewgsavage

- [x] Closes #128 #125 #117 #113 #93 #26 
- [x] Executed ``pre-commit run --all-files`` with no errors
- [x] The change is fully covered by automated unit tests
- [x] Documented in docs/ as appropriate
- [x] Added an entry to the CHANGES file

Probably closes #120 

This is running locally, with hgrecco/pint#1596. Quite a few additional tests passing!
This is following the change to is_list_like in pandas, so allows constrcutors and setitem to work.

Co-authored-by: Andrew <andrewgsavage@gmail.com>
@andrewgsavage
Copy link
Collaborator

this works now

	distance	numbers1	numbers2
0	1.0	1.0	1.0
1	2.0	2.0	1.0
2	nan	NaN	NaN
3	nan	NaN	NaN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants