Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Regression in getitem for SparseArray with greater comparison #45110

Closed
3 tasks done
phofl opened this issue Dec 29, 2021 · 9 comments
Closed
3 tasks done

BUG: Regression in getitem for SparseArray with greater comparison #45110

phofl opened this issue Dec 29, 2021 · 9 comments
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version Sparse Sparse Data Type
Milestone

Comments

@phofl
Copy link
Member

phofl commented Dec 29, 2021

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

#44955 broke some things with greater comparisons. We should fix them before 1.4 is released or revert here.

s = pd.arrays.SparseArray([1, 2, 3, 4, np.nan, np.nan], fill_value=np.nan)
s[s>2]

Issue Description

This returns

[1.0, 2.0, 3.0, 4.0]
Fill: nan
IntIndex
Indices: array([0, 1, 2, 3], dtype=int32)

Expected Behavior

This returned

[3.0, 4.0]
Fill: nan
IntIndex
Indices: array([0, 1], dtype=int32)

before which was correct.

Installed Versions

INSTALLED VERSIONS

commit : a51cd10
python : 3.8.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-43-generic
Version : #47~20.04.2-Ubuntu SMP Mon Dec 13 11:06:56 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.4.0.dev0+1516.ga51cd1053e
numpy : 1.21.5
pytz : 2021.1
dateutil : 2.8.2
pip : 21.3.1
setuptools : 59.8.0
Cython : 0.29.24
pytest : 6.2.5
hypothesis : 6.23.1
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.28.0
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : 2021.11.0
fastparquet : 0.7.1
gcsfs : 2021.05.0
matplotlib : 3.4.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : 2021.11.0
scipy : 1.7.2
sqlalchemy : 1.4.25
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.18.0
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1
zstandard : None
None

Process finished with exit code 0

@phofl phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member Blocker for rc Blocking issue or pull request for release candidate Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version Sparse Sparse Data Type and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 29, 2021
@phofl phofl added this to the 1.4 milestone Dec 29, 2021
@phofl
Copy link
Member Author

phofl commented Dec 29, 2021

cc @bdrum

@bdrum
Copy link
Contributor

bdrum commented Dec 29, 2021

Hi @phofl

There was a test about it?

Because I've created an issue #44956 that's about comparison operators work wrong.

And this behavior that you see exactly about this.

As you can see on this screenshot
image
I've fixed it, but only for scalars. Other comparisons still work wrong, but I'm working about it.

I will add your case to the tests as well as.

@phofl
Copy link
Member Author

phofl commented Dec 29, 2021

No there is no test right now.

Yes recalculating the indices was the previous bug, bug getitem worked. getitem is now broken for these comparisons

This does not work on the newest master commit. Where did you test that?

@bdrum
Copy link
Contributor

bdrum commented Dec 29, 2021

I meant that my screenshot was from my current dev branch. I didn't prepare pr yet, because I want to fix arrays comparison according to the issue.
Ok, what is the release date of 1.4? Hopefully I will fix it tomorrow or the next day after tomorrow.

@phofl
Copy link
Member Author

phofl commented Dec 29, 2021

See #41957

@jreback jreback removed the Blocker for rc Blocking issue or pull request for release candidate label Dec 31, 2021
@jreback
Copy link
Contributor

jreback commented Dec 31, 2021

we should try to fix this for 1.4 but not blocking the rc

@phofl
Copy link
Member Author

phofl commented Dec 31, 2021

Sgtm, but if we don't get this for the actual release we should revert the commit which caused this before releasing 1.4

@jreback
Copy link
Contributor

jreback commented Dec 31, 2021

sounds good (either way is fine)

@bdrum
Copy link
Contributor

bdrum commented Jan 9, 2022

@jreback, @phofl this also could be closed. Fixed and tested by #45125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version Sparse Sparse Data Type
Projects
None yet
Development

No branches or pull requests

4 participants