Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Internal and external indices on axis 0 do not match doing df._to_pandas after pd.read_csv #5150

Closed
3 tasks done
YarShev opened this issue Oct 23, 2022 · 0 comments · Fixed by #5151
Closed
3 tasks done
Labels
bug 🦗 Something isn't working P0 Highest priority tasks requiring immediate fix

Comments

@YarShev
Copy link
Collaborator

YarShev commented Oct 23, 2022

Modin version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest released version of Modin.

  • I have confirmed this bug exists on the main branch of Modin. (In order to do this you can follow this guide.)

Reproducible Example

import numpy as np
import pandas
import modin.pandas as pd

pandas_df = pandas.DataFrame(np.random.randint(0, 100, size=(2**6, 2**6)))
pandas_df.to_csv("foo.csv", index=False)
modin_df = pd.read_csv("foo.csv", index_col=False)
pandas_df2 = modin_df._to_pandas()

Issue Description

Internal and external indices on axis 0 do not match doing df._to_pandas after pd.read_csv. The indices should be syncronized after read_csv in the case.

Expected Behavior

Modin should correctly read data with the provided parameters so consequent operations work correctly too.

Error Logs

Exception: Internal Error. Please visit https://github.com/modin-project/modin/issues to file an issue with the traceback and the command that caused this error. If you can't file a GitHub issue, please email bug_reports@modin.org.
Internal and external indices on axis 0 do not match.

Installed Versions

INSTALLED VERSIONS

commit : 6cc441a
python : 3.9.13.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.102.1-microsoft-standard-WSL2
Version : #1 SMP Wed Mar 2 00:30:59 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

Modin dependencies

modin : 0.16.0+21.g6cc441a4
ray : 2.0.0
dask : 2022.9.1
distributed : 2022.9.1
hdk : None

pandas dependencies

pandas : 1.5.1
numpy : 1.23.3
pytz : 2022.2.1
dateutil : 2.8.2
setuptools : 65.4.0
pip : 22.2.2
Cython : None
pytest : 7.1.3
hypothesis : None
sphinx : 5.2.2
blosc : None
feather : 0.4.1
xlsxwriter : None
lxml.etree : 4.8.0
html5lib : None
pymysql : None
psycopg2 : 2.8.6
jinja2 : 3.1.2
IPython : 8.5.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli :
fastparquet : 0.8.3
fsspec : 2022.8.2
gcsfs : None
matplotlib : 2.2.5
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : 0.17.8
pyarrow : 9.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2022.8.2
scipy : 1.9.1
snappy : None
sqlalchemy : 1.4.41
tables : 3.7.0
tabulate : None
xarray : 2022.6.0
xlrd : 2.0.1
xlwt : None
zstandard : None
tzdata : None

@YarShev YarShev added bug 🦗 Something isn't working Triage 🩹 Issues that need triage P0 Highest priority tasks requiring immediate fix and removed Triage 🩹 Issues that need triage labels Oct 23, 2022
YarShev added a commit to YarShev/modin that referenced this issue Oct 23, 2022
… is False

Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
dchigarev pushed a commit that referenced this issue Oct 25, 2022
)

Signed-off-by: Igoshev, Iaroslav <iaroslav.igoshev@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working P0 Highest priority tasks requiring immediate fix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant