We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving a dataset .to_json() fails with a ValueError since the latest pandas release (2.1.0)
.to_json()
ValueError
pandas
2.1.0
In their latest release we have:
Improved error handling when using DataFrame.to_json() with incompatible index and orient arguments (GH 52143)
i.e. an error is now raised for invalid combinations of index and orient.
index
orient
This means that unfortunately the custom logic at this line might sometimes lead to contradictions:
datasets/src/datasets/io/json.py
Line 96 in 029227a
e.g. for the default case orient=records leads to index=True, which now raises a ValueError
orient=records
index=True
import datasets if __name__ == '__main__': dataset = datasets.Dataset.from_dict({"A": [1, 2, 3], "B": [4, 5, 6]}) dataset.to_json("dataset.json")
>>> ValueError: 'index=True' is only valid when 'orient' is 'split', 'table', 'index', or 'columns'.
The dataset is successfully saved as .json
.json
python >= 3.9 pandas >= 2.1.0
python >= 3.9
pandas >= 2.1.0
The text was updated successfully, but these errors were encountered:
Thanks for reporting. We are investigating it.
Sorry, something went wrong.
This issue is caused by latest pandas release 2.1.0 (released yesterday Aug 30).
See: https://github.com/huggingface/datasets/actions/runs/6035484010/job/16375932085?pr=6198
People using previous releases of datasets should pin pandas in their local environment:
datasets
python -m pip install 'pandas<2.1.0'
albertvillanova
Successfully merging a pull request may close this issue.
Describe the bug
Saving a dataset
.to_json()
fails with aValueError
since the latestpandas
release (2.1.0
)In their latest release we have:
i.e. an error is now raised for invalid combinations of
index
andorient
.This means that unfortunately the custom logic at this line might sometimes lead to contradictions:
datasets/src/datasets/io/json.py
Line 96 in 029227a
e.g. for the default case
orient=records
leads toindex=True
, which now raises aValueError
Steps to reproduce the bug
Expected behavior
The dataset is successfully saved as
.json
Environment info
python >= 3.9
pandas >= 2.1.0
The text was updated successfully, but these errors were encountered: