-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Switch engine=cudf
to the new JSON
reader
#12509
[REVIEW] Switch engine=cudf
to the new JSON
reader
#12509
Conversation
Codecov ReportBase: 86.58% // Head: 85.70% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #12509 +/- ##
================================================
- Coverage 86.58% 85.70% -0.88%
================================================
Files 155 155
Lines 24368 24870 +502
================================================
+ Hits 21098 21316 +218
- Misses 3270 3554 +284
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
JSON
reader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
The only suggestion is to remove all engine='cudf'
and engine="cudf"
:)
python/cudf/cudf/io/json.py
Outdated
# TODO: Deprecated in 23.02, please | ||
# give some time until `cudf_legacy` | ||
# support can be removed completely. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"give some time until
cudf_legacy
support can be removed completely"
Will our one-release deprecation policy be enough for this, or do you think a special exception is warranted? If more time is needed, let's try to indicate the release in which support should be removed in the error message or perhaps a code comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we definitely want to give this a special exception. I don't think we have a specific release decided yet to remove it completely. I updated the comment here.
cc: @GregoryKimball @vuule too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll pick the removal release based on 23.02 feedback. In general, we'll remove the old reader as soon as there are no user issues specific to the new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds reasonable! I just wanted to make sure we discussed this topic before merging. 👍
python/cudf/cudf/io/json.py
Outdated
raise ValueError( | ||
"engine='cudf_experimental' support has been removed, " | ||
"use `engine='cudf'`" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a hard error or a deprecation warning that replaces the value with "cudf"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it was experimental, I feel we have the flexibility to make this an error, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's justifiable -- but we'll want to delete this error at some point later. Therefore, it's the same amount of work for us as developers to deprecate it as to force an error. "Add warning, delete warning later" vs. "add error, delete error later."
No strong feelings here - resolve as you see fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, once conversations are resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Looking forward to future removal of engine='cudf'
s :)
Which also means the removal of |
JSON
readerengine=cudf
to the new JSON
reader
Co-authored-by: Bradley Dice <bdice@bradleydice.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Looking forward to add more orients, and remove engine parameter entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @galipremsagar - Looks good on the dask side!
/merge |
🚀 |
Description
Fixes: #12470
This PR:
cudf
engine in json reader to the map to thenewest
JSON reader. Introduces thecudf_legacy
engine to map to theold
JSON reader._fsspec_data_transfer
&compression
that is required for the switch, these failures are already caught by tests.Note: When
engine='auto'
, andline=False
, thepandas
json reader will be used. To override the selection, we passengine='cudf'
.Dependent on :
Checklist