Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support allowUnquotedControlChars for JsonToStructs and ScanJson #4612

Open
Tracked by #9
GaryShen2008 opened this issue Jan 24, 2022 · 2 comments
Open
Tracked by #9
Labels
task Work required that improves the product but is not user facing

Comments

@GaryShen2008
Copy link
Collaborator

GaryShen2008 commented Jan 24, 2022

From the Spark JSON option,

allowUnquotedControlChars: Allows JSON Strings to contain unquoted control characters (ASCII characters with value less than 32, including tab and line feed characters) or not.

This is off by default with Spark, but on by default with CUDF, so we need to have a way to fix this or we will not return nulls when we should.

@GaryShen2008 GaryShen2008 added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 24, 2022
@sameerz sameerz added task Work required that improves the product but is not user facing and removed feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 25, 2022
@revans2
Copy link
Collaborator

revans2 commented Feb 21, 2024

allowUnquotedControlChars is what CUDF supports by default already. We need a way to disable it because that is what Spark has by default.

@revans2 revans2 changed the title [FEA]JSON reader: support unquoted field name and control chars [FEA] Support allowUnquotedControlChars for JsonToStructs and ScanJson Mar 13, 2024
@revans2
Copy link
Collaborator

revans2 commented Mar 13, 2024

This depends on rapidsai/cudf#15222. It may be broken up into smaller pieces in the future, and there are a lot of things that could be validated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

No branches or pull requests

3 participants