-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to CPU when encoding is not supported for JSON reader #4622
Conversation
Signed-off-by: Bobby Wang <wbo4958@gmail.com>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little late to be doing this for 22.02. But it is a simple bug fix.
@@ -138,6 +140,11 @@ object GpuJsonScan { | |||
meta.willNotWorkOnGpu("GpuJsonScan only supports \"\\n\" as a line separator") | |||
} | |||
|
|||
parsedOptions.encoding.foreach(enc => | |||
if (enc != StandardCharsets.UTF_8.name() && enc != StandardCharsets.US_ASCII.name()) { | |||
meta.willNotWorkOnGpu("GpuJsonScan only supports UTF8 or US-ASCII encoded data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have the error message include the encoding that is causing us to fall back? This will make it easier for the user to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the comment, I'd like to create a follow-up targeting 22.04
@@ -138,6 +140,11 @@ object GpuJsonScan { | |||
meta.willNotWorkOnGpu("GpuJsonScan only supports \"\\n\" as a line separator") | |||
} | |||
|
|||
parsedOptions.encoding.foreach(enc => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we use exists
here instead since this is an Option
? Something like:
if (parsedOptions.encoding.exists(enc =>
enc != StandardCharsets.UTF_8.name() && enc != StandardCharsets.US_ASCII.name())) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the comment, I'd like to create a follow-up targeting 22.04
Falllback to CPU when encoding is not UTF-8 or US-ASCII