-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] JSON reader: ignores Java/C++ style comment #10265
Comments
This issue has been labeled |
This is off by default in Spark so it is nice to be able to support this, but it is not a blocker. |
This issue has been labeled |
Do the comments only occur in between field names, values, and symbols Does Does a block comment ever contain |
To be clear this is not a super high priority for us. We have had no one request this yet, but it is a part of Spark. This would be a feature that we could turn on or turn off. This corresponds to a Jackson JSON parser feature, which is the JSON parser that Spark uses and just exposes some configs for. Some of the tests for it are at
Yes, that appears to be correct.
results in an error. Comment parsing within a quoted string is skipped.
results in
There are some bugs with this in Spark. For now I would say yes new line \n or the end of the buffer/file. We can standardize on that. The reality is a little more complicated, but it is for features that I don't think we would support, or need to be bug for bug compatible with Spark for.
From reading the Jackson parser code it appears to be doing exactly that. If it sees a '/' when it is looking to skip white space, a colon, or the end, then it goes into comment skipping mode. If it sees a / by itself that is an error or something after the / that is not another / or a * it is an error.
I am not sure what you mean by single line comments? Do you mean something like
Yes
|
This is part of FEA of NVIDIA/spark-rapids#9
We have a JSON file
Spark can parse it when enabling
allowComments
andmultiLine
or
{'name': /* hello */ 'Reynold Xin'}
Spark can parse it when enabling
allowComments
We expect there is a configure
allowComments
to control this behavior.The text was updated successfully, but these errors were encountered: