-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable regular expressions containing \s
and \S
#5089
Conversation
Signed-off-by: sperlingxx <lovedreamf@gmail.com>
build |
It looks like there may be a version mismatch or something.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks fine to me, but I want to be sure that the test is good enough to cover the corner cases we expect. @sperlingxx did you check the CUDF implementation to be sure that they match? @andygrove Are there other characters that you would like to see in there? Especially in we want to catch possible cudf regressions.
Could you also update |
I couldn't find documentation for what cuDF considers whitespace but through experimentation in the fuzz tests, I found that cuDF appears to consider test("whitespace boundaries - replace") {
val patterns = Seq("\\s", "\\S")
val inputs = Seq("\u001eTEST")
assertCpuGpuMatchesRegexpReplace(patterns, inputs)
} This fails with |
Rather than pass |
Yes, I changed the parser to transpile |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I don't think we need to change the Spark2 code but I don't think it hurts either.
Closes #4528
Signed-off-by: sperlingxx lovedreamf@gmail.com
Enable regular expressions which contain
\s
and\S
and provide corresponding tests.