[FEA] Enable regular expressions by default #4509
Labels
epic
Issue that encompasses a significant feature or body of work
feature request
New feature or request
Is your feature request related to a problem? Please describe.
Regular expression support is currently disabled by default due to many known compatibility issues, which are documented in the compatibility guide. This epic is to track the work required to address these issues and enable the feature by default.
Completed
$
and\r
consistent with CPU #4170[^a]
has different behavior between CPU and GPU for multiline strings #4229\d+
,\d*
,\d?
) #4467\
can produce incorrect results #4559\D
and\W
match newline in Spark but not in cuDF #4475High Priority
$
and string anchors\z
and\Z
in regexp_replace #4425?
and*
with regexp_replace #4468\Z
in regular expressions #4532$
in regular expressions #4533Medium Priority
\s
and\S
#4528\b
and\B
in regular expressions #4517\b
and\B
with StringSplit for regular expressions #5478REGEXP_SUBSTR
#5973Low Priority
\200
to377
#4746.
doesn't fully exclude all unicode line terminator characters #5415[\0177]
#48620x7f
#4866[\x7f]
#4865Describe the solution you'd like
Support the regular expressions functions and expressions by default with 100% compatibility with Spark:
Describe alternatives you've considered
None
Additional context
None
The text was updated successfully, but these errors were encountered: