-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix beginning fixups of captures after Regex span support #66713
Conversation
The Regex span support changed the scanning infrastructure to always be based on spans. That means that when a string is passed in by the caller, internally we still operate on it as a span. That also means we can take advantage of slicing, and if the caller has specified via a beginning/length set of arguments that we should only process a substring, we can just slice to that substring. That, however, then means that all offsets computed by the scanning implementation are 0-based rather than beginning-based. The span change included a fixup for the overall match position, but not for the position of each individual capture, and that then meant that captures were providing the wrong values. We unfortunately didn't have any tests for validating groups that also involved non-0 beginnings with string inputs.
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsThe Regex span support changed the scanning infrastructure to always be based on spans. That means that when a string is passed in by the caller, internally we still operate on it as a span. That also means we can take advantage of slicing, and if the caller has specified via a beginning/length set of arguments that we should only process a substring, we can just slice to that substring. That, however, then means that all offsets computed by the scanning implementation are 0-based rather than beginning-based. The span change included a fixup for the overall match position, but not for the position of each individual capture, and that then meant that captures were providing the wrong values. We unfortunately didn't have any tests for validating groups that also involved non-0 beginnings with string inputs. Fixes #66697
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing, sorry that I missed this from the beginning.
The Regex span support changed the scanning infrastructure to always be based on spans. That means that when a string is passed in by the caller, internally we still operate on it as a span. That also means we can take advantage of slicing, and if the caller has specified via a beginning/length set of arguments that we should only process a substring, we can just slice to that substring. That, however, then means that all offsets computed by the scanning implementation are 0-based rather than beginning-based. The span change included a fixup for the overall match position, but not for the position of each individual capture, and that then meant that captures were providing the wrong values. We unfortunately didn't have any tests for validating groups that also involved non-0 beginnings with string inputs.
The Regex span support changed the scanning infrastructure to always be based on spans. That means that when a string is passed in by the caller, internally we still operate on it as a span. That also means we can take advantage of slicing, and if the caller has specified via a beginning/length set of arguments that we should only process a substring, we can just slice to that substring. That, however, then means that all offsets computed by the scanning implementation are 0-based rather than beginning-based. The span change included a fixup for the overall match position, but not for the position of each individual capture, and that then meant that captures were providing the wrong values. We unfortunately didn't have any tests for validating groups that also involved non-0 beginnings with string inputs.
Fixes #66697