-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex Capture gives IndexOutOfRangeException on (?:){93}
#62094
Comments
Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions Issue DetailsRegex regex = new("@(a*)+?");
MatchCollection mc = regex.Matches("@");
foreach (Match m in mc)
{
Console.WriteLine($"{m.Index} .. {m.Length + m.Index - 1}");
Console.WriteLine(m.ToString());
} On other regex engines, like PCRE, Nim, Python, Javascript, and Golang you would get On .NET Framework this displays On .NET 6 it gives
@stephentoub first bug from running Nim tests.
|
So there seems to be distinct bugs
|
Looks like a regression in .NET 5, which also exhibits this. .NET Core 3.1 outputs 1..1 like .NET Framework does. |
This is actually an issue in .NET Framework 4.8 and .NET Core 3.1 as well. The change in .NET 5 is that it recognizes that the outer loop is actually atomic (because there's nothing after it to backtrack into it). So it effectively changes the expression to instead be:
Try that expression with .NET Framework, and you similarly get 1..3 and an out of range exception. |
(Locally I also have the final changes that switches over all compiler / source generated expressions to using the new simpler code generation (except for anything that uses RightToLeft), and with that, this correctly outputs 0..1 and no exception, so it'll end up really only being an issue for the interpreter.) |
When I fuzzed I believe I only used IsMatch. Next time we should enumerate the matches also. |
I just tested with Preview 7 bits and seems like all 4 engines (including Interpreted) are correctly printing: 0 .. 1
@ Not sure exactly which change fixed this as it was logged some time ago, but closing as completed. |
@joperezr might be worth adding as a test case if we don't have one. |
Good point, I'll reopen this to track that. |
lol, that explains why after adding it I was seeing it running elsewhere in testResults.xml. I'll fix that condition instead then. |
so I tried re-enabling all those 3, but looks like one is still problematic: Lines 405 to 409 in f7d7b55
This is still throwing IndexOutOfRangeException when trying to call Trackpush(). Looks like with such a large number of loops (93 in that testcase) we are failing when trying to walk back through the runtrack stack, the exception is being thrown on this line: Line 82 in 63692cd
since at some point runtrackpos is 0 so this is trying to access to index |
@joperezr, can this be closed again now? Or if it's tracking a disabled test, should it be moved out of 7.0 and labeled as disabled test? |
it is tracking a disabled test, so I'll label and move as suggested. |
(?:){93}
I would like to provide a comment for reference.
I insert the following into Regex.MultipleMatches.Tests.Matches_TestData() for testing. for (int i = 0; i < 100; i++)
{
yield return new object[]
{
engine, @"(?:){" + i + @"}", "x", RegexOptions.None, new[]
{
new CaptureData("", 0, 0),
new CaptureData("", 1, 0)
}
};
} |
This update fixes the IndexOutOfRangeException in RegexInterpreter by enhancing the `TrackPush` and `TrackPush2` methods. The adjustment involves checking the runtrack position before decrementing it, ensuring that it doesn't become negative, which was the root cause of the exception. This prevents potential out-of-range errors when handling large numbers of repetitions in regular expressions. Fix dotnet#62094
The size of RegexRunner.runtrack is set to runtrackcount * 8 (runtrackcount == _code.TrackCount).
It appears that the reason the index (runtrackpos) becomes negative and causes an exception from @"(?:){26}" ~ @"(?:){99}" is because the runtrack size of 7 * 8 (== 56) is not sufficient. |
* Prevent IndexOutOfRangeException in RegexInterpreter This update fixes the IndexOutOfRangeException in RegexInterpreter by enhancing the `TrackPush` and `TrackPush2` methods. The adjustment involves checking the runtrack position before decrementing it, ensuring that it doesn't become negative, which was the root cause of the exception. This prevents potential out-of-range errors when handling large numbers of repetitions in regular expressions. Fix #62094 * Changed to call EnsureStorage() unconditionally. If EnsureStorage() is called unconditionally, the array will be expanded, so the position will never become negative. When the conditions inside EnsureStorage() are true, it might be necessary to expand the array, regardless of the comparison between newpos and codepos. https://github.com/dotnet/runtime/blob/6ebc8bd86dbc780b2a2a7daf3ab6020f9104f09e/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.MultipleMatches.Tests.cs#L461-L469 Before the change, in this test case, EnsureStorage() is not called because newpos == codepos == 6 from the first time until an exception occurs. Fix #62049
* Prevent IndexOutOfRangeException in RegexInterpreter This update fixes the IndexOutOfRangeException in RegexInterpreter by enhancing the `TrackPush` and `TrackPush2` methods. The adjustment involves checking the runtrack position before decrementing it, ensuring that it doesn't become negative, which was the root cause of the exception. This prevents potential out-of-range errors when handling large numbers of repetitions in regular expressions. Fix dotnet#62094 * Changed to call EnsureStorage() unconditionally. If EnsureStorage() is called unconditionally, the array will be expanded, so the position will never become negative. When the conditions inside EnsureStorage() are true, it might be necessary to expand the array, regardless of the comparison between newpos and codepos. https://github.com/dotnet/runtime/blob/6ebc8bd86dbc780b2a2a7daf3ab6020f9104f09e/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.MultipleMatches.Tests.cs#L461-L469 Before the change, in this test case, EnsureStorage() is not called because newpos == codepos == 6 from the first time until an exception occurs. Fix dotnet#62049
* Prevent IndexOutOfRangeException in RegexInterpreter This update fixes the IndexOutOfRangeException in RegexInterpreter by enhancing the `TrackPush` and `TrackPush2` methods. The adjustment involves checking the runtrack position before decrementing it, ensuring that it doesn't become negative, which was the root cause of the exception. This prevents potential out-of-range errors when handling large numbers of repetitions in regular expressions. Fix dotnet#62094 * Changed to call EnsureStorage() unconditionally. If EnsureStorage() is called unconditionally, the array will be expanded, so the position will never become negative. When the conditions inside EnsureStorage() are true, it might be necessary to expand the array, regardless of the comparison between newpos and codepos. https://github.com/dotnet/runtime/blob/6ebc8bd86dbc780b2a2a7daf3ab6020f9104f09e/src/libraries/System.Text.RegularExpressions/tests/FunctionalTests/Regex.MultipleMatches.Tests.cs#L461-L469 Before the change, in this test case, EnsureStorage() is not called because newpos == codepos == 6 from the first time until an exception occurs. Fix dotnet#62049
On other regex engines, like PCRE, Nim, Python, Javascript, and Golang you would get
0 .. 1
On .NET Framework this displays
1 .. 1
On .NET 6 it gives
@stephentoub first bug from running Nim tests.
The text was updated successfully, but these errors were encountered: