Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException occasionally thrown in TokenizerThread #840

Open
FlorianKroiss opened this issue Dec 9, 2024 · 0 comments
Labels

Comments

@FlorianKroiss
Copy link

In my case this happens quite often when comparing two xml files.
Steps to (sometimes) reproduce:

  1. Create test.xml with example content from here
  2. Copy file to test2.xml
  3. Select both files.
  4. Right click and Compare With > Each Other.
  5. A variation of the following Exception is raised:
java.lang.ArrayIndexOutOfBoundsException: Byte index 28 is out of range 0..25 of SingleByteString[string="    <genre>Horror</genre>
"]
	at org.eclipse.tm4e.core.internal.oniguruma.OnigString.throwOutOfBoundsException(OnigString.java:170)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigString$SingleByteString.getCharIndexOfByte(OnigString.java:144)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigScannerMatch.captureIndicesOfMatch(OnigScannerMatch.java:46)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigScannerMatch.<init>(OnigScannerMatch.java:34)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigScanner.findNextMatch(OnigScanner.java:39)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.matchRule(LineTokenizer.java:317)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.matchRuleOrInjections(LineTokenizer.java:329)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.scanNext(LineTokenizer.java:140)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.scan(LineTokenizer.java:131)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.tokenizeString(LineTokenizer.java:564)
	at org.eclipse.tm4e.core.internal.grammar.Grammar._tokenize(Grammar.java:342)
	at org.eclipse.tm4e.core.internal.grammar.Grammar.tokenizeLine(Grammar.java:259)
	at org.eclipse.tm4e.core.model.TMTokenizationSupport.tokenize(TMTokenizationSupport.java:81)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.revalidateTokens(TMModel.java:250)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.run(TMModel.java:173)

This does not always happen. But when it happens, the location always seems to be different.

ERROR 2024-12-09 10:13:48.592 [tm4e.TokenizerThread] org.eclipse.tm4e.core.model.TMModel - java.lang.ArrayIndexOutOfBoundsException: Byte index 28 is out of range 0..23 of SingleByteString[string="    <price>4.95</price>
"]
ERROR 2024-12-09 10:37:46.569 [tm4e.TokenizerThread] org.eclipse.tm4e.core.model.TMModel - java.lang.ArrayIndexOutOfBoundsException: Byte index 34 is out of range 0..32 of SingleByteString[string="      <title>Lover Birds</title>
"]
@sebthom sebthom added the bug label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants