Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported regex in md-math-block.tmLanguage.json #525

Closed
jensli opened this issue May 3, 2023 · 10 comments
Closed

Unsupported regex in md-math-block.tmLanguage.json #525

jensli opened this issue May 3, 2023 · 10 comments

Comments

@jensli
Copy link
Contributor

jensli commented May 3, 2023

This regex in

"begin": "(?<=^\\s*)(\\${2})(?![^$]*\\${2})",

...is not supported by Joni.

The result if that when the line gets selected in Preferences > TextMate > Grammars the following exception is logged:

SEVERE: org.eclipse.tm4e.core.TMException: Parsing regex pattern "(?<=^\s*)(\${2})(?![^$]*\${2})" failed with org.joni.exception.SyntaxException: invalid pattern in look-behind

Stack trace:

	at org.eclipse.tm4e.core.internal.oniguruma.OnigRegExp.<init>(OnigRegExp.java:60)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:720)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
	at java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
	at java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
	at java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigSearcher.<init>(OnigSearcher.java:33)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigScanner.<init>(OnigScanner.java:32)
	at org.eclipse.tm4e.core.internal.rule.CompiledRule.<init>(CompiledRule.java:37)
	at org.eclipse.tm4e.core.internal.rule.RegExpSourceList.compile(RegExpSourceList.java:77)
	at org.eclipse.tm4e.core.internal.rule.RegExpSourceList.compileAG(RegExpSourceList.java:84)
	at org.eclipse.tm4e.core.internal.rule.IncludeOnlyRule.compileAG(IncludeOnlyRule.java:57)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.matchRule(LineTokenizer.java:314)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.matchRuleOrInjections(LineTokenizer.java:328)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.scanNext(LineTokenizer.java:137)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.scan(LineTokenizer.java:128)
	at org.eclipse.tm4e.core.internal.grammar.LineTokenizer.tokenizeString(LineTokenizer.java:564)
	at org.eclipse.tm4e.core.internal.grammar.Grammar._tokenize(Grammar.java:341)
	at org.eclipse.tm4e.core.internal.grammar.Grammar.tokenizeLine(Grammar.java:258)
	at org.eclipse.tm4e.core.model.TMTokenization.tokenize(TMTokenization.java:72)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.updateTokensOfLine(TMModel.java:169)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.lambda$0(TMModel.java:125)
	at org.eclipse.tm4e.core.model.TMModel.buildAndEmitEvent(TMModel.java:269)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.revalidateTokens(TMModel.java:121)
	at org.eclipse.tm4e.core.model.TMModel$TokenizerThread.run(TMModel.java:101)
Caused by: org.joni.exception.SyntaxException: invalid pattern in look-behind
	at org.joni.ScannerSupport.newSyntaxException(ScannerSupport.java:163)
	at org.joni.Analyser.setupLookBehind(Analyser.java:1404)
	at org.joni.Analyser.setupTree(Analyser.java:2004)
	at org.joni.Analyser.setupTree(Analyser.java:1825)
	at org.joni.Analyser.compile(Analyser.java:110)
	at org.joni.Regex.<init>(Regex.java:155)
	at org.joni.Regex.<init>(Regex.java:134)
	at org.eclipse.tm4e.core.internal.oniguruma.OnigRegExp.<init>(OnigRegExp.java:58)
	... 28 more
@jensli
Copy link
Contributor Author

jensli commented May 3, 2023

The solution is probably to remove the "Markdown Math" grammars in plugin.xml:

<!-- markdown-math -->
<extension point="org.eclipse.tm4e.registry.grammars">
<grammar scopeName="lngpck.text.html.markdown.math" path="markdown-math/md-math.tmLanguage.json"/>
<scopeNameContentTypeBinding scopeName="lngpck.text.html.markdown.math" contentTypeId="lng.markdown-math"/>
<grammar scopeName="lngpck.markdown.math.block" path="markdown-math/md-math-block.tmLanguage.json"/>
<grammar scopeName="lngpck.markdown.math.inline" path="markdown-math/md-math-inline.tmLanguage.json"/>
</extension>

@jensli
Copy link
Contributor Author

jensli commented May 3, 2023

Also, TMModel should probably log the whole exception, not just the message:

} catch (final Exception ex) {
LOGGER.log(ERROR, ex.toString());
return UpdateTokensOfLineResult.UPDATE_FAILED;
}

@jensli
Copy link
Contributor Author

jensli commented May 3, 2023

Also, I think tm4e should handle unsupported regexes and other problems with the grammar files more gracefully. Instead of just logging the error an error dialog should be displayed that explains the problem.

@mickaelistria
Copy link
Contributor

all the proposals would be welcome as pull requests ;)

@sebthom
Copy link
Member

sebthom commented May 5, 2023

@jensli Maybe this should be reported in the upstream repo instead: https://github.com/microsoft/vscode/blob/main/extensions/markdown-math/syntaxes/md-math-block.tmLanguage.json#L16

If joni does not support this, then this is probably not a valid Oniguruma style regex, which is required by textmate syntax. https://macromates.com/manual/en/regular_expressions#syntax_oniguruma

@jensli
Copy link
Contributor Author

jensli commented May 5, 2023

@sebthom Good idea. I will check.

Also: I plan to provide a PR disabling the invalid grammar.

Also: I'll have a look and try to improve error handling in general. Since all loading is lazy it is hard to find a good place to report problems. Also, the problem with this grammar is re-encountered and reported many times a second in the look by the update thread. I'll try to do something about that.

@angelozerr
Copy link
Contributor

If joni supports it it could be interesting to see if vscode supports. As tm4e is a port of vscode-textmate perhaps we should upgrade it again.

@sebthom
Copy link
Member

sebthom commented May 5, 2023

@angelozerr we are using the latest joni release and are in sync with latest vscode-textmate changes.

@jensli
Copy link
Contributor Author

jensli commented May 5, 2023

Oniguruma also rejects the regex, with exactly then same error message:

(?<=^\\s*)(\\${2})(?![^$]*\\${2})
=> jdoodle.rb:1: invalid pattern in look-behind: /(?<=^\\s*)(\\${2})(?![^$]*\\${2})/

Tested at jdoodle.com using Ruby 3.0.2.

I will file a bug upstream.

Do anyone know what regex implementation that Visual Studio Code uses?

@sebthom
Copy link
Member

sebthom commented May 5, 2023

jensli added a commit to jensli/tm4e that referenced this issue May 6, 2023
jensli added a commit to jensli/tm4e that referenced this issue May 6, 2023
jensli added a commit to jensli/tm4e that referenced this issue May 6, 2023
jensli added a commit to jensli/tm4e that referenced this issue May 7, 2023
@sebthom sebthom closed this as completed May 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants