-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing to parse some sublime-syntax files #156
Comments
Yah the fact that syntect panics on invalid regexes is unfortunate, and should probably be fixed for 3.0 when we next make breaking API changes. It may require a lot of bubbling though. As for the specific errors, I wonder if we're using an older version of Oniguruma than Sublime or a newer one. Because Sublime should be parsing these using the same regex parser we are. |
Regarding the first problem, "Invalid pattern in look-behind": The regex But, I would recommend that you switch to "newlines" mode. To do that, do the following changes:
Some background: The regexes in Sublime Text are all written assuming that the lines that they match on include the trailing |
@trishume: With regards to panicking, can we make it so that we try to compile the regexes when loading from YAML (just for checking them)? So loading them would fail, instead of only later when using them for parsing. I think that would reduce the bubbling a bit. |
@robinst yah that would help. But the trick there is regexes with capture interpolations. There I guess you have to compile them with some placeholder such that if that compiles then any validly escaped interpolation will compile. |
Yeah, replacing them with placeholders sounds good. |
Oh, that's why I had problems finding that regular expression initially! I didn't remember that "no-newlines" option, thank you very much for pointing this out (and also for the detailed instructions - I will try this!).
Two things to (possibly) be aware of
|
The idea is that it would only impact loading from YAML files, which I'd recommend to do infrequently anyway. After that, you can serialize the syntax definition (using bincode) as a cache. Then later you load the cached version which does not do the checks. |
You'll probably find that the default Markdown syntax definition is actually much better than all the old third party ones these days, and will likely have none of these compatibility problems. From a quick glance at the readme of Markdown Extended, the only feature you'd not get is the YAML "front matter" (and only the official SublimeHQ syntaxes would be highlighted in fenced code blocks). |
For single lines without newlines, the end result is the same. The advantage of `$` is that it can be used in lookbehinds. `\z` in a lookbehind results in a regex compilation error, so rewriting would need to be more complicated. See #156.
Created a PR so that the regex |
@keith-hall Thank you very much for the feedback. I tried the default Markdown syntax definition but some of the highlighting is done with background colors (which I don't want to use in the terminal / my application) and I didn't get the syntax highlighting in code blocks to work(?). With #157 merged, I think this can be closed. My second problem ("Target of repeat operator is invalid") is already addressed somewhere else and the |
I'm currently using
syntect
with a customSyntaxSet
which is compiled from several sources.I have experienced a few problems with some syntax files due to errors in the regex parsing (unfortunately resulting in a
panic!
, see #98).I'm not sure if there is something that can be done about this here, but I thought it would be good to share (will #34 change something in this respect?).
Invalid pattern in look-behind
I downloaded an advanced Markdown syntax definition from https://github.com/jonschlinkert/sublime-markdown-extended and it results in a
thread 'main' panicked at 'called Result::unwrap() on an
Errvalue: Error(-122, invalid pattern in look-behind)', libcore/result.rs:945:5
when using it with an input like this:I tracked the error down to this section
and this regular expression in particular:
(?<=^</\1>$\n)
. I was able to fix the parsing-error by removing the trailing\n
in the parens, but I'm not sure if this is the right way to go.Target of repeat operator is invalid
The current version of the Javascript.sublime-syntax in https://github.com/sublimehq/Packages fails with
target of repeat operator is invalid
on any JavaScript file, for example:I tracked the error down to this section
and the
dollar_identifier
regex in particular. The last part of this regex is{{identifier_break}}+
which expands to(?!...)+
. The+
here is superfluous and I have opened a pull request to remove it, but it would be great ifsyntect
would not panic in such a case.The text was updated successfully, but these errors were encountered: