-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to define begin and end patterns which span multiple lines? #41
Comments
begin
and end
not match newlines?
I also tried running |
- new lines don't work in begin/end patterns, so there doesn't seem the be a way to match the intent: "start highlighting only when a line with `graphql request` is followed by `"""` in the next line, and end when the next set of `"""` is found. Ref: microsoft/vscode-textmate#41
@alexandrudima friendly ping |
Regular expressions in TextMate grammars will not match across multiple lines. This is a fundamental limitation of the format. The only way to match content across multiple lines is to chain |
@freebroccolo is correct. Each line (with |
Do you happen to know what does the prefix |
I think @aeschli would know. |
The |
@aeschli I have a follow-up question on defining grammars to match across multiple lines. I've created a syntax for the gherkin syntax, which would match and highlight graphql syntax defined as so: Given I make a graphql request
"""
mutation {
UserCreate(input: {...}) {
clientMutationId
}
}
"""
Then I expect the response
"""
{
"data": {...},
"errors": [...],
}
""" The syntax definition should apply on matching the following conditions:
The syntax definition I tried was like this: {
"injectionSelector": "L:text -comment",
"patterns": [
{
"begin": "graphql request\\s*$", // STAGE_1
"patterns": [
{
"begin": "^\\s*(\"\"\")$", // STAGE_2
"beginCaptures": {
"1": { "name": "string.quoted.double.graphql.begin" }
},
"end": "^\\s*(\"\"\")$",
"endCaptures": {
"1": { "name": "string.quoted.double.graphql.end" }
},
"patterns": [
{ "include": "source.graphql" }
]
}
]
},
]
...
} It highlights the graphql syntax correctly, but then the highlighter doesn't revert back to gherkin syntax after encountering the closing |
@kumarharsh Sorry, I'm no expert either, but I also believe that has to do with the missing end rule. |
I have seen that, but still can't make any sense of how to go about it. Since as soon as I define the Given I make a graphql request // begin matches this
""" // while would end here(?)
mutation { Guess I'll just drop this try here - as I feel the textmate grammar is handicapped by default. |
- remove support for syntax starting with "graphql request" followed by a new line with `"""`. Seems impossible (microsoft/vscode-textmate#41 (comment)).
@kumarharsh For situations like this you have to make use of oniguruma lookarounds. Here is a modification of your original example using a lookbehind:
Alternatively you can use a lookahead (although I find this is usually a worse choice):
One pattern you will want to use to match multi-line syntactic constructs accurately is as follows. This is what I was referring to earlier about chaining "begin"/"end" matches. You might want to adapt your example to this style depending on how many stages there will be after the first quoted block.
|
Thanks a lot @freebroccolo. I was under the impression that lookbehinds were not supported by JS/TS. Didn't think of the second way though. The third example is great. I had misconstrued 'chaining' to mean 'nesting' 🤦♂️ |
Yeah, the regexp engine used for handling TextMate grammars is usually oniguruma so you have a lot more flexibility than you do with JS regexps. |
To clarify: for this pattern to work, it must be able to identify confidently, from the first line, that it applies? Eg there is no way to do a markdown header where the current line could be a paragraph or could be a h1 or h2, and we can't know which, until we see the next line. I attempted it, expecting that if the $ git diff --cached
diff --git a/markdown.tmLanguage.base.yaml b/markdown.tmLanguage.base.yaml
index 15df966..bf78da0 100644
--- a/markdown.tmLanguage.base.yaml
+++ b/markdown.tmLanguage.base.yaml
@@ -3,12 +3,13 @@ keyEquivalent: ^~M
name: Markdown
patterns:
- {include: '#frontMatter'}
+- {include: '#heading-atx'}
+- {include: '#heading-setext'}
- {include: '#block'}
repository:
block:
patterns:
- {include: '#separator'}
- - {include: '#heading'}
- {include: '#blockquote'}
- {include: '#lists'}
- {include: '#fenced_code_block'}
@@ -22,6 +23,8 @@ repository:
'2': {name: punctuation.definition.quote.begin.markdown}
name: markup.quote.markdown
patterns:
+ - {include: '#heading-atx'}
+ - {include: '#heading-setext'}
- {include: '#block'}
while: (^|\G)\s*(>) ?
{{languageDefinitions}}
@@ -38,7 +41,7 @@ repository:
endCaptures:
'3': {name: punctuation.definition.markdown}
name: markup.fenced_code.block.markdown
- heading:
+ heading-atx:
match: (?:^|\G)[ ]{0,3}(#{1,6}\s+(.*?)(\s+#{1,6})?\s*)$
captures:
'1':
@@ -83,9 +86,12 @@ repository:
patterns:
- {include: '#inline'}
heading-setext:
- patterns:
- - {match: '^(={3,})(?=[ \t]*$\n?)', name: markup.heading.setext.1.markdown}
- - {match: '^(-{3,})(?=[ \t]*$\n?)', name: markup.heading.setext.2.markdown}
+ name: 'heading.1.markdown'
+ begin: (?:^|\G)(\w[^\n]*)$\n
+ beginCaptures: {'1': {name: entity.name.section.markdown}}
+ end: \G^(={3,})[ \t]*$\n?
+ endCaptures: {'1': {name: markup.heading.setext.1.markdown}}
+ patterns: [{match: '(?<=^={3,}[ \t]*$\n?)\G'}]
html:
patterns:
- begin: (^|\G)\s*(<!--)
@@ -154,7 +160,6 @@ repository:
patterns:
- {include: '#inline'}
- {include: text.html.derivative}
- - {include: '#heading-setext'}
while: (^|\G)(?!\s*$|#|[ ]{0,3}([-*_>][ ]{2,}){3,}[ \t]*$\n?|[ ]{0,3}[*+->]|[
]{0,3}[0-9]+\.)
lists:
@@ -182,7 +187,6 @@ repository:
patterns:
- {include: '#inline'}
- {include: text.html.derivative}
- - {include: '#heading-setext'}
while: (^|\G)((?=\s*[-=]{3,}\s*$)|[ ]{4,}(?=\S))
raw_block: {begin: '(^|\G)([ ]{4}|\t)', name: markup.raw.block.markdown, while: '(^|\G)([
]{4}|\t)'} I'm guessing it's impossible? 😞 Specifically, on "this is a paragraph", there is just no way to handle that. $ cat test/colorize-fixtures/h1.md
nice 1
======
* list 1
* list 2
this is a paragraph
still just a paragraph
# shitty 1
nice 2
------
## shitty 2
$ jq < test/colorize-results/h1_md.json 'map({c, t})[]' -c
{"c":"nice 1","t":"text.html.markdown heading.1.markdown entity.name.section.markdown"}
{"c":"======","t":"text.html.markdown heading.1.markdown markup.heading.setext.1.markdown"}
{"c":"*","t":"text.html.markdown markup.list.unnumbered.markdown punctuation.definition.list.begin.markdown"}
{"c":" ","t":"text.html.markdown markup.list.unnumbered.markdown"}
{"c":"list 1","t":"text.html.markdown markup.list.unnumbered.markdown meta.paragraph.markdown"}
{"c":"*","t":"text.html.markdown markup.list.unnumbered.markdown punctuation.definition.list.begin.markdown"}
{"c":" ","t":"text.html.markdown markup.list.unnumbered.markdown"}
{"c":"list 2","t":"text.html.markdown markup.list.unnumbered.markdown meta.paragraph.markdown"}
{"c":"this is a paragraph","t":"text.html.markdown heading.1.markdown entity.name.section.markdown"}
{"c":"still just a paragraph","t":"text.html.markdown heading.1.markdown"}
{"c":"# shitty 1","t":"text.html.markdown heading.1.markdown"}
{"c":"nice 2","t":"text.html.markdown heading.1.markdown"}
{"c":"------","t":"text.html.markdown heading.1.markdown"}
{"c":"## shitty 2","t":"text.html.markdown heading.1.markdown"} |
Yes, and that is actually a very succinct way of stating the overall fundamental limitation of the TextMate engine (not VS Code's implementation). The Tree Sitter (used by Atom) was created precisely to solve this limitation. |
Thanks for confirming 🙏 For any future readers, note that there is apparently an addition to the TextMate grammar, called Semantic Highlighting. I haven't looked into it yet, but it is introduced like this:
|
TextMate grammer cannot move across multiple lines. Use the suggestion workaround mentioned in this issue: microsoft/vscode-textmate#41
Syntax highlight-level grammar support for: + string/int/float/bool constants + local/global/system variables + function call foundation + trans tag + line comments - multiline comments don't work - refer to microsoft/vscode-textmate#41 (comment)
See microsoft/vscode-textmate#41 (comment) for multi line matching
See microsoft/vscode-textmate#41 (comment) for multi line matching
See microsoft/vscode-textmate#41 (comment) for multi line matching
+ single-line multiline comments are properly highlighted + multiline comments are still broken likely due to textmate's limitation (see microsoft/vscode-textmate#41)
Hi, I'm trying to expand the support for my plugin, graphql-for-vscode, to support gherkin feature files, but I'm stuck while defining the grammer. The
begin
regexp seems to not do any matching when I specify a newline (\n):Specifically:
With this syntax definition:
I don't get any matches:
but modifying the source text to bring the beginning
"""
to the same line asgraphql request
does the trick:I tried modifying the begin regexp to be:
"begin": "graphql request\\n\\s+\"\"\"",
but it didn't help - in fact, it stopped highlighting anything within quotes.I've spent some time browsing other syntaxes in vscode and textmate, but could only find
\n
to be used inmatch
regexp sections, but none yet inbegin
orend
sections.The text was updated successfully, but these errors were encountered: