Skip to content
This repository has been archived by the owner on Jul 15, 2023. It is now read-only.

Use tree-sitter for syntax coloring #2555

Closed
wants to merge 3 commits into from

Conversation

georgewfraser
Copy link

This PR uses tree-sitter to replace VSCode's builtin syntax coloring for Go. Tree-sitter is an incremental parsing framework, developed within Github and used by the Atom editor as a replacement for TextMate grammars. It produces full parse trees, but it's efficient and incremental so the parse tree can be updated on every keystroke.

I originally created an independent extension that does this for several languages, but I think it makes more sense for each language-specific extension to be responsible for syntax coloring, so I've repackaged it as an NPM library.

The basic strategy is:

This strategy allows more accurate coloring of types, and it makes it possible to color based on scope. Notice how top-level vars are blue, but local vars are not:

Screen Shot 2019-06-02 at 10 45 56 AM

This works correctly when locals shadow top-level vars:

shadow mov

We can also do fancy things like underline mutable vars:

mutable mov

In the future, the tree produced by tree-sitter could be combined with semantic information from Go language server to provide even more accurate coloring, for example coloring constants differently even when they are in other packages.

The performance-critical section is the javascript function that walks the visible part of the tree produced by tree-sitter and applies colors. Note that this doesn't need to update every frame, because the setDecorations API is designed to be used asynchronously and VSCode will "patch up" small edits even when the decorations are slightly out-of-date. But in practice it takes about 2ms to color on small and large files:

Screen Shot 2019-06-02 at 12 47 11 PM

Large file:

Screen Shot 2019-06-02 at 12 49 00 PM

If you want to check out the source code of the vscode-tree-sitter npm library, it is published from the npm branch of my repo.

@msftclas
Copy link

msftclas commented Jun 2, 2019

CLA assistant check
All CLA requirements met.

@georgewfraser
Copy link
Author

@ramya-rao-a Any thoughts about this? I've been using it for a while myself and it's a big improvement over the textmate syntax coloring.

@oneslash
Copy link
Contributor

@ramya-rao-a let's push this please, it is a great improvement for vscode-go

@georgewfraser
Copy link
Author

@ramya-rao-a You commented on Twitter "I have been sitting on that for a while now, still trying to make up my mind if the Go extension is the right place for it or a separate extension :("

The official CPP extension has replaced VSCode's builtin TextMate-based coloring with a custom syntax colorizer that uses the setDecorations API, similar to this PR. The approach is slightly different: vscode-cpptools uses the actual C++ parser, while this PR uses a tree-sitter parser for Go. The advantage of tree-sitter is that it's incremental, so it's easy to get as-you-type updates to the coloring.

I think it would make sense for vscode-go to also override the builtin syntax coloring. Perhaps @sean-mcmanus has some comment?

@sean-mcmanus
Copy link

sean-mcmanus commented Oct 3, 2019

@georgewfraser The C/C++ extension has not replaced the built-in TextMate coloring -- we still rely on TextMate for fast syntactic/lexical colorization, but then use the setDecorations to add "semantic" colorization on top of that. However, our approach has several inherent limitations/bugs that requires additional APIs from VS Code to fix. Also, early versions did have lexical colorization that replaced the TextMate colors, but the performance using decorations was too poor and led to white space in comments being colored green, etc.

A clone of the Atom TextMate Go repo with a bug fix is at https://github.com/jeff-hykin/better-go-syntax if VS Code wants to switch to using that instead.

@georgewfraser
Copy link
Author

we still rely on TextMate for fast syntactic/lexical colorization, but then use the setDecorations to add "semantic" colorization on top of that

Thanks for clarifying @sean-mcmanus , this is actually the same strategy used in this PR---basic tokens are colored using a simplified TextMate grammar, and only tricky things like types are colored using setDecorations.

@isavcic
Copy link

isavcic commented Feb 3, 2020

Any updates on this PR?

@georgewfraser
Copy link
Author

There are things I can do to simplify this, and to put it under an option-flag, but I'd like some signal from @ramya-rao-a that she actually intends to merge it before I make additional effort.

@ramya-rao-a
Copy link
Contributor

Hello all,

First off, thanks for your patience and my apologies for not replying sooner.

While I do see the merits of the tree sitter, I don't believe this extension is the right place for the tree sitter based Go grammar anymore.

As most of you already know, this extension does not own or maintain the grammar that powers the syntax highlighting on Go files which comes out of the box from VS Code. This was previously based on https://github.com/atom/language-go and now moving towards https://github.com/jeff-hykin/better-go-syntax/ as being discussed in microsoft/vscode#82549

Therefore, it makes sense to have any grammar changes to come from the same source or at the very least a separate extension.

Regardless, as a sole maintainer of this extension, I neither have the expertise, time or the resources to manage this feature if this PR were to be merged. Therefore, at this stage, I strongly believe that the independent extension that @georgewfraser has is the right way to go for now.

An alternative is to get traction on the upstream issue in VS Code microsoft/vscode#50140 which is tracking tree-sitter support in general in VS Code. cc @aeschli

Thanks again your patience and support

Closing this PR :(

@ramya-rao-a ramya-rao-a closed this Feb 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants