Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for C-Sharp #789

Open
dburriss opened this issue Feb 12, 2022 · 9 comments
Open

Add support for C-Sharp #789

dburriss opened this issue Feb 12, 2022 · 9 comments

Comments

@dburriss
Copy link
Contributor

I imagine we would use tree-sitter-c-sharp.

Since the Java metrics are growing I would like to add C# too. I imagine the implementations should be close to identical.

@marco-c
Copy link
Collaborator

marco-c commented Feb 14, 2022

Good idea!

@dburriss
Copy link
Contributor Author

So folding in lessons learned from #694 I am starting with some research and definitions. Thankfully C# has a fairly clear definition of a statement as well as a comprehensive list of the different types:

csharp statements

A cursory look through seems to indicate, unsurprisingly, that it is much the same as Java. I will raise it if I find any weird difference.

Should I create an issue per metric to track this or link PRs here?

@dburriss
Copy link
Contributor Author

dburriss commented Mar 19, 2022

So bumping into my first challenge here.
On this repo currently tree-sitter version dependency istree-sitter = ">= 0.19, < 0.21".
The source Cargo.toml for tree-sitter-c-sharp is tree-sitter = "0.19".
On crates.io the tree-sitter version is 0.17.

I guess this is the reason I get:

expected struct tree_sitter::Language, found a different struct tree_sitter::Language

when trying to add tree-sitter-c-sharp as a dependency?

I assume the fix here is to ask @dcreager, the owner of the crate, if it can be updated?

@dcreager
Copy link

Just published tree-sitter-c-sharp 0.19.1, which should unblock you if you're using tree-sitter 0.19.

If you're using tree-sitter 0.20, then I think you'll end up getting the same error, until tree-sitter/tree-sitter-c-sharp#219 (which I just opened) is merged. Once that gets a 👍 review I'll merge it and cut a new release of tree-sitter-c-sharp with the more inclusive tree-sitter dependency.

@dburriss
Copy link
Contributor Author

Thanks so much @dcreager 🙏
I'll test it out tomorrow but I think it should work.

@marco-c
Copy link
Collaborator

marco-c commented Mar 19, 2022

Should I create an issue per metric to track this or link PRs here?

Let's just use this one, maybe you can edit the first comment to add a list of metrics with checkmarks.

@dburriss
Copy link
Contributor Author

After adding tree_sitter_c_sharp to the nums crate I get the follwong compile error:

 Compiling enums v0.0.1 (C:\Users\devon\GitHub\rust-code-analysis\enums)
error[E0308]: mismatched types
  --> src\macros.rs:22:31
   |
20 |           pub fn get_language(lang: &Lang) -> Language {
   |                                               -------- expected `tree_sitter::Language` because of return type
21 |               match lang {
22 |                   Lang::Java => tree_sitter_java::language(),
   |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected struct `tree_sitter::Language`, found a different struct `tree_sitter::Language`
   |
  ::: src\languages.rs:3:1
   |
3  | / mk_langs!(
4  | |     // 1) Name for enum
5  | |     // 2) tree-sitter function to call to get a Language
6  | |     (Java, tree_sitter_java),
...  |
15 | |     (Javascript, tree_sitter_javascript)
16 | | );
   | |_- in this macro invocation
   |
   = note: perhaps two different versions of crate `tree_sitter` are being used?
   = note: this error originates in the macro `mk_get_language` (in Nightly builds, run with -Z macro-backtrace for more info)

I am not 100% sure what is going on here. Running cargo lock -d, tree-sitter version look like so:

- tree-sitter 0.19.3
  - cc 1.0.72
  - regex 1.4.6
- tree-sitter 0.20.2
  - cc 1.0.72
  - regex 1.4.6
- tree-sitter-ccomment 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-cpp 0.20.0
  - cc 1.0.72
  - tree-sitter 0.20.2
- tree-sitter-java 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-mozcpp 0.20.1
  - cc 1.0.72
  - tree-sitter 0.19.3
  - tree-sitter-cpp 0.20.0
- tree-sitter-mozjs 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
  - tree-sitter-javascript 0.19.0
- tree-sitter-preproc 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-python 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-rust 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-typescript 0.20.0
  - cc 1.0.72
  - tree-sitter 0.19.3

Then when I add tree-sitter-c-sharp = "0.19.1" to enums/Cargo.toml, the dependencies look like so:

- tree-sitter 0.19.3
  - cc 1.0.72
  - regex 1.4.6
- tree-sitter 0.20.2
  - cc 1.0.72
  - regex 1.4.6
- tree-sitter-c-sharp 0.19.1
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-ccomment 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-cpp 0.20.0
  - cc 1.0.72
  - tree-sitter 0.20.2
- tree-sitter-java 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-javascript 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-mozcpp 0.20.1
  - cc 1.0.72
  - tree-sitter 0.20.2
  - tree-sitter-cpp 0.20.0
- tree-sitter-mozjs 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
  - tree-sitter-javascript 0.19.0
- tree-sitter-preproc 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-python 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-rust 0.19.0
  - cc 1.0.72
  - tree-sitter 0.19.3
- tree-sitter-typescript 0.20.0
  - cc 1.0.72
  - tree-sitter 0.20.2

As you can see tree-sitter-mozcpp and tree-sitter-typescript have been bumped to using tree-sitter 0.20.2 instead of 0.19.3. Interestingly, tree-sitter-cpp was already on 0.20.2 but tree-sitter-mozcpp was not (but now it is).

So my understanding (with very little Rust experience) is that the enum crate is using the higher tree-sitter and then any of the lower ones are returning the wrong type. In the example above it is java but I confirmed this by moving others up to the top of the match. If I try constrain tree-sitter to 0.19 then typescript and cpp return the wrong type.

I actually ran into this issue trying to consume ``rust-code-analysisin another crate and ended up searching for a version of the crate where thetree-sitter` versions aligned.

So @marco-c I am going to need some help on how to resolve this:

  • Why did this work before? It seems to be purely due to when the lock file was generated? Or the lock file was edited manually? I confirmed this by deleting the lock file on ~HEAD and indeed, building throws the same error.
  • What is the alternative here? With my limited knowledge I guess it would be keeping the tree-sitter versions in sync and only upgrading in lock-step?

I see there is a ticket for upgrading tree-sitter to 0.20 #674 so maybe a known thing that this should be in lockstep?

Sorry about the wall of text if this is all know stuff but it helped me attempt to reason through it.

@dburriss
Copy link
Contributor Author

To try confirm the above I downgraded typescript and mozcpp and pinned the tree-sitter in enum to 0.19.3 and the enum project compiles again.

@marco-c
Copy link
Collaborator

marco-c commented Mar 21, 2022

I think we should try to have all grammars use the same tree-sitter version, otherwise we have too many complications. We should either update all to 0.20, or revert all to 0.19.
CC @Luni-4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants