Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Roslyn typesystem? #896

Open
dgrunwald opened this issue Oct 3, 2017 · 5 comments
Open

Migrate to Roslyn typesystem? #896

dgrunwald opened this issue Oct 3, 2017 · 5 comments
Labels
Decompiler The decompiler engine itself

Comments

@dgrunwald
Copy link
Member

The newdecompiler branch just migrated ILSpy from Cecil (pretty much no type system at all), to the NRefactory type system.
We picked NRefactory.TypeSystem because we knew it, and because all the logic we need is publicly available there (whereas with Roslyn, we'll likely need to fork it and access its internals).
However, NRefactory only supported up to C# 5, and even though we stripped it down to only the portions we need and integrated them into ICSharpCode.Decompiler, it would still be a lot of effort to upgrade the following parts to later C# versions:

  • TypeSystem -- used to represent the C# type of expressions we generate. Most importantly, lacks support for C# 7 tuple types.
  • Conversions -- ILSpy uses this to detect whether an implicit conversion is possible, or whether a cast needs to be inserted.
  • Overload resolution -- ILSpy needs to be able to tell whether a given method call will call the correct overload, or whether we'll need to insert casts. relevant testcases
  • Type inference -- used as part of overload resolution to figure out whether we need to explicitly specify the type arguments for a method call.
  • Member lookup -- used to discover the method group for overload resolution.

The NRefactory API turned out to be great for this, because it allows all of these operations to be performed independently of the syntax tree. The Roslyn APIs seem to be less suitable for the case where we have already-bound subexpressions and want to determine how to generate an invocation expression so that it binds to the correct method: unlike NRefactory where CSharpResolver.ResolveInvocation() takes already-bound arguments as inputs, the equivalent Roslyn Binding.BindInvocationExpression() will itself recursively call BindExpression() on the arguments.
Although this might not be an issue if there's a cache that lets us avoid re-binding the argument over and over again for each parent InvocationExpression we try.

So the big question is: do we keep maintaining our own implementation of a significant portion of the C# spec, or do we fork+patch Roslyn to be usable in ILSpy, and migrate everything over?
If we just went for C# 7 compatibility it's probably not too hard to keep maintaining NRefactory, especially since our own implementation of overload resolution doesn't have to be perfect -- if it emits a false positive compiler error, we'll just emit a few redundant casts.
But at some point we might try to revive the VB decompiler, and we certainly don't want to reimplement a big chunk of the VB spec using the NR typesystem.

@Pilchie
Copy link

Pilchie commented Oct 3, 2017

Without commenting on the difficulty level, I'd love to see a system based on Roslyn. We've been discussing building decompilation into VS, with a plan to do so by levering ILSpy. One of the concerns has been what happens when we release a new language feature that ILSpy doesn't support. Moving ILSpy closer to Roslyn means that it would likely be easier to update. Who knows, maybe Roslyn team members would even contribute updates to support new language features.

@aelij
Copy link

aelij commented Oct 4, 2017

Just a suggestion, if you don’t mind relying on internals, you can use https://github.com/aelij/IgnoresAccessChecksToGenerator

This is what I use to build the RoslynPad.Roslyn assembly.

@dgrunwald
Copy link
Member Author

On the syntax side of things:

  • Roslyn has better options for code formatting than our output logic.
  • Using the Roslyn Binder requires us to use the the Roslyn syntax tree for the candidate expressions.
  • We'd like a mutable AST (for our C# transforms).
  • We'd like to keep using something like NR pattern matching
  • We heavily customize the OutputVisitor/TokenWriter to get syntax+semantic highlighting, hyperlinks, ...

Except for the mutable AST, all of this looks like it could also done in Roslyn, but it's a lot of code churn for little benefit.

@dgrunwald
Copy link
Member Author

I took a look at using the Roslyn type system in the decompiler.

  • The public API is near useless. The decompiler needs similar access to the type system as the binder in roslyn does, and that's all through an internal API (class TypeSymbol). The public interface only has small a fraction of the member we'd need.
    • Some examples of methods the decompiler could use, but are not available in the public API: IsTupleCompatible(), IsReadOnly, IsByRefLikeType -- basically, all the new language features that are a reason to move to the Roslyn TS, are internal-only.
    • What would really make Roslyn useful for us is if we could re-use it's overload resolution implementation. But that requires some modifications to the code (we need to perform overload resolution before we have a syntax tree!). Which means we'd use a modified copy of class OverloadResolution in our code base, but that means we need access to all internals APIs used by overload resolution...
    • ==> The only way we could use Roslyn is by copying the source code and adding an [InternalsVisibleTo] attribute for us.
  • At least in the near term (up to C# 7.3 inclusive), it's faster for us to re-implement the new features in our own type system than to port everything to Roslyn.
  • Porting the decompiler to the Roslyn API takes a significant amount of boring busy-work. Given that ILSpy is a hobby project and the time we spend on it is highly influenced by how interesting the work is, our chances of ever finishing the port to Roslyn are essentially zero.
  • By contrast, implementing the cool new features in our type system is less boring :) Also it's not one huge chunk of work, but multiple small ones (for the various new language features). This does wonders for motivation.

@dgrunwald
Copy link
Member Author

dgrunwald commented May 17, 2020

I guess re-using overload resolution would be possible by feeding something like default(TargetType).MethodCall(default(ArgumentTyp1)) to Roslyn's GetSpeculativeSymbolInfo.
But I'm not sure if it would be possible to migrate to the Roslyn type system. Probably not with acceptable effort. And I certainly don't want to start mixing multiple type systems in the decompiler.

But Roslyn's overload resolution is only nice-to-have: our own implementation was fairly good for C# 5.0 code; and since the newer C# versions are backward-compatible with that, lacking the new features only means we end up sometimes emitting a few unnecessary casts.

Currently the only really compelling reason I see to move to Roslyn is that we'd get to re-use the nullability flow analysis. This would allow us to infer nullable reference types for local variables. And the analysis is complex enough that we can't easily re-implement it.

It's also quite possible (certainly in the short term) that porting relevant code pieces from Roslyn to the decompiler type system is better than porting the whole decompiler to the Roslyn type system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Decompiler The decompiler engine itself
Projects
None yet
Development

No branches or pull requests

4 participants