Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export diagnostics (including unused warnings) to SemanticDB #17835

Merged
merged 30 commits into from
Oct 19, 2023

Conversation

tanishiking
Copy link
Member

@tanishiking tanishiking commented Jun 5, 2023

fix #17535

What this PR does

This PR splits ExtractSemanticDB to PostTyper and PostInlining (as ChechUnused does).

  • The PostTyper phase
    • It extracts SemanticDB information such as symbol definitions, symbol occurrences, type information, and synthetics.
    • This phase does not write the information to a .semanticdb file; instead, it attaches the SemanticDB information to the top-level tree.
    • And write .semanticdb file.
  • The PostInlining phase
    • It extracts diagnostics from ctx.reporter and attaches them to the SemanticDB information extracted in the PostTyper phase.
    • Afterwards, it updates the SemanticDB to a .semanticdb file (if there's warning in the file).
    • We need to run this phase after the CheckUnused.PostInlining phase so that we can extract the warnings generated by "-Wunused".

Also,

  • Reporter now stores warnings in addition to errors
  • Tweaked SemanticDBTest to show the generated diagnostics to metac.expect

Concerns

  • Since we attach the SemanticDB information to the top-level tree, it lives in-memory during the compilation which may leads to more memory usage for compilation
  • Now, Reporter stores all warnings in addition to errors (to convey the warnings across phases), which also may cause some more memory consumption.

@tanishiking tanishiking marked this pull request as ready for review June 5, 2023 08:53
@tanishiking tanishiking changed the title [WIP] Export diagnostics (including unused warnings) to SemanticDB Export diagnostics (including unused warnings) to SemanticDB Jun 5, 2023
@@ -3522,7 +3573,7 @@ local1 => val local right: Int
local2 => val local number1: Int
local3 => var local leftVar: Int
local4 => var local rightVar: Int
local5 => var local number1Var: Int
local5 => val local number1Var: Int
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why they became val 🤔 let me see...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like -Wunused:all do the thing, maybe we can dig in another issue?

unit.tpdTree.putAttachment(_key, extractor.toTextDocument(unit.source))
else
unit.tpdTree.getAttachment(_key) match
case None =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should say something when attachment is not found?

@jchyb jchyb requested a review from bishabosha June 5, 2023 12:11
@tanishiking
Copy link
Member Author

@bishabosha Could you take a look when you have time? + Is there any ways to checking the memory usage with this PR (so we can check this PR boost the memory usage of compiler) :)

@bishabosha
Copy link
Member

@bishabosha Could you take a look when you have time? + Is there any ways to checking the memory usage with this PR (so we can check this PR boost the memory usage of compiler) :)

if you don't have async profiler there is also https://visualvm.github.io

@tanishiking
Copy link
Member Author

Oh, thanks, never mind, I though by any chance, there's a benchmark or something that tracks memory consumption during the benchmark or something. I'll check using some profilers 👍

@tanishiking
Copy link
Member Author

test performance please

@tanishiking
Copy link
Member Author

tanishiking commented Jun 27, 2023

Now, SemanticDB phases doesn't hold the semanticdb info on memory across the phases (AfterTyper - AfterInlining). Instead, ExtractSemanticDB.AfterInlining updates the SemanticDB on disk written by ExtractSemanticDB.AfterTyper, adding diagnostics.

I believe that the export diagnostics is more like IO-heavy (only if there're warnings) rather than memory-heavy.

It would be nice to test performance using test performance please, can someone run performance test?

FYI @bishabosha

@bishabosha
Copy link
Member

bishabosha commented Jun 28, 2023

do benchmarks exercise the extractSemanticDB phase?

Edit: short answer, no.

longer answer: looking at https://github.com/lampepfl/bench/blob/master/fixture/profiles/default.plan you can see that custom compiler flags are sent to the measure command, so anything not by default included in https://github.com/lampepfl/dotty/blob/main/bench/src/main/scala/Benchmarks.scala will not be activated.

note that zinc phases are also not turned on 🤔

@tanishiking
Copy link
Member Author

@tgodzik This PR changes the internal interface of ExtractSemanticDB, and i realized that Metals depends on it
https://github.com/scalameta/metals/blob/6ccc512d53f231e906b9fb99a74cc774efd64fc9/mtags/src/main/scala-3/scala/meta/internal/pc/SemanticdbTextDocumentProvider.scala#L36-L37
While we can keep Extractor class public as ExtractSemanticDB.Extractor(), Metals will need to some workaround between before-this-change vs after-this-change :/

@tanishiking
Copy link
Member Author

Thanks @bishabosha for checking!
Haven't profiled properly, but when I tested on scalameta/metals I found that it doesn't take so much (It has an overhead of 1-3 ms for compilation units with warnings, and less than 1 ms for compilation units with no warnings)

So we may not need to worry too much about performance (and I can't think of any other way to make it faster).

Now, it's ready for review 👍

@tgodzik
Copy link
Contributor

tgodzik commented Jun 29, 2023

@tgodzik This PR changes the internal interface of ExtractSemanticDB, and i realized that Metals depends on it https://github.com/scalameta/metals/blob/6ccc512d53f231e906b9fb99a74cc774efd64fc9/mtags/src/main/scala-3/scala/meta/internal/pc/SemanticdbTextDocumentProvider.scala#L36-L37 While we can keep Extractor class public as ExtractSemanticDB.Extractor(), Metals will need to some workaround between before-this-change vs after-this-change :/

We can work around that for sure, besides this will be migrating back to the compiler as soon as @rochala PR is merged.

Copy link
Contributor

@tgodzik tgodzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side, reading and writing protobuf should be quick enough.

tanishiking added a commit to tanishiking/scalafix that referenced this pull request Jun 30, 2023
Once this PR scala/scala3#17835
has merged and released, scalafix-organize-imports should be
able to run OrganizeImports.removeUnused based on the
diagnostics information in SemanticDB emit from Scala3 compiler.

In order to make OrganizeImports rule to work with Scala 3,
this commit added a few adjustments to the rule.
@bishabosha
Copy link
Member

bishabosha commented Jul 14, 2023

so the ExtractSemanticDB.PostTyper phase remains the same performance as before, then the ExtractSemanticDB.PostInlining phase adds another 12% to the time taken by the prior PostTyper phase (when I ran on scala3-compiler-bootstrapped project with 579 warnings from adding -Wunused:all)

pretty much the entire cost is parsing and writing the text documents, e.g. rather than converting the warnings, so I think once warnings are converted, the reading and writing could be in parallel.

Copy link
Member

@bishabosha bishabosha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation seems fine other than the question of performance

Copy link
Member

@bishabosha bishabosha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one annoying thing I found is that e.g. deprecation warnings are not included - these are emitted at crossVersionChecks which is not so long after where the phase is currently placed

@tanishiking
Copy link
Member Author

tanishiking commented Jul 19, 2023

I think once warnings are converted, the reading and writing could be in parallel

~So, we can use runOn instead of run and process each compilation unit in parallel 👍 ~ -> d2b54c2

one annoying thing I found is that e.g. deprecation warnings are not included - these are emitted at crossVersionChecks which is not so long after where the phase is currently placed

Ok, I move the PostInlining phase to bit more later. -> done 56c5909

tanishiking added a commit to tanishiking/scalafix that referenced this pull request Oct 26, 2023
Once this PR scala/scala3#17835
has merged and released, scalafix-organize-imports should be
able to run OrganizeImports.removeUnused based on the
diagnostics information in SemanticDB emit from Scala3 compiler.

In order to make OrganizeImports rule to work with Scala 3,
this commit added a few adjustments to the rule.
@bjaglin
Copy link

bjaglin commented Dec 19, 2023

Despite being backport:nominated, it looks like this didn't make it to 3.3.2-RC1. I am not familiar with the release process, but is there a reason?

@Kordyjan
Copy link
Contributor

Current release process is described here.
We are now entering the first iteration of the cadence. 3.3.2-RC1 is now on maven central and will be announced as stable around end of January. 3.4.0-RC1 will land tomorrow, and will be made stable around the same time as 3.3.1.

The current backlog of things to asses and backport can be found here. It is expected that there is some delay between things landing on main and shortly after in some Next release and LTS. We are moving slowly and carefully, so there are no regressions in LTS.

Some of the changes might seem obvious to just cherry-pick to the LTS branch, but there might be other changes that those changes implicitly depend on. I've seen situations like this preparing 3.3.2. This is why it is important to analyze and backport things in the same order as they were merged, even if sometimes it means that you will need to wait for needed changes slightly longer.

This particular change will land in 3.3.3 (that should have parity with 3.4.0). It will be available as an RC in late January and as a stable release in early March.

@bjaglin
Copy link

bjaglin commented Dec 19, 2023

Many thanks for the extensive answer @Kordyjan 🙇

@Kordyjan Kordyjan added this to the 3.4.0 milestone Dec 20, 2023
bjaglin pushed a commit to tanishiking/scalafix that referenced this pull request Feb 3, 2024
Once this PR scala/scala3#17835
has merged and released, scalafix-organize-imports should be
able to run OrganizeImports.removeUnused based on the
diagnostics information in SemanticDB emit from Scala3 compiler.

In order to make OrganizeImports rule to work with Scala 3,
this commit added a few adjustments to the rule.
bjaglin pushed a commit to tanishiking/scalafix that referenced this pull request Feb 3, 2024
Once this PR scala/scala3#17835
has merged and released, scalafix-organize-imports should be
able to run OrganizeImports.removeUnused based on the
diagnostics information in SemanticDB emit from Scala3 compiler.

In order to make OrganizeImports rule to work with Scala 3,
this commit added a few adjustments to the rule.
bjaglin pushed a commit to tanishiking/scalafix that referenced this pull request Feb 3, 2024
Once this PR scala/scala3#17835
has merged and released, scalafix-organize-imports should be
able to run OrganizeImports.removeUnused based on the
diagnostics information in SemanticDB emit from Scala3 compiler.

In order to make OrganizeImports rule to work with Scala 3,
this commit added a few adjustments to the rule.
@bjaglin
Copy link

bjaglin commented Feb 22, 2024

This particular change will land in 3.3.3 (that should have parity with 3.4.0). It will be available as an RC in late January and as a stable release in early March.

I understand the release cycle is a bit behind, but does the target of 3.3.3 still hold despite no status update on the backport board (still in the "Needs assessment")?

Context of my question:

  • I am about to cut a release of Scalafix that whitelists Scala 3.4.0 thanks to this PR.
  • I am tempted to also whitelist the upcoming 3.3.3, in order to avoid cutting a new release a few weeks later just to allow it (since there is zero logic on the Scalafix side).
  • Therefore, I am trying to gauge the probability (since I understand we can't be 100% sure anyway) of the backport to happen in 3.3.3.

@tgodzik
Copy link
Contributor

tgodzik commented Feb 23, 2024

I think this should be in 3.3.3, @Kordyjan ?

@bjaglin
Copy link

bjaglin commented Apr 6, 2024

Following-up on this to make sure we don't miss the 3.3.4 opportunity (since I believe there is no technical reason that 3.3.3 didn't have it?).

I still see this PR as "Needs assessment" in the LTS backport board. I've re-read https://github.com/scala/scala3/blob/main/project/RELEASES.md#how-should-things-be-backported, but I am not sure how we can highlight a particular PR as candidate to the maintainer?

@bishabosha
Copy link
Member

@Kordyjan

@bjaglin
Copy link

bjaglin commented May 1, 2024

#20315 🤞

WojciechMazur added a commit that referenced this pull request Jun 22, 2024
bjaglin added a commit to scalacenter/scalafix that referenced this pull request Jun 23, 2024
bjaglin added a commit to scalacenter/scalafix that referenced this pull request Jun 26, 2024
bjaglin added a commit to bjaglin/scalafix that referenced this pull request Jul 29, 2024
bjaglin added a commit to bjaglin/scalafix that referenced this pull request Jul 29, 2024
@WojciechMazur WojciechMazur removed the backport:nominated If we agree to backport this PR, replace this tag with "backport:accepted", otherwise delete it. label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SemanticDB diagnostics missing
7 participants