-
-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the performance of GetModIfaceFromDisk in large repos and delete GetDependencies #2323
Conversation
One surprising and unintended consequence of this is that
I suppose we'll need to introduce an artificial dependency on A better option would be to handle cycles in the build system (hls-graph), and make them visible in build rules somehow so that we can convert them in diagnostics. Any thoughts @ndmitchell ? You have cycle detection in Shake, would it work for his-graph? |
ad90173
to
3fe80a1
Compare
I have done something slightly different that hopefully preservers the current semantics - a synthetic dependency on I have also deleted |
3fe80a1
to
4141a31
Compare
f20a97e
to
f715d5b
Compare
There are two cycle detections in Shake. One uses a stack of basically where you have come from, and produces moderately good error messages. If you enhanced that with the stack in terms of import definitions you would get nice user actionable error messages. The second cycle detector is the worse one. When A and B are mutually depended upon, and both start simultaneously, they both notice each other is in progress, and then everything halts. Shake gives a really poor error here, but does spot the cycle because it has things to be done, and can't make progress. To get a good error message would require fairly complex graph algorithms. There are some, but they are probably a lot more work than doing a dependency scan in the rules. |
6a6e38a
to
df42555
Compare
6fb3077
to
6fb0dc0
Compare
@jneira any idea of why the func-test testsuite is timing out on Windows w/ ghc 9.0 after >4h? It normally takes 3m |
@wz1000 are you able to review this core change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good improvement.
-- To work around this, we coerce to the underlying type | ||
-- To remove this, I plan to upstream the missing Monoid instance | ||
concatFC :: [FinderCache] -> FinderCache | ||
concatFC = unsafeCoerce (mconcat @(Map InstalledModule InstalledFindResult)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can easily break with new GHC releases. Can we guard this using CPP for GHC versions where this is known to be ok? It can be a compile error for as of yet unreleased GHC versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will find out if this breaks with a new GHC release, not sure what's the benefit of preemptive CPP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moreover, the missing instances will be added hopefully soon - https://gitlab.haskell.org/ghc/ghc/-/merge_requests/6935
I got to enable all tests but progress ones in #2296 but by pure brute force, after that it has not timed out. I have to investigate why progress tests continue causing hangs although i suspect about eval one.
|
I see a timeout as well in the CI results for #2296 |
Yeah, i tried all combinations of ignore specific test groups several times and several attempts ended with hangs for windows and 9.0.1 until https://github.com/haskell/haskell-language-server/runs/4079556307?check_suite_focus=true ignoring only progress tests. After that i've not seen another run get stuck in windows and 9.0.1 in any branch (i've just did a search in subsequent runs to find it and i did not see anyone but maybe i missed it). |
Oh you are referring to https://github.com/haskell/haskell-language-server/runs/4088340871?check_suite_focus=true? |
Lets see how is going the actual workflow run, will try to disable other test groups if the hang is reproduced (or afterwards if it happens again in another branch) |
The func-test suite did not hang this time: https://github.com/haskell/haskell-language-server/runs/4150189056?check_suite_focus=true |
I'm not going to merge this yet since the benchmarks show a perf regression after introducing the synthetic dependency on |
While it didn't time out this time around, it timed out for 4/5 previous attempts, so I strongly think that we should disable it (or the offending tests) |
With the new build graph statistics outputs in the benchmark suite (#2343) it's easier to see what's causing the regression:
Compared to upstream, this PR HEAD increases the number of rules built on edit by a factor of 10. It also shows the cost (3X build graph edges) of the import cycle tracking rules. |
There are three benefits: 1. GetModIfaceFromDisk and GhcSessionDeps no longer depend on the transitive module summaries. This means fewer edges in the build graph = smaller build graph = faster builds 2. Avoid duplicate computations in setting up the GHC session with the dependencies of the module. Previously the total work done was O(NlogN) in the number of transitive dependencies, now it is O(N). 3. Increased sharing of HPT and FinderCache. Ideally we should also share the module graphs, but the datatype is abstract, doesn't have a monoid instance, and cannot be coerced to something that has. We will need to add the Monoid instance in GHC first. On the Sigma repo: - the startup metric goes down by ~34%. - The edit metric also goes down by 15%. - Max residency is down by 30% in the edit benchmark.
Surfacing the performance tradeoffs in the core build rules
6fb0dc0
to
7fa91b2
Compare
7fa91b2
to
ef370d5
Compare
Perf regression fixed, benchmarks look fine:
|
What was the fix? |
…ete GetDependencies (#2323) * Improve the performance of GetModIfaceFromDisk in large repos There are three benefits: 1. GetModIfaceFromDisk and GhcSessionDeps no longer depend on the transitive module summaries. This means fewer edges in the build graph = smaller build graph = faster builds 2. Avoid duplicate computations in setting up the GHC session with the dependencies of the module. Previously the total work done was O(NlogN) in the number of transitive dependencies, now it is O(N). 3. Increased sharing of HPT and FinderCache. Ideally we should also share the module graphs, but the datatype is abstract, doesn't have a monoid instance, and cannot be coerced to something that has. We will need to add the Monoid instance in GHC first. On the Sigma repo: - the startup metric goes down by ~34%. - The edit metric also goes down by 15%. - Max residency is down by 30% in the edit benchmark. * format importes * clean up * remove stale comment * fix build in GHC 9 * clean up * Unify defintions of ghcSessionDeps * mark test as no longer failing * Prevent duplicate missing module diagnostics * delete GetDependencies * add a test for deeply nested import cycles * Fix build in GHC 9.0 * bump ghcide version * Introduce config options for the main rules Surfacing the performance tradeoffs in the core build rules * Avoid using the Monoid instance (removed in 9.4 ?????) * Fix build with GHC 9 * Fix Eval plugin
…ete GetDependencies (#2323) * Improve the performance of GetModIfaceFromDisk in large repos There are three benefits: 1. GetModIfaceFromDisk and GhcSessionDeps no longer depend on the transitive module summaries. This means fewer edges in the build graph = smaller build graph = faster builds 2. Avoid duplicate computations in setting up the GHC session with the dependencies of the module. Previously the total work done was O(NlogN) in the number of transitive dependencies, now it is O(N). 3. Increased sharing of HPT and FinderCache. Ideally we should also share the module graphs, but the datatype is abstract, doesn't have a monoid instance, and cannot be coerced to something that has. We will need to add the Monoid instance in GHC first. On the Sigma repo: - the startup metric goes down by ~34%. - The edit metric also goes down by 15%. - Max residency is down by 30% in the edit benchmark. * format importes * clean up * remove stale comment * fix build in GHC 9 * clean up * Unify defintions of ghcSessionDeps * mark test as no longer failing * Prevent duplicate missing module diagnostics * delete GetDependencies * add a test for deeply nested import cycles * Fix build in GHC 9.0 * bump ghcide version * Introduce config options for the main rules Surfacing the performance tradeoffs in the core build rules * Avoid using the Monoid instance (removed in 9.4 ?????) * Fix build with GHC 9 * Fix Eval plugin
…ete GetDependencies (#2323) * Improve the performance of GetModIfaceFromDisk in large repos There are three benefits: 1. GetModIfaceFromDisk and GhcSessionDeps no longer depend on the transitive module summaries. This means fewer edges in the build graph = smaller build graph = faster builds 2. Avoid duplicate computations in setting up the GHC session with the dependencies of the module. Previously the total work done was O(NlogN) in the number of transitive dependencies, now it is O(N). 3. Increased sharing of HPT and FinderCache. Ideally we should also share the module graphs, but the datatype is abstract, doesn't have a monoid instance, and cannot be coerced to something that has. We will need to add the Monoid instance in GHC first. On the Sigma repo: - the startup metric goes down by ~34%. - The edit metric also goes down by 15%. - Max residency is down by 30% in the edit benchmark. * format importes * clean up * remove stale comment * fix build in GHC 9 * clean up * Unify defintions of ghcSessionDeps * mark test as no longer failing * Prevent duplicate missing module diagnostics * delete GetDependencies * add a test for deeply nested import cycles * Fix build in GHC 9.0 * bump ghcide version * Introduce config options for the main rules Surfacing the performance tradeoffs in the core build rules * Avoid using the Monoid instance (removed in 9.4 ?????) * Fix build with GHC 9 * Fix Eval plugin
This changes
GhcSessionDeps
to reuse theGhcSessionDeps
of the dependencies, and thenGetModIfaceFromDisk
to useGhcSessionDeps
instead of callingghcSessionDepsDefinition
.There are three benefits:
GetModIfaceFromDisk
andGhcSessionDeps
no longer depend on the transitive module summaries. This means fewer edges in the build graph = smaller build graph = faster buildsIdeally we should also share the module graphs, but the datatype is abstract, doesn't have a Monoid instance, and cannot be coerced to something that has. We will need to add the Monoid instance in GHC first.
On the Sigma repo:
Interestingly, the memory usage goes down despite caching more!
Fixes #2304 (in that it improves the complexity as much as I can think of)