Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance reduction in dart analyser since 1.23.0 #30006

Closed
ajrcarey opened this issue Jun 24, 2017 · 22 comments
Closed

Significant performance reduction in dart analyser since 1.23.0 #30006

ajrcarey opened this issue Jun 24, 2017 · 22 comments
Assignees
Labels
area-analyzer Use area-analyzer for Dart analyzer issues, including the analysis server and code completion. P2 A bug or feature request we're likely to work on type-performance Issue relates to performance or code size

Comments

@ajrcarey
Copy link

ajrcarey commented Jun 24, 2017

Since dart sdk 1.23.0, analyzer performance has degraded significantly. Analysis during typing prior to 1.23.0 was nearly instantaneous; now simple edits can lock the analyzer for nearly 30 seconds, rendering refactoring, intellisense, autocomplete and syntax highlighting non-functional. First-time analysis (e.g. on opening IntelliJ) is now also much slower than in 1.22.x; in 1.23.x first-time analysis took nearly 60 seconds, although this appears to have improved somewhat in 1.24.x.

It's hard to provide a testable example because, of course, on tiny test snippets, performance remains sufficiently fast that degradation is not apparent. I would be happy to submit log files to demonstrate the problem - I just need direction on how to collect those.

This is on arch linux, fully patched, Intel core i5 with 16 Gb RAM, running off an SSD. Analyzer performance was so fast as to be mostly unnoticeable on 1.22.x and prior. The IDE in use is IntelliJ IDEA 2017.1.4 with Dart plugin 171.4694.29.

@devoncarew
Copy link
Member

devoncarew commented Jun 26, 2017

Thanks for the feedback! We do have the analysis server under benchmark, so are a bit surprised to hear about performance reductions.

I would be happy to submit log files to demonstrate the problem - I just need direction on how to collect those.

Some performance numbers for your projects would be helpful here. Specifically, cold start and warm start numbers would help. If you run pub global activate tuneup, you'll have a tool that can drive the analysis server from the command line. If you run tuneup --dart-sdk <path/to/sdk/version> check, you'll get the startup performance numbers for the analysis server. We cache our analysis on disk for unchanged files, so to get good numbers, you'll need to delete the cache when comparing sdk versions.

I would delete ~/.dartServer/.analysis-driver/, run tuneup w/ a copy of the sdk, and then re-run the same command. That will get you cold start and warm start numbers for that sdk version.

@ajrcarey
Copy link
Author

ajrcarey commented Jun 29, 2017

Sorry for the delay in replying. Thank you for the detailed instructions; here are the results.

Using sdk 1.24.2:

First run of tuneup, after clearing out ~/.dartServer/.analysis-driver/ (i.e. cold start):

15,356 issues found; analyzed 2,083 source files in 351.2s.

(I'm not sure why the number of issues reported is so high; the IDE shows only nine errors and a number of hints and TODOs. Best guess is that some obsolete code excluded from analysis in the IDE is being included in analysis run from the command line.)

Second run of tuneup (i.e. warm start):

15,356 issues found; analyzed 2,083 source files in 3.7s.

For comparison's sake, here are the numbers run against sdk 1.23:

Cold start:

15,949 issues found; analyzed 2,083 source files in 20.1s.

Warm start:

15,949 issues found; analyzed 2,083 source files in 2.9s.

And against sdk 1.22, which predates the performance problem:

15,487 issues found; analyzed 2,198 source files in 30.5s.

This is a typical response; I ran it a few times and the time range varied between about 28 and 31 seconds. I noticed that the ~/.dartServer/.analysis-driver directory did not seem to populated by 1.22, so I'm guessing that was an addition with 1.23. The introduction of 1.23 coincides with the performance degredation I'm observing.

I'm not sure what, if anything, to read into these numbers; at face value, they might seem to suggest that analysis performance in 1.22 was slower than in 1.23. That is definitely not the experience inside the IDE, where a single keystroke can lock the analyzer for upwards of 10 or 15 seconds, or more.

@bwilkerson
Copy link
Member

Best guess is that some obsolete code excluded from analysis in the IDE is being included in analysis run from the command line.

There are at least two ways of excluding code from analysis in the IDE. The only one that works with the command-line analyzer is to use the analysis options file to exclude files.

@bwilkerson
Copy link
Member

Or, rather, it ought to work for the command-line analyzer. I forgot that we currently have a bug (#26212).

@ajrcarey
Copy link
Author

ajrcarey commented Jun 29, 2017

Unless it is likely to make a difference with the analysis server performance, or affect the validity of these benchmarks, I'm not too worried about excluding the code from command line analysis. I was just interested in why the number of issues flagged was so high when the report in the IDE doesn't reflect that.

@ajrcarey
Copy link
Author

ajrcarey commented Jun 29, 2017

The disk seems to be being hit pretty hard during analysis generally, and cold start specifically. Under 1.24.2, 3879 files are created in ~/.dartServer/.dart-analysis, and the disk shows 100% activity for the entire analysis duration. Normally I would run IDE, source code and dart tools all off separate disks, but at the moment code and dart tools are running off the same disk due to #27828. The underlying filesystem is ext4.

(Note to self: check to make sure another process like tracker-miner isn't being stupid and trying to index these files as they're being created by dart-analyzer.)

@kasperl kasperl added the area-analyzer Use area-analyzer for Dart analyzer issues, including the analysis server and code completion. label Jun 30, 2017
@ajrcarey
Copy link
Author

ajrcarey commented Jun 30, 2017

(Reran tuneup under sdk 1.24.2 with tracker disabled. Cold result was 0.5s slower, warm result 0.1s faster - essentially, no significant difference. No apparent difference in disk activity with tracker disabled. Disk utilisation seems to be driven entirely by dart-analyzer, not any other process.)

@devoncarew
Copy link
Member

@ajrcarey, thanks much for putting these numbers together! We'll take a look at why the cold start for 1.24.2 is so much slower.

15,356 is a lot of issues - it may be worth determining how to address those (some config issue?), since that may be skewing the results.

@devoncarew devoncarew added the type-performance Issue relates to performance or code size label Jun 30, 2017
@scheglov
Copy link
Contributor

What do you mean when you say "a single keystroke can lock the analyzer for upwards of 10 or 15 seconds"? I want to understand what means "lock". Is your IDE still responsive? Can you type text and see there characters appear on the screen in timely manner?

I would expect some I/O to happen during cold analysis using Analysis Server (by your IDE) or tuneup - we store parsing results, linked summaries, and errors in the cache on the file system. We also try to read these results from the file system, during both cold and hot analysis. Although reading should be less expansive - we use in-memory cache, so read each cache entry only once. I also use SSD, and I/O is never a problem.

@ajrcarey
Copy link
Author

ajrcarey commented Jul 1, 2017

That's a fair question. The IDE itself remains perfectly responsive to user input. By "lock" I am referring to the analysis server: everything involving dart analysis is unresponsive, including syntax highlighting, autocomplete, error highlighting, reference lookup, and the ability to refactor (in that case the IDE does lock because it spins while waiting for dart-analyzer so it can show the dialog relevant to the refactoring operation requested). So I'd describe the IDE as input responsive but functionally useless because dart analysis is locked. Even something as simple as switching between source file tabs results in an analyzer lock; switching tab and waiting for the source to become syntax highlighted takes about five seconds. Many of these operations used to be so fast as to be unnoticeable; for instance, until 1.23, I was not aware that the analyzer was invoked when switching tabs, but now it's painfully obvious that this is the case.

"Some I/O" is perfectly fine, but I'm seeing here disk pegged to 100% and system load average going from ~0 to ~2.2 whenever the analyzer is invoked, which is basically constantly when the IDE is in use. Once the analyzer is done, disks go to idle and system load reduces back to zero.

Switching the IDE back to SDK 1.22.0 immediately resolves all performance problems. I/O utilisation under 1.22, even during cold start, appears minimal. For instance, during a full re-analysis of the entire source tree with 1.22, the disk is never touched and system load never exceeds 0.70; under 1.23 and 1.24, the disk is pegged at 100% and system load passes 2.2, with the analysis itself taking about five times as long to complete.

@scheglov
Copy link
Contributor

scheglov commented Jul 1, 2017

OK, it seems that something very bad happens with your project and Analysis Server.

If we could reproduce this situation, I'm sure we would be able to fix it.

Alternatively we could start with running Analysis Server with Observatory profiling. See this document how to change dart.server.vm.options. Set it to --observe --max_profile_depth=256 --collect_code=false --code_comments=true --old_gen_growth_rate=2800 and restart the server. The open 127.0.0.1:8181 and see the (server.dart$main) isolate. I think that the cpu profile (table) is most useful now. So, open it, see if there is something interesting and send screenshots, or PDFs, whatever is more convenient for you.

@ajrcarey
Copy link
Author

ajrcarey commented Jul 3, 2017

Sure. Here are two screen grabs.

image

image

@scheglov
Copy link
Contributor

scheglov commented Jul 3, 2017

Thanks.
Unfortunately these screenshots don't tell us much.
There was some work performed, but not too much.

And the fact that CompileUnoptimized has so high percentage implies that this performance information is shortly after restarting Analysis Server, not after performance problems.

Also, there is one, as I thought obvious, comment. Performance analysis is iterative, exploratory process - you sort by the Stack column and look methods with high Stack times, then on their Callees and Callers, trying to understand where these methods spend time, and why they are called.

I wish there were a way to grab full CPU profile information so that I could slice and dice it locally, but unfortunately there isn't. So, you would need to use your judgement to perform this research.

@ajrcarey
Copy link
Author

ajrcarey commented Jul 3, 2017

Alright. Here are some more screen grabs, CPU profile table sorted by stack %, on the same VM after running for about 7 hours without a restart.

image

image

I'm not really sure what I'm meant to be looking for here. My observation is that file I/O seems to be the problem, so I'm looking for functions that suggest some sort of file access. There are a few once we get down to about 15%: LibraryAnalyzer._resolveFile, EvictingFileByteStore.put, FileByteStore.put, and _File.writeAsBytesSync. But I have no idea if these numbers are problematic or not. I am guessing "execution" refers to CPU time, not real time; it would be good to be able to get at the real time somehow, if that's possible.

@scheglov
Copy link
Contributor

scheglov commented Jul 3, 2017

Yes, it seems that number of cache writes, i.e. MemoryCachingByteStore.put (which calls EvictingFileByteStore.put and FileByteStore.put), is quite high. So, it probably indicates that we analyze and store analysis results for a lot of files.

But there are also other surprising results. Using 22% for Object._as, most probably from InterfaceTypeImpl.element is hard to explain (from where is InterfaceTypeImpl.element called?). I cannot explain almost 30% for LibraryAnalyzer._computeVerifyErrors, 21% for LibraryAnalyzer._computeHints and 12% for UsedLocalElements.merge. Is it because of extremely deep class hierarchies? Or because of libraries for lots of parts?

I'm sorry to say so, but while this profile information provides some insides, it turned out that it is not enough for me to identify and solve the problem. Is these a chance for me to get your project to be able to profile and analyze locally? Of course it will not go anywhere from me, I will remove it once I'm done, etc.

@devoncarew
Copy link
Member

Some drive by comments:

I cannot explain almost 30% for LibraryAnalyzer._computeVerifyErrors, 21% for LibraryAnalyzer._computeHints

@ajrcarey, perhaps related to the large number of issues reported?

I wish there were a way to grab full CPU profile information so that I could slice and dice it locally, but unfortunately there isn't.

@a-siva, @B3rn475 - food for thought, we may want a way to save and load performance profiles.

@mraleph
Copy link
Member

mraleph commented Jul 4, 2017

@devoncarew I filed #30067 for saving/restoring CPU profiles. Lets make this happen - otherwise remote performance troubleshooting is way too hard.

@bwilkerson bwilkerson added the P2 A bug or feature request we're likely to work on label Jul 5, 2017
@ajrcarey
Copy link
Author

ajrcarey commented Jul 5, 2017

@devoncarew Unless I'm misunderstanding how the dart-plugin in IntelliJ works, the high number of issues reported by dart-analyzer from the command-line should not be relevant here as those issues are (I believe) mostly spurious; they are likely being generated through analysis of obsolete code that is excluded from analysis in the IDE but not when running dart-analyzer from the command line. The number of errors reported in the IDE is single figures. The data from the VM observatory comes from the IDE dart-analyzer session, not a command-line dart-analyzer session.

@scheglov I have looked at the VM observatory again after another 8-hour-or-so session with IntelliJ on the same code base. Last time, I was working on one specific area of the code; this time, my session was wider-ranging, touching a variety of files across the code base. Object._as and InterfaceTypeImlp.element have both disappeared from the top of the cpu profile list, but the caching functions are still present. My instinct is that the problem lies there, specifically in FileByteStore.put; not because the function is doing anything wrong, necessarily, but just because of the sheer number of times it's being called. (I see from the source that FileByteStore.put is synchronous and does not return until its buffer is flushed to disk; I wonder what the performance impact would be if it were either asynchronous or did not wait for its buffer to flush but returned immediately. There may be reasons, of course, why it can't do those things, but the code as written looks like it might be performance blocking if it's called too frequently.)

image

Yes, it would be possible to share the code for this project, if necessary. I prefer the idea of being able to dump observatory metrics to an export file for remote analysis.

Answering your general questions about the source project: the project is composed of lots of small base libraries that are imported and consumed en masse by classes further up the hierarchy. I would not have thought the depth of the class hierarchy was too extreme. I prefer to compose classes rather than build inheritance hierarchies, so mixins are used widely; if I had to guess at one behaviour present in this code base that might be less likely to appear elsewhere, it would be pervasive use of mixins. (Dart's support for mixins and class composition was one of the main reasons I selected it for this project over Java.)

@scheglov
Copy link
Contributor

scheglov commented Jul 5, 2017

Yes, I think you're right and at least significant part of the problem is flushing.
This would very well explain the I/O load you observe.

We added flushing after performing performance measurement and profiling locally on Mac and Ubuntu, and it did not cause such significant slowdown. But maybe it is more expensive on other OS.

We were trying to solve the problem with truncated files, sometimes we were getting existing, but zero-length files as compilation results. We need to find another solution. I will remove flushing for now.

scheglov added a commit that referenced this issue Jul 5, 2017
It turned out to be too expensive for some users.

R=brianwilkerson@google.com
BUG= #30006

Review-Url: https://codereview.chromium.org/2971063002 .
@scheglov
Copy link
Contributor

scheglov commented Jul 5, 2017

The change to remove flushing has been landed.
Please download a new bleeding edge build in a couple of hours.
I usually download this.

@ajrcarey
Copy link
Author

ajrcarey commented Jul 6, 2017

Yes, that's made a big difference. There's still some disk activity, but it's much reduced - especially during startup / full analysis - and incremental analysis while typing is much snappier. System load never exceeds 0.8. Getting syntax highlighting to pop in on switching tabs takes less than a second now. It's still a tiny bit slower than 1.22, but not by much, and it's much faster than 1.23 and 1.24. Thank you!

(For reference, I am running on arch, linux kernel 4.11.7, with OS and IntelliJ running off SSD and dart and source on a separate HD. I used to run Ubuntu but got annoyed with the upgrade cycle so switched to arch about a year ago. I would expect performance characteristics to be similar to Ubuntu, though.)

@scheglov
Copy link
Contributor

scheglov commented Jul 6, 2017

OK, good to know that it helped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-analyzer Use area-analyzer for Dart analyzer issues, including the analysis server and code completion. P2 A bug or feature request we're likely to work on type-performance Issue relates to performance or code size
Projects
None yet
Development

No branches or pull requests

6 participants