Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make formatting performance better #3247

Merged
merged 2 commits into from
Jun 4, 2015
Merged

Conversation

heejaechang
Copy link
Contributor

this performance improvement is particularly for devdiv bug # 1089540

this makes the file in the bug to be formatted in several seconds compared to several minutes on my machine.

there were several issues. each one fixed by
#1, use concurrency on gathering operations.
#2, don't use too much time to split work to chunks if that requires more work than actually formatting.
#3, don't blindly set beginning of a file as inseparable start point for certain formatting options.

...

but these don't actually address the most impactful root cause of this perf issues. which is perf issue of GetPrevious/GetNextToken API in compiler.
(#3244)

formatter internally uses GetDescendantTokens to get all tokens at once and cache them which takes less than 1 second for the entire file (2M bytes size) in the bug. and use the cache internally.

but certain part of formatter (Rule Provider) can't use that internal cache, so it has to use the GetPrevious/GetNextToken to move around tokens, which in this particular bug, takes more than 40 seconds on my machine. and that is not even for entire file. (less than 1/12 of tokens)

I opened a bug to compiler team, hopely so that we can get better perf on those APIs.

in this PR, I mitigated the issue either by making more things to run concurrently or by changing logic which requires those APIs.

this performance improvement is particularly for devdiv bug # 1089540

this makes the file in the bug to be formatted in several seconds compared to several minutes on my machine.

there were several issues. each one fixed by

#1, use concurrency on gathering operations.
#2, don't use too much time to split work to chunks if that requires more work than actually formatting.
#3, don't blindly set beginning of a file as inseparable start point for certain formatting options.

...

but these don't actually address the most impactful root cause of this perf issues. which is perf issue of GetPrevious/GetNextToken API in compiler.
(dotnet#3244)

formatter internally uses GetDescendantTokens to get all tokens at once and cache them which takes less than 1 second for the entire file (2M bytes size) in the bug. and use the cache internally.

but certain part of formatter (Rule Provider) can't use that internal cache, so it has to use the GetPrevious/GetNextToken to move around tokens, which in this particular bug, takes more than 40 seconds on my machine. and that is not even for entire file. (less than 1/12 of tokens)

I opened a bug to compiler team, hopely so that we can get better perf on those APIs.

in this PR, I mitigated the issue either by making more things to run concurrently or by changing logic which requires those APIs.
@heejaechang
Copy link
Contributor Author

related to #3229

@heejaechang
Copy link
Contributor Author

@mattwar @jmarolf @dpoeschl @Pilchie @balajikris can you take a look? thank you.

@heejaechang
Copy link
Contributor Author

missed @rchande

@heejaechang
Copy link
Contributor Author

@dotnet-bot retest this please

@heejaechang
Copy link
Contributor Author

retest this please

@Pilchie Pilchie added this to the 1.1 milestone Jun 2, 2015
@mattwar
Copy link
Contributor

mattwar commented Jun 3, 2015

👍

@heejaechang
Copy link
Contributor Author

retest this please

@heejaechang
Copy link
Contributor Author

retest this please

heejaechang added a commit that referenced this pull request Jun 4, 2015
make formatting performance better

...

this performance improvement is particularly for devdiv bug # 1089540

this makes the file in the bug to be formatted in several seconds compared to several minutes on my machine.

there were several issues. each one fixed by

#1, use concurrency on gathering operations.
#2, don't use too much time to split work to chunks if that requires more work than actually formatting.
#3, don't blindly set beginning of a file as inseparable start point for certain formatting options.

...

but these don't actually address the most impactful root cause of this perf issues. which is perf issue of GetPrevious/GetNextToken API in compiler.
(#3244)

formatter internally uses GetDescendantTokens to get all tokens at once and cache them which takes less than 1 second for the entire file (2M bytes size) in the bug. and use the cache internally.

but certain part of formatter (Rule Provider) can't use that internal cache, so it has to use the GetPrevious/GetNextToken to move around tokens, which in this particular bug, takes more than 40 seconds on my machine. and that is not even for entire file. (less than 1/12 of tokens)

I opened a bug to compiler team, hopely so that we can get better perf on those APIs.

in this PR, I mitigated the issue either by making more things to run concurrently or by changing logic which requires those APIs.
@heejaechang heejaechang merged commit 0382e3e into dotnet:master Jun 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants