Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve GetPrevious/GetNextToken performance #3244

Closed
heejaechang opened this issue Jun 1, 2015 · 3 comments · May be fixed by #69740
Closed

improve GetPrevious/GetNextToken performance #3244

heejaechang opened this issue Jun 1, 2015 · 3 comments · May be fixed by #69740
Labels
Area-Compilers Feature Request Tenet-Performance Regression in measured performance of the product from goals.
Milestone

Comments

@heejaechang
Copy link
Contributor

heejaechang commented Jun 1, 2015

look devdiv # 1089540 for more detail.

GetNextToken takes 40% of time (of several minutes)

@AdamSpeight2008
Copy link
Contributor

GetNextToken definition in source

@heejaechang heejaechang changed the title imporve GetPrevious/GetNextToken performance improve GetPrevious/GetNextToken performance Jun 2, 2015
heejaechang added a commit to heejaechang/roslyn that referenced this issue Jun 2, 2015
this performance improvement is particularly for devdiv bug # 1089540

this makes the file in the bug to be formatted in several seconds compared to several minutes on my machine.

there were several issues. each one fixed by

#1, use concurrency on gathering operations.
#2, don't use too much time to split work to chunks if that requires more work than actually formatting.
#3, don't blindly set beginning of a file as inseparable start point for certain formatting options.

...

but these don't actually address the most impactful root cause of this perf issues. which is perf issue of GetPrevious/GetNextToken API in compiler.
(dotnet#3244)

formatter internally uses GetDescendantTokens to get all tokens at once and cache them which takes less than 1 second for the entire file (2M bytes size) in the bug. and use the cache internally.

but certain part of formatter (Rule Provider) can't use that internal cache, so it has to use the GetPrevious/GetNextToken to move around tokens, which in this particular bug, takes more than 40 seconds on my machine. and that is not even for entire file. (less than 1/12 of tokens)

I opened a bug to compiler team, hopely so that we can get better perf on those APIs.

in this PR, I mitigated the issue either by making more things to run concurrently or by changing logic which requires those APIs.
@heejaechang
Copy link
Contributor Author

for the same file in the bug, smart indenter on "}" in array initializer also spends most of its time on GetNextToken. (16 seconds in my machine)

@gafter gafter added this to the 1.1 milestone Jun 2, 2015
heejaechang added a commit that referenced this issue Jun 4, 2015
make formatting performance better

...

this performance improvement is particularly for devdiv bug # 1089540

this makes the file in the bug to be formatted in several seconds compared to several minutes on my machine.

there were several issues. each one fixed by

#1, use concurrency on gathering operations.
#2, don't use too much time to split work to chunks if that requires more work than actually formatting.
#3, don't blindly set beginning of a file as inseparable start point for certain formatting options.

...

but these don't actually address the most impactful root cause of this perf issues. which is perf issue of GetPrevious/GetNextToken API in compiler.
(#3244)

formatter internally uses GetDescendantTokens to get all tokens at once and cache them which takes less than 1 second for the entire file (2M bytes size) in the bug. and use the cache internally.

but certain part of formatter (Rule Provider) can't use that internal cache, so it has to use the GetPrevious/GetNextToken to move around tokens, which in this particular bug, takes more than 40 seconds on my machine. and that is not even for entire file. (less than 1/12 of tokens)

I opened a bug to compiler team, hopely so that we can get better perf on those APIs.

in this PR, I mitigated the issue either by making more things to run concurrently or by changing logic which requires those APIs.
@jaredpar jaredpar modified the milestones: 1.2, 1.1 Jul 31, 2015
@jaredpar jaredpar removed this from the 1.2 milestone Nov 23, 2015
@dpoeschl dpoeschl added this to the 1.2 milestone Dec 3, 2015
@jaredpar jaredpar modified the milestones: 1.3, 1.2 Dec 10, 2015
@jaredpar jaredpar modified the milestones: 1.3, 2.0 (Preview) Mar 30, 2016
@gafter gafter modified the milestones: 2.0 (Preview 1), 2.0 (RC) Jun 20, 2016
@jaredpar jaredpar added the Tenet-Performance Regression in measured performance of the product from goals. label Jul 19, 2016
@jaredpar jaredpar modified the milestones: 2.1, 2.0 (RC) Jul 19, 2016
@jaredpar jaredpar modified the milestones: 3.0, 2.1 Feb 17, 2017
@jaredpar jaredpar modified the milestones: 16.0, Unknown Sep 7, 2018
@CyrusNajmabadi
Copy link
Member

Dont' currently have any issue indicating we still have a problem here. So closing out.

@CyrusNajmabadi CyrusNajmabadi closed this as not planned Won't fix, can't repro, duplicate, stale Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Feature Request Tenet-Performance Regression in measured performance of the product from goals.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants