-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: faster processing of large file #543
Conversation
@sebthom have you some benshmark about the performance improvement? To be honnest with you I dont understand where you have optimized the code. Is it just use of ArrayList instead of LinkedArrayList,? Many thankq for your clarification. |
This commit replaces LinkedList with ArrayList in AbstractModelLines to address #540. AbstractModelLines mostly accesses internal list using random access via List#get(int). This results into UI freezes for files with a lot of lines. There is actually only one usecase where LinkedList is faster than ArrayList: when adding/removing elements at the beginning of a list. Even for insertions and removal within large lists ArrayList is faster because with LinkedList the list must be traversed up to the point where elements are removed/added. |
dc01954
to
19a562e
Compare
@@ -37,7 +37,7 @@ static final class ModelLine { | |||
List<TMToken> tokens = Collections.emptyList(); | |||
} | |||
|
|||
private final List<ModelLine> list = new LinkedList<>(); | |||
private final List<ModelLine> list = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice starting point.
I suggest to go a bit further and try to work based on addAll(int idx, Collection elements), and list.subList().clear();
Two examples:
protected void addLines(final int lineIndex, final int count) {
if (count < 1)
return;
// prepare temporary list;
var newLines = new ArrayList<>(count)
for (int i = 0; i < count; i++) {
newLines.add(new ModelLine());
}
synchronized (list) {
final var firstLine = getOrNull(lineIndex);
list.addAll(lineIndex, newLines);
if (firstLine != null) {
list.get(lineIndex).startState = firstLine.startState;
}
updateLine(lineIndex);
}
}
similarly removeLines
could benefit from list.sublist(lineIndex, lineIndex + count).clear()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise there is potential for optimizations in replaceLines via subList().replaceAll(..)
followed by an addAll or clear depending on the requested changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took 29 seconds on my machine before this PR and takes 20 seconds with this PR
Could you please provide updated numbers? Because to be honest, 20 seconds is still quite slow.
I shaved of a few more seconds by adding a token caching on the hot path but I don't see more low hanging fruits that will have a noticeable. Because of it's async nature profiling is a bit difficult. Most of the time seems now to be spend in the joni library for TextMate regex searches and in eclipse UI classes doing the rendering. I see one more possible optimization but I am not sure how/if this will result in a noticeable performance improvement #38
380d026
to
94acfc4
Compare
Could you please provide updated numbers? Because to be honest, 20 seconds is still quite slow. |
This PR addresses #540 and additionally improves general rendering performance.
Full colorization of this file https://github.com/json-iterator/test-data/blob/master/large-file.json (26MB, 11k lines) took 29 seconds on my machine before this PR and takes 20 seconds with this PR applied. Also short UI freezes encountered without the PR are gone with the PR applied.