Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: index type hierarchy in java files #6189

Merged
merged 1 commit into from
Mar 28, 2024

Conversation

kasiaMarek
Copy link
Contributor

@kasiaMarek kasiaMarek commented Mar 1, 2024

connected to: #5623

Copy link
Contributor

@tgodzik tgodzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just one question

else
sourceTopLevels.filter(sym => !isTrivialToplevelSymbol(uri, sym)).toList
else if (isJava) {
sourceTopLevels.toList.headOption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any idea if the indexing is becoming much slower?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark                      Mode  Cnt  Score   Error  Units
MetalsBench.javaMtagsPackage     ss   10  0.168 ± 0.014   s/op
MetalsBench.toplevelJavaMtags    ss   10  2.843 ± 0.112   s/op

It does get slower but it doesn't seem to be too bad. In practice for e.g. Metals there is no visible slowdown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, let's merge it then 🚀

Copy link
Contributor Author

@kasiaMarek kasiaMarek Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for e.g. Metals there is no visible slowdown

Actually, I was wrong (I looked at the whole indexed workspace time, which can be very misleading on laptop doing million things in the background).

The actual slowdown:
samples before:

time: indexed library sources in 7.45s
time: indexed library sources in 9.97s
time: indexed library sources in 11s

samples after:

time: indexed library sources in 26s
time: indexed library sources in 22s
time: indexed library sources in 19s

So there is an over 200% slowdown as visible on the CI.

Indexing Java top level doesn't seem to be slower than for Scala but we didn't index Java files almost at all before.

MetalsBench.typeHierarchyIndex     ss   10  0.404 ± 0.008   s/op (for Scala lines: 383135)
~ 1.054 for 1m lines
MetalsBench.toplevelJavaMtags      ss   10  2.843 ± 0.112   s/op (for Java lines: 5170931)
~ 0.54  for 1m lines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the biggest slowdown I think is because we index JDK multiple times on one machine, which doesn't seem necessary. Should we have a separate database for JDKs then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem reasonable to index JDK just once instead of doing it for every project.

@kasiaMarek kasiaMarek merged commit 0f9d773 into scalameta:main Mar 28, 2024
26 checks passed
@tgodzik
Copy link
Contributor

tgodzik commented Mar 29, 2024

Could you take a look next week if something isn't wrong, the times for some of the CI jobs jumped over 200 %

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants