Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: faster workspace mapping #138

Closed
wants to merge 4 commits into from
Closed

Conversation

thecodrr
Copy link
Contributor

This PR changes the workspace finding algorithm to be around 2x faster by:

  1. globing only once instead of for each pattern
  2. Using ignore in glob for negated patterns

The results on my machine look quite good:

┌─────────┬───────────────┬──────────┬────────────────────┬───────────┬─────────┐
│ (index) │   Task Name   │ ops/sec  │ Average Time (ns)  │  Margin   │ Samples │
├─────────┼───────────────┼──────────┼────────────────────┼───────────┼─────────┤
│    0    │     'old'     │   '97'   │ 10300189.995765686 │ '±58.58%' │   10    │
│    1    │     'new'     │  '195'   │ 5119704.985618591  │ '±6.59%'  │   20    │
│    2    │ 'new virtual' │ '24,096' │  41499.5856304881  │ '±3.61%'  │  2410   │
│    3    │ 'old virtual' │ '54,503' │ 18347.38597894367  │ '±1.64%'  │  5451   │
└─────────┴───────────────┴──────────┴────────────────────┴───────────┴─────────┘

This PR doesn't break any test or introduce any new package.

References

@thecodrr thecodrr requested a review from a team as a code owner December 22, 2023 17:56
lib/index.js Outdated Show resolved Hide resolved
Comment on lines +130 to +137
// we must preserve the order of results according to the given list of
// workspace patterns
const orderedMatches = []
for (const pattern of patterns) {
orderedMatches.push(...matches.filter((m) => {
return minimatch(m, pattern, { partial: true, windowsPathsNoEscape: true })
}))
}
Copy link
Contributor Author

@thecodrr thecodrr Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of this approach because there can be edge cases where the reverse matching fails. We will have to account for every case that is handled by glob. Another problem could be a single pattern matching multiple matches causing unnecessary duplicates. We can sift them out but it seems unnecessary to me.

The only reason we have to do this is due to this line:

matches = matches.sort((a, b) => a.localeCompare(b, 'en'))

which sorts all matches alphabetically. glob maintains the given order of patterns automatically so we take this line out, everything works as intended. However, what's "intended" is vague. For well-defined patterns like docs, smoke-tests, there's no problem but if the user adds a wildcard like workspaces/* then we have to decide on how to order the results.

If we take out the above line, the order for wildcard matches depend on glob and I have found no way to change this behavior.

@wraithgar what do you think?

Copy link
Contributor Author

@thecodrr thecodrr Jan 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, not sorting the results alphabetically gives us an extra 5% performance boost.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I didn't see this notification and it got lost in the avalanche of cli work.

I didn't quite follow what you were meaning about the "reverse matching".

As long as the results from something like workspaces/* comes back sorted, and those results are in the same order as the entry for workspaces/* is in the package.json, we are fine.

As far as I can tell, glob doesn't have a guaranteed sort order so that is why the sort was added here.

Sorry if this doesn't clarify, I would really like to see this land and again apologize that it slipped through the cracks.

@wraithgar wraithgar self-assigned this Jan 8, 2024
@wraithgar wraithgar changed the title 2x faster workspace mapping fix: faster workspace mapping Jan 31, 2024
seenPackagePathnames = new Set()
seen.set(name, seenPackagePathnames)
}
seenPackagePathnames.add(packagePathname)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because seenPackagePathnames is always added to at least once for every name, line 167 (the continue below) is now unreachable:

  for (const [packageName, seenPackagePathnames] of seen) {
    if (seenPackagePathnames.size === 0) {
      continue
    }

I think at if statement can be removed.

@wraithgar
Copy link
Member

wraithgar commented Apr 10, 2024

I have a local copy of this branch that removes the now unreachable if condition, and adds a test for the last line of coverage missing. I will make a new PR for it and we can land that if we're ready. I think this works as-is, even if there may be more we can do in the future.

#143

wraithgar added a commit that referenced this pull request Apr 10, 2024
This is a cleanup of #138 to
get code coverage finished in the tests.

Credit: @thecodrr

---------

Co-authored-by: Abdullah Atta <abdullahatta@streetwriters.co>
@wraithgar wraithgar closed this Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants