-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: faster workspace mapping #138
Conversation
// we must preserve the order of results according to the given list of | ||
// workspace patterns | ||
const orderedMatches = [] | ||
for (const pattern of patterns) { | ||
orderedMatches.push(...matches.filter((m) => { | ||
return minimatch(m, pattern, { partial: true, windowsPathsNoEscape: true }) | ||
})) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of this approach because there can be edge cases where the reverse matching fails. We will have to account for every case that is handled by glob
. Another problem could be a single pattern matching multiple matches
causing unnecessary duplicates. We can sift them out but it seems unnecessary to me.
The only reason we have to do this is due to this line:
matches = matches.sort((a, b) => a.localeCompare(b, 'en'))
which sorts all matches alphabetically. glob
maintains the given order of patterns automatically so we take this line out, everything works as intended. However, what's "intended" is vague. For well-defined patterns like docs
, smoke-tests
, there's no problem but if the user adds a wildcard like workspaces/*
then we have to decide on how to order the results.
If we take out the above line, the order for wildcard matches depend on glob
and I have found no way to change this behavior.
@wraithgar what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, not sorting the results alphabetically gives us an extra 5% performance boost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I didn't see this notification and it got lost in the avalanche of cli work.
I didn't quite follow what you were meaning about the "reverse matching".
As long as the results from something like workspaces/*
comes back sorted, and those results are in the same order as the entry for workspaces/*
is in the package.json, we are fine.
As far as I can tell, glob doesn't have a guaranteed sort order so that is why the sort was added here.
Sorry if this doesn't clarify, I would really like to see this land and again apologize that it slipped through the cracks.
seenPackagePathnames = new Set() | ||
seen.set(name, seenPackagePathnames) | ||
} | ||
seenPackagePathnames.add(packagePathname) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because seenPackagePathnames
is always added to at least once for every name, line 167 (the continue
below) is now unreachable:
for (const [packageName, seenPackagePathnames] of seen) {
if (seenPackagePathnames.size === 0) {
continue
}
I think at if
statement can be removed.
I have a local copy of this branch that removes the now unreachable |
This PR changes the workspace finding algorithm to be around 2x faster by:
glob
ing only once instead of for each patternignore
inglob
for negated patternsThe results on my machine look quite good:
This PR doesn't break any test or introduce any new package.
References