-
Notifications
You must be signed in to change notification settings - Fork 357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't cache canonicalize calls when containingUrl
is available
#2215
Conversation
if (await _canonicalize(importer, url, baseUrl, forImport) | ||
case var result?) { | ||
var key = (url, forImport: forImport); | ||
if (_canonicalizeCache.containsKey(key)) return _canonicalizeCache[key]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think of an edge case with two importers [importerA, importerB]
:
- Resolve an import
x
, with containing urlu:y
.importerA
accessedcontainingUrl
, but could not resolve it, thenimporterB
resolved it without usingcontainingUrl
thus get cached. - Resolve an import
x
, with containing urlfile://a.scss
.importerA
would have been able to resolve it because now this has a differentcontainingUrl
thatimporterA
can handle, however, because of the previous cache fromimporterB
,importerA
would not even get attempted.
I think this would be unexpected, because importerA
is before importerB
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, _canonicalize()
would return (null, false)
for importerA
because containingUrl
is passed, so cacheable
will be set to false
on line 188 and importerB
's result won't be cached even if it would be cacheable in isolation (since that case
is guarded by when cacheable
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot this is the first iteration that cacheable is based on whether it's passed or not, instead of whether it's used or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic should work the same if it just checks access—we'd still return (null, true)
in the case where importerA
accessed containingUrl
but couldn't resolve the import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this works correctly.
If I read it correctly, any non-cacheable importer will make all importers after it non-cacheable for the current import. And therefore if a FilesystemImporter is low priority it would not be cached and cost can be high? If that’s the case it might still be worth to have per importer cache key instead of single cache key for all importers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have [nonCacheableImporter, loadPathImporter]
, with the implementation in this PR the loadPathImporter
would not be cached, and repeatedly loading the same file from load path can get slower due to repeated I/O calls.
Thus I think it's worth to make the trade-off to use a little bit more memory to cache per importer (and even cache failed canonicalization if cacheable), and a bit more CPU to re-check cache per importer, like the pseudo code below:
for (var importer in importers) {
var key = (url, forImport: forImport, importer: importer);
if (_canonicalizeCache.containsKey(key)) {
var cached = _canonicalizeCache[key];
if (cached != null) return cached;
} else {
var (result, cacheable) = await _canonicalize(importer, url, baseUrl, forImport);
if (cacheable) {
_canonicalizeCache[key] = result;
}
if (result != null) return result;
}
}
return null;
The cost of CPU/memory overhead would likely be lower than the I/O overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's crazy to have a per-importer canonicalize cache in addition to the current whole-load-path cache, although I think we might want to be a bit more sophisticated about only filling it if we run into an uncacheable load in practice. Either way, let's save that for a follow-up after we fix the initial bug.
if (_canonicalizeCache.containsKey(key)) return _canonicalizeCache[key]; | ||
|
||
var cacheable = true; | ||
for (var importer in _importers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I was able to follow the caching logic, but it wasn't very obvious at the beginning. It would be great if the caching strategy was documented, but at the same time the documenting the strategy also feels like documenting an implementation detail so maybe it doesn't have to be documented at all 🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment to explain in more detail
See #2208