Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't cache canonicalize calls when containingUrl is available #2215

Merged
merged 3 commits into from
Apr 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
## 1.75.0

* Fix a bug in which stylesheet canonicalization could be cached incorrectly
when custom importers or the Node.js package importer made decisions based on
the URL of the containing stylesheet.

### JS API

* Allow `importer` to be passed without `url` in `StringOptionsWithImporter`.

## 1.74.1

* No user-visible changes.
Expand Down
85 changes: 53 additions & 32 deletions lib/src/async_import_cache.dart
Original file line number Diff line number Diff line change
Expand Up @@ -154,64 +154,85 @@ final class AsyncImportCache {
}

if (baseImporter != null && url.scheme == '') {
var relativeResult = await putIfAbsentAsync(
_relativeCanonicalizeCache,
(
url,
forImport: forImport,
baseImporter: baseImporter,
baseUrl: baseUrl
),
() => _canonicalize(baseImporter, baseUrl?.resolveUri(url) ?? url,
baseUrl, forImport));
var relativeResult = await putIfAbsentAsync(_relativeCanonicalizeCache, (
url,
forImport: forImport,
baseImporter: baseImporter,
baseUrl: baseUrl
), () async {
var (result, cacheable) = await _canonicalize(
baseImporter, baseUrl?.resolveUri(url) ?? url, baseUrl, forImport);
assert(
cacheable,
"Relative loads should always be cacheable because they never "
"provide access to the containing URL.");
return result;
});
if (relativeResult != null) return relativeResult;
}

return await putIfAbsentAsync(
_canonicalizeCache, (url, forImport: forImport), () async {
for (var importer in _importers) {
if (await _canonicalize(importer, url, baseUrl, forImport)
case var result?) {
var key = (url, forImport: forImport);
if (_canonicalizeCache.containsKey(key)) return _canonicalizeCache[key];
Copy link
Contributor

@ntkme ntkme Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of an edge case with two importers [importerA, importerB]:

  1. Resolve an import x, with containing url u:y. importerA accessed containingUrl, but could not resolve it, then importerB resolved it without using containingUrl thus get cached.
  2. Resolve an import x, with containing url file://a.scss. importerA would have been able to resolve it because now this has a different containingUrl that importerA can handle, however, because of the previous cache from importerB, importerA would not even get attempted.

I think this would be unexpected, because importerA is before importerB.

Copy link
Contributor Author

@nex3 nex3 Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, _canonicalize() would return (null, false) for importerA because containingUrl is passed, so cacheable will be set to false on line 188 and importerB's result won't be cached even if it would be cacheable in isolation (since that case is guarded by when cacheable).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot this is the first iteration that cacheable is based on whether it's passed or not, instead of whether it's used or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic should work the same if it just checks access—we'd still return (null, true) in the case where importerA accessed containingUrl but couldn't resolve the import.

Copy link
Contributor

@ntkme ntkme Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works correctly.

If I read it correctly, any non-cacheable importer will make all importers after it non-cacheable for the current import. And therefore if a FilesystemImporter is low priority it would not be cached and cost can be high? If that’s the case it might still be worth to have per importer cache key instead of single cache key for all importers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have [nonCacheableImporter, loadPathImporter], with the implementation in this PR the loadPathImporter would not be cached, and repeatedly loading the same file from load path can get slower due to repeated I/O calls.

Thus I think it's worth to make the trade-off to use a little bit more memory to cache per importer (and even cache failed canonicalization if cacheable), and a bit more CPU to re-check cache per importer, like the pseudo code below:

for (var importer in importers) {
    var key = (url, forImport: forImport, importer: importer);
    if (_canonicalizeCache.containsKey(key)) {
      var cached = _canonicalizeCache[key];
      if (cached != null) return cached;
    } else {
      var (result, cacheable) = await _canonicalize(importer, url, baseUrl, forImport);
      if (cacheable) {
        _canonicalizeCache[key] = result;
      }
      if (result != null) return result;
    }
}
return null;

The cost of CPU/memory overhead would likely be lower than the I/O overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's crazy to have a per-importer canonicalize cache in addition to the current whole-load-path cache, although I think we might want to be a bit more sophisticated about only filling it if we run into an uncacheable load in practice. Either way, let's save that for a follow-up after we fix the initial bug.


// Each indivudal call to a `canonicalize()` override may not be cacheable
// (specifically, if it has access to `containingUrl` it's too
// context-sensitive to usefully cache). We want to cache a given URL across
// the _entire_ importer chain, so we use [cacheable] to track whether _all_
// `canonicalize()` calls we've attempted are cacheable. Only if they are do
// we store the result in the cache.
var cacheable = true;
for (var importer in _importers) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I was able to follow the caching logic, but it wasn't very obvious at the beginning. It would be great if the caching strategy was documented, but at the same time the documenting the strategy also feels like documenting an implementation detail so maybe it doesn't have to be documented at all 🤔.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment to explain in more detail

switch (await _canonicalize(importer, url, baseUrl, forImport)) {
case (var result?, true) when cacheable:
_canonicalizeCache[key] = result;
return result;

case (var result?, _):
return result;
}

case (_, false):
cacheable = false;
}
}

return null;
});
if (cacheable) _canonicalizeCache[key] = null;
return null;
}

/// Calls [importer.canonicalize] and prints a deprecation warning if it
/// returns a relative URL.
///
/// If [resolveUrl] is `true`, this resolves [url] relative to [baseUrl]
/// before passing it to [importer].
Future<AsyncCanonicalizeResult?> _canonicalize(
AsyncImporter importer, Uri url, Uri? baseUrl, bool forImport,
{bool resolveUrl = false}) async {
var resolved =
resolveUrl && baseUrl != null ? baseUrl.resolveUri(url) : url;
/// This returns both the result of the call to `canonicalize()` and whether
/// that result is cacheable at all.
Future<(AsyncCanonicalizeResult?, bool cacheable)> _canonicalize(
AsyncImporter importer, Uri url, Uri? baseUrl, bool forImport) async {
var canonicalize = forImport
? () => inImportRule(() => importer.canonicalize(resolved))
: () => importer.canonicalize(resolved);
? () => inImportRule(() => importer.canonicalize(url))
: () => importer.canonicalize(url);

var passContainingUrl = baseUrl != null &&
(url.scheme == '' || await importer.isNonCanonicalScheme(url.scheme));
var result = await withContainingUrl(
passContainingUrl ? baseUrl : null, canonicalize);
if (result == null) return null;

// TODO(sass/dart-sass#2208): Determine whether the containing URL was
// _actually_ accessed rather than assuming it was.
var cacheable = !passContainingUrl || importer is FilesystemImporter;

if (result == null) return (null, cacheable);

if (result.scheme == '') {
_logger.warnForDeprecation(
Deprecation.relativeCanonical,
"Importer $importer canonicalized $resolved to $result.\n"
"Importer $importer canonicalized $url to $result.\n"
"Relative canonical URLs are deprecated and will eventually be "
"disallowed.");
} else if (await importer.isNonCanonicalScheme(result.scheme)) {
throw "Importer $importer canonicalized $resolved to $result, which "
"uses a scheme declared as non-canonical.";
throw "Importer $importer canonicalized $url to $result, which uses a "
"scheme declared as non-canonical.";
}

return (importer, result, originalUrl: resolved);
return ((importer, result, originalUrl: url), cacheable);
}

/// Tries to import [url] using one of this cache's importers.
Expand Down
84 changes: 54 additions & 30 deletions lib/src/import_cache.dart
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
// DO NOT EDIT. This file was generated from async_import_cache.dart.
// See tool/grind/synchronize.dart for details.
//
// Checksum: d157b83599dbc07a80ac6cb5ffdf5dde03b60376
// Checksum: 37dd173d676ec6cf201a25b3cca9ac81d92b1433
//
// ignore_for_file: unused_import

Expand Down Expand Up @@ -154,61 +154,85 @@ final class ImportCache {
}

if (baseImporter != null && url.scheme == '') {
var relativeResult = _relativeCanonicalizeCache.putIfAbsent(
(
url,
forImport: forImport,
baseImporter: baseImporter,
baseUrl: baseUrl
),
() => _canonicalize(baseImporter, baseUrl?.resolveUri(url) ?? url,
baseUrl, forImport));
var relativeResult = _relativeCanonicalizeCache.putIfAbsent((
url,
forImport: forImport,
baseImporter: baseImporter,
baseUrl: baseUrl
), () {
var (result, cacheable) = _canonicalize(
baseImporter, baseUrl?.resolveUri(url) ?? url, baseUrl, forImport);
assert(
cacheable,
"Relative loads should always be cacheable because they never "
"provide access to the containing URL.");
return result;
});
if (relativeResult != null) return relativeResult;
}

return _canonicalizeCache.putIfAbsent((url, forImport: forImport), () {
for (var importer in _importers) {
if (_canonicalize(importer, url, baseUrl, forImport) case var result?) {
var key = (url, forImport: forImport);
if (_canonicalizeCache.containsKey(key)) return _canonicalizeCache[key];

// Each indivudal call to a `canonicalize()` override may not be cacheable
// (specifically, if it has access to `containingUrl` it's too
// context-sensitive to usefully cache). We want to cache a given URL across
// the _entire_ importer chain, so we use [cacheable] to track whether _all_
// `canonicalize()` calls we've attempted are cacheable. Only if they are do
// we store the result in the cache.
var cacheable = true;
for (var importer in _importers) {
switch (_canonicalize(importer, url, baseUrl, forImport)) {
case (var result?, true) when cacheable:
_canonicalizeCache[key] = result;
return result;

case (var result?, _):
return result;
}

case (_, false):
cacheable = false;
}
}

return null;
});
if (cacheable) _canonicalizeCache[key] = null;
return null;
}

/// Calls [importer.canonicalize] and prints a deprecation warning if it
/// returns a relative URL.
///
/// If [resolveUrl] is `true`, this resolves [url] relative to [baseUrl]
/// before passing it to [importer].
CanonicalizeResult? _canonicalize(
Importer importer, Uri url, Uri? baseUrl, bool forImport,
{bool resolveUrl = false}) {
var resolved =
resolveUrl && baseUrl != null ? baseUrl.resolveUri(url) : url;
/// This returns both the result of the call to `canonicalize()` and whether
/// that result is cacheable at all.
(CanonicalizeResult?, bool cacheable) _canonicalize(
Importer importer, Uri url, Uri? baseUrl, bool forImport) {
var canonicalize = forImport
? () => inImportRule(() => importer.canonicalize(resolved))
: () => importer.canonicalize(resolved);
? () => inImportRule(() => importer.canonicalize(url))
: () => importer.canonicalize(url);

var passContainingUrl = baseUrl != null &&
(url.scheme == '' || importer.isNonCanonicalScheme(url.scheme));
var result =
withContainingUrl(passContainingUrl ? baseUrl : null, canonicalize);
if (result == null) return null;

// TODO(sass/dart-sass#2208): Determine whether the containing URL was
// _actually_ accessed rather than assuming it was.
var cacheable = !passContainingUrl || importer is FilesystemImporter;

if (result == null) return (null, cacheable);

if (result.scheme == '') {
_logger.warnForDeprecation(
Deprecation.relativeCanonical,
"Importer $importer canonicalized $resolved to $result.\n"
"Importer $importer canonicalized $url to $result.\n"
"Relative canonical URLs are deprecated and will eventually be "
"disallowed.");
} else if (importer.isNonCanonicalScheme(result.scheme)) {
throw "Importer $importer canonicalized $resolved to $result, which "
"uses a scheme declared as non-canonical.";
throw "Importer $importer canonicalized $url to $result, which uses a "
"scheme declared as non-canonical.";
}

return (importer, result, originalUrl: resolved);
return ((importer, result, originalUrl: url), cacheable);
}

/// Tries to import [url] using one of this cache's importers.
Expand Down
4 changes: 4 additions & 0 deletions pkg/sass_api/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
## 10.2.0

* No user-visible changes.

## 10.1.1

* No user-visible changes.
Expand Down
4 changes: 2 additions & 2 deletions pkg/sass_api/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@ name: sass_api
# Note: Every time we add a new Sass AST node, we need to bump the *major*
# version because it's a breaking change for anyone who's implementing the
# visitor interface(s).
version: 10.1.1
version: 10.2.0
description: Additional APIs for Dart Sass.
homepage: https://github.com/sass/dart-sass

environment:
sdk: ">=3.0.0 <4.0.0"

dependencies:
sass: 1.74.1
sass: 1.75.0

dev_dependencies:
dartdoc: ^6.0.0
Expand Down
2 changes: 1 addition & 1 deletion pubspec.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: sass
version: 1.74.1
version: 1.75.0
description: A Sass implementation in Dart.
homepage: https://github.com/sass/dart-sass

Expand Down
Loading