Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new_audit(cache-headers): detects savings from leveraging caching #3531

Merged
merged 14 commits into from
Nov 17, 2017
Original file line number Diff line number Diff line change
Expand Up @@ -140,11 +140,11 @@ class UnusedBytes extends Audit {
rawValue: wastedMs,
score: UnusedBytes.scoreForWastedMs(wastedMs),
extendedInfo: {
value: {
value: Object.assign({
wastedMs,
wastedKb,
results,
},
}, result.extendedInfo),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment that this merges in any extendedInfo provided by the derived audit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

},
details: tableDetails,
};
Expand Down
226 changes: 226 additions & 0 deletions lighthouse-core/audits/byte-efficiency/cache-headers.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
/**
* @license Copyright 2017 Google Inc. All Rights Reserved.
* Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
* Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
*/
'use strict';

const assert = require('assert');
const parseCacheControl = require('parse-cache-control');
const ByteEfficiencyAudit = require('./byte-efficiency-audit');
const formatDuration = require('../../report/v2/renderer/util.js').formatDuration;
const WebInspector = require('../../lib/web-inspector');
const URL = require('../../lib/url-shim');

// Ignore assets that have very high likelihood of cache hit
const IGNORE_THRESHOLD_IN_PERCENT = 0.95;
// Basically we assume a 10% chance of repeat visit.
const PROBABILITY_OF_RETURN_VISIT = 0.1;

class CacheHeaders extends ByteEfficiencyAudit {
/**
* @return {number}
*/
static get PROBABILITY_OF_RETURN_VISIT() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need a getter? (e.g. vs IGNORE_THRESHOLD_IN_PERCENT)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes the tests easier than copy pasting

Copy link
Member

@brendankenny brendankenny Nov 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes the tests easier than copy pasting

but it's not used in a test? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is too 😛

const DISCOUNT_MULTIPLIER = CacheHeadersAudit.PROBABILITY_OF_RETURN_VISIT;

return PROBABILITY_OF_RETURN_VISIT;
}

/**
* @return {!AuditMeta}
*/
static get meta() {
return {
category: 'Caching',
name: 'cache-headers',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshedding on name? It's not just 'cache-headers' but also a judgement of them. asset-cache-length? asset-caching-ttl?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm if we go by consistency with the other byte efficiency audits they basically fall into either <noun of thing being detected> or uses-<best practice we're encouraging>

how about...

uncached-assets
low-cache-ttl
uses-caching
uses-cache-headers
uses-long-cache-ttl
?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uses-long-cache-ttl certainly isn't exactly catchy but describes it well :) I like that since it's not just use, it's (if they're used) that they're long

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

informative: true,
helpText:
'A well-defined cache policy can speed up repeat visits to your page. ' +
'[Learn more](https://developers.google.com/speed/docs/insights/LeverageBrowserCaching).',
description: 'Leverage browser caching for static assets',
requiredArtifacts: ['devtoolsLogs'],
};
}

/**
* Interpolates the y value at a point x on the line defined by (x0, y0) and (x1, y1)
* @param {number} x0
* @param {number} y0
* @param {number} x1
* @param {number} y1
* @param {number} x
* @return {number}
*/
static linearInterpolation(x0, y0, x1, y1, x) {
const slope = (y1 - y0) / (x1 - x0);
return y0 + (x - x0) * slope;
}

/**
* Computes the percent likelihood that a return visit will be within the cache lifetime, based on
* Chrome UMA stats see the note below.
* @param {number} maxAgeInSeconds
* @return {number}
*/
static getCacheHitProbability(maxAgeInSeconds) {
// This array contains the hand wavy distribution of the age of a resource in hours at the time of
// cache hit at 0th, 10th, 20th, 30th, etc percentiles. This is used to compute `wastedMs` since there
// are clearly diminishing returns to cache duration i.e. 6 months is not 2x better than 3 months.
// Based on UMA stats for HttpCache.StaleEntry.Validated.Age, see https://www.desmos.com/calculator/7v0qh1nzvh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to look at other cache entry stats too? It seems like this is only stale entries (so biases toward later next visits as non-stale entries would just be loaded and not log here?) and only for assets that qualify for 304 checks

// Example: a max-age of 12 hours already covers ~50% of cases, doubling to 24 hours covers ~10% more.
const RESOURCE_AGE_IN_HOURS_DECILES = [0, 0.2, 1, 3, 8, 12, 24, 48, 72, 168, 8760, Infinity];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding this guy

console.assert(RESOURCE_AGE_IN_HOURS_DECILES.length === 10, 'deci means 10, yo')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with require('assert')

assert.ok(RESOURCE_AGE_IN_HOURS_DECILES.length === 12, 'deciles 0-10 and 1 for overflow');

const maxAgeInHours = maxAgeInSeconds / 3600;
const upperDecileIndex = RESOURCE_AGE_IN_HOURS_DECILES.findIndex(
decile => decile >= maxAgeInHours
);

// Clip the likelihood between 0 and 1
if (upperDecileIndex === RESOURCE_AGE_IN_HOURS_DECILES.length - 1) return 1;
if (upperDecileIndex === 0) return 0;

// Use the two closest decile points as control points
const upperDecileValue = RESOURCE_AGE_IN_HOURS_DECILES[upperDecileIndex];
const lowerDecileValue = RESOURCE_AGE_IN_HOURS_DECILES[upperDecileIndex - 1];
const upperDecile = upperDecileIndex / 10;
const lowerDecile = (upperDecileIndex - 1) / 10;

// Approximate the real likelihood with linear interpolation
return CacheHeaders.linearInterpolation(
lowerDecileValue,
lowerDecile,
upperDecileValue,
upperDecile,
maxAgeInHours
);
}

/**
* Computes the user-specified cache lifetime, 0 if explicit no-cache policy is in effect, and null if not
* user-specified. See https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
*
* @param {!Map<string,string>} headers
* @param {!Object} cacheControl Follows the potential settings of cache-control, see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
* @return {?number}
*/
static computeCacheLifetimeInSeconds(headers, cacheControl) {
if (cacheControl) {
// Cache-Control takes precendence over expires
if (cacheControl['no-cache'] || cacheControl['no-store']) return 0;
if (Number.isFinite(cacheControl['max-age'])) return Math.max(cacheControl['max-age'], 0);
} else if ((headers.get('pragma') || '').includes('no-cache')) {
// The HTTP/1.0 Pragma header can disable caching if cache-control is not set, see https://tools.ietf.org/html/rfc7234#section-5.4
return 0;
}

if (headers.has('expires')) {
const expires = new Date(headers.get('expires')).getTime();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay for standards that enable this parser to handle the http date format. \o/

// Invalid expires values MUST be treated as already expired
if (!expires) return 0;
return Math.max(0, Math.ceil((expires - Date.now()) / 1000));
}

return null;
}

/**
* Given a network record, returns whether we believe the asset is cacheable, i.e. it was a network
* request that satisifed the conditions:
*
* 1. Has a cacheable status code
* 2. Has a resource type that corresponds to static assets (image, script, stylesheet, etc).
*
* Allowing assets with a query string is debatable, PSI considered them non-cacheable with a similar
* caveat.
*
* TODO: Investigate impact in HTTPArchive, experiment with this policy to see what changes.
*
* @param {!WebInspector.NetworkRequest} record
* @return {boolean}
*/
static isCacheableAsset(record) {
const CACHEABLE_STATUS_CODES = new Set([200, 203, 206]);

const STATIC_RESOURCE_TYPES = new Set([
WebInspector.resourceTypes.Font,
WebInspector.resourceTypes.Image,
WebInspector.resourceTypes.Media,
WebInspector.resourceTypes.Script,
WebInspector.resourceTypes.Stylesheet,
]);

const resourceUrl = record._url;
return (
CACHEABLE_STATUS_CODES.has(record.statusCode) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same deal with CACHEABLE_STATUS_CODES and position.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

STATIC_RESOURCE_TYPES.has(record._resourceType) &&
!resourceUrl.includes('data:')
);
}

/**
* @param {!Artifacts} artifacts
* @return {!AuditResult}
*/
static audit_(artifacts) {
const devtoolsLogs = artifacts.devtoolsLogs[ByteEfficiencyAudit.DEFAULT_PASS];
return artifacts.requestNetworkRecords(devtoolsLogs).then(records => {
const results = [];
let queryStringCount = 0;

for (const record of records) {
if (!CacheHeaders.isCacheableAsset(record)) continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are redirects just filtered out in that fn?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, only 200, 203, 206 allowed


const headers = new Map();
for (const header of record._responseHeaders) {
headers.set(header.name.toLowerCase(), header.value);
}

const cacheControl = parseCacheControl(headers.get('cache-control'));
let cacheLifetimeInSeconds = CacheHeaders.computeCacheLifetimeInSeconds(
headers,
cacheControl
);

// Ignore assets with an explicit no-cache policy
if (cacheLifetimeInSeconds === 0) continue;
cacheLifetimeInSeconds = cacheLifetimeInSeconds || 0;

let cacheHitProbability = CacheHeaders.getCacheHitProbability(cacheLifetimeInSeconds);
if (cacheHitProbability > IGNORE_THRESHOLD_IN_PERCENT) continue;

const url = URL.elideDataURI(record._url);
const totalBytes = record._transferSize;
const wastedBytes = (1 - cacheHitProbability) * totalBytes * PROBABILITY_OF_RETURN_VISIT;
const cacheLifetimeDisplay = formatDuration(cacheLifetimeInSeconds);
cacheHitProbability = `~${Math.round(cacheHitProbability * 100)}%`;

if (url.includes('?')) queryStringCount++;

results.push({
url,
cacheControl,
cacheLifetimeInSeconds,
cacheLifetimeDisplay,
cacheHitProbability,
totalBytes,
wastedBytes,
});
}

const headings = [
{key: 'url', itemType: 'url', text: 'URL'},
{key: 'totalKb', itemType: 'text', text: 'Size (KB)'},
{key: 'cacheLifetimeDisplay', itemType: 'text', text: 'Cache TTL'},
{key: 'probabilityOfCacheHit', itemType: 'text', text: 'Probability of Cache Hit (%)'},
];

return {
results,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's sort these.
i guess by like totalKb * cacheMissLikelihood ?

Copy link
Collaborator Author

@patrickhulce patrickhulce Oct 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should already be sorted by that thanks to

const results = result.results
.map(item => {
const wastedPercent = 100 * item.wastedBytes / item.totalBytes;
item.wastedKb = this.bytesToKbString(item.wastedBytes);
item.wastedMs = this.bytesToMsString(item.wastedBytes, networkThroughput);
item.totalKb = this.bytesToKbString(item.totalBytes);
item.totalMs = this.bytesToMsString(item.totalBytes, networkThroughput);
item.potentialSavings = this.toSavingsString(item.wastedBytes, wastedPercent);
return item;
})
.sort((itemA, itemB) => itemB.wastedBytes - itemA.wastedBytes);

👍

headings,
extendedInfo: {queryStringCount},
};
});
}
}

module.exports = CacheHeaders;
2 changes: 2 additions & 0 deletions lighthouse-core/config/default.js
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ module.exports = {
'accessibility/valid-lang',
'accessibility/video-caption',
'accessibility/video-description',
'byte-efficiency/cache-headers',
'byte-efficiency/total-byte-weight',
'byte-efficiency/offscreen-images',
'byte-efficiency/uses-webp-images',
Expand Down Expand Up @@ -229,6 +230,7 @@ module.exports = {
{id: 'consistently-interactive', weight: 5, group: 'perf-metric'},
{id: 'speed-index-metric', weight: 1, group: 'perf-metric'},
{id: 'estimated-input-latency', weight: 1, group: 'perf-metric'},
{id: 'cache-headers', weight: 0, group: 'perf-hint'},
{id: 'link-blocking-first-paint', weight: 0, group: 'perf-hint'},
{id: 'script-blocking-first-paint', weight: 0, group: 'perf-hint'},
{id: 'uses-responsive-images', weight: 0, group: 'perf-hint'},
Expand Down
30 changes: 30 additions & 0 deletions lighthouse-core/report/v2/renderer/util.js
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,36 @@ class Util {
}
return formatter.format(new Date(date));
}
/**
* Converts a time in seconds into a duration string, i.e. `1d 2h 13m 52s`
* @param {number} timeInSeconds
* @param {string=} zeroLabel
* @return {string}
*/
static formatDuration(timeInSeconds, zeroLabel = 'None') {
if (timeInSeconds === 0) {
return zeroLabel;
}

const parts = [];
const unitLabels = /** @type {!Object<string, number>} */ ({
d: 60 * 60 * 24,
h: 60 * 60,
m: 60,
s: 1,
});

Object.keys(unitLabels).forEach(label => {
const unit = unitLabels[label];
const numberOfUnits = Math.floor(timeInSeconds / unit);
if (numberOfUnits > 0) {
timeInSeconds -= numberOfUnits * unit;
parts.push(`${numberOfUnits}\xa0${label}`);
}
});

return parts.join(' ');
}

/**
* @param {!URL} parsedUrl
Expand Down
Loading