Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance 2022 Queries #2974

Merged
merged 47 commits into from
Aug 18, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
7b2fba4
Update README.md
konfirmed Jun 15, 2022
8c45c7d
Update README.md
konfirmed Jun 15, 2022
8483099
WIP
konfirmed Jul 18, 2022
27cc716
gaming metric query by count & percent
25prathamesh Jul 24, 2022
9760242
fix linting error gaming metrics sql file
25prathamesh Jul 25, 2022
db11212
add list of all queries
mel-ada Jul 26, 2022
d6d0ed4
update MD syntax
mel-ada Jul 26, 2022
aa41861
update comment on FID / INP Long Task Data
mel-ada Jul 26, 2022
fb23a47
forgot to save - updating FID/INP comment once again
mel-ada Jul 26, 2022
ce71870
correct linter errors
mel-ada Jul 26, 2022
219fd77
Antipatterns - LCP Lazy Loaded
25prathamesh Aug 1, 2022
a7c88fb
lin issue fix for gaming sql file
25prathamesh Aug 2, 2022
143e857
lin issue fix for gaming sql file
25prathamesh Aug 2, 2022
6ba495d
linter error
konfirmed Aug 4, 2022
4b73e2a
linter issue
konfirmed Aug 4, 2022
5163699
Linter issue fixed
konfirmed Aug 4, 2022
47e15cc
changed date from 2021 to 2022
konfirmed Aug 4, 2022
5f9afc7
lazy load LCP element by node & client query fix
25prathamesh Aug 7, 2022
66945ad
lcp responsiveness data query file
25prathamesh Aug 7, 2022
ceec699
linting errors fixed
25prathamesh Aug 8, 2022
72cc52c
linting errors fixed
25prathamesh Aug 8, 2022
439357e
LCP element (preload added)
konfirmed Aug 8, 2022
57c035d
linter error
konfirmed Aug 8, 2022
3227c98
Added fetch priority
konfirmed Aug 8, 2022
13806bd
Copies 2021 queries
siakaramalegos Aug 8, 2022
e81d3a9
Switch to 2022 and add INP
siakaramalegos Aug 8, 2022
e03f0a6
Fix INP label
siakaramalegos Aug 8, 2022
016d51d
update query list
mel-ada Aug 9, 2022
6681032
add note for synthetic filtering metric
mel-ada Aug 9, 2022
bcb9d71
Add LCP element query with fetchpriority
siakaramalegos Aug 9, 2022
7680d67
Merge branch 'peformance-2022-sql' of github.com:HTTPArchive/almanac.…
siakaramalegos Aug 9, 2022
063aea3
Trim trailing white space
siakaramalegos Aug 9, 2022
b94582b
Add sample data tables
siakaramalegos Aug 9, 2022
6f39b87
Point to object instead of array
siakaramalegos Aug 9, 2022
6a00886
Query same table on join
siakaramalegos Aug 9, 2022
e6e0cde
Fix INP thresholds
siakaramalegos Aug 9, 2022
7e3101e
fid disable zoom , ttfb by cms query performance chapter
25prathamesh Aug 14, 2022
d7ed770
readme update
25prathamesh Aug 14, 2022
7e001c4
linting issue fix for prev commit
25prathamesh Aug 14, 2022
3dbb025
linter
rviscomi Aug 15, 2022
5538977
lcp preload
rviscomi Aug 16, 2022
3dbd767
inp, ttfb, hosts
rviscomi Aug 17, 2022
d79c144
linter, lcp resource delay
rviscomi Aug 17, 2022
3bc777f
fix lcp resource delay
rviscomi Aug 17, 2022
1d5ec80
linter
rviscomi Aug 17, 2022
e3846cd
linter
rviscomi Aug 18, 2022
b2ac89c
lcp same-host
rviscomi Aug 18, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions sql/2022/performance/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,33 @@
[~google-doc]: https://docs.google.com/document/d/1IKV40fllCZTqeu-R6-73ckjQR9S6jiBfVBBfdcpAMkI/edit?usp=sharing
[~google-sheets]: https://docs.google.com/spreadsheets/d/1TPA_4xRTBB2fQZaBPZHVFvD0ikrR-4sNkfJfUEpjibs/edit?usp=sharing
[~chapter-markdown]: https://github.com/HTTPArchive/almanac.httparchive.org/tree/main/src/content/en/2022/performance.md

## Query List

Taken from [`Metrics` Section](https://docs.google.com/document/d/1IKV40fllCZTqeu-R6-73ckjQR9S6jiBfVBBfdcpAMkI/edit#heading=h.zbvh8yhwkp2i) in planning doc, where technical notes on how queries might be formulated live.

## Query notes

### `inp_long_tasks.sql`

This query generates a scatterplot of field-based, page-level p75 INP versus the sum of all lab-based long tasks on the page according to Lighthouse.

Lab-based mobile CPU throttling necessarily means that long tasks should be higher than desktop. It's interesting to look at the trendlines for both desktop and mobile to see that higher INP correlates with higher long tasks.

### `lcp_resource_delay.sql`

This query subtracts the lab-based TTFB time from the lab-based LCP element's request start time and generates a distribution over all pages.

The `lcp_elem_stats.startTime` and `lcp_resource.timestamp` values did not seem to correspond to the actual LCP element request start times seen in the WPT results, so the query goes the more expensive route to join the pages data with the more reliable requests data.

### `prelcp_domain_sharding.sql`

This query takes the distribution of the number of unique hosts connected prior to the lab-based LCP time, broken down by whether the LCP was good, NI, or poor.

### `ttfb_by_rendering.sql`

This query segments pages by whether they use client-side rendering (CSR) or server-side rendering (SSR). There is no high quality signal for CSR/SSR so the heuristic used in this query is the ratio of the number of words in the document served in the static HTML to the number of words in the final rendered page. If the number of words increases by 1.5x after rendering, then we consider the page to use CSR.

Using both rendering buckets, we then calculate the median field-based p75 TTFB.

This metric is not expected to be very meaningful given the tenuous definition of CSR.
84 changes: 84 additions & 0 deletions sql/2022/performance/correlation_preload_lcp.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
CREATE TEMPORARY FUNCTION getResourceHints(payload STRING)
RETURNS STRUCT<preload INT64>
LANGUAGE js AS '''
var hints = ['preload'];
try {
var $ = JSON.parse(payload);
var almanac = JSON.parse($._almanac);
return hints.reduce((results, hint) => {
// Null values are omitted from BigQuery aggregations.
// This means only pages with at least one hint are considered.
results[hint] = almanac['link-nodes'].nodes.filter(link => link.rel.toLowerCase() == hint).length || 0;
return results;
}, {});
} catch (e) {
return hints.reduce((results, hint) => {
results[hint] = 0;
return results;
}, {});
}
''';

CREATE TEMPORARY FUNCTION getGoodCwv(payload STRING)
RETURNS STRUCT<cumulative_layout_shift BOOLEAN, first_contentful_paint BOOLEAN, first_input_delay BOOLEAN, largest_contentful_paint BOOLEAN>
LANGUAGE js AS '''
try {
var $ = JSON.parse(payload);
var crux = $._CrUX;

if (crux) {
return Object.keys(crux.metrics).reduce((acc, n) => ({
...acc,
[n]: crux.metrics[n].histogram[0].density > 0.75
}), {})
}

return null;
} catch (e) {
return null;
}
''';

SELECT
device,

LEAST(hints.preload, 30) AS preload,

COUNT(0) AS freq,
SUM(COUNT(0)) OVER (PARTITION BY device) AS total,

COUNTIF(CrUX.largest_contentful_paint) AS lcp_good,
COUNTIF(CrUX.largest_contentful_paint IS NOT NULL) AS any_lcp,
COUNTIF(CrUX.largest_contentful_paint) / COUNTIF(CrUX.largest_contentful_paint IS NOT NULL) AS pct_lcp_good,

COUNTIF(CrUX.first_input_delay) AS fid_good,
COUNTIF(CrUX.first_input_delay IS NOT NULL) AS any_fid,
COUNTIF(CrUX.first_input_delay) / COUNTIF(CrUX.first_input_delay IS NOT NULL) AS pct_fid_good,

COUNTIF(CrUX.cumulative_layout_shift) AS cls_good,
COUNTIF(CrUX.cumulative_layout_shift IS NOT NULL) AS any_cls,
COUNTIF(CrUX.cumulative_layout_shift) / COUNTIF(CrUX.cumulative_layout_shift IS NOT NULL) AS pct_cls_good,

COUNTIF(CrUX.first_contentful_paint) AS fcp_good,
COUNTIF(CrUX.first_contentful_paint IS NOT NULL) AS any_fcp,
COUNTIF(CrUX.first_contentful_paint) / COUNTIF(CrUX.first_contentful_paint IS NOT NULL) AS pct_fcp_good,

COUNTIF(CrUX.largest_contentful_paint AND CrUX.first_input_delay IS NOT FALSE AND CrUX.cumulative_layout_shift) AS cwv_good,
COUNTIF(CrUX.largest_contentful_paint IS NOT NULL AND CrUX.cumulative_layout_shift IS NOT NULL) AS eligible_cwv,
COUNTIF(CrUX.largest_contentful_paint AND CrUX.first_input_delay IS NOT FALSE AND CrUX.cumulative_layout_shift) / COUNTIF(CrUX.largest_contentful_paint IS NOT NULL AND CrUX.cumulative_layout_shift IS NOT NULL) AS pct_cwv_good
FROM (
SELECT
_TABLE_SUFFIX AS device,
getResourceHints(payload) AS hints,
getGoodCwv(payload) AS CrUX
FROM
`httparchive.pages.2022_06_01_*`
)
WHERE
CrUX IS NOT NULL
GROUP BY
device,
preload
ORDER BY
device,
preload
25 changes: 25 additions & 0 deletions sql/2022/performance/gaming_metric.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
SELECT
_TABLE_SUFFIX AS client,
COUNT(0) AS total,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-ChromeLH') = 'true', 1, 0)) AS chrome_lighthouse,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-ChromeLH') = 'true', 1, 0)) / COUNT(0) AS chrome_lighthouse_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-GTmetrix') = 'true', 1, 0)) AS gtmetrix,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-GTmetrix') = 'true', 1, 0)) / COUNT(0) AS gtmetrix_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-PageSpeed') = 'true', 1, 0)) AS pagespeed,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.detectUA-PageSpeed') = 'true', 1, 0)) / COUNT(0) AS pagespeed_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.imgAnimationStrict') = 'true', 1, 0)) AS img_animation_strict,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.imgAnimationStrict') = 'true', 1, 0)) / COUNT(0) AS img_animation_strict_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.imgAnimationSoft') = 'true', 1, 0)) AS img_animation_soft,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.imgAnimationSoft') = 'true', 1, 0)) / COUNT(0) AS img_animation_soft_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.fidIframeOverlayStrict') = 'true', 1, 0)) AS fid_iframe_overlay_strict,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.fidIframeOverlayStrict') = 'true', 1, 0)) / COUNT(0) AS fid_iframe_overlay_strict_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.fidIframeOverlaySoft') = 'true', 1, 0)) AS fid_iframe_overlay_soft,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.fidIframeOverlaySoft') = 'true', 1, 0)) / COUNT(0) AS fid_iframe_overlay_soft_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.lcpOverlayStrict') = 'true', 1, 0)) AS lcp_overlay_strict,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.lcpOverlayStrict') = 'true', 1, 0)) / COUNT(0) AS lcp_overlay_strict_per,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.lcpOverlaySoft') = 'true', 1, 0)) AS lcp_overlay_soft,
SUM(IF(JSON_EXTRACT_SCALAR(payload, '$._performance.gaming_metrics.lcpOverlaySoft') = 'true', 1, 0)) / COUNT(0) AS lcp_overlay_soft_per
FROM
`httparchive.pages.2022_07_01_*`
GROUP BY
client
63 changes: 63 additions & 0 deletions sql/2022/performance/inp_long_tasks.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
WITH long_tasks AS (
SELECT
_TABLE_SUFFIX AS client,
url AS page,
CAST(JSON_QUERY(item, '$.duration') AS FLOAT64) AS long_task_duration
FROM
`lighthouse.2022_06_01_*`,
UNNEST(JSON_QUERY_ARRAY(report, '$.audits.long-tasks.details.items')) AS item
),

per_page AS (
SELECT
client,
page,
SUM(long_task_duration) AS long_tasks
FROM
long_tasks
GROUP BY
client,
page
),

crux_inp AS (
SELECT
_TABLE_SUFFIX AS client,
url AS page,
httparchive.core_web_vitals.GET_CRUX_INP(payload) AS inp
FROM
`httparchive.pages.2022_06_01_*`
),

combined AS (
SELECT
client,
long_tasks,
inp
FROM
per_page
JOIN
crux_inp
USING
(client, page)
),

meta AS (
SELECT
*,
COUNT(0) OVER (PARTITION BY client) AS n,
ROW_NUMBER() OVER (PARTITION BY client ORDER BY inp) AS row
FROM
combined
WHERE
inp IS NOT NULL
)

SELECT
client,
long_tasks,
inp
FROM
meta
WHERE
MOD(row, CAST(FLOOR(n / 1000) AS INT64)) = 0
133 changes: 133 additions & 0 deletions sql/2022/performance/lcp_element_data.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
#standardSQL
# LCP element node details

CREATE TEMP FUNCTION getLoadingAttr(attributes STRING) RETURNS STRING LANGUAGE js AS '''
try {
const data = JSON.parse(attributes);
const loadingAttr = data.find(attr => attr["name"] === "loading")
return loadingAttr.value
} catch (e) {
return "";
}
''';

CREATE TEMP FUNCTION getDecodingAttr(attributes STRING) RETURNS STRING LANGUAGE js AS '''
try {
const data = JSON.parse(attributes);
const decodingAttr = data.find(attr => attr["name"] === "decoding")
return decodingAttr.value
} catch (e) {
return "";
}
''';

CREATE TEMP FUNCTION getLoadingClasses(attributes STRING) RETURNS STRING LANGUAGE js AS '''
try {
const data = JSON.parse(attributes);
const classes = data.find(attr => attr["name"] === "class").value
if (classes.indexOf('lazyload') !== -1) {
return classes
} else {
return ""
}
} catch (e) {
return "";
}
''';

CREATE TEMPORARY FUNCTION getResourceHints(payload STRING)
RETURNS STRUCT<preload BOOLEAN, prefetch BOOLEAN, preconnect BOOLEAN, prerender BOOLEAN, `dns-prefetch` BOOLEAN, `modulepreload` BOOLEAN>
LANGUAGE js AS '''
var hints = ['preload', 'prefetch', 'preconnect', 'prerender', 'dns-prefetch', 'modulepreload'];
try {
var $ = JSON.parse(payload);
var almanac = JSON.parse($._almanac);
return hints.reduce((results, hint) => {
results[hint] = !!almanac['link-nodes'].nodes.find(link => link.rel.toLowerCase() == hint);
return results;
}, {});
} catch (e) {
return hints.reduce((results, hint) => {
results[hint] = false;
return results;
}, {});
}
''';

CREATE TEMPORARY FUNCTION getFetchPriority(payload STRING)
RETURNS STRUCT<high BOOLEAN, low BOOLEAN, auto BOOLEAN>
LANGUAGE js AS '''
var hints = ['high', 'low', 'auto'];
try {
var $ = JSON.parse(payload);
var almanac = JSON.parse($._almanac);
return hints.reduce((results, hint) => {
results[hint] = !!almanac['link-nodes'].nodes.find(link => link.rel.toLowerCase() == hint);
return results;
}, {});
} catch (e) {
return hints.reduce((results, hint) => {
results[hint] = false;
return results;
}, {});
}
''';

WITH
lcp_stats AS (
SELECT
_TABLE_SUFFIX AS client,
url,
JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.nodeName') AS nodeName,
JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.url') AS elementUrl,
CAST(JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.size') AS INT64) AS size,
CAST(JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.loadTime') AS FLOAT64) AS loadTime,
CAST(JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.startTime') AS FLOAT64) AS startTime,
CAST(JSON_EXTRACT_SCALAR(payload, '$._performance.lcp_elem_stats.renderTime') AS FLOAT64) AS renderTime,
JSON_EXTRACT(payload, '$._performance.lcp_elem_stats.attributes') AS attributes,
getLoadingAttr(JSON_EXTRACT(payload, '$._performance.lcp_elem_stats.attributes')) AS loading,
getDecodingAttr(JSON_EXTRACT(payload, '$._performance.lcp_elem_stats.attributes')) AS decoding,
getLoadingClasses(JSON_EXTRACT(payload, '$._performance.lcp_elem_stats.attributes')) AS classWithLazyload,
getResourceHints(payload) AS hints
FROM
`httparchive.pages.2022_06_01_*`
)

SELECT
client,
nodeName,
COUNT(DISTINCT url) AS pages,
ANY_VALUE(total) AS total,
COUNT(DISTINCT url) / ANY_VALUE(total) AS pct,
COUNTIF(elementUrl != '') AS haveImages,
COUNTIF(elementUrl != '') / COUNT(DISTINCT url) AS pct_haveImages,
COUNTIF(loading = 'eager') AS native_eagerload,
COUNTIF(loading = 'lazy') AS native_lazyload,
COUNTIF(classWithLazyload != '') AS lazyload_class,
COUNTIF(classWithLazyload != '' OR loading = 'lazy') AS probably_lazyLoaded,
COUNTIF(classWithLazyload != '' OR loading = 'lazy') / COUNT(DISTINCT url) AS pct_prob_lazyloaded,
COUNTIF(decoding = 'async') AS async_decoding,
COUNTIF(decoding = 'sync') AS sync_decoding,
COUNTIF(decoding = 'auto') AS auto_decoding,
COUNT(0) AS total,
COUNTIF(hints.preload) AS preload,
COUNTIF(hints.preload) / COUNT(0) AS pct_preload
FROM
lcp_stats
JOIN (
SELECT
_TABLE_SUFFIX AS client,
COUNT(0) AS total
FROM
`httparchive.summary_pages.2022_06_01_*`
GROUP BY
_TABLE_SUFFIX)
USING
(client)
GROUP BY
client,
nodeName
HAVING
pages > 1000
ORDER BY
pct DESC
Loading