core(network-analyzer): use arithmetic mean for median #15096

connorjclark · 2023-05-18T19:20:29Z

AFAIK, median is always defined to use the arithmetic mean of the two middle values if the number of samples is even. We weren't doing that. The implications for lantern accuracy, at least according to our current test database, is minor but positive.

This value is only used to estimate the server response time in computeRTTAndServerResponseTime. The previous value of this biased slightly towards a faster server response time (for example- in the common case of having two estimates- if TCP was any faster than SSL then we select the TCP duration and ignore the SSL duration for purposes of response time).

adamraine · 2023-05-18T19:38:57Z

core/lib/dependency-graph/simulator/network-analyzer.js

+    if (values.length <= 1) {
+      median = values[0];


nit: the 1 case is handled fine by the odd case, and length 0 is guaranteed to be undefined. This makes it more explicit. Reaching into values[0] when it's empty just to get undefined is kinda confusing.

Suggested change

if (values.length <= 1) {

median = values[0];

if (values.length === 0) {

median = undefined;

this made TS unhappy, so I threw an error for values.length === 0 which should never happen with real data but plenty of unit tests fail with that. So, I reverted.

Co-authored-by: Adam Raine <6752989+adamraine@users.noreply.github.com>

connorjclark · 2023-05-18T20:20:46Z

core/test/fixtures/fraggle-rock/reports/sample-flow-result.json

@@ -8766,7 +8766,7 @@
                },
                {
                  "origin": "https://mnl4bjjsnz-dsn.algolia.net",
-                  "serverResponseTime": 0
+                  "serverResponseTime": 263.2025
                }


hmmm seems like a lot..

This came from these estimates:

in _estimateResponseTimeByOrigin's Math.max(ttfb - rtt, 0), ttfb - rtt is somehow negative so we get a zero as an estimate.

The zero estimate came from this record: https://mnl4bjjsnz-dsn.algolia.net/1/indexes/dev_OFFICE_SCENES/query, ttfb 49 (slightly more than the rtt estimate of 49.56)

The second, higher estimate came from: https://mnl4bjjsnz-dsn.algolia.net/1/indexes/dev_OFFICE_SCENES/query, ttfb 575

it's reasonable for query time to be so variable for such a website, so I think taking the average-ish value here (via the median changes in this PR) is good.

adamraine

Seems good

brendankenny · 2023-05-18T21:42:16Z

median is always defined to use the arithmetic mean of the two middle values if the number of samples is even

Median has a few definitions, depending on how you're using it. If you need a value that's in the dataset, for instance, you have to pick one of the two middle values, you can't take their mean. Either middle value is fine to pick in that case, again depending on your goals.

I think the fundamental issue is trying to take the median of one or two or three numbers, at which point the median isn't robust in any meaningful way. This feels a bit like it should have been using the arithmetic mean in the first place (there could maybe be issues with like one outlier request, and this could function as a pseudo trimmed mean, I guess).

core(network-analyzer): use arithmetic mean for median of even samples

d8af3c7

connorjclark requested a review from a team as a code owner May 18, 2023 19:20

connorjclark requested review from brendankenny and removed request for a team May 18, 2023 19:20

adamraine reviewed May 18, 2023

View reviewed changes

Update core/lib/dependency-graph/simulator/network-analyzer.js

fb08222

Co-authored-by: Adam Raine <6752989+adamraine@users.noreply.github.com>

vercel bot deployed to Preview May 18, 2023 19:46 View deployment

comments

a5eca34

vercel bot deployed to Preview May 18, 2023 20:07 View deployment

cool

77f273e

connorjclark commented May 18, 2023

View reviewed changes

vercel bot deployed to Preview May 18, 2023 20:21 View deployment

connorjclark added 2 commits May 18, 2023 13:40

update test snaps

1e537ed

cantdothat

2df98d4

vercel bot deployed to Preview May 18, 2023 20:43 View deployment

jest snapshot update format sux

0f274c6

vercel bot deployed to Preview May 18, 2023 20:51 View deployment

adamraine approved these changes May 18, 2023

View reviewed changes

Merge remote-tracking branch 'origin/main' into nit-median-avg

e6f6c90

vercel bot deployed to Preview May 18, 2023 21:34 View deployment

connorjclark changed the title ~~core(network-analyzer): use arithmetic mean for median of even samples~~ core(network-analyzer): use arithmetic mean for median May 20, 2023

connorjclark merged commit 9922e4c into main May 20, 2023

connorjclark deleted the nit-median-avg branch May 20, 2023 00:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(network-analyzer): use arithmetic mean for median #15096

core(network-analyzer): use arithmetic mean for median #15096

connorjclark commented May 18, 2023 •

edited

Loading

adamraine May 18, 2023

connorjclark May 18, 2023 •

edited

Loading

connorjclark May 18, 2023 •

edited

Loading

connorjclark May 18, 2023 •

edited

Loading

connorjclark May 18, 2023

adamraine left a comment

brendankenny commented May 18, 2023 •

edited

Loading

core(network-analyzer): use arithmetic mean for median #15096

core(network-analyzer): use arithmetic mean for median #15096

Conversation

connorjclark commented May 18, 2023 • edited Loading

adamraine May 18, 2023

Choose a reason for hiding this comment

connorjclark May 18, 2023 • edited Loading

Choose a reason for hiding this comment

connorjclark May 18, 2023 • edited Loading

Choose a reason for hiding this comment

connorjclark May 18, 2023 • edited Loading

Choose a reason for hiding this comment

connorjclark May 18, 2023

Choose a reason for hiding this comment

adamraine left a comment

Choose a reason for hiding this comment

brendankenny commented May 18, 2023 • edited Loading

connorjclark commented May 18, 2023 •

edited

Loading

connorjclark May 18, 2023 •

edited

Loading

connorjclark May 18, 2023 •

edited

Loading

connorjclark May 18, 2023 •

edited

Loading

brendankenny commented May 18, 2023 •

edited

Loading