-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core(lantern): always use flexible network ordering #14612
Conversation
aa13de1
to
4360edd
Compare
Nice work!
I take it that this statement is limited to metrics, without saying anything about other audits, correct? I'm curious how you assess the changes in audits that went from inflexible to flexible. Since inflexible was the default value, from just looking at the PR diff we can only see the no-op changes from |
Yeah, it's limited to the metrics. In theory, if you take LCP of the original graph and a modified graph and the opportunity is the difference, then any improvements in LCP accuracy should improve that estimate. But that's not necessarily true in practice (the dbw failure in this run looks like a good example, I need to compare the before/after to see if we were relying on coincidence before or if there's a real regression). We don't have any automated method for evaluating opportunity accuracy, but we should consider how we could build one in 2023. We should also update the lantern test data. Somehow it's already been three years since you updated it last. |
Ooooh boy that will be a fun exercise! 3 years of software and product churn, and the cloud, what could go wrong? |
We can add this for 10.0, just need to update the branch. Sorry it didn't get merged earlier. |
4360edd
to
abef56b
Compare
abef56b
to
53e50d4
Compare
huh we still want this right? |
Yeah, it needs some TLC though (conflicts). I wanted to have the lantern traces updated before merging, but an analysis of this PR's effects could be done posthoc so no need to wait on that. |
@@ -82,7 +82,7 @@ describe('Render blocking resources audit', () => { | |||
const settings = {throttlingMethod: 'simulate', throttling: mobileSlow4G}; | |||
const computedCache = new Map(); | |||
const result = await RenderBlockingResourcesAudit.audit(artifacts, {settings, computedCache}); | |||
expect(result.numericValue).toEqual(316); | |||
expect(result.numericValue).toEqual(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: verify this or change test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const {nodeTimings} = simulator.simulate(fcpGraph);
in render-blocking-resources.js estimateSavingsWithGraphs
gives 2884
for script.js, whereas before it was 2100
. AFAICT no other numbers in this method varied. This is enough to make the "before inline" FCP estimate always be higher than the "after inline" estimate for this trace.
The test does otherwise fail if we comment out the amp-specific handling in render-blocking-resources.js, and I think the change in the FCP estimate is to be expected from this PR, so updating to 0 here seems the approach to take.
Last remaining item is the smoke failure that only occurs in the bundled test (odd): |
Remove
flexibleOrdering
from the simulator and always start requests when able to be started and a connection is available. Part of #14166Simplifies simulator code and improves lantern accuracy (with a small change to the TTI and SI coefficients)
(initial image from original PR - see above instead)
The lantern simulator has two modes for connection reuse controlled by the boolean
flexibleOrdering
:flexibleOrdering: false
, the default value, any network request that was made on a warm connection (connectionReused: true
) can only be started once that connection is warm, and the NetworkNode is kept in the queue until whatever request was first on that connection in the unthrottled real load is completed in the simulatorflexibleOrdering: true
,connectionReused
is not considered in deciding when network requests should startOriginally the observed connection reuse ordering was required, and
flexibleOrdering
came later when we explicitly wanted to make requests earlier than they had occurred, as if they had been preloaded.However, the default
flexibleOrdering: false
is a little bit weird today. It's basically an additional dependency constraint on the graph, though not represented through the main dependency mechanism. In an ideal world, if there needs to be a dependency relationship between two requests for improved accuracy, we should make it explicit. We're also simulating the whole page load on a different speed machine; there's no reason to idle a connection waiting for the request that was observed to be first on a connection when other requests could be made sooner (that works against the whole reason for simulating). That's in an ideal world again, though, and we may be missing key simulator features that are being made up for byflexibleOrdering
.When testing to see what additional work it would take to remove inflexible ordering, I was prepared for accuracy to drop and additional dependency-graph relationships to be needed to counteract that. However, the lantern tests actually showed small improvements in accuracy in multiple audits—highest in p90 SI—except a small decrease in accuracy in p50 TTI and p95 SI.
The lantern coefficients for SI and TTI are super old, so I re-computed them from the full 9-run lantern dataset, then scaled back the changes considerably because I'm worried about overfitting to the test data. Looking at the goals laid out in #5120, TTI stays at
optimistic + pessimistic === 1
while favoring the pessimistic estimate slightly more now, and SI moves very slightly toward a total of1
, but more importantly can finally use an intercept of0
(the last of the metrics to do so).Combined, these two changes make major improvements to the SI accuracy. Most of that is due to the coefficient/intercept change, but it means we can simplify the simulator code and remove an artificial constraint while maintaining or improving simulator accuracy.