-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(predictive-perf): add network estimation #3187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more to come, but flushing early... 🚽 😺
sum += values[key]; | ||
}); | ||
|
||
const rawValue = sum / 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magic number. Object.keys(graphs).length
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
const DEFAULT_MAXIMUM_CONCURRENT_REQUESTS = 10; | ||
const DEFAULT_RESPONSE_TIME = 30; | ||
const DEFAULT_RTT = 150; | ||
const DEFAULT_THROUGHPUT = 1600 * 1024; // 1.6 Mbps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two numbers are identical to our throttling presets. should we pull from emulation.js instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
{ | ||
rtt: DEFAULT_RTT, | ||
throughput: DEFAULT_THROUGHPUT, | ||
defaultResponseTime: DEFAULT_RESPONSE_TIME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
post-discussion: naming-wise this value is very unclear. defaultServerResponseTime is still ambiguous wether it includes latency or not. (and 1-way or 2-way?)
defaultTTFB
is very clear as a definition, so I'd prefer using that as the argument. That means some adding/subtracting of latency here and there, but i think it's worth it so the terms are clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to fallbackTTFB
and added latency subtraction 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok hang on, reopening the discussion :)
i'm gonna push back against this latency subtraction from TTFB deal a bit, I cherrypicked the change to my final fixes branch and it regressed accuracy by 0.3%
given we're significantly under reporting rtt in most cases i'm not sure we should be further reducing the download time estimates, how about as future exploration we try to do something fancy to identify/separate connection latency from server response time instead (I can imagine trying to determine the built-in connection latency by looking at all the requests, or comparing the TCP handshake time to the TTFB time etc but its all more researchy) and leave it be for now?
proposed comment:
// We'll approximate how much time the server for a connection took to respond after receiving
// the request by computing the minimum TTFB time for requests on that connection.
// TTFB = one way latency + server response time + one way latency
// Even though TTFB is greater than server response time, the RTT is underaccounted for by
// by not varying per-server and so the difference roughly evens out.
// TODO(patrickhulce): investigate a way to identify per-server RTT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discsussed offline, i think it'd be great to explore identifing per-server RTT later. this comment sg. let's add it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update: saw the comment is added already. 👍
* @param {!WebInspector.NetworkRequest} record | ||
* @return {number} | ||
*/ | ||
static getResponseTime(record) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this value is the definition of TTFB, so let's s/responseTime/ttfb/ :)
const DEFAULT_RTT = 150; | ||
const DEFAULT_THROUGHPUT = 1600 * 1024; // 1.6 Mbps | ||
|
||
function groupBy(items, keyFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we could install this to Object.groupBy
so its as usable everywhere. the named props and array of obj dances get old. oh well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh yeah...I'm still all for a util lib :)
); | ||
|
||
for (const [connectionId, records] of recordsByConnection.entries()) { | ||
const isSsl = records[0].parsedURL.scheme === 'https'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wss? if that should be included we have SECURE_SCHEMES elsewhere :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confirm that we dont care about ws in this simulation.
how about
const isTLS = records[0].parsedURL.scheme === 'https'; // wss deliberately excluded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flushing..
Also can I just say... estimator/tcp-connection.js
is one of my favorite files I've seen in a while. Love the research.
return graph.cloneWithRelationships(node => { | ||
if (node.endTime > fmp) return false; | ||
if (node.type !== Node.TYPES.NETWORK) return true; | ||
return node.record.priority() === 'VeryHigh'; // proxy for render-blocking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this will exclude external scripts (and a few others)
https://docs.google.com/document/d/1bCDuq9H1ih9iNjgzyAL0gpwNFiEP4TZS-YLRp_RuMlc/edit
Maybe it's time we write the isResourceRenderBlocking(resourceType, priority)
method finally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup this is implemented in next PR :) FWIW in the future PR the remaining optimistic/pessimistic split is also excluding render blocking priorities if they were script initiated
*/ | ||
static getOptimisticTTCIGraph(graph) { | ||
return graph.cloneWithRelationships(node => { | ||
return node.record._resourceType && node.record._resourceType._name === 'script' || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with modern devtools this is node.record.resourceType() === Common.resourceTypes.Script
But with ours it'll need some WebInspector.*
... :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, do we want to go with the public accessors here? (this specific condition is also removed in future work so I'll add this there if we want to switch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is same as node.record._resourceType === Common.resourceTypes.Script
(which is fine). The benefit is just dont have to reach all the way to _name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, will re-open in the final version then where i've used node.resourceType
then 👍
static getOptimisticTTCIGraph(graph) { | ||
return graph.cloneWithRelationships(node => { | ||
return node.record._resourceType && node.record._resourceType._name === 'script' || | ||
node.record.priority() === 'High' || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this one use both priorities but optimisticFMP only uses veryhigh? Would love a comment to explain some of these or link into the relevant part of the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
future version has comments explaining each condition when more thought started going into them :) but will flesh out more
*/ | ||
static getOptimisticTTCIGraph(graph) { | ||
return graph.cloneWithRelationships(node => { | ||
return node.record._resourceType && node.record._resourceType._name === 'script' || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style wise the if (..) return...
approach used on getOptimisticFMPGraph's predicate is more clear and avoids questions around operator precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this particular condition is removed in next PR but will update latest 👍
}); | ||
|
||
this._networkRecords = records; | ||
return records; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does the return value matter? if it doesn't then might as well ditch records
and push right into this._networkRecords
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope it doesn't, done.
} | ||
} else { | ||
auxiliaryData.timeElapsed += calculation.timeElapsed; | ||
auxiliaryData.timeElapsedOvershoot += |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah lets rename, especially so this var's length doesn't force wrapping. :)
|
||
_updateNetworkCapacity() { | ||
for (const connection of this._connectionsInUse) { | ||
connection.setThroughput(this._throughput / this._nodesInProcess.size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying this out..
1000KB/sec throughput
10 connections
20 requests in flight
Each connection then get is set to 50 kb/s of availableThroughput. K sgtm. :)
however, thinking forward to CPU nodes in the graph, they shouldn't be included in this .size
should they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah they won't be I keep track of number of in process by type separately 👍
|
||
let handshakeAndRequest = oneWayLatency; | ||
if (!this._warmed) { | ||
handshakeAndRequest = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as part of the earlier TTFB comments, can this be just handShake
? (and move one of the OWL's into the TTFB) 🦉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of splitting it since the handshake is in fact concluding in the same trip as the request is being made, adding a comment with // technically the ACK is still happening blah blah
seems a tad more to process than a variable name indicating the two are conjoined IMO. It also slightly complicates the later http2 case where neither are required once the connection has been warmed.
Do you think addressing the other TTFB issues makes this clearer or you'd still like to see the split?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed to keep handshakeAndRequest
as a unit as it's more accurate to the reality of TCP. 👍
oneWayLatency + | ||
// ACK + Application Data | ||
oneWayLatency + | ||
// ClientHello/ServerHello with TLS False Start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about this for a comment
// We assume TLS False Start is enabled. https://istlsfastyet.com/#server-performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sg
* @param {number=} maximumTimeToElapse | ||
* @return {{timeElapsed: number, roundTrips: number, bytesDownloaded: number, congestionWindow: number}} | ||
*/ | ||
calculateTimeToDownload(bytesToDownload, timeAlreadyElapsed = 0, maximumTimeToElapse = Infinity) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming threw me... simulateDownloadingUntil
is way clearer.
kinda weird that args 1 and 3 are the constraints but not next to eachother. should we just move to an object instead? Seems like we usually would at this point. strawman:
simulateDownloadingUntil({maxBytes, maxMs, msElapsed})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, how about switching to an object param in the "incremental PR" after I can rebase against the name change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about switching to an object param in the "incremental PR" after I can rebase against the name change?
sure that's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, Finished looking at this PR. great stuff. :)
|
||
_updateNetworkCapacity() { | ||
for (const connection of this._connectionsInUse) { | ||
connection.setThroughput(this._throughput / this._nodesInProcess.size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setThroughput vs availableThroughput makes me think there may be multiple... like currentTHroughput, defaultThroughput, etc... since there's only one should we just use that same name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep done
3d35420
to
cf44696
Compare
cf44696
to
a632442
Compare
oneWayLatency + | ||
// ACK + Application Data | ||
oneWayLatency + | ||
// ClientHello/ServerHello assuming TLS Flase Start is enabled (https://istlsfastyet.com/#server-performance). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
False
} | ||
|
||
let roundTrips = Math.ceil(handshakeAndRequest / twoWayLatency); | ||
const timeToFirstByte = handshakeAndRequest + this._responseTime + oneWayLatency; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_responseTime => _serverLatency
everywhere? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
oneWayLatency + | ||
// SYN ACK | ||
oneWayLatency + | ||
// ACK + Application Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK + initial request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
I think everything's addressed here, so back to you @paulirish build failures seem to be |
sorry, I'd still like to take a look :D:D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial comment flush, though this is looking good :)
@@ -31,14 +32,77 @@ class PredictivePerf extends Audit { | |||
} | |||
|
|||
/** | |||
* @param {!Node} graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe dependencyGraph
to give a local hint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sum += values[key]; | ||
}); | ||
|
||
const rawValue = sum / Object.keys(values).length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: meanDuration
for clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
maxDepth--; | ||
} | ||
|
||
if (!maxDepth) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this happens it seems like a more serious bug would be present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means something is wrong with graph that's been created, see #3187 (comment) I think it's still more useful than having a fatal infinite loop on bad input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, which I guess is good :) (especially during ongoing development) but it does feel like a check for a properly formed graph at construction time is the right solution (plus we could probably give better error messages for what went wrong with graph construction)
@@ -121,6 +147,11 @@ class Node { | |||
if (!shouldIncludeNode(originalNode)) return; | |||
const clonedNode = originalNode.cloneWithoutRelationships(); | |||
idToNodeMap.set(clonedNode.id, clonedNode); | |||
}); | |||
|
|||
rootNode.traverse(originalNode => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is splitting it like this necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not guaranteed that a node will be visited by traverse
before all of its dependencies, it's just doing a traversal of the edges not necessarily a topological sort ordered traversal
* | ||
* @param {!Node} rootNode | ||
*/ | ||
static printGraph(rootNode, widthInCharacters = 100) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used? just for debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
return grouped; | ||
} | ||
|
||
class Estimator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs some jsdoc description strings :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// Even though TTFB is greater than server response time, the RTT is underaccounted for by | ||
// not varying per-server and so the difference roughly evens out. | ||
// TODO(patrickhulce): investigate a way to identify per-server RTT | ||
let estimatedResponseTime = records.reduce( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Math.min(...records.map(Estimator.getTTFB))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
); | ||
} | ||
|
||
if (depth > 10000) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also seems like a code smell/from an earlier development stage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it was also very useful for early debugging with a small value, this is just a sensibility check to avoid looping forever and instead fail in the event of illogical network record or trace event relationships (or bug in graph construction though at this point it's synonymous with illogic event relationships given the interpretations we're taking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is just a sensibility check to avoid looping forever and instead fail in the event of illogical network record or trace event relationships
same as above, maybe we can move to an assertValidDepGraph
method in the future
_initializeAuxiliaryData() { | ||
this._nodeTiming = new Map(); | ||
this._nodesCompleted = new Set(); | ||
this._nodesInProcess = new Set(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the same process? if not, may be a bit confusing naming-wise. nodesInProgress
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahaha, good point, done
_enqueueNodeIfPossible(node) { | ||
const dependencies = node.getDependencies(); | ||
if ( | ||
!this._nodesCompleted.has(node) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if all dependencies must have been completed to be enqueued, shouldn't it not be possible for a node to already be completed here (since it's being added as a dependent of another node)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cleaned up a little bit in #3162 but the summary here is that it was the only check against duplicate processing of a node. By definition, if a node has been completed, all of its dependencies have been completed, so while we're enqueing don't start on a node we've already finished (really should be if a node we've already started which is fixed later but it didn't have an impact when only network records were considered).
Once again, this logic is re-arranged in the next PR to be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By definition, if a node has been completed, all of its dependencies have been completed, so while we're enqueing don't start on a node we've already finished
right, I was trying to say that always checking dependencies.every(dependency => this._nodesCompleted.has(dependency))
before enqueuing should be sufficient, because it means if a dependent of a newly-completed node is being enqueued, it can't have even been in the queue before (let alone completed), because one of its dependencies wasn't completed until just now.
We can wait as it changes, but it feels like it's a // should never happen
situation and it should throw rather than ignoring just in case it does ever happen :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh oh I see my confusion here now :)
In another version, this method was called in a loop over all nodes, not just the dependents of recently completed nodes. Yup, yup in this snapshot it's a 100% superfluous check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to go from me!
@@ -177,4 +208,9 @@ class Node { | |||
} | |||
} | |||
|
|||
Node.TYPES = { | |||
UNKNOWN: 'unknown', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
in the name of fast mode #2703
blocked on #2720split out from #2720 for easier reviewing