Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(predictive-perf): add network estimation #3187

Merged
merged 5 commits into from
Sep 6, 2017

Conversation

patrickhulce
Copy link
Collaborator

@patrickhulce patrickhulce commented Aug 30, 2017

in the name of fast mode #2703

blocked on #2720

split out from #2720 for easier reviewing

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more to come, but flushing early... 🚽 😺

sum += values[key];
});

const rawValue = sum / 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic number. Object.keys(graphs).length ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

const DEFAULT_MAXIMUM_CONCURRENT_REQUESTS = 10;
const DEFAULT_RESPONSE_TIME = 30;
const DEFAULT_RTT = 150;
const DEFAULT_THROUGHPUT = 1600 * 1024; // 1.6 Mbps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two numbers are identical to our throttling presets. should we pull from emulation.js instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

{
rtt: DEFAULT_RTT,
throughput: DEFAULT_THROUGHPUT,
defaultResponseTime: DEFAULT_RESPONSE_TIME,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-discussion: naming-wise this value is very unclear. defaultServerResponseTime is still ambiguous wether it includes latency or not. (and 1-way or 2-way?)

defaultTTFB is very clear as a definition, so I'd prefer using that as the argument. That means some adding/subtracting of latency here and there, but i think it's worth it so the terms are clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to fallbackTTFB and added latency subtraction 👍

Copy link
Collaborator Author

@patrickhulce patrickhulce Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok hang on, reopening the discussion :)

i'm gonna push back against this latency subtraction from TTFB deal a bit, I cherrypicked the change to my final fixes branch and it regressed accuracy by 0.3%

given we're significantly under reporting rtt in most cases i'm not sure we should be further reducing the download time estimates, how about as future exploration we try to do something fancy to identify/separate connection latency from server response time instead (I can imagine trying to determine the built-in connection latency by looking at all the requests, or comparing the TCP handshake time to the TTFB time etc but its all more researchy) and leave it be for now?

proposed comment:

      // We'll approximate how much time the server for a connection took to respond after receiving
      // the request by computing the minimum TTFB time for requests on that connection.
      // TTFB = one way latency + server response time + one way latency
      // Even though TTFB is greater than server response time, the RTT is underaccounted for by
      // by not varying per-server and so the difference roughly evens out.
      // TODO(patrickhulce): investigate a way to identify per-server RTT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discsussed offline, i think it'd be great to explore identifing per-server RTT later. this comment sg. let's add it. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: saw the comment is added already. 👍

* @param {!WebInspector.NetworkRequest} record
* @return {number}
*/
static getResponseTime(record) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this value is the definition of TTFB, so let's s/responseTime/ttfb/ :)

const DEFAULT_RTT = 150;
const DEFAULT_THROUGHPUT = 1600 * 1024; // 1.6 Mbps

function groupBy(items, keyFunc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could install this to Object.groupBy so its as usable everywhere. the named props and array of obj dances get old. oh well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh yeah...I'm still all for a util lib :)

);

for (const [connectionId, records] of recordsByConnection.entries()) {
const isSsl = records[0].parsedURL.scheme === 'https';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wss? if that should be included we have SECURE_SCHEMES elsewhere :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirm that we dont care about ws in this simulation.

how about

const isTLS = records[0].parsedURL.scheme === 'https'; // wss deliberately excluded

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushing..

Also can I just say... estimator/tcp-connection.js is one of my favorite files I've seen in a while. Love the research.

return graph.cloneWithRelationships(node => {
if (node.endTime > fmp) return false;
if (node.type !== Node.TYPES.NETWORK) return true;
return node.record.priority() === 'VeryHigh'; // proxy for render-blocking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this will exclude external scripts (and a few others)

https://docs.google.com/document/d/1bCDuq9H1ih9iNjgzyAL0gpwNFiEP4TZS-YLRp_RuMlc/edit

Maybe it's time we write the isResourceRenderBlocking(resourceType, priority) method finally?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup this is implemented in next PR :) FWIW in the future PR the remaining optimistic/pessimistic split is also excluding render blocking priorities if they were script initiated

*/
static getOptimisticTTCIGraph(graph) {
return graph.cloneWithRelationships(node => {
return node.record._resourceType && node.record._resourceType._name === 'script' ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with modern devtools this is node.record.resourceType() === Common.resourceTypes.Script
But with ours it'll need some WebInspector.*... :)

Copy link
Collaborator Author

@patrickhulce patrickhulce Aug 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, do we want to go with the public accessors here? (this specific condition is also removed in future work so I'll add this there if we want to switch)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is same as node.record._resourceType === Common.resourceTypes.Script (which is fine). The benefit is just dont have to reach all the way to _name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, will re-open in the final version then where i've used node.resourceType then 👍

static getOptimisticTTCIGraph(graph) {
return graph.cloneWithRelationships(node => {
return node.record._resourceType && node.record._resourceType._name === 'script' ||
node.record.priority() === 'High' ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this one use both priorities but optimisticFMP only uses veryhigh? Would love a comment to explain some of these or link into the relevant part of the doc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future version has comments explaining each condition when more thought started going into them :) but will flesh out more

*/
static getOptimisticTTCIGraph(graph) {
return graph.cloneWithRelationships(node => {
return node.record._resourceType && node.record._resourceType._name === 'script' ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style wise the if (..) return... approach used on getOptimisticFMPGraph's predicate is more clear and avoids questions around operator precedence.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this particular condition is removed in next PR but will update latest 👍

});

this._networkRecords = records;
return records;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the return value matter? if it doesn't then might as well ditch records and push right into this._networkRecords ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope it doesn't, done.

}
} else {
auxiliaryData.timeElapsed += calculation.timeElapsed;
auxiliaryData.timeElapsedOvershoot +=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah lets rename, especially so this var's length doesn't force wrapping. :)


_updateNetworkCapacity() {
for (const connection of this._connectionsInUse) {
connection.setThroughput(this._throughput / this._nodesInProcess.size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying this out..

1000KB/sec throughput
10 connections
20 requests in flight

Each connection then get is set to 50 kb/s of availableThroughput. K sgtm. :)


however, thinking forward to CPU nodes in the graph, they shouldn't be included in this .size should they?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah they won't be I keep track of number of in process by type separately 👍


let handshakeAndRequest = oneWayLatency;
if (!this._warmed) {
handshakeAndRequest =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as part of the earlier TTFB comments, can this be just handShake ? (and move one of the OWL's into the TTFB) 🦉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of splitting it since the handshake is in fact concluding in the same trip as the request is being made, adding a comment with // technically the ACK is still happening blah blah seems a tad more to process than a variable name indicating the two are conjoined IMO. It also slightly complicates the later http2 case where neither are required once the connection has been warmed.

Do you think addressing the other TTFB issues makes this clearer or you'd still like to see the split?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed to keep handshakeAndRequest as a unit as it's more accurate to the reality of TCP. 👍

oneWayLatency +
// ACK + Application Data
oneWayLatency +
// ClientHello/ServerHello with TLS False Start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this for a comment

// We assume TLS False Start is enabled. https://istlsfastyet.com/#server-performance

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg

* @param {number=} maximumTimeToElapse
* @return {{timeElapsed: number, roundTrips: number, bytesDownloaded: number, congestionWindow: number}}
*/
calculateTimeToDownload(bytesToDownload, timeAlreadyElapsed = 0, maximumTimeToElapse = Infinity) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming threw me... simulateDownloadingUntil is way clearer.

kinda weird that args 1 and 3 are the constraints but not next to eachother. should we just move to an object instead? Seems like we usually would at this point. strawman:

simulateDownloadingUntil({maxBytes, maxMs, msElapsed})

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, how about switching to an object param in the "incremental PR" after I can rebase against the name change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about switching to an object param in the "incremental PR" after I can rebase against the name change?

sure that's fine.

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, Finished looking at this PR. great stuff. :)


_updateNetworkCapacity() {
for (const connection of this._connectionsInUse) {
connection.setThroughput(this._throughput / this._nodesInProcess.size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setThroughput vs availableThroughput makes me think there may be multiple... like currentTHroughput, defaultThroughput, etc... since there's only one should we just use that same name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep done

@patrickhulce patrickhulce force-pushed the predictive_perf__network branch from 3d35420 to cf44696 Compare September 1, 2017 00:03
@patrickhulce patrickhulce force-pushed the predictive_perf__network branch from cf44696 to a632442 Compare September 1, 2017 16:18
oneWayLatency +
// ACK + Application Data
oneWayLatency +
// ClientHello/ServerHello assuming TLS Flase Start is enabled (https://istlsfastyet.com/#server-performance).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False

}

let roundTrips = Math.ceil(handshakeAndRequest / twoWayLatency);
const timeToFirstByte = handshakeAndRequest + this._responseTime + oneWayLatency;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_responseTime => _serverLatency

everywhere? :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

oneWayLatency +
// SYN ACK
oneWayLatency +
// ACK + Application Data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK + initial request

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@patrickhulce
Copy link
Collaborator Author

patrickhulce commented Sep 5, 2017

I think everything's addressed here, so back to you @paulirish

build failures seem to be bundlesize step failing due to a github 404

@brendankenny
Copy link
Member

sorry, I'd still like to take a look :D:D

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial comment flush, though this is looking good :)

@@ -31,14 +32,77 @@ class PredictivePerf extends Audit {
}

/**
* @param {!Node} graph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe dependencyGraph to give a local hint?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

sum += values[key];
});

const rawValue = sum / Object.keys(values).length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: meanDuration for clarity?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

maxDepth--;
}

if (!maxDepth) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this happens it seems like a more serious bug would be present?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means something is wrong with graph that's been created, see #3187 (comment) I think it's still more useful than having a fatal infinite loop on bad input.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, which I guess is good :) (especially during ongoing development) but it does feel like a check for a properly formed graph at construction time is the right solution (plus we could probably give better error messages for what went wrong with graph construction)

@@ -121,6 +147,11 @@ class Node {
if (!shouldIncludeNode(originalNode)) return;
const clonedNode = originalNode.cloneWithoutRelationships();
idToNodeMap.set(clonedNode.id, clonedNode);
});

rootNode.traverse(originalNode => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is splitting it like this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not guaranteed that a node will be visited by traverse before all of its dependencies, it's just doing a traversal of the edges not necessarily a topological sort ordered traversal

*
* @param {!Node} rootNode
*/
static printGraph(rootNode, widthInCharacters = 100) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used? just for debugging?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

return grouped;
}

class Estimator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs some jsdoc description strings :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Even though TTFB is greater than server response time, the RTT is underaccounted for by
// not varying per-server and so the difference roughly evens out.
// TODO(patrickhulce): investigate a way to identify per-server RTT
let estimatedResponseTime = records.reduce(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Math.min(...records.map(Estimator.getTTFB))?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

);
}

if (depth > 10000) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also seems like a code smell/from an earlier development stage?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it was also very useful for early debugging with a small value, this is just a sensibility check to avoid looping forever and instead fail in the event of illogical network record or trace event relationships (or bug in graph construction though at this point it's synonymous with illogic event relationships given the interpretations we're taking).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is just a sensibility check to avoid looping forever and instead fail in the event of illogical network record or trace event relationships

same as above, maybe we can move to an assertValidDepGraph method in the future

_initializeAuxiliaryData() {
this._nodeTiming = new Map();
this._nodesCompleted = new Set();
this._nodesInProcess = new Set();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the same process? if not, may be a bit confusing naming-wise. nodesInProgress?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahaha, good point, done

_enqueueNodeIfPossible(node) {
const dependencies = node.getDependencies();
if (
!this._nodesCompleted.has(node) &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if all dependencies must have been completed to be enqueued, shouldn't it not be possible for a node to already be completed here (since it's being added as a dependent of another node)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cleaned up a little bit in #3162 but the summary here is that it was the only check against duplicate processing of a node. By definition, if a node has been completed, all of its dependencies have been completed, so while we're enqueing don't start on a node we've already finished (really should be if a node we've already started which is fixed later but it didn't have an impact when only network records were considered).

Once again, this logic is re-arranged in the next PR to be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By definition, if a node has been completed, all of its dependencies have been completed, so while we're enqueing don't start on a node we've already finished

right, I was trying to say that always checking dependencies.every(dependency => this._nodesCompleted.has(dependency)) before enqueuing should be sufficient, because it means if a dependent of a newly-completed node is being enqueued, it can't have even been in the queue before (let alone completed), because one of its dependencies wasn't completed until just now.

We can wait as it changes, but it feels like it's a // should never happen situation and it should throw rather than ignoring just in case it does ever happen :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh oh I see my confusion here now :)

In another version, this method was called in a loop over all nodes, not just the dependents of recently completed nodes. Yup, yup in this snapshot it's a 100% superfluous check.

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to go from me!

@@ -177,4 +208,9 @@ class Node {
}
}

Node.TYPES = {
UNKNOWN: 'unknown',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants