fix: normalize all times to navStart, remove traceviewer model #2347

patrickhulce · 2017-05-24T00:36:18Z

brings us closer to closing #2235 (still there for stats we can remove separately)

Would like to do some more testing of how this impacts metrics, but the current state is pretty bad with ~10% of sites (just did 100 from alexa top 1000) failing to compute TTCI when it should have been computed (there was a 5s quiet of both network and CPU but due to the inconsistency between traceviewer model and network record timestamps the quiet periods didn't appear to line up in the audit)

NOTE: moving to first navStart is optional and can easily be reverted, I wanted to test this out since it still meets the goal of the tests of capturing the correct FMP it just reports longer times which in some cases is definitely the right thing to do and the failure cases are unknown (reverted)

patrickhulce · 2017-05-24T18:33:28Z

did a plots run on this and it brought TTCI failures down from 20% to 13% so that seems like a win, of the remaining 13%, 8% are timeouts and 5% are all for react-hn cases where there's a random long task that is impossible to predict from other signals (it's on a setinterval/settimeout or something) until we have protocol support for CPU quiet

I'm comfortable opening this up for review :)

brendankenny · 2017-05-24T18:39:47Z

what was the plots population size? Runs per site?

can we move the navstart change in a separate PR?

This reverts commit ea619df.

googlebot · 2017-05-24T18:44:35Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that they're okay with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

patrickhulce · 2017-05-24T18:44:53Z

what was the plots population size? Runs per site?

same 100 sites from before single run

can we move the navstart change in a separate PR?

ya reverted

patrickhulce · 2017-05-30T22:49:11Z

🏏 🏏

paulirish · 2017-05-30T23:06:29Z

lighthouse-core/lib/traces/tracing-processor.js

-    const events = tabTrace.processEvents.filter(e => e.ts !== 0);
-    const lastEvent = events[events.length - 1];
+    const navStart = tabTrace.navigationStartEvt.ts;
+    const traceStart = tabTrace.processEvents


should we move these two (traceStart/traceEnd) into tabTrace ? i'm okay either way.

yeah good call

paulirish · 2017-05-30T23:12:53Z

lighthouse-core/lib/traces/tracing-processor.js

+        events: [event],
+      };
+
+      if (currentEvt.end < lastEvt.end) {


can you add this kind of comment?

// note: processEvents is already sorted by event start

paulirish · 2017-05-30T23:17:27Z

lighthouse-core/lib/traces/tracing-processor.js

+      if (currentEvt.end < lastEvt.end) {
+        // The current event is completely inside the previous event, add to previous
+        lastEvt.events.push(event);
+      } else if (currentEvt.start < lastEvt.end) {


this is possible?

we're in a single process, though admittedly not a single thread. since we're only looking at these scheduleable tasks, i dont think they should overlap..

yup, that's one of the reasons the event count was so far off from before

you're saying you have traces where these do overlap? can we look at them?

yeah the test fixtures do fixtures/traces/progressive-app.json

paulirish · 2017-05-30T23:18:31Z

lighthouse-core/lib/traces/tracing-processor.js

+        lastEvt.events.push(event);
+      } else if (currentEvt.start < lastEvt.end) {
+        // The current event overlaps with the previous event, extend the duration and add to previous
+        lastEvt.events.push(event);


are these childEvents being used somewhere?

no but seemed like a bad idea to throw them away

kinda feel like we should throw them away since we have no use for them.

paulirish · 2017-05-30T23:18:59Z

lighthouse-core/test/lib/traces/tracing-processor-test.js

@@ -47,25 +47,6 @@ describe('TracingProcessor lib', () => {
    });
  });

-  it('doesn\'t throw when user_timing events have a colon', () => {


why remove?

because we don't ask tracing processor to do anything anymore, I can bring back though I guess it doesn't hurt anything

wait not nevermind it does hurt something there's no init function at all anymore :)

ah true! :) can move this to user-timing-test.js to make sure we can inject a trace with these colons and everything goes through fine?

brendankenny · 2017-05-31T03:17:10Z

what's the expected difference between this and what trace viewer produces today? I'm seeing some pretty different numbers of events

brendankenny · 2017-05-31T04:22:40Z

lighthouse-core/lib/traces/tracing-processor.js

+        // The current event overlaps with the previous event, extend the duration and add to previous
+        lastEvt.events.push(event);
+        lastEvt.end = currentEvt.end;
+        lastEvt.duration = currentEvt.end - currentEvt.start;


should this be lastEvt.duration = currentEvt.end - lastEvt.start; (or currentEvt.end - currentEvt.start) if extending duration by the end of currentEvt (but keeping the same start)?

heh made the same mistake as in your or :) yeah lastEvt.end - lastEvt.start or currentEvt.end - lastEvt.start

added tests for this func :)

heh made the same mistake as in your or

lol

patrickhulce · 2017-05-31T16:34:49Z

I'm seeing some pretty different numbers of events

how different is "pretty different"? many of the differences I saw originally in Paul's branch were eliminated by handling the overlapping events, there are likely more edge cases we're not handling but this fixes the ~7% of definitely wrong interpretations of the trace I was seeing. Hard to do more comparisons of what's "right" since no ground truth exists

brendankenny · 2017-05-31T17:19:24Z

how different is "pretty different"?

I called getMainThreadTopLevelEventDurations on the same trace using master and the new implementation. Durations should cancel out timing changes due to the navStart shift.

In the new implementation I got +35% number of top level events and some pretty different timings, though some of that may be due to it being difficult to tell which events match up and which are in the new 35% due to there being so many events.

But there were also some missing from the new version, and they could make a significant difference in our metrics. e.g. here's all event durations over 100ms:

master: 105.71, 116.33, 144.134, 191.014, 239.333, 391.718, 505.825, 599.512, 2563.194
#2347:  105.71, 116.33, 144.134, 191.014, 239.333,          505.825,          2563.194

I can dig in more and look at the actual events behind these missing durations to see where they're going.

Hard to do more comparisons of what's "right" since no ground truth exists

This was extracting all of the top level events in the trace (using -Infinity and Infinity as startTime and endTime). I would expect trace viewer to handle that correctly and we could treat that as ground truth, but are there issues with that I'm missing?

paulirish · 2017-05-31T19:07:44Z

lighthouse-core/gather/computed/trace-of-tab.js

@@ -98,7 +98,11 @@ class TraceOfTab extends ComputedArtifact {
      .filter(e => e.pid === startedInPageEvt.pid)
      .sort((event0, event1) => event0.ts - event1.ts);

-    const traceEnd = trace.traceEvents.reduce((max, evt) => {
+    const earliestTraceEvt = trace.traceEvents.reduce((min, evt) => {


let's add this guy:

const mainThreadEvents = processEvents .filter(e => e.tid === startedInPageEvt.tid);

paulirish · 2017-05-31T19:08:19Z

lighthouse-core/gather/computed/trace-of-tab.js

@@ -124,7 +128,9 @@ class TraceOfTab extends ComputedArtifact {
      timings,
      timestamps,
      processEvents,


mainThreadEvents,

paulirish · 2017-05-31T19:09:55Z

lighthouse-core/lib/traces/tracing-processor.js

+    let lastEvt = {start: -Infinity, end: -Infinity};
+    const topLevelEvents = [];
+    // note: processEvents is already sorted by event start
+    for (const event of tabTrace.processEvents) {


let's use mainThreadEvents here, since we don't want ProcessTaskFromWorkQueue events from HTML parser thread, script streamer thread, service worker thread, etc.

brendankenny

alright! this is looking good

brendankenny · 2017-06-02T04:28:49Z

lighthouse-core/audits/estimated-input-latency.js

-    return Promise.all(pending).then(([tabTrace, model]) => {
-      return EstimatedInputLatency.calculate(tabTrace, model, trace);
+    return artifacts.requestTraceOfTab(trace).then(tabTrace => {
+      return EstimatedInputLatency.calculate(tabTrace);


since it's static it would nice if this were able to be .then(EstimatedInputLatency.calculate), though that would require cleaning up our uses of this in here

brendankenny · 2017-06-02T04:32:51Z

lighthouse-core/audits/time-to-interactive.js

@@ -191,7 +190,7 @@ class TTIMetric extends Audit {
      }

      const times = {fmpTiming, visuallyReadyTiming, traceEndTiming};
-      const data = {tabTrace, model, trace};
+      const data = {tabTrace, trace};


looks like only data.tabTrace is being used above, not data.trace, so can maybe drop data in favor of just tabTrace? (also need to update the jsdocs still referring to model)

or we can just kill the audit entirely :)

haha, good point. I'm fine with leaving as is and deleting it next :)

brendankenny · 2017-06-02T04:34:56Z

lighthouse-core/gather/computed/trace-of-tab.js

@@ -38,7 +38,7 @@ class TraceOfTab extends ComputedArtifact {

  /**
   * @param {{traceEvents: !Array}} trace
-   * @return {!{processEvents: !Array<TraceEvent>, startedInPageEvt: TraceEvent, navigationStartEvt: TraceEvent, firstContentfulPaintEvt: TraceEvent, firstMeaningfulPaintEvt: TraceEvent}}
+   * @return {!TraceOfTabArtifact}


TraceOfTabArtifact is really out of date. Mind updating it? :D

brendankenny · 2017-06-02T04:40:34Z

lighthouse-core/gather/computed/trace-of-tab.js

@@ -107,7 +114,7 @@ class TraceOfTab extends ComputedArtifact {
      firstPaint,
      firstContentfulPaint,
      firstMeaningfulPaint,
-      traceEnd,
+      traceEnd: latestTraceEvt,


traceEnd in metrics but latestTraceEvt in the top level. Should latestTraceEvt be traceEndEvt to match how the others pair up (timing name + Evt)?

(and don't feel strongly about it, but traceStartEvt may make more sense than earliestTraceEvt. e.g. new code in tracing-processor already does traceEnd = tabTrace.latestTraceEvt.ts)

I had a very good explanation for why I named them this instead of traceEnd (that these were supposed to be just about the main thread, latestMainThreadEvt earliestMainThreadEvt and distinct from traceEnd) but the rest of the code betrays me. In fact, there are a number of things in here that I strongly feel like I accidentally dropped a stash...

yeah what do you think about using navStart and traceEnd timings instead of all this other nonsense entirely?

brendankenny · 2017-06-02T04:41:57Z

lighthouse-core/lib/traces/tracing-processor.js

-   * @see https://docs.google.com/document/d/1b9slyaB9yho91YTOkAQfpCdULFkZM9LqsipcX3t7He8/preview
-   * @param {!traceviewer.Model} model
-   * @param {{traceEvents: !Array<!Object>}} trace
+   * @see https://docs.google.com/document/d/18gvP-CBA2BiBpi3Rz1I1ISciKGhniTSZ9TY0XCnXS7E/preview


not sure how this snuck into the diff. This is the old doc (which has just a link to the currently linked doc)

yeah probably a bad rebase from the earlier branch, I think paul's been on the hunt for this for a while ;) fixed

brendankenny · 2017-06-02T05:12:25Z

lighthouse-core/lib/traces/tracing-processor.js

    // Range of responsiveness we care about. Default to bounds of model.
-    startTime = startTime === undefined ? model.bounds.min : startTime;
-    endTime = endTime === undefined ? model.bounds.max : endTime;
+    startTime = startTime === undefined ? (traceStart - navStart) / 1000 : startTime;


the - navStart is somewhat surprising here. Maybe add a comment explaining that they need to be shifted too? Or could use tabTrace.timings.traceEnd and tabTrace.timings.traceStart (if it was added to trace-of-tab) to get the already shifted bounds

better question, should we just default these bounds to navstart and traceEnd now instead? negative start time doesn't really make a whole lot of sense and we take it to be the beginning of time everywhere we consume this data

better question, should we just default these bounds to navstart and traceEnd now instead? negative start time doesn't really make a whole lot of sense and we take it to be the beginning of time everywhere we consume this data

ha, sorry I was too slow to respond :)

I think it would be better to still have getMainThreadTopLevelEvents and getMainThreadTopLevelEventDurations return all events, even ones before navStart.

Our current uses don't need anything before then, but we'll hopefully be doing more sophisticated trace analysis in the future (e.g. over multiple navigations, though we'll have several other complications to figure out before then :) and I think having them return all of the main thread/top level events in the trace will be what you would expect from them.

Having a note that the times are relative to navStart (so negative times are just before that) doesn't seem too confusing.

I guess on the other hand we could say something like "use a startTime of -Infinity to get all top level events", but that's pretty awkward

On the other other hand, getting everything after navStart is our typical use case right now, so it's kind of awkward that you have to specify 0 for that.

On the original hand, though, it still makes sense to me that these would return all of the top level events in the main thread of the trace when no timing bounds are specified, so idk :)

getting everything after navStart is our typical use case right now

though that's not completely true. It looks like all of our calls have to deal with events earlier than they care about anyways, so it doesn't seem harmful to default to all events, even before navStart

I gotta disagree that it makes sense to return everything. The entire rest of the LH world completely ignores time before navstart and having this one spot that doesn't do that is what led to total failure on 10% of sites. Having them return everything would require every caller to filter out the negative events or provide a non-default parameter at which point why bother having a default at all?

but they do all filter out negative events or provide their own startTime.

why bother having a default at all?

start of trace/end of trace defaults are no defaults in a way, since it's pretty much just letting the trace you passed in define the bounds.

I guess it comes down to what these methods are for, just TTI-like things or any sort of processing where you need to extract the top level events.

idk, @paulirish, as a user, what would you expect from these methods?

i think getRiskToResponsiveness makes sense to be aware of navstart, whereas getMainThreadTopLevelEventDurations shouldnt. that seems to be the state of things here, so i'm pretty good.

I would appreciate a comment being left about the shifting for navstart. (it seems like the jsdoc comment is incorrect now)

OK, I'm good with what we have here. I'll file an issue after this PR with a proposal to separate concerns slightly but still make sure all our audits can easily do the right thing by default (don't pay attention to events before navStart). This works well for all our current audits, though.

brendankenny · 2017-06-02T05:13:56Z

lighthouse-core/lib/traces/tracing-processor.js

+        start,
+        end,
+        duration: event.dur / 1000,
+        events: [event],


since it'll be just the one event now, drop the array and just have event? (or rawEvent or whatever...also may not really be needed)

just nuked it we don't use it I forgot to remove

brendankenny · 2017-06-02T05:16:15Z

lighthouse-core/test/audits/estimated-input-latency-test.js

@@ -37,7 +37,7 @@ describe('Performance: estimated-input-latency audit', () => {
    const artifacts = generateArtifactsWithTrace({traceEvents: pwaTrace});
    return Audit.audit(artifacts).then(output => {
      assert.equal(output.debugString, undefined);
-      assert.equal(output.rawValue, 16.4);
+      assert.equal(output.rawValue, 16.2);


this and the TTI change, are they just from the navStart shift?

yep, can confirm. the trace for this one now skips the first 15ms and the TTI one skips the first 5ms, adding back this delta yields the old results

brendankenny · 2017-06-02T05:18:37Z

lighthouse-core/test/gather/computed/first-interactive-test.js

@@ -43,7 +44,14 @@ describe('FirstInteractive computed artifact:', () => {

  it('should compute firstInteractive', () => {
    return computedArtifacts.requestFirstInteractive(acceptableTrace).then(output => {
-      assert.equal(Math.round(output.timeInMs), 1587);
+      assert.equal(Math.round(output.timeInMs), 1582);


also just from navStart shift?

brendankenny · 2017-06-02T05:20:25Z

lighthouse-core/test/lib/traces/tracing-processor-test.js

      const durations = ret.durations;

      function getDurationFromIndex(index) {
        return Number(durations[index].toFixed(2));
      }

      assert.equal(durations.filter(dur => isNaN(dur)).length, 0, 'NaN found');
-      assert.equal(durations.length, 652);
+      assert.equal(durations.length, 651);


why the change? Seems like it shouldn't lose any events

brendankenny · 2017-06-06T19:44:23Z

lighthouse-core/test/lib/traces/tracing-processor-test.js

      const durations = ret.durations;

      function getDurationFromIndex(index) {
        return Number(durations[index].toFixed(2));
      }

      assert.equal(durations.filter(dur => isNaN(dur)).length, 0, 'NaN found');
-      assert.equal(durations.length, 652);
+      assert.equal(durations.length, 645);


can you confirm this one as well? Just the new navStart default?

6 of them definitely the 7th is eluding me, trying to track it down

ah found the 7th it's a result of !dur they return an event that has no duration which we'll ignore

brendankenny · 2017-06-06T19:45:14Z

need a more recent rebase? travis is still complaining about javascriptDialog

paulirish · 2017-06-06T19:49:06Z

Ya I didn't rebase this one.

brendankenny

sweet! Test failures are devtools compilation, so lets not let that get in the way of this big step forward

patrickhulce · 2017-06-06T22:07:52Z

🎉 🎉

brendankenny · 2017-06-06T22:11:53Z

with this change, the compiled lighthouse-background.js just dropped by 562KB, around 25%

paulirish and others added 6 commits May 5, 2017 10:11

Remove traceviewer's tracingModel

e791557

Merge branch 'master' into rmtraceviewer

8c2e789

revert user timings changes

15d9424

fix remaining tests

2fb8850

add trace with redirects

25f157c

use first navstart, not last

ea619df

patrickhulce changed the title ~~fix: normalize all times to first navStart, remove traceviewer model~~ fix: normalize all times to first navStart, remove traceviewer model [DON'T MERGE] May 24, 2017

patrickhulce changed the title ~~fix: normalize all times to first navStart, remove traceviewer model [DON'T MERGE]~~ fix: normalize all times to first navStart, remove traceviewer model May 24, 2017

Revert "use first navstart, not last"

f32cccf

This reverts commit ea619df.

patrickhulce changed the title ~~fix: normalize all times to first navStart, remove traceviewer model~~ fix: normalize all times to navStart, remove traceviewer model May 24, 2017

Merge branch 'master' into rmtraceviewer

78036fc

reduce -> for..of (#2399)

ef2eccd

paulirish reviewed May 30, 2017

View reviewed changes

feedback

184537b

brendankenny reviewed May 31, 2017

View reviewed changes

patrickhulce added 2 commits May 31, 2017 09:50

bug fix and test

1a0aacb

add second test

9bfcdee

patrickhulce added 2 commits May 31, 2017 13:45

Merge branch 'master' into rmtraceviewer

5f1a15f

mainthread events

65de1eb

paulirish reviewed May 31, 2017

View reviewed changes

patrickhulce added 2 commits May 31, 2017 14:30

test updates

ba571ab

add back colon traceg

f62b2a4

fixed user timing test

d4c4605

brendankenny requested changes Jun 2, 2017

View reviewed changes

patrickhulce mentioned this pull request Jun 5, 2017

Runs out of memory for complex pages #2424

Closed

patrickhulce added 3 commits June 6, 2017 10:55

use navstart as default

df3c12e

Merge branch 'master' into rmtraceviewer

08797fc

tasks -> events

2bb290b

brendankenny reviewed Jun 6, 2017

View reviewed changes

Merge branch 'master' into rmtraceviewer

71682ce

brendankenny approved these changes Jun 6, 2017

View reviewed changes

patrickhulce merged commit d5f5c52 into master Jun 6, 2017

patrickhulce deleted the rmtraceviewer branch June 6, 2017 22:07

This was referenced Jun 9, 2017

fix(driver): wait for CPU idle #2473

Merged

Can't retrieve Interactive times with Angular HN tastejs/hacker-news-pwas#70

Closed

Remove traceviewer dependency #2235

Closed

brendankenny mentioned this pull request Sep 26, 2017

Migrate TTI and EIL audits to using TraceOfTab #1568

Closed

patrickhulce mentioned this pull request May 17, 2019

Improve handling of redirects, navStart #8984

Open

fix: normalize all times to navStart, remove traceviewer model #2347

fix: normalize all times to navStart, remove traceviewer model #2347

Conversation

patrickhulce commented May 24, 2017 • edited by brendankenny Loading

patrickhulce commented May 24, 2017

brendankenny commented May 24, 2017

googlebot commented May 24, 2017

patrickhulce commented May 24, 2017

patrickhulce commented May 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny commented May 31, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce commented May 31, 2017

brendankenny commented May 31, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny Jun 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny Jun 2, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny Jun 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brendankenny commented Jun 6, 2017

paulirish commented Jun 6, 2017 via email

brendankenny left a comment

Choose a reason for hiding this comment

patrickhulce commented Jun 6, 2017

brendankenny commented Jun 6, 2017

patrickhulce commented May 24, 2017 •

edited by brendankenny

Loading

brendankenny commented May 31, 2017 •

edited

Loading

brendankenny Jun 2, 2017 •

edited

Loading

brendankenny Jun 2, 2017 •

edited

Loading

brendankenny Jun 6, 2017 •

edited

Loading