Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: use target type instead of session for oopif #15006

Merged
merged 19 commits into from
May 2, 2023
Merged

Conversation

connorjclark
Copy link
Collaborator

@connorjclark connorjclark commented Apr 21, 2023

ref #14157

This removes the sessionId (too low-level, differs between clients) from the network requests model, and replaces it with source which is set to the target type that sent the Network protocol event.

An oopif boolean getter is added, since currently audits only care about that. It simply checks that it didn't come from an iframe target type, which ought to be the same behavior as the previous sessionId hack. The oopif smoke tests validate this.

This doesn't do anything to make the network records -> devtools log conversion roundtrip w/ network source information. We don't have all the information about targets, so the roundtrip wouldn't be exact anyhow. We'd have to fake sessionIds and targets and their attached events, such that network records -> devtools log -> network records would roundtrip OK. But devtools log -> network records -> devtools log would lose information. This is no different that before with the sessionId hack.

  • source may be too generic of a name, maybe sourceTargetType?

@connorjclark connorjclark requested a review from a team as a code owner April 21, 2023 18:49
@connorjclark connorjclark requested review from brendankenny and removed request for a team April 21, 2023 18:49
@@ -186,29 +186,33 @@ describe('network recorder', function() {
const devtoolsLogs = networkRecordsToDevtoolsLog([
{url: 'http://example.com'},
{url: 'http://iframe.com'},
{url: 'http://other-iframe.com'},
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo this second iframe was distracting

// Set record.source to the Crdp target type that requested the resource,
// based on the what session the network events came from.
{
/** @type {Map<string, 'page'|'iframe'|'worker'>} */
Copy link
Collaborator Author

@connorjclark connorjclark Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: we still do not attach to worker targets, so this will never be worker.

/** @type {Map<string, 'page'|'iframe'|'worker'>} */
const sessionIdToTargetType = new Map();
for (const message of devtoolsLog) {
if (message.method === 'Target.attachedToTarget') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding...

We won't see a Target.attachedToTarget event for the default session, but there is a fall-back to the page type on L312 anyway. All other sessions should have a Target.attachedToTarget event in the DT log.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat, could you add a comment explaining it?

Copy link
Member

@brendankenny brendankenny Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timespans will miss the Target.attachedToTarget events of subframes and workers too, though, won't they?

Copy link
Member

@brendankenny brendankenny Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one idea: target-manager already keeps info on all the targets it's managing:

const targetWithSession = {
target: target.targetInfo,
cdpSession,
session: newSession,
protocolListener,
};
this._targetIdToTargets.set(targetId, targetWithSession);

Any new listener to protocolevent on the targetManager (like network-monitor/devtools-log) could first get a set of pre-seeded events for any existing tracked targets. New targets during the run will come then come in normally.

That would keep the nice property from this PR that network records are entirely reconstructable from the devtoolslog.

Copy link
Member

@adamraine adamraine Apr 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, there could be more than 1 missing Target.attachedToTarget event in timespan mode. Are you saying we add the preliminary Target.attachedToTarget events in ourselves?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the latest commit what you had in mind?

I still need to write a timespan test that would exercise this

core/lib/network-recorder.js Outdated Show resolved Hide resolved
core/lib/network-recorder.js Outdated Show resolved Hide resolved
@brendankenny
Copy link
Member

One thing that came up in discussing #14157 was splitting up sources between actual page (usually the main frame) and in process iframes. They'll be from the same target, but it's worth differentiating where they're really coming from.

A number of the audits that say they want to not include content from OOPIFs actually mean they don't want content from any iframe. Similarly we should probably be more careful with all-frames vs main-frame-only metrics, where currently we're sometimes including a bunch of stuff in the lantern graph that shares a process but isn't actually in the main frame. We don't have to go update all of those now (obviously we've been living with using OOPIFs as the differentiator for a while now), but having that differentiation would unlock being able to do those updates.

Maybe same session but different frameIds is sufficient to tell them apart? Frame trees are sometimes more complicated in practice, but this should be compatible with adding more outmost/fenced/whatever source types in the future

@brendankenny
Copy link
Member

source may be too generic of a name, maybe sourceTargetType?

Forgot to voice agreement with this before. Seeing it used in context it's now clear it's way too generic :)

sourceTargetType is kind of awkward but conveys everything it needs to.

My only other idea was sessionTargetType to piggyback on knowledge people working with CDP already have about sessions, and it's clear that it kind of pairs up with sessionId

Comment on lines 173 to 175
if (method === 'Target.attachedToTarget') {
this._targetAttachedPayloads.push(payload);
}
Copy link
Member

@brendankenny brendankenny Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking instead of having to hold onto them, we could recreate the target events from this._targetIdToTargets.values(), since all that info is in there, and that means if/when we decide to more actively attach/detach from targets (we don't currently do anything for Target.targetCrashed/Target.targetDestroyed/Target.targetInfoChanged/Target.detachedFromTarget events), this approach would automatically pick up any management work and only emit known targets.

However, that would mean any Target events that come in during the run wouldn't get the same treatment, so to be consistent we'd either need to save all Target.* events here and downstream consumers like network-recorder would have to be able to handle evolving targets, or TargetManager could handle the whole thing, but it probably isn't a good idea to rely on events at that point.

Alternatively we change _getProtocolEventListener above to take a targetType, so payload becomes {method, params, targetType} (or also keep sessionId, but there really wouldn't be any need for it at that point), _findRealRequestAndSetSession becomes _findRealRequestAndSetTargetType, and no other special handling is needed.

Copy link
Collaborator Author

@connorjclark connorjclark Apr 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like re-emitting is simpler and all we need for now, and we could defer changing the interface for target manager / having downstream consumers do more work until we actually need it?

Alternatively we change _getProtocolEventListener above to take a targetType, so payload becomes {method, params, targetType} (or also keep sessionId, but there really wouldn't be any need for it at that point), _findRealRequestAndSetSession becomes _findRealRequestAndSetTargetType, and no other special handling is needed.

This loses the property that the network records can be created from just the devtools log. Or are you suggesting we also save targetType in DevToolsLog / what we write to disk?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you suggesting we also save targetType in DevToolsLog / what we write to disk?

yes, if it's in payload here it'll be automatically saved in the devtoolslog (like sessionId is currently)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From chat...

I like the alternative plan, assuming we store DT logs as {method, params, targetType, sessionId} instead of the current {method, params, targetType}. I would prefer maintaining our new definition of protocol events as an extended version of the original.

core/gather/driver/target-manager.js Outdated Show resolved Hide resolved
Copy link
Member

@adamraine adamraine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm slightly concerned we will want to add more than just the targetType to the DT log (e.g. adding a target id or something) but I don't see it being that hard to change in the future.

core/test/scenarios/api-test-pptr.js Outdated Show resolved Hide resolved
Co-authored-by: Adam Raine <6752989+adamraine@users.noreply.github.com>
@@ -244,6 +244,8 @@ function getRequestWillBeSentEvent(networkRecord, index, normalizedTiming) {
frameId: networkRecord.frameId || `${idBase}.1`,
redirectResponse: networkRecord.redirectResponse,
},
targetType: networkRecord.sessionTargetType,
Copy link
Member

@brendankenny brendankenny Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is required so should have a default? ('page'?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left that out because otherwise NetworkRecorder.recordsFromLogs would need to do something similar (otherwise the roundtrip tests would fail), and it seemed better to not let such defaults bleed into non-test code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to start relying on this property though, and we don't want to manually add sessionTargetType to a bunch of tests every time a computed artifact or whatever starts using it.

Ideally #15005 will eventually get us to a place where test devtoolslogs are all updated. What about going wild and doing targetType: 'sessionTargetType' in networkRecord ? networkRecord.sessionTargetType : 'page'? That would default to including it, except for when constructed from real network records.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going hog wild works, thanks :)

core/gather/driver/target-manager.js Outdated Show resolved Hide resolved
@@ -10,6 +10,8 @@ type CrdpEvents = CrdpMappings.Events;
type CrdpCommands = CrdpMappings.Commands;

declare module Protocol {
type TargetType = 'page' | 'iframe' | 'worker';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we switch iframe to oopif to make it explicit?

Copy link
Collaborator Author

@connorjclark connorjclark Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to make this target type not just be the Crdp target type. Better not to conflict with existing terminology. If you're insistent on this, we should rename this from target type to something else.

Copy link
Member

@brendankenny brendankenny Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's a good point since we're saving these in the devtoolslog messages, it makes sense to stick to what's coming in over the protocol (even if we are combining a couple of things now).

I guess I'm really thinking about what we expose from NetworkRequest. Both page and iframe are misleading for what audits may be doing (the first is really page/in-process iframes, the second just out-of-process iframes). The distinction is important for audits that only care about main-frame activity (e.g. LH LCP can't be in an iframe, even if the iframe shares a process with the main frame), and being able to easily make the distinction could be very helpful to LR, where I believe all iframes are in-process, regardless of origin.

I think we may want something like 'page'|'ipif'|'oopif'|'worker' for networkRequests? We can make that actual change in a follow up, but worth discussing the design now: would that seem workable to you even though it overlaps but doesn't quite match what would come through on the devtoolslog targetType?

edit: medium kidding about 'ipif', but we could also do just 'page'|'iframe'|'worker' and keep the isOutOfProcessIframe added here for when something cares about the difference in iframe types for whatever reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about making the record's sessionTargetType a private member, and expose helpers isFromOutOfProcessIframe, isFromIframe, etc.. as audit uses cases demand?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that could work, I was sticking to the enum because we had discussed it in #14157. It does have the nice property that these are exclusive states, so you know you don't have to check more than one boolean helper

Copy link
Member

@brendankenny brendankenny Apr 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that could work to land this PR though. Make sessionTargetType private (even doing that may be overkill), keep isFromOutOfProcessIframe as is, and we can discuss building on it next week?

*/
setSession(sessionId) {
this.sessionId = sessionId;
}

get oopif() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isOopif to match e.g. isLinkPreload?

Any reason to make it a getter and not a function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I didn't want to convert some of tests that use raw network objects in this PR.

Copy link
Collaborator Author

@connorjclark connorjclark Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isLinkPreload

Oof, I assumed this was a method, so my previous comment didn't make much sense.

I think having two properties here that need to sync up will make it possible to mess up in a test or elsewhere. Renaming to isOutOfProcessIframe seems good though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants