Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: API for exposing resource requests #137

Open
npm1 opened this issue Jan 10, 2018 · 19 comments
Open

Proposal: API for exposing resource requests #137

npm1 opened this issue Jan 10, 2018 · 19 comments
Milestone

Comments

@npm1
Copy link
Contributor

npm1 commented Jan 10, 2018

We would like to expose in-flight resource requests. The following document is a proposal on how to do this by just using an analogue to PerformanceResourceTiming but for resources requests that have not been completed. We would like to have some feedback on the proposal!

https://docs.google.com/document/d/1NS80J5H796-mbEXHr6uWX3ZJMv-sM-jy9BeZM_yPJ94

@nicjansma
Copy link
Contributor

This would be incredibly helpful for things we're doing in Boomerang such as measuring SPA navigations and for calculating a RUM Time-To-Interactive.

@igrigorik igrigorik added this to the Level 3 milestone Jan 18, 2018
@npm1
Copy link
Contributor Author

npm1 commented Jan 24, 2018

There's no consensus yet regarding the best way to achieve all of the following points:

  • Consistency with other PerformanceEntry objects. These are not modified nor removed from the Performance Timeline.

  • Avoid confusing developers regarding which of the entries correspond to in-flight resource requests and which correspond to older entries from requests that have been completed.

  • Encourage the use of PerformanceObserver instead of polling, which may involve clearing and causing race conditions if there are multipler pollers.

I've updated the document with the ideas we have so far.

@npm1
Copy link
Contributor Author

npm1 commented Jan 10, 2019

Update: latest proposal was https://docs.google.com/presentation/d/1x6QTUdrXtk0faWT1zOTIPdyoKno3WFwhzB3L-mqNYxY/edit#slide=id.p

There were concerns about it:

  • We might want to do this from Fetch instead.
  • There are use cases where we'd like an observer instead of just polling.

@npm1
Copy link
Contributor Author

npm1 commented Jan 17, 2019

We had a discussion with Nic and Charles. Here are the use cases they are considering:

  • SPA navigation onload signal: during an SPA nav, developers might want to create an artificial 'onload' signal. They want to know which resources have been started to wait for them, but if none have started after some time then they should abort. This use case requires some nontrivial filtering to avoid counting resources such as 1x1 images, so it might not be possible to satisfy this use case with the initial API. But this API should target eventually being able to handle this use case.

  • Allow computing a network quiet window for a Time-To-Interactive metric calculated on users: this is a simple use case where we'd like to keep count of the number of outstanding requests (maybe with some basic filtering) and update as resources begin or end.

  • Compute an improved onload signal for the webpage. Since onload is such an unreliable signal, developers could force waiting until certain resources requested have finished being fetched before considering the page to have loaded.

@toddreifsteck does this explanation make sense to you? I think this spells out use cases pretty clearly. The observer pattern is a good fit for all of these use cases assuming the buffered flag and takeRecords() is supported. It does not seem that these use cases require high priority notifications, so even simple integration with PerformanceObserver would work. The polling is a worse fit for these use cases, especially the TTI computation.

Given that the observer pattern is the best fit but that it would work best having the buffered flag and takeRecords, I favor adding a new entry type instead of adding a new observer (FetchObserver). Another reason in favor of PerformanceObserver instead of FetchObserver is that these use cases play well with ResourceTiming (in-flight notifies you when resource begins, RT notifies you when it finished). Thoughts @yoavweiss @igrigorik @tdresser Todd?

@yoavweiss
Copy link
Contributor

Using PerformanceObserver for this makes sense to me.

@igrigorik
Copy link
Member

Thanks for unpacking the use cases. Overall, I think this direction makes sense, but a few questions as I'm thinking through it...

(A) Likely outcome is that an analytics script would register two observers, one for completed resources (as they do today) and another for in-flight resources?

  • For SPA nav's, when / how would the script use the observer? E.g. would it rely on an already registered observer and then reconcile recently initiated requests until some end condition, or, would we expect a new observer to be registered? The edge case here is around when/how the analytics script would know that an SPA navigation is happening — nothing stops the app from initiating some requests prior to triggering a history transition, right?

(B) The buffered flag behavior for inflight will be the same as resourcetiming, correct?

  • How would the implementation for TTI look like? Register before or during onload event with buffered set to true, reconcile retrieved list against completed resource list, and then wait for RT notifications for remaining set?

@npm1
Copy link
Contributor Author

npm1 commented Jan 28, 2019

A. They could use separate observers or they could observe both types in one observer. I don't think we're solving the problem of figuring out when an SPA nav happens. Rather, we're trying to help calculate an onload signal, assuming we know when the nav happened. @nicjansma can correct me on this.

B. Buffered flag behavior must be the same so that you can keep track of previous resources, when needed. You could keep a counter and process entries in timestamp order since you need to compute any time that becomes network quiet, not just the first one (since TTI also depends on long tasks).

This doc shows the SPA nav and the TTI examples.

@igrigorik
Copy link
Member

A. They could use separate observers or they could observe both types in one observer. I don't think we're solving the problem of figuring out when an SPA nav happens. Rather, we're trying to help calculate an onload signal, assuming we know when the nav happened. @nicjansma can correct me on this.

Yep, I'm just trying to understand how an analytics script would use this today. For example, post onload and after events are no longer automatically buffered, it seems that an analytics provider would always have to maintain own reconciled map of inflight resource by subscribing to both types. That said, would love to hear from @nicjansma on this one..

This doc shows the SPA nav and the TTI examples.

Ah, thanks that is really helpful. I think that confirms what I described above. All in all, it works, albeit it does require some non-trivial processing and continuous monitoring.

@npm1
Copy link
Contributor Author

npm1 commented Jan 29, 2019

Yea, for the SPA use case, in order to avoid having permanent observers we'd need to support the buffered flag to work after onload. It could be something like: buffer 'recent' resource requests, where recent is some defined threshold (ie, 2 seconds or something like that). That way, once the SPA nav begins, the observers can be registered, and the buffering prevents them from losing potentially important information.

The use cases do require a considerable amount of code. One alternative that comes to mind, but seems too tailored to these use cases, would be to have something like performance.observeResources(start_timestamp, filter_function, threshold, callback) which means: start monitoring resource requests that are being processed during start_timestamp (which can be a future now() or can be 0 if the call is made early enough in the page load), or resource requests that begin after that. Use filter_function to ignore irrelevant resources (the input would be some information about resources). When the number of outstanding resource requests goes below threshold, then execute callback.

@philipwalton
Copy link
Member

In other observers, the takeRecords() method doesn't necessarily report the state of the world at that exact time, it just synchronously returns the entries that have been queued for reporting but haven't yet been dispatched (e.g. some reporters wait for idle periods to dispatch).

I know this is true for IntersectionObserver in that calling takeRecords() won't force layout in order to get the most up-to-date intersection information. It'll just flush the entry queue.

Will that also be true for takeRecords() with this API, or will takeRecords() always synchronously report the correct number of in-flight resource requests?

@npm1
Copy link
Contributor Author

npm1 commented Jan 31, 2019

takeRecords() would work as with the other entries: return the entries that have been queued for the observer callback but for which the callback has not yet occurred. I'm not sure what you mean here... what would be the equivalent of forcing layout in this case? In an implementation, I'd expect the browser to queue the entry as soon as it can once the resource request begins, so I'm not sure I follow what the two options here are.

@philipwalton
Copy link
Member

I'm not sure what you mean here... what would be the equivalent of forcing layout in this case?

I'm not sure if there is a relevant equivalent in this case, which is why I was asking. My understanding of layout implementations is that layout data is frequently dirty and only needs to be updated in certain situations -- and takeRecords() was decided to not be one of those situations.

If observer callbacks are queued as soon as fetches happen, then it's not an issue here.

@npm1
Copy link
Contributor Author

npm1 commented Jan 31, 2019

Yep. Entries should queued to PerformanceObserver buffer as soon as they are created (at the same time as they are added to the performance timeline, whenever that happens). What can be delayed is the observer's callback, which receives as input the entries from this buffer and clears the buffer.

@igrigorik
Copy link
Member

Yea, for the SPA use case, in order to avoid having permanent observers we'd need to support the buffered flag to work after onload. It could be something like: buffer 'recent' resource requests, where recent is some defined threshold (ie, 2 seconds or something like that). That way, once the SPA nav begins, the observers can be registered, and the buffering prevents them from losing potentially important information.

Yikes, if this use case also requires adding additional "always queue last N records" into perfobserver, that's a non-trivial addition with interesting implications. Can one get away without it? E.g. if the analytics script registers at onload to listen to all new and completed resources.. then it can maintain the necessary state on its own; it just means there are always at least two observers running. Is that an unreasonable pattern to expect? /cc @nicjansma

@npm1
Copy link
Contributor Author

npm1 commented Feb 1, 2019

Correct, there is no need to receive previous records if the observers are never disconnected. I think this is fine as long as analytics is careful to only keep a reasonable amount of records instead of storing the full history of them, which could potentially be a considerable amount of memory .

@nicjansma
Copy link
Contributor

Yeah when thinking through how we'd utilize these observers, all roads lead to the simplest operation of just always having two observers running (or a single one handling both events).

If we were to start/stop observers on the fly, some of the tricky scenarios are:

  • With SPAs, we frequently see requests starting just before the SPA's route change event or the history state changing (which is when we activate). SPA frameworks aren't consistent around when they change the URL. Some even fetch the entire page's HTML template first.
  • When we're monitoring other in-page interactions (e.g. clicks), our bubbled click handler may be called after the element's click handler that's triggering some networking activity.

For both of those, it would be nice to know of all outstanding requests that start around and just before the interaction we'd like to monitor (+/- 50ms). Today, we handle this by always listening to all XHRs (proxying the object), keeping track of what's going on and "promoting" an XHR/click/etc to a full SPA navigation if one happens right afterwards.

So that being said, using a PerfObserver for this, it's probably easiest for us to just have a lightweight one running constantly. We'd just be keeping track of the last N seconds or last N requests.

@igrigorik
Copy link
Member

@nicjansma thanks, that makes sense. This also suggests that we shouldn't need any additional acrobatics for additional buffered flags or modes, which addresses my earlier concerns.

@rniwa @toddreifsteck would love to hear your feedback and thoughts on this direction as well.

@rniwa
Copy link

rniwa commented Feb 13, 2019

I'm still soliciting the feedback internally at Apple.

@yoavweiss
Copy link
Contributor

Going back to this after many (many) years, there's some significant overlap with whatwg/fetch#607.

It'd make sense to resolve this as part of Fetch, but we need to make sure that non-fetch() use cases are covered.
/cc @jakearchibald @annevk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants