Odd format for fetch callbacks #536

jyasskin · 2017-05-02T22:26:23Z

https://fetch.spec.whatwg.org/#fetch-method has:

Run the following in parallel:

Fetch request.

To process response for response, run these substeps: ...

To process response done for response, run these substeps: ...

"process response" and "process response done" are defined as "Tasks that are queued by this standard are annotated as one of: ...", but they actually appear to be customization points within the Fetch algorithm for which the caller can provide an implementation. Assuming that's right, then:

It doesn't make sense to say to run the fetch and its callbacks in parallel. Instead, we should run the fetch in parallel with the callbacks passed in.
The callbacks need their defaults defined for when callers don't pass them in. I think the defaults are to be no-ops?
The callbacks should be defined as customization points, not just annotations.

annevk · 2017-05-03T07:01:01Z

but they actually appear to be customization points within the Fetch algorithm for which the caller can provide an implementation

I don't think that's the case. What makes you think that?

jyasskin · 2017-05-03T14:40:36Z

In https://fetch.spec.whatwg.org/#fetch-method, there's no introduction for response other than the phrase "To process response for", so it's probably a parameter, making that block a function definition.
The uses of "process response", "process response end-of-body", and "process response done" in https://fetch.spec.whatwg.org/#main-fetch say to queue a task to do them, but they don't say what steps to do in them, leaving that up to someone else.
The definitions at https://fetch.spec.whatwg.org/#process-response don't provide steps either, making me think the intended steps are actually the ones provided by #fetch-method.
https://w3c.github.io/ServiceWorker/#cache-addAll and several algorithms in HTML provide other implementations of "process response", and if that's correct, then these implementations must be parameters to Fetch.

annevk · 2017-05-03T14:44:00Z

I guess you're asking for a more explicit link between the call to fetch and the tasks that end up being queued as a result. It seems somewhat reasonable to link them directly in the way you suggest.

They're not customization points however as these steps happen on different threads/processes and are not accessible from each other.

jyasskin · 2017-05-03T14:46:09Z

Ah, you have a more precise definition of "customization point" than I do. I just meant that they appear to be parameters that can be selected by the caller to customize how Fetch works, like other kinds of arguments.

jakearchibald · 2017-09-18T10:44:11Z

I have a somewhat similar problem. In background fetch I make fetches in parallel, but I want to read the response body as it arrives from the network.

The synchronous flag is closest to what I want, but it waits for the response body before returning.

"process response" is second closest, but it wants to queue a task on an event loop. I just want to process the response in prose. I may not have an event loop at this point.

jakearchibald · 2017-09-19T08:43:21Z

I think the callbacks such as "process response" should be algorithmic callbacks, like we did with signal abort.

Add the following process response steps to the ongoing fetch:
1. Then specs could optionally…
2. Queue a fetch task on request to run the following steps:
  1. …

The "ongoing fetch" part is a bit meh. This came up with aborting too. Ideally it should be easier to get a reference to the fetch record, and operations can be called against that.

annevk · 2017-09-19T09:23:35Z

Maybe we should make fetch return a fetch record? And then have "synchronous fetch" wrap that or some such to just return a response?

jakearchibald · 2017-09-19T09:27:54Z

Yeah, I think that would make it much easier to talk about actions relating to a particular fetch.

jakearchibald · 2018-05-04T23:10:00Z

In retrospect, I wish fetching sync didn't wait for the body, and spec authors had to explicitly wait if needed. Is it too late to make a change like this?

annevk · 2018-05-07T06:53:08Z

@jakearchibald you mean we would simply not offer a synchronous mode in Fetch? I think refactoring synchronous callers is reasonable still. I actually think refactoring all callers is still reasonable, as there's not that many and not all have even integrated yet to begin with.

jakearchibald · 2018-05-07T15:44:56Z

you mean we would simply not offer a synchronous mode in Fetch?

I think sync fetching is useful in in-parallel steps, but I'm running into cases where I want to do something like (excuse the hand-waving):

Let response be the result of fetching request.
If response is opaque, abort the fetch and abort these steps.
Stream response into something.

I can't do the above, as step one involves waiting for the whole body, so the aborting happens too late, and the streaming is kinda pointless.

If a sync fetch returned before waiting for the body, it'd work fine. In the cases where APIs needed the whole body (like sync XHR), they could do:

Let response be the result of fetching request.
Wait for response's body.
…

annevk · 2018-05-07T18:07:07Z

Why can't you use the process response callback? In any event, I think if we improve the basics first, as suggested in OP, we can then build better higher-level functionality on top.

jakearchibald · 2018-05-07T18:12:51Z

Why can't you use the process response callback?

It queues a task, and I have no reason to get back onto the main thread. In the background fetch case I may not have an event loop to queue into.

annevk · 2018-05-08T07:18:44Z

I see. I wonder if we can untangle the task queueing somehow so that there's a lower-level hook you can use if you're already in parallel.

jakearchibald · 2018-05-08T14:24:03Z

Yeah, I guess you could remove the task queuing from fetch and add it to all APIs that need to be back on the main thread.

gterzian · 2020-06-06T03:39:36Z

Have you though of defining a parallel queue, and then instead of queuing tasks, queue steps on the parallel queue, and define that those steps would call into callbacks(still "in-parallel"), such as process_response.

So for example the current definition of process_response as used in fetch, could be reworded as:

If locallyAborted is true, terminate these substeps. (note: this happens fully "in-parallel", with locallyAborted being a local variable of the parallel steps).
Queue a task using the networking task-source to:
2.1 /// Add current steps 2 -> 6 here, running inside the queued task.

Whereas in the background-fetch use case, the callbacks simply operate on the "local" data of the in-parallel algorithm that started the fetch, so there is no need to queue a task(so the spec could simply remove the note to this issue, and it would work "as-is" really).

It would be somewhat a weird use of a parallel-queue, although it could make sense if one thinks about it as having parallel steps "register callbacks" with this parallel queue, and then fetch queuing the steps on it that would invoke those callbacks.

Since data like the "request" are passed around those callbacks, and I think multiple specs could register callbacks, we'd have to be somewhat careful not to introduce race condition(for example the "transmitted bytes" member of body is written to by fetch, then read by background fetch in the callback, which is fine and could get interesting if multiple callbacks were to be registered by different specs, and some would for some reason write to the variable as well).

annevk · 2020-06-08T07:46:54Z

It's still not entirely obvious to me what abstraction would work well here. Some thoughts:

It's useful that fetch can be invoked from the main thread as a bunch of state (e.g., a mutable CSP policy) is passed in that doesn't really work from "in parallel" land (because it can be mutated). Perhaps though that is something that needs sorting out so that such state can be copied at the right time and passed across the in parallel boundary with ease.
If we want to support full duplex the model where you go in parallel, fetch, and then wait for the response, the body, and the trailers, doesn't work, as you wouldn't be able to send the request body and request trailers at the same time. (No browser supports full duplex though and even with streams nobody seems interested in supporting it. That is, the full request body would have to be transmitted before you start seeing response bytes.)

@gterzian the reason to use a parallel queue I guess is to ensure that you do not process the response body before the response headers? It might work I suppose, but it would have to be flushed out a bit to ensure the right information is passed around to be able to queue tasks from there and such. (Tasks need a bit more information these days.)

And I guess to support synchronous fetching we could use "Wait" (which we should probably define somewhere) in the main thread that looks for some variable shared with in parallel to flip.

annevk · 2020-06-08T08:33:50Z

Actually the DOM fetch method calls the second one from "in-parallel" steps(see step 9).

To be clear, I think that's problematic, as it means the CSP policy gets enforced from in parallel and it isn't clear at what point it is copied to make that deterministic. (In part this is a separate issue, but I find it useful to think about as it influences what kind of algorithms we should be offering to standards.)

I guess since the callbacks take a set of steps you could indeed give them any kind of variables they might need to take hold of. It does seem like that approach might work quite well.

I'd love it if one or more of @yutakahirano, @jyasskin, @jakearchibald, @domenic, and maybe @jungkees could take a look as well.

(I suspect that for most standards we want to provide a wrapper algorithm that preserves the status quo more or less, whereby you invoke "fetch with tasks" or some such and it continues queuing tasks for the callback steps.)

gterzian · 2020-06-08T08:34:25Z

It's useful that fetch can be invoked from the main thread as a bunch of state (e.g., a mutable CSP policy) is passed in that doesn't really work from "in parallel" land (because it can be mutated). Perhaps though that is something that needs sorting out so that such state can be copied at the right time and passed across the in parallel boundary with ease.

Yes I think the background-fetch provides a nice example of that, since it defines a DOM available method, which when called creates a bunch of "non-DOM" data, like a https://wicg.github.io/background-fetch/#background-fetch-record, and then this data holds enough information to later queue tasks back on the event-loop to communicate various progress.

Also, the DOM method is called from the event-loop, and later the actual fetch is started from parallel steps started by the DOM method(similar to how DOM global fetch calls fetch from parallel steps, see step 9)

@gterzian the reason to use a parallel queue I guess is to ensure that you do not process the response body before the response headers? It might work I suppose, but it would have to be flushed out a bit to ensure the right information is passed around to be able to queue tasks from there and such. (Tasks need a bit more information these days.)

Yes, the ordering of the callbacks is something I hadn't even thought of, and I was mostly thinking about the fact that those callbacks are asynchronous also from the perspective of fetch(currently as tasks are queued, and this would be preserved if steps were enqueued on a parallel queue).

And with regards to "the right information is passed around to be able to queue tasks from there", I think the bgfetch spec again provides a nice example, for example see https://wicg.github.io/background-fetch/#update-background-fetch-instances, which is called into from the process_response callback defined by that spec, and which operate on local "in-parallel data", and then queues tasks back on the relevant event-loop to "continue" the algorithm there.

So I essentially think that by defining the "fetch callbacks" as tasks, they are tied to running on an event-loop, whereas the background-fetch spec is a good example where those are rather callbacks running on other in-parallel steps(which then subsequently queue tasks, based on their own locally owned data).

gterzian · 2020-06-08T08:36:01Z

To be clear, I think that's problematic, as it means the CSP policy gets enforced from in parallel and it isn't clear at what point it is copied to make that deterministic. (In part this is a separate issue, but I find it useful to think about as it influences what kind of algorithms we should be offering to standards.)

Ok, thanks for pointing that out (and sorry I deleted my comment, because I wanted to actually edit it and re-post, but your response beat me to it).

gterzian · 2020-06-14T06:32:39Z

I think this could provide a concrete example of a problem with the current approach around "process response" and similar tasks(if the issue is correct, that is): w3c/ServiceWorker#1517

annevk · 2021-02-08T14:00:33Z

I keep coming back to this as it's such a fundamental issue:

fetch() should not go in parallel itself, that's just wrong: fetch() does not need in parallel as fetch already does so #1163.
In principle most of fetch is designed to be invoked from the main thread as global state is available there. For those wanting to invoke fetch while in parallel, more work needs to be done to adequately capture that global state at the right point in time. I do think we should make that work, to be clear, and policy containers will hopefully help with this.
I do think it makes sense for the fetch algorithm to explicitly take these "callbacks" as parameters. Currently we have five callbacks listed at https://fetch.spec.whatwg.org/#process-request-body. Apart from making these explicit parameters I don't think there is much that needs to change here for main thread usage of fetch.
1. We need to have an easy way to get at the bytes of a response body. (We could just expose a specification-only byte sequence on responses. We could allow specifications to treat ReadableStream of responses for which end-of-file has run as equivalent to a byte sequence (perhaps with some magic conversion algorithm that does the relevant asserts). Thoughts appreciated.)
I don't have a good solution for "in parallel" callers, especially those that want to process a response incrementally. Perhaps a return value is still a good fit here and the API would be that you "wait" for things to become non-null.
We might also want a return value for main thread callers that is equivalent to the above except that it would be filled in from tasks.
The return value could also expose terminate, soundly coupling it to the fetch.

@jakearchibald @domenic @jyasskin @ricea @gterzian thoughts?

jakearchibald · 2021-02-08T14:12:38Z

Where will those callbacks run? In parallel or on the main thread?

annevk · 2021-02-08T14:14:39Z

The callbacks mentioned in 3 would run on the main thread in tasks queued on the networking task source. That's basically what happens today, except that this provides a clearer coupling with the fetch algorithm (which also means it would not have to queue tasks if they are not provided and such).

jakearchibald · 2021-02-08T14:25:15Z

Hm, it's the task queuing that's caused issues for me. Eg step 11 of https://wicg.github.io/background-fetch/#complete-a-record

annevk · 2021-02-08T14:36:02Z

Right, for in parallel callers (which I think Background Fetch is, otherwise main thread should be fine) 4 applies. Though looking at what you wrote down there you might have to go in parallel again for the fetch itself so the overall abort can still come through. In parallel doesn't support anything like callbacks as in JavaScript so you have to build that up yourself.

jakearchibald · 2021-02-08T14:49:39Z

Fair enough, but…

In parallel doesn't support anything like callbacks

I don't see why it couldn't. It kinda feels like parallel queues already do.

annevk · 2021-02-08T15:01:10Z

I'm not sure I understand what the suggestion is. Would the caller supply a parallel queue? Would fetch return one somehow when asked? Wouldn't that still have to run in parallel from the thread dealing with the "abort all flag" to ensure that's not blocked? Is that fundamentally different from waiting for some bits of a response to become non-null or is it just better ergonomics?

jakearchibald · 2021-02-08T15:04:38Z

I mean just allow prose like step 7 of https://wicg.github.io/background-fetch/#complete-a-record, but in a way that doesn't bounce back to the main thread. I think it's fine to go in parallel from parallel steps.

annevk · 2021-02-08T15:16:53Z

That's certainly feasible, but it means that fetch itself doesn't block. That seems fine though and in fact it might be a good idea to remove the synchronous flag and have callers "wait" instead when they need something like that (and synchronous XMLHttpRequest could request the parallel queue style callbacks and wait on them on the main thread). I guess I'll first work on these "callback" parameters and figure out the possible return value with terminate operation later on.

domenic · 2021-02-08T18:36:18Z

Everything in #536 (comment) makes general sense to me. In particular, I'd love to be able to write spec text that's like the following:

Fetch request, using the following steps to [process the response] response:

... stuff with response ...

I think it'd be somewhat tempting to be able to write text like the following too:

Fetch request. When the fetch completes with response, continue onto the following steps:

... stuff with response, but not indented ...

but on balance I think this is inadvisable, as then you can easily get into trouble if you accidentally use "return" or "throw" from those non-idented steps.

I'm thinking how this (indented) pattern would look if we carried it over to https://html.spec.whatwg.org/#fetching-scripts, and I think it would be reasonable. We'd define an explicit callback parameter (e.g. "process the module script") for those, and thread everything through with some extra indenting, instead of using the current "Asynchronously complete this algorithm with X". The most complex it would get would be in https://html.spec.whatwg.org/#fetch-the-descendants-of-a-module-script step 8, which could be formalized nicely with infrastructure like this, at the cost of some complexity. (It could also allow us to explicitly terminate ongoing fetches if one of the fetches fail.)

On an editorial side, for some reason callbacks namedLikeThis seem ickier, so if we made them real optional paramters I'm not sure I'd want to update their names, but that doesn't matter much.

We need to have an easy way to get at the bytes of a response body. (We could just expose a specification-only byte sequence on responses. We could allow specifications to treat ReadableStream of responses for which end-of-file has run as equivalent to a byte sequence (perhaps with some magic conversion algorithm that does the relevant asserts). Thoughts appreciated.)

Exposing a byte sequence seems critical. I think getting the byte sequence should consume the body though, so this would be a byte sequence variant of the spec's current "consume body" algorithm. It could still operate on streams, with relevant asserts. I guess it would need to be called from the main thread, so it'd be called from "process response" hooks?

We might also want a return value for main thread callers that is equivalent to the above except that it would be filled in from tasks.

What would this do besides expose terminate? How would one use it?

annevk · 2021-02-09T16:31:00Z

I haven't made progress on the body question yet.

I don't have real uses for the return value at this point other than exposing terminate, which I think is a plus as it's a shared handle between the fetch algorithm and the caller, which is a much clearer interface than what we have now. (I was thinking we could put other things there too, such as the response or response body as byte sequence, but having worked on this refactor I think I'd like to wait a little bit before adding multiple ways to get at things.)

This makes some rather big changes: * Requests no longer have a synchronous flag. Blocking a thread is now up to the caller. * Fetch therefore no longer returns a response directly. In parallel callers will need to pass in "callbacks" that are either invoked on a parallel queue or on an event loop. * To hold onto these callbacks as well as some other information a fetch params struct is now passed around the various fetch algorithms. This will allow for cleanup around termination and aborting in the future. Potentially some bookkeeping state from request can move there going forward. * There's a dedicated navigate-redirect fetch algorithm for HTML as the HTTP-redirect fetch algorithm now wants a fetch params instance. * Some allowance for aborting early on in fetch was removed as all that is run synchronously with the code that invoked fetch to begin with. Closes #1164. * Algorithms that needed to be adjusted were changed to use the new Infra patterns for parameters. I also tried to improve naming, e.g., makeCORSPreflight rather than CORS-preflight flag. * "process response done" was removed as it seemed redundant with "response response end-of-body"; possible leftover from trailers. * Removed duplicate task queueing at the end of main fetch for request's body. Fixes #536.

This makes some rather big changes: * Requests no longer have a synchronous flag. Blocking a thread is now up to the caller. * Fetch therefore no longer returns a response directly. In parallel callers will need to pass in "callbacks" that are either invoked on a parallel queue or on an event loop. * To hold onto these callbacks as well as some other information a fetch params struct is now passed around the various fetch algorithms. This will allow for cleanup around termination and aborting in the future. Potentially some bookkeeping state from request can move there going forward. * There's a dedicated navigate-redirect fetch algorithm for HTML as the HTTP-redirect fetch algorithm now wants a fetch params instance. This also returns a response which fixes a problem where HTTP-redirect fetch would return a network error that was then ignored. * Some allowance for aborting early on in fetch was removed as all that is run synchronously with the code that invoked fetch to begin with. Closes #1164. * Algorithms that needed to be adjusted were changed to use the new Infra patterns for parameters. I also tried to improve naming, e.g., makeCORSPreflight rather than CORS-preflight flag. * "process response done" was removed as it seemed redundant with "response response end-of-body"; possible leftover from trailers. * Removed duplicate task queuing at the end of main fetch for request's body. Fixes #536.

annevk mentioned this issue May 18, 2017

Aborting fetch #523

Merged

annevk mentioned this issue Jul 18, 2017

Clean up fetch integration for errors whatwg/html#1363

Open

This was referenced Sep 28, 2017

Handing fetch termination w3c/ServiceWorker#1178

Merged

Clarify aborting logic WICG/background-fetch#60

Closed

annevk added the clarification Standard could be clearer label Apr 11, 2018

jakearchibald mentioned this issue May 29, 2018

Clients.get: block on reserved clients. w3c/ServiceWorker#1315

Closed

gterzian mentioned this issue Jun 7, 2020

Improve spec compliance of transmit request body servo/servo#26776

Open

gterzian mentioned this issue Jun 14, 2020

Handle Fetch "process response" callback appears to run on wrong event-loop w3c/ServiceWorker#1517

Open

annevk mentioned this issue Jan 21, 2021

Explainer needed for creating requests #1143

Open

annevk mentioned this issue Feb 8, 2021

Getting all bytes in a body #661

Closed

jakearchibald mentioned this issue Feb 9, 2021

"Return a network error" doesn't seem to go through "process response" #1164

Closed

annevk mentioned this issue Feb 9, 2021

Overhaul how standards integrate with fetch #1165

Merged

annevk closed this as completed in #1165 Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Odd format for fetch callbacks #536

Odd format for fetch callbacks #536

jyasskin commented May 2, 2017

annevk commented May 3, 2017

jyasskin commented May 3, 2017

annevk commented May 3, 2017

jyasskin commented May 3, 2017

jakearchibald commented Sep 18, 2017

jakearchibald commented Sep 19, 2017

annevk commented Sep 19, 2017

jakearchibald commented Sep 19, 2017

jakearchibald commented May 4, 2018 •

edited

Loading

annevk commented May 7, 2018

jakearchibald commented May 7, 2018 •

edited

Loading

annevk commented May 7, 2018

jakearchibald commented May 7, 2018

annevk commented May 8, 2018

jakearchibald commented May 8, 2018

gterzian commented Jun 6, 2020 •

edited

Loading

annevk commented Jun 8, 2020

annevk commented Jun 8, 2020

gterzian commented Jun 8, 2020 •

edited

Loading

gterzian commented Jun 8, 2020

gterzian commented Jun 14, 2020

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021 •

edited

Loading

annevk commented Feb 8, 2021

domenic commented Feb 8, 2021

annevk commented Feb 9, 2021

Odd format for fetch callbacks #536

Odd format for fetch callbacks #536

Comments

jyasskin commented May 2, 2017

annevk commented May 3, 2017

jyasskin commented May 3, 2017

annevk commented May 3, 2017

jyasskin commented May 3, 2017

jakearchibald commented Sep 18, 2017

jakearchibald commented Sep 19, 2017

annevk commented Sep 19, 2017

jakearchibald commented Sep 19, 2017

jakearchibald commented May 4, 2018 • edited Loading

annevk commented May 7, 2018

jakearchibald commented May 7, 2018 • edited Loading

annevk commented May 7, 2018

jakearchibald commented May 7, 2018

annevk commented May 8, 2018

jakearchibald commented May 8, 2018

gterzian commented Jun 6, 2020 • edited Loading

annevk commented Jun 8, 2020

annevk commented Jun 8, 2020

gterzian commented Jun 8, 2020 • edited Loading

gterzian commented Jun 8, 2020

gterzian commented Jun 14, 2020

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021

annevk commented Feb 8, 2021

jakearchibald commented Feb 8, 2021 • edited Loading

annevk commented Feb 8, 2021

domenic commented Feb 8, 2021

annevk commented Feb 9, 2021

jakearchibald commented May 4, 2018 •

edited

Loading

jakearchibald commented May 7, 2018 •

edited

Loading

gterzian commented Jun 6, 2020 •

edited

Loading

gterzian commented Jun 8, 2020 •

edited

Loading

jakearchibald commented Feb 8, 2021 •

edited

Loading