diff --git a/proposals/0000-surface-stream-errors-to-http.md b/proposals/0000-surface-stream-errors-to-http.md new file mode 100644 index 0000000..f6f333d --- /dev/null +++ b/proposals/0000-surface-stream-errors-to-http.md @@ -0,0 +1,227 @@ +# Surfacing Stream Errors to HTTP + +## Intro + +> A _short_ explanation of the proposal. + +## Motivation + +Marko already has a way to tell _humans_ that the promise for an `` rejected: the `<@catch>` block. Unfortunately for machines, the original HTTP status code is how they know the error state of a response — and the HTTP headers are already sent by the time an `` errors. + +Say we have Marko render a page’s `` and site banner while the Node server calls a backend API for the content: + +```marko + + + + + + + + + + + + + <@then|response|> + ${response.body} + + <@catch|err|> + Oh no, the content API is down again + + + + + + + +``` + +This is great for the user because their browser gets a head start downloading assets in the `` and displays the site header while the network request to the content API is underway. + +If the `contentFetchRequest` fails for any of the myriad reasons computers are terrible, the page will show an error message instead — but as far as the HTTP layer knows, the response completed successfully. + +No machine-readable indicator that a streamed response contains erroneous content causes some thorny problems: + +- An HTTP cache may store the erroneous content and reuse it, showing users the error for longer than they otherwise would +- Search engines will index the erroneous content, since they received no sign they should try again or discard the response as invalid +- HTTP-level tools (debuggers, monitoring, curl, spiders, etc.) will report the response as successful, even though it wasn’t + +HTTP/1.1’s chunked responses and HTTP/2’s stream error handling each have a standardized way of indicating a dynamic response failed to complete successfully: + +
+
HTTP/1.1 Transfer-Encoding: chunked
+
IETF Draft: HTTP/1.1 Messaging §8 Handling Incomplete Messages
+

If a chunked response doesn’t terminate with the zero-length end chunk, the client must assume that the response was incomplete — which at the very least, means a cache should double-check with the server before reusing the stored incomplete response.

+ +
HTTP/2 Streams
+
RFC 7540 §5.4.2 Stream Error Handling
+

An HTTP/2 stream can signal an application error by sending a RST_STREAM frame with an error code of 0x2 INTERNAL_ERROR.

+
+ +This proposal explores how Marko should surface errors from its components to the HTTP layer, so the response can properly signal an error with one of the above methods. + +## Guide-level explanation + +Developers should be able to tell Marko how important an async error is: + +1. Our example from the Motivation section is as serious as it gets + +2. Small partials in a page may still want to tell proxies and search engines that the response is incomplete, but the user might find value in the rest of the page around the `<@catch>` + +3. Maybe no problem at all for trivial parts of the page, like pulling weather data to change a background image + +> Explain the proposal as if it was already implemented and you are now teaching it to another Marko developer. That generally means: +> +> - Explaining the feature largely in terms of examples. +> - What names and terminology work best for these concepts and why? Is it a continuation of existing Marko patterns, or a wholly new one? +> - Explaining how Marko developers should _think_ about the feature, and how it should impact the way they use Marko. It should explain the impact as concretely as possible. +> - If applicable, describe the differences between teaching this to existing Marko developers and new Marko developers. + +> For implementation-oriented proposals, this section should focus on how contributors should think about the change, and give examples of its concrete impact. For policy proposals, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. + +> Would the acceptance of this proposal mean the Marko docs must be re-organized or altered? + +No, but a page about this problem may be a good addition. + +## Reference-level explanation + +> This is the technical portion of the proposal. Explain the design in sufficient detail that: +> +> - Its interaction with other features/tools is clear. +> - It is reasonably clear how the feature would be implemented. +> - Corner cases are dissected by example. +> - The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. + +It is out of scope for this mechanism to surface anything other than runtime (`5XX`) errors — mostly because that’s the only semantic we can surface through either version of HTTP. It would be wrong to use this for a 404 page, for example. + +### Trailers + +⚠️ **At-risk:** This section may not work at all, since trailers can only be appended _after_ the zero-length terminator chunk in HTTP/1.1, and presumably after the error frame in HTTP/2. + +We could send error information in HTTP trailers if any clients would benefit. [Some browsers have traditionally discarded almost all trailers other than `Server-Timing`](https://www.fastly.com/blog/supercharging-server-timing-http-trailers), but other browsers, tools like `curl`, search engines, proxy caches, and debuggers may benefit. For example, an updated `Cache-Control`, `Retry-After`, or even `Refresh`ing to an error page at a new URL if the problem was serious enough. + +Even for stricter browsers, error information inside `Server-Timing` could help developers, as many monitoring tools track, expose, and alert information from this header. + +``` +Server-Timing: markoAwaitErr; dur=87.1; desc="${errorInfo}" +``` + +What information should be exposed this way? + +- The `dur` field is standardized as indicating how far along the event happened in the request +- The `desc` field is any string the developer wishes +- Multiple Server-Timing events are allowed with the same name, so Marko could emit it repeatedly for multiple async failures, with `desc` disambiguating what happened and where + +### HTTP/1.1 considerations + +We should only close the TCP socket sans chunk terminator once we’re done rendering the page. Closing a warmed-up HTTP/1.1 connection is wasteful performance-wise, but there’s no other option. + +The spec also says a `Transfer-Encoding` decode error marks the message as incomplete; a chunk with a _negative_ integer length might trigger that behavior? Or anything that isn’t a positive hex integer. + +A chunk that’s displaying the contents of a `<@catch>` error message may also encode error information as [chunk extensions](https://tools.ietf.org/html/rfc7230#section-4.1.1). I wonder if there’s any standards or _de facto_ implementations that leverage these? + +What a headache. At least HTTP/2 has a much more explicit signal for errors during a stream. + +### HTTP/2 considerations + +Developers could signal that even though an error occurred, the request is safe to retry. [The HTTP/2 spec provides two methods to indicate which streamed requests are safe to retry](https://tools.ietf.org/html/rfc7540#section-8.1.4): a `RST_STREAM` frame with an error code of `REFUSED_STREAM`, or a `GOAWAY` frame indicating which stream IDs are safe to retry. + +However, those signals indicate a retry is safe even if the request semantics were not idempotent (like a `POST` request), so they may not be useful by the time the request handling makes it to the Marko layer: + +> A server MUST NOT indicate that a stream has not been processed unless it can guarantee that fact. If frames that are on a stream are passed to the application layer for any stream, then `REFUSED_STREAM` MUST NOT be used for that stream, and a `GOAWAY` frame MUST include a stream identifier that is greater than or equal to the given stream identifier. + +Still, we might provide a way for Marko authors to signal this safe-to-retry information — the application may be allowed to reason about errors within its own layer. + +### SPDY considerations + +Some CDNs, like Akamai, translate backend responses to SPDY when talking to older clients. Luckily, SPDY’s semantics more or less map to HTTP/2’s: see [IETF draft: SPDY Protocol §2.4.2. Stream error handling](https://tools.ietf.org/id/draft-mbelshe-httpbis-spdy-00.txt). + +[The hex code for `INTERNAL_ERROR` might be `0x6` instead](https://www.chromium.org/spdy/spdy-protocol/spdy-protocol-draft2#TOC-RST_STREAM), but that seems easy and obvious enough to expect transformers to handle it. + +### HTTP/… 3 + +[It’s going to happen eventually.](https://quicwg.org/base-drafts/draft-ietf-quic-http.html#errors) + +### Signaling within Node + +Node’s `http` and `http2` modules are probably capable of emitting error state as the standards require, but it’s not 100% clear how to transparently work with them in ways that the greater ecosystem would expect. + +For example, how does Express react to a `http` stream closing without the zero-length terminator? + +Hopefully, Node already does The Right Thing when these modules’ streams produce the protocol-specific error signal, and the ecosystem has had time to discover and handle that occasion. (If not, Node might appreciate a pull request.) More research needed. + +Presumably-helpful parts of the API: + +- https://nodejs.org/api/net.html#net_socket_end_data_encoding_callback +- https://nodejs.org/api/net.html#net_socket_setkeepalive_enable_initialdelay +- https://nodejs.org/api/http.html#http_request_abort +- https://nodejs.org/api/http.html#http_event_clienterror +- `http2session.destroy([error][, code])` +- `http2session.goaway([code[, lastStreamID[, opaqueData]]])` +- https://nodejs.org/api/http2.html#http2_destruction +- `http2stream.close(code[, callback])` +- `response.addTrailers(headers)` + +### HTTP ecosystem considerations + +It should be possible for developers to selectively turn off this error-signaling behavior on a per-request basis, in case of misbehaving proxies or known-bad clients. + +Research needed into how common reverse proxies or load balancers, such as nginx, handle error-signaled HTTP streams: + +- [Real-world examples of how that can happen, including the comment at the very bottom](https://rhodesmill.org/brandon/2013/chunked-wsgi/) +- [Atmosphere long-polling on nginx – chunked encoding error](https://rijulaggarwal.wordpress.com/2018/01/10/atmosphere-long-polling-on-nginx-chunked-encoding-error/) + +In the common case of a backend speaking HTTP/1.1 to a reverse proxy/front-end terminator/CDN/etc. that translates to HTTP/2 for browsers, what can be done? + +This part is probably out of scope for Marko itself, but could be invaluable guidance to help developers figure out what to do with an `error` event on a Marko stream. + +## Migration strategy + +> Is this proposal backwards incompatible? Does this proposal replace an existing feature/pattern? If so, can we safely and automatically migrate existing apps/components to this proposal? How? + +This proposal may change nothing about Marko at all, but instead surface developer advice. + +> If applicable, provide sample error messages, deprecation warnings, or migration guidance. + +## Drawbacks + +So far, Marko is totally agnostic as to the kind of stream it outputs to. You can use it to stream to a file, a UNIX pipe, or something unexpected like an FTP transmission or a chat protocol. + +It’s possible the server runtime might expose a different method to avoid breaking such cases, like `template.streamToHttp()`. + +However, there’s already a lot about Marko’s server rendering that only makes sense delivered over HTTP. For example, if you were rendering to a file to be served later, you’d want to buffer the component init scripts and you’d never want `client-reorder` enabled. + +## Alternatives + +The impact of not doing this would be no change; we would continue exposing Marko authors to the risks mentioned in the Motivation section. + +In the past, browsers other than Internet Explorer accepted `multipart/x-mixed-replace` responses to be loaded in `target="_top"`, and that could potentially be used to swap to a new copy of the response-in-progress but with error-indicating headers, or even a replacement error page. However, modern browsers have quietly dropped support for `multipart/x-mixed-replace` in the top browsing context. + +## Prior art + +How to signal errors during a dynamic HTTP stream is not unique to Marko; the problem is very old. Other server runtimes and frameworks may have paved this cowpath already. + +- [You used to have to delay a bit before closing a connection without the zero-length terminator because of this Chrome bug](https://bugs.chromium.org/p/chromium/issues/detail?id=610126) +- [StackOverflow / Response sent in chunked transfer encoding and indicating errors happening after some data has already been sent](https://stackoverflow.com/questions/17203379/response-sent-in-chunked-transfer-encoding-and-indicating-errors-happening-after) +- Express and other Node HTTP frameworks + - [expressjs/express #2700: Abort a streaming response.](https://github.com/expressjs/express/issues/2700) + - [Handling Errors in Express § When streaming](https://zellwk.com/blog/express-errors/#when-streaming) + - [StackOverflow / Error handling in express while piping stream to response](https://stackoverflow.com/questions/21509233/error-handling-in-express-while-piping-stream-to-response) + - [Express Guide / Error Handling § The default error handler](https://expressjs.com/en/guide/error-handling.html#the-default-error-handler) +- PHP’s output buffers and its `flush()` family of functions + - [StackOverflow / Chrome net::ERR_INCOMPLETE_CHUNKED_ENCODING error](https://stackoverflow.com/questions/29894154/chrome-neterr-incomplete-chunked-encoding-error) (good lord the answers all reveal a lot of ways this can happen) +- [Error pages and chunked encoding – it’s harder than you think](https://techblog.bozho.net/error-pages-and-chunked-encoding-its-harder-than-you-think/) +- Ruby on Rails + - [Rails API / `ActionController::Streaming` § Errors](https://api.rubyonrails.org/classes/ActionController/Streaming.html#module-ActionController::Streaming-label-Errors) + - [rails/…/action_view/base.rb § `:streaming_completion_on_exception`](https://github.com/rails/rails/blob/7729b518f37740a0b2d3bfe8c53ed55dd7548d8b/actionview/lib/action_view/base.rb#L147-L149) +- Perl may be the most interesting of all, as it was widely-used during the migration to HTTP/1.1 — HTTP/1.0 notoriously couldn’t tell the difference between the server finishing a dynamic response or the connection being closed for other reasons. +- [Django basically discourages streaming altogether](https://docs.djangoproject.com/en/3.0/ref/request-response/#django.http.StreamingHttpResponse) +- This is worth its weight in gold, even the comments: [WSGI and truncated chunked response bodies](https://rhodesmill.org/brandon/2013/chunked-wsgi/) +- [HTTP: The Definitive Guide / The Mysteries of Connection Close](https://www.oreilly.com/library/view/http-the-definitive/1565925092/ch04s07.html) + +## Unresolved questions + +1. How should Marko devs indicate an `` error is serious or not? Re-throwing the `Error` inside `<@catch>`? It seems like a bad idea to have errors default to cacheable, instead of the opposite. + +2. Can Marko even write garbage data to the HTTP stream at its layer?