From c9e231c73a92cb35fbca0e6278618a93290421a4 Mon Sep 17 00:00:00 2001 From: Marcin Rataj Date: Thu, 6 Jul 2023 17:46:54 +0200 Subject: [PATCH] ipip-412: refactor on top of ipip-402 moving spec details to trustless-gateway, rebasing on top of ipip-402 --- src/http-gateways/path-gateway.md | 6 +- src/http-gateways/trustless-gateway.md | 172 +++++++++++++++++++++---- src/ipips/ipip-0412.md | 113 ++++------------ 3 files changed, 174 insertions(+), 117 deletions(-) diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index e23061ddb..61aa6b568 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -595,11 +595,7 @@ The following response types require an explicit opt-in, can only be requested w - Raw Block (`?format=raw`) - Opaque bytes, see [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw). - CAR (`?format=car`) - - A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and :cite[trustless-gateway]. - - **Note:** by default, block order in CAR response is not deterministic, - blocks can be returned in different order, depending on implementation - choices (traversal, speed at which blocks arrive from the network, etc). - An opt-in ordered CAR responses MAY be introduced in a future IPIP. + - A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and Section 5 (CAR Responses) at :cite[trustless-gateway]. - TAR (`?format=tar`) - Deserialized UnixFS files and directories as a TAR file or a stream, see :cite[ipip-0288]. - IPNS Record diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index e3ee0fe7f..ab8f42e0e 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -13,6 +13,10 @@ editors: - name: Henrique Dias github: hacdias url: https://hacdias.com/ +xref: + - url + - path-gateway + - ipip-0412 tags: ['httpGateways', 'lowLevelHttpGateways'] order: 1 --- @@ -25,11 +29,12 @@ The minimal implementation means: - response type is always fully verifiable: client can decide between a raw block or a CAR stream - no UnixFS/IPLD deserialization -- for CAR files: - - the behavior is identical to :cite[path-gateway] - for raw blocks: - data is requested by CID, only supported path is `/ipfs/{cid}` - no path traversal or recursive resolution +- for CAR files: + - the pathing behavior is identical to :cite[path-gateway] + # HTTP API @@ -63,13 +68,14 @@ Same as in :cite[path-gateway], but with limited number of supported response ty ### `Accept` (request header) -This HTTP header is required when running in a strict, trustless mode. +A Client SHOULD sent this HTTP header to leverage content type negotiation +based on section 12.5.1 of :cite[rfc9110]. Below response types MUST to be supported: - [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – requests a single, verifiable raw block to be returned Below response types SHOULD to be supported: -- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional parameters (:cite[ipip-0412]) +- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned, implementations MAY support optional CAR content type parameters (:cite[ipip-0412]) - [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] (multicodec `0x0300`). Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless @@ -175,28 +181,29 @@ For example: `Content-Type: application/vnd.ipld.car; version=1` MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser. -## Response Payload - -### Block Response +# Block Responses (application/vnd.ipld.raw) An opaque bytes matching the requested block CID ([application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)). The Body hash MUST match the Multihash from the requested CID. -### CAR Response +# CAR Responses (application/vnd.ipld.car) A CAR stream for the requested [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) -content type, path and optional `dag-scope` and `entity-bytes` URL parameters. +content type (with optional `order` and `dups` params), path and optional +`dag-scope` and `entity-bytes` URL parameters. + +Below MUST be implmented when a Gateway supports [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car). -#### CAR version +## CAR version Value returned in [`CarV1Header.version`](https://ipld.io/specs/transport/car/carv1/#header) field MUST match the `version` parameter returned in `Content-Type` header. -#### CAR roots +## CAR roots The behavior associated with the [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field @@ -210,27 +217,148 @@ As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved ::: -#### CAR determinism +## CAR `order` (content type parameter) + +The `order` parameter allows clients to specify the desired block order in the +response. It supports the following values: + +- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) + order, enables streaming responses with minimal memory usage. +- `unk` or missing: Unknown order, which serves as the implicit default when the `order` + parameter is missing. In this case, the client cannot make any assumptions + about the block order: blocks may arrive in a random order or be a result of + a custom DAG traversal algorithm. + +A Gateway SHOULD always return explicit `order` in CAR's `Content-Type` response header. + +A Gateway MAY skip `order` in CAR response if no order was explicitly requested +by the client and the default order is unknown. + +A Client MUST assume implicit `order=unk` when `order` is missing, unknown, or empty. + +## CAR `dups` (content type parameter) + +The `dups` parameter specifies whether duplicate blocks (the same block +occurring multiple times in the requested DAG) will be present in the CAR +response. Useful when a deterministic block order is used. + +It accepts two values: +- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk. +- `n`: Duplicate blocks MUST be sent only once. + +When set to `y`, light clients are able to discard blocks after +reading them, removing the need for caching in-memory or on-disk. + +Setting to `n` allows for more efficient data transfer of certain types of +data, but introduces additional resource cost on the receiving end, as each +block needs to be kept around in case its CID appears again. + +A Client MUST not assume any implicit behavior when `dups` is missing. + +If the `dups` parameter is not present in the `Content-Type` header, the +behavior is unspecified, and the CAR response includes an arbitrary list of +blocks. In this unknown state, the client MUST assume `n` as the default, but +also MUST ignore duplicates if they are present. + +A Gateway MUST return always return `dups` in `Content-Type` response header +when the duplicate status is known at the time of response. + +A Gateway MAY skip `dups` if it was not present in `Accept` header sent by the +client or if it is not possible to tell the duplicate status. + +:::warning + +The specified parameter does not apply to virtual blocks identified by identity +CIDs. CAR responses MUST never include these virtual blocks. The parameter in +question is meant to control the behavior of non-virtual blocks in the +response. Therefore, it does not have any effect on virtual blocks, and they +should never be included in the CAR response, no matter if present, or what +value is set. + +::: + + +## CAR parameters and determinism -The default CAR header and block order in a CAR response is not specified and is non-deterministic. +The default header and block order in a CAR format is not specified by IPLD specifications. Clients MUST NOT assume that CAR responses are deterministic (byte-for-byte identical) across different gateways. Clients MUST NOT assume that CAR includes CIDs and their blocks in the same order across different gateways. +Clients MUST assume block order and duplicate status only if `Content-Type` returned with CAR responses includes optional `order` or `dups` parameters, as specified by :cite[ipip-0412]. + +A Gateway SHOULD support some aspects of determinism by implementing content type negotiation and signaling via `Accept` and `Content-Type` headers. + :::issue -In controlled environments, clients MAY choose to rely on undocumented CAR determinism, -subject to the agreement of the following conditions between the client and the -gateway: +In controlled environments, clients MAY choose to rely on implicit and +undocumented CAR determinism, subject to the agreement of the following +conditions between the client and the gateway: - CAR version - content of [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field -- order of blocks -- status of duplicate blocks +- order of blocks (`order` from :cite[ipip-0412]) +- status of duplicate blocks (`dups` from :cite[ipip-0412]) -In the future, there may be an introduction of a convention to indicate aspects -of determinism in CAR responses. Please refer to -[IPIP-412](https://github.com/ipfs/specs/pull/412) for potential developments -in this area. +Mind this is undocumented behavior, and MUST NOT be used on public networks. ::: + +### CAR format signaling in Request + +Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. + +Clients MAY indicate their preferred block order by sending an `Accept` header in +the HTTP request. The `Accept` header format is as follows: + +``` +Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y +``` + +In the future, when more orders or parameters exist, clients will be able to +specify a list of preferences, for example: + +``` +Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 +``` + +The above example is a list of preferences, the client would really like to use +the hypothetical `order=foo` however if this isn't available it would accept +`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, +as noted in :cite[rfc9110]). + +### CAR format signaling in Response + +The Trustless Gateway MUST always respond with a `Content-Type` header that includes +information about all supported and known parameters, even if the client did not +specify them in the request. + +The `Content-Type` header format is as follows: + +``` +Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=n +``` + +Gateway implementations SHOULD decide on the implicit default ordering or +other parameters, and use it in responses when client did not explicitly +specify any matching preference. + +A Gateway MAY choose to implement only some of the parameters and return HTTP +400 Bad Request or 406 Not Acceptable when client requested a response with +unsupported content type variant. + +A Client MUST verify `Content-Type` returned with CAR response before +processing the payload, as the legacy gateway may not support optional content +type parameters like `order` an `dups` and return plain +`application/vnd.ipld.car`. + + +# IPNS Record Responses (application/vnd.ipfs.ipns-record) + +An opaque bytes matching the [Signed IPNS Record](https://specs.ipfs.tech/ipns/ipns-record/#ipns-record) +for the requested [IPNS Name](https://specs.ipfs.tech/ipns/ipns-record/#ipns-name) +returned as [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record). + +A Client MUST confirm the record signature match `libp2p-key` from the requested IPNS Name. + +A Client MUST [perform additional record verification according to the IPNS specification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification). diff --git a/src/ipips/ipip-0412.md b/src/ipips/ipip-0412.md index c0b20beda..b0e536318 100644 --- a/src/ipips/ipip-0412.md +++ b/src/ipips/ipip-0412.md @@ -61,96 +61,10 @@ The `order` parameter allows the client to indicate its preference for a specific block order in the CAR response, and the `dups` parameter specifies whether duplicate blocks are allowed in the response. -### Signaling in Request +A Client SHOULD sent `Accept` HTTP header to leverage content type negotiation +based on section 12.5.1 of :cite[rfc9110] to get the preferred response type. -Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. - -Clients MAY indicate their preferred block order by sending an `Accept` header in -the HTTP request. The `Accept` header format is as follows: - -``` -Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y -``` - -In the future, when more orders or parameters exist, clients will be able to -specify a list of preferences, for example: - -``` -Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 -``` - -The above example is a list of preferences, the client would really like to use -the hypothetical `order=foo` however if this isn't available it would accept -`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, -as noted in :cite[rfc9110]). - -#### `order` CAR content type parameter - -The `order` parameter allows clients to specify the desired block order in the -response. It supports the following values: - -- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) - order, enables streaming responses with minimal memory usage. -- `unk`: Unknown order, which serves as the implicit default when the order - parameter is missing. In this case, the client cannot make any assumptions - about the block order: blocks may arrive in a random order or be a result of - a custom DAG traversal algorithm. - -#### `dups` CAR content type parameter - -The `dups` parameter specifies whether duplicate blocks (the same block -occuring multiple times in the requested DAG) will be present in the CAR -response. Useful when a deterministic block order is used. - -It accepts two values: -- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk. -- `n`: Duplicate blocks MUST be sent only once. - -When set to `y`, light clients are able to discard blocks after -reading them, removing the need for caching in-memory or on-disk. - -Setting to `n` allows for more efficient data transfer of certain types of data, -but introduces additional resource cost on the receiving end. - -If the `dups` parameter is not present in the `Content-Type` header, the -behavior is unspecified, and the CAR response includes an arbitrary list of -blocks. In this case, the client should assume `n` as the default, but ignore -duplicates if they are present. - -:::warning - -The specified parameter does not apply to virtual blocks identified by identity -CIDs. CAR responses MUST never include these virtual blocks. The parameter in -question is meant to control the behavior of non-virtual blocks in the -response. Therefore, it does not have any effect on virtual blocks, and they -should never be included in the CAR response, no matter if present, or what -value is set. - -::: - - - -### Signaling in Response - -The Trustless Gateway MUST always respond with a `Content-Type` header that includes -information about all supported/known parameters, even if the client did not -specify them in the request. - -The `Content-Type` header format is as follows: - -``` -Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y -``` - - -Gateway implementations are free to decide on the implicit default ordering or -other parameters, and use it in responses when client did not explicitly -specify any matching preference. - -Implementations MAY choose to implement only some of the parameters and return -HTTP 406 Not Acceptable when client requested a response with unsupported one. +More details in Section 5. (CAR Responses) of :cite[trustless-gateway]. ## Design rationale @@ -245,7 +159,7 @@ Several alternative approaches were considered before arriving at the proposed s saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against use of `X-` and similar constructs in new protocols. -The proposed solution of negotiating the block order through headers si +The proposed solution of negotiating the block order through headers is future-proof, allows for flexibility, interoperability, and customization while maintaining compatibility with existing implementations. @@ -255,11 +169,30 @@ Implementation compliance can be determined by testing the negotiation process between clients and Trustless Gateways using various combinations of `order` and `dups` parameters. +Relevant tests were added to +[gateway-conformance](https://github.com/ipfs/gateway-conformance) test suite +in [#87](https://github.com/ipfs/gateway-conformance/pull/87). + + + +Below are CIDs, CARs, and short summary of each fixture. + TODO: 1. a CAR with blocks for a small file in DFS order 2. a CAR with blocks for a small file with one block appearing twice + +Tests for duplicates use a fixture where a directory contains two files that +are the same. If `dups=n`, then there are no duplicates. If `dups=y`, then the +blocks of the file are sent twice, by the order they show up in the DAG. + +The same fixture is used for testing `order=dfs`. + + ### Copyright Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).