Authoritative Contained Resource Data #352

csarven · 2021-11-17T13:59:12Z

Issues concerning:

The concept #authoritative-information describes an RDF constraint that's analogous to "authoritative metadata" as per https://www.w3.org/2001/tag/doc/mime-respect - the sender's HTTP message is considered to be authoritative.

Server receiving the message can either respect sender's intentions or responds with redirect or error message. For example, in #server-protect-authoritative-resource-data , when a client requests to update a container's description including authoritative data about the contained resources, the server will reject.

The concept #authoritative-resource-data specialises #authoritative-information in that it is used in context of resource descriptions. #authoritative-contained-resource-data is one application of authoritative information in the container description.

RubenVerborgh

Aligns in spirit with what was decided.

protocol.html

kjetilk

I'm just going on about this: dct:modified and stat:mtime isn't defined as the relevant times for the resource, as detailed in the comments. It can't be. It has to be defined in other terms.

protocol.html

…er-last-modified

NoelDeMartin · 2021-12-05T16:30:20Z

I haven't read all the comments in this thread, so apologies if this has already been addressed. But I just have a couple of comments/questions.

Reading this part is confusing for me:

Servers MUST include resource metadata about contained resources as part of the container description, unless that information is inapplicable to the server.

It says "MUST", but then it says "unless that information is inapplicable to the server". Wouldn't that be an implementation detail of the server, and thus something irrelevant in the protocol description? Also, as an app developer reading this I could be mislead to think that all servers provide the data, because it says MUST. But as I understand it, that's not true, depending on some implementation details of the server, some containers won't return this data.

Keeping that in mind, why not just use MAY in that sentence?

Also, I'm not sure why there are two values indicating what seems to be the same information: dcterms:modified and stat:mtime. Is it just for compatibility with current implementations? Or is there an important difference between the two?

As an app developer, it was confusing to know which one to use, so I think I just picked one at random. Now it mentions that dcterms:modified has to match the Last-Modified header, so I suppose that's the "cannonical" one?

csarven · 2021-12-05T16:48:18Z

@NoelDeMartin Thank you. The discussion on "MUST, unless" happened here: #352 (comment) . "unless" has to do with inapplicability. We may change to "SHOULD" but this is to be worked out in #343 or next release of the spec. Unfortunately MAY is not strong enough for a requirement to enable the feature.

stat:mtime may be deprecated in a later version of the spec ( #352 (comment) ). Indeed rely on dcterms:modified for the foreseeable future.

RubenVerborgh

All good, except for the wrong RFC6570 template, to which I propose a correction.

protocol.html

RubenVerborgh · 2021-12-05T20:01:47Z

Implementer's note: the Community Solid Server supports the text of this PR via configuration: https://github.com/solid/community-server-recipes/blob/feat/spec-352/metadata/config-metadata.json.
Live version for testing at https://drive.verborgh.org/.

Co-authored-by: Ruben Verborgh <ruben@verborgh.org>

kjetilk

I have bad feelings about this, as we are now saying that it is OK for a server to not track the modification of representations of a resource and give the modification times of just a part of it. It seems likely to me that this will cause many problems for caching in the future, and that servers will serve stale data, and that it will result in difficulty for decentralization efforts.

Nevertheless, I acknowledge that it is now an internally consistent structure in line with NSS behavior and that RFC7232 is very liberal on this point, as it says:

The last-modified time would usually be the most recent time that any of those parts were changed.

and, what this does is to say "usually, yes, but not for containers in Solid".

There ought to be a 👃 reaction

acoburn · 2021-12-06T10:21:19Z

protocol.html

+                        <dt about="#contained-resource-metadata-rdf-type" id="contained-resource-metadata-rdf-type" property="skos:prefLabel"><code>rdf:type</code></dt>
+                        <dd about="#contained-resource-metadata-rdf-type" property="skos:definition">A class whose URI is the expansion of the <em>URI Template</em> [<cite><a class="bibref" href="#bib-rfc6570">RFC6570</a></cite>] <code>http://www.w3.org/ns/iana/media-types/{+iana-media-type}#Resource</code>, where <code>iana-media-type</code> corresponds to a value from the IANA Media Types [<cite><a class="bibref" href="#bib-iana-media-types">IANA-MEDIA-TYPES</a></cite>].</dd>
+                        <dt about="#contained-resource-metadata-stat-size" id="contained-resource-metadata-stat-size" property="skos:prefLabel"><code>stat:size</code></dt>
+                        <dd about="#contained-resource-metadata-stat-size" property="skos:definition">A non-negative integer giving the size of the resource in bytes.</dd>


This item in particular should be discussed in the security considerations section

Do you consider stat:size (I assume that's what your comment is referring to - GitHub UI trips me up sometimes) to more of a concern than the others?

Had this text earlier in the Note #contained-resource-metadata-considerations :

Servers are encouraged to consider omitting authoritative data about a contained resource when an agent is unauthorized to read the contained resource.

Removed it as per Kjetil's suggestion: #352 (comment)

Can revive and put it under #security-considerations but note that it may be equivalent / already captured by:

Servers are strongly discouraged from exposing information beyond the minimum amount necessary to enable a feature.

If you'd like more specific considerations that should be mentioned in there, we can do that as well. Have something in mind?

Provided that a server can choose to omit this data (under the "inapplicability" clause), this is fine as-is.

Knowing the size of a file that one may not otherwise have access to read, can be extremely useful data when looking to exploit weaknesses in a server. I would have serious reservations about including that information on a Storage that holds sensitive personal data.

Yes, I believe that since we do know that we cannot have a server check the authz of every child resource, then is inappropriate to leave it as an implementation detail, we'd need to specify a concrete mechanism. I think it would be better for this to be a SHOULD, as per RFC2119, but for now, it seems that it could be left out under the inapplicability clause.

acoburn · 2021-12-07T18:04:40Z

protocol.html

+                        <dt about="#contained-resource-metadata-stat-size" id="contained-resource-metadata-stat-size" property="skos:prefLabel"><code>stat:size</code></dt>
+                        <dd about="#contained-resource-metadata-stat-size" property="skos:definition">A non-negative integer giving the size of the resource in bytes.</dd>
+                        <dt about="#contained-resource-metadata-dcterms-modified" id="contained-resource-metadata-dcterms-modified" property="skos:prefLabel"><code>dcterms:modified</code></dt>
+                        <dd about="#contained-resource-metadata-dcterms-modified" property="skos:definition">The date and time when the resource was last modified.</dd>


Assuming that a server provides a Last-Modified header and assuming that we do not want to break how caching works for browsers, I do not see how this (dcterms:modified or stat:mtime) can be implemented without causing a cascade of updates to the root of a Storage.

For example, consider a resource at /lvl-1/lvl-2/foo.ttl with the following response headers:

Last-Modified: Sun, 05 Dec 2021 17:45:02 GMT ETag: "abcd"

The container at /lvl-1/lvl-2/ has the following headers:

Last-Modified: Tue, 07 Dec 2021 12:23:15 GMT ETag: "1234"

That container would include the following triples:

</lvl-1/lvl-2/> dcterms:modified "2021-12-07T12:23:15Z"^^xsd:dateTime ; ldp:contains </lvl-1/lvl-2/foo.ttl> . </lvl-1/lvl-2/foo.ttl> dcterms:modified "2021-12-05T17:45:02Z"^^xsd:dateTime .

The container at /lvl-1/ has the following headers:

Last-Modified: Tue, 07 Dec 2021 19:37:22 GMT ETag: "zyxw"

And the following body:

</lvl-1/> dcterms:modified "2021-12-07T19:37:22Z"^^xsd:dateTime ; ldp:contains </lvl-1/lvl-2/> . </lvl-1/lvl-2/> dcterms:modified "2021-12-07T12:23:15Z"^^xsd:dateTime .

Now consider that a client adds a triple to /lvl-1/lvl-2/foo.ttl. The result will lead to the Last-Modified being updated on /lvl-1/lvl-2/foo.ttl. The content of that resource has changed, so the ETag header also changes.

Because the Last-Modified header changes, the representation of /lvl-1/lvl-2/ now also changes to reflect the new status of the contained resource. As a consequence, and in order to ensure that clients receive the latest version of /lvl-1/lvl-2/, the Last-Modified and ETag headers of this container resource must change.

Because the Last-Modified header for /lvl-1/lvl-2/ changes, the representation of /lvl-1/ also needs to change, which leads to a further cascade of changes to the root of the storage.

There are three ways around this, as I see it:

Do not include the dcterms:modified or stat:mtime triples (i.e. ignore this part of the spec)

Don't worry about breaking how browser caching works (i.e. ignoring app/client needs)

Put this data in a separate (auxiliary) resource that doesn't lead to a cascade of change.

I prefer the third option because it makes it possible to include this data without breaking how browsers interact with HTTP resources, but absent that approach, the only route forward I see (while staying roughly in line with this spec) is to simply not include the modified time of contained resources (again, arguing "inapplicability").

Yes, @acoburn , I have been arguing the same, an auxiliary resource solves all problems, it can represent the entirety of the resource (and it makes sense to add data for each representation too if needed), avoids cascading problems, can be subject to separate authorization, can be used to track any changes and should be at least as easy to implement.

However, we haven't been able to find consensus around that for 0.9, so indeed, the result is that these data may be stale when conditional requests are used. I disagree with the conclusion, but as the RFC7232 is fairly vague at this point, I have chosen to accept it for 0.9, sincerely hoping we can revisit for 1.0.

Because the Last-Modified header changes, the representation of /lvl-1/lvl-2/ now also changes to reflect the new status of the contained resource.

While the representation data of /lvl-1/lvl-2/ changes because of the dcterms:modified value of /lvl-1/lvl-2/foo.ttl , the Last-Modified header of /lvl-1/lvl-2/ need not change - nor prohibited - as there were no changes to the containment triples ( #server-container-last-modified ).

A recommendation (or advisement) along the following lines may be necessary:

When the resource metadata of an existing contained resource changes, the server MUST send a weak entity-tag when responding to container’s request URI.

I want to mention again that the requirements introduced by this PR does not expect a container (/lvl-1/lvl-2/) to include resource metadata about itself (</lvl-1/lvl-2/> dcterms:modified "2021-12-07T12:23:15Z"^^xsd:dateTime .). (I don't care if that's not the point - it easily introduces complications/assumptions to the discussion that we are better off without.) And again, if resource metadata about the container itself is desired - I may have missed the discussion that calls for it - the specification should needs to say so because both the generation and protection of those resource metadata needs to apply. On a related note:

When a server plans to include resource metadata about the container in the response of the same container, the server should determine the Last-Modified header value as a regular change to the representation data. (It can send a strong entity-tag as usual.)

This does mean that apps/users will see stale data. From an impl perspective (given the choice), I would rather not provide the data than provide it in a way that confuses users.

I am really trying to be constructive here, but I don't see how this is implementable, given this text

As it stands, the Protocol uses the same requirement levels as in the RFC for Last-Modified and ETag headers. Servers might not include either Last-Modified or ETag headers, and so clients can't absolutely rely on them for being there.

That aside,

https://datatracker.ietf.org/doc/html/rfc7232#section-2.4 and https://datatracker.ietf.org/doc/html/rfc7232#section-6 suggests that headers with entity-tags will have higher precedence.

Server needs to enable the use of conditional requests so that client can eventually get a 200/304/412 or whatever.

Perhaps I'm missing something here but as mentioned in previous comment, I don't see a staleness issue around entity-tags.

If a client is only making decisions with Last-Modified/If-*-Since (when Last-Modified is provided by the server), then the representation metadata in the response will not help them to differentiate between a change to containment triples or any change to the representation. In this particular scenario, client has to make a GET request on the container.

We can certainly revisit that in 1.0.

For now, I suggest to add the following and go ahead with 0.9:

When the resource metadata of an existing contained resource changes, the server MUST send a weak entity-tag when responding to container’s request URI.

If the entity tag changes from x to y, there is no issue, but that is orthogonal to the weakness indicator

Good. Then in what case would it not change from x to y?

Any time the representation changes, the ETag changes. There is no argument about that. I am only responding to the proposed requirement to use weak ETags. And to be clear, I am not opposed to using weak ETags, but we can't claim that weak ETags will solve this issue

The weak entity-tag is suggested because strong entity-tag wouldn't be correct when only the resource metadata changes. All meanwhile allowing the client to make conditional requests. Right at this second, I don't have a strong opinion on adding that requirement because I do think it comes directly from the RFC but it may be reasonable to mention.

That aside, I acknowledge that there is (always) room to discuss/work this out further but it doesn't need to be a blocker for 0.9. (We mark handful of stuff to be revisited for 1.0, and I don't see why it can't be done here as well.)

…-comparison

justinwb

✅ following discussion and agreements in 12/15 editor meeting

csarven added 4 commits November 17, 2021 14:53

Add dc-terms iana-media-types to References

f07d1ee

Add definition for authoritative-information

de6eb1c

Add dcterms and stat to namespaces

0d10299

Add section on Authoritative Resource Data

2e70f8f

kjetilk requested a review from a team November 17, 2021 16:32

kjetilk added doc: Protocol topic: resource access labels Nov 17, 2021

kjetilk added this to the Release 0.9 milestone Nov 17, 2021

kjetilk linked an issue Nov 17, 2021 that may be closed by this pull request

Specify existing practice for Container data about contents #343

Closed

RubenVerborgh approved these changes Nov 17, 2021

View reviewed changes

kjetilk requested changes Nov 18, 2021

View reviewed changes

protocol.html Outdated Show resolved Hide resolved

protocol.html Outdated Show resolved Hide resolved

acoburn reviewed Nov 18, 2021

View reviewed changes

protocol.html Outdated Show resolved Hide resolved

csarven added 5 commits November 23, 2021 09:47

Add reference to RFC 6570

294245e

Update authoritative resource type. Refer to RFC 6570

5ea32a4

Minor

630861c

Update inapplicable condition and logically-part-of note

c0d8d4c

Minor

91a77cb

kjetilk reviewed Nov 23, 2021

View reviewed changes

protocol.html Outdated Show resolved Hide resolved

justinwb reviewed Nov 24, 2021

View reviewed changes

protocol.html Outdated Show resolved Hide resolved

protocol.html Outdated Show resolved Hide resolved

csarven mentioned this pull request Nov 25, 2021

Server Description #355

Open

csarven added 8 commits December 3, 2021 14:25

Update authoritative-resource-data-considerations

f599dd5

Mention date as part of dcterms:modified

4734bbf

Remove advisement on omitting data from unauthorized agents

98d2db4

Add determining server-container-last-modified

4eaf345

Remove contained-resource-state-cascading in favour of server-contain…

f2c8c8b

…er-last-modified

Add dcterms-modified-corresponds-last-modified

30eee02

Use resource-metadata instead of authoritative-information

1727bb8

Minor

64d4841

Add contained-resource-metadata-statements as Collection

0b6db97

csarven requested review from kjetilk, justinwb, acoburn, timbl and RubenVerborgh December 5, 2021 16:48

RubenVerborgh added a commit to CommunitySolidServer/Recipes that referenced this pull request Dec 5, 2021

Implement solid/specification#352

b4b4072

RubenVerborgh suggested changes Dec 5, 2021

View reviewed changes

protocol.html Outdated Show resolved Hide resolved

protocol.html Outdated Show resolved Hide resolved

protocol.html Outdated Show resolved Hide resolved

protocol.html Show resolved Hide resolved

csarven and others added 2 commits December 5, 2021 21:48

Update protocol.html

64e3738

Co-authored-by: Ruben Verborgh <ruben@verborgh.org>

Update protocol.html

a1222a2

Co-authored-by: Ruben Verborgh <ruben@verborgh.org>

kjetilk approved these changes Dec 6, 2021

View reviewed changes

acoburn reviewed Dec 6, 2021

View reviewed changes

acoburn reviewed Dec 7, 2021

View reviewed changes

kjetilk mentioned this pull request Dec 15, 2021

Auxiliary resource for container metadata as alternative #362

Open

csarven added 2 commits December 15, 2021 16:20

Constrain server-container-last-modified. Add container-last-modified…

746520d

…-comparison

Minor

c33e386

justinwb approved these changes Dec 15, 2021

View reviewed changes

csarven requested review from RubenVerborgh and acoburn December 15, 2021 15:23

Merge branch 'main' into feature/authoritative-contained-resource-data

4ebdc4d

timbl approved these changes Dec 15, 2021

View reviewed changes

Use 'SHOULD, unless' for server-contained-resource-metadata

29b7094

csarven merged commit 1caff4c into main Dec 15, 2021

RubenVerborgh mentioned this pull request Mar 8, 2022

Expose all required fields in container listings for the filesystem backend CommunitySolidServer/CommunitySolidServer#1207

Closed

csarven deleted the feature/authoritative-contained-resource-data branch May 12, 2022 16:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Authoritative Contained Resource Data #352

Authoritative Contained Resource Data #352

csarven commented Nov 17, 2021 •

edited

Loading

RubenVerborgh left a comment

kjetilk left a comment

NoelDeMartin commented Dec 5, 2021

csarven commented Dec 5, 2021 •

edited

Loading

RubenVerborgh left a comment

RubenVerborgh commented Dec 5, 2021

kjetilk left a comment

acoburn Dec 6, 2021

csarven Dec 6, 2021

acoburn Dec 6, 2021

kjetilk Dec 6, 2021

acoburn Dec 7, 2021 •

edited

Loading

kjetilk Dec 7, 2021

csarven Dec 8, 2021

acoburn Dec 8, 2021

csarven Dec 15, 2021

acoburn Dec 15, 2021

csarven Dec 15, 2021

acoburn Dec 15, 2021

csarven Dec 15, 2021

csarven Dec 15, 2021

justinwb left a comment

Authoritative Contained Resource Data #352

Authoritative Contained Resource Data #352

Conversation

csarven commented Nov 17, 2021 • edited Loading

RubenVerborgh left a comment

Choose a reason for hiding this comment

kjetilk left a comment

Choose a reason for hiding this comment

NoelDeMartin commented Dec 5, 2021

csarven commented Dec 5, 2021 • edited Loading

RubenVerborgh left a comment

Choose a reason for hiding this comment

RubenVerborgh commented Dec 5, 2021

kjetilk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acoburn Dec 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinwb left a comment

Choose a reason for hiding this comment

csarven commented Nov 17, 2021 •

edited

Loading

csarven commented Dec 5, 2021 •

edited

Loading

acoburn Dec 7, 2021 •

edited

Loading