-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotation Lists #50
Comments
😍 |
What @tilgovi said :-) |
I would like to understand what all these properties mean in this respect. But, first of all, I would also like to understand what we need in terms of functionalities before adopting a specification created elsewhere… For example, the model in EPUB[1] is way simpler and clearer. Why do we need more than that? (Let us forget about the complexities of the RDF expression of lists; a JSON-LD or Turtle representation thereof makes this complexity hidden anyway.) We could also consider ORE[4] although, I must admit, I do not remember all the details any more, but it could make it simpler. (I admit that using a specification developed in another WG has its advantages, but I would prefer to consider that on technical merit.) Ivan [4] http://www.openarchives.org/ore/1.0/vocabulary
Ivan Herman, W3C |
I'd be curious to know how this compares to any work going on with paging within LDP. The first thing that leapt out to me is that this is the minimum viable paging I am used to seeing from a typical HTTP API: total, count, offset. In this case, it has the addition of next and self for greater navigability by not relying on URL construction using the offset and count. |
It may be overkill for EPUB but it's spot on for servers, IMO. |
I had hoped that the new AS Collection model and ldp Paging would be easily integrated, but with some implementation work there are various issues. Not least of which being that ldp:contains is not an ordered list, even if it looks like one in JSON-LD. Note the disclaimer in 7.2.1 [1] that:
So if you have 100 annotations on a page, in LDP they're all at the same rank. Conversely, as:OrderedCollection allows the use of an RDF List as the object of as:items, thus preserving in-page order. So ... my proposal is to drop support for LDP Paging in protocol, and instead use AS Collections / Pages. |
I'm not deterred by that note. It sounds like the spec authors have done their best to not overly constrain implementations with potentially difficult requirements. To me, a server that has a non-deterministic stability of the sorting when doing pagination is unfortunate but also not entirely unreasonable. In practice, such things are nearly indistinguishable, to clients, from cases where items are being inserted or deleted concurrently. Simply having an offset and a limit does not guarantee that paginating over the whole collection will return each item exactly once. Doing that requires stable collection snapshots that persist for the duration of client sessions and other such complicated stuff. So, to me, the LDP language is just being realistic and avoiding certain burdens of scale that many find untenable. In practice, it's a very reasonable paging behavior. |
No, it's because ldp:contains is the relationship between container and contained-item directly with no rdf:List (or other) involved. So when you ask for the response in turtle, there really is no order at all in the page. |
"In cases where ordering is important, an LDP Paging server ensures that all the members on any single page have the proper sort order with relation to all members on any next and previous pages." So, if there's an ordering the spec requires servers to honor it when paginating. Can we discuss use cases? When is it critical that a specific page return an ordered list? Are there any such cases when the client couldn't determine that order themselves if they have reason to not trust the order returned in the serialization? |
Right, but only at the page level. The items on a single page can be in any order, the guarantee is that they all sort greater than previous pages and less than next pages. Which if your page size is 1000, and there's only one page, you're out of luck. As for client side sorting being impossible, how about:
|
I think the last two are compelling. I actually started refuting the first and included a caveat about secret sauce and then I read the rest of your response and found we were thinking the same thing. Still, I'm having a hard time figuring out whether or how to include this. It's attractive to me that the spec could not require ordering, but if there's a way we can recommend a particular way to signal an ordering when it does exist that would be nice. |
I'm thinking at the moment:
So something like:
{
"@id": "http://example.org/annos/",
"@type": ["OrderedCollection", "Container"],
"label": "My Big Collection",
"totalItems": 42023,
"contains": ["anno3", "anno2", "anno4", "anno1", "anno5"],
"first": "http://example.org/annos/?p=0",
"last": "http://example.org/annos/?p=236"
}
{
"@id": "http://example.org/annos/?p=0",
"@type": "OrderedCollectionPage",
"partOf": "http://example.org/annos/",
"next": "http://example.org/annos/?p=1",
"orderedItems": [
{
"@id": "http://example.org/annos/anno1",
"@type": "Annotation",
"target": "..."
},
"..."
]
} |
Tracking: The current protocol and this issue would be affected if w3c/activitystreams#221 is accepted to remove paging from the AS model. |
How based on
{
"@id": "http://example.org/annos/?p=0",
"@type": "OrderedCollectionPage",
"partOf": "http://example.org/annos/",
"next": "http://example.org/annos/?p=1",
"orderedItems": [
{
"@id": "http://example.org/annos/anno1",
"@type": "Annotation",
"target": "..."
},
"..."
]
} Client can assert that
I haven't noticed in AS2.0 any rules for inferencing based on as:partOf |
Following up on my question from w3c/activitystreams#221 (comment) If we separate the concern of HTTP access to the dataset, let's say provide a single file for download linking to it with void:dataDump, and this way get rid of all the API specific terms from LDP namespace used in web annotation examples. Does it still need paging mechanism from AS2.0, even if we can access the whole dataset directly from device memory? |
@elf-pavlik I don't see ordering, collections, or even paging as an API specific feature--and dearly want Collection style stuff spec'd somewhere or other. In my current desired use case, I'm wanting to add "static" annotations to a filesystem (think Jekyll sites on GitHub Pages, for example), and would like to include paged collections of them in "blog order" (newest first) as I would with RSS and Atom Feeds. If the single collection file grows unwieldy, I would (reasonably) want to paginate them and/or break them into related collections (by month, week, etc). Expressing all of that should be possible, is not specific to API usage, and would sure be handy to have defined somewhere. 😄 Right now, it seems that ActivityStreams 2.0 OrderedCollection and OrderedCollectionPage do come the closest to the above use case requirements. Given @tilgovi's scalability points (and the likelihood of one or more annotations appearing in one or more pages when using A fun problem to be sure. 😃 |
@BigBlueHat that sounds in direction of https://www.w3.org/Social/track/issues/24 Does each annotation has logical relationship to the whole big collection and you use paging just to send smaller chunks over network. Or each annotation has important logical relationship to a particular page, not directly to the whole collection, and even while having all the dataset loaded in memory you still want to make sure to preserve this exact page structure? |
@elf-pavlik yeah. It's not at all dissimilar. 😃 You can think of it a bit like "rolling log files." The importance of the page they are on only depends on how you're doing paging. Granted, semantically--in the log file case--the pages are less pages than sub-collections...however, other than directionality between the pages there's already minimal difference between a Collection and a Page (both in general and in the AS2 vocab specifically). Even if the whole annotation collection were loaded into memory, you may still want them paginated for display--even if you're merely paginating on the total length of the collection divided by # of items per page. Does that explain the value of paging in a static site style use case? |
I try to keep distinction between data model, API or UI As I understand each Annotation has logical relation with a List of annotations (in this case ordered) and pages have no other purpose than split this list into small chunks when accessed over network I see this question possibly helps with clarifying it: If we add more annotations to the list, can they possibly move between pages? Or annotation has permanent relationship with a page which can only change by intentional operation: "Move this annotation from this page to this page" and once again not as side effect of adding annotations to the whole list. Actually in such case, one should never insert annotations directly to the list but always interact only with pages and use the List as collection of pages, not collection of annotations! Pages become here collections of annotations. |
@elf-pavlik good points, and very good question for re-framing the discussion! It's sort of what I was after when I was referring to Pages "merely" as Collections with some siblings and a position among them--such that "next" and "previous" could be used for finding the related siblings. In LDP paging the relationships are a bit clearer (maybe). In that case, a "page" has a greater than / less than relationship with other pages, but the items within it do not themselves have order--as
@azaroth42 iirc, there was some reason we felt that non-in-page ordering done this way was insufficient and that knowing that position of the annotation within the wider collection--without the provision of additional statements was important...but I don't honestly recall what that was now. Maybe you do? 😄 Schema.org also lacks the notion of "pages" per se, and instead uses |
Dependency on #92. It doesn't make sense to me to have both oa:List and as:OrderedCollection in the model when the structure is identical and the semantics so close as to make no difference. |
Just to double check my interpretation of as:OrderedCollection and as:OrderedCollectionPage which I understand Web Annotations will use. https://www.w3.org/TR/activitystreams-core/#collections Each page uses as:partOf to reference collection which got 'broken into pages'. Each page also uses as:items to reference each item/member of the particular OrderedCollectionPage. In that case each item/member DOES NOT have a relationship with the whole OrderedCollection which one can express with a single predicate/property defined in AS 2.0 Vocabulary (or its owl:inverseOf) . Expressing relationship between the instance of as:PagedCollection and each item/member seems to require combining two properties: (owl:inverseOf) as:partOf and as:items. In such case, maybe it would make sense to define owl:propertyChainAxiom. |
@elf-pavlik Yes. I'm working on writing it up for the annotation model, vocab and protocol this week. And, to avoid multiple possible responses, the consensus was that we would only use the paged model, as annotation collections are more likely to be very large than very small. |
Several downstream systems have a need for lists of annotations, including EPUB [1] and IIIF [2]. For search, we need to have a list of annotations for the result set of applying the query to the set of annotations. Other expressed use cases are user constructed "playlists" of Annotations, curated distribution lists of annotations, and general optimization of annotation retrieval to avoid thousands of HTTP calls for each annotation individually.
As an initial proposal, we could use Activity-Streams's
OrderedCollection
class[3], which seems to fulfill the (implicit, to be expressed) requirements:This would be consistent with a (to-be-proposed) use of AS2.0 for notifications about annotation activity.
[1] http://www.idpf.org/epub/oa/#h.48f1o3s9o9hf
[2] http://iiif.io/api/presentation/2.0/#other-content-resources
[3] http://www.w3.org/TR/activitystreams-core/#collections
The text was updated successfully, but these errors were encountered: