Skip to content

Commit

Permalink
feat: improve guidance on total counts (#731) (#788)
Browse files Browse the repository at this point in the history
* feat: improve guidance on total counts (#731)

Signed-off-by: tkrop <tronje.krop@zalando.de>

* feat: improve wording as suggested (#731)

Co-authored-by: Miha Lunar <mlunar@gmail.com>

---------

Signed-off-by: tkrop <tronje.krop@zalando.de>
Co-authored-by: Miha Lunar <mlunar@gmail.com>
  • Loading branch information
Tronje Krop and SmilyOrg authored Nov 21, 2023
1 parent b916b92 commit 6555c13
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 30 deletions.
2 changes: 1 addition & 1 deletion chapters/http-headers.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ components:
Default:
headers:
ETag:
$ref: '#/components/(parameters|headers)/ETag
$ref: '#/components/(parameters|headers)/ETag'
----

*Note:* It is a question of taste whether headers for responses are defined in
Expand Down
76 changes: 47 additions & 29 deletions chapters/pagination.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,27 +6,31 @@
== {MUST} support pagination

Access to lists of data items must support pagination to protect the service
against overload as well as to support client side iteration and batch processing
experience. This holds true for all lists that are (potentially) larger than
just a few hundred entries.
against overload as well as to support client side iteration and batch
processing experience. This holds true for all lists that are (potentially)
larger than just a few hundred entries.

There are two well known page iteration techniques:

* **Offset-based pagination**: numeric offset identifies the first page-entry
* **Cursor-based pagination** — aka key-based pagination: a unique key identifies the first page-entry
(see also https://dev.twitter.com/overview/api/cursoring[Twitter API] or
* **Cursor-based pagination** — aka key-based pagination: a unique key
identifies the first page-entry (see also
https://dev.twitter.com/overview/api/cursoring[Twitter API] or
https://developers.facebook.com/docs/graph-api/results[Facebook API])

The technical conception of pagination should also consider user experience (see
https://www.smashingmagazine.com/2016/03/pagination-infinite-scrolling-load-more-buttons/[Pagination Usability Findings In eCommerce]),
for instance, jumping to a specific page is far less used than navigation via {next}/{prev}
page links (see <<161>>). This favors an API design using cursor-based instead of
offset-based pagination -- see <<160>>.
:smashing-pagination: https://www.smashingmagazine.com/2016/03/pagination-infinite-scrolling-load-more-buttons/

The technical conception of pagination should also consider user experience
(see {smashing-pagination}[Pagination Usability Findings In eCommerce]), for
instance, jumping to a specific page is far less used than navigation via
{next}/{prev} page links (see <<161>>). This favors an API design using
cursor-based instead of offset-based pagination -- see <<160>>.

**Note:** To provide a consistent look and feel of pagination patterns,
you must stick to the common query parameter names defined in <<137>>.



[#160]
== {SHOULD} prefer cursor-based pagination, avoid offset-based pagination

Expand All @@ -38,10 +42,10 @@ Before choosing cursor-based pagination, consider the following trade-offs:

* Usability/framework support:
** Offset-based pagination is more widely known than cursor-based pagination,
so it has more framework support and is easier to use for API clients
so it has more framework support and is easier to use for API clients.
* Use case - jump to a certain page:
** If jumping to a particular page in a range (e.g., 51 of 100) is really a
required use case, cursor-based navigation is not feasible.
required use case, cursor-based navigation may not be feasible.
* Data changes may lead to anomalies in result pages:
** Offset-based pagination may create duplicates or lead to missing entries
if rows are inserted or deleted between two subsequent paging requests.
Expand All @@ -52,23 +56,23 @@ Before choosing cursor-based pagination, consider the following trade-offs:
** Very big data sets, especially if they cannot reside in the main memory of
the database.
** Sharded or NoSQL databases.
* Cursor-based navigation may not work if you need the total count of results.

The {cursor} used for pagination is an opaque pointer to a page, that must
never be *inspected* or *constructed* by clients. It usually encodes (encrypts)
the page position, i.e. the unique identifier of the first or last page element, the
pagination direction, and the applied query filters (or a hash over these) to safely
recreate the collection (see also best practice <<cursor-based-pagination>> below).
the page position, i.e. the unique identifier of the first or last page
element, the pagination direction, and the applied query filters (or a hash
over these) to safely recreate the collection (see also best practice
<<cursor-based-pagination>> below).


[#248]
== {SHOULD} use pagination response page object

[[pagination-fields]]
For iterating over collections (result sets) we propose to either use cursors
(see <<160>>) or simple hypertext control links (see <<161>>). To implement these
in a consistent way, we have defined a response page object pattern with the
following field semantics:
(see <<160>>) or simple hypertext control links (see <<161>>). To implement
these in a consistent way, we have defined a response page object pattern with
the following field semantics:

* [[self]]{self}:the link or cursor pointing to the same page.
* [[first]]{first}: the link or cursor pointing to the first page.
Expand Down Expand Up @@ -136,14 +140,15 @@ ResponsePage:
----

*Note:* While you may support cursors for {next}, {prev}, {first}, {last}, and
{self}, it is best practice to replace these with pagination links -- see <<161>>.
{self}, it is best practice to replace these with pagination links -- see
<<161>>.


[#161]
== {SHOULD} use pagination links

To simplify client design, APIs should support <<165, simplified hypertext controls>>
as standard pagination links where applicable:
To simplify client design, APIs should support <<165, simplified hypertext
controls>> as standard pagination links where applicable:

[source,json]
----
Expand All @@ -161,11 +166,24 @@ as standard pagination links where applicable:
}
----

See also <<248>> for details on the pagination fields and page result object.
See also <<248>> for details on the pagination fields and page result object.


[#254]
== {SHOULD} avoid a total result count

In pagination responses you should generally avoid providing a _total result
count_, since calculating it is a costly operation that is usually not required
by clients. Counting the total number of results for complex queries usually
requires a full scan of all involved indexes, as it is difficult to calculate
and cache it in advance. While this is only an implementation detail, it is
important to consider that providing these total counts over the life-span
of a service might become expensive as the data set grows over time.

As clients may integrate against these counts over time alongside data
set growth, removing them will be more difficult than not providing them
in the first place.

*Remark:* You should avoid providing a total count unless there is a clear
need to do so. Very often, there are significant system and performance
implications when supporting full counts. Especially, if the data set grows
and requests become complex queries and filters drive full scans. While this
is an implementation detail relative to the API, it is important to consider
the ability to support serving counts over the life of a service.
If your consumer really requires a total result count in the response, you may
support this requirement via the {Prefer} header adding the directive
`return=total-count` (see also <<181>>).
7 changes: 7 additions & 0 deletions models/headers-1.0.0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,9 @@ If-None-Match:


Prefer:
# Do not import this schema directly, since processing directives are usually
# highly customized. Instead, copy the schema to your API and adjust it to
# your needs.
name: Prefer
in: header
required: false
Expand All @@ -123,6 +126,10 @@ Prefer:
return using **204** (No Content) without resource (minimal) or using
**200** or **201** with resource (representation) in the response body on
success.
* **return=<total-count>** is used to suggest the server to return a total
result count in a collection requests supporting pagination. Since this
is a costly operation, it should be used with care, and the service may
decide to ignore this request.
* **wait=<delta-seconds>** is used to suggest a maximum time the server has
time to process the request synchronously.
* **handling=<strict|lenient>** is used to suggest the server to be strict
Expand Down

0 comments on commit 6555c13

Please sign in to comment.