Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway directory listings should be paginated #8455

Open
3 tasks done
guseggert opened this issue Sep 22, 2021 · 5 comments
Open
3 tasks done

Gateway directory listings should be paginated #8455

guseggert opened this issue Sep 22, 2021 · 5 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature topic/gateway Topic gateway

Comments

@guseggert
Copy link
Contributor

Checklist

  • My issue is specific & actionable.
  • I am not suggesting a protocol enhancement.
  • I have searched on the issue tracker for my issue.

Description

To render the directory listing page, go-ipfs sequentially fetches blocks for every directory entry. For large directories, this takes a very long time (see e.g. #7588). The gateway should paginate this listing so that there's a reasonable upper bound on the time it takes to return a response, and to allow more even load distribution across gateway fleets. I'd suggest some query args to control page size and offset, with an upper bound of 20 on the page size (this upper bound could be configurable).

@guseggert guseggert added the kind/enhancement A net-new feature or improvement to an existing feature label Sep 22, 2021
@lidel lidel added the topic/gateway Topic gateway label Sep 23, 2021
@lidel
Copy link
Member

lidel commented Sep 23, 2021

This would require a serious refactor of how https://github.com/ipfs/dir-index-html works
(which we already want to do, but is a bigger adventure).

Given that we want to improve IPLD support on gateways (ipfs/in-web-browsers#182),
we should do this type of thing in a generic way that works for all DAG types,
and lazy-load additional Size and Type information when the DAG is unixfs.

I envision replacing unixfs-specific dir-index-html with "IPLD Explorer v2" that shows generic DAG view by default, but has specially-crafted variants for most popular codecs like dag-pb (unixfs) and leverages something like ?format=unixfs-info (#8234) for lazy-loading additional metadata about Size and Type only for items visible on the page.

@guseggert
Copy link
Contributor Author

I agree with the bigger picture, but this also seems relatively low effort and addresses an availability risk. Assuming the "quick fix" is straightforward, I think it makes sense to do both (quick fix now, generic fix later).

We should also add some metrics around this, because from what I can tell, we don't have good visiblity into how much this contributes to aggregate metrics like latency, TTFB, etc.

@Stebalien
Copy link
Member

If we're just trying to make non-sharded directories faster, #8178 is probably a simpler short-term solution.

Eventually, we'll likely need pagination for sharded directories. But we'll need to add the ability to "seek" which will require some design work.

@RangerMauve
Copy link

Some stuff discussed with Lidel that might be useful for consideration:

  • Specify a basic pagination spec which uses query strings
  • ?page=0&limit=100
  • If a gateway thinks there's "too many" entries (I think >100) it can send a 302 redirect pointing to the first "page"
  • Applications aware of the pagination API could then request later pages or larger limits
  • The generated HTML directory listing could have buttons for incrementing the page or changing the limit (basic HTML <form>?)

The pagination should try to use regular traversal and account for whatever ADLs exist at that point including HAMTs.

@lidel
Copy link
Member

lidel commented Jun 24, 2022

I also wrote some notes about an alternative approach, which essentially removes the need for pagination: #9058 and included both in HTML Gateway specs under best practices section (ipfs/specs@9fc9a9c)

lidel added a commit to ipfs/specs that referenced this issue Jun 28, 2022
lidel added a commit to ipfs/specs that referenced this issue Jul 1, 2022
* feat: initial HTTP gateway specs

This adds gateway specs under ./http-gateways directory.

The aim is to document _current_ behavior (implementation in go-ipfs 0.13)
and switch the way we do the gateway work to be specs-driven.

Long term goal is to provide language and implementation agnostic
specification that anyone can use to implement compatible gateways.

* gateway: add Content-Range

* gateway: registerProtocolHandler uri router

* CODEOWNERS: add lidel for ./http-gateways

* gateway: resolving an advanced DNSLink chain

* gateway: only-if-cached HEAD behavior

* gateway: suggestions from reviewers

Co-authored-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
Co-authored-by: Vasco Santos <vasco.santos@moxy.studio>
Co-authored-by: Oli Evans <oli.evans@gmail.com>

* gateway: include CIDv1 node in summary

* gateway: reorder URI router section

As suggested in #283 (comment)

* gateway: add Denylists section

* gateway: switch only-if-cached miss to 412

Rationale: ipfs/kubo#8783 (comment)

* gateway: apply suggestions from review

Co-authored-by: Thibault Meunier <thibmeu@users.noreply.github.com>

* gateway: apply suggestions from Cloudflare

#283 (review)

* gateway: add X-Content-Type-Options

* gateway: simplify dnslink summary

https://github.com/ipfs/specs/pull/283/files#r898709569

* gateway: document 412 Precondition Failed

https://github.com/ipfs/specs/pull/283/files#r898686654

* gateway: link to ipld codecs explainer

https://github.com/ipfs/specs/pull/283/files#r898687052

* gateway: stub about handling traversal errors

https://github.com/ipfs/specs/pull/283/files#r892845860

* gateway: expand HTTP caching considerations

* gateway: editorial fixes

Co-authored-by: Steve Loeppky <stvn@loeppky.com>

* gateway: expand on Host header parsing

https://github.com/ipfs/specs/pull/283/files#r898703765

* gateway: editorial fixes

* gateway: X-Forwarded-Proto and X-Forwarded-Host

* gateway: editorial fixes

* gateway: X-Trace-Id

optional header suggested in:
#283 (comment)

rationale: having specific name as a suggestion of 'best practice' in
the specs will simplify debugging across ecosystem

* gateway: Generated HTML with directory index

Synthesis of ideas from:
ipfs/kubo#8455
and
ipfs/kubo#9058

Co-authored-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
Co-authored-by: Vasco Santos <vasco.santos@moxy.studio>
Co-authored-by: Oli Evans <oli.evans@gmail.com>
Co-authored-by: Thibault Meunier <thibmeu@users.noreply.github.com>
Co-authored-by: Steve Loeppky <stvn@loeppky.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature topic/gateway Topic gateway
Projects
None yet
Development

No branches or pull requests

4 participants