Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve client API from exported GCS bucket #8133

Open
8 tasks
jonasfj opened this issue Oct 11, 2024 · 1 comment
Open
8 tasks

Serve client API from exported GCS bucket #8133

jonasfj opened this issue Oct 11, 2024 · 1 comment

Comments

@jonasfj
Copy link
Member

jonasfj commented Oct 11, 2024

We can point GCLB at a prefix within a GCS bucket, so ideally we should be able to export responses for everything matching:

  • GET /api/packages/<package>
  • GET /api/packages/<package>/advisories
  • GET /api/archives/<package>-<version>.tar.gz

Export them into a GCS bucket, under prefixes:

  • /latest/ (all deployments will write to this concurrently)
  • /<runtimeVersion>/ (each deployment will have it's own prefix)

Then GCLB can be pointed to /latest/ for URLs matching the above routes.

If a new deployment of pub.dev starts producing garbled output in /latest/, we can update GCLB to point to /<runtimeVersion>/ from the previous deployment. That way we still have a fallback mechanism.

To do this we'll need the following:

  • An abstraction around the bucket containing exported API responses.
    • Must be able to GC old /<runtimeVersion>/.
    • Must be able to GC deleted packages.
      (Should probably wait until they are 1 day old, to avoid race conditions between creation/deletion).
  • A updater function that given a package name ensures all affected responses are updated.
    • Ideally, this can compute a hash of the contents and avoid unnecessary updates.
  • A backend that runs continuously (similar to how taskBackend works) and scans for packages that have been updated, calling the updater function.
  • Calls to updater function when a new version is published, retracted, moderated, deleted, etc.
  • Tests for all components.
  • Provisioning of new bucket.
  • Mapping of traffic on staging to bucket using GCLB.
  • Mapping of traffic in production to bucket using GCLB.
@sigurdm
Copy link
Contributor

sigurdm commented Oct 11, 2024

Does this mean that api responses can only be updated reliably after all old deployments are gone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants