Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres improvements #782

Closed
3 of 11 tasks
dchoi27 opened this issue Nov 14, 2021 · 5 comments
Closed
3 of 11 tasks

Postgres improvements #782

dchoi27 opened this issue Nov 14, 2021 · 5 comments
Assignees
Labels
kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up topic/tg

Comments

@dchoi27
Copy link
Contributor

dchoi27 commented Nov 14, 2021

We have Eric from Trigram working on our current postgres setup including schema, config, migrations.

Tasks:

  • Repo setup
  • Baseline metrics (heroku dashboard)
  • Baseline benchmarks with local setup mirroring heroku
    • POST /upload, POST /pins, GET /check
  • Test out table partition to enable better scaling
  • Analise postgres config
  • Analise current postgres indexes
  • Cloudflare sync for the new db connectors
  • Setup https://pganalyze.com/ for our dbs
  • Add migration system
  • Split out read to read replica
@dchoi27 dchoi27 added kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up labels Nov 14, 2021
@hugomrdias hugomrdias changed the title Low hanging fruit Postgres config changes Postgres improvements Nov 16, 2021
@hugomrdias hugomrdias pinned this issue Nov 16, 2021
@hugomrdias
Copy link
Contributor

@yocontra can you please review the list above and give some feedback if it makes sense ?

feel free to add remove etc

@JeffLowe JeffLowe unpinned this issue Nov 18, 2021
@yocontra
Copy link
Contributor

yocontra commented Nov 22, 2021

Here's my current TODO list for my first PR of DB changes:

  • Basic analysis of production index usage, heroku stats to figure out where out issues are

  • Create partition schemas for pin, upload, and content tables

    • Ended up going with hash partitioning into 100 tables by CID - this should speed up reads/writes by isolating IO into smaller tables with ~20K rows. You will be able to grow these tables up to 2M rows before you need to revisit and consider more partitions, so this should be acceptable for a long while.
  • Optimize the create_upload function

  • Optimize basic DB configuration flags (specifically when autovacuum should run, table statistics, etc. to keep indexes cleaner than they are now since this is write-heavy workload)

  • Write migrations for production/staging to move data from the existing single tables into the new partitioned model

  • Create benchmark system in-repo and create a report comparing before vs. after the changes

TODO for future PRs:

  • Postgres index analysis + make changes based on unused indexes
  • Move all read queries to the read replica
  • Use a connection pooler like pgbouncer or https://github.com/yandex/odyssey
  • Set up pganalyze (or pgbadger) to get better statistics on the production DB
  • Set up a migration system for production and staging DB changes
    • sqitch suggested by @hugomrdias
    • Also looking at liquibase, and flyway
  • New cloudflare worker DB connection work if needed (factor out PostgREST #829)
  • Facilitate switching DB providers if needed

@yocontra
Copy link
Contributor

yocontra commented Nov 22, 2021

One thing that surfaced is PostgREST isn't able to deal with partitioned tables correctly - after implementing partitioning the query generator blows up (ends up doing a no-op and running no query, with no error reported) doing a basic relational query on the table after the create_upload function succeeds. Issue was already reported and the PR to fix it was merged over the weekend, so I'll go ahead and update to latest and see if it resolves the query generator issue.

UPDATE: Need to wait for a new PostgREST release - fix is not available in any docker images yet, and I don't want to roll our own custom image by pulling binaries out of their GitHub Actions which seems hacky.

@yocontra
Copy link
Contributor

Latest PostgREST (9.0.0) fixes the partition issue so unblocked there. Tests are all passing fine with the new partition changes.

On the DB configuration front, Heroku only lets you set basic flags so most of that will be skipped except what can be tuned in the database itself. If you want to tune the database more (and generally have more control, make it cheaper) I would recommend using a service like Aiven for managed PG. If we were able to set flags, this is what the configuration would mostly look like:

image

@hugomrdias hugomrdias added the Epic label Dec 3, 2021
@dchoi27 dchoi27 added need/update manually applied to any needing verbal update during meetings and removed Epic labels Jan 10, 2022
@dchoi27 dchoi27 added need/update manually applied to any needing verbal update during meetings and removed need/update manually applied to any needing verbal update during meetings labels Jan 24, 2022
@dchoi27 dchoi27 added need/update manually applied to any needing verbal update during meetings and removed need/update manually applied to any needing verbal update during meetings labels Feb 4, 2022
@dchoi27 dchoi27 removed the need/update manually applied to any needing verbal update during meetings label Feb 14, 2022
@JeffLowe
Copy link
Contributor

JeffLowe commented Apr 7, 2022

Closing this as complete for nft.storage. Optimizations are live.
The new Uploads V2 architecture is going to remove the bottleneck on the db so most of the remaining optimizations of the current architecture are not necessary.
Remaining step on this is to port changes over to web3 (captured here web3-storage/web3.storage#615

@JeffLowe JeffLowe closed this as completed Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature P2 Medium: Good to have, but can wait until someone steps up topic/tg
Projects
None yet
Development

No branches or pull requests

4 participants