Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlobStore for retrieving raw blob data from the db #484

Merged
merged 11 commits into from
Aug 9, 2023
Merged

Conversation

sandreae
Copy link
Member

@sandreae sandreae commented Aug 4, 2023

caveat: I'm still on the fence about whether we want to store blobs like this (all pieces as their own document) but I wanted to try out this approach first as a) it fits so smoothly in with our current documents system b) it's easy to do and the api won't really change if we take a different approach to storing blobs anyway. So, this is the minimal-changes approach, and we can iterate from here 👍

Introduces BlobStore with one method: get_blob.

Under the hood this:

  • gets the target document by id
  • checks it's schema id is blob_v1
  • fetches all related pieces in it's pieces field
  • validates that all pieces have been collected
  • combines them into one string
  • validates that the length matches the claimed length on the blob document
  • returns this string

This is done without introducing a new Blob type into p2panda-rs (as was the approach with Schema, our other system schema type). I actually think this works ok for now, but interested in any thoughts on this.

📋 Checklist

  • Add tests that cover your changes
  • Add this PR to the Unreleased section in CHANGELOG.md
  • Link this PR to any issues it closes
  • New files contain a SPDX license header

@codecov
Copy link

codecov bot commented Aug 4, 2023

Codecov Report

Patch coverage: 94.41% and project coverage change: +0.01% 🎉

Comparison is base (d5ba55d) 91.95% compared to head (76f40ec) 91.97%.
Report is 2 commits behind head on development.

Additional details and impacted files
@@               Coverage Diff               @@
##           development     #484      +/-   ##
===============================================
+ Coverage        91.95%   91.97%   +0.01%     
===============================================
  Files              100      101       +1     
  Lines            15914    16110     +196     
===============================================
+ Hits             14634    14817     +183     
- Misses            1280     1293      +13     
Files Changed Coverage Δ
aquadoggo/src/db/errors.rs 0.00% <0.00%> (ø)
aquadoggo/src/db/stores/blob.rs 94.87% <94.87%> (ø)
aquadoggo/src/schema/schema_provider.rs 89.93% <100.00%> (ø)

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sandreae sandreae marked this pull request as ready for review August 8, 2023 14:49
@sandreae sandreae changed the base branch from main to development August 8, 2023 15:00
@sandreae sandreae merged commit 3b38ff0 into development Aug 9, 2023
10 checks passed
@adzialocha adzialocha deleted the blob-store branch August 17, 2023 12:48
sandreae added a commit that referenced this pull request Aug 31, 2023
* Use p2panda-rs version which has blob schema

* `SchemaProvider` now has 4 system schema :-)

* Update proptests

* Implement BlobStore with `get_blob` method

* Clippy happy

* Fix imports

* fmt

* Remove some BlobStoreErrors and error tests

* Update CHANGELOG

* add `get_blob_by_view_id` method

* clippy
adzialocha added a commit that referenced this pull request Sep 8, 2023
* development:
  Make sure `/tmp` directory does not run out of scope before application ends (#557)
  Integrate `Bytes` value (#554)
  Stream blob data in chunks to files to not occupy too much memory (#551)
  Blobs directory configuration (#549)
  Use correct MAX_BLOB_PIECE_LENGTH from p2panda_rs
  Build a byte buffer over paginated pieces when assembling blobs (#547)
  HTTP routes to serve files with correct content type and etag headers (#544)
  Task for automatic garbage collection of unused documents and views (#500)
  Refactor tmp blob dir creation after rebase
  Fix after rebase
  "blob" materializer task (#493)
  Add static file server to `http` service (#483)
  Enable deletion of dangling `document_views` and related `document_view_fields` from db  (#491)
  BlobStore for retrieving raw blob data from the db (#484)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant