-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage v2: add image upload+polling, use CIDs, remove metadata #5032
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work! All looks good to me but I don't have much context on the mediorum stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice. Added a note about not using ReadAll
The other thing we should also do is update postBlob
function to compute CID for incoming blob instead of trusting the multipart filename from peer.
mediorum/server/serve_upload.go
Outdated
file, err := upload.Open() | ||
func computeFileCID(file io.Reader) (string, error) { | ||
builder := cid.V1Builder{} | ||
contents, err := ioutil.ReadAll(file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two problems with using ReadAll
here:
- it consumes the reader passed in... so calling code needs to
Seek(0,0)
if it expects reader to still work. - it reads whole file into memory, which for a wav could easily mean 100+ MB.
Anyway I think you can solve both by:
- making param ReedSeeker
- removing
ReadAll
and usingCidFromReaderactually that is to parse a cid... looking for the reader friendly thing. - doing a
defer file.Seek(0,0)
next week planning to do the "parallel replicate" change, which will pass around files everywhere so we can have independent readers which will improve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for not using ReadAll
, something like:
h, err := multihash.SumStream(file, multihash.SHA2_256, -1)
basically modify the utils function, drop utils dep and use multihash directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate all the details! re-tested now after updating CID computation to use SumStream + seek to the beginning and making postBlob verify CIDs instead of trusting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
## Changelog - 2023-04-03 [ac2f3f3] Fix discover-node-selector response (#5043) [Dylan Jeffers] - 2023-04-03 [dfee3cc] SDK: Add getBlockers(), getBlockees(), unfurl(), and getPermissions() (#5041) [Marcus Pasell] - 2023-04-03 [1d8db84] SDK: Snake case DN selector services key (#5039) [Marcus Pasell] - 2023-04-03 [518758a] Storage v2: add image upload+polling, use CIDs, remove metadata (#5032) [Theo Ilie] - 2023-04-03 [0115b13] Bump sdk to v2.0.3-beta.2 [audius-infra]
## Changelog - 2023-04-03 [ac2f3f3] Fix discover-node-selector response (#5043) [Dylan Jeffers] - 2023-04-03 [dfee3cc] SDK: Add getBlockers(), getBlockees(), unfurl(), and getPermissions() (#5041) [Marcus Pasell] - 2023-04-03 [1d8db84] SDK: Snake case DN selector services key (#5039) [Marcus Pasell] - 2023-04-03 [518758a] Storage v2: add image upload+polling, use CIDs, remove metadata (#5032) [Theo Ilie] - 2023-04-03 [0115b13] Bump sdk to v2.0.3-beta.2 [audius-infra]
Description
/uploads/:id
endpoint for transcode/resize successTests
With local stack+client, I signed up, pressed "f" to enable the
storage_v2
feature flag, uploaded a track with cover art, and verified that polling succeeded followed by an EntityManager transaction being relayed with correct metadata (includingtrack_cid
). I also verified that results for jobs now use CID V1:Job results
Metadata created by V2 upload
Metadata created by the same V1 upload
Note: track CIDs are different due to #4631.
Monitoring - How will this change be monitored? Are there sufficient logs / alerts?
This is all gated by the
storage_v2
flag.