Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify staging zone readiness to size only #535

Conversation

neelvirdy
Copy link
Contributor

@neelvirdy neelvirdy commented Nov 17, 2022

  1. Prevents attempting to make deals with too small of content by simplifying staging zone readiness to "is it an acceptable deal size"
  2. Staging zones live indefinitely until they become ready
  3. Changes accepted staging zone sizes from 13.4GB - 14.4GB to 3.6GB - 31.6GB

I revisited the tuning of minStagingZoneSizeLimit and maxStagingZoneSizeLimit. It seemed like a fairly tight range that would be hard to hit (13.4GB-14.4GB). IMO the range should correlate to the typical piece size range an SP would accept. From some brief slack searching, it seems like 32GiB should be our absolute max, and some common min piece sizes are 16GB, 1GB, and 256 bytes. I settled on 3.6GB since it will yield consistent deal sizes by matching the individual deal threshold (also 3.6GB), and it strikes a balance between usable in FE but not irrelevantly small for SPs. This way, SPs should see deals from estuary consistently in the range of 3.6GB-31.6GB, likely skewed towards 3.6GB.

Associated FE PR since some metadata that is rendered in FE is removed by this PR: application-research/estuary-www#97

replication.go Outdated Show resolved Hide resolved
constants/constants.go Outdated Show resolved Hide resolved
replication.go Outdated Show resolved Hide resolved
@en0ma
Copy link
Contributor

en0ma commented Nov 17, 2022

Thank you @neelvirdy for doing this, I have always wanted some form of deterministic behavior for the staging zone, and this PR does just that - awesome!

I have always been of the opinion that the min staging bucket size should match the individual deal threshold size - it keeps things consistent (guarantees the same min deal size across aggregated and non-aggregated content deals), intuitive, and deterministic.

I agree with removing the time-based (closing and keep-alive) rules, as it can easily yield content sizes that would not satisfy the min ask piece size and/or not give SPs consistent min deal size (which I believe Estuary from the onset, has always wanted to give SPs some guarantee that there get almost 3.6Gb per deal).

Some suggestions

  • Make the staging zone maximum size to be equal to DefaultContentSizeLimit. This way, staging buckets will be aggregated above the minimum bucket size, but also gives a wider range to add more contents up to DefaultContentSizeLimit to have consistent behavior as non-aggregated contents.

  • Change DefaultContentSizeLimit to MaxDealContentSize and IndividualDealThreshold to MinDealContentSize to make things more intuitive

  • We can remove the need for items count and change to using a sharded directory for the wrapper content, this way, child content count won't be a problem

  • lastly, we can use a new database table (staging_zones) to track uploaded contents that are below the minimum deal size, and have a worker monitor and aggregate as required - this will greatly reduce the Estuary RAM usage and a good step towards having a smooth deployment experience (we don't have to rebuild the queue on the RAM at boot-up anymore).

PS: I am available to work with you on this, let me know if you need anything.

cc: @jimmylee @alvin-reyes

handlers.go Outdated Show resolved Hide resolved
constants/constants.go Outdated Show resolved Hide resolved
@neelvirdy neelvirdy force-pushed the nvirdy/simplify-staging-zone-readiness branch from d9f46f3 to ce56f6e Compare November 17, 2022 18:05
@neelvirdy neelvirdy requested a review from jcace November 18, 2022 05:04
@neelvirdy
Copy link
Contributor Author

neelvirdy commented Nov 18, 2022

Thanks for the thorough review @en0ma! I incorporated most of your suggestions and have successfully uploaded 20k small files (190KB each) on my local estuary and reached TransferFinish deal state. The stats on the deal are 19404 files totaling 3.62 GB. The rest went into a new staging bucket.

I also realized that the staging buckets are modeled and auto-recovered in the contents table here. I believe we can avoid having a staging_zones table entirely and still avoid any reboot issues with the staging zones state by treating every change to the buckets as the following:

  • Write change to contents dB
  • Lock bucketLk
  • Write change to buckets in memory
  • Unlock bucketLk

Lmk your thoughts! Ideally I'd like to make the bucket persistence changes in a follow up PR so we can improve SP experience ASAP.

Will look into some cleanup tweaks tomorrow as well, particularly around constant names and redundancy.

image

replication.go Outdated Show resolved Hide resolved
handlers.go Outdated Show resolved Hide resolved
@en0ma
Copy link
Contributor

en0ma commented Nov 18, 2022

Thanks for the thorough review @en0ma! I incorporated most of your suggestions and have successfully uploaded 20k small files (190KB each) on my local estuary and reached TransferFinish deal state. The stats on the deal are 19404 files totaling 3.62 GB. The rest went into a new staging bucket.

Great to see that it worked, awesome job!

I also realized that the staging buckets are modeled and auto-recovered in the contents table here. I believe we can avoid having a staging_zones table entirely and still avoid any reboot issues with the staging zones state by treating every change to the buckets as the following:
Write change to contents dB
Lock bucketLk
Write change to buckets in memory
Unlock bucketLk
Lmk your thoughts! Ideally I'd like to make the bucket persistence changes in a follow up PR so we can improve SP experience ASAP.

Let's discuss this in the follow-up PR, but in summary, this PR has addressed SPs pain, the follow-up PR is to address RAM usage (by not keeping its metadata in memory, it can quickly build up - we have no control) and improve deployment by not rebuilding staging-zones which blocks deal processing

Copy link
Contributor

@en0ma en0ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - great job!

@neelvirdy neelvirdy merged commit d04cee0 into application-research:dev Nov 18, 2022
@neelvirdy neelvirdy deleted the nvirdy/simplify-staging-zone-readiness branch November 18, 2022 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants