Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: sharky cleanup and simplification of code #3500

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

istae
Copy link
Member

@istae istae commented Nov 5, 2022

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

Write operations now follow a very basic round robin strategy with circular distribution of requests.

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):


This change is Reviewable

@istae istae force-pushed the freaky branch 2 times, most recently from 1f6576a to ec5fe69 Compare November 11, 2022 15:50
@istae istae changed the title refactor: rewrite sharky refactor: sharky rewrite Nov 27, 2022
@istae istae marked this pull request as ready for review November 27, 2022 15:08
@istae istae requested review from a team, mrekucci and aloknerurkar and removed request for a team November 27, 2022 15:08
pkg/sharky/shard.go Outdated Show resolved Hide resolved
pkg/sharky/store.go Outdated Show resolved Hide resolved
pkg/sharky/slots_test.go Outdated Show resolved Hide resolved
pkg/sharky/slots_test.go Outdated Show resolved Hide resolved
pkg/sharky/slots_test.go Outdated Show resolved Hide resolved
pkg/sharky/slots_test.go Outdated Show resolved Hide resolved
Comment on lines 110 to 113
s.wg.Add(1)
go func() {
defer sh.slots.wg.Done()
defer s.wg.Done()
sh.process()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently new goroutine is started for each shard instance, would it be possible to avoid these goroutines? Are they really necessary?


All that sh.process() function does is essentially two operations:

slot := sh.slots.Next()
sh.slots.Use(slot)

Would it be possible to just call these two functions in (store.go:136) instead of reading from channel? Probably one more utility function would be needed to get shard, but overall code might be more simpler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea actually :)

pkg/sharky/recovery.go Show resolved Hide resolved
sl.size = uint32(len(sl.data) * 8)
sl.head = sl.next(0)
return err
sl.data = data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should statement

sl.data = data

be guarded with mutex?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, this function is only used during bootup so no mutex is needed

pkg/sharky/store.go Show resolved Hide resolved
pkg/sharky/store.go Outdated Show resolved Hide resolved
pkg/sharky/store.go Outdated Show resolved Hide resolved
@vladopajic
Copy link
Contributor

This sharky implementation does not have guaranties which shard and location is returned next. As an outcome this could lead to 1) probably more cache invalidation (on hdd/ssd level); and 2) fragmented data. Would it make sense to have sharky to always return first available location in consistent order? This way disk IO will probably hit same sector(s) on disk (probably having cache hit).

@@ -44,12 +48,11 @@ type Store struct {
// - maxDataSize - positive integer representing the maximum blob size to be stored
func New(basedir fs.FS, shardCnt int, maxDataSize int) (*Store, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is store ever writing data down to disk (closing and saving shards)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it asks shards to ask slots to save and close

@@ -28,12 +27,17 @@ var (
// - read prioritisation over writing
// - free slots allow write
type Store struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be tested.

@istae
Copy link
Member Author

istae commented Feb 22, 2023

This sharky implementation does not have guaranties which shard and location is returned next. As an outcome this could lead to 1) probably more cache invalidation (on hdd/ssd level); and 2) fragmented data. Would it make sense to have sharky to always return first available location in consistent order? This way disk IO will probably hit same sector(s) on disk (probably having cache hit).

Order of writes does not guarantee order of reads however. We cannot tell which chunk will be retrieved in the future, so it does not matter which shard we place them.
That being said, there is a some benefit in having shards store only a certain PO.
For example, when the node is calculating the reserve sample and is going through the entire reserve, reading the shards in order could be a great optimization.

@istae istae force-pushed the freaky branch 2 times, most recently from 0a9a83e to 2e4f076 Compare March 10, 2023 19:05
@istae istae requested a review from vladopajic March 13, 2023 15:30
@istae istae changed the title refactor: sharky rewrite refactor(sharky): cleanup of code and code simplification Mar 13, 2023
@istae istae changed the title refactor(sharky): cleanup of code and code simplification refactor: sharky cleanup and simplification of code Mar 13, 2023
@nikipapadatou
Copy link

Consider this again after the major localstore releases.

@istae istae marked this pull request as draft June 10, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants