Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Switch to using Inode like datastructures in the blockstore #5528

Open
kevina opened this issue Sep 26, 2018 · 0 comments
Open

RFC: Switch to using Inode like datastructures in the blockstore #5528

kevina opened this issue Sep 26, 2018 · 0 comments

Comments

@kevina
Copy link
Contributor

kevina commented Sep 26, 2018

Our current blockstore has complete lack of support for any sort of metadata. I believe fairly strongly that a lot of problems can be made a lot easier if we support it. One very good way to do this is to introduce the concept of a BlockInfo which adds a lawyer of indirection in a similar manor of unix inodes.

A possible API:

type Blockstore interface {
	Has(mh.Multihash) (bool, error)

	Get(cid.Cid) (*BlockInfo, error)

	// Update atomically updates a block info.  If the origData field
	// of the BlockInfo is not nil the value is retrieved and compared
	// to this value, if values match the store is updated with the
	// new info in BlockInfo, if the value does not match then the
	// store is not updated. If the origData field is empty and the
	// block already exists than the block is not updated.  The first
	// value returned is true if the block was updated.  The second
	// value is the new blockinfo as it exists in the Store. The final
	// value returnd is an error.
	Update(val *BlockInfo) (bool, *BlockInfo, error)

	// Overright updates the block info with no regard to the existing
	// value.  Use with care
	Overright(val *BlockInfo) error

	// TODO UpdateMany, will lock then first retrieve all keys and
	// compare, after that it will update the store according to the
	// semantic described in Update in a batch update

	Delete(mh.Multihash) error

	// AllKeysChan returns a channel from which
	// the CIDs in the Blockstore can be read. It should respect
	// the given context, closing the channel if it becomes Done.
	AllKeysChan(ctx context.Context) (<-chan mh.Multihash, error)

	// HashOnRead specifies if every read block should be
	// rehashed to make sure it matches its CID.
	HashOnRead(enabled bool)
}

type BlockInfo struct {
	Cid      cid.Cid
	Accessed time.Time // can be an approximate value for performance reason
	Created  time.Time
	PinCount int       // Delete will return an error if PinCount > 0
	Block    BlockLink
	Attr    []BlockAttr
	origData []byte // orignal data as retrieved from the blockstore
}

type BlockLink interface {
	// Get and contruct a block using the provided Cid
	func Get(cid.Cid) (Block, error)
	// Size returns the size of the block, it should never error
	func Size() int
}

type InlineBlock struct { //implements BlockLink
	data []byte
}

type NormalBlock struct { //implements BlockLink
	size int
	// other fields, if necessary to point to the location of the
	// block
}

type ExternalBlock struct { //implements BlockLink
	url string
	offset int
	size   int
}

type BlockAttr struct {
	Key string
	Value interface{}
}

Advantages to using this API:

  • Maintains pin reference counts to speed up pin operations
  • Allows for maianting last access time for more intelligent garbage collection
  • Allows for a hybrid storage approach where small blocks are stored in the database while larger blocks are stored via other means, for example using flatfs for additional flexibility and ability to take advantage of filesystem level deduplication
  • Builtin support filestore/urlstore support
  • Provide means to store additional useful attributes with the block itself rather then in a different data-structure
  • In general provides a lot more flexibility and should make other things easier down the road

Challenges:

The backing datastore needs to allow for fast retrieval and updating of keys


I agree this will be a major change but I think it is something worth considering. If no one else wants to attempt this I will be happy lead the effort.

(Note: I see this as an actionable action possible specific to go-ipfs (at least at first), if there is a better place to file this please let me know.)

@kevina kevina changed the title RFC: Switch to using Inode like datastructures in the blockstore RFC: Switch to using I-Node like datastructures in the blockstore Sep 29, 2018
@kevina kevina changed the title RFC: Switch to using I-Node like datastructures in the blockstore RFC: Switch to using Inode like datastructures in the blockstore Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant