Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm: store rkyv-serialized contracts in rocksdb, not borsh #10785

Closed

Conversation

nagisa
Copy link
Collaborator

@nagisa nagisa commented Mar 13, 2024

Borsh is not an appropriate serialization format for a couple of reasons:

  1. It is pretty conservative and defensive about the data it handles -- for instance it will conservatively grow the vector it is deserializing which makes an otherwise O(n) operation an O(n^2) one. This makes sense as a tradeoff with untrusted data where an allocation of $TOOMANYBYTES would crash the program. For us this is not a relevant property and causes a significant slowdown at the vector sizes we're dealing with.
  2. The data is really just a one big vector of bytes that we have read out from the database. Borsh APIs kinda force this vector to be reallocated no matter what, while ideally we'd want just a pointer/slice to said bytes.

rkyv serves these needs pretty well out of the box, although we can also consider some less fancy ways for (de)serialization if we move to the filesystem proper.

@nagisa nagisa requested a review from a team as a code owner March 13, 2024 16:02
@nagisa nagisa requested a review from Longarithm March 13, 2024 16:02
@nagisa nagisa force-pushed the replaces-compiled-contracts-with-rkyv branch from 3e2da90 to f645ef6 Compare March 13, 2024 16:06
Borsh is not an appropriate serialization format for a couple of
reasons:

1. It is pretty conservative and defensive about the data it handles --
   for instance it will conservatively grow the vector it is
   deserializing which makes an otherwise O(n) operation an O(n^2) one.
   This makes sense as a tradeoff with untrusted data where an
   allocation of $TOOMANYBYTES would crash the program. For us this is
   not a relevant property and causes a significant slowdown at the
   vector sizes we're dealing with.
2. The data is really just a one big vector of bytes that we have read
   out from the database. Borsh APIs kinda force this vector to be
   reallocated no matter what, while ideally we'd want just a
   pointer/slice to said bytes.

rkyv serves these needs pretty well out of the box, although we can also
consider some less fancy ways for (de)serialization if we move to the
filesystem proper.
@nagisa nagisa force-pushed the replaces-compiled-contracts-with-rkyv branch from f645ef6 to 9816df3 Compare March 13, 2024 16:13
Copy link
Collaborator

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving to unblock.

@nagisa
Copy link
Collaborator Author

nagisa commented Mar 15, 2024

#10791 is a better approach.

@nagisa nagisa closed this Mar 15, 2024
@nagisa nagisa deleted the replaces-compiled-contracts-with-rkyv branch March 15, 2024 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants