Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm: store compiled contracts directly in the filesystem #10791

Merged

Conversation

nagisa
Copy link
Collaborator

@nagisa nagisa commented Mar 14, 2024

Going through rocksdb involves compression, all the rocksdb logic etc. That can't be cheap. This is also a cache so none of the features really benefit the use-case.

Storing compiled contracts as files also makes it much more straightforward to inspect them now if the need to do so arises. For example an estimation of the size distribution is one ls away.

Going through rocksdb involves compression, all the rocksdb logic etc.
That can't be cheap. This is also a cache so none of the features really
benefit the use-case.

Storing compiled contracts as files also makes it much more
straightforward to inspect them now if the need to do so arises. For
example an estimation of the size distribution is one `ls` away.
Copy link
Collaborator

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fairly straightforward. Even if we do not end up using it in production, I don't see a big harm in merging this in. Excited to see some benchmarking results.

runtime/near-vm-runner/src/runner.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/runner.rs Outdated Show resolved Hide resolved

/// Cache for compiled contracts code in plain filesystem.
impl CompiledContractCache for FilesystemCompiledContractCache {
fn put(&self, key: &CryptoHash, value: CompiledContract) -> std::io::Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a check for if the file already exist and then the function is effectively a NOP?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RocksDB-based implementation says something about the fact that there is some non-determinism somewhere (though this was probably an outdated statement) and that intended behaviour is to overwrite in case we end up with different data. I didn't want to prod that particular bear nest this time around :)

I haven't pursued mmap and such yet as it requires non-trivial
additional changes, and those can be made as part of a followup too.
Creating the cache requires us to know the right place to create it at.
This has required me to thread the thing pretty much though the entire
stack. But it also has resulted in some benefits -- for instance we
won't be unnecessarily doing any directory related operations every time
a view call is invoked cause the cache is created pretty early on and is
reused.

This is also a step in the right direction in the longer term where we
probably want the contract runtime to stand on its own a little more and
have the invoker of the transaction runtime to be responsible for
integrating the components together.
This makes the cloning and lifetime of the cache directory much more
straightforward to reason about.
@nagisa nagisa marked this pull request as ready for review March 15, 2024 13:01
@nagisa nagisa requested a review from a team as a code owner March 15, 2024 13:01
@nagisa nagisa requested a review from wacban March 15, 2024 13:01
@nagisa
Copy link
Collaborator Author

nagisa commented Mar 15, 2024

I had to finish this first before going at the VMArtifact cache due to the fact that the VMArtifact cache might end up being managed by the "CompiledContractCache" (although most likely not) and I didn't want myself to be bogged down by not being able to make the choice.

This PR should be ready to land in its current state. I was able to repurpose the previous tests that we had and that were using the RocksDB-based implementation to now utilize this new cache instead, so I didn't really need to spend too much time on that facet, although I would love many more tests for this functionality.

I also kept the read(2) based implementation rather than exploring mmap as that would require an interface change similar to what I've done in #10785. Since the improvement from that effort is not guaranteed, I'm leaving this as a future endeavour (will be filing an issue for this right after this comment.)

Copy link
Collaborator

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving to unblock. High level, looks OK. Still a detailed review from @wacban will be good.

@wacban
Copy link
Contributor

wacban commented Mar 15, 2024

just sanity check
How does it handle stopping neard? Does this impl provide any guarantees that the files are not corrupted in this case? I don't mean SIGKILL, just the regular SIGINT that is typically used.

@akhi3030
Copy link
Collaborator

just sanity check How does it handle stopping neard? Does this impl provide any guarantees that the files are not corrupted in this case? I don't mean SIGKILL, just the regular SIGINT that is typically used.

Good question. I think we should be able to handle SIGKILL as well. I guess the answer depends on whether or not after a restart, a node can call get before put. As we are always overwriting in put.

@nagisa
Copy link
Collaborator Author

nagisa commented Mar 15, 2024

How does it handle stopping neard? Does this impl provide any guarantees that the files are not corrupted in this case? I don't mean SIGKILL, just the regular SIGINT that is typically used.

Partial writes are not visible to the cache -- the writes are "commited" using the renameat syscall which is atomic. We only call it after the file is fully written. Though we do not fsync between the write and rename (should we?), so I'd imagine it is still conceivable for things to go wrong if somebody keeps tripping over their power cable. If the operator keeps doing that, its gonna be on them to rm -rf ~/.near/data/contracts/ to recover…

Regular signals (including SIGKILL) will at worst leave some .tmp files in the $NEAR_HOME/data/contracts/ directory.

@nagisa
Copy link
Collaborator Author

nagisa commented Mar 15, 2024

I debugged the integration test failures. Requires some adjustments to TestEnv to allow specifying the cache to use, which I'll do... not today.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
mostly just requests for code comments
you may want to consider adding some metrics and logs to easy future debugging

runtime/near-vm-runner/src/cache.rs Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Outdated Show resolved Hide resolved
runtime/near-vm-runner/src/cache.rs Outdated Show resolved Hide resolved
Comment on lines +98 to +100
/// This cache however does not implement any clean-up policies. While it is possible to truncate
/// a file that has been written to the cache before (`put` an empty buffer), the file will remain
/// in place until an operator (or somebody else) removes files at their own discretion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just going to ask you about this. Why did you decide to not clean up the cache? Do you expect it to be growing slow enough to ever actually cause trouble? Do you know what would be the total size of cache today?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is again something that the previous rocksdb based implementation does as well. You can check the corresponding column in rocksdb for the current numbers. I have no clue how to do that myself.

The size of this directory should end up being in the same ballpark in the long term.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's around 10GB. That's not motivating enough for me to try to optimize.

@wacban
Copy link
Contributor

wacban commented Mar 18, 2024

cc @pugachAG for an implementation of contract cache that we may want to reuse or build on top of for the stateless validation contract cache. This one is for compiled code but perhaps that's even better?

Previously some of the tests implicitly isolated the cache between
contracts by specifying distinct stores. With the new filesystem cache
the contract caches must be distinct too in some of these tests.
@nagisa
Copy link
Collaborator Author

nagisa commented Mar 18, 2024

@wacban It would be great to have a last pass check on the teach test env about contract caches commit. If the CI passes (I believe it should) this will be landable.

Thank you for the thorough review!

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (looking at the test env commit)

}
}

impl<C: CompiledContractCache> CompiledContractCache for &C {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a while to notice that & sign there.

Copy link

codecov bot commented Mar 18, 2024

Codecov Report

Attention: Patch coverage is 68.75000% with 65 lines in your changes are missing coverage. Please review.

Project coverage is 71.63%. Comparing base (272797f) to head (f645277).

Files Patch % Lines
runtime/near-vm-runner/src/cache.rs 70.21% 8 Missing and 6 partials ⚠️
tools/state-viewer/src/commands.rs 16.66% 10 Missing ⚠️
runtime/near-vm-runner/src/logic/mod.rs 62.50% 9 Missing ⚠️
core/store/src/lib.rs 0.00% 4 Missing ⚠️
...ntime/runtime-params-estimator/src/vm_estimator.rs 33.33% 3 Missing and 1 partial ⚠️
tools/fork-network/src/cli.rs 0.00% 4 Missing ⚠️
nearcore/src/lib.rs 25.00% 2 Missing and 1 partial ⚠️
test-utils/store-validator/src/main.rs 0.00% 2 Missing ⚠️
tools/epoch-sync/src/cli.rs 0.00% 2 Missing ⚠️
tools/flat-storage/src/commands.rs 0.00% 2 Missing ⚠️
... and 7 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10791      +/-   ##
==========================================
- Coverage   71.63%   71.63%   -0.01%     
==========================================
  Files         761      761              
  Lines      152915   153067     +152     
  Branches   152915   153067     +152     
==========================================
+ Hits       109546   109643      +97     
- Misses      38399    38438      +39     
- Partials     4970     4986      +16     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.42% <0.00%> (-0.01%) ⬇️
integration-tests 36.91% <52.40%> (+0.04%) ⬆️
linux 70.22% <68.75%> (-0.01%) ⬇️
linux-nightly 71.12% <68.75%> (+0.01%) ⬆️
macos 54.77% <54.68%> (+0.02%) ⬆️
pytests 1.64% <0.00%> (-0.01%) ⬇️
sanity-checks 1.43% <0.00%> (-0.01%) ⬇️
unittests 67.35% <62.50%> (-0.02%) ⬇️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@nagisa
Copy link
Collaborator Author

nagisa commented Mar 18, 2024

The missing coverage is largely in the generic code that delegates to other methods...

@nagisa nagisa added this pull request to the merge queue Mar 18, 2024
Merged via the queue into near:master with commit 0a18bff Mar 18, 2024
27 of 30 checks passed
@nagisa nagisa deleted the prototypes-filesystem-based-compiled-contract-store branch March 18, 2024 13:14
@wacban
Copy link
Contributor

wacban commented Mar 19, 2024

It's me again :)

Would it make sense to remove that no longer used db column when we switch to the filesystem impl?

Will there be any perf impact when a node switches to this impl? Would it make sense to warm it up or is that not needed?

@nagisa
Copy link
Collaborator Author

nagisa commented Mar 19, 2024

Would it make sense to remove that no longer used db column when we switch to the filesystem impl?

Formally the StoreCompiledContractCache still exists, although currently it is dead code. I think it would be fine to remove, or at least clear the contents of the column (its a cache, we can regenerate data as needed.)


Will there be any perf impact when a node switches to this impl? Would it make sense to warm it up or is that not needed?

There will be an impact that's roughly equivalent to the impact of upgrading neard to a version that contains changes compared to a previous version – the node will compile contracts as they are called for the first time since the upgrade moment. This sort of thing used to happen somewhat frequently in the past without causing notable issues, so in principle I do not believe we need to treat this change specially in any way.

nagisa added a commit to nagisa/nearcore that referenced this pull request Mar 20, 2024
Going through rocksdb involves compression, all the rocksdb logic etc.
That can't be cheap. This is also a cache so none of the features really
benefit the use-case.

Storing compiled contracts as files also makes it much more
straightforward to inspect them now if the need to do so arises. For
example an estimation of the size distribution is one `ls` away.
nagisa added a commit to nagisa/nearcore that referenced this pull request Mar 20, 2024
VanBarbascu pushed a commit that referenced this pull request Mar 26, 2024
Going through rocksdb involves compression, all the rocksdb logic etc.
That can't be cheap. This is also a cache so none of the features really
benefit the use-case.

Storing compiled contracts as files also makes it much more
straightforward to inspect them now if the need to do so arises. For
example an estimation of the size distribution is one `ls` away.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants