-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[move] Remove compiled scripts from data cache #14026
Conversation
⏱️ 1h 11m total CI duration on this PR
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #14026 +/- ##
===========================================
- Coverage 70.8% 58.9% -11.9%
===========================================
Files 2324 824 -1500
Lines 459502 198261 -261241
===========================================
- Hits 325634 116938 -208696
+ Misses 133868 81323 -52545 ☔ View full report in Codecov by Sentry. |
// Access to this cache is always under a `RwLock`. | ||
#[derive(Clone)] | ||
pub(crate) struct BinaryCache<K, V> { | ||
// Notice that we are using the HashMap implementation from the hashbrown crate, not the | ||
// one from std, as it allows alternative key representations to be used for lookup, | ||
// making certain optimizations possible. | ||
id_map: hashbrown::HashMap<K, usize>, | ||
binaries: Vec<Arc<V>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many such instances, thanks!
whenever ownership is clear and outlives the uses, we should use references (unless lifetime metaprogramming really gets out of hand, and I think our threshold has been low there)
compiled_script: CompiledScript, | ||
) -> Arc<CompiledScript> { | ||
let compiled_script_to_return = Arc::new(compiled_script); | ||
self.scripts.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we are inserting at a place that already contains a verified script? should that be allowed, no-op, or not expected from caller? If we introduce any logic let's also unit test.
@@ -269,7 +269,15 @@ impl Loader { | |||
sha3_256.update(script_blob); | |||
let hash_value: [u8; 32] = sha3_256.finalize().into(); | |||
|
|||
let script = data_store.load_compiled_script_to_cache(script_blob, hash_value)?; | |||
let mut scripts = self.scripts.write(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use a read lock, then upgrade to write lock if we need to insert (can find example in scheduler), but I'd imagine it doesn't matter here. However, .entry function might be slightly better here as it would make it clear in this case we are just replacing empty slot with a deserialized/compiled script. See the comment below, otherwise APIs seem more general (for the data-structure, not entry level), e.g. insert_compiled_script which would have to deal with observing an existing entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems interesting if the designer didn't intend it under a lock (and makes sense to not be under a lock) - but also does scripts need to be a DashMap or something more granular than a global RW lock?
another question: probably the below logic, and also similar logic in load_script could be good separate utility methods - will also give more space to e.g. improve implementation with locks, etc.
script_blob, | ||
hash_value, | ||
data_store, | ||
let compiled_script = self.deserialize_script(script_blob)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, would insert deserialized, then re-use the handling for verification (deserialized to verified). Makes me think that these could become the APIs of the script cache- e.g. if we never insert verified directly.
match entry { | ||
ScriptCacheEntry::Verified(script) => script.clone(), | ||
ScriptCacheEntry::Deserialized(_) => { | ||
unreachable!("Script must be verified before it is main function scope is accessed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should "it is" be "its"?
hash_value, | ||
data_store, | ||
let compiled_script = self.deserialize_script(script_blob)?; | ||
self.verify_script(&compiled_script, data_store, module_store)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, the comment on verify_script (previously on deserialize_and_verify) said it should not happen under a lock, but as far as I can see, it happens under scripts write lock? should we give up and re-acquire?
@@ -269,7 +269,15 @@ impl Loader { | |||
sha3_256.update(script_blob); | |||
let hash_value: [u8; 32] = sha3_256.finalize().into(); | |||
|
|||
let script = data_store.load_compiled_script_to_cache(script_blob, hash_value)?; | |||
let mut scripts = self.scripts.write(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems interesting if the designer didn't intend it under a lock (and makes sense to not be under a lock) - but also does scripts need to be a DashMap or something more granular than a global RW lock?
another question: probably the below logic, and also similar logic in load_script could be good separate utility methods - will also give more space to e.g. improve implementation with locks, etc.
Description
The new code cache caches deserialization, which means transaction data cache should not be used for any kind of module loading. This PR changes existing script cache in loader so that deserialized scripts can be cached there directly. It also allows us to cache deserialized scripts across multiple sessions & transactions.
Type of Change
Which Components or Systems Does This Change Impact?
How Has This Been Tested?
Key Areas to Review
Checklist