[move] Loader V2 API integration for MoveVM #14075

georgemitenkov · 2024-07-22T15:39:22Z

Description

Versioning Loader into V1 and V2: this allows easier integration so that we can switch a flag and get the new behaviour.
New loader implementation draft:

LoaderV2 does not store modules or scripts, instead the corresponding ScriptStorage and ModuleStorage are passed as references to loader APIs. This allows us to cache things on read.
LoaderV2 still has type caches which keep track of depth formulas, etc. like the old loader: these do not change the behaviour with module upgrade, and it is fine to keep them IMO for simpler integration.

New data structures:

LoaderV2 also has V: Verifier type param to make sure we can provide custom verifier when loading a module
Loading a module calls to fetch_or_create_verified_module - this function actually loads the transitive closure of dependencies, and this is an implementation detail of code cache (not here).
Because we do not want to leak data structures from Loader into code cache, the idea is to pass a function to fetch_or_create_verified_module that is just f: &dyn Fn(Arc<CompiledModule>) -> PartialVMResult<Module> and is passed by the loader. This way this function can capture struct name cache which maps struct names to identifiers, native functions, etc.

Alternative design

We can have a shared context which will be shared between loader & code cache: something like this:

pub struct LoaderContext<V: Verifier> {
    // Native functions can be also here, but type caches are kept in the loader.
    pub(crate) struct_name_index_map: StructNameIndexMap,
    phantom_data: PhantomData<V>,
}

impl<V: Verifier> LoaderContext<V> {
    pub fn build_module(
        &self,
        cm: Arc<CompiledModule>,
        module_storage: &impl ModuleStorage,
    ) -> PartialVMResult<Module> {
        V::verify_module(cm.as_ref())?;
        let imm_dependencies = cm
            .immediate_dependencies_iter()
            .map(|(addr, name)| module_storage.fetch_verified_module(addr, name))
            .collect::<PartialVMResult<Vec<_>>>()?;
        V::verify_module_with_dependencies(
            cm.as_ref(),
            imm_dependencies.iter().map(|m| m.module()),
        )?;
        // Must be a private constructor, so that one can create module only via the context.
        Module::new_v2(module_storage, &self.struct_name_index_map, cm)
    }
}

Type of Change

Which Components or Systems Does This Change Impact?

How Has This Been Tested?

Key Areas to Review

Checklist

I have read and followed the CONTRIBUTING doc
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I identified and added all stakeholders and component owners affected by this change as reviewers
I tested both happy and unhappy path of the functionality
I have made corresponding changes to the documentation

trunk-io · 2024-07-22T15:39:25Z

⏱️ 1h 59m total CI duration on this PR

Job	Cumulative Duration	Recent Runs
rust-move-tests	15m	🟩
rust-move-tests	15m	🟩
rust-move-unit-coverage	13m	🟩
rust-move-unit-coverage	12m	🟩
rust-move-tests	11m	🟩
rust-move-tests	10m	🟩
rust-move-unit-coverage	8m	🟩
rust-move-unit-coverage	7m	🟩
rust-cargo-deny	7m	🟩 🟩 🟩 🟩
general-lints	7m	🟩 🟩 🟩 🟩
check-dynamic-deps	5m	🟩 🟩 🟩 🟩
file_change_determinator	3m	🟩 🟩 🟩 🟩 🟩 (+11 more)
semgrep/ci	2m	🟩 🟩 🟩 🟩 🟩
permission-check	50s	🟩 🟩 🟩 🟩 🟩 (+11 more)
file_change_determinator	47s	🟩 🟩 🟩 🟩
permission-check	43s	🟩 🟩 🟩 🟩 🟩 (+11 more)
permission-check	42s	🟩 🟩 🟩 🟩 🟩 (+11 more)
permission-check	39s	🟩 🟩 🟩 🟩 🟩 (+11 more)
rust-move-unit-coverage	1s	⬜

_{settings ⋅ feedback ⋅ docs ⋅ learn more about trunk.io}

codecov · 2024-07-22T15:48:59Z

Codecov Report

Attention: Patch coverage is 50.89216% with 633 lines in your changes missing coverage. Please review.

Project coverage is 59.0%. Comparing base (a13c2f0) to head (dc00eee).

Files	Patch %	Lines
...d_party/move/move-vm/runtime/src/storage/loader.rs	0.0%	244 Missing ⚠️
third_party/move/move-vm/runtime/src/loader/mod.rs	65.2%	164 Missing ⚠️
...rd_party/move/move-vm/runtime/src/storage/dummy.rs	0.0%	69 Missing ⚠️
aptos-move/aptos-vm/src/aptos_vm.rs	0.0%	42 Missing ⚠️
third_party/move/move-vm/runtime/src/session.rs	57.1%	33 Missing ⚠️
third_party/move/move-vm/runtime/src/data_cache.rs	54.8%	14 Missing ⚠️
...move-vm/runtime/src/storage/struct_type_storage.rs	40.0%	12 Missing ⚠️
...tos-move/aptos-vm/src/verifier/event_validation.rs	0.0%	10 Missing ⚠️
...ptos-move/aptos-vm/src/verifier/resource_groups.rs	0.0%	9 Missing ⚠️
...party/move/move-vm/runtime/src/native_functions.rs	69.2%	8 Missing ⚠️
... and 10 more

Additional details and impacted files

@@              Coverage Diff               @@
##           george/main   #14075     +/-   ##
==============================================
- Coverage         59.1%    59.0%   -0.1%     
==============================================
  Files              828      833      +5     
  Lines           201915   202672    +757     
==============================================
+ Hits            119416   119683    +267     
- Misses           82499    82989    +490

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

gelash

does it make sense to add script_hash to CompiledScript or Script? it could also be smt we build e.g. in constructors.

third_party/move/move-vm/runtime/src/storage/dummy.rs

third_party/move/move-vm/runtime/src/storage/loader.rs

third_party/move/move-vm/runtime/src/storage/dummy.rs

gelash · 2024-07-26T12:43:27Z

third_party/move/move-vm/runtime/src/data_cache.rs

+
+            let (data, bytes_loaded) = match loader {
+                Loader::V1(_) => {
+                    let maybe_module = module_store.module_at(&ty_tag.module_id());


matter of style, but as a fancy fan got to mention we can directly compute the metadata by chaining .map_or_else(|| &[], |m|.module().metadata());
on one hand maybe_module isn't used elsewhere, on the other hand it makes reading a bit clearer. Similarly match might be more verbose - so up to you.

I don't think we can use a map because then the lifetimes become weird... let's see if I can refactor the code so that it works, mapping or else is nicer indeed.

gelash · 2024-07-26T12:44:19Z

third_party/move/move-vm/runtime/src/data_cache.rs

+                    // If we need to process aggregator lifting, we pass type layout to remote.
+                    // Remote, in turn ensures that all aggregator values are lifted if the resolved
+                    // resource comes from storage.
+                    self.remote.get_resource_bytes_with_metadata_and_layout(


looks like we can move these calls in V1 and V2 outside? we just need to get different metadata?

Hm, not really, we have different lifetimes here because the V2 returns a reference which lives as long as module storage lives, and V1 lives only as long as the optional module lives. And it gets this ugly. WIll try to refactor the code a bit to make it look nicer

gelash · 2024-07-26T12:46:27Z

third_party/move/move-vm/runtime/src/interpreter.rs

-                        PartialVMError::new(StatusCode::FUNCTION_RESOLUTION_FAILURE)
-                            .with_message(format!("Module {} doesn't exist", module_name))
-                    })?;
+                // Note(George): V2 loads function directly to avoid this case completely!


is whatever the below was precautioning from no longer relevant? then maybe we can add one sentence to sketch that.

Added a sentence, but yes: V1 uses cache which maybe empty, but then there is this assumption that module must be in module cache when we load a function. This seems pretty fragile, and in V2 we always fetch a module from module storage and cache on read.

gelash · 2024-07-26T12:47:41Z

third_party/move/move-vm/runtime/src/interpreter.rs

@@ -645,6 +667,15 @@ impl Interpreter {
                .map_err(|err| err.to_partial())
            },
            NativeResult::LoadModule { module_name } => {
+                if let Loader::V2(_) = resolver.loader() {


mhm, why do we need a new check?

Below we do check_dependencies_and_charge_gas. Current implementation uses annoying allow_module_failure only for the first set of ids because these dependencies indeed may not exist (e.g., non existent entry function, etc.) but for other dependencies down the closure they always exist. So we optionally check if allow LINKER_ERROR or it is an invariant violation

But actually yes, in this implementation I do not think we need this here, good point

third_party/move/move-vm/runtime/src/interpreter.rs

third_party/move/move-vm/runtime/src/storage/verifier.rs

third_party/move/move-vm/runtime/src/storage/struct_name_index_map.rs

third_party/move/move-vm/runtime/src/loader/modules.rs

gelash

Thanks a lot for this George, the fact that you keep improving everything around your changes is amazing

gelash · 2024-07-29T18:37:21Z

third_party/move/move-vm/runtime/src/data_cache.rs

@@ -171,7 +172,7 @@ impl<'r> TransactionDataCache<'r> {
        Ok(change_set)
    }

-    pub(crate) fn num_mutated_accounts(&self, sender: &AccountAddress) -> u64 {
+    pub(crate) fn num_mutated_resources(&self, sender: &AccountAddress) -> u64 {


is it resources? so it matches the account and counts mutated entries associated with the account, suppose that makes sense and thanks for fixing! however,

let's also change inner variable names?

total_mutated accounts is initialized w. 1 because there is an account resource? but that seems ugly: what if we call it with some account that wasn't even relevant though, it would just return 1? 'sender' name maybe helps a bit here.

Yes, because it uses data_map in its implementation and not modules. This nicely aligns with the fact that we will not store modules inside anymore👍

To be honest, I am not sure how useful this API is, it is only used for tests, but I know that not everyone is happy with removing public interfaces no matter what😄 But this seems so ugly... I agree, why we assume sender is mutated? Mu guess is that this must be the adapter's responsibility, where we have senders.

Example of a test:

// The resource was published to "account1" and the sender's account // (TEST_ADDR) is assumed to be mutated as well (e.g., in a subsequent // transaction epilogue). assert_eq!(sess.num_mutated_resources(&TEST_ADDR), 2);

But this is just wrong, test assumes that something is mutated....

So I see two ways:

Remove this assumption on sender. Move doesn't care about this and epilogue, this is clearly off.

Remove this API completely, or better make it #[cfg(feature = "testing")] or something like that.

@gelash what do you think?

let's see if we are missing something, o.w. I'd purge it.

cc @runtian-zhou @vgao1996 who might have some context about this

Ping @runtian-zhou @vgao1996

aptos-move/framework/src/module_metadata.rs

third_party/move/move-vm/runtime/src/storage/struct_name_index_map.rs

third_party/move/move-vm/runtime/src/session.rs

gelash · 2024-07-29T23:10:32Z

third_party/move/move-vm/runtime/src/interpreter.rs

-                    .map_err(|err| self.attach_state_if_invariant_violation(err, &current_frame))?;
+            // Select which resolver to use: if we are executing in the context of the entrypoint
+            // function, we use the one provided by the caller.
+            let resolver = if self.call_stack.is_empty() {


is this new logic? previously it worked through Script / Module distinction, right? let's be careful here, but it does seem we were creating resolvers every time for scripts and now we re-use? can call stack be cleared e.g. on abort, and then have unintended consequences? just asking.
Maybe we can encapsulate the logic somewhere (and add any other safeguards/checks if needed)?

Yes, good point. I rewrote that ugly APIs by storing LoadedFunction everywhere, and providing it with scope. @runtian-zhou I think this looks cleaner, and V2 loader API fit perfectly. WDYT? See the last commit in this PR (Note: I did not polish certain types, etc., but added this code to give an idea).

Update: this change is refactored into #14185

third_party/move/move-vm/runtime/src/interpreter.rs

third_party/move/move-vm/runtime/src/loader/mod.rs

gelash · 2024-07-29T23:26:25Z

third_party/move/move-vm/runtime/src/loader/mod.rs

+    ) -> PartialVMResult<Self> {
+        let module_id = function
+            .module_id()
+            .ok_or_else(|| {


we could move this inside module_id call, with error saying "Function {} must have a module owner", i.e. not always, but when we call, it should.

Hmmm, yes, but scripts still do not have it is not really an error, there a few debug use cases where we optionally print things. My take is that the function should not be owned, but LoadedFunction should be because this is the one that has full context, and can actually carry arced module or script. @runtian-zhou, what do you think?

I have tried that approach, and it requires changing types from Arc to LoadedFunction throughout, and I was not convinced this is better than just special casing for scripts.

runtian-zhou

My general take is we can don't need to version the code just for the sake of switching things over. The new loader should be functionally equivalent so if keeping two versions at the same time is making the code harder to read and maintain we don't have to keep both logics. WDYT?

runtian-zhou · 2024-07-31T07:35:46Z

third_party/move/move-vm/runtime/src/session.rs

@@ -60,6 +62,7 @@ impl<'r, 'l> Session<'r, 'l> {
        args: Vec<impl Borrow<[u8]>>,
        gas_meter: &mut impl GasMeter,
        traversal_context: &mut TraversalContext,
+        module_storage: &impl ModuleStorage,


Would it make more sense to move the module_storage into Session? That way we could avoid changing the api. Also from the api standpoint I think it also makes sense, as we shouldn't expect modules loaded within a session to be altered.

Not really, and this is something I do not want to do on purpose - mostly because of init-module and respawned sessions. If we capture it in Move session, then how do you support init_module? It is not possible unless you respawn session or stash published modules temporarily inside the session (which exactly lead to existing problems).

The APIs can made nice by introducing a "context" struct which you pass everywhere, but it contains the gas, the module storage, etc, something similar to Resolver

third_party/move/move-vm/runtime/src/loader/mod.rs

runtian-zhou

Personally I would be hesitant to just land the PR as is because we are using dummy storage everywhere, meaning this V2 logic isn't actually being used and tested for now right? Or did I miss anything?

gelash requested review from gelash, ziaptos, runtian-zhou, vgao1996, zekun000 and igor-aptos July 26, 2024 10:42

gelash reviewed Jul 26, 2024

View reviewed changes

third_party/move/move-vm/runtime/src/storage/dummy.rs Outdated Show resolved Hide resolved

third_party/move/move-vm/runtime/src/storage/dummy.rs Outdated Show resolved Hide resolved

third_party/move/move-vm/runtime/src/storage/loader.rs Outdated Show resolved Hide resolved

gelash reviewed Jul 26, 2024

View reviewed changes

georgemitenkov marked this pull request as ready for review July 26, 2024 15:06

gelash reviewed Jul 26, 2024

View reviewed changes

third_party/move/move-vm/runtime/src/interpreter.rs Outdated Show resolved Hide resolved

third_party/move/move-vm/runtime/src/storage/verifier.rs Show resolved Hide resolved

gelash reviewed Jul 26, 2024

View reviewed changes

third_party/move/move-vm/runtime/src/storage/struct_name_index_map.rs Show resolved Hide resolved

third_party/move/move-vm/runtime/src/loader/modules.rs Outdated Show resolved Hide resolved

third_party/move/move-vm/runtime/src/loader/modules.rs Show resolved Hide resolved

georgemitenkov changed the title ~~WIP: almost full loader V2 API integration for MoveVM~~ [move] Loader V2 API integration for MoveVM Jul 27, 2024

georgemitenkov force-pushed the george/loader-v2 branch from 1b082d8 to 65eb531 Compare July 27, 2024 17:34

georgemitenkov requested review from davidiw, movekevin and wrwg as code owners July 27, 2024 18:48

georgemitenkov removed request for davidiw, wrwg and movekevin July 27, 2024 22:49

gelash approved these changes Jul 29, 2024

View reviewed changes

georgemitenkov force-pushed the george/loader-v2 branch 3 times, most recently from f80b102 to 4b3f255 Compare July 30, 2024 14:14

runtian-zhou reviewed Jul 31, 2024

View reviewed changes

georgemitenkov force-pushed the george/loader-v2 branch 3 times, most recently from 7a9cbfa to 0c56eff Compare July 31, 2024 12:06

georgemitenkov added a commit that referenced this pull request Sep 1, 2024

[move] Full loader integration on MoveVM side (#14075)

7ff6ea8

This was referenced Sep 2, 2024

[loader-v2] Addressing some loader TODOs and performance issues #14495

Merged

[move] Improve & test struct index map #14507

Merged

georgemitenkov added a commit that referenced this pull request Sep 4, 2024

[move] Full loader integration on MoveVM side (#14075)

8ce35c7

georgemitenkov added a commit that referenced this pull request Sep 4, 2024

[move] Full loader integration on MoveVM side (#14075)

4f1e9fb

georgemitenkov added a commit that referenced this pull request Sep 5, 2024

[move] Full loader integration on MoveVM side (#14075)

255a869

georgemitenkov added a commit that referenced this pull request Sep 5, 2024

[move] Full loader integration on MoveVM side (#14075)

ee1f64b

georgemitenkov added a commit that referenced this pull request Sep 15, 2024

[move] Full loader integration on MoveVM side (#14075)

48c70ee

georgemitenkov added a commit that referenced this pull request Sep 24, 2024

[move] Full loader integration on MoveVM side (#14075)

b486628

georgemitenkov mentioned this pull request Oct 1, 2024

[IGNORE] avoid module validations (no global cache) #14827

Closed

22 tasks

georgemitenkov added a commit that referenced this pull request Oct 2, 2024

[move] Full loader integration on MoveVM side (#14075)

6f34ff9

georgemitenkov added a commit that referenced this pull request Oct 5, 2024

[move] Full loader integration on MoveVM side (#14075)

9da65cc

georgemitenkov added a commit that referenced this pull request Oct 7, 2024

[move] Full loader integration on MoveVM side (#14075)

c5265c8

georgemitenkov added a commit that referenced this pull request Oct 8, 2024

[move] Full loader integration on MoveVM side (#14075)

9bfbf7d

georgemitenkov added a commit that referenced this pull request Oct 9, 2024

[move] Full loader integration on MoveVM side (#14075)

43e77ca

georgemitenkov added a commit that referenced this pull request Oct 14, 2024

[move] Full loader integration on MoveVM side (#14075)

ccf340f

georgemitenkov added a commit that referenced this pull request Oct 16, 2024

[move] Full loader integration on MoveVM side (#14075)

6da5365

georgemitenkov added a commit that referenced this pull request Oct 21, 2024

[move] Full loader integration on MoveVM side (#14075)

1f82b66

georgemitenkov added a commit that referenced this pull request Oct 22, 2024

[move] Full loader integration on MoveVM side (#14075)

48a8b77

georgemitenkov added a commit that referenced this pull request Oct 22, 2024

[move] Full loader integration on MoveVM side (#14075)

7e3a576

georgemitenkov added a commit that referenced this pull request Oct 23, 2024

[move] Full loader integration on MoveVM side (#14075)

6cb5f24

georgemitenkov mentioned this pull request Oct 23, 2024

[loader] Remove V1 loader #15070

Open

22 tasks

georgemitenkov added a commit that referenced this pull request Oct 28, 2024

[move] Full loader integration on MoveVM side (#14075)

f5860e5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[move] Loader V2 API integration for MoveVM #14075

[move] Loader V2 API integration for MoveVM #14075

georgemitenkov commented Jul 22, 2024 •

edited

Loading

trunk-io bot commented Jul 22, 2024 •

edited

Loading

codecov bot commented Jul 22, 2024 •

edited

Loading

gelash left a comment

gelash Jul 26, 2024

georgemitenkov Jul 26, 2024

gelash Jul 26, 2024

georgemitenkov Jul 26, 2024

gelash Jul 26, 2024

georgemitenkov Jul 26, 2024

gelash Jul 26, 2024

georgemitenkov Jul 26, 2024

georgemitenkov Jul 26, 2024

gelash left a comment

gelash Jul 29, 2024

georgemitenkov Jul 30, 2024

georgemitenkov Jul 30, 2024 •

edited

Loading

gelash Jul 30, 2024

georgemitenkov Jul 31, 2024

georgemitenkov Aug 1, 2024

gelash Jul 29, 2024

georgemitenkov Jul 31, 2024

georgemitenkov Aug 1, 2024

gelash Jul 29, 2024

georgemitenkov Jul 30, 2024

runtian-zhou left a comment

runtian-zhou Jul 31, 2024

georgemitenkov Jul 31, 2024

runtian-zhou left a comment

[move] Loader V2 API integration for MoveVM #14075

[move] Loader V2 API integration for MoveVM #14075

Conversation

georgemitenkov commented Jul 22, 2024 • edited Loading

Description

Alternative design

Type of Change

Which Components or Systems Does This Change Impact?

How Has This Been Tested?

Key Areas to Review

Checklist

trunk-io bot commented Jul 22, 2024 • edited Loading

codecov bot commented Jul 22, 2024 • edited Loading

Codecov Report

gelash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gelash left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

georgemitenkov Jul 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

runtian-zhou left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

runtian-zhou left a comment

Choose a reason for hiding this comment

georgemitenkov commented Jul 22, 2024 •

edited

Loading

trunk-io bot commented Jul 22, 2024 •

edited

Loading

codecov bot commented Jul 22, 2024 •

edited

Loading

georgemitenkov Jul 30, 2024 •

edited

Loading