Lock mutex in more client methods. #2567

afck · 2024-10-03T16:51:56Z

Motivation

The tests in #2538 fail. A (possibly the) scenario that can cause a processInbox mutation to unexpectedly not produce a block is the following:

Chain 1 creates a block B that sends a message to chain 2.
The running node service of the owner of chain 2 receives the notification from the validators.
The node service's client listener downloads B and starts processing it. It updates the chain state and puts the message in the outbox…
Now processInbox is called: It sees that B has already been handled. On the other hand, the inbox is still empty, so it doesn't create a block.
…finally the client listener task puts the messages in chain 2's inbox.

Proposal

Use the mutex in the ChainState in more places, to ensure that certain tasks don't overlap.

Test Plan

I ran the test_wasm_end_to_end_fungible::storage_service_grpc test locally 50 times successfully, together with the optimization in #2538 and the fix in #2562.

Release Plan

These changes should be backported to the latest devnet branch, then
- be released in a new SDK.
These changes should be backported to the latest testnet branch, then
- be released in a new SDK.

Links

reviewer checklist

ma2bd · 2024-10-03T19:16:45Z

linera-core/src/client/chain_state.rs

@@ -161,7 +161,7 @@ impl ChainState {
        self.pending_blobs.clear();
    }

-    pub fn preparing_block(&self) -> Arc<Mutex<()>> {
-        self.preparing_block.clone()
+    pub fn client_mutex(&self) -> Arc<Mutex<()>> {


Why is this pub ?

Because the ChainClient has to use it. I can make it pub(super).

This whole struct is private to linera_core::client. I'm usually suspicious of pub(super), pub(crate) and friends because, even though they're sometimes necessary (especially for use in macros), they imply non-local knowledge of the structure the code is embedded into. The upshot is that types look different (expose different behaviour) depending on where they're imported, which usually just results in a lot of spurious changes when refactoring, but coupled with conditional compilation could lead to some hard-to-spot compilation breakages.

Yes, I also prefer just using pub in these cases. Happy to revert this in a later PR.

As a rule, public APIs need to be minimized.

I'd argue that they are, since ChainState itself is only visible in the client module, and not exported further. So effectively all it's methods are pub(super) anyway.

ma2bd · 2024-10-03T19:18:05Z

linera-core/src/client/mod.rs

+        let mutex = self.state().client_mutex();
+        let _guard = mutex.lock_owned().await;


Do we need two separate lines when lock_owned is used? (and otherwise why lock_owned?)

We actually do! self.state() returns a ChainGuard that can't be held across an await point… specifically so we wouldn't ever lock an entry in the chain states map locked.

And it looks like Rust would drop that guard only after the whole expression, i.e. after the await.

Twey

This mutex is a hack, but I'm happy enough with the hack being extended to cover more cases in anticipation of a larger linera_core::client refactor. I think it's good that at least the locking logic remains within the client this time.

afck requested review from Twey and ma2bd October 3, 2024 16:51

ma2bd reviewed Oct 3, 2024

View reviewed changes

Lock mutex in more client methods.

69d787d

This was referenced Oct 4, 2024

Revisit ChainState locking. #2569

Open

Replace ApplicationDescription with Blobs #2426

Closed

Make mutating ChainState methods pub(super).

5ffc29d

afck force-pushed the client-mutex branch from fef2b2f to 5ffc29d Compare October 4, 2024 09:36

Twey approved these changes Oct 4, 2024

View reviewed changes

afck marked this pull request as ready for review October 4, 2024 09:38

afck merged commit 27c53b0 into linera-io:main Oct 4, 2024
5 checks passed

afck deleted the client-mutex branch October 4, 2024 10:33

afck mentioned this pull request Oct 7, 2024

Don't keep the client mutex longer than necessary. #2581

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lock mutex in more client methods. #2567

Lock mutex in more client methods. #2567

afck commented Oct 3, 2024 •

edited

Loading

ma2bd Oct 3, 2024

afck Oct 4, 2024 •

edited

Loading

Twey Oct 4, 2024

afck Oct 4, 2024

ma2bd Oct 4, 2024

afck Oct 4, 2024

ma2bd Oct 3, 2024 •

edited

Loading

afck Oct 4, 2024

Twey left a comment

		let mutex = self.state().client_mutex();
		let _guard = mutex.lock_owned().await;

Lock mutex in more client methods. #2567

Lock mutex in more client methods. #2567

Conversation

afck commented Oct 3, 2024 • edited Loading

Motivation

Proposal

Test Plan

Release Plan

Links

ma2bd Oct 3, 2024

Choose a reason for hiding this comment

afck Oct 4, 2024 • edited Loading

Choose a reason for hiding this comment

Twey Oct 4, 2024

Choose a reason for hiding this comment

afck Oct 4, 2024

Choose a reason for hiding this comment

ma2bd Oct 4, 2024

Choose a reason for hiding this comment

afck Oct 4, 2024

Choose a reason for hiding this comment

ma2bd Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

afck Oct 4, 2024

Choose a reason for hiding this comment

Twey left a comment

Choose a reason for hiding this comment

afck commented Oct 3, 2024 •

edited

Loading

afck Oct 4, 2024 •

edited

Loading

ma2bd Oct 3, 2024 •

edited

Loading