feat: replay logs of different tables in parallel #1492

Lethannn · 2024-03-04T15:57:46Z

Rationale

Related with #1466

Detailed Changes

Replay logs of different tables in parallel

Test Plan

CI

jiacai2050 · 2024-03-05T02:10:45Z

Cargo.toml

@@ -107,6 +107,7 @@ cluster = { path = "src/cluster" }
 criterion = "0.5"
 horaedb-client = "1.0.2"
 common_types = { path = "src/common_types" }
+dashmap = "5.5.3"


I wonder if this is a required dependency for this task?

If HashMap works, I prefer to stick with it first.

I'm trying to run 'replay_table_log_entries' concurrently, but I faced an issue with 'serial_exec_ctxs', which is a mutable reference HashMap. I had to wrap this by Arc and Mutex, then every time I grap a mutable reference to a value from the map, it locks the entire map.

DashMap allowing concurrent access to different keys. I wonder if there's an appoach to make Hashmap work in this case.

then every time I grap a mutable reference to a value from the map, it locks the entire map.

I think it's fine to use plain Mutex here, since they are not the bottle neck, replay_table_log_entries is the most heavy task in this place.

Also, there is a partitioned lock in our codebase, you can use if you want to optimize here:
https://github.com/apache/incubator-horaedb/blob/9f166f3daa9a02ef8af1e733c22f956ab97e7aaf/src/components/partitioned_lock/src/lib.rs#L130

then every time I grap a mutable reference to a value from the map, it locks the entire map.

I think it's fine to use plain Mutex here, since they are not the bottle neck, replay_table_log_entries is the most heavy task in this place.

Also, there is a partitioned lock in our codebase, you can use if you want to optimize here:

https://github.com/apache/incubator-horaedb/blob/9f166f3daa9a02ef8af1e733c22f956ab97e7aaf/src/components/partitioned_lock/src/lib.rs#L130

Awesome! I'm gonna check it out. Thx for the heads up

CLAassistant · 2024-03-05T12:00:10Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ Lethannn
❌ jiacai2050
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

jiacai2050 · 2024-03-08T07:17:12Z

src/analytic_engine/src/instance/wal_replayer.rs

+        let alter_failed_tables = HashMap::new();
+        let alter_failed_tables_ref = Arc::new(Mutex::new(alter_failed_tables));
+
+        let mut serial_exec_ctxs_dash_map = DashMap::new();


This map seems unnecessary, what I think of is like this:

modified src/analytic_engine/src/instance/wal_replayer.rs @@ -29,6 +29,7 @@ use common_types::{ schema::{IndexInWriterSchema, Schema}, table::ShardId, }; +use futures::StreamExt; use generic_error::BoxError; use lazy_static::lazy_static; use logger::{debug, error, info, trace, warn}; @@ -374,6 +375,7 @@ impl RegionBasedReplay { // TODO: No `group_by` method in `VecDeque`, so implement it manually here... Self::split_log_batch_by_table(log_batch, &mut table_batches); + let mut replay_tasks = Vec::with_capacity(table_batches.len()); // TODO: Replay logs of different tables in parallel. for table_batch in table_batches { // Some tables may have failed in previous replay, ignore them. @@ -384,22 +386,27 @@ impl RegionBasedReplay { // Replay all log entries of current table. // Some tables may have been moved to other shards or dropped, ignore such logs. if let Some(ctx) = serial_exec_ctxs.get_mut(&table_batch.table_id) { - let result = replay_table_log_entries( + replay_tasks.push(replay_table_log_entries( &context.flusher, context.max_retry_flush_limit, &mut ctx.serial_exec, &ctx.table_data, log_batch.range(table_batch.range), - ) - .await; + )); - // If occur error, mark this table as failed and store the cause. - if let Err(e) = result { - failed_tables.insert(table_batch.table_id, e); - } + // If occur error, mark this table as failed and store the + // cause. if let Err(e) = result { + // failed_tables.insert(table_batch.table_id, e); + // } } } - + for ret in futures::stream::iter(replay_tasks) + .buffer_unordered(20) + .collect::<Vec<_>>() + .await + { + // insert to failed_tables in there are errors + } Ok(()) }

But this compile failed due to mutable reference

error[E0499]: cannot borrow `*serial_exec_ctxs` as mutable more than once at a time --> src/analytic_engine/src/instance/wal_replayer.rs:388:32 | 388 | if let Some(ctx) = serial_exec_ctxs.get_mut(&table_batch.table_id) { | ^^^^^^^^^^^^^^^^ `*serial_exec_ctxs` was mutably borrowed here in the previous iteration of the loop ... 403 | for ret in futures::stream::iter(replay_tasks) | ------------ first borrow used here, in later iteration of loop

So the first step to do this task is to remove those mutable references.

The fix should be easy, just define serial_exec_ctxs with Arc<Mutex<HashMap>> type.

async fn replay_single_batch( context: &ReplayContext, log_batch: VecDeque<LogEntry<ReadPayload>>, serial_exec_ctxs: Arc<tokio::sync::Mutex<HashMap<TableId, SerialExecContext<'_>>>>, failed_tables: &mut FailedTables, ) -> Result<()> { let mut table_batches = Vec::new(); // TODO: No `group_by` method in `VecDeque`, so implement it manually here... Self::split_log_batch_by_table(log_batch, &mut table_batches); // TODO: Replay logs of different tables in parallel. let mut replay_tasks = Vec::with_capacity(table_batches.len()); for table_batch in table_batches { // Some tables may have failed in previous replay, ignore them. if failed_tables.contains_key(&table_batch.table_id) { continue; } let serial_exec_ctxs = serial_exec_ctxs.clone(); replay_tasks.push(async move { if let Some(ctx) = serial_exec_ctxs.lock().await.get_mut(&table_batch.table_id) { let result = replay_table_log_entries( &context.flusher, context.max_retry_flush_limit, &mut ctx.serial_exec, &ctx.table_data, log_batch.range(table_batch.range), ) .await; (table_batch.table_id, result) } else { (table_batch.table_id, Ok(())) } }); } for (table_id, ret) in futures::stream::iter(replay_tasks) .buffer_unordered(20) .collect::<Vec<_>>() .await { // If occur error, mark this table as failed and store the cause. if let Err(e) = ret { failed_tables.insert(table_id, e); } } Ok(()) }

I ran into the same compile failed before. Here is my code. Is this what you were expecting? However, My concern is, wouldn't serial_exec_ctxs.lock().await.get_mut break concurrency?

I ran into the same compile failed before.

I push my commits to your branch, it compile OK.

wouldn't serial_exec_ctxs.lock().await.get_mut break concurrency?

Yes, this step will be run in serially, but we make replay_table_log_entries concurrent, which is what we want.

jiacai2050 · 2024-03-11T03:28:14Z

I will merge this PR once CI pass.

@Lethannn If you have more free time, you can measure how much time is reduced in your deployment.

Leave an comment here if you have the numbers.

jiacai2050

LGTM

## Rationale Related with apache#1466 ## Detailed Changes Replay logs of different tables in parallel ## Test Plan CI --------- Co-authored-by: jiacai2050 <dev@liujiacai.net>

Lethannn added 5 commits March 4, 2024 23:03

edit: Replay logs of different tables in parallel

44b0942

Merge branch 'main' of https://github.com/Lethannn/incubator-horaedb

b4542b2

Update Cargo.toml

8c37fc8

Update Cargo.toml

0b36c92

cargo fmt

9f2c894

jiacai2050 reviewed Mar 5, 2024

View reviewed changes

apache deleted a comment from CLAassistant Mar 5, 2024

Merge remote-tracking branch 'upstream/main'

d048ada

Update Cargo.lock

02c332e

jiacai2050 reviewed Mar 8, 2024

View reviewed changes

jiacai2050 added 2 commits March 11, 2024 10:54

remove dashmap

3c9e358

add comments

8e2a412

jiacai2050 changed the title ~~[WIP]fix: Replay logs of different tables in parallel~~ feat: replay logs of different tables in parallel Mar 11, 2024

remove vec allocs

9ff1fa7

jiacai2050 approved these changes Mar 11, 2024

View reviewed changes

jiacai2050 merged commit 66d7a0d into apache:main Mar 11, 2024
9 checks passed

jiacai2050 mentioned this pull request Mar 12, 2024

Replay WAL of different tables concurrently for TableBasedReplay #1498

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: replay logs of different tables in parallel #1492

feat: replay logs of different tables in parallel #1492

Lethannn commented Mar 4, 2024 •

edited by jiacai2050

Loading

jiacai2050 Mar 5, 2024

Lethannn Mar 5, 2024

jiacai2050 Mar 8, 2024 •

edited

Loading

Lethannn Mar 8, 2024 •

edited by jiacai2050

Loading

CLAassistant commented Mar 5, 2024 •

edited

Loading

jiacai2050 Mar 8, 2024 •

edited

Loading

Lethannn Mar 9, 2024

jiacai2050 Mar 11, 2024

jiacai2050 commented Mar 11, 2024

jiacai2050 left a comment

feat: replay logs of different tables in parallel #1492

feat: replay logs of different tables in parallel #1492

Conversation

Lethannn commented Mar 4, 2024 • edited by jiacai2050 Loading

Rationale

Detailed Changes

Test Plan

jiacai2050 Mar 5, 2024

Choose a reason for hiding this comment

Lethannn Mar 5, 2024

Choose a reason for hiding this comment

jiacai2050 Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Lethannn Mar 8, 2024 • edited by jiacai2050 Loading

Choose a reason for hiding this comment

CLAassistant commented Mar 5, 2024 • edited Loading

jiacai2050 Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Lethannn Mar 9, 2024

Choose a reason for hiding this comment

jiacai2050 Mar 11, 2024

Choose a reason for hiding this comment

jiacai2050 commented Mar 11, 2024

jiacai2050 left a comment

Choose a reason for hiding this comment

Lethannn commented Mar 4, 2024 •

edited by jiacai2050

Loading

jiacai2050 Mar 8, 2024 •

edited

Loading

Lethannn Mar 8, 2024 •

edited by jiacai2050

Loading

CLAassistant commented Mar 5, 2024 •

edited

Loading

jiacai2050 Mar 8, 2024 •

edited

Loading