SQLite step serialization overhead #1175

link2xt · 2021-04-16T23:42:10Z

I have added the following profiling to f7775f7 (tag v0.5.2):

diff --git a/sqlx-core/src/sqlite/connection/executor.rs b/sqlx-core/src/sqlite/connection/executor.rs
index 6e09fdc2..f7d97c8e 100644
--- a/sqlx-core/src/sqlite/connection/executor.rs
+++ b/sqlx-core/src/sqlite/connection/executor.rs
@@ -125,7 +125,9 @@ impl<'c> Executor<'c> for &'c mut SqliteConnection {
 
                     // invoke [sqlite3_step] on the dedicated worker thread
                     // this will move us forward one row or finish the statement
+                    let time_start = std::time::SystemTime::now();
                     let s = worker.step(*stmt).await?;
+                    println!("worker: {:.3?}", time_start.elapsed().unwrap());
 
                     match s {
                         Either::Left(changes) => {
diff --git a/sqlx-core/src/sqlite/statement/worker.rs b/sqlx-core/src/sqlite/statement/worker.rs
index 8b1d2299..d1a120fa 100644
--- a/sqlx-core/src/sqlite/statement/worker.rs
+++ b/sqlx-core/src/sqlite/statement/worker.rs
@@ -31,7 +31,9 @@ impl StatementWorker {
             for cmd in rx {
                 match cmd {
                     StatementWorkerCommand::Step { statement, tx } => {
+                        let time_start = std::time::SystemTime::now();
                         let status = unsafe { sqlite3_step(statement.0.as_ptr()) };
+                        println!("sqlite3_step: {:.3?}", time_start.elapsed().unwrap());
 
                         let resp = match status {
                             SQLITE_ROW => Ok(Either::Right(())),

Executing a single statement returning multiple rows, I see a large per-row overhead:

sqlite3_step: 1.497ms
worker: 1.786ms
sqlite3_step: 4.250ms
worker: 4.694ms
sqlite3_step: 69.109µs
worker: 219.427µs
sqlite3_step: 183.521µs
worker: 325.917µs
sqlite3_step: 630.797µs
worker: 812.979µs
sqlite3_step: 25.389µs
worker: 130.274µs
sqlite3_step: 1.283ms
worker: 1.545ms
sqlite3_step: 236.185µs
worker: 426.723µs
sqlite3_step: 1.919ms
worker: 2.194ms
sqlite3_step: 381.938µs
worker: 622.630µs
sqlite3_step: 361.922µs
worker: 527.785µs
sqlite3_step: 49.781µs
worker: 155.005µs
sqlite3_step: 24.133µs
worker: 136.689µs
sqlite3_step: 647.297µs
worker: 812.152µs
sqlite3_step: 353.067µs
worker: 558.957µs
sqlite3_step: 415.232µs
worker: 571.597µs
sqlite3_step: 876.746µs
worker: 1.083ms
sqlite3_step: 42.917µs
worker: 177.336µs
sqlite3_step: 19.772µs
worker: 100.709µs
sqlite3_step: 718.207µs
worker: 924.504µs
sqlite3_step: 24.960µs
worker: 150.134µs

Overhead of about 0.1ms per row just to wait for the kernel to give some time to another thread instead of doing sqlite3_step ourselves is a lot and causes a visible performance degradation compared to rusqilte+r2d2: deltachat/deltachat-core-rust#2353

What is the point of having a separate thread to execute sqlite3_step?

The text was updated successfully, but these errors were encountered:

mehcode · 2021-04-16T23:46:55Z

What is the point of having a separate thread to execute sqlite3_step?

To not block the current thread assuming you are in an async IO environment.

The only reason there is a synchronization block is we are trying to do "zero-copy" SQLite which I think we need to move away from. Rusqlite buffers the SQLite row into a separate data type before handing it to you. If we did this, there would be no real sync overhead.

Additionally, in next there is a blocking runtime for SQLx where it will work using pure sync IO and no async runtime, similar to rusqlite + r2d2.

link2xt · 2021-04-17T00:06:14Z

I have tried moving sqlite3_step call to the executor as follows though:

diff --git a/sqlx-core/src/sqlite/connection/executor.rs b/sqlx-core/src/sqlite/connection/executor.rs
index 6e09fdc2..31dd961e 100644
--- a/sqlx-core/src/sqlite/connection/executor.rs
+++ b/sqlx-core/src/sqlite/connection/executor.rs
@@ -12,7 +12,7 @@ use crate::sqlite::{
 use either::Either;
 use futures_core::future::BoxFuture;
 use futures_core::stream::BoxStream;
-use libsqlite3_sys::sqlite3_last_insert_rowid;
+use libsqlite3_sys::{sqlite3_step, sqlite3_last_insert_rowid, SQLITE_DONE, SQLITE_ROW};
 use std::borrow::Cow;
 use std::sync::Arc;
 
@@ -125,7 +125,16 @@ impl<'c> Executor<'c> for &'c mut SqliteConnection {
 
                     // invoke [sqlite3_step] on the dedicated worker thread
                     // this will move us forward one row or finish the statement
-                    let s = worker.step(*stmt).await?;
+                    let time_start = std::time::SystemTime::now();
+                    let status = unsafe { sqlite3_step(stmt.0.as_ptr()) };
+                    let resp: Result<Either<u64, ()>, Error> = match status {
+                        SQLITE_ROW => Ok(Either::Right(())),
+                        SQLITE_DONE => Ok(Either::Left(stmt.changes())),
+                        _ => Err(stmt.last_error().into()),
+                    };
+                    let s = resp?;
+
+                    //println!("worker: {:.3?}", time_start.elapsed().unwrap());
 
                     match s {
                         Either::Left(changes) => {

It only cuts around ~10ms in total for my statement returning 131 rows, so it is probably not the major cause of latency. I still have time jumping between ~70ms and about ~130ms when I execute the same statement.

Anyway, maybe instead of serializing every step it makes sense to just start the worker and let it run sqlite3_step in the loop and write results in a channel as long as the channel is not full? If you absolutely want not to run sqlite3_step more times than needed (in case not all rows are consumed, as in the case of fetch_optional) this can be achieved by creating a channel with a capacity of 1.

mehcode · 2021-04-17T01:12:48Z

Yep. The channel approach is exactly how I'm thinking it should be.

abonander · 2021-11-23T00:31:15Z

Could you try your benchmark with the changes I have in #1551 to see if there's any improvement?

link2xt changed the title ~~Large SQLite step serialization overhead~~ SQLite step serialization overhead Apr 17, 2021

link2xt mentioned this issue Apr 17, 2021

performance degradation deltachat/deltachat-core-rust#2353

Closed

r10s mentioned this issue Apr 23, 2021

sqlx performance and stability issues deltachat/deltachat-core-rust#2378

Closed

abonander mentioned this issue Nov 13, 2021

Sqlite "Access violation" while simple upsert loop with transaction. #1467

Closed

abonander mentioned this issue Nov 23, 2021

refactor(sqlite): make background thread responsible for all FFI calls #1551

Merged

abonander closed this as completed in #1551 Dec 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLite step serialization overhead #1175

SQLite step serialization overhead #1175

link2xt commented Apr 16, 2021

mehcode commented Apr 16, 2021

link2xt commented Apr 17, 2021 •

edited

Loading

mehcode commented Apr 17, 2021

abonander commented Nov 23, 2021

SQLite step serialization overhead #1175

SQLite step serialization overhead #1175

Comments

link2xt commented Apr 16, 2021

mehcode commented Apr 16, 2021

link2xt commented Apr 17, 2021 • edited Loading

mehcode commented Apr 17, 2021

abonander commented Nov 23, 2021

link2xt commented Apr 17, 2021 •

edited

Loading