Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: optimize transaction validation db queries #6196

Merged

Conversation

hansieodendaal
Copy link
Contributor

@hansieodendaal hansieodendaal commented Mar 8, 2024

Description

Optimized transaction validation SQLite queries to run in batch mode wherever possible to minimise db operations.

Motivation and Context

See #6195

How Has This Been Tested?

Updated unit tests
Added a raw query regression test
System-level testing

What process can a PR reviewer use to test or verify this change?

Review code and tests

Breaking Changes

  • None
  • Requires data directory on base node to be deleted
  • Requires hard fork
  • Other - Please specify

Copy link

github-actions bot commented Mar 8, 2024

Test Results (CI)

    3 files    120 suites   38m 37s ⏱️
1 269 tests 1 269 ✅ 0 💤 0 ❌
3 799 runs  3 799 ✅ 0 💤 0 ❌

Results for commit 4524995.

♻️ This comment has been updated with latest results.

@ghpbot-tari-project ghpbot-tari-project added P-acks_required Process - Requires more ACKs or utACKs P-reviews_required Process - Requires a review from a lead maintainer to be merged labels Mar 8, 2024
Copy link

github-actions bot commented Mar 8, 2024

Test Results (Integration tests)

 2 files  11 suites   24m 32s ⏱️
32 tests 31 ✅ 0 💤 1 ❌
33 runs  32 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit 4524995.

♻️ This comment has been updated with latest results.

SWvheerden
SWvheerden previously approved these changes Mar 11, 2024
Copy link
Collaborator

@SWvheerden SWvheerden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, I just dont like the new trait function names for the reasons stated

@ghpbot-tari-project ghpbot-tari-project removed the P-reviews_required Process - Requires a review from a lead maintainer to be merged label Mar 11, 2024
@hansieodendaal hansieodendaal marked this pull request as draft March 11, 2024 06:49
@hansieodendaal hansieodendaal force-pushed the ho_optimize_sqlite_queries branch from 13eda9c to 7290c3c Compare March 11, 2024 15:24
@ghpbot-tari-project ghpbot-tari-project added the P-reviews_required Process - Requires a review from a lead maintainer to be merged label Mar 11, 2024
@hansieodendaal hansieodendaal marked this pull request as ready for review March 12, 2024 07:18
SWvheerden
SWvheerden previously approved these changes Mar 12, 2024
@ghpbot-tari-project ghpbot-tari-project removed the P-reviews_required Process - Requires a review from a lead maintainer to be merged label Mar 12, 2024
Copy link
Collaborator

@stringhandler stringhandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this SQL doesn't include a SQL Injection vulnerabilty, it is not using best practices for including parameters in sql. This should be changed now rather than later

);

for update in &updates {
query.push_str(&format!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use bound parameters instead

.eq::<Option<NaiveDateTime>>(NaiveDateTime::from_timestamp_opt(Utc::now().timestamp(), 0)),))

diesel::update(outputs::table.filter(outputs::hash.eq_any(hashes.iter().map(|hash| hash.to_vec()))))
.set(outputs::last_validation_timestamp.eq(Some(Utc::now().naive_utc())))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's best to not update this if the last_validation_timestamp is newer than the current timestamp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not follow. How could last_validation_timestamp be newer than the current timestamp?

updates.len()
);

let mut query = String::from(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

.execute(&mut conn)
.num_rows_affected_or_not_found(1)?;

let mut query = String::from(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems different. Why would we insert empty data? Is there a process to correct these amounts?

Copy link
Contributor Author

@hansieodendaal hansieodendaal Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of the query only identifies the rows that would be affected
INSERT INTO outputs (..) VALUES (...) ON CONFLICT (commitment) and does not change anything.
The only values in that list that are important are the ones to be used in the second part of the query, so all other values can by NULL / 0.

The actual update comes after as

DO UPDATE SET mined_height = excluded.mined_height, mined_in_block = \
    excluded.mined_in_block, status = excluded.status, mined_timestamp = excluded.mined_timestamp, \
    marked_deleted_at_height = NULL, marked_deleted_in_block = NULL, last_validation_timestamp = NULL

That is equivalent to

            .set((
                outputs::mined_height.eq(mined_height as i64),
                outputs::mined_in_block.eq(mined_in_block),
                outputs::status.eq(status),
                outputs::mined_timestamp.eq(timestamp),
                outputs::marked_deleted_at_height.eq::<Option<i64>>(None),
                outputs::marked_deleted_in_block.eq::<Option<Vec<u8>>>(None),
                outputs::last_validation_timestamp.eq::<Option<NaiveDateTime>>(None),

Comment on lines +450 to +459
"INSERT INTO outputs ( commitment, mined_height, mined_in_block, status, mined_timestamp, spending_key, \
value, output_type, maturity, hash, script, input_data, script_private_key, sender_offset_public_key, \
metadata_signature_ephemeral_commitment, metadata_signature_ephemeral_pubkey, metadata_signature_u_a, \
metadata_signature_u_x, metadata_signature_u_y, spending_priority, covenant, encrypted_data, \
minimum_value_promise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often do these values change, and does the ordering here matter.

For maintenance overtime how easy is this for a developer to come along, remove or add a field and break the query.

Are these fields managed anywhere as a list in constants or fields of an existing structure we could reference instead of strings in the query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That list comprises field names in diesel::table! {outputs (id) {...} } that cannot be left empty when inserting a row and all fields that need to be updated in the second half of the query if not included already. The frequency of change would coincide with changes to that table.

For future maintenance, I have added unit test pub async fn test_raw_custom_queries_regression(). It specifically tests the raw queries in set_received_outputs_mined_height_and_statuses and mark_outputs_as_spent.

As an example, executing the full query in DB Browser for SQLite for two rows:

INSERT INTO outputs 
(commitment, mined_height, mined_in_block, status, mined_timestamp, spending_key, value, output_type, maturity, hash, script, input_data, script_private_key, sender_offset_public_key,
metadata_signature_ephemeral_commitment, metadata_signature_ephemeral_pubkey, metadata_signature_u_a, metadata_signature_u_x, metadata_signature_u_y, spending_priority, covenant,
encrypted_data, minimum_value_promise) 
VALUES 
(x'be9a7ce5362db3882fdf138de9840a9329ae48e30c945d60a013758f76ce7239', 2, x'00', 6, '2024-03-01 11:52:54', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
(x'de0f779865d752563f609c07669afff260b946bb2c805b933c071c26e308c232', 2, x'00', 6, '2024-03-01 11:59:04', 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) 
ON CONFLICT (commitment) DO UPDATE SET 
mined_height = excluded.mined_height, mined_in_block = excluded.mined_in_block, status = excluded.status, mined_timestamp = excluded.mined_timestamp, marked_deleted_at_height = NULL, marked_deleted_in_block = NULL, last_validation_timestamp = NULL

results in

Execution finished without errors.
Result: query executed successfully. Took 0ms, 2 rows affected

As a test let's leave out input_data from the list of columns as well as the corresponding 0 in both value rows, then we get

Execution finished with errors.
Result: NOT NULL constraint failed: outputs.input_data

@hansieodendaal hansieodendaal force-pushed the ho_optimize_sqlite_queries branch from 7290c3c to 2d6671a Compare March 13, 2024 16:11
@ghpbot-tari-project ghpbot-tari-project added the P-reviews_required Process - Requires a review from a lead maintainer to be merged label Mar 13, 2024
@hansieodendaal hansieodendaal force-pushed the ho_optimize_sqlite_queries branch from 41c7fb5 to 2bf88d6 Compare March 14, 2024 14:01
stringhandler
stringhandler previously approved these changes Mar 18, 2024
SWvheerden
SWvheerden previously approved these changes Mar 18, 2024
Optimized transaction validation sqlite queries to run in batch mode where ever
possible to minimise db operations.
@hansieodendaal hansieodendaal force-pushed the ho_optimize_sqlite_queries branch from 2b4f071 to 4524995 Compare March 19, 2024 16:05
@SWvheerden SWvheerden merged commit 213a885 into tari-project:development Mar 20, 2024
14 of 16 checks passed
@hansieodendaal hansieodendaal deleted the ho_optimize_sqlite_queries branch March 20, 2024 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-acks_required Process - Requires more ACKs or utACKs P-reviews_required Process - Requires a review from a lead maintainer to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants