Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintain sorted operation indexes #438

Merged
merged 21 commits into from
Jul 7, 2023

Conversation

sandreae
Copy link
Member

@sandreae sandreae commented Jul 7, 2023

We want to persist and maintain sorted position indexes for all operations which have been materialized into a document. This index is the operations position once topological sorting has occurred and can be understood as the sorted_index of a particular operation, or also the documents "depth" at that point in it's history.

This is done by introducing the following:

  • add a sorted_index column to the operations_v1 table
  • update all operation models and data types accordingly
  • introduce a update_operation_index method to the OperationStore
  • in a reduce task update the sorted_index of all operations in a document

Note, there is a period of time when an operation has been inserted into the store, but it's sorted_index is None because materialization has not been performed yet. As we have a "retry" system already in-place to account for a node shutting down unexpectedly before tasks could be completed, then we can be confident all operations will be processed and have their sorted_index set eventually.

📋 Checklist

  • Add tests that cover your changes
  • Add this PR to the Unreleased section in CHANGELOG.md
  • Link this PR to any issues it closes
  • New files contain a SPDX license header

@sandreae sandreae linked an issue Jul 7, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Jul 7, 2023

Codecov Report

Patch coverage: 93.75% and project coverage change: +0.13 🎉

Comparison is base (ae27ff2) 90.05% compared to head (57813bd) 90.19%.

❗ Current head 57813bd differs from pull request most recent head 2d93eb5. Consider uploading reports for the commit 2d93eb5 to get more accurate results

Additional details and impacted files
@@               Coverage Diff               @@
##           development     #438      +/-   ##
===============================================
+ Coverage        90.05%   90.19%   +0.13%     
===============================================
  Files               87       87              
  Lines             8439     8514      +75     
===============================================
+ Hits              7600     7679      +79     
+ Misses             839      835       -4     
Impacted Files Coverage Δ
aquadoggo/src/db/types/operation.rs 86.66% <ø> (ø)
aquadoggo/src/db/stores/operation.rs 89.01% <90.76%> (+4.54%) ⬆️
aquadoggo/src/materializer/tasks/reduce.rs 90.07% <95.91%> (+1.28%) ⬆️
aquadoggo/src/db/models/utils.rs 99.46% <100.00%> (+0.01%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sandreae sandreae marked this pull request as ready for review July 7, 2023 07:46
@sandreae sandreae requested a review from adzialocha July 7, 2023 07:46
@sandreae sandreae changed the title Maintain sorted operation index Maintain sorted operation indexes Jul 7, 2023
@adzialocha
Copy link
Member

Great!!

Note, there is a period of time when an operation has been inserted into the store, but it's sorted_index is None because materialization has not been performed yet. As we have a "retry" system already in-place to account for a node shutting down unexpectedly before tasks could be completed, then we can be confident all operations will be processed and have their sorted_index set eventually.

That's half working actually, but with this PR we have everything in place I think to help us with it: #441

@adzialocha adzialocha merged commit 17a1096 into development Jul 7, 2023
@adzialocha adzialocha deleted the maintain-sorted-operation-index branch July 7, 2023 12:07
adzialocha added a commit that referenced this pull request Jul 14, 2023
* development: (23 commits)
  Implement `dialer` behaviour (#444)
  Sort expected results in strategy tests
  Update CHANGELOG
  Replicate operations in topo order (#442)
  Maintain sorted operation indexes (#438)
  Use fork of `asynchronous-codec`  (#440)
  Ingest check for duplicate entries (#439)
  Reverse lookup for pinned relations in dependency task (#434)
  Remove unnecessary exact version pinning in Cargo.toml
  Make `TaskInput` an enum and other minor clean ups in materialiser (#429)
  Use `libp2p` `v0.52.0` (#425)
  Fix race condition when check for existing view ids was too early (#420)
  Reduce logging verbosity
  CI: Temporary workaround for Rust compiler bug (#417)
  Fix early document view insertion (#413)
  Handle duplicate document view insertions (#410)
  Decouple p2panda's authentication data types from libp2p's (#408)
  Remove dead_code attribute in lib
  Integrate replication manager with networking stack (#387)
  Implement naive replication protocol (#380)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Maintain sorted document position index for all materialized operations
2 participants