Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remove Clone from EventInsertable #481

Merged
merged 1 commit into from
Aug 16, 2024
Merged

chore: Remove Clone from EventInsertable #481

merged 1 commit into from
Aug 16, 2024

Conversation

stbrody
Copy link
Collaborator

@stbrody stbrody commented Aug 15, 2024

Builds on #480

@stbrody stbrody self-assigned this Aug 15, 2024
if candidate_events.is_empty() {
}
// Need to reverse the vectors because we popped events off of candidate_events from back to front
deliverable.reverse();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels a bit hacky and probably not the most efficient. The other idea I had was to replace the pop().unwrap() with remove(0), but I was worried that removing from the front of the array repeatedly would be even less efficient than reversing at the end.

It really feels like there should be a better way to move elements out of one vector into two others in an efficient manor...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding @nathanielc to see if you have any ideas for how to do this more efficiently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VecDeque and push front.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? I have a vector, I want to move the elements from that vector out into two other vectors, deciding which vector to move into based on a property of the values in the input vector. How can a VecDeque help with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a vector to store events, use a VecDeque. You push front instead of back. You then return an object with a VecDeque.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure I understand. We have three vectors:

  • candidate_events
  • deliverable
  • remaining_candidates

We want to move the values from candidate_events into one of the other two vectors and we want to do it in order? When we are done candidate_events is empty?

In this case a simple for loop should suffice.

for event in candidate_events {
   if ... {deliverable.push(event)} else {remaining_candidates.push(event) }
}

Note since we are not using a reference to candidate_events in the for loop the values are moved out of the vector into the loop.

Aside there is a one liner for this in nightly rust https://doc.rust-lang.org/std/vec/struct.Vec.html#method.extract_if

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! For some reason I thought we couldn't just move objects out of a vector like that without some sort of explicit "removal" operation.

I was worried about what kind of state this would leave the Vec in, but I guess because the for loop didn't take a reference but took the Vec by value it took ownership of it and now it's illegal to make any further references to candidate_events. Very cool!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the order isn't important here until we start appending non-init events to the deliverable list (since they need to be delivered after). I just used a stack/pop because it's fastest on a Vec. We could pass in the two vectors (deliverable, unknown) if we prefer but I originally just kept knowing that an init is deliverable in here. The remaining candidates don't need to be in any particular order as they will be sorted by prev chain separately so no need to reverse them.

Two things I noticed that were of my making.. I allocate an unnecessary vec when we could just have:

if remaining_candidates.is_empty() {
            return Ok(OrderEvents {
                deliverable,
                missing_history: remaining_candidates, // Vec::new(),
            });
        }

And we could do this in one pass if we make deliverable a public field.. it gets to Danny's point about style, originally I wanted to keep that an internal detail but it really doesn't make a lot of sense to use the setter.. so with that change we could have:

    pub async fn try_new(
        pool: &SqlitePool,
        candidate_events: Vec<(EventInsertable, EventMetadata)>,
    ) -> Result<Self> {
        let mut deliverable = Vec::with_capacity(candidate_events.len());
        let mut remaining_candidates = Vec::with_capacity(candidate_events.len());

        let mut new_cids: HashMap<Cid, bool> =
            HashMap::from_iter(candidate_events.into_iter().map(|(mut e, meta)| {
                // all init events are deliverable so we mark them as such before we do anything else
                let cid = e.cid();
                let deliverable = if matches!(meta, EventMetadata::Init { .. }) {
                    e.deliverable = true;
                    deliverable.push((e, meta));
                    true
                } else {
                    remaining_candidates.push((e, meta));
                    false
                };
                (cid, deliverable)
            }));

        if remaining_candidates.is_empty() {
            return Ok(OrderEvents {
                deliverable,
                missing_history: Vec::new(),
            });
        }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a nice cleanup, but I'd like to do that in a separate PR to keep this one more focused.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service/src/event/order_events.rs Outdated Show resolved Hide resolved
store/src/sql/entities/event.rs Show resolved Hide resolved
@stbrody stbrody marked this pull request as ready for review August 15, 2024 17:17
@stbrody stbrody requested review from dav1do and a team as code owners August 15, 2024 17:17
@stbrody stbrody requested review from smrz2001 and nathanielc and removed request for a team August 15, 2024 17:17
let insertable = get_n_insertable_events(10).await;
let mut init = insertable.first().unwrap().to_owned();
let mut insertable = get_n_insertable_events(10).await;
let mut init = insertable.remove(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is O(n) when previously it was constant time. Not sure it matters but thought I would point it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but this is test code, and we know the vector is only 10 long, so I'm not too worried about it

if candidate_events.is_empty() {
}
// Need to reverse the vectors because we popped events off of candidate_events from back to front
deliverable.reverse();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure I understand. We have three vectors:

  • candidate_events
  • deliverable
  • remaining_candidates

We want to move the values from candidate_events into one of the other two vectors and we want to do it in order? When we are done candidate_events is empty?

In this case a simple for loop should suffice.

for event in candidate_events {
   if ... {deliverable.push(event)} else {remaining_candidates.push(event) }
}

Note since we are not using a reference to candidate_events in the for loop the values are moved out of the vector into the loop.

Aside there is a one liner for this in nightly rust https://doc.rust-lang.org/std/vec/struct.Vec.html#method.extract_if

@@ -181,18 +187,30 @@ mod test {
(after_1, after_2)
}

/// Takes the given events from Recon and turns them into two vectors of insertable events.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are the two vectors different? Should we make a small struct to give them names?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's context at the call site, but as far as this function is concerned it doesn't care, it's just making two vectors

Copy link
Contributor

@dav1do dav1do left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this as is, I noticed two things that could be nicer (both originally my mistakes) so you can take them or leave them.

store/src/sql/entities/event.rs Show resolved Hide resolved
if candidate_events.is_empty() {
}
// Need to reverse the vectors because we popped events off of candidate_events from back to front
deliverable.reverse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the order isn't important here until we start appending non-init events to the deliverable list (since they need to be delivered after). I just used a stack/pop because it's fastest on a Vec. We could pass in the two vectors (deliverable, unknown) if we prefer but I originally just kept knowing that an init is deliverable in here. The remaining candidates don't need to be in any particular order as they will be sorted by prev chain separately so no need to reverse them.

Two things I noticed that were of my making.. I allocate an unnecessary vec when we could just have:

if remaining_candidates.is_empty() {
            return Ok(OrderEvents {
                deliverable,
                missing_history: remaining_candidates, // Vec::new(),
            });
        }

And we could do this in one pass if we make deliverable a public field.. it gets to Danny's point about style, originally I wanted to keep that an internal detail but it really doesn't make a lot of sense to use the setter.. so with that change we could have:

    pub async fn try_new(
        pool: &SqlitePool,
        candidate_events: Vec<(EventInsertable, EventMetadata)>,
    ) -> Result<Self> {
        let mut deliverable = Vec::with_capacity(candidate_events.len());
        let mut remaining_candidates = Vec::with_capacity(candidate_events.len());

        let mut new_cids: HashMap<Cid, bool> =
            HashMap::from_iter(candidate_events.into_iter().map(|(mut e, meta)| {
                // all init events are deliverable so we mark them as such before we do anything else
                let cid = e.cid();
                let deliverable = if matches!(meta, EventMetadata::Init { .. }) {
                    e.deliverable = true;
                    deliverable.push((e, meta));
                    true
                } else {
                    remaining_candidates.push((e, meta));
                    false
                };
                (cid, deliverable)
            }));

        if remaining_candidates.is_empty() {
            return Ok(OrderEvents {
                deliverable,
                missing_history: Vec::new(),
            });
        }

Base automatically changed from flatten_event_insertable to main August 16, 2024 17:10
@stbrody stbrody enabled auto-merge August 16, 2024 17:40
@stbrody stbrody added this pull request to the merge queue Aug 16, 2024
Merged via the queue into main with commit 4a81101 Aug 16, 2024
5 checks passed
@stbrody stbrody deleted the rm_clone branch August 16, 2024 18:19
@smrz2001 smrz2001 mentioned this pull request Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants