Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: api to transform events to conclusion events #508

Merged
merged 8 commits into from
Sep 10, 2024

Conversation

samika98
Copy link
Contributor

@samika98 samika98 commented Aug 29, 2024

API that processes a series of raw events, transforming them into a structured vector of conclusion events. The transformation involves taking a high watermark as input, fetching events from the database based on this marker, and converting these events into the ConclusionEvent type.

Added transform_raw_events_to_conclusion_events function which converts events based on the provided high watermark
Modified event and payload structures
Updated CeramicEventService to include new transformation logic

TODO : Add time event creation logic for test

controller: event.get_controller().unwrap(),
dimensions: vec![], // You may need to populate this if needed
},
// TODO_Discuss: How can this ever have more than one value? prev return a single cid
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev returns a single cid

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected, however in the future we may change the Ceramic event format to allow multiple previous values.

previous: vec![*time_event.prev()],
//TODO_Discuss: How to determine an ever incrementing index.
//Soln_proposed : We have to store this on disk and increment with every event seen ?
//Define purpose of this counter and how it can be used
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

** how it can be useful to the consumer of this api

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index needs to come from the db query. I believe its the deliverable column on the data. We need to select that column and preserve it.

limit: i64,
include_data: ceramic_api::IncludeEventData,
) -> anyhow::Result<Vec<ConclusionEvent>> {
let raw_events = ceramic_api::EventStore::events_since_highwater_mark(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not want to rely on the API methods here. We should likely move the body of ceramic_api::EventStore::events_since_highwater_mark to a method on the struct directly. Both this method and the EventStore implementation can call it. This way we do not overload the ceramic_api types.

&self,
highwater_mark: i64,
limit: i64,
include_data: ceramic_api::IncludeEventData,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always want the data for the conclusion API so we should not let this be a parameter on the function.

// We map over this to get a vec of conclusion events
fn transform_raw_events_to_conclusion_events(
&self,
raw_event: ceramic_api::EventDataResult,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted, we should not use the ceramic_api types here but instead use the unvalidated::Event directly.

match event {
ceramic_event::unvalidated::Event::Time(time_event) => {
let mut current_event: Event<Ipld> = event;
let mut stream_id = current_event.id();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit this is stream_cid not the stream id

let mut stream_id = current_event.id();
let mut init_event: Option<Event<Ipld>> = None;
// Traverse the chain of events until we find the init event
while let Some(prev_cid) = current_event.prev() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to traverse the chain, the event.id() is the CID of the init event already. I imagine this can look something like:

let stream_cid = current_event.id();
let init_event = self.get_event_by_cid(stream_cid)?;
let init = ConclusionInit::try_from(init_event)?;// TODO implement this

Constructing a ConclusionInit from an init event means finding the init payload if it exists and gathering its metadata.

We should add an init_payload() method to unvalidated::Event that is similar to is_init method but that returns an Option<&init::Payload>. Then try_from becomes

fn try_from(value: unvalidated::Event) -> Result<ConclusionInit> {
    let init_payload = value.init_payload().ok_or_else(|| anyhow!("no init payload found")?;
    Ok(ConclusionInit{ ... })
}


Ok(ConclusionEvent::Time(ConclusionTime {
event_cid: raw_event.id,
init: ConclusionInit {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the above is complete then this just becomes init: init.

previous: vec![*time_event.prev()],
//TODO_Discuss: How to determine an ever incrementing index.
//Soln_proposed : We have to store this on disk and increment with every event seen ?
//Define purpose of this counter and how it can be used
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The index needs to come from the db query. I believe its the deliverable column on the data. We need to select that column and preserve it.

@samika98 samika98 force-pushed the feat/convert-raw-events-to-conclusion-type branch from 599eaab to c04e093 Compare September 3, 2024 19:33
@samika98 samika98 changed the title feat: add basic structure to lead discussion + add multiple todos for… feat: api to transform events to conclusion events Sep 3, 2024
@samika98 samika98 marked this pull request as ready for review September 3, 2024 20:25
@samika98 samika98 requested review from dav1do and a team as code owners September 3, 2024 20:25
@samika98 samika98 requested review from smrz2001 and nathanielc and removed request for a team September 3, 2024 20:25
e
))
})?,
index: delivered as u64, // TODO: Implement proper indexing
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : remove todo

fn from(value: init::Payload<D>) -> Self {
Self::Unsigned(value)
// TODO : Delete this? then delete all
// Delete event with new
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanielc All cannot be delete since rest of the two impls are being used in migration code. Makes sense to keep all 3 then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes lets keep them

Copy link
Collaborator

@nathanielc nathanielc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, a few questions below. The refactor worked out well.

@@ -92,16 +107,17 @@ where
}
}

// TODO : Returns the cid no option, update the comment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, this todo is complete

@@ -92,16 +107,17 @@ where
}
}

// TODO : Returns the cid no option, update the comment
/// Returns the 'id' field of the event, which is the Cid of the stream's init event.
/// If this event *is* the init event, then it doesn't know its own Cid and returns None.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this comment to indicate we always return the cid because and init event does know its cid

fn from(value: init::Payload<D>) -> Self {
Self::Unsigned(value)
// TODO : Delete this? then delete all
// Delete event with new
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes lets keep them

@@ -22,11 +25,11 @@ impl ConclusionEvent {
}
}

//TODO : Add Stream Type here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have an issue to track this, so we can remove this comment for now

let stream_cid = raw_event.id();
let init_event = self.get_event_by_cid(stream_cid).await?;
let init = ConclusionInit::try_from(init_event).unwrap();
let event_cid = *raw_event.id();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, the event_cid is not the stream_cid (unless its an init event). Does the raw_event know its own cid? That is what we need to use. Maybe we need a new method on unvalidated::Event to get the cid of itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also pass it as a parameter into this method

limit: i64,
) -> Result<(i64, Vec<(Cid, unvalidated::Event<Ipld>)>)> {
) -> Result<(i64, Vec<(Cid, unvalidated::Event<Ipld>, i64)>)> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to use a struct instead of a tuple? Its confusing what the different i64s are for. I get it after reading the code, however a struct would make it more clear.

.bind(limit)
.fetch_all(pool.reader())
.await?;

// default to the passed in value if there are no new events to avoid the client going back to 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this comment removed? It still seems valid.

all_blocks.sort_by(|a, b| a.new_highwater_mark.cmp(&b.new_highwater_mark));
let blocks = all_blocks.into_iter().map(|b| b.block).collect();

let blocks = all_blocks.clone().into_iter().map(|b| b.block).collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to clone the blocks?

@nathanielc
Copy link
Collaborator

I'd like to see a test that shows we are correctly converting from events to conclusion events

Ok((max_highwater, values))
let result = values
.into_iter()
.zip(all_blocks.iter())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use them here as well, we add the delivered field to all the blocks here

.id()
.expect("prev must exist for non-init events");
let stream_cid = parsed_event.id();
let prev = parsed_event.id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be prev().. it looks like we introduced a bug here that was released in v0.33 (made latest this week)... fortunately it's the startup task code and has probably never done anything. We don't have long streams currently so things shouldn't be getting picked up at startup, meaning we we're probably okay without doing any clean up, but I'm going to add a ticket to investigate how it got through/improve the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #516

@samika98 samika98 force-pushed the feat/convert-raw-events-to-conclusion-type branch from efb7972 to 97dd107 Compare September 9, 2024 17:46
@samika98 samika98 added this pull request to the merge queue Sep 10, 2024
Merged via the queue into main with commit 0af708d Sep 10, 2024
6 checks passed
@samika98 samika98 deleted the feat/convert-raw-events-to-conclusion-type branch September 10, 2024 15:41
@smrz2001 smrz2001 mentioned this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants