-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a programming model/API for Event Sourcing #343
Comments
My current understanding is that Event Sourcing can be easily supported on top of the DataGrain API we have been discussing, which can in turn be done on top of various backplane services. Do you have links to particular samples that use the event sourcing pattern? |
As a (simpleton) developer I'd like to implement an event sourced grain like this: [EventStoreProvider("TableStorage")]
public MyEventGrain : EventGrain, IMyGrain
{
int balance;
public async Task Add(int value)
{
this.balance += value;
await this.Commit();
}
} When When the grain is activated, the events recorded for that grain are then replayed by calling the methods again in sequence (although the Note that the grain interface is not concerned with how this grain is implemented (whether it uses event sourcing or not). I'm not sure what to do if the grain makes calls to other grains (this sounds like a bad idea). I'm not sure if you would want to support a 'role up' or 'snapshot' of the state, to cater for grains with a large number of events. Another consideration is whether you would want to allow a grain (or a system) to a restore to a previous point in history (i.e. play back all the events up to a certain point). |
@sebastianburckhardt We already have al least four different attempts to bring ES to Orleans: by @jkonecki, @yevhen, @ReubenBond and you. The purpose of this issue is try to come up with a programming model/API that would work for everyone while allowing for pluggable implementations.
@richorama I am offended you said calling other grains is a bad idea> :-) This is where it's not clear to me if we should
or
It is also very possible that I'm too confused by all these similar but different stories. |
I think we can agree that at some point, the runtime has to be able to serialize/deserialize The question is if this should be automatic (done by compiler or at runtime using reflection), or under explicit programmer control. Today, the persistence mechanism is not automatic. The programmer explicitly provides a serializable class that defines the grain state. There are advantages and disadvantages to that. If we are doing the same thing to serialize updates, i.e. go with explicit programmer control, the user would write serializable classes to define all the update operations. I show an example below. There is a bit more redundancy in this code than I like, unfortunately. The cool part is that since updates are now first-class, we can support very useful and powerful APIs involving updates and update sequences (e.g. we can query which updates are currently still pending and not yet confirmed, we can get an IEnumerable of all updates, and we can subscribe to a stream of updates). [Serializable]
class MyGrainState
{
public int Balance {get; set;}
}
[Serializable]
class MyAddOperation : IUpdates<MyGrainState> {
public int Amount { get; set; }
public void Update(MyGrainState state) {
state.Balance += Amount;
}
}
class MyGrain : DataGrain<MyGrainState>
{
public async Task Add(int value)
{
await QueueUpdate(new MyAddOperation() { Amount = value });
}
} |
Some people were confused by my Event Sourcing implementation. I had a chat with @gregoryyoung, which hopefully cleared up most of that confusion and also helped to refine the implementation in my mind. Framework support for Event Sourcing is a great idea! I am very confident that we can come up with an API which is both sound, easy to use, and (hopefully) which makes it obvious when it's being abused. Here is one idea: we have command methods which do not directly emit events, event methods which do directly emit events, and query methods which are read-only. Command methods take their arguments, perform required computation on them, collect whatever ambient information is required, and then call Event methods. Command methods are not replayable. public Task PlaceOrder(Order order, User user)
{
// Gets the taxation rate at the time the order is placed.
var tax = await taxActor.GetApplicableTax(order, user.Address);
return this.OrderPlaced(order, user, tax);
} Event methods perform validation of their arguments, emit the event, and then apply the event. public Task AddedUser(User user)
{
return this.Commit(
validate: () =>
{
if (user == null)
{
throw new ArgumentNullException("user");
}
},
apply: () => this.users.Add(user.Id));
} Of course, if no validation is required, that method becomes very terse: public Task AddedUser(User user)
{
return this.Commit(() => this.users.Add(user.Id));
} Query methods are exactly as they are today. Here is a Query method: public Task<int> GetUserCount()
{
return this.users.Count();
} Alright, so those are the kinds of methods which we can have, but what if you need to emit multiple events atomically? @gregoryyoung gives an example of a stock trading platform where:
Events 2 & 3 should be committed atomically. I propose we do this within the scope of a using (var tx = this.CreateEventTransaction())
{
await this.TradeOccurred(buyer, seller, symbol, volume, price);
await this.PlaceOrder(buyer, symbol, remainingVolume, price);
await tx.Commit();
} This is surprisingly easy to implement: each call to Let me know what you think. We will discuss versioning of event handlers as well as event method arguments. Oh, also, we can consider creating Roslyn Code Analyzers to guide users onto the right track (eg, event methods should have exactly one statement: Commands and events are identical if the command performs no computation & require no validation (I think this is what made @yevhen decide I had implemented command sourcing). Message sourcing can be implemented also: you Very excited! |
I think I like it a lot. Did not understand the connection between views/projections and Orleans' streaming, but that is orthogonal. I am also super interested in: "consider creating Roslyn Code Analyzers to guide users onto the right track (eg, event methods should have exactly one statement: this.Commit(...) ). " |
Your understanding is correct. I'm trying to keep from discussing everything at once, but basically the event journal could be subscribed to. Subscribers would implement the same interface as the grain (or at least he journalled part of it). That way, the subscriber can handle each of the events in a natural, .NET-native manner. Does that make sense? Does it sound like a good approach? Regarding Roslyn Code Analyzers, I'm referring to this: https://channel9.msdn.com/Events/dotnetConf/2015/NET-Compiler-Platform-Roslyn-Analyzers-and-the-Rise-of-Code-Aware-Libraries Edit: more detail on analyzers: https://msdn.microsoft.com/en-us/magazine/dn879356.aspx |
Read the views thread on akka.persistence. On Sat, Apr 18, 2015 at 8:00 AM, Reuben Bond notifications@github.com
Studying for the Turing test |
@gregoryyoung do you mean this thread? https://groups.google.com/forum/#!searchin/akka-dev/views/akka-dev/dhEQWEeqY40/AOW1PKgFRHAJ |
Got it about looking on the journal as stream of events. Nice! Thanks @gregoryyoung, will do. That:http://doc.akka.io/docs/akka/snapshot/scala/persistence.html? |
If that's the thread, thank you - we could integrate the notion of stream hooks into our model. We would provide a handler which is executed on each journalled message. That gives users a chance to emit the event to a stream of their choosing - that allows for heterogeneous partitioning, rather than forcing the partition to be per-actor. So I could fork the stream for all user actors, and have security-related events emitted to a special audit log. Or if we are message sourcing, we could copy all queries to a stream for analytics. |
Akka Persistence on the Query Side On Sat, Apr 18, 2015 at 8:04 AM, Gabriel Kliot notifications@github.com
Studying for the Turing test |
Alright, got it: https://groups.google.com/forum/#!topic/akka-user/MNDc9cVG1To I'm wary of making users pay for something they might not be using, and generalized query support can be very expensive. A very simple hooking point solves the problem efficiently, but we should carefully consider how we support querying and stream discovery. I think the question of query/discovery is broader than just ES, though: people often ask how to search for grains by some property. |
Very happy to see a topic of Event Sourcing
|
No, it doesn't. That will lead to code bloat in cases where subscribers are only interested in selection of events, not the whole set. Also, your idea doesn't mesh well with newly introduced Streams api, which is based on message-passing. Basically, with your idea - we're back to Observer api. |
I would like to start by stating that I would prefer to implement event sourcing rather than command sourcing, as described by @richorama. Command sourcing can be tricky when it comes to side-effects, like communication with external services, which shouldn't happen during replay. Also, command sourcing fails to capture the fact whether the command succeeded or not, which means it's difficult to subscribe to stream externally. Event sourcing is defined many times by @gregoryyoung as derivation of the current state from the stream of past events. As I've mentioned during the first meetup, the separation of the grain state and grain itself is very handy here. In my opinion t is the state that is event sourced, not the gain. I see the grain more as a repository of logic. I fully agree with @richorama that the usage of the events-sourced state should be as similar to the current way gain interacts with its state as possible, Here is @richorama example rewrote the way I see it: [EventStoreProvider("TableStorage")]
public MyEventGrain : EventGrain, IMyGrain
{
public async Task Add(int value)
{
// instead of this.State.Balance += value;
this.State.RaiseEvent(new BalanceIncreased(value));
// no change to current api
await this.State.WriteStateAsync();
}
} I believe that the events should be explicitly defined and not auto-generated from grain's method arguments. Events are the fundamental part of domain modelling and may contain more information than is passed to the method that raises them. For example an ConfirmationEmailSent event may contain an email unique identifier returned from EmailSending service invoked from grain method and not passed as an argument to the method. One benefit of event sourcing is the ability to rehydrate the aggregate as of a certain version, for example for debugging purposes. In order to achieve this in Orleans we would need a way to obtain a reference to the grain by passing a certain version. Right now I don't want to go into the detail of if the version is a number, string or etag. I'm just pointing out that there should be a way to obtain multiple instances of the same grain with different states. I'm guessing this with probably result with the grain reference being extended in order to differentiate between multiple instances of the same grain. Another feature which I believe is required is the ability to publish raised events. A plugin architecture here would be perfect with providers for EventHub, Table Queues and ServiceBus being the most obvious. |
+1 to @jkonecki I'm not going to support the idea of having implicit generic events auto-generated from method signatures. It's too magical, non-conformist and non-idiomatic. It creates completely unnecessary (accidental) complexity. There other ways exist to deal with verbosity of declaring events, such as using custom DSL or simply using the DSL provided by serialization library (.proto, .bond). |
I think for that you don't need to go through the framework (runtime). Just create an instance of your (presumably) POCO aggregate-actor and replay its stream on it. Use-case that you have presented should not require runtime. |
Absolutely opposite. I view event sourcing as persistence mechanism. So I'm more on side of @jkonecki and rest of 100500 folks from ES community :) |
See example here https://github.com/yevhen/Orleankka/blob/master/Source/Example.EventSourcing/Domain.cs. I don't record (log) calls to the grain, I record state changes (transitions) as events, which do capture a business intent of a change. See https://github.com/yevhen/Orleankka/blob/master/Source/Example.EventSourcing/Domain.cs#L46 as an example of something very casual - capturing more data than just an input. |
Guys, from most of the comments here, I can deduce that many of you do not understand what Event Sourcing actually is. I see no point discussing it any further until we're on the same page. I don't like playing Chinese whispers. Please, take your time and watch at least first quarter of https://www.youtube.com/watch?v=whCk1Q87_ZI then I believe we can have a more productive discussion. |
@yevhen I get the feeling you haven't read the thread or the JabbR conversation. You told Greg Young to come and tell me that this is a bad idea and that this is not event sourcing and it's wrong. He and I had a chat, he indicated that yes, this library supports event sourcing. We can also do command and message sourcing. We are not blindly recording calls to grains. We validate the arguments, we have the option to collect environmental information at the time of the event, and to perform computations on whatever data before persisting and applying the event. That is all clearly demonstrated above. This proposal is much more in-tune with the rest of Orleans, in my opinion. Your system is easily implemented using this proposal. You would have a single event handler which takes arguments deriving from some kind base type with an It's much easier to throw away statically checked types than to keep them. Once you throw them away, they're gone. I've watched Greg's videos, but thanks for the link. This one is good, too. |
I'm uncomfortable using event sourcing terms and I'm somewhat loss with it, but I'll note that for what I know, this looks either a form of CEP or a way to persistent events (in general fashion, i.e. without business specific structure?). Taking a cue from the linked Google thread and a question by Prakhyat Mallikarjun, whatever we do, I'd like to have an easy route to business specific queries and storage. I think I see the value as a pattern on how to deal with events as such, especially when CEP like functionality isn't needed. The problems I'm usually dealing with in the streaming domain are CEP like queries, how to make them fault-tolerant, how to quickly get aggregates and initialize state and this when there might be multiple streams to merge. The rest, like sequence identifiers, have always been business specific. For instance sensor readings are timestamped or have a sequence ID defined by the source. Then there is another timestamp applied when the event is received on the server and at this point their relative order, from a business perspective, doesn't matter as only timestamps are looked (and perhaps corrected for known anomalies). Considering events flowing from a metering field, they come through multiple socket ports through multiple servers (applying a consistent sequence number doesn't look like sensible), get routed to the a same event stream and persisted. Ack for receiving the event is given when it's on a durable store (just after receiving it or when persisted to a db). What I'd like to do, many times, is to calculate aggregates like sums or cumulative sums on the fly and persist these along the data storage and use these to initialize the streams quickly upon recovering from interruptions. There might be some heavier analytics and/or data enrichment done at the data storage, which would be nice to join back to the event stream. What I do currently for my little demo for myself is that upon ingesting data I immediately save it on the edge and then pass along for further domain-specific processing and then save that data to domain specific store and then put to transport streams for further consumption. If further processing crashes before some other save point, I'll replay the data either from the initial save location or the domain specific location. This is a kind of a stream, source of which can be seen in either of the two save locations. I can imagine this getting difficult with more downstream consumers and aggregation logic. For instance, if further downstream aggregates are being calculated, I might like to save it to the domain specific store along with the events that defines the aggregate. For a more "application logic", there's a good example on what I've thought about at work for some years. Getting #er blog, functional event-sourcing – compose. In any event, how I see what I'd like to have. Carry on. :) <edit: I might add that currently I save all data explicitly without using the Grain state mechanism. |
" Perhaps metering devices are not the only use case. I have found many threads just like this one. People tend to think of one On Sat 18 Apr 2015 at 21:18 Veikko Eeva notifications@github.com wrote:
|
We want an API which supports {Message,Command,Event} Sourcing. We can call it our Journaling API. Here are a few desirable properties of the API:
Supporting statically typed events is easiest if an implementation of the event sourced interface is passed to the subscribe method. That works cleanly, and you can argue for it, but @yevhen rightly points out that many of the interface methods would be often go unimplemented. One option to remedy this is to let consumers decompose their interface so that they only need to implement a subset of it - allowing the processing of events from different actor types. Alternatively, they can filter the raw event stream. Alternatively we can see what possibilities Roslyn gives us for checking type assertions at development & compile time. @yevhen's approach more cleanly supports projections, but is verbose in the normal case, where users are logging & applying events. So it's a trade-off. This is basically the object / lexical closure equivalency debate. Orleans as a framework clearly leans towards objects, whereas Akka leans towards closures. Both options have merits, both have downsides, but let's do our best to be consistent. |
It being we are discussing method = message = schema, its true they are all Why not just make this a choice? The same underlying "framework" code would On a side note on "making types being a hassle" as a form of defining On Sun, Apr 19, 2015 at 3:59 PM, Reuben Bond notifications@github.com
Studying for the Turing test |
Typical Joe, doesn't build distributed systems which use event sourcing. He builds CRUD apps using his favorite RDB/ORM combo. As a (simpleton) developer I'd like to first gain an understanding of what event sourcing is, whether I really need it, what are the trade-offs and issues associated with it, before I even try to convince my manager. As a (simpleton) developer, I first need to understand what distributed systems are, whether do I need to build the one and take all risks associated with it. That means, I first need to understand CAP, storage options and associated trade-offs, read some or all papers listed here and learn about distributed scalable architectures alternative to my simpleton RDB\ORM combo, such as REST, EDA, CQRS, ETC. As a (simpleton) developer, before diving into a brave new world of distributed systems programming, it definitely won't hurt, if I first learn crucial design & modeling principles and patterns, like DDD, NoSQL, EIP, ETC. By doing that, I will avoid constantly begging authors of super-powerful distributed actor frameworks to support cross-actor (cross-partition) transactions, because I won't be modeling my actors after each row in my RDB, since at that moment I'll already have a firm understanding of what Aggregate is. As a (simpleton) developer, by doing my homework and not being stupid, I'll get a chance to avoid shooting myself in a foot, by thinking that all of my (scalability, performance, etc) problems could be somehow auto-magically solved by technology/framework/unicorn alone, without requiring me making any investments in learning about the hard STUFF at all. As a (simpleton) developer, at that moment, without any doubt, I'll be understanding that:
Dear Joe (typical senior assistant of junior .NET developer), |
@sebastianburckhardt I haven't looked at your repo yet so cannot comment on the QueuedGrain API. I would like to try to implement event sourcing in such a way that there is no need to derive from a JournaledGrain base class. That would allow the freedom of deriving the grain from Grain or QueuedGrain regardless of the way the state is persisted. For me event sourcing should be limited to the State and StateProvider only. If that is achievable than all that would be needed for the QueuedGrain API is to understand that the state is event sourced and use existing events instead of update messages to syncronise the state. Event-sourced state require the definition of the |
How does QueuedGrain not require the definition of similar handlers for update messages, if update messages are also diffs? Magic? |
It does - that's what I'm trying to say: with event sourced state On Tue, 8 Dec 2015 7:06 pm Yevhen Bobrov notifications@github.com wrote:
|
I think I now understand what @jkonecki is proposing... this has the potential to be super easy to use. Basically, you write an event sourced grain exactly as previously proposed (using Internally I can translate all of the JournaledGrain state interface (ReadStateAsync/WriteStateAsync/ClearStateAsync/RaiseStateEvent) into QueuedGrain API, without the user needing to know anything about the latter. |
nice, that hint had a proper effect ))
click
you owe me a beer ;) |
I created a gist with my design for event-sourced grains - please comment on it: |
@jkonecki I like it! 👍 |
Looks very functional! Immutable state, transition as left fold - very nice! |
You can still cheet by updating the passes state and returning the same On Wed, 9 Dec 2015 9:57 am Yevhen Bobrov notifications@github.com wrote:
|
@jkonecki I suspect for states having multitude of fields it might be better choice. That makes transition function returning a state not very convenient. It's just +1 line for |
@jkonecki honestly, as long-time OO aficionado, I don't like design with separate explicit state class. Looks anti-OO to me. I do prefer to have mutable fields directly in my actors. Having immutable state in single-threaded actor don't give much benefits. Also, I don't like prepending What if instead of passing State we will be passing Grain inside? Then there will be choice, either to modify |
Orleans separates state from grains themselves. Do you want to have them merged into a single class? So your state properties will be declared directly inside the grain? That wasn't a part of my design as it would have a huge impact on the whole framework. In my design for non-ES grains you just mutate the existing state ( I actually like the separation of logic and state in Orleans... |
Sure. Just matter of taste 😄 |
Thanks for posting the example! Overall, looks pretty much like I expected. There is certainly no problem with supporting this exact JournaledGrain interface on top of QueuedGrain. There are some points I think are worth discussing though. Mostly they are about reducing the amount of code users have to write for the expected common case. I would be interested in hearing what your thoughts are. Subclasses vs. Marker interfaces: I am not sure that as a user, I would prefer to write public class MyJournaledGrain : Grain<MyState>, IGrainJournaledState<MyJournaledState, MyJournaledStateTransition> as opposed to just public class MyJournaledGrain : JournaledGrain<MyJournaledState,MyJournaledStateTransition> A journaled grain is just a specialized version of a grain... that is what subclasses are for. Why make it more complicated than necessary? Subclasses are also better for Intellisense... I had trouble occasionally with finding the extension methods because of missing using clauses. Separate object for state transitions: Using a separate object for defining the state transitions may be o.k. for cases where you have a strong desire to emphasize the conceptual separation between event and state, but it requires users to write more code than if they put the apply function directly into the events (that is how the QueuedGrain API does it currently). It is also easy to support both mechanisms (users can choose how they want to do it). Mutable vs. Immutable State: I think it will be more common to use mutable state than immutable state (as @yevhen mentions, because actors already take care of concurrency), thus I would also vote to save that one line of code (note that users can still use immutable data structures inside the grain state, so nothing is really lost). Perhaps we can also support two separate Transition classes, one for each style. None of these are very important points that we need to spend much discussion effort on, except maybe the first point. I think we may want to design some clean extension mechanism for creating specialized grains. Chances are good that we will want to experiment with various grain versions for a while to come. I am currently creating QueuedGrains as a special case, and I think it would be better to have this be done with a general mechanism that also supports JournaledGrains. |
That won't be a problem since extension methods will reside in If we introduce
My existing ES branch actually has |
yes, I think that is right. StatefulGrain can remain internal to the runtime.
No need. QueuedGrain is just a thin adaptor for the underlying replication provider (which has the exact same API), there is no real code inside. JournaledGrain can just call the replication provider directly. |
So are we OK to introduce internal StatefulGrain with common state logic and derive I updated my gist - https://gist.github.com/jkonecki/26b5bec619757e199e2d |
As a newcomer I find it difficult to catch up with this volume of information. Still working on it. :-) But I would like to mention something I was looking at before meeting actors: FT-Corba. It has a mix of jounal and snapshots: http://www.omg.org/spec/FT/1.0/ |
#1854 will add a robust solution for this. |
Resolved via #1854. |
It would help to have a general programming model for using Event Sourcing as an approach to managing grain state. Likely that would be an alternative or an extension to the declarative persistence model. The goal is to have a unified model that would allow different implementations (and against different data stores), so that an app can pick and choose which ones to use.
The text was updated successfully, but these errors were encountered: