Checkpoint chaincode event listening #362

bestbeforetoday · 2022-01-05T17:42:15Z

As an application developer
I want checkpoint capability for chaincode events
So that I can resume event listening following a connection error or application restart with minimal application code

Using checkpointing, event listening should resume after the last successfully processed event. No events should be missed and no duplicate events should be delivered.

Options for dealing with connection errors are to either:

transparently retry connection with transient failures not reported to the application code (as in the v2.2 Node and Java SDKs); or
Surface connection errors to the application so it can choose to reconnect by simply restarting event listening using the checkpointer.

For simplicity and to avoid obscuring persistent connection failures, the second approach might be preferred.

It might be possible to implement checkpointing as purely a wrapper API around the existing (non-checkpointing) event listening API. The advantage to this approach would be simplicity of the core eventing implementation. The disadvantage would be that the checkpoint implementation could not take advantage of information available only within the eventing implementation, such as visibility of the number of events contained within a given block.

bestbeforetoday · 2022-03-09T18:24:08Z

Integrated vs wrapper implementation

A checkpointing solution tightly integrated into the existing event delivery turned out not to be practical, mainly due to the off-line signing flow. This serialises only the protobuf messages and loses and more transient information, such as a selected checkpointer, leaving no way to automatically recreate this when the request (along with signature) is reconstructed after the off-line signing takes place.

The approach currently favoured is a wrapper around the event iterator or channel.

Automatic checkpoint of events

The pull model used for event delivery (iterator / channel) provides good control of rate of message delivery and easier management of the eventing session but it means that the client API has no way of knowing whether a supplied event is successfully processed by the client application or whether a processing error occurs. The push model used in the legacy SDKs (observer pattern) returns control to the client API after the application event consumer function is invoked, so a non-error return can be safely assumed to mean the event was successfully processed.

One approach to allow events to be automatically checkpointed by the API would be to provide a push (observer pattern) API for event delivery when checkpointing is used, similar to the legacy SDKs and built on top of the pull (iterator / channel) API. This would allow automatic checkpointing but is a lot of additional complexity and API surface to support.

The alternative currently favoured is to require an explicit checkpoint call by the client application on successful event processing.

bestbeforetoday · 2022-03-13T14:01:16Z

To allow resume of eventing with no missed or duplicate events, the checkpointing implementation needs to address two concerns:

Record the last successfully processed event (block number and transaction ID), with this information optionally stored persistently across application runs.
Ensure eventing resumes exactly after the last successfully processed event.

The ChaincodeEvents service already guarantees that events are delivered ordered by block number and position of the emitting transaction within the block so, during an event listening session, no events will be missed or duplicated.

Filtering out previously processed events at the client end presents different challenges depending on the implementation language. A better approach is to prevent the client from receiving previously processed events by adding an optional after_transaction_id field to the ChaincodeEventsRequest protobuf message, and for the Gateway service to only deliver events following that transaction ID. The client only needs to include the correct start block number and last successfully processed transaction ID in the request, and does not need to do any filtering of previously processed events, significantly simplifying the client implementation.

Extending the ChaincodeEventsRequest protobuf message has the added advantage of retaining the after_transaction_id during off-line signing flow.

The Checkpointer interface/implementation is only required to store the current block number and, if any transactions within that block have been processed, the last successfully processed transaction ID. There is no need to store all previously processed transaction IDs within a block. So the Checkpointer interface can become:

interface Checkpointer {
    checkpoint(blockNumber: bigint, transactionId?: string): Promise<void>;
    getBlockNumber(): bigint | undefined;
    getTransactionId(): string | undefined;
}

Record last processed event

The checkpoint() method must be called after each event is successfully processed. Implementation options include:

Client application explicitly calls checkpoint().
We provide a wrapper iterator or helper method providing a no-args checkpoint() method that checkpoints the last delivered event.
We provide a mechanism to register an event callback, which is invoked in each event and checkpoints that event when the callback successfully completes.

I suggest we start with the first option, as it requires little or no additional implementation in the client API and is still trivial for the client application to use.

Specify last processed event on resume

The client application must specify both the start block and (if one exists) last successfully processed transaction ID when resuming eventing after a transient connection failure or in a subsequent application run. Implementation options include:

Add an afterTransactionId chaincode eventing option, in addition to the existing startBlock option. It would be the client application's responsibility to pass both the start block number and previous transaction ID when resuming eventing.
Allow a checkpointer chaincode eventing option, which we would use to obtain the correct start block and previous transaction ID values. It would be the client application's responsibility to pass the checkpointer instance when resuming eventing.

The second option is slightly less work for the client application and no extra work in the client API implementation, so I think I would favour that approach.

bestbeforetoday added client Relates to Fabric Gateway client enhancement New feature or request labels Jan 5, 2022

denyeart assigned denyeart and sapthasurendran and unassigned denyeart Feb 7, 2022

bestbeforetoday added this to the v1.1 milestone Feb 11, 2022

bestbeforetoday mentioned this issue Mar 4, 2022

Chaincode Event Checkpointing #398

Merged

bestbeforetoday mentioned this issue May 8, 2022

Scenario Test Chaincode Checkpointer #426

Merged

bestbeforetoday closed this as completed May 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoint chaincode event listening #362

Checkpoint chaincode event listening #362

bestbeforetoday commented Jan 5, 2022

bestbeforetoday commented Mar 9, 2022

bestbeforetoday commented Mar 13, 2022 •

edited

Loading

Checkpoint chaincode event listening #362

Checkpoint chaincode event listening #362

Comments

bestbeforetoday commented Jan 5, 2022

bestbeforetoday commented Mar 9, 2022

Integrated vs wrapper implementation

Automatic checkpoint of events

bestbeforetoday commented Mar 13, 2022 • edited Loading

Record last processed event

Specify last processed event on resume

bestbeforetoday commented Mar 13, 2022 •

edited

Loading