-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consume file streamer output as reliable message queue #13652
Comments
For the "new block event", do you mean tendermint one? |
The key is to provide the above guarantee, so the natural implementation should be watching new files:
|
But since the tendermint When there are something wrong with consensus state machine, for example app hash mismatch happens, we probably have to manually rollback the file streamer output and downstream consumers. |
Since check length when read, if incomplete just retry fetch. |
I think it's still better if the event can provide that guarantee, so when things work properly, client don't have to retry at all. |
with the remote approach, are we making guarantees around delivery and how the response interacts with the node? Im more in favour of offering only file system and then writing a small remote cmd line tool that is not part of the node. |
It provide offset based subscription, client maintain the consumption offset(block height), provides at least once delivery guarantee.
Yeah, I think the server feature can be perfectly run in a separate process, may or may not embed inside the node. |
In favor of this approach. How would stopNodeOnErr work here? |
Do you mean |
The filesystem is a really great buffer/caching layer for an event delivery system! But it decouples producer and consumer in a way that makes it really difficult to achieve reliable delivery. First, Even if you give up on inotify and poll, it's still super difficult to establish that a state on a local FS is equivalent to a state on a [remote] consumer. The FS is by its nature asynchronous — caching, buffering, all sorts of stuff even before you get to the OS interfaces, and that's even before you get to the language adapters! If you need reliable delivery, you more or less have to use direct, synchronous connections between producer and consumer. A filesystem makes that impossible. Happy to say more on the topic if anyone is interested. But, assuming a standard definition of "reliable delivery", the FS is unfortunately not an option. |
If by reliable delivery we mean at least once delivery semantic without strict latency guarantees, I don't see why it won't work, can you shows a specific case where it fails? |
If we're moving towards a plugin-based design, shouldn't all consumers be plugin-based? Including a file consumer? |
that is the design that the pr implements. closing this for now |
Summary
In #13516, we fixed state listeners issues and refactored file streamer output.
And we realized that the file streamer output can be further consumed reliably like a message queue, either locally or remotely, use inotify and equivalents1 to provide real time events.
Problem Definition
Proposal
Provide some utilities for allow consuming file streamer output conveniently and reliably.
Local
For local clients, one just use fsnotify library1 to monitor file system events of file streamer output directory, and read the local files.
We should implement some utilities to make it convenient.
Remote
For remote clients, we should provide a http server that serve the static files plus a long-polling endpoint for the new block event:
/block-{N}-data
, download the data file of block N/block-{N}-meta
, download the meta file of block N/newblock
, a long polling endpoint to provide new block event.When the event for block
N
is emitted, the endpoints/block-{N}-data
and/block-{N}-meta
are guaranteed to return complete data./block-{N}-data
and/block-{N}-meta
are allowed to return partial data before the new block event is emitted.http service setup is flexible, for example one can setup a nginx to serve the static files and reverse proxy the
/newblock
endpoint.Client
Client should record the last block number that it has successfully consumed, and process new blocks from there, and subscribe to
/newblock
event notifications after it has caught up.File streamer settings
To treat the file streamer output as a reliable message queue, it should be configured as a reliable one:
stopNodeOnErr=true
for eventual consistency.fsync=true
, so we don't lose data in face of system crash.Footnotes
https://github.com/fsnotify/fsnotify ↩ ↩2
The text was updated successfully, but these errors were encountered: