You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For edge nodes of a sync server cluster, simple load balancing is enough to distribute load across cores & machines. But so far, these edge nodes had to rely on a single core server (and single CPU core) as their upstream. And that core server had to process updates to all CoValues, making it a bottleneck.
In order to parallelise the core server and start introducing redundancy, we need to shard it by CoValue ID.
To achieve this, we need to implement a ShardedPeer which consists of multiple other Peer definitions ("sub-peers"). It should multiplex outgoing messages to the correct peer by using Rendezvous Hashing - picking the sub-peer with the smallest value of hash(coValueID, peerName).
It should support dynamic changes to the existence/availability of sub-peers.
Incoming messages from all sub-peers should be coalesced to give the appearance of ShardedPeer being a single peer.
Note that a shard becoming unavailable will cause CoValue unavailability issues, because we only sync each CoValue with a single shard in this initial naive setup. In this case, the known-state handling of the ShardedPeer as implemented in the SyncManager also becomes slightly wrong for affected CoValues, because it reflects the state of the now-unavailable shard.
This is however already a marked improvement to the single-core-server situation and a first step to implement smarter behaviour in the future (such as picking two or three peers for each CoValue ID, which Rendezvous Hashing lets us gracefully upgrade to later)
Setting up the shard peers correctly and making sure the shard peers know about each other, too, is left as an exercise to the cluster deployer. This is important, because CoValue dependencies (such as value-owned-by-group, account-member-of-group, or group-extended-by-group) need to be resolved correctly, and depended-on CoValues might live on another shard.
The text was updated successfully, but these errors were encountered:
For edge nodes of a sync server cluster, simple load balancing is enough to distribute load across cores & machines. But so far, these edge nodes had to rely on a single core server (and single CPU core) as their upstream. And that core server had to process updates to all CoValues, making it a bottleneck.
In order to parallelise the core server and start introducing redundancy, we need to shard it by CoValue ID.
To achieve this, we need to implement a
ShardedPeer
which consists of multiple otherPeer
definitions ("sub-peers"). It should multiplex outgoing messages to the correct peer by using Rendezvous Hashing - picking the sub-peer with the smallest value ofhash(coValueID, peerName)
.It should support dynamic changes to the existence/availability of sub-peers.
Incoming messages from all sub-peers should be coalesced to give the appearance of
ShardedPeer
being a single peer.Note that a shard becoming unavailable will cause CoValue unavailability issues, because we only sync each CoValue with a single shard in this initial naive setup. In this case, the known-state handling of the
ShardedPeer
as implemented in theSyncManager
also becomes slightly wrong for affected CoValues, because it reflects the state of the now-unavailable shard.This is however already a marked improvement to the single-core-server situation and a first step to implement smarter behaviour in the future (such as picking two or three peers for each CoValue ID, which Rendezvous Hashing lets us gracefully upgrade to later)
Setting up the shard peers correctly and making sure the shard peers know about each other, too, is left as an exercise to the cluster deployer. This is important, because CoValue dependencies (such as value-owned-by-group, account-member-of-group, or group-extended-by-group) need to be resolved correctly, and depended-on CoValues might live on another shard.
The text was updated successfully, but these errors were encountered: