-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Choose signature aggregation dissemination strategy for mainnet #1331
Comments
We have been investigating the attestation aggregation and dissemination problem for awhile. Preliminary scalability considerationsThere will be many participant nodes, even we consider only a part of the overall network active at a given slot, e.g. attesters, intermediate nodes (aggregators, transient nodes) and proposers (of next blocks). It's expected there will be around 300K attesters initially (10M ether). This means about 300 attesters per shard/committee. Given 16 committees per slot, it's around 5K nodes. In future, amount of validators may grow, so if there are 1M validators, there will be 1K attesters per committee, i.e. around 15-16K nodes with at least one attester (assuming 100K nodes overall, it will be a rare situation to have more than one attester on a node). The same issue with shard subnets, i.e. it's expected that around 300 validators plus 200 standard nodes are listening to a shard. There are several shards in a subnet, so a subnet size is several times more than a committee size. There are up to 16 active subnets in a slot, so it's lots of nodes too. Given all above, one should be very careful when designing an aggregation/dissemination protocol. We'll look at this in more details in the following sections. NB The estimates of 300 validators per shard and 200 standard nodes per shard are based on p2p/issues/1 and p2p/issues/6. Aggregation and result delivery are separate problemsThe network specifications states:
This implies only a subset of aggregators should send their results to proposers (beacon_attestation). Else it's too much traffic and it is probably simpler to send individual attestations directly (unless aggregators size is much smaller than attesters size). This is regardless of which aggregation protocol is employed. E.g. if Handel is employed to aggregate attestations, then the aggregating nodes still have to decides who sends the results. Partial aggregates may be okaySince several aggregators have to send their results to proposers, it may be okay not to wait when aggregates become complete or near complete (include all or almost all individual attestations). Given network or byzantine failures, this is a highly desirable property, since some attestations may be lost for various reasons. This allows for aggregation protocol to stop before a final solution is obtained (which may be too late). However, it raises additional problems (see below). Coordinated vs random aggregationLet's look at the aggregation part in more details. There are three general kinds of protocols to aggregate data in p2p-networks:
When node/link failures cannot be ignored, we have only two options, either a gossip or a hybrid approach. A gossip approach has a significant drawback: since partial aggregates are sent in a random way, at some point, it will be difficult or impossible to merge two partial aggregations, because the sets of their attesters are overlapping, i.e. there is one or more attesters, whose attestations are included in both partial aggregates. The problem is caused by the Attestation class, which uses bitifields to account for attesters. A coordinated approach is required to avoid this, so that nodes should communicate in a way that allows for non-overlapping partial aggregates. Organizing nodes in a tree is an ideal choice in a fault-free setup, but in byzantine context, rather a forest of trees should be constructed to mask failures and message omissions. Medium-sized partial aggregationGossip-like protocols are attractive because they require less coordination and well matched to p2p communication graph. Also it's beneficial (and may be even required if slot duration is about to elapse) to stop the aggregation stage before a final result is reached. Actually, the beacon block structure allows storing multiple partial attestations of a committee. The main obstacle is the 128 limit on total amount of Attestations. More importantly, storing too many attestation will bloat a beacon block, which can be a problem to scalability. However, we think that the problem can be resolved with proper block structure and/or smart compression. See here for details. Handel is a partial solutionHandel is an interesting protocol, however as it follows from the above notes, it's not a complete solution. First, Handel requires pairwise connections between nodes, which doesn't fit well p2p-graph, i.e. instead of direct connections, messages will pass through transient nodes, which means: a) additional delays, b) opportunities for byzantine attacks. The last is not Handel specific, though. Second, after Handel is complete or partially complete, the results should be sent somehow to proposers in a reliable fashion - the problem common to all attestation aggregation-dissemination strategies (discussed before). Third, Handel paper says that Handel is able to aggregate 4K attestations under 1 second in case of UDP setup. However, when using QUIC, Handel developers report it's three times slower. In case of p2p-graph, when a pairwise connection between nodes have to be implemented with sending a message via transient nodes, it means an additional latency. So, when implemented in the context of Ethereum 2.0 requirements, it's not clear whether it's performant enough or not. Overall, if follow a coordinated route, Handel seems to be a very good starting point, which should be augmented to resolve the above issues. Topic-based Publish-Subscribe pattern seems to be a poor matchAs quoted before, the network specifications states:
However, fully delivering of individual attestations to subnet topic is very resource consuming. Earlier, we estimated that the are 16 committees of 300-1000 senders and each should send to a subnet topic of a size which is several times more, i.e. around thousands subscribers. Actually, it's excessive since individual attestations have to be delivered to only some of aggregators. The final aggregation is obtained via several rounds of aggregation protocol. If all individual attestations are propagated to all members of a shard subnet, then there is no need for an aggregation protocol at all, since they can be sent to proposers directly, with less efforts (assuming amount of proposers in beacon_attestation is much less than amount of subscribers to a shard subnet topic). An aggregation protocol also doesn't match topic-based publish subscribe pattern, since aggregators send partial aggregates which are growing with each round, so there are different messages. The final stage, where aggregators send their results to proposers, looks a good match on a high level. However, considering implementation details, the subscribers to beacon_attestation topic are constantly changing. So, this is a serious problem with topic membership management, which discussed in the following section Overlay management and Topic discoveryNew proposers should subscribe to the topic beforehand to be able to receive results. And later unsubscribe (to keep topic subscribers small). The appropriate information about topic membership changes should be propagated to aggregator nodes, so that they know whom to send their results. As the specification assumes that beacon_attestation are mostly proposers, we assume there won't be many subscribers -- around tens of them. From scalabilty point of view, ideally there should be one subscriber each slot -- the proposer of the next slot. However, it's safer to assume there will be proposers of some slots before and after the current one. If the topic membership is small and changes rapidly, then it will be a problem for gossipsub to maintain the mesh for the topic. Basically, we should assume, a gossipsub router at a node should request beforehand Topic Discovery service for an information of latest topic changes. Moreover, for a validator which is assigned to be an attester for a particular slot, it's most important that the topic membership information includes the entry of the next slot proposer. Topic Discovery and BFTAnother critical problem is byzantine fault tolerance properties of Topic Discovery service. An adversary can advertise wrong records in Topic Discovery service or run Topic Discovery service instance which provides wrong records to honest nodes about who are the members of the beacon_attestation topic. The honest nodes will send their attestations in a wrong direction.
Basically, Topic Discovery is based around Kademlia DHT and p2p DHT are known to have problems with BFT. The BFT in the context of p2p and DHT is also discussed here. |
Good stuff :-) On Handel:
|
Intention is to use the simple approach currently in the network/validator specs. Will revisit this if run into issues on testnets |
On mainnet, the expectation is that unaggregated attestations will be disseminated to a shard-specific topic, then aggregated and forwarded to a beacon-chain-wide topic so that block proposers can propose blocks without having to listen to all shard attestation channels.
In the networking spec, the aggregation strategy is left open, with a few notable alternatives having been discussed in the past (please add any that I missed):
A random selection of validators are responsible for packaging attestations and forwarding to beacon topic - for example the first N in committee
As 1) but random selection is a probability function instead where validators roll a local die, with increasing probability of doing the work as time passes and nobody has passed attestation - the function could weigh certain validators higher to prevent collisions.
Handel
The text was updated successfully, but these errors were encountered: