-
Notifications
You must be signed in to change notification settings - Fork 977
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phase 0 Wire Protocol #692
Comments
Great writeup! A few notes: There has been discussions on whether to use RLP for discovery and SSZ for messages. There has also been some discussion on using QUIC instead of UDP for discovery and TCP for all other peer communication. By using QUIC, we no longer need a 3-way handshake as in TCP and we get built-in encryption. The main issue is that it hasn't been implemented in all programming languages. |
What do we mean by this? Note that especially in cases where the validator set is very small, it is absolutely possible for the latest finalized block to be more than a weak subjectivity period behind the current block.
This feels backward. It's not the peer's latest finalized root that should be checked against the client head, it's the peer's head that should be checked against the client's latest finalized root. As an example of how the current approach is insufficient, consider the case where the client head is H with a latest finalized root R with parent P, and a peer says hello with a chain going through some different child R' of P that doesn't finalize any blocks. The peer's latest finalized root would be some ancestor of P; so this is clearly invalid from the point of view of the client, but it would still be accepted.
Is there a need for this particular wire protocol spec to specify where unaggregated attestations get sent?
As the inventor of RLP, I'm inclined to prefer SSZ (or SOS if we end up replacing SSZ with SOS or something similar) 😆 The "simplicity over efficiency" desideratum would imply that we should ideally only use one algorithm per task (in this case the task being "serialization"), even if they have slightly different tradeoffs. I see the overhead of a few extra zero bytes in SSZ being tiny relative to the absolute size of the objects that would be transferred, so the gains wouldn't be large. |
The point here is to make clear that you may think you are safe to sync but realize on the wire that you don't have recent enough information to safely sync. I suppose this information can (and maybe should) be known before you even contact peers. An epoch # is an absolute time so you can know the expected head of chain epoch before you sync and know you are in the danger zone without knocking on a peer's door. In this case, maybe this is not a wire protocol note, but a client implementer safety note that goes in another doc.
The peer with the lower finalized epoch (peer-1) cannot immediately tell just by shaking hands with the peer with the higher finalized epoch (peer-2) if the peer-1's finalized epoch is in peer-2's chain or not. Peer-2 on the other hand can easily check if peer-1's finalized epoch is in their chain. If not, no use in sending blocks to them because to sync peer-2's blocks, peer-1 would have to revert. I'm not sure if what we currently have in (3) is backward, but it is does not sufficiently cover all the cases in which two chains might be out of sync wrt finality
Not 100% sure. The expectation is that single signer attestations are sent to the shard subnet. Only rarely would a single signer attestation be broadcast to the beacon net as a "best effort" aggregation.
+1 |
Agree! So do we need a "shard subnet wire protocol" section? |
Re-raising this point, but is there any particular reason why Also, if we are going to make topics plural, please can we make |
Nice writeup! This is quite a big document, so maybe opening a PR would allow for easier review (not sure what the best place for this would be though). I'll try giving some feedback anyway:
I'm leaning towards RLP for message encoding. Not because of efficiency, but because with SSZ we at all times need to know the exact message type that we're expecting. RLP is more flexible in that regard which would make it easier to support versioning, multiple possible response message types, and using the same gossip topic for multiple object types. Also, if we go with RLP at the discovery level anyway, then using it at the "content" level would be more consistent (but this is a different debate I guess).
I think whenever we send objects that are defined in the spec, we should put them in the message either as a (list of) SSZ blob(s) (if we go with non-SSZ as message format) or as a container field (for SSZ messages). Right now, it seems like we're redefining some object types (e.g. block header and block body).
Maybe we should have an explicit "Disconnect" message. This would enable peers to do some clean ups and also stop responding to any potential outstanding requests.
We should add a
I think the corresponding request is missing, no? We can probably just use the same format as we do for requesting beacon block headers.
The request contains both slot and root. Why? The slot seems to be redundant so I think removing it is better for simplicity.
I think we should try to keep this as simple as possible for now and don't specify any fast or light sync etc.
I think we should add beacon/shard to the protocol path, i.e.
👍 |
The "Shard Attestations" section includes both |
This is similar to how block hashes are exchanged on the ethereum 1.0 wire protocol. The expectation is that you are only gathering information to "catch up" when initially finding peers and that beyond that you generally stay synced by listening to broadcasts on the wire. If you get a recent block and don't have it's parent, you can ask for it's parent via the root embedded in the block and walk the chain back until you find a common ancestor. If fallen way out of sync, you can reconnect to peers to get a list of roots. I see the potential utility of what you describe but am not sure needing to request the list after initial
👍 |
Will open up a PR after these initial comments. Thanks!
The current format of
Interesting. In the 1.0 protocol we use it for both distinguishing between entirely different chains (mainnet vs ropsten vs ..) as well as forks within a single chain. and contentious forks become different networks. I see there could be some use in coming up with a succinct format to describe forks independent of base "network". Nick Johnson had an interesting proposal on twitter. Something like this might be worth pursuing.
The
agreed
Seems reasonable. Could also signal a "why" if you had a particular reason
A method to sync state is necessary in the weakly subjective chain. Clients are expected to show up with some recent finalized root and peers are not expected to serve blocks since genesis in perpetuity. The term is more aptly described here as "State Sync" rather than "fast" as it's primary use is to provide a recent state to a client. Also note that the state size is bound ~600MB even in the worst case and so should actually be fast. gotta run. I'll get to your other few things in a bit, and will make some edits accordingly. |
After a review (I will study/contemplate this later in more depth) of this lovely piece of work, I just want to raise a couple of questions, which most likely arise out of my own lack of understanding:
On wire encoding is it fair to say that for participation in the existing network, RLP is necessary as a software component for the medium term future regardless? |
I assumed both finalized epoch and head are just used to select the best peer to connect to and sync from. The justified epoch would just be an additional piece of information to do this a little better. Maybe in practice the head is already enough though, so this might not be necessary.
Yes, in Eth1.0 there's kind of a one to one relationship between networks and forks. But in Eth2.0, we have at least the beacon network and multiple (shard) subnetworks for the same (beacon) chain, and if we count the various topics as one network each we get even more. So I think we should differentiate the terms a bit more cleanly.
I think/hope libp2p handles this for us already so nothing we should worry about.
I strongly think we should agree on some encoding scheme and require it. Otherwise different clients won't be able to talk to each other and the network will cluster by implementation.
I think that's also in libp2p's domain. cc @raulk |
This shouldn't be in this specification because it is handled by lower layers of the stack. |
From what I understand, What's often happens in distributed systems is that stuff gets lost along the way, for a variety of reasons. Thus, what would be useful is an ancestor request, such that when you receive a block or attestation whose parent is unknown, you can quickly recover and establish if this is a good block or not. What you need:
When you receive a block, you need to check that you have a path back to a tail block you know about - if you don't, it's either a block that's not part of the chain (spam?), or you're missing links. To find out, you have to try downloading the missing block headers, using slot time to establish a horizon to tell one situation from the other. This kind of request is very flexible, because it allows you to join the network, start listening to broadcast chatter and discover what the most popular heads are (assuming broadcast network is namespaced / separate for every chain - it's fairly low cost to add a network id to every broadcast, which will greatly help debugging, if nothing else) The approach is nice for a few different reasons:
What you do as a client is to divide blocks into two: resolved and unresolved. Resolved blocks are those that you have established a path to a finalized state (that you know through other means) to. Unresolved are all the others:
The request would look something like:
Basically, a start-pointer+length kind of request, but going backwards in time (this is already an optimization over getting blocks one at a time then getting their parents recursively, which also works - this is incidentally the same as the This request can return block headers or roots as is appropriate (headers seems natural, because you want to use the response to verify that one of the blocks resolves to a known chain state) One problem with the approach is that when you receive an unknown block, you have to use network resources to find out if it's valid or not (the block might be signed by a validator that's not in your validator set, because the latest finalized block you know about does not yet contain it ). This can be mitigated by:
The forward-direction request is slightly problematic in that it also needs to decide if it should branch out on all children or simply follow what the responding client considers to be the canonical chain at the time. Ancestor request does not have this issue.
what's the use case for skipping slots at uniform intervals? |
fwiw we just landed a simple implementation of the application layer requests in this protocol in Nimbus: status-im/nimbus-eth2#117 (the PR description is a bit off, it refers to an earlier attempt that we subsequently rewrote), and will be playing around with it for the initial sync. A detail is that we're running it over devp2p right now for convenience, but should be integrating with a libp2p transport soon:ish. One thing of note is that when you're a proposer, you want to include attestations from as many shards as possible, right? Effectively, this means that that if attestation broadcasts are split up by shards, you end up having to listen to all of them anyway, under the naive attestation broadcast protocol where everyone just broadcasts their self-signed attestation.
What's lacking for this is a responsible entity that does the aggregation and broadcasts it here - also, what's aggregate in this context? 2/3 votes? all? and who does the aggregation.. so many questions :) Pegasys' aggregation looks promising, but until we have that, in the implementation we're broadcasting everything everywhere - obviously, this will not do for 4 million validators! |
The expectation is to have some subset of committee broadcast best-effort aggregates picked up from the shard subnet to the beacon (say last N of committee). These validators are already connected to the subnet for creating the attestation and are incentivized to have their attestations included in the beacon chain so are a natural pick. A proposer of a beacon chain block is not expected to sync to a bunch of different subnets to pick up and aggregate attestations. Adding this to honest validator guide shortly.
Agreed. Glad to have it in our back pocket, but want to get some nodes on testnets before we decide to go with a more sophisticated setup. Note that even with a more structured aggregatation scheme, we need some sort of expected behavior defining what gets passed to the beacon net.
To distribute the load of requests across peers, and to also be able to provide a "skip" subset to light clients. Still digesting the rest of your post. |
but do they have an interest for others to be rewarded also? minimally no, right? since reward depends on total active validator balance, it's better for me if someone else loses balance? there's the social good argument of course that validators might remember who excluded them and reciprocate etc.. |
validators lose money if they aren't successfully crosslinking or finalizing. The best thing I can do as an individual is maximize participation of my committee and validators as a whole. If I happen to have some huge majority of validation, my optimal strategy might change to censorship. |
How would state sync work in the case of skipped slots ? Lets say a node A joins the network so has no saved state locally and will have to perform a What happens when the slots before finalization were skipped slots in epoch 5 ? Node A downloads the whole state from its peers and will have the same finalized state as Node B. However now Node A has to catch up to the current head of the network and starts requesting for blocks after the latest finalized epoch. Lets say due to network issues slots are skipped before finalization, every block that Node A import's now will point to a parent that the node cannot validate as the parent is not saved and that block will end up being thrown away despite being a valid block. This would end up with sync permanently stalling as it would never be able to reach the current head |
@nisdas Skipped slots don't matter, all A need is blocks and they can still get them even if some slots in between are empty. A would just ask for all blocks from their last known slot to the current slot and B would reply with all non-skipped ones. I guess this may become slightly more inefficient if A tries to request from multiple Bs (as they don't know which blocks have been skipped, so the load won't be distributed equally), but that seems not very significant. |
I've included updates to this spec here: #763. Closing this in favor of the PR. |
@jannikluhn The issue isnt requesting skipped slots. Its how we perform
If a new node joins the network they will start sync by requesting the finalized state at slot 64( epoch 1). Then in order to sync till the current head at slot 94, it will request blocks from (65,94). So when the node receives the response containing the blocks in this range, it will process the block at slot 66 (since 65 was skipped). But when we process this block into state transition function, it will be rejected since we do not have the previous chain head which is actually the block at slot 60. We also will need to save the parent block(slot 60) in state sync in order for sync to work. |
Ah, I think I get what you mean now. The state does contain the latest block header though, so this should deal with it, right? |
Ahh , yes that would solve it. This is a bug on our end since we generate the previous block root from our locally saved block instead of using the one kept in state. Thanks ! |
Follow-on to #593.
This specification describes Ethereum 2.0's networking wire protocol.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL", NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Use of
libp2p
This protocol uses the
libp2p
networking stack.libp2p
provides a composable wrapper around common networking primitives, including:Clients MUST be compliant with the corresponding
libp2p
specification wheneverlibp2p
specific protocols are mentioned. This document will link to those specifications when applicable.Client Identity
Identification
Upon first startup, clients MUST generate an RSA key pair in order to identify the client on the network. The SHA-256
multihash
of the public key is the clients's Peer ID, which is used to look up the client inlibp2p
's peer book and allows a client's identity to remain constant across network changes.Addressing
Clients on the Ethereum 2.0 network are identified by
multiaddr
s.multiaddr
s are self-describing network addresses that include the client's location on the network, the transport protocols it supports, and its peer ID. For example, the human-readablemultiaddr
for a client located atexample.com
, available via TCP on port8080
, and with peer IDQmUWmZnpZb6xFryNDeNU7KcJ1Af5oHy7fB9npU67sseEjR
would look like this:We refer to the
/dns4/example.com
part as the 'lookup protocol', the/tcp/8080
part as the networking protocol, and the/p2p/<peer ID>
part as the 'identity protocol.'Clients MAY use either
dns4
orip4
lookup protocols. Clients MUST set the identity protocol to/tcp
followed by a port of their choosing. It is RECOMMENDED to use the default port of9000
. Clients MUST set the identity protocol to/p2p/
followed by their peer ID.Relevant
libp2p
SpecificationsTransport
Clients communicate with one another over a TCP stream. Through that TCP stream, clients receive messages either as a result of a 1-1 RPC request/response between peers or via pubsub broadcasts.
Weak-Subjectivity Period
Some of the message types below depend on a calculated value called the 'weak subjectivity period' to be processed correctly. The weak subjectivity period is a function of the size of the validator set at the last finalized epoch. The goal of the weak-subjectivity period is to define the maximum number of validator set changes a client can tolerate before requiring out-of-band information to resync.
The definition of this function will be added to the 0-beacon-chain specification in the coming days.
Messaging
All ETH 2.0 messages conform to the following structure:
The protocol path is a human-readable prefix that identifies the message's contents. It is compliant with the
libp2p
multistream
specification. For example, the protocol path forlibp2p
's internalping
message is/p2p/ping/1.0.0
. All protocol paths include a version for future upgradeability. In practice, client implementors will not have to manually prepend the protocol path sincelibp2p
implements this as part of thelibp2p
library.The compression ID is a single-byte sigil that denotes which compression algorithm is used to compress the message body. Currently, the following compression algorithms are supported:
0x00
: no compression0x01
: Snappy compressionWe suggest starting with Snappy because of its high throughput (~250MB/s without needing assembler), permissive license, and availability in a variety of different languages.
Finally, the compressed body is the SSZ-encoded message body after being compressed by the algorithm denoted by the compression ID.
Relevant Specifications
Messages
The schema of message bodies is notated like this:
SSZ serialization is field-order dependent. Therefore, fields MUST be encoded and decoded according to the order described in this document. The encoded values of each field are concatenated to form the final encoded message body. Embedded structs are serialized as Containers unless otherwise noted.
All ETH 2.0 RPC messages prefix their protocol path with
/eth/serenity
.Handshake
Hello
Protocol Path:
/eth/serenity/hello/1.0.0
Body:
Clients exchange
hello
messages upon connection, forming a two-phase handshake. The first message the initiating client sends MUST be thehello
message. In response, the receiving client MUST respond with its ownhello
message.Clients SHOULD immediately disconnect from one another following the handshake above under the following conditions:
network_id
belongs to a different chain, since the client definitionally cannot sync with this client.latest_finalized_epoch
exceeds the weak-subjectivity period, since syncing with this client would be unsafe.latest_finalized_root
shared by the peer is not in the client's chain at the expected epoch. For example, if Peer 1 in the diagram below has(root, epoch)
of(A, 5)
and Peer 2 has(B, 3)
, Peer 1 would disconnect because it knows thatB
is not the root in their chain at epoch 3:Once the handshake completes, the client with the higher
latest_finalized_epoch
orbest_slot
(if the clients have equallatest_finalized_epoch
s) SHOULD send beacon block roots to its counterparty viabeacon_block_roots
.RPC
These protocols represent RPC-like request/response interactions between two clients. Clients send serialized request objects to streams at the protocol paths described below, and wait for a response. If no response is received within a reasonable amount of time, clients MAY disconnect.
Beacon Block Roots
Protocol Path:
/eth/serenity/rpc/beacon_block_roots/1.0.0
Body:
Send a list of block roots and slots to the peer.
Beacon Block Headers
Protocol Path:
/eth/serenity/rpc/beacon_block_headers/1.0.0
Request Body
Response Body:
Requests beacon block headers from the peer starting from
(start_root, start_slot)
. The response MUST contain fewer thanmax_headers
headers.skip_slots
defines the maximum number of slots to skip between blocks. For example, requesting blocks starting at slots2
askip_slots
value of2
would return the blocks at[2, 4, 6, 8, 10]
. In cases where a slot is undefined for a given slot number, the closest previous block MUST be returned returned. For example, if slot4
were undefined in the previous example, the returned array would contain[2, 3, 6, 8, 10]
. If slot three were further undefined, the array would contain[2, 6, 8, 10]
- i.e., duplicate blocks MUST be collapsed.The function of the
skip_slots
parameter helps facilitate light client sync - for example, in #459 - and allows clients to balance the peers from whom they request headers. Client could, for instance, request every 10th block from a set of peers where each per has a different starting block in order to populate block data.Beacon Block Bodies
Protocol Path:
/eth/serenity/rpc/beacon_block_bodies/1.0.0
Request Body:
Requests the
block_bodies
associated with the providedblock_roots
from the peer. Responses MUST returnblock_roots
in the order provided in the request. If the receiver does not have a particularblock_root
, it must return a zero-valueblock_body
(i.e., a zero-filledbytes32
).Response Body:
For type definitions of the below objects, see the 0-beacon-chain specification.
Beacon Chain State
Note: This section is preliminary, pending the definition of the data structures to be transferred over the wire during fast sync operations.
Protocol Path:
/eth/serenity/rpc/beacon_chain_state/1.0.0
Request Body:
Requests contain the hashes of Merkle tree nodes that when merkelized yield the block's
state_root
.Response Body: TBD
The response will contain the values that, when hashed, yield the hashes inside the request body.
Broadcast
These protocols represent 'topics' that clients can subscribe to via GossipSub.
Beacon Blocks
The response bodies of each topic below map to the response bodies of the Beacon RPC methods above. Note that since broadcasts have no concept of a request, any limitations to the RPC response bodies do not apply to broadcast messages.
Topics:
beacon/block_roots
beacon/block_headers
beacon/block_bodies
Voluntary Exits
Topic:
beacon/exits
Body:
See the 0-beacon-chain spec for the definition of the
VoluntaryExit
type.Transfers
Topic:
beacon/transfer
Body:
See the 0-beacon-chain spec for the definition of the
Transfer
type.Clients MUST ignore transfer messages if
transfer.slot < current_slot - GRACE_PERIOD
, whereGRACE_PERIOD
is an integer that represents the number of slots that a remote peer is allowed to drift fromcurrent_slot
in order to take potential network time differences into account.Shard Attestations
Topics:
shard-{number}
, wherenumber
is an integer in[0, SHARD_SUBNET_COUNT)
, andbeacon/attestations
.The
Attestation
object below includes fully serializedAttestationData
in itsdata
field. See the 0-beacon-chain for the definition of theAttestation
type.Body:
Only aggregate attestations are broadcast to the
beacon/attestations
topic.Clients SHOULD NOT send attestations for shards that the recipient is not interested in. Clients receiving uninteresting attestations MAY disconnect from senders.
Relevant Specifications
Client Synchronization
When a client joins the network, or has otherwise fallen behind the
latest_finalized_root
orlatest_finalized_epoch
, the client MUST perform a sync in order to catch up with the head of the chain. This specification defines two sync methods:latest_finalized_root
orlatest_finalized_epoch
. In a standard sync, clients process per-block state transitions until they reach the head of the chain.latest_finalized_root
orlatest_finalized_epoch
. In a fast sync, clients use RPC methods to download nodes in the state tree for a givenstate_root
via the/eth/serenity/rpc/beacon_chain_state/1.0.0
endpoint. The basic algorithm is as follows:(C, 1)
and Peer 2 has(A, 5)
. Peer 1 validates that this new head is within the weak subjectivity period.(A, 5)
and sends/eth/serenity/rpc/beacon_chain_state/1.0.0
requests recursively to its peers in order to build its SSZ BeaconState.Note that nodes MUST perform a fast sync if they do not have state at their starting finalized root. For example, if Peer 1 in the example above did not have the state at
(C, 1)
, Peer 1 would have to perform a fast sync because it would have no base state to compute transitions from.Open Questions
Encryption
This specification does not currently define an encrypted transport mechanism because the set of
libp2p
-native encryption libraries is limited.libp2p
currently supports an encryption scheme called SecIO, which is a variant of TLSv1.3 that uses a peer's public key for authentication rather than a certificate authority. While SecIO is available for public use, it has not been audited and is going to be deprecated when TLSv1.3 ships.Another potential solution would be to support an encryption scheme such as Noise. The Lightning Network team has successfully deployed Noise in production to secure inter-node communications.
Granularity of Topics
This specification defines granular GossipSub topics - i.e.,
beacon/block_headers
vs. simplybeacon
. The goal of using granular topics is to simplify client development by defining a single payload type for each topic. For example,beacon/block_headers
will only ever contain block headers, so clients know the content type of without needing to read the message body itself. This may have drawbacks. For example, having too many topics may hinder peer discovery speed. If this is the case, this specification will be updated to use less granular topics.Block Structure Changes
The structure of blocks may change due to #649. Changes that affect this specification will be incorporated here once the PR is merged.
The text was updated successfully, but these errors were encountered: