Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phase 0 Wire Protocol #692

Closed
mslipper opened this issue Feb 26, 2019 · 23 comments
Closed

Phase 0 Wire Protocol #692

mslipper opened this issue Feb 26, 2019 · 23 comments
Labels

Comments

@mslipper
Copy link
Contributor

mslipper commented Feb 26, 2019

Follow-on to #593.

This specification describes Ethereum 2.0's networking wire protocol.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL", NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Use of libp2p

This protocol uses the libp2p networking stack. libp2p provides a composable wrapper around common networking primitives, including:

  1. Transport.
  2. Encryption.
  3. Stream multiplexing.

Clients MUST be compliant with the corresponding libp2p specification whenever libp2p specific protocols are mentioned. This document will link to those specifications when applicable.

Client Identity

Identification

Upon first startup, clients MUST generate an RSA key pair in order to identify the client on the network. The SHA-256 multihash of the public key is the clients's Peer ID, which is used to look up the client in libp2p's peer book and allows a client's identity to remain constant across network changes.

Addressing

Clients on the Ethereum 2.0 network are identified bymultiaddrs. multiaddrs are self-describing network addresses that include the client's location on the network, the transport protocols it supports, and its peer ID. For example, the human-readable multiaddr for a client located at example.com, available via TCP on port 8080, and with peer ID QmUWmZnpZb6xFryNDeNU7KcJ1Af5oHy7fB9npU67sseEjR would look like this:

/dns4/example.com/tcp/8080/p2p/QmUWmZnpZb6xFryNDeNU7KcJ1Af5oHy7fB9npU67sseEjR

We refer to the /dns4/example.com part as the 'lookup protocol', the /tcp/8080 part as the networking protocol, and the /p2p/<peer ID> part as the 'identity protocol.'

Clients MAY use either dns4 or ip4 lookup protocols. Clients MUST set the identity protocol to /tcp followed by a port of their choosing. It is RECOMMENDED to use the default port of 9000. Clients MUST set the identity protocol to /p2p/ followed by their peer ID.

Relevant libp2p Specifications

Transport

Clients communicate with one another over a TCP stream. Through that TCP stream, clients receive messages either as a result of a 1-1 RPC request/response between peers or via pubsub broadcasts.

Weak-Subjectivity Period

Some of the message types below depend on a calculated value called the 'weak subjectivity period' to be processed correctly. The weak subjectivity period is a function of the size of the validator set at the last finalized epoch. The goal of the weak-subjectivity period is to define the maximum number of validator set changes a client can tolerate before requiring out-of-band information to resync.

The definition of this function will be added to the 0-beacon-chain specification in the coming days.

Messaging

All ETH 2.0 messages conform to the following structure:

+--------------------------+
|       protocol path      |
+--------------------------+
|      compression ID      |
+--------------------------+
|                          |
|    compressed body       |
|    (SSZ encoded)         |
|                          |
+--------------------------+

The protocol path is a human-readable prefix that identifies the message's contents. It is compliant with the libp2p multistream specification. For example, the protocol path for libp2p's internal ping message is /p2p/ping/1.0.0. All protocol paths include a version for future upgradeability. In practice, client implementors will not have to manually prepend the protocol path since libp2p implements this as part of the libp2p library.

The compression ID is a single-byte sigil that denotes which compression algorithm is used to compress the message body. Currently, the following compression algorithms are supported:

  1. ID 0x00: no compression
  2. ID 0x01: Snappy compression

We suggest starting with Snappy because of its high throughput (~250MB/s without needing assembler), permissive license, and availability in a variety of different languages.

Finally, the compressed body is the SSZ-encoded message body after being compressed by the algorithm denoted by the compression ID.

Relevant Specifications

Messages

The schema of message bodies is notated like this:

(
    field_name_1: type
    field_name_2: type
)

SSZ serialization is field-order dependent. Therefore, fields MUST be encoded and decoded according to the order described in this document. The encoded values of each field are concatenated to form the final encoded message body. Embedded structs are serialized as Containers unless otherwise noted.

All ETH 2.0 RPC messages prefix their protocol path with /eth/serenity.

Handshake

Hello

Protocol Path: /eth/serenity/hello/1.0.0

Body:

(
    network_id: uint8
    latest_finalized_root: bytes32
    latest_finalized_epoch: uint64
    best_root: bytes32
    best_slot: uint64
)

Clients exchange hello messages upon connection, forming a two-phase handshake. The first message the initiating client sends MUST be the hello message. In response, the receiving client MUST respond with its own hello message.

Clients SHOULD immediately disconnect from one another following the handshake above under the following conditions:

  1. If network_id belongs to a different chain, since the client definitionally cannot sync with this client.
  2. If the time between each peer's latest_finalized_epoch exceeds the weak-subjectivity period, since syncing with this client would be unsafe.
  3. If the latest_finalized_root shared by the peer is not in the client's chain at the expected epoch. For example, if Peer 1 in the diagram below has (root, epoch) of (A, 5) and Peer 2 has (B, 3), Peer 1 would disconnect because it knows that B is not the root in their chain at epoch 3:
              Root A

              +---+
              |xxx|  +----+ Epoch 5
              +-+-+
                ^
                |
              +-+-+
              |   |  +----+ Epoch 4
              +-+-+
Root B          ^
                |
+---+         +-+-+
|xxx+<---+--->+   |  +----+ Epoch 3
+---+    |    +---+
         |
       +-+-+
       |   |  +-----------+ Epoch 2
       +-+-+
         ^
         |
       +-+-+
       |   |  +-----------+ Epoch 1
       +---+

Once the handshake completes, the client with the higher latest_finalized_epoch or best_slot (if the clients have equal latest_finalized_epochs) SHOULD send beacon block roots to its counterparty via beacon_block_roots.

RPC

These protocols represent RPC-like request/response interactions between two clients. Clients send serialized request objects to streams at the protocol paths described below, and wait for a response. If no response is received within a reasonable amount of time, clients MAY disconnect.

Beacon Block Roots

Protocol Path: /eth/serenity/rpc/beacon_block_roots/1.0.0

Body:

# BlockRootSlot
(
    block_root: HashTreeRoot
    slot: uint64
)

(
    roots: []BlockRootSlot
)

Send a list of block roots and slots to the peer.

Beacon Block Headers

Protocol Path: /eth/serenity/rpc/beacon_block_headers/1.0.0

Request Body

(
    start_root: HashTreeRoot
    start_slot: uint64
    max_headers: uint64
    skip_slots: uint64
)

Response Body:

# Eth1Data
(
    deposit_root: bytes32
    block_hash: bytes32
)

# BlockHeader
(
    slot: uint64
    parent_root: bytes32
    state_root: bytes32
    randao_reveal: bytes96
    eth1_data: Eth1Data
    body_root: HashTreeRoot
    signature: bytes96
)

(
    headers: []BlockHeader
)

Requests beacon block headers from the peer starting from (start_root, start_slot). The response MUST contain fewer than max_headers headers. skip_slots defines the maximum number of slots to skip between blocks. For example, requesting blocks starting at slots 2 a skip_slots value of 2 would return the blocks at [2, 4, 6, 8, 10]. In cases where a slot is undefined for a given slot number, the closest previous block MUST be returned returned. For example, if slot 4 were undefined in the previous example, the returned array would contain [2, 3, 6, 8, 10]. If slot three were further undefined, the array would contain [2, 6, 8, 10] - i.e., duplicate blocks MUST be collapsed.

The function of the skip_slots parameter helps facilitate light client sync - for example, in #459 - and allows clients to balance the peers from whom they request headers. Client could, for instance, request every 10th block from a set of peers where each per has a different starting block in order to populate block data.

Beacon Block Bodies

Protocol Path: /eth/serenity/rpc/beacon_block_bodies/1.0.0

Request Body:

(
    block_roots: []HashTreeRoot
)

Requests the block_bodies associated with the provided block_roots from the peer. Responses MUST return block_roots in the order provided in the request. If the receiver does not have a particular block_root, it must return a zero-value block_body (i.e., a zero-filled bytes32).

Response Body:

For type definitions of the below objects, see the 0-beacon-chain specification.

# BlockRoot
(
    proposer_slashings: []ProposerSlashing
    attester_slashings: []AttesterSlashing
    attestations: []Attestation
    deposits: []Deposit
    voluntary_exits: []VoluntaryExit
    transfers: []Transfer
)

(
    block_roots: BlockRoot[]
)

Beacon Chain State

Note: This section is preliminary, pending the definition of the data structures to be transferred over the wire during fast sync operations.

Protocol Path: /eth/serenity/rpc/beacon_chain_state/1.0.0

Request Body:

(
    hashes: []HashTreeRoot
)

Requests contain the hashes of Merkle tree nodes that when merkelized yield the block's state_root.

Response Body: TBD

The response will contain the values that, when hashed, yield the hashes inside the request body.

Broadcast

These protocols represent 'topics' that clients can subscribe to via GossipSub.

Beacon Blocks

The response bodies of each topic below map to the response bodies of the Beacon RPC methods above. Note that since broadcasts have no concept of a request, any limitations to the RPC response bodies do not apply to broadcast messages.

Topics:

  • beacon/block_roots
  • beacon/block_headers
  • beacon/block_bodies

Voluntary Exits

Topic: beacon/exits

Body:

See the 0-beacon-chain spec for the definition of the VoluntaryExit type.

(
    exit: VoluntaryExit
)

Transfers

Topic: beacon/transfer

Body:

See the 0-beacon-chain spec for the definition of the Transfer type.

(
    transfer: Transfer
)

Clients MUST ignore transfer messages if transfer.slot < current_slot - GRACE_PERIOD, where GRACE_PERIOD is an integer that represents the number of slots that a remote peer is allowed to drift from current_slot in order to take potential network time differences into account.

Shard Attestations

Topics: shard-{number}, where number is an integer in [0, SHARD_SUBNET_COUNT), and beacon/attestations.

The Attestation object below includes fully serialized AttestationData in its data field. See the 0-beacon-chain for the definition of the Attestation type.

Body:

(
    attestations: []Attestation
)

Only aggregate attestations are broadcast to the beacon/attestations topic.

Clients SHOULD NOT send attestations for shards that the recipient is not interested in. Clients receiving uninteresting attestations MAY disconnect from senders.

Relevant Specifications

Client Synchronization

When a client joins the network, or has otherwise fallen behind the latest_finalized_root or latest_finalized_epoch, the client MUST perform a sync in order to catch up with the head of the chain. This specification defines two sync methods:

  1. Standard: Used when clients already have state at latest_finalized_root or latest_finalized_epoch. In a standard sync, clients process per-block state transitions until they reach the head of the chain.
  2. Fast: Used when clients do not have state at latest_finalized_root or latest_finalized_epoch. In a fast sync, clients use RPC methods to download nodes in the state tree for a given state_root via the /eth/serenity/rpc/beacon_chain_state/1.0.0 endpoint. The basic algorithm is as follows:
    1. Peer 1 and Peer 2 connect. Peer 1 has (C, 1) and Peer 2 has (A, 5). Peer 1 validates that this new head is within the weak subjectivity period.
    2. If the head is within the weak subjectivity period, Peer 2 checks the validity of the new chain by verifying that all children point to valid parent roots.
    3. Peer 2 then takes the state root of (A, 5) and sends /eth/serenity/rpc/beacon_chain_state/1.0.0 requests recursively to its peers in order to build its SSZ BeaconState.

Note that nodes MUST perform a fast sync if they do not have state at their starting finalized root. For example, if Peer 1 in the example above did not have the state at (C, 1), Peer 1 would have to perform a fast sync because it would have no base state to compute transitions from.

Open Questions

Encryption

This specification does not currently define an encrypted transport mechanism because the set of libp2p-native encryption libraries is limited. libp2p currently supports an encryption scheme called SecIO, which is a variant of TLSv1.3 that uses a peer's public key for authentication rather than a certificate authority. While SecIO is available for public use, it has not been audited and is going to be deprecated when TLSv1.3 ships.

Another potential solution would be to support an encryption scheme such as Noise. The Lightning Network team has successfully deployed Noise in production to secure inter-node communications.

Granularity of Topics

This specification defines granular GossipSub topics - i.e., beacon/block_headers vs. simply beacon. The goal of using granular topics is to simplify client development by defining a single payload type for each topic. For example, beacon/block_headers will only ever contain block headers, so clients know the content type of without needing to read the message body itself. This may have drawbacks. For example, having too many topics may hinder peer discovery speed. If this is the case, this specification will be updated to use less granular topics.

Block Structure Changes

The structure of blocks may change due to #649. Changes that affect this specification will be incorporated here once the PR is merged.

@Mikerah
Copy link
Contributor

Mikerah commented Feb 26, 2019

Great writeup!

A few notes:

There has been discussions on whether to use RLP for discovery and SSZ for messages.

There has also been some discussion on using QUIC instead of UDP for discovery and TCP for all other peer communication. By using QUIC, we no longer need a 3-way handshake as in TCP and we get built-in encryption. The main issue is that it hasn't been implemented in all programming languages.

@vbuterin
Copy link
Contributor

vbuterin commented Feb 27, 2019

If the time between each peer's latest_finalized_epoch exceeds the weak-subjectivity period, since syncing with this client would be unsafe.

What do we mean by this? Note that especially in cases where the validator set is very small, it is absolutely possible for the latest finalized block to be more than a weak subjectivity period behind the current block.

  1. If the latest_finalized_root shared by the peer is not in the client's chain at the expected epoch. For example, if Peer 1 in the diagram below has (root, epoch) of (A, 5) and Peer 2 has (B, 3), Peer 1 would disconnect because it knows that B is not the root in their chain at epoch 3:

This feels backward. It's not the peer's latest finalized root that should be checked against the client head, it's the peer's head that should be checked against the client's latest finalized root. As an example of how the current approach is insufficient, consider the case where the client head is H with a latest finalized root R with parent P, and a peer says hello with a chain going through some different child R' of P that doesn't finalize any blocks. The peer's latest finalized root would be some ancestor of P; so this is clearly invalid from the point of view of the client, but it would still be accepted.

Only aggregate attestations are broadcast to the beacon/attestations topic.

Is there a need for this particular wire protocol spec to specify where unaggregated attestations get sent?

There has been discussions on whether to use RLP for discovery and SSZ for messages.

As the inventor of RLP, I'm inclined to prefer SSZ (or SOS if we end up replacing SSZ with SOS or something similar) 😆

The "simplicity over efficiency" desideratum would imply that we should ideally only use one algorithm per task (in this case the task being "serialization"), even if they have slightly different tradeoffs. I see the overhead of a few extra zero bytes in SSZ being tiny relative to the absolute size of the objects that would be transferred, so the gains wouldn't be large.

@djrtwo
Copy link
Contributor

djrtwo commented Feb 27, 2019

What do we mean by this? Note that especially in cases where the validator set is very small, it is absolutely possible for the latest finalized block to be more than a weak subjectivity period behind the current block.

The point here is to make clear that you may think you are safe to sync but realize on the wire that you don't have recent enough information to safely sync. I suppose this information can (and maybe should) be known before you even contact peers. An epoch # is an absolute time so you can know the expected head of chain epoch before you sync and know you are in the danger zone without knocking on a peer's door. In this case, maybe this is not a wire protocol note, but a client implementer safety note that goes in another doc.

This feels backward

The peer with the lower finalized epoch (peer-1) cannot immediately tell just by shaking hands with the peer with the higher finalized epoch (peer-2) if the peer-1's finalized epoch is in peer-2's chain or not. Peer-2 on the other hand can easily check if peer-1's finalized epoch is in their chain. If not, no use in sending blocks to them because to sync peer-2's blocks, peer-1 would have to revert.

I'm not sure if what we currently have in (3) is backward, but it is does not sufficiently cover all the cases in which two chains might be out of sync wrt finality

Is there a need for this particular wire protocol spec to specify where unaggregated attestations get sent?

Not 100% sure. The expectation is that single signer attestations are sent to the shard subnet. Only rarely would a single signer attestation be broadcast to the beacon net as a "best effort" aggregation.

The "simplicity over efficiency" desideratum would imply that we should ideally only use one algorithm per task

+1

@vbuterin
Copy link
Contributor

Not 100% sure. The expectation is that single signer attestations are sent to the shard subnet. Only rarely would a single signer attestation be broadcast to the beacon net as a "best effort" aggregation.

Agree! So do we need a "shard subnet wire protocol" section?

@CarlBeek
Copy link
Contributor

CarlBeek commented Feb 27, 2019

Re-raising this point, but is there any particular reason why beacon_block_roots is built in as a part of hello instead of being exposed as its own RPC request? It improves the modularity of hello and passes the onus of getting up to date to the peer that is behind. Under the above proposal, if a peer is behind and conducts several hello exchanges with its peers (perhaps due to being offline temporarily) would receive beacon block roots several times. This obviously represents sevral redundant messages. Furthermore, in the above protocol, the only means of obtaining beacon block roots is via new hello exchanges.

Also, if we are going to make topics plural, please can we make beacon/transfer plural too.

@jannikluhn
Copy link
Contributor

jannikluhn commented Feb 27, 2019

Nice writeup! This is quite a big document, so maybe opening a PR would allow for easier review (not sure what the best place for this would be though). I'll try giving some feedback anyway:

SSZ/SOS vs RLP debate

I'm leaning towards RLP for message encoding. Not because of efficiency, but because with SSZ we at all times need to know the exact message type that we're expecting. RLP is more flexible in that regard which would make it easier to support versioning, multiple possible response message types, and using the same gossip topic for multiple object types. Also, if we go with RLP at the discovery level anyway, then using it at the "content" level would be more consistent (but this is a different debate I guess).

Message format

I think whenever we send objects that are defined in the spec, we should put them in the message either as a (list of) SSZ blob(s) (if we go with non-SSZ as message format) or as a container field (for SSZ messages). Right now, it seems like we're redefining some object types (e.g. block header and block body).

Handshake

  • network id: It seems like we don't actually use this to distinguish between networks, but rather between chains. Replacing it with something like "hard fork choices" would make this more explicit: It would contain slot or epoch and block hash pairs for relevant blocks (genesis and all hard forks, or at least the latest one)

  • Shall we add the latest justified head as well? If I'm not making a mistake, the last finalized epoch will be 1.5 epochs or ~10 minutes old on average, even in optimal conditions, and adding the justified epoch should be quite cheap.

  • Shall we rename best to head to follow the terminology of the spec and be more neutral?

Disconnect

Maybe we should have an explicit "Disconnect" message. This would enable peers to do some clean ups and also stop responding to any potential outstanding requests.

Request/Responses

We should add a request_id to each request/response message. This way handling replies gets much easier and also allows for parallel requests.

Beacon Block Roots

I think the corresponding request is missing, no? We can probably just use the same format as we do for requesting beacon block headers.

Beacon Block Headers

The request contains both slot and root. Why? The slot seems to be redundant so I think removing it is better for simplicity.

Scope

I think we should try to keep this as simple as possible for now and don't specify any fast or light sync etc.

Protocol paths

I think we should add beacon/shard to the protocol path, i.e.

/eth/serenity/rpc/beacon_block_roots/1.0.0 --> /eth/serenity/beacon/rpc/beacon_block_roots/1.0.0

Agree! So do we need a "shard subnet wire protocol" section?

👍

@djrtwo
Copy link
Contributor

djrtwo commented Feb 27, 2019

Agree! So do we need a "shard subnet wire protocol" section?

The "Shard Attestations" section includes both shard and beacon topics as valid topics to broadcast on. It just makes the note that only aggregated are expected to be passed to the beacon topic. Is this what you mean? We can pull these things out so they are more clearly distinct.

@djrtwo
Copy link
Contributor

djrtwo commented Feb 27, 2019

Re-raising this point, but is there any particular reason why beacon_block_roots is built in as a part of hello instead of being exposed as its own RPC request?

This is similar to how block hashes are exchanged on the ethereum 1.0 wire protocol. The expectation is that you are only gathering information to "catch up" when initially finding peers and that beyond that you generally stay synced by listening to broadcasts on the wire. If you get a recent block and don't have it's parent, you can ask for it's parent via the root embedded in the block and walk the chain back until you find a common ancestor. If fallen way out of sync, you can reconnect to peers to get a list of roots.

I see the potential utility of what you describe but am not sure needing to request the list after initial hello is generally necessary. I'd like to look into how geth handles this and will ask a couple of 1.0 sync masters about it today.

Also, if we are going to make topics plural, please can we make beacon/transfer plural too.

👍

@djrtwo
Copy link
Contributor

djrtwo commented Feb 27, 2019

Nice writeup! This is quite a big document, so maybe opening a PR would allow for easier review (not sure what the best place for this would be though). I'll try giving some feedback anyway:

Will open up a PR after these initial comments. Thanks!

Right now, it seems like we're redefining some object types (e.g. block header and block body).

The current format of BlockHeader here is just the BeaconBlock with the body served as a hash root. I hear you on making the formats just easily reference the SSZ objects for clarity. We have a couple of header/body changes coming in #649 (firm separation between header/body) and then will clean up accordingly.

network id: It seems like we don't actually use this to distinguish between networks, but rather between chains.

Interesting. In the 1.0 protocol we use it for both distinguishing between entirely different chains (mainnet vs ropsten vs ..) as well as forks within a single chain. and contentious forks become different networks. I see there could be some use in coming up with a succinct format to describe forks independent of base "network". Nick Johnson had an interesting proposal on twitter. Something like this might be worth pursuing.

If I'm not making a mistake, the last finalized epoch will be 1.5 epochs or ~10 minutes old on average, even in optimal conditions, and adding the justified epoch should be quite cheap.

The latest_finalized_epoch/root is to see if the two chains are irrevocably disjoint and to serve somewhat as a "check-pointing" mechanism. The lastest justified is not set in stone. What use case do you see here? Also note, that a newly syncing peer would probably just have a latest finalized epoch/root.

Shall we rename best to head to follow the terminology of the spec and be more neutral?

agreed

Maybe we should have an explicit "Disconnect" message. This would enable peers to do some clean ups and also stop responding to any potential outstanding requests.

Seems reasonable. Could also signal a "why" if you had a particular reason

I think we should try to keep this as simple as possible for now and don't specify any fast or light sync etc.

A method to sync state is necessary in the weakly subjective chain. Clients are expected to show up with some recent finalized root and peers are not expected to serve blocks since genesis in perpetuity. The term is more aptly described here as "State Sync" rather than "fast" as it's primary use is to provide a recent state to a client. Also note that the state size is bound ~600MB even in the worst case and so should actually be fast.

gotta run. I'll get to your other few things in a bit, and will make some edits accordingly.

@FrankSzendzielarz
Copy link
Member

After a review (I will study/contemplate this later in more depth) of this lovely piece of work, I just want to raise a couple of questions, which most likely arise out of my own lack of understanding:

  1. If nodes are discovered using ENRs, the information on the node is already available. This means that clients will be identified by ENR, and if libp2p multiaddr is used or not is an implementation detail, not a protocol requirement, (right?)
  2. Similarly, compression type could be omitted from the message as the default should be the 'highest' supported by the sending node supported by the recipient.
  3. Encoding (SSZ) could similarly be optional/variable.
  4. Protocol path could be compressed out as the info is in the ENR. A byte/nibble should be sufficient. Why compress the message body and make the protocol path human readable?
  5. Eth-current uses Network ID and Chain ID.

On wire encoding is it fair to say that for participation in the existing network, RLP is necessary as a software component for the medium term future regardless?

@jannikluhn
Copy link
Contributor

The latest_finalized_epoch/root is to see if the two chains are irrevocably disjoint and to serve somewhat as a "check-pointing" mechanism. The lastest justified is not set in stone. What use case do you see here? Also note, that a newly syncing peer would probably just have a latest finalized epoch/root.

I assumed both finalized epoch and head are just used to select the best peer to connect to and sync from. The justified epoch would just be an additional piece of information to do this a little better. Maybe in practice the head is already enough though, so this might not be necessary.

Interesting. In the 1.0 protocol we use it for both distinguishing between entirely different chains (mainnet vs ropsten vs ..) as well as forks within a single chain. and contentious forks become different networks. I see there could be some use in coming up with a succinct format to describe forks independent of base "network". Nick Johnson had an interesting proposal on twitter. Something like this might be worth pursuing.

Yes, in Eth1.0 there's kind of a one to one relationship between networks and forks. But in Eth2.0, we have at least the beacon network and multiple (shard) subnetworks for the same (beacon) chain, and if we count the various topics as one network each we get even more. So I think we should differentiate the terms a bit more cleanly.

Similarly, compression type could be omitted from the message as the default should be the 'highest' supported by the sending node supported by the recipient.

I think/hope libp2p handles this for us already so nothing we should worry about.

Encoding (SSZ) could similarly be optional/variable.

I strongly think we should agree on some encoding scheme and require it. Otherwise different clients won't be able to talk to each other and the network will cluster by implementation.

Protocol path could be compressed out as the info is in the ENR. A byte/nibble should be sufficient. Why compress the message body and make the protocol path human readable?

I think that's also in libp2p's domain.

cc @raulk

@fjl
Copy link

fjl commented Feb 28, 2019

Upon first startup, clients MUST generate an RSA key pair in order to identify the client on the network. The SHA-256 multihash of the public key is the clients's Peer ID...

This shouldn't be in this specification because it is handled by lower layers of the stack.
Also, are we really considering to use RSA?

@arnetheduck
Copy link
Contributor

From what I understand, beacon_block_headers goes forwards in time, starting at a low slot number and building towards higher slot numbers.

What's often happens in distributed systems is that stuff gets lost along the way, for a variety of reasons. Thus, what would be useful is an ancestor request, such that when you receive a block or attestation whose parent is unknown, you can quickly recover and establish if this is a good block or not.

What you need:

  • a finalized block serving as anchor - here you can go back to genesis or whatever block you already know is finalized - tail is a useful name, if the other end is called head.
  • a request to get ancestor blocks - strictly, one at a time is sufficient, but it quickly becomes obvious that it's nice to get a range as well.

When you receive a block, you need to check that you have a path back to a tail block you know about - if you don't, it's either a block that's not part of the chain (spam?), or you're missing links. To find out, you have to try downloading the missing block headers, using slot time to establish a horizon to tell one situation from the other.

This kind of request is very flexible, because it allows you to join the network, start listening to broadcast chatter and discover what the most popular heads are (assuming broadcast network is namespaced / separate for every chain - it's fairly low cost to add a network id to every broadcast, which will greatly help debugging, if nothing else)

The approach is nice for a few different reasons:

  • cheap to implement - all data is already in the blocks, so there's no need for clients to build special indices to support basic functionality
  • same flow for sync and repair - whether you lost a block because connectivity was cut or are joining with a fresh state, the same request is used to fill in the gaps
  • the method is sufficient for a correctly functioning network - on top, you can add certain well-placed optimizations to speed things up (like the hello message to discover heads)
  • you can join the network and start listening to broadcasts - no special requests necessary for sync

What you do as a client is to divide blocks into two: resolved and unresolved. Resolved blocks are those that you have established a path to a finalized state (that you know through other means) to. Unresolved are all the others:

  • blocks from different chains, malicious blocks etc.. will never establish path
  • blocks where you're missing links - will establish path

The request would look something like:

(
    head_root: HashTreeRoot
    ancestor_slots: uint64
)

Basically, a start-pointer+length kind of request, but going backwards in time (this is already an optimization over getting blocks one at a time then getting their parents recursively, which also works - this is incidentally the same as the git dumb remote sync protocol).

This request can return block headers or roots as is appropriate (headers seems natural, because you want to use the response to verify that one of the blocks resolves to a known chain state)

One problem with the approach is that when you receive an unknown block, you have to use network resources to find out if it's valid or not (the block might be signed by a validator that's not in your validator set, because the latest finalized block you know about does not yet contain it ). This can be mitigated by:

  • using heuristics to score blocks and use resources accordingly (is it well-formed? is it signed by a known validator? does slot time make sense? etc)
  • make sure that answers are ordered from low slot number to high, so you can quickly discard bad information - potentially with pagination (ie you have to be prepared to handle a partial response)

The forward-direction request is slightly problematic in that it also needs to decide if it should branch out on all children or simply follow what the responding client considers to be the canonical chain at the time. Ancestor request does not have this issue.

skip_slots: uint64

what's the use case for skipping slots at uniform intervals?

@arnetheduck
Copy link
Contributor

fwiw we just landed a simple implementation of the application layer requests in this protocol in Nimbus: status-im/nimbus-eth2#117 (the PR description is a bit off, it refers to an earlier attempt that we subsequently rewrote), and will be playing around with it for the initial sync.

A detail is that we're running it over devp2p right now for convenience, but should be integrating with a libp2p transport soon:ish.

One thing of note is that when you're a proposer, you want to include attestations from as many shards as possible, right? Effectively, this means that that if attestation broadcasts are split up by shards, you end up having to listen to all of them anyway, under the naive attestation broadcast protocol where everyone just broadcasts their self-signed attestation.

Only aggregate attestations are broadcast to the beacon/attestations topic.

What's lacking for this is a responsible entity that does the aggregation and broadcasts it here - also, what's aggregate in this context? 2/3 votes? all? and who does the aggregation.. so many questions :) Pegasys' aggregation looks promising, but until we have that, in the implementation we're broadcasting everything everywhere - obviously, this will not do for 4 million validators!

@djrtwo
Copy link
Contributor

djrtwo commented Mar 5, 2019

Effectively, this means that that if attestation broadcasts are split up by shards, you end up having to listen to all of them anyway, under the naive attestation broadcast protocol where everyone just broadcasts their self-signed attestation.

The expectation is to have some subset of committee broadcast best-effort aggregates picked up from the shard subnet to the beacon (say last N of committee). These validators are already connected to the subnet for creating the attestation and are incentivized to have their attestations included in the beacon chain so are a natural pick. A proposer of a beacon chain block is not expected to sync to a bunch of different subnets to pick up and aggregate attestations. Adding this to honest validator guide shortly.

Pegasys' aggregation looks promising

Agreed. Glad to have it in our back pocket, but want to get some nodes on testnets before we decide to go with a more sophisticated setup. Note that even with a more structured aggregatation scheme, we need some sort of expected behavior defining what gets passed to the beacon net.

what's the use case for skipping slots at uniform intervals?

To distribute the load of requests across peers, and to also be able to provide a "skip" subset to light clients.

Still digesting the rest of your post.

@arnetheduck
Copy link
Contributor

These validators are already connected to the subnet for creating the attestation and are incentivized to have their attestations included in the beacon chain so are a natural pick.

but do they have an interest for others to be rewarded also? minimally no, right? since reward depends on total active validator balance, it's better for me if someone else loses balance? there's the social good argument of course that validators might remember who excluded them and reciprocate etc..

@djrtwo
Copy link
Contributor

djrtwo commented Mar 6, 2019

but do they have an interest for others to be rewarded also? minimally no, right? since reward depends on total active validator balance, it's better for me if someone else loses balance? there's the social good argument of course that validators might remember who excluded them and reciprocate etc..

validators lose money if they aren't successfully crosslinking or finalizing. The best thing I can do as an individual is maximize participation of my committee and validators as a whole. If I happen to have some huge majority of validation, my optimal strategy might change to censorship.

@nisdas
Copy link
Contributor

nisdas commented Mar 11, 2019

How would state sync work in the case of skipped slots ? Lets say a node A joins the network so has no saved state locally and will have to perform a fast sync. Node B has the latest finalized root/epoch , A is at (RA,1) and B is at (RB,5).

What happens when the slots before finalization were skipped slots in epoch 5 ? Node A downloads the whole state from its peers and will have the same finalized state as Node B. However now Node A has to catch up to the current head of the network and starts requesting for blocks after the latest finalized epoch.

Lets say due to network issues slots are skipped before finalization, every block that Node A import's now will point to a parent that the node cannot validate as the parent is not saved and that block will end up being thrown away despite being a valid block. This would end up with sync permanently stalling as it would never be able to reach the current head

@jannikluhn
Copy link
Contributor

@nisdas Skipped slots don't matter, all A need is blocks and they can still get them even if some slots in between are empty. A would just ask for all blocks from their last known slot to the current slot and B would reply with all non-skipped ones. I guess this may become slightly more inefficient if A tries to request from multiple Bs (as they don't know which blocks have been skipped, so the load won't be distributed equally), but that seems not very significant.

@mslipper
Copy link
Contributor Author

I've included updates to this spec here: #763. Closing this in favor of the PR.

@nisdas
Copy link
Contributor

nisdas commented Mar 13, 2019

@jannikluhn The issue isnt requesting skipped slots. Its how we perform fast sync while requesting for the beacon state initially. This is what I was thinking of

 Block Proposed +-----------+ Slot 94 ( Current Head)
         ^
         | 
         ^
         | 
         ^
         | 
 Block Proposed +-----------+ Slot 66
         ^
         | 
   Slot Skipped +-----------+ Slot 65
         ^
         | 
   Slot Skipped +-----------+ Slot 64 ( Epoch Transition & State is finalized)
         ^
         | 
   Slot Skipped +-----------+ Slot 63
         ^
         | 
   Slot Skipped +-----------+ Slot 62
         ^
         | 
   Slot Skipped +-----------+ Slot 61
         ^
         |
    Block Proposed +-----------+ Slot 60
  

If a new node joins the network they will start sync by requesting the finalized state at slot 64( epoch 1). Then in order to sync till the current head at slot 94, it will request blocks from (65,94). So when the node receives the response containing the blocks in this range, it will process the block at slot 66 (since 65 was skipped). But when we process this block into state transition function, it will be rejected since we do not have the previous chain head which is actually the block at slot 60. We also will need to save the parent block(slot 60) in state sync in order for sync to work.

@jannikluhn
Copy link
Contributor

Ah, I think I get what you mean now. The state does contain the latest block header though, so this should deal with it, right?

@nisdas
Copy link
Contributor

nisdas commented Mar 13, 2019

Ahh , yes that would solve it. This is a bug on our end since we generate the previous block root from our locally saved block instead of using the one kept in state. Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests