Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pdd protocol and multistream example as .md files for easier review and upgrade collaboratively #12

Merged
merged 6 commits into from
Jun 27, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions pdd/PDD-multistream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# PDD for multistream (example/use case)

## Protocol

The motivation and base of the protocol can be found here:

- https://github.com/ipfs/specs/blob/wire/protocol/network/wire.md#multistream---self-describing-protocol-stream
- https://github.com/ipfs/specs/blob/wire/protocol/network/wire.md#multistream-selector---self-describing-protocol-stream-selector

The multistream protocol does not cover:

- discovering participants, selecting transports, and establishing connections
- managing the state of the connection

multistream enables several types of streams to be used over one single stream, like an intelligent message broker that offers the ability to negotiate the protocol and version that is going to be used. To simplify, a visual representation can be:

```
┌ ─ ─ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ─ ─ ┐┌ ─ ┐
dht-id/1.0.1│ bitswap/1.2.3 ...
└ ─ ─ ─ ─ ─ ─ └ ─ ─ ─ ─ ─ ─ ┘└ ─ ┘
┌────────────────────────────────┐
│ multistream-select │
└────────────────────────────────┘
┌────────────────────────────────┐
│transport │
│ [TCP, UDP, uTP, ...] │
└────────────────────────────────┘
```

multistream doesn't cover stream multiplexing over the same connection, however, we can achieve this by leveraging functionality offered by SPDY or HTTP/2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it should probably just be multistream (and not multistream-select in the figure above). Without a stream-multiplexing transport layer, using multistream-select will be difficult (impossible?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without a stream-multiplexing transport layer, using multistream-select will be difficult (impossible?).

no. it will not. easy, as the examples suggest. the stream will just be locked into that protocol until the stream terminates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Sat, Jun 13, 2015 at 08:43:29PM -0700, Juan Batiz-Benet wrote:

Without a stream-multiplexing transport layer, using
multistream-select will be difficult (impossible?).

no. it will not. easy, as the examples suggest stream will just be
locked into that protocol until the stream terminates.

Ah, I was just thinking that multistream-select assumed it would
always be running for stream maintenance. But looking over [1,2], it
looks like ‘multistream’ is just “write a protocol-describing header
line before sending any other data down the stream”. So that's fine.
Then ‘multistream-select’ adds the ‘ls’ listing and ‘na\n’. The
handoff between the two is still a bit murky to me. If the
uniplex-transport conversation is:

← /multistream/1.0.0
← /bird/3.2.1
← hey, how is it going?

Switching to a /bird/ protocol is a multistream-select situation. So
it should probalby be:

← /multistream-select/1.0.0
← /bird/3.2.1
← hey, how is it going?

And the listener handlers will be:

  1. Open with a multistream-select driver on the socket.
  2. Receive ‘/multistream-select/1.0.0’. Good, that's the language we
    speak.
  3. Receive ‘/bird/3.2.1’, that's a protocol we support. Pop ourselves
    off the socket (and into a protocol stack?), and attach the
    bird-3.2.1 driver.
  4. Replay the ‘/bird/3.2.1’ message into the bird driver?
  5. Receive ‘hey, how is it going?’ and process with the current bird
    driver.

The sticky bit is (4), which I don't think we actually need. But if
the ‘/bird/3.2.1’ is only sent on the multistream-select “stream”,
then the bird stream itself is just ‘hey, how is it going?’, which
isn't a multistream protocol. That's fine, it doesn't have to be a
multistream protocol, but if the only multistream protocol is
multistream-select, it seems a bit odd ;).

I'm also not sure how, in the uniplexed transport, a subprotocol is
supposed to open a new child protocol. For example, if the logicial
stream spawning structure was like:

← /multistream-select/1.0.0
← /bird/3.2.1
← /parrot/1.0.0

If you have a parallel stream that's still in multistream-select, you
can just ask for a new stream with the child protocol there:

← /multistream-select/1.0.0
← /bird/3.2.1
… bird stream …
← /parrot/1.0.0
… new parrot stream …

That means the agent requesting the opening needs to reach back up to
the multistream-select stream, but that's ok. Over uniplexed
transport, it seems like you'd either have to build subprotocol
spawning into your spawning protocol:

← /multistream-select/1.0.0
← /bird/3.2.1
… bird stream …
← /parrot/1.0.0 # handled by bird driver
… new parrot stream …

Or close the parent:

← /multistream-select/1.0.0
← /bird/3.2.1
… bird stream …
← exit # or whatever the bird-driver needs to shut down
← /parrot/1.0.0 # handled by multistream-select driver
… new parrot stream …

And I'm a bit nervous about both of those choices. Maybe that's just
stretching multistream-select over unplexed transport too far?


```
┌─── ───┐┌─── ───┐┌ ┌─── ───┐┌─── ───┐┌┐
a/1.0.0 b/1.0.0 │ a/1.0.0 b/1.0.0 ││
└─── ───┘└─── ───┘└ └─── ───┘└─── ───┘ ┘
┌─── ──── ──── ──── ┌─── ──── ──── ──── ┌───
│multistream-select││multistream-select││...│
└ ──── ──── ──── ──┘└ ──── ──── ──── ──┘└ ──┘
┌─── ──── ──── ──── ──── ──── ──── ──── ──── ┌───
│stream multiplexing ││...│
│ [HTTP/2, SPDY] ││ │
└─ ──── ──── ──── ──── ──── ──── ──── ──── ─┘└─ ─┘
┌────────────────────────────────────────────────┐
│ multistream-select │
└────────────────────────────────────────────────┘
┌────────────────────────────────────────────────┐
│transport │
│ [TCP, UDP, uTP, ...] │
└────────────────────────────────────────────────┘
```

The multistream messages (such as 'ls', 'na', 'protocol/version') have the following format:

```
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
┌───────────────┐┌───────┐│
││varint with ││message│
│ message length││ ││
│└───────────────┘└───────┘
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment: we don't need this frame for every message. the multistream idea is just to make "multistream protocols" that all begin with the recognizable header, but after that could pipe your own thing. the reason for it is that sometimes protocols already have their own framing layer so dont want to waste too much on that.

(maybe you already mean this -- wasnt getting it from the above)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jun 12, 2015 at 04:45:29PM -0700, Juan Batiz-Benet wrote:

one comment: we don't need this frame for every message. the
multistream idea is just to make "multistream protocols" that all
begin with the recognizable header…

That's enough for the test-suite's multistream-select and multistream
drivers (that get messages from the transcript and encode them
correctly for those protocols).

… but after that could pipe your own thing.

So the test suite will also need protocol drivers for any protocols
its testing. Each per-protocol driver is responsible for testing both
the syntactic and semantic correctness of the testee's implementation
of that protocol, and the interface for a per-protocol driver has to
accept messages that the test suite is reading from the transcript
(e.g. ‘hey, how is it going?’) and return messages that it receives
from the testee.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment: we don't need this frame for every message.
... (maybe you already mean this -- wasnt getting it from the above)

Yes, I hope now it's more clear, once a protocol is handshake'd, it is up to that protocol to control the stream and encode however they want.


However, for readability reasons, we will omit the `varint` part on the Compliance Spec.


Other reference material and discussion:
- https://github.com/ipfs/node-ipfs/issues/13#issuecomment-109802818

## Protocol Compliance Tests Spec

Given the protocol spec, an implementation of the multistream-select protocol has to comply with the following scenarios:

#### 1 - push-stream

In a push-stream example (one-way stream), we have two agents:

- 'broadcast' - where messages are emited
- 'silent' - listener to the messages emited by the broadcast counterparty

Compliance test 1 (human readable format, without varint):
```
# With a connection established between silent - broadcast
< /multistream/1.0.0 # multistream header
< /bird/3.2.1 # Protocol header
< hey, how is it going? # First protocol message
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Fri, Jun 12, 2015 at 04:45:57PM -0700, Juan Batiz-Benet wrote:

+# With a connection established between silent - broadcast
+< /multistream/1.0.0 # multistream header

  • < /bird/3.2.1 # Protocol header
  • < hey, how is it going? # First protocol message

This is the multistream-select view (because of the nested streams),
but for conversations like this where the streams are strictly nested
(e.g., no going back to an ancestor stream after saying something in a
child stream), you could also run this test against a non-multiplexed
(uniplexed?) multistream implementation. Elaborating on the
test-harness changes for the multiplexed transport:

  1. Spin up connection to implementation, attaching the
    multistream-select driver. We probably want a
    ‘/multistream-select/{version}’ handshake (or optimistic
    broadcast?) to make sure we're speaking the same multistream-select
    language (or at least can figure out what went wrong in a
    brodcast-to-silent situation).
  2. Send the ‘/multistream/1.0.0’ message through the
    multistream-select driver.
  3. Get a new multiplexed stream (I'm not sure which agent is
    responsible for creating the new stream) and attach the
    multistream-1.0.0 driver.
  4. Send the ‘/bird/3.2.1’ message through the multistream driver.
  5. Get a new multiplexed stream and attach the bird-3.2.1 driver.
  6. Send the ‘hey, how is it going?’ message through the bird driver.
  7. Teardown streams?

While for the uniplexed transport, we'd have:

  1. Spin up connection to implementation, attaching the
    multistream-select driver. We probably want a
    ‘/multistream-select/{version}’ handshake (or optimistic
    broadcast?) to make sure we're speaking the same multistream-select
    language (or at least can figure out what went wrong in a
    brodcast-to-silent situation).
  2. Send the ‘/multistream/1.0.0’ message through the
    multistream-select driver.
  3. Disconnect the multistream-select driver and attach the
    multistream-1.0.0 driver to the stream.
  4. Send the ‘/bird/3.2.1’ message through the multistream driver.
  5. Disconnect the multistream driver and attach the bird-3.2.1 driver
    to the stream.
  6. Send the ‘hey, how is it going?’ message through the bird driver.
  7. Teardown stream?

The test-suite's transport driver should have a IsMultiplexed() check
so it can decide which of these approaches to take (or so it can error
out if you ask it to run a multiplex-requiring transcript over a
uniplexed transport).


#### 2 - duplex-stream

In a duplex-stream example (interactive conversation), we have two agents:

- 'select' - waiting for connections, hosts several protocols from where a client can pick from
- 'interactive' - connects to a select agent and queries that agent for a specific protocol

Compliance test 2 (human readable format):
```
# With a connection established between interactive - select
< /multistream/1.0.0
> /multistream/1.0.0
> ls
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally the test could try to fuzz all possible message orderings. e.g.:

# order 1
< /multistream/1.0.0
> /multistream/1.0.0
> ls
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey

# order 2
> /multistream/1.0.0
< /multistream/1.0.0
> ls
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey

# order 3
> /multistream/1.0.0
> ls
< /multistream/1.0.0
< ["/dogs/0.1.0","/cats/1.2.11"]
> /mouse/1.1.0
< na
> /dogs/0.1.0
< /dogs/0.1.0
> hey

... and so on down ...

# another (this one is weird from an "interactivity" perspective
# but --for correctness-- it should be tested to ensure things behave
# as expected).
> /multistream/1.0.0
> ls
> /mouse/1.1.0
> /dogs/0.1.0
> hey
< /multistream/1.0.0
< ["/dogs/0.1.0","/cats/1.2.11"]
< na
< /dogs/0.1.0

not sure if this is worth worrying about or not

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Thu, Jun 11, 2015 at 08:55:04PM -0700, Juan Batiz-Benet wrote:

ideally the test could try to fuzz all possible message orderings…

How do you know what the possible message orderings are without a
protocol spec to tell you?


## Wire out

Since this protocol is not fully plaintext, we have to capture the messages/packets that transmited by one of the agents to make sure we get the transformations right (and therefore doing development driven by what is going on in the wire, which is defined by the Protocol (PDD ftw!))

With a first implementation written, we can capture the messages switched on the wire, so that later, we can require other implementations to conform. For the multistream scenario, tests are organized by the following:

```
tests
├── comp # Where compliance tests live
│   ├── compliance-test.js
├── impl # Where specific implementation tests live, where we can get code coverage and all that good stuff
│   ├── interactive-test.js
│   └── one-way-test.js
└── spec # Spec tests are the tests were what is passed on the wire is captured, so it can be used in the compliance tests for all the implementations
├── capture.js
├── interactive-test.js
├── one-way-test.js
└── pristine # The pristine folder were those captures live
├── broadcast.in
├── broadcast.out # A broadcast.out is the same as a silent.in, since there are only two agents in this exchange,
├── interactive.in # the reason both files exist is to avoid mind bending when it is time to use the "in/out", it could get confusing
├── interactive.out
├── select.in
├── select.out
├── silent.in
└── silent.out
```

## Protocol Compliance Test Suite

The protocol compliance test suit for multistream-select can be found on `tests/comp/compliance-test.js`, each agent is tested alone with the input we have prepared on the previous step for it, once that agent replies to all the messages, we compare (diff) both the output generated and its "pristine" counterpart, expecting to get 0 differences.
82 changes: 82 additions & 0 deletions pdd/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
RFC {protocol hash} - Protocol Driven Development
=================================================

# Abstract

# Introduction

Cross compatibility through several implementations and runtimes is historically an hard goal to achieve. Each framework/language offers different testing suits and implement a different flavour of testing (BDD, TDD, RDD, etc). We need a better way to test compatibility across different implementations.

Instead of the common API tests, we can achieve cross implementation testing by leveraging interfaces offered through the network and defined by the Protocol. We call this Protocol Driven Development.

In order for a code artefact to be PDD compatible
- Expose a connection (duplex stream) interface, may be synchronous (online, interactive) or asynchronous.
- Implement a well defined Protocol spec

## Objectives

The objectives for Protocol Driven Development are:
- Well defined process to test Protocol Spec implementations
- Standard definition of implementation requirements to comply with a certain protocol
- Automate cross implementation tests
- Have a general purpose proxy for packet/message capture

# Process

In order to achieve compliance, we have to follow four main steps:

1 - Define the desired Protocol Spec that is going to be implemented
2 - Design the compliance tests that prove that a certain implementation conforms with the spec
3 - Once an implementation is done, capture the messages traded on the wire using that implementation, so that the behaviour of both participants can be replicated without the agent
4 - Create the Protocol Compliance Tests (consisting on injecting the packets/messages generated in the last step in the other implementations and comparing outputs)

## Protocol Spec

Should define the goals, motivation, messages traded between participants and some use cases. It should not cover language or framework specifics.

## Protocol Compliance Tests Spec

Defines what are the necessary “use stories” in which the Protocol implementation must be tested to assure it complies with the Protocol Spec. For e.g:

```
# Protocol that should always ACK messages of type A and not messages of type B
> A
{< ACK}
> B
> B
> B
```

**Message Flow DSL:**
- Indentation to communicate a dependency (a ACK of A can only come after A is sent for e.g)
- [ ] for messages that might or not appear (e.g heartbeats should be passed on the wire from time to time, we know we should get some, but not sure how much and specifically when).
- { } for messages that will arrive, we just can't make sure if before of the following messages described

A test would pass if the messages transmitted by an implementation follow the expected format and order, defined by the message flow DSL. The test would fail if format and order are not respected, plus if any extra message is transmitted that is was not defined.

Tests should be deterministic, so that different implementations produce the same results:
```
┌─────────┐ ┌─────────┐ ┌───────────────┐
│input.txt│──┬─▶│go-impl │───▶│ output.go.txt │
└─────────┘ │ └─────────┘ └───────────────┘
│ ┌─────────┐ ┌───────────────┐
└─▶│node-impl├───▶│output.node.txt│
└─────────┘ └───────────────┘
```

So that a diff between two results should yield 0 results

```
$ diff output.go.txt output.node.txt
$
```

## Interchange Packet/Message Capture

Since most of these protocols define leverage some type of encoded format for messages, we have to replicate the transformations applied to those messages before being sent. The other option is capturing the messages being sent by one of the implementations, which should suffice the majority of the scenarios.

## Protocol Compliance Tests Suite

These tests offer the last step to test different implementations independently. By sending the packets/messages and evaluating their responses and comparing across different implementations, we can infer that in fact they are compatible

#### [Example use case - go-multistream and node-multistream tests](/PDD-multistream.md)
Binary file added pdd/figs/multistream-1.monopic
Binary file not shown.
10 changes: 10 additions & 0 deletions pdd/figs/multistream-1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
┌ ─ ─ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ─ ─ ┐┌ ─ ┐
dht-id/1.0.1│ bitswap/1.2.3 ...
└ ─ ─ ─ ─ ─ ─ └ ─ ─ ─ ─ ─ ─ ┘└ ─ ┘
┌────────────────────────────────┐
│ multistream-select │
└────────────────────────────────┘
┌────────────────────────────────┐
│transport │
│ [TCP, UDP, uTP, ...] │
└────────────────────────────────┘
Binary file added pdd/figs/multistream-2.monopic
Binary file not shown.
17 changes: 17 additions & 0 deletions pdd/figs/multistream-2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
┌─── ───┐┌─── ───┐┌ ┌─── ───┐┌─── ───┐┌┐
a/1.0.0 b/1.0.0 │ a/1.0.0 b/1.0.0 ││
└─── ───┘└─── ───┘└ └─── ───┘└─── ───┘ ┘
┌─── ──── ──── ──── ┌─── ──── ──── ──── ┌───
│multistream-select││multistream-select││...│
└ ──── ──── ──── ──┘└ ──── ──── ──── ──┘└ ──┘
┌─── ──── ──── ──── ──── ──── ──── ──── ──── ┌───
│stream multiplexing ││...│
│ [HTTP/2, SPDY] ││ │
└─ ──── ──── ──── ──── ──── ──── ──── ──── ─┘└─ ─┘
┌────────────────────────────────────────────────┐
│ multistream-select │
└────────────────────────────────────────────────┘
┌────────────────────────────────────────────────┐
│transport │
│ [TCP, UDP, uTP, ...] │
└────────────────────────────────────────────────┘
Binary file added pdd/figs/multistream-3.monopic
Binary file not shown.
6 changes: 6 additions & 0 deletions pdd/figs/multistream-3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
┌───────────────┐┌───────┐│
││varint with ││message│
│ message length││ ││
│└───────────────┘└───────┘
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘