Skip to content
This repository has been archived by the owner on Aug 23, 2019. It is now read-only.

observe traffic and expose statistics #243

Merged
merged 26 commits into from
Feb 27, 2018
Merged

observe traffic and expose statistics #243

merged 26 commits into from
Feb 27, 2018

Conversation

pgte
Copy link
Contributor

@pgte pgte commented Feb 16, 2018

Required because of libp2p/js-libp2p#157

@ghost ghost assigned pgte Feb 16, 2018
@ghost ghost added the in progress label Feb 16, 2018
@@ -82,19 +82,47 @@ describe('Stats', () => {
})
})

it('waits a bit', (done) => setTimeout(done, 1900))
it('waits a bit', (done) => setTimeout(done, 1000))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test case that just waits? Should fix the tests to not depend on the ordering...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@victorbjelkholm good call! Fixed by moving the preparation code to a before hook.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgte better to get rid of the after/before hooks we have and have function calls in the test cases that returns what you need. If all tests where shaped like this, we can run tests in parallel and just run one isolated test case if wanted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case I need something to run once, not for every test.
In this case, I want to setup some switches, exchange some protocol-specific data, wait for the stats to compute, and then I can infer all the statistics are there.
@victorbjelkholm Any recommendations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@victorbjelkholm changed to isolated tests for stats, would love your feedback.

const peerA = infos[0]
const peerB = infos[1]

peerA.multiaddrs.add('/ip4/127.0.0.1/tcp/9001')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gonna conflict if we run two test cases at the same time, is it possible to set the port to 0 and then later get it (if you need it) instead? Think you're not actually using the port in the test, so should work fine to be set to 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@victorbjelkholm good call, fixed!

Copy link
Member

@victorb victorb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests LGTM 👍

@pgte pgte changed the title WIP: observe traffic observe traffic and expose statistics Feb 20, 2018
@pgte pgte requested review from daviddias and hacdias February 20, 2018 16:40
Copy link
Member

@hacdias hacdias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Awesome 😄

Copy link
Member

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgte really solid work here (and in record time!) 👏🏽Made some comments in what I think is a regression (protocolMuxing) and request for more tests.

README.md Outdated
@@ -30,6 +30,7 @@ libp2p-switch is used by [libp2p](https://github.com/libp2p/js-libp2p) but it ca
- [`switch.stop(callback)`](#swarmclosecallback)
- [`switch.connection`](#connection)
- [Internal Transports API](#transports)
- [`switch.stats`](#stats-api)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this above the Internal Transports line.

README.md Outdated
@@ -51,7 +52,7 @@ libp2p-switch is used by [libp2p](https://github.com/libp2p/js-libp2p) but it ca
```JavaScript
const switch = require('libp2p-switch')

const sw = new switch(peerInfo [, peerBook])
const sw = new switch(peerInfo [, peerBook [, options]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it look like options only exists in the context of passing peerBook. What about changing this to just a single options object and assert on its properties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid can do this, but will be a breaking API change..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I see that a peerBook is always required. The only optional argument should be options.

README.md Outdated
@@ -223,6 +224,106 @@ To avoid the confusion between connection, stream, transport, and other names th
- connection - something that implements the transversal expectations of a stream between two peers, including the benefits of using a stream plus having a way to do half duplex, full duplex
- transport - something that as a dial/listen interface and return objs that implement a connection interface

### Stats API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see this all RED too?
image

Mind checking what went wrong here?

README.md Outdated
Should return a stats snapshot, which is an object containing the following keys and respective values:

* dataSent: amount of bytes sent, [Big](https://github.com/MikeMcl/big.js#readme) number
* dataReceived: amount of bytes received, [Big](https://github.com/MikeMcl/big.js#readme) number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/*/-/g for consistency

README.md Outdated

Returns an array containing the tags (string) for each observed transport.

##### `switch.stats.forTransports(transportTag).snapshot`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nickpick s/forTransports/transport/g (singular and no need the word for, I think)


// If identify is enabled
// 1. overload getPeerInfo
// 2. call getPeerInfo
// 3. add this conn to the pool
if (swarm.identify) {
// overload peerInfo to use Identify instead
const setPeerInfo = conn.setPeerInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there an overload of setPeerInfo? I'm afraid that this might cause issues in the future. Is it needed?

@@ -120,7 +123,7 @@ module.exports = function connection (swarm) {
swarm.handle(tag, (protocol, conn) => {
const myId = swarm._peerInfo.id
const secure = encrypt(myId, conn, undefined, () => {
protocolMuxer(swarm.protocols, secure)
swarm.protocolMuxer(tag)(secure)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is not correct here. Doing swarm.protocolMuxer(tag) is saying "After you get the conn encrypted, you can encrypt again". It should be for all the Protocols being handled and not just for one.

Note: I'm still reading through the PR and you might have changed something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the exact same behaviour as before, with the exception we're tracking the protocol tag for accounting purposes..

Object.keys(protocols).forEach((protocol) => {
if (!protocol) {
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still digesting the code but it seems that there is an assumption that all the switch.handle happen in the beginning and so you are locking the list of protocols we support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocols is an object, which can get mutated later as the user adds more protocols.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not get https://github.com/libp2p/js-libp2p-switch/issues/239 even harder to fix :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid again, this was the same behaviour as before: switch.protocols also contained a mutable object with the protocol handlers... I'm just storing the protocols on that closure, but it's still a JS object that will get mutated, same as before..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Makes sense now thanks for explaining :)

forTransport: (transport) => transportStats.get(transport),
protocols: () => Array.from(protocolStats.keys()),
forProtocol: (protocol) => protocolStats.get(protocol)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one "clear" previous stats from a peer?

teardown(switches, done)
})
})
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need tests for:

  • Clearing out previous stats from a peer (Rep mechanisms will need a reset)
  • Check what happens when a peer disconnects and connects again
  • Check when a peer disconnects and connects again from a different transport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid Currently, peer disconnections imply losing the stats for that peer.
If you think that it's important to keep them around, we should impose a time limit so that I eventually becomes garbage collected...
@diasdavid thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps tracking the sessions and having them in a LRU cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diasdavid good call, but I'm not going to put all of the sessions there, only once a peer disconnects..

@pgte
Copy link
Contributor Author

pgte commented Feb 22, 2018

I still have one bug to squash that I discovered this morning:
The base protocol (tcp, wbrtc, etc.) is not being accounted for because the peerInfo never resolves on those connections. I'm still trying to track this down..

@pgte
Copy link
Contributor Author

pgte commented Feb 22, 2018

last described issue should be fixed by libp2p/js-libp2p-secio#98

@pgte
Copy link
Contributor Author

pgte commented Feb 22, 2018

@diasdavid everything you asked for (apart from the protocolMuxing regression you described, which I think is not a regression, I argued above) should be done. :)

@hacdias
Copy link
Member

hacdias commented Feb 26, 2018

Hello @pgte! 😄 When is this PR expected to be finished?

@pgte
Copy link
Contributor Author

pgte commented Feb 26, 2018 via email

Copy link
Member

@daviddias daviddias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgte this looks great!! :) Thanks for kicking ass!

@daviddias daviddias merged commit a4521dd into master Feb 27, 2018
@daviddias daviddias deleted the feat/observe branch February 27, 2018 12:22
@ghost ghost removed the in progress label Feb 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants