-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Connection Manager #4288
WIP: Connection Manager #4288
Conversation
supplementary: libp2p/go-libp2p-kad-dht#95 |
proposal: Merge this roughly as is (without connection tagging logic) so that we can start using it in 'dumb' mode asap. Then, PR the different connection tagging things in individually. |
33d901c
to
1f3decc
Compare
note: disabled by default, you will need to set up the configuration as defined in the config.md doc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good here. I still need to dig more into the libp2p side of things.
core/core.go
Outdated
@@ -814,6 +839,10 @@ func constructPeerHost(ctx context.Context, id peer.ID, ps pstore.Peerstore, bwr | |||
if !opts.DisableNatPortMap { | |||
hostOpts = append(hostOpts, p2pbhost.NATPortMap) | |||
} | |||
if opts.ConnectionManager != nil { | |||
fmt.Println("adding conn manager to host opts") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like debugging leftover.
core/core.go
Outdated
switch cfg.Type { | ||
case "", "none": | ||
return nil, nil | ||
default: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be on the bottom of the switch
LowWater is the minimum number of connections to maintain. | ||
|
||
- `HighWater` | ||
HighWater is the number of connections that, when exceeded, will trigger a connection GC operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like these names. Wouldn't simple Low
/High
be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm... I'd something a little more descriptive than that. I've seen these terms used in this context before. But i'm open to suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a simple Min
and Max
or maybe MinConn
or MaxConn
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watermark
is something that is familiar to me in terms of setting upper and lower bounds to something in programming. No special feelings attached though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'd avoid min/low as they imply that we'll actively seek peers up to this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... are we in favor of still using HighWater
and LowWater
? I'm getting mixed signals (especially from @Stebalien who thumbs upped davids comment, but suggested changing things).
I agree that "min" would imply we actively open connections until we have that many, but I think "low" avoids this connotation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(GC)Treshold(Conn/Peers)
, (GC)Target(Conn/Peers)
? Water
may sound confusing for some even if it's technically correct.
Note: connections used in |
Maybe we should actually have this enabled by default, limit people to 900-1000 connections or so. Even if it does cause potentially strange behaviour, its probably better than endless "too many open file descriptor" errors... |
1f3decc
to
a5367a3
Compare
For reference, the connection closing logic is here: https://github.com/libp2p/go-libp2p-connmgr |
To set the config, run:
|
@@ -380,7 +404,7 @@ func setupDiscoveryOption(d config.Discovery) DiscoveryOption { | |||
if d.MDNS.Interval == 0 { | |||
d.MDNS.Interval = 5 | |||
} | |||
return discovery.NewMdnsService(ctx, h, time.Duration(d.MDNS.Interval)*time.Second) | |||
return discovery.NewMdnsService(ctx, h, time.Duration(d.MDNS.Interval)*time.Second, discovery.ServiceTag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change related to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it just happened to change in a dependency that i needed to update here
hostopts := &ConstructPeerHostOpts{ | ||
AddrsFactory: addrsFactory, | ||
DisableNatPortMap: cfg.Swarm.DisableNatPortMap, | ||
DisableRelay: cfg.Swarm.DisableRelay, | ||
EnableRelayHop: cfg.Swarm.EnableRelayHop, | ||
ConnectionManager: connmgr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to instantiate the ConnManager here? As a user of libp2p, I would assume that ConnManager is just a thing that exists and that all I need to do is pass an option to select the strategy, saving users from yet importing another package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to have a libp2p-bundle
package that imports everything all together, but for now the tying together of all the pieces is done here in the core constructor. Importing things at the top level as much as possible makes dependency tree modifications much cheaper. If this were done in a lower package, then every time i make a change to the connection manager logic, i have to update that package (or packages) and then update those into here.
Sets the type of connection manager to use, options are: `"none"` and `"basic"`. | ||
|
||
- `LowWater` | ||
LowWater is the minimum number of connections to maintain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is lowWater
enforced? Does ConnManager run Discovery Mechanisms if we are poorly connected? (Didn't see this in the go-libp2p-connmgr code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, its not a 'minimum number of connections' its just the point at which we stop closing connections. So if LowWater is 900 and HighWater is 1000, we will trigger a Trim once we have over 1000, and then close connections until we have 900 or less
LowWater is the minimum number of connections to maintain. | ||
|
||
- `HighWater` | ||
HighWater is the number of connections that, when exceeded, will trigger a connection GC operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watermark
is something that is familiar to me in terms of setting upper and lower bounds to something in programming. No special feelings attached though.
docs/config.md
Outdated
- `HighWater` | ||
HighWater is the number of connections that, when exceeded, will trigger a connection GC operation. | ||
- `GracePeriod` | ||
GracePeriod is the length of time that new connections are immune from being closed by the connection manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is in seconds? Can get that written down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a time duration (like all the other time related things). 30s == 30 seconds, 1h == one hour, 4m == 4 minutes, etc
@@ -11,7 +11,7 @@ import ( | |||
ma "gx/ipfs/QmXY77cVe7rVRQXZZQRioukUM7aRW3BTcAgJe12MCtb3Ji/go-multiaddr" | |||
peer "gx/ipfs/QmXYjuNuxVzXKJCfWasQk1RqkhVLDM9jtUKhqc2WPQmFSB/go-libp2p-peer" | |||
pro "gx/ipfs/QmZNkThpqfVXs9GNbexPrfBbXSLNYeKrE7jwFM2oqHbyqN/go-libp2p-protocol" | |||
p2phost "gx/ipfs/QmaSxYRuMq4pkpBBG2CYaRrPx2z7NmMVEs34b9g61biQA6/go-libp2p-host" | |||
p2phost "gx/ipfs/Qmc1XhrFEiSeBNn3mpfg6gEuYCt5im2gYmNVmncsvmpeAk/go-libp2p-host" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curiosity, in go, interface packages don't get an interface
name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they do. But this is where we're instantiating the concrete type.
I see that https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go keeps track of the peers to whom we opened connections. I guess in go memory is very cheap, in our JS implementation I would/will implement this ConnManager as just checking everything that is on PeerBook (your PeerStore) and using the |
Primarily, it keeps track of tags on each connection. |
a5367a3
to
76c4b04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it's probably not too hard for a determined attacker to DoS an IPFS node but the grace period seems a bit too easy to exploit.
In the future, we can use some form of simple proof of work to ensure peers actually want to talk to us (under high loads) but, for now, the simplest solution I can think of is to limit the number of nodes in the grace period set (note: connections from our node should always have a grace period). It's still possible to make it hard to establish new inbound connections but that can already be an issue.
LowWater is the minimum number of connections to maintain. | ||
|
||
- `HighWater` | ||
HighWater is the number of connections that, when exceeded, will trigger a connection GC operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I'd avoid min/low as they imply that we'll actively seek peers up to this point.
- `HighWater` | ||
HighWater is the number of connections that, when exceeded, will trigger a connection GC operation. | ||
- `GracePeriod` | ||
GracePeriod is a time duration that new connections are immune from being closed by the connection manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm mistaken, this can trivially be used to DOS a node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldnt call it trivially, but yes, you can use this to exceed a nodes maximum connection count and make a connection closing sweep not reach the low water mark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, for some reason, I assumed we had a hard "don't open new connections past the high water mark" limit (I was a bit sleep deprived due to jet lag...). That would have effectively taken the node offline.
However, still worried that this will be a way to monopolize a peer by tying up their connection quota with connections in the grace period. Basically, you can ensure that no connection lasts more than 10 seconds. Maybe don't count connections in the grace period?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, connections in the grace period are basically not counted, see here: https://github.com/libp2p/go-libp2p-connmgr/blob/master/connmgr.go#L61
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so we literally just skip them. Nevermind, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interface looks good to me. I trust all of you to make the right decision on figuring out the names for the marks ;)
Observations from some of our nodes, connection closing was run
|
License: MIT Signed-off-by: Jeromy <jeromyj@gmail.com>
76c4b04
to
3b8eed8
Compare
ping @diasdavid @lgierth @Kubuxu for review (or response to previous review follow up) |
"name": "go-testutil", | ||
"version": "1.1.12" | ||
"version": "1.1.11" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is go-testutil being downgraded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm... thats probably the fault of a rebase gone rogue. I'll take care of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, its because we accidentally updated go-testutil in master without bubbling the changes from the rest of the deps at the same time. I think this change is safe for now, we can bubble up those deps and fix things later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👏 👍
One tiny comment about the go-testutil downgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part LGTM, I am still not done with reviewing some of the dep changes but as far as I got 👍
Only comment I have is that it would be good to have info about this feature returned somewhere from the API and have a test that config is being loaded correctly. |
Any suggestions here? |
The network section in |
@Kubuxu i think we should merge this now, and do the api introspection as a separate PR. Theres a fair amount of discussion to be had around that (i realize as i write it). |
see #4308 |
License: MIT Signed-off-by: Jeromy <jeromyj@gmail.com>
3b8eed8
to
62d5217
Compare
I am not blocking this PR. I just raised a note that we should have it. |
Choo Choo! |
Semi-relatedly, and because I want to activate this feature and see what difference it makes, is there a command that will show me on the CLI the (number of) active connections so I can tune the high/low water marks sensibly? Is |
@skorokithakis |
Implement Connection Manager
Implement Connection Manager
Implement Connection Manager
This is the first WIP branch of the connection manager. It allows you to set up a ConnMgr that tries to keep the number of open connections your node has within certain bounds.
Theres still a lot to do here.