Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the impact of the DHT #6283

Open
Stebalien opened this issue May 1, 2019 · 10 comments
Open

Reduce the impact of the DHT #6283

Stebalien opened this issue May 1, 2019 · 10 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@Stebalien
Copy link
Member

Currently, all nodes participate as full DHT servers by default. Unfortunately, this means:

  1. We have a lot of crappy unreachable DHT servers.
  2. I have to repeatedly tell users to run their client with ipfs daemon --routing=dhtclient because the DHT is causing the network to DoS their system.

Related work:

However, I'm wondering if we should consider an interim solution: run in
DHT-client mode by default, at least for now.

  1. Create a "laptop" config profile and make it the default. The laptop profile will use a "client" routing option.
  2. Create a "desktop" config profile and use an "auto" routing option. At the moment, this will default to client until we have the ability to switch between client/server mode dynamically.
  3. Modify the "server" config profile to default the routing option to "dht".

The significant drawback to this solution is that it'll make the IPFS network significantly less "p2p". That is, in a pure p2p network, all nodes are equal. On the other hands, all nodes are clearly not equal in terms of hardware so I'm not that concerned about this.

Thoughts and concerns?

cc @whyrusleeping & @daviddias?

@Stebalien Stebalien added the kind/enhancement A net-new feature or improvement to an existing feature label May 1, 2019
@vyzo
Copy link
Contributor

vyzo commented May 1, 2019

If we get a laptop profile, we might want to enable autorelay by default for it as well.

@vyzo
Copy link
Contributor

vyzo commented May 1, 2019

Create a "desktop" config profile and use an "auto" routing option. At the moment, this will default to client until we have the ability to switch between client/server mode dynamically.

This might be unreasonable, we might want to have this be dht by default until we have the magic option to switch dynamically.

@obo20
Copy link

obo20 commented May 1, 2019

I'm in favor of defaulting people to dhtclient for now. This point specifically resonates with me:

We have a lot of crappy unreachable DHT servers.

My thoughts are that most people who would opt-in to be a DHT server would have somewhat of an idea what they're doing, and the nodes opting in would likely be more stable as they're intentionally configured to redistribute content.

@Stebalien
Copy link
Member Author

If we get a laptop profile, we might want to enable autorelay by default for it as well.

SGTM.

Create a "desktop" config profile and use an "auto" routing option. At the moment, this will default to client until we have the ability to switch between client/server mode dynamically.

This might be unreasonable, we might want to have this be dht by default until we have the magic option to switch dynamically.

My thinking is that, at the moment, the DHT is too much overhead even for the average desktop. Fixing the issues I noted the issue description will help with that but, IMO, not even desktops should be DHT nodes till then.

My thoughts are that most people who would opt-in to be a DHT server would have somewhat of an idea what they're doing, and the nodes opting in would likely be more stable as they're intentionally configured to redistribute content.

Exactly.

@Stebalien
Copy link
Member Author

Requires #6287.

@vyzo
Copy link
Contributor

vyzo commented May 1, 2019

My concern is that we might end up with a DHT that is vastly undersized for the scale of the network.

@Stebalien
Copy link
Member Author

My concern is that we might end up with a DHT that is vastly undersized for the scale of the network.

I agree although I think we'll get that simply by defaulting to the "laptop" profile. However, I'd be fine defaulting the desktop profile to "dht" at first (for a slower transition).

@BillDStrong
Copy link

Keeping this as a stop gap measure sounds fine. As an experimenting user, I don't want to know about all of this.

To prevent an undersized DHT I would suggest a simple test of the users hardware resources at first run. Declare some minimum threshold, and if the user exceeds that threshold, ask the user if they would like to enable some services to keep the network healthy.

You would want to overestimate the minimum hardware. You don't want the user to ever have to think about ipfs is running in the background, taking precious cycles from their games/work.

@bonekill
Copy link

bonekill commented May 14, 2019

+1 to interim plan
You should not have to worry to much about destabilizing the network because...

1. Most nodes do not update with haste.
Notably there are a few large projects that have their clients and server clusters hang back a few versions, so in the event that this starts leading to an absurd client to server ratio a patch can be released to start reversing the swarm back to the old behavior before problems arise.

Edit: The new release process mostly outdates this.

  1. The change is not applied to upgraded instances.
    For this change take effect the user would need to run "ipfs init" or make an explicit config alteration as I don't believe we can decipher between an existing config being explicitly set to "dht" or just defaulted to it.This reinforces 1. as the adoption rate is further reduced.

3. Ability to lower the α (alpha concurrency) parameter.
IIRC the α parameter for searching through the DHT is cranked up to deal with all the useless nodes, Once a large number of useless nodes are removed you can pull back the α to a more sane number (ex. α = 3). While you cannot make this particular change in the same patch (because of point 1. and 2.) this should eventually lower the swarm "cost" for each query due to less canceled RPCs while in flight. Hopefully while the raw capacity drops the query efficiency rises, netting in a greater effective capacity than before.

Edit: α is already = 3, and has been for a very long time,.. whoops

4. Reduces "scattershot" behavior.
IPFS seems to increase the number of in flight requests the longer it takes to find valid results. Lots of useless nodes waste a lot of time to timeouts and IPFS seems to spawn many RPCs to make up for a failed one. Less wasted RPCs result in faster queries and fewer panic "scattershots" through routing tables. This behavior should decrease proportionally as the usable node ratio gets better. Not sure if this behavior is intentional/still exists but just something I have observed in the past,

Edit: I cannot replicate this behavior anymore.

A warning however, you should observe post patch to see if your own DHT nodes are getting hit to hard or not and if a reversal is required. While unlikely due to the above, However IF the DHT client to server ratio hits a critical point the entire DHT swarm may cascade fail and be difficult to bootstrap again. You would either need to wait for a large number of requesting clients to give up and/or bulk online a large number of healthy DHT serving nodes to fix it.

@yiannisbot
Copy link
Member

Isn't this issue obsolete after the IPFS v0.5 version where new nodes use AutoNAT to get their node status?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

6 participants