Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

HTTP daemon performance is low #3464

Closed
ukstv opened this issue Dec 29, 2020 · 8 comments
Closed

HTTP daemon performance is low #3464

ukstv opened this issue Dec 29, 2020 · 8 comments
Labels
kind/support A question or request for support

Comments

@ukstv
Copy link
Contributor

ukstv commented Dec 29, 2020

  • Version:
    ipfs-http-server v0.1.4

  • Platform:

    • Darwin feather 20.2.0 Darwin Kernel Version 20.2.0: Wed Dec 2 20:39:59 PST 2020; root:xnu-7195.60.75~1/RELEASE_X86_64 x86_64
    • Node.js v14.15.1
  • Subsystem:
    ipfs-http-daemon

Severity:

Medium: Performance Issues

Description:

JS-IPFS HTTP daemon refuses any new HTTP connection over certain threshold, which on my machine is 20. I believe, Node.js is a bit more performant than this.

As far as I understand, js-ipfs http daemon is built on top of HAPI, which apparently is one of the slowest HTTP frameworks for Nodejs. I do believe, this commands the inability to handle more than 20 simultaneous connections. As an experiment, I got dag.put HTTP endpoint implemented on top of Fastify, with same IPFS instance configuration. Handles 1000 simultaneously initiated connections just fine.

I do understand, that one could put something like HAProxy in front, and that JS-IPFS maybe has never been oriented towards performance. Though, it feels like you guys should be aware of the issue.

Steps to reproduce the error:

Adjust TIMES variable in https://gist.github.com/ukstv/25f77d94113f32c0b2d200f8f1e0c3a1 to 20, and run it against local js-ipfs instance. Works all right, reports no refused connections. If you set it back to 1000, then a huge part of the connections would be refused.

@ukstv ukstv added the need/triage Needs initial labeling and prioritization label Dec 29, 2020
@welcome
Copy link

welcome bot commented Dec 29, 2020

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@jacobheun
Copy link
Contributor

We're looking into this.

May be related to #3469

achingbrain added a commit that referenced this issue Jan 12, 2021
Right now no `http.Agent` is used for requests made using the http
client in node, which means each request opens a new connection which
can end up hitting process resource limits which means connections get
dropped.

The change here sets a default `http.Agent` with a `keepAlive: true` and
`maxSockets` of 6 which is consistent with [browsers](https://tools.ietf.org/html/rfc2616#section-8.1.4)
and [native apps](https://developer.apple.com/documentation/foundation/nsurlsessionconfiguration/1407597-httpmaximumconnectionsperhost?language=objc).

The user can override the agent passed to the ipfs-http-client constructor
to restore the previous functionality:

```
const http = require('http')
const createClient = require('ipfs-http-client')

const client = createClient({
  url: 'http://127.0.0.1:5002',
  agent: new http.Agent({
    keepAlive: false,
    maxSockets: Infinity
  })
})
```

Refs: #3464
@achingbrain
Copy link
Member

achingbrain commented Jan 12, 2021

It's important to set a baseline for performance expectations. In this gist I set up an HTTP server and use a client to make requests to it, using only node core - no HTTP framework and no HTTP client abstractions.

Running on Mac OS X 10.15.6 I see the following:

$ node max-concurrent-requests.js
testing 100 concurrent requests
max in flight 100
testing 111 concurrent requests
max in flight 111
testing 123 concurrent requests
max in flight 123
testing 136 concurrent requests
request 125 failed
max in flight 0
Error: connect ECONNRESET 127.0.0.1:51658
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1137:16) {
  errno: 'ECONNRESET',
  code: 'ECONNRESET',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 51658
}

For me, after a few runs it starts to fail somewhere between 120-140 concurrent requests on average.

The same code run on Linux is much better:

$ node max-concurrent-requests.js
testing 100 concurrent requests
max in flight 100
testing 111 concurrent requests
max in flight 111
testing 123 concurrent requests
max in flight 123
... output omitted
testing 2027 concurrent requests
max in flight 2027
testing 2230 concurrent requests
request 2048 failed
... output omitted
max in flight 0
Error: connect EMFILE 127.0.0.1:38212 - Local (undefined:undefined)
    at internalConnect (net.js:923:16)
    at defaultTriggerAsyncIdScope (internal/async_hooks.js:323:12)
    at net.js:1011:9
    at processTicksAndRejections (internal/process/task_queues.js:79:11) {
  errno: 'EMFILE',
  code: 'EMFILE',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 38212
}

Over 2000 concurrent requests before it gets an EMFILE, likely because it's hit a limit on how many files a process can have open.

Why the ECONNRESETs though? A cursory google reveals lots of 'Why does my networking code work on Linux but ECONNRESET on OS X?'

There are two interesting parameters, one is the TCP connection backlog (511 by default, set as a parameter to server.listen and the other is somaxconn. On OS X it's set to 128 by default and is probably why I get so few concurrent requests compared to Linux. From what I understand, the value is 128 to give you some sort of protection against SYN flood attacks.

It's also set to 128 on Linux but the max concurrent connections seems to be limited by the process ulimit -n so there may be something else at play on that platform.

Anyway, for OS X we can increase this until the next reboot with:

$ sudo sysctl kern.ipc.somaxconn=2048
kern.ipc.somaxconn: 128 -> 2048

Now run the test again and also increase the connection backlog to least 2048:

$ node max-concurrent-requests.js
testing 100 concurrent requests
max in flight 100
testing 111 concurrent requests
max in flight 111
testing 123 concurrent requests
max in flight 123
... output omitted
testing 2027 concurrent requests
max in flight 2027
testing 2230 concurrent requests
request 2047 failed
max in flight 0
Error: connect ECONNRESET 127.0.0.1:52661
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1137:16) {
  errno: 'ECONNRESET',
  code: 'ECONNRESET',
  syscall: 'connect',
  address: '127.0.0.1',
  port: 52661
}

Great! Now we have way more concurrent requests, similar to Linux.

On to the benchmarks. I've taken your test and modified it slightly to

a) create the random data before we start making connections as it's not free
b) to incorporate the changes from #3474 which gives a bit more control over the behaviour of the http client.

It also needs this .diff applied to node_modules/ipfs-utils/src/http.js to tell us the number of requests in flight at any one time.

On MacOS X with 1000 requests, maxSockets: 100 and keepAlive: true on the agent I see:

$ node index.js 
Set {}
0/1000
max in flight 1
took 7324 ms

So no errors and it completed in 7.3 seconds.

With 150 sockets I see:

$ node index.js 
Set {}
0/1000
max in flight 1
took 6860 ms

With 200 sockets I see:

$ node index.js 
Set {}
0/1000
max in flight 1
took 6752 ms

Something to note here is that max in flight is only ever 1 - the API server is responding too quickly so we don't have multiple requests open, we need to make another change to validate what's going on.

Apply this diff to node_modules/ipfs-http-server/src/api/resources/dag.js to add 100ms latency to every ipfs.dag.put request and we see max in flight start to increase in line with maxSockets:

maxSockets: 100

$ node concurrent-requests.js
Set {}
0/1000
max in flight 100
took 7304 ms

maxSockets: 150

$ node concurrent-requests.js
Set {}
0/1000
max in flight 150
took 6847 ms

maxSockets: 200

$ node concurrent-requests.js
165/1000 FetchError: request to http://localhost:5002/api/v0/dag/put?format=dag-cbor&input-enc=raw&hash=sha2-256 failed, reason: connect ECONNRESET 127.0.0.1:5002
    at ClientRequest.<anonymous> (/Users/alex/test/http/node_modules/node-fetch/lib/index.js:1461:11)
    at ClientRequest.emit (events.js:323:22)
    at Socket.socketErrorListener (_http_client.js:426:9)
    at Socket.emit (events.js:311:20)
    at emitErrorNT (internal/streams/destroy.js:92:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
    at processTicksAndRejections (internal/process/task_queues.js:84:21) {
  message: 'request to http://localhost:5002/api/v0/dag/put?format=dag-cbor&input-enc=raw&hash=sha2-256 failed, reason: connect ECONNRESET 127.0.0.1:5002',
  type: 'system',
  errno: 'ECONNRESET',
  code: 'ECONNRESET'
}
Set { 165 }
1/1000
max in flight 200
took 6850 ms

Let's increase kern.ipc.somaxconn=2048, maxSockets: Infinity and keepAlive: false:

$ node concurrent-requests.js
Set {}
0/1000
max in flight 1000
took 7135 ms

kern.ipc.somaxconn=2048, maxSockets: Infinity, keepAlive: true:

$ node concurrent-requests.js 
Set {}
0/1000
max in flight 1000
took 6301 ms

So, we can increase the number of incoming connections, but only by tweaking system parameters which seems a little unreasonable. A better solution would be to limit the number of concurrent connections used by the http client in node through the use of a http.Agent, which is the purpose of #3474.

@achingbrain achingbrain added kind/support A question or request for support and removed need/triage Needs initial labeling and prioritization labels Jan 12, 2021
@ukstv
Copy link
Contributor Author

ukstv commented Jan 12, 2021

@achingbrain that’s a very rigorous analysis, thank you. So, it means two ways of dealing with the refused connections now: either increase os-level parameters, or wait till #3474 is released. Ideally, both should be applied, as they belong to different sides of data flow. When is the release then? :)

@achingbrain
Copy link
Member

achingbrain commented Jan 12, 2021

As soon as node 14 15 stops making my life interesting and the build passes 😉

Tomorrow, all things going well.

@ukstv
Copy link
Contributor Author

ukstv commented Jan 12, 2021

May the force be with you.

achingbrain added a commit that referenced this issue Jan 13, 2021
Right now no `http.Agent` is used for requests made using the http client in node, which means each request opens a new connection which can end up hitting process resource limits which means connections get dropped.

The change here sets a default `http.Agent` with a `keepAlive: true` and `maxSockets` of 6 which is consistent with [browsers](https://tools.ietf.org/html/rfc2616#section-8.1.4) and [native apps](https://developer.apple.com/documentation/foundation/nsurlsessionconfiguration/1407597-httpmaximumconnectionsperhost?language=objc).

The user can override the agent passed to the `ipfs-http-client` constructor to restore the previous functionality:

```js
const http = require('http')
const createClient = require('ipfs-http-client')

const client = createClient({
  url: 'http://127.0.0.1:5002',
  agent: new http.Agent({
    keepAlive: false,
    maxSockets: Infinity
  })
})
```

Refs: #3464
@achingbrain
Copy link
Member

An RC with this change is now available:

$ npm install ipfs-http-client@48.1.4-rc.5

@achingbrain
Copy link
Member

The agent change was shipped in ipfs-http-client@48.2.0 and the kernel configuration required has been outlined above - please re-open this issue if you're still seeing problems.

SgtPooki referenced this issue in ipfs/js-kubo-rpc-client Aug 18, 2022
Right now no `http.Agent` is used for requests made using the http client in node, which means each request opens a new connection which can end up hitting process resource limits which means connections get dropped.

The change here sets a default `http.Agent` with a `keepAlive: true` and `maxSockets` of 6 which is consistent with [browsers](https://tools.ietf.org/html/rfc2616#section-8.1.4) and [native apps](https://developer.apple.com/documentation/foundation/nsurlsessionconfiguration/1407597-httpmaximumconnectionsperhost?language=objc).

The user can override the agent passed to the `ipfs-http-client` constructor to restore the previous functionality:

```js
const http = require('http')
const createClient = require('ipfs-http-client')

const client = createClient({
  url: 'http://127.0.0.1:5002',
  agent: new http.Agent({
    keepAlive: false,
    maxSockets: Infinity
  })
})
```

Refs: #3464
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/support A question or request for support
Projects
None yet
Development

No branches or pull requests

3 participants